Strategies for Automatic Release of Foreign Resources: pinterface

pinterface

Strategies for Automatic Release of Foreign Resources

Nov 30, 2011 03:19

Wanting to avoid forcing users of burgled-batteries to deal with cpython reference counts for Python types with unknown translation, I'm exploring not one, but two separate strategies for automatically handling the issue.

The second, upon which I based my previous ill-advised CFFI trick 1, and for which I am grateful to Paul Khuong for correcting my error, is to let the GC decide when a resource is no longer needed. As I discovered attempting to test it, however, this does not necessarily free one from thinking about when a resource can be freed.

The first approach, which is considerably more deterministic, is also somewhat more limiting in extent. In that approach, pointers are also wrapped, but the wrapped-value has dynamic-extent within a barrier 2. Leave the barrier, and any wrappers are relieved of their wrapped value.

Both approaches have at least one downside in common: wrapped values cannot be used in place of a POINTER. This can be lessened somewhat for CFFI's TRANSLATE-*-FOREIGN functions (with some caveats involving the use of the EXPAND- variants):
(defmethod translate-to-foreign ((value wrapped) type) (translate-to-foreign (wrapped-value value) type)) (defmethod translate-from-foreign ((value wrapped) type) (translate-from-foreign (wrapped-value value) type)) (defmethod free-translated-object ((value wrapped) type param) (free-translated-object (wrapped-value value) type param)) but MEM-REF and friends (reasonably!) assume POINTERs, so there's no getting around callers noticing the difference. Whether that's a worthwhile tradeoff depends on your goals.
Release by GC
The GoodForeign resource has indefinite extent The BadFreeing of foreign resource is unpredictable The Story
While I'm not always particularly good about testing, refcnts are incredibly easy to screw up and tracking down refcnt bugs is unpleasant, so I've been trying to make sure I test things involving them to catch when I screw up.

After a while flailing about with no idea what I was doing (seriously: at some point I decided adding threads into the mix would make it easier to test things involving the GC), I had an epiphany. "Gee, I wonder if trivial-garbage has finalizer-related tests?" It does. Well then, I'll just copy whatever they do.

;;; I don't really understand this, but it seems to work, and stems ;;; from the observation that typing the code in sequence at the REPL ;;; achieves the desired result. Superstition at its best. (defmacro voodoo (string) `(funcall (compile nil `(lambda () (eval (let ((*package* (find-package :tg-tests))) (read-from-string ,,string)))))))
I'm not particularly thrilled with the "call with strings of Lisp code" interface for VOODOO, so I mess with it a little but not too much because I understand it even less than the original author.

(defun voodoo (expr) (funcall (compile nil `(lambda () (eval (read-from-string (prin1-to-string ',expr)))))) (values))
Having simplified my tests, and confident the remaining problems I was seeing were in fact problems with my tests and not with the thing being tested, I set about discovering what those problems might be.

(let* ((wrapped (run code)) (unwrapped (wrapped-value wrapped))) (do-things-with unwrapped))
Spot the problem? If not, here's a hint: after what point is WRAPPED eligible for GC?

WRAPPED might get GCed at any point after (wrapped-value wrapped) is calculated. If that happens, the finalization on WRAPPER changes, and potentially destroys, what UNWRAPPED points to. And then you've got a problem with the test (or, potentially, user code).

Lesson learned: don't wander around with an unwrapped pointer in a &body. But it's worse than that. Don't even pass the unwrapped pointer to a function if you can avoid it:

(let ((wrapped (run code))) (do-things-with (wrapped-value wrapped)))
WRAPPED becomes eligible for GC as soon as WRAPPED-VALUE returns, which means the C object might go away before the pointer is passed to DO-THINGS-WITH, or any time during the execution of DO-THINGS-WITH.

If you're lucky, liveness will be ensured by wrapped pointers being part of some larger datastructure, but if not, it becomes necessary to make sure WRAPPED is required later on, to avoid it becoming eligible for GC too soon.

(let* ((wrapped (run code)) (unwrapped (wrapped-value wrapped))) (do-things-with unwrapped) (calculate-something-using wrapped))
Unfortunately, if using wrapped pointers, one must keep in mind the liveness of the wrapper, lest foreign resources disappear too soon-making them not quite so effortless as I had hoped.
Release by Barrier
The GoodForeign resource is freed predictably The BadHam-fisted approach The Story
I started with refcnt barriers first because they're considerably less scary to write. Their predictability also makes them considerably easier to debug and reason about.

(defclass barrier-wrapper () ((value :initarg :wrap :reader wrapped-value)))
Wrapper. The invalidation of pointers upon exiting the barrier necessitates an ability to modify the VALUE slot. Hopefully nobody does.

(defvar *barrier-objects*) (defun initialize-instance :after ((object barrier-wrapper) &key) (push object *barrier-objects*))
Nothing fancy. Just quick-and-easy tracking of wrappers created within a barrier.

(defmacro with-barrier (&body body) (let ((*barrier-objects* (list))) (unwind-protect (progn ,@body) (mapcar #'invalidate-wrapper *barrier-objects*)))) (defun invalidate-wrapper (wrapper) (free-somehow (wrapped-value wrapper)) (slot-makunbound wrapper 'value))
Clearly I'm taking liberties with #'FREE-SOMEHOW, but you get the point. Upon exiting the extent of the form, we free the resource and invalidate the wrapped value. I unbind the slot, but setting it to NIL might better suite your tastes.

This is certainly useful in some cases, such as where you're processing things in a loop and each iteration is unrelated to the next, so you don't need to carry any values over. And it's not as if WITH-THING is unheard-of. But the complete lack of subtlety bothers me.

While it'd certainly be possible to provide a little more nuance by adding in the ability to promote objects to the next barrier or GC-ability, I'm not convinced reintroducing parts of manual memory management is an appropriate method of dealing with the shortcomings of a system meant to automate it away.
Conclusions

While both are (to me) clear winners over forcing the issue on library users, I don't think either is a clear winner over the other. Letting the GC handle things is more conceptually elegant, but practically speaking it has some thorny issues that are difficult to overcome and even harder to spot. 3 Barriers are easy to reason about, but they're lacking in useful nuance, and I suspect would wind up being too large to be practical, or too small to be useful.

So, for now at least, burgled-batteries admits both approaches, though I have not yet come up with a good way to enable the approaches to be interleaved.

A third approach might be a sort of hybrid, where an implementation similar to barriers is used not to ensure foreign objects are destroyed upon exiting the form's extent, but to ensure they remain available within it. A "corral", I guess you might call it?

At any rate, the search continues for the perfect means of automatically dealing with foreign resources.
Footnotes

One of the great things about being wrong on Planet Lisp instead of just my own little echo chamber: somebody is vastly more likely to notice and correct me. Trying to debug the problems that would have arisen from my original approach would have been Not Fun.
It is entirely possible I have incorrectly co-opted the term "barrier". If so, I'd be happy to switch to a more correct term, if one exists.
For instance: You unwrap and pass the value to foreign code, foreign code issues a callback, callback triggers a GC, wrapper goes away taking the wrapped foreign value with it.

cffi, lisp