[RFC- ANN] CachedFunctions.jl

Hi, i just want to announce and request for comments about my package:


I did this as a possible solution to the calculation of higher order hessians, inspired heavily by this post. the idea is simple: if you have an inplace function and know the size of the input and the output beforehand, and the function returns the same type as the input, then the creation of caches can be programmatically done, instead of manually.
The idea is transform f!(y,x) to f(x) and be allowed to use this function freely (the main reason, passing it to a ForwardDiff routine to reduce allocations). im working on a CachedJacobian using this package to calculate higher order jacobians without allocations (preallocating also the caches neccesary by ForwardDiff and FiniteDiff). Comments about the interface, described in the readme, feature requests and corrections would be greatly appreciated.
Already registered! waiting to be added to the general registry.

8 Likes

From a quick look at the code, it looks like invoking f(x) from different threads might corrupt the data. Do you have a plan/idea for thread safety?

1 Like

good catch!, i’m gonna look how to make the calls thread-safe, but the only thing that comes to me right now is using a lock at the CachedFunction call level. Another approach would be creating NThreads caches managed by the CachedFunction, with the cache selection also including the threadid as a key.
If you have something in mind, let me know!

Yes, I think I’d use threadid, too. Something like Vector{IdDict{DataType, Union{AbstractArray,Number}}}. I don’t know if "threadid as a key" works because Dict can (in principle) mutate underlying data structure in a non- thread-safe manner.


Also, come to think of it, it mgith be possible to get a puzzling behavior even without threading but just with @async? Consider

@assert f isa CachedFunction
@assert typeof(x1) == typeof(x2)

@sync begin
    task = @async f(x1)
    y2 = f(x2)
    y1 = fetch(task)
end

@assert y1 === y2  # but maybe this is puzzling?

It’s a bit silly example but it could introduce a subtle bug for something like

@assert f isa CachedFunction
@assert typeof(x1) == typeof(x2)

@sync begin
    @async g(f(x1))
    @async h(f(x2))
end

If g contains some I/O, f(x2) can be called before g finishes using y.

I suppose it seems that it’s weird to worry about I/O with CachedFunction, given that it’s presumably geared towards compute-intensive works. But it’s conceivable in Julia ecosystem that someone might want to wrap some computational package in a web API. In that case, worrying I/O started to make sense. Though I’m not sure if it’s possible to “fix” this other than re-implementing GC.

No, not at all!, i come to the realization that this is a harder problem than i thought, i was thinking of modifying the CachedFunction constructor to add a threadsafe keyword, where threadsafe = false would give the current behavior, whereas true would lock the caching stage and evaluation stage,

Other approach is to create different constructors for different uses. a SyncCachedFunction, a ParallelCachedFunction (to do calculations on all threads at once, for example) and others if necessary.

At last. it should be clear to the user what to expect of the created CachedFunction, and those examples are great to bring the expected usage to light. This is also the right moment to do a lot of breaking changes, if those would bring a better package. If you have any request or proposal, those would be very (emphasis on very) appreciated :grinning:

Thinking about this a bit more, I guess you can add an interface like lock(g, f::CachedFunction, x) that locks the output y of f(x) while evaluating g(y). I think it’s a nice way to declare the lifetime of the cached output.

@sync begin
    @async lock(f, x1) do y  # y = f(x1)
        g(y)
    end
    @async lock(f, x2) do y  # y = f(x2)
        h(y)
    end
end

Note that f still can implement thread-local cache such that

@sync begin
    @spawn lock(f, x1) do y
        g(y)
    end
    @spawn lock(f, x2) do y
        h(y)
    end
end

can still do the works in parallel if there are enough threads.

…although I guess this would deadlock with

@assert typeof(x1) == typeof(x2)

lock(f, x1) do y1
    lock(f, x2) do y2
        g(y1, y2)
    end
end

unless it is explicitly supported.

Note that LRUCache.jl provides a thread-safe associative structure, which I use to cache temporaries in e.g. TensorOperations.jl. Nonetheless, I still use the threadid() as part of the key, but that alone is not sufficient if you store all cached results in a common structure.

1 Like

although the threadid() part of the key might be because of how I use the objects stored in the cache, they are indeed overwritten in place after being retrieved, so I want to make sure to have different objects associated to different threads. If it is just to look up values, just the thread safety of the LRU structure should be sufficient.

1 Like

so, basically, what i have to do is to do all the unsafe operations behind a lock stored on the struct, right? what’s the difference between the reentrant lock and the spinlock in julia?