-
Notifications
You must be signed in to change notification settings - Fork 470
Description
Currently LLamaSharp is in a bit of an awkward position with async
support.
ILLamaExecutor
specifies two infer methods, one async and one non-async.- All implementations of
InferAsync
currently just callInfer
, which is useless! - Inference itself is not really async (although we could make it act like it).
- There are some things LLamaSharp does which could be async (e.g. saving/loading session files), but they're not because they're implemented in the non-async
Infer
method. Making these async would require duplicating a huge amount of inference code.
So we're in the position that we pretend to have async, but don't actually have async, and the small amount of work we have which could be async isn't!
Option 1: Remove Async
Right now the library is not really async, the small amount of IO work that is in the library isn't async and all inference computations block. So we could just remove async support altogether and lose nothing.
Option 2: Remove Non-Async
The other option is to remove non async support. That way the inference code would all be in an async context - any IO work would be easy to make properly async and the expensive computations (i.e. Eval()
) could be done in a Task.Run
call and awaited to not block the calling thread.
Personally I prefer option 2. But I realise that requiring async may not be preferred by everyone! Thoughts?