Skip to content

Conversation

@mkhludnev
Copy link
Contributor

@mkhludnev mkhludnev commented Sep 23, 2024

Here I want to boost QdrantVectorStore.aadd_texts performance with truly async implementation

It's just a test in a form of a benchmark. It also has LocalAIEmbeddings implementation allowing bulk request #22666.

  • Lint and test:

needs openai>=1.3 to run it #22399.

@efriis efriis added the partner label Sep 23, 2024
@vercel
Copy link

vercel bot commented Sep 23, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Sep 24, 2024 7:21pm

@efriis efriis self-assigned this Sep 23, 2024
@mkhludnev
Copy link
Contributor Author

Heya!!
I repeat the same exercise with remote Qdrant & LocalAI. And got significant gain (honestly I deliberately sized test data to give advantage for async code spawning parallel request in the beginning):
image
The interesting thing: the old Qdrant.aadd_texts using async generator is slower than truly async code with create_tasks().

@mkhludnev
Copy link
Contributor Author

@Anush008 wdyt?

@Anush008
Copy link
Contributor

This also includes time for generating embeddings as well. Correct?

@mkhludnev
Copy link
Contributor Author

This also includes time for generating embeddings as well. Correct?

Right.

@mkhludnev
Copy link
Contributor Author

@Anush008 using create_task() gives more parallelism, so it let to put higher load reducing wall clock time. However, I'm not sure if it's a purpose of async VectorStore.a* method.
It's not clear eg in scope of #11141. @baskaryan could you comment what's the purpose of VectorStore async methods?

@efriis
Copy link
Contributor

efriis commented Sep 30, 2024

hey team!

while this is in draft, could you open it against your own fork, and reopen the PR against the main project when it's ready for our team's review?

@efriis efriis closed this Sep 30, 2024
@mkhludnev
Copy link
Contributor Author

@efriis thanks for your reaction.
Could you please share your vision: is it worth to contribute QdrantVectorStore.aadd_texts() processing batches in parallel reducing wall clock time with higher cpu utilization?
cc @Anush008

@Anush008
Copy link
Contributor

Anush008 commented Oct 1, 2024

If we remove the embeddings generation, is the difference in performance worth the code duplication?

@mkhludnev
Copy link
Contributor Author

mkhludnev commented Oct 1, 2024

If we remove the embeddings generation, is the difference in performance worth the code duplication?

That's actually what I'm asking. Embeddings is one of the pipeline stages, even if we avoid it, aadd_texts can send a few batches in parallel. Whether it's worth it, I don't know. Here's example of parallel execution from another project (if you ever heard of ;) UPD although turns out that code doesn't proceed batches concurrently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants