partners/qdrant: DRAFT QdrandVectorStore.aadd_texts POC becnmark #26795

mkhludnev · 2024-09-23T23:02:09Z

Here I want to boost QdrantVectorStore.aadd_texts performance with truly async implementation

It's just a test in a form of a benchmark. It also has LocalAIEmbeddings implementation allowing bulk request #22666.

Lint and test:

needs openai>=1.3 to run it #22399.

vercel · 2024-09-23T23:02:12Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)	Visit Preview		Sep 24, 2024 7:21pm

mkhludnev · 2024-09-24T19:30:09Z

Heya!!
I repeat the same exercise with remote Qdrant & LocalAI. And got significant gain (honestly I deliberately sized test data to give advantage for async code spawning parallel request in the beginning):

The interesting thing: the old Qdrant.aadd_texts using async generator is slower than truly async code with create_tasks().

mkhludnev · 2024-09-24T19:41:45Z

@Anush008 wdyt?

Anush008 · 2024-09-24T19:43:59Z

This also includes time for generating embeddings as well. Correct?

mkhludnev · 2024-09-24T20:25:31Z

This also includes time for generating embeddings as well. Correct?

Right.

mkhludnev · 2024-09-25T18:56:47Z

@Anush008 using create_task() gives more parallelism, so it let to put higher load reducing wall clock time. However, I'm not sure if it's a purpose of async VectorStore.a* method.
It's not clear eg in scope of #11141. @baskaryan could you comment what's the purpose of VectorStore async methods?

efriis · 2024-09-30T22:52:01Z

hey team!

while this is in draft, could you open it against your own fork, and reopen the PR against the main project when it's ready for our team's review?

mkhludnev · 2024-10-01T12:14:22Z

@efriis thanks for your reaction.
Could you please share your vision: is it worth to contribute QdrantVectorStore.aadd_texts() processing batches in parallel reducing wall clock time with higher cpu utilization?
cc @Anush008

Anush008 · 2024-10-01T12:16:33Z

If we remove the embeddings generation, is the difference in performance worth the code duplication?

mkhludnev · 2024-10-01T12:27:27Z

If we remove the embeddings generation, is the difference in performance worth the code duplication?

That's actually what I'm asking. Embeddings is one of the pipeline stages, even if we avoid it, aadd_texts can send a few batches in parallel. Whether it's worth it, I don't know. Here's example of parallel execution from another project (if you ever heard of ;) UPD although turns out that code doesn't proceed batches concurrently.

aadd_texts POC

c861c59

efriis added the partner label Sep 23, 2024

efriis self-assigned this Sep 23, 2024

rename tests

2276da6

mkhludnev mentioned this pull request Sep 23, 2024

qdrant: new Qdrant implementation #24164

Merged

compare truly async vs old async generator

e590eb3

mkhludnev mentioned this pull request Sep 27, 2024

VectorStoreIndex Scaling Problem run-llama/llama_index#14238

Closed

1 task

efriis closed this Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

partners/qdrant: DRAFT QdrandVectorStore.aadd_texts POC becnmark #26795

partners/qdrant: DRAFT QdrandVectorStore.aadd_texts POC becnmark #26795

Uh oh!

mkhludnev commented Sep 23, 2024 •

edited

Loading

Uh oh!

vercel bot commented Sep 23, 2024 •

edited

Loading

Uh oh!

mkhludnev commented Sep 24, 2024

Uh oh!

mkhludnev commented Sep 24, 2024

Uh oh!

Anush008 commented Sep 24, 2024

Uh oh!

mkhludnev commented Sep 24, 2024

Uh oh!

mkhludnev commented Sep 25, 2024

Uh oh!

efriis commented Sep 30, 2024

Uh oh!

mkhludnev commented Oct 1, 2024

Uh oh!

Anush008 commented Oct 1, 2024

Uh oh!

mkhludnev commented Oct 1, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

partners/qdrant: DRAFT QdrandVectorStore.aadd_texts POC becnmark #26795

partners/qdrant: DRAFT QdrandVectorStore.aadd_texts POC becnmark #26795

Uh oh!

Conversation

mkhludnev commented Sep 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel bot commented Sep 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkhludnev commented Sep 24, 2024

Uh oh!

mkhludnev commented Sep 24, 2024

Uh oh!

Anush008 commented Sep 24, 2024

Uh oh!

mkhludnev commented Sep 24, 2024

Uh oh!

mkhludnev commented Sep 25, 2024

Uh oh!

efriis commented Sep 30, 2024

Uh oh!

mkhludnev commented Oct 1, 2024

Uh oh!

Anush008 commented Oct 1, 2024

Uh oh!

mkhludnev commented Oct 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mkhludnev commented Sep 23, 2024 •

edited

Loading

vercel bot commented Sep 23, 2024 •

edited

Loading

mkhludnev commented Oct 1, 2024 •

edited

Loading