Skip to content

Conversation

Daniel-Robbins
Copy link
Contributor

@Daniel-Robbins Daniel-Robbins commented Dec 12, 2023

The main purpose of this demo is to demonstrate how to train the vector representation of items using Word2vec and make item recommendations based on the similarity of item vectors. It mainly consists of 4 parts:

  1. Prepare item sequences based on user behavior.
  2. Train a CBOW model using the Word2Vec module of the gensim library.
  3. Extract all embedding data and write it to chDB.
  4. Perform queries on chDB based on cosine distance to find similar movies to the input movie.
  5. A simple unittest for vector data insertion and querying.

lmangani
lmangani previously approved these changes Dec 12, 2023
@lmangani
Copy link
Contributor

Thanks @Daniel-Robbins for all the amazing contributions 🤟

  • No checks needed for examples.

@Daniel-Robbins Daniel-Robbins marked this pull request as ready for review December 14, 2023 10:15
@auxten auxten merged commit bd7ff5a into chdb-io:main Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants