-
Notifications
You must be signed in to change notification settings - Fork 33
Description
In SVS, we separate the concept of dataset and index. This allows an index to accept different datasets, such as FP32, Scalar Quantization (SQDataset
), LVQ, and LeanVec.
Currently, using SQDataset
as an example, users must call compress
to get a SQDataset
, then pass it to the index. However, most use cases don't care about the dataset itself - this two-step process is often unnecessary for most users. Additionally, this approach makes runtime fallback impossible to implement in SVS, as each dataset (i.e., type) is determined at compile time.
auto loaded =
svs::VectorDataLoader<float>(std::filesystem::path(SVS_DATA_DIR) / "data_f32.svs").load();
auto data =
svs::scalar::SQDataset<std::int8_t>::compress(loaded, threadpool); // SQDataset is determined at compile time
auto parameters = svs::index::vamana::VamanaBuildParameters{};
svs::Vamana index = svs::Vamana::build<float>(
parameters, data, svs::distance::DistanceL2(), num_threads
);
I propose adding index building APIs that directly accept uncompressed data and take the dataset type as a parameter to determine which dataset format to use internally.
auto loaded =
svs::VectorDataLoader<float>(std::filesystem::path(SVS_DATA_DIR) / "data_f32.svs").load();
auto parameters = svs::index::vamana::VamanaBuildParameters{};
svs::Vamana index = svs::Vamana::build<float>(
parameters, data, svs::distance::DistanceL2(), num_threads, svs::SQ8
); // internally, SVS fallbacks to uncompressed data if scalar quantization failed
Another advantage of this is that we could optionally utilize uncompressed data to build the graph rather than compressed data, which typically gives worse quality graphs as compression introduces approximation errors that can degrade the graph structure during construction.
Note that when calling build
, we can compress the data and build the graph in parallel, as the two things are independent.