Skip to content

Conversation

ahuber21
Copy link
Contributor

This PR adds a new class SQDataset which implements global scalar quantization.

@ahuber21 ahuber21 force-pushed the dev/ahuber/default-quantization branch from 075f55f to 0b8f7e8 Compare March 21, 2025 14:51
@dian-lun-lin
Copy link
Member

dian-lun-lin commented Apr 10, 2025

We need to make SQDataset resizable in order to make it work on dynamic vamana index. Dynamic vamana index requires a dataset to be resizable as this index will call compact and resize. (see this). Please implement the two functions in SQDataset. I think calling data_.compact() and data_.resize() internally should be enough.

For reference, see the SimpleData compact and resize implementation.

@razdoburdin
Copy link
Contributor

I have took a look at ARM CI failure. It looks like some problem with EVE. Is it possible to update EVE to the last stable version ? There is a chance, that the problem is already solved on there side.

Please go ahead and update EVE. As long as it does not break our existing codebase, it's fine

I have created PR to dev branch: #117

@ahuber21
Copy link
Contributor Author

Thank you @razdoburdin for the PR against this branch. I've checked my usage of eve and I don't think I should have used it in the fix_argument() function. I've replaced it with a simple std::reduce() and thereby removed the necessity to #include eve/algo.h, which hopefully fixes the build error on ARM.

@ibhati @dian-lun-lin as discussed I reverted the re-compression changes (and the check for trivial data). My last action for this PR would be clean up the RNG generator in scalar.cpp, which I'm hoping to finish before your AM so that we can merge quickly.

@ibhati ibhati requested a review from dian-lun-lin April 30, 2025 14:29
// Thread-local accumulators
std::vector<MinMaxAccumulator> tls(threadpool.size());

// Compute mean and squared sum
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Compute mean and squared sum
// Compute min and max values in dataset

}
};

// operator to find global min and max in dataset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// operator to find global min and max in dataset
// Operator to find global min and max in dataset

}
};

// operator to compress a dataset using a threadpool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// operator to compress a dataset using a threadpool
// Operator to compress a dataset using a threadpool

// Compute mean and squared sum
threads::parallel_for(
threadpool,
threads::DynamicPartition(data.size(), batch_size),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did we get this batch_size value from? If there is no strong reason for using DynamicParition, we can simply use StaticPartition as used in most of the code. See an example here for instance: https://github.com/intel/ScalableVectorSearch/blob/main/include/svs/index/vamana/index.h#L553

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use StaticPartion here as well?

std::move(compressed), scale, bias};
}

/// @brief Compact the dataset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to keep in mind we need to dd the Doxygen docstrings for the other functions and some more documentation for SQDataset in next PR

CATCH_REQUIRE(svs_test::prepare_temp_directory());
size_t num_threads = 2;

// use uncompressed reference results which should be identical
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// use uncompressed reference results which should be identical
// Use uncompressed reference results which should be identical

// Scale is calculated from (max_data - min_data) / (max_quant - min_quant)
// The dataset features values [-127, 127], the quantization range is given by the MIN
// and MAX elements of the provided type.
constexpr float MIN = std::numeric_limits<T>::min();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not the minimium value for int8 -128 instead of -127?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the dataset is symmetric, not sure why, but it doesn't cause issues.

// and MAX elements of the provided type.
constexpr float MIN = std::numeric_limits<T>::min();
constexpr float MAX = std::numeric_limits<T>::max();
constexpr float exp_scale = 254.0F / float(MAX - MIN);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
constexpr float exp_scale = 254.0F / float(MAX - MIN);
constexpr float exp_scale = 255.0F / float(MAX - MIN);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No the dataset min max max values are -127 and 127, respectively. I could have used min() and max(), but figured I would use this knowledge about our test data in the test case.

// Calculations are performed in float everywhere and should therefore produce the exact
// same results
CATCH_REQUIRE(sq_dataset.get_scale() == exp_scale);
CATCH_REQUIRE(sq_dataset.get_bias() == exp_bias);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these checks not failing if the int8 minimum is -128? Not sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah as explained, these are the values that are not determined from the compression range but rather from data. So all good!

@ahuber21 ahuber21 requested a review from ibhati April 30, 2025 21:06
Copy link
Member

@ibhati ibhati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just see if we can use StaticParition for the other instance as well

// Compute mean and squared sum
threads::parallel_for(
threadpool,
threads::DynamicPartition(data.size(), batch_size),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use StaticPartion here as well?

@ahuber21
Copy link
Contributor Author

ahuber21 commented May 5, 2025

@ibhati everything should be there now. Please update your "change requested" if you think we're ready to merge.

Copy link
Member

@ibhati ibhati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ahuber21 for enabling this capability in SVS.

@ibhati
Copy link
Member

ibhati commented May 5, 2025

/intelci

@ahuber21 ahuber21 merged commit 94e24ba into main May 6, 2025
18 checks passed
@ahuber21 ahuber21 deleted the dev/ahuber/default-quantization branch May 6, 2025 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants