Skip to content

Conversation

blaginin
Copy link
Member

@blaginin blaginin commented Jul 29, 2025

Closes #1424

@blaginin blaginin self-assigned this Jul 29, 2025
@blaginin blaginin changed the title ArrayEquals kernel feat:ArrayEquals kernel Jul 29, 2025
@blaginin blaginin changed the title feat:ArrayEquals kernel feat: ArrayEquals kernel Jul 29, 2025
@blaginin blaginin added the feature Release label indicating a new feature or request label Jul 29, 2025
# Conflicts:
#	vortex-array/src/compute/mod.rs
Copy link

cloudflare-workers-and-pages bot commented Jul 29, 2025

Deploying vortex-bench with  Cloudflare Pages  Cloudflare Pages

Latest commit: a907efe
Status: ✅  Deploy successful!
Preview URL: https://15d5d328.vortex-93b.pages.dev
Branch Preview URL: https://db-kernel-eq.vortex-93b.pages.dev

View logs

blaginin and others added 2 commits July 29, 2025 20:27
Signed-off-by: blaginin <[email protected]>
- Change array_equals to take two arrays as arguments
- Fix imports and add missing 'use' keyword
- Update ArrayEqualsArgs to expect 2 inputs instead of 3
- Use ARRAY_EQUALS_FN instead of IS_SORTED_FN

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@coveralls
Copy link

coveralls commented Jul 29, 2025

Coverage Status

coverage: 82.71% (+0.03%) from 82.684%
when pulling 9b2c808 on db/kernel-eq
into 6fb0f3e on develop.

Claude and others added 6 commits July 29, 2025 20:44
- Add chunked comparison with early exit for performance
- Use constant array checks to quickly detect differences
- Add fallback to canonical arrays for non-canonical inputs
- Use 64K batch size for chunked processing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Treat NULL == NULL as true for array equality
- Check individual comparison results when not constant
- Add proper null handling in chunked comparison

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Major changes:
- Extract chunked comparison logic into separate functions
- Move statistics checking into dedicated helper
- Add configurable batch_size parameter (with 64K default)
- Split comparison result checking into modular functions
- Add comprehensive tests including float precision and chunked arrays

Functions added:
- compare_chunked(): Main chunked comparison with configurable batch size
- compare_batch(): Single batch comparison
- check_constant_result(): Handle constant comparison results
- check_non_constant_result(): Handle element-wise checking with stats optimization
- check_comparison_stats(): Quick rejection via min/max stats
- check_null_equality(): Proper null comparison handling
- check_stats_equality(): Early exit via statistics comparison

Tests added:
- test_float_precision(): Verify float comparison behavior
- test_batch_size_functionality(): Test large array handling
- test_primitive_vs_dict_array(): Test different encodings

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Add missing patterns found in compare.rs, filter.rs, and cast.rs:

- Add debug logging for fallback operations
- Add constant null array handling (NULL constants equal only to NULL constants)
- Add TODO comment for future floating-point optimizations
- Add comprehensive test for constant null arrays

These changes align array_equals with established patterns in the codebase
and improve debuggability and correctness for edge cases.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…ations

- Add early return for empty arrays regardless of type
- Implement comprehensive constant array optimization with statistics checks
- Use min/max statistics to rule out equality for non-null constants
- Add null count statistics check for null constant comparisons
- Remove redundant empty array and constant null checks
- Add mixed constant/non-constant array test coverage
- Fix clippy warnings by removing unnecessary clones

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
This test demonstrates that arrays containing -0.0 and +0.0 are not
considered equal, even though they should be according to IEEE 754
(-0.0 == +0.0 returns true).

The failing test shows a correctness issue in the array equality
implementation that needs to be addressed.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@robert3005
Copy link
Contributor

I always found this weird to be a kernel. I think once we land #4188 this will be easier to implement in a reasonable way. Otherwise I would define this as PartialEq for Canonical since you really shouldn't have to dispatch this to the encoding

@blaginin
Copy link
Member Author

Agreed! Joe and Nick mentioned the kernel trait may change quite a bit, so holding this off until it's merged

@joseph-isaacs
Copy link
Contributor

I would say define partial eq for canonical. I think this will be useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Release label indicating a new feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement array_equals compute fn and PartialEq for ArrayData
4 participants