Skip to content

Conversation

pkolaczk
Copy link

It is possible to insert non-equal vectors
which score the same by the provided similarity measure.
E.g. vectors (0.1, 0.1), (0.2, 0.2), (0.3, 0.3) all
are really the same point under cosine metric
and any pair of those would score 1.0 similarity.
This edge case caused some serious issues with graph
connectivity and the queries returned at most 33 nodes,
even if for large graphs.

This PR fixes it by improving fairness of node selection
when nodes score the same. This is achieved by a
small modification to how node ids are encoded
in the NodeQueue. When nodes score the same, they were compared
by node ids, which always preferred the nodes added earlier.
That created a huge bias. If we reverse the bits
of node ids, now this shuffles their order and breaks the
systematic bias towards the older nodes.

Another part of the fix is making sure nodes with the same
score don't block backlinks to be formed. By placing new neighbours
before the neighbours with the same score, we're giving
them a chance to be linked to. It will drop some other neighbor
from the list, but considering it was present in the graph for some
time already, it is much more likely to be already well connected.

@pkolaczk pkolaczk marked this pull request as draft September 16, 2025 12:25
@pkolaczk pkolaczk force-pushed the fix-recall-on-score-ties branch from 26c0c8b to b62c809 Compare September 16, 2025 12:33
It is possible to insert non-equal vectors
which score the same by the provided similarity measure.
E.g. vectors (0.1, 0.1), (0.2, 0.2), (0.3, 0.3) all
are really the same point under cosine metric
and any pair of those would score 1.0 similarity.
This edge case caused some serious issues with graph
connectivity and the queries returned at most 33 nodes,
even if for large graphs.

This PR fixes it by improving fairness of node selection
when nodes score the same. This is achieved by a
small modification to how node ids are encoded
in the NodeQueue. When nodes score the same, they were compared
by node ids, which always preferred the nodes added earlier.
That created a huge bias. If we reverse the bits
of node ids, now this shuffles their order and breaks the
systematic bias towards the older nodes.

Another part of the fix is making sure nodes with the same
score don't block backlinks to be formed. By placing new neighbours
before the neighbours with the same score, we're giving
them a chance to be linked to. It will drop some other neighbor
from the list, but considering it was present in the graph for some
time already, it is much more likely to be already well connected.
@pkolaczk pkolaczk force-pushed the fix-recall-on-score-ties branch from b62c809 to aaf4136 Compare September 16, 2025 12:44
@pkolaczk pkolaczk marked this pull request as ready for review September 16, 2025 12:45
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical issue where vectors with identical similarity scores could cause poor recall in graph connectivity. The fix involves improving fairness in node selection when scores are tied by reversing node ID bits instead of using raw node IDs, and modifying insertion order to place new neighbors before existing ones with the same score.

Key changes:

  • Modified node ID encoding in NodeQueue to use bit reversal instead of complement for fairer tie-breaking
  • Changed NodeArray insertion to use leftmost insertion point instead of rightmost for same-score nodes
  • Added comprehensive test coverage for the cosine similarity edge case

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
NodeQueue.java Updates node encoding to use Integer.reverse() instead of bitwise complement for fairer ordering
NodeArray.java Changes insertion logic to place new nodes before existing ones with same scores
TestVectorGraph.java Adds test case for vectors with identical cosine similarity scores
TestNodeArray.java Updates existing test expectations to match new tie-breaking behavior

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@pkolaczk pkolaczk marked this pull request as draft September 16, 2025 12:49
@pkolaczk
Copy link
Author

Even after those changes the number of results is sometimes less than 50% of the graph and the test is a bit flaky. I'm kinda reluctant to decrease the threshold in a test further before understanding fully why the graph still falls apart.
Maybe there is a better approach to fix it? Any ideas?

@pkolaczk
Copy link
Author

There is something fishy going on here. The tests are sometimes failing with returning even fewer vectors than 33; which is unexpected to me considering each vector has space to connect to 32 neighbors.

E.g. some return just 1 or 2 vectors. This happens only 1-10 / 1000 times.

So far I debugged those failed runs to have some issue when constructing the graph.
The searches are just constantly returning the same 0, 1 or 2 nodes:

Starting search at level 0 with 1 candidates; total graph size = 6
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 6
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 6
Starting search at level 0 with 1 candidates; total graph size = 9
Search completed: visited 0, expanded 1 (1 at base layer)
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 6
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 6
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 13
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 7
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 9
Starting search at level 0 with 1 candidates; total graph size = 15
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 16
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 17
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 10
Starting search at level 0 with 1 candidates; total graph size = 18
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 11
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 12
Starting search at level 0 with 1 candidates; total graph size = 20
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 21
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 14
Starting search at level 0 with 1 candidates; total graph size = 22
Search completed: visited 0, expanded 1 (1 at base layer)
Search completed: visited 0, expanded 1 (1 at base layer)
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 19
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 25
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 26
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 27
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 28
Starting search at level 0 with 1 candidates; total graph size = 29
Search completed: visited 0, expanded 1 (1 at base layer)
Search completed: visited 0, expanded 1 (1 at base layer)
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 23
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 24
Starting search at level 0 with 1 candidates; total graph size = 33
Search completed: visited 0, expanded 1 (1 at base layer)
Search completed: visited 0, expanded 1 (1 at base layer)
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 30
Starting search at level 0 with 1 candidates; total graph size = 36
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 37
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 31
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 33
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 39
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 41
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 34
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 43
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 44
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 45
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 46
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 47
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 48
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 49
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 35
Search completed: visited 0, expanded 1 (1 at base layer)
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 38
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 40
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 42
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 50
Search completed: visited 0, expanded 1 (1 at base layer)
Starting search at level 0 with 1 candidates; total graph size = 50
Visiting neighbor 0 of node 15 at level 0: node 47
Visiting neighbor 0 of node 47 at level 0: node 15
Already visited
Search completed: visited 1, expanded 2 (2 at base layer)

java.lang.AssertionError: Should return almost all vectors, expected at least: 25, got: 2

@tlwillke tlwillke self-requested a review September 17, 2025 16:48
@tlwillke
Copy link
Collaborator

@pkolaczk we are investigating a potentially related issue. Let's wait on this for a minute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants