Skip to content

Conversation

bevzzz
Copy link
Collaborator

@bevzzz bevzzz commented Aug 25, 2025

This PR adds the following data types:

  • boolean / boolean[]
  • number / number[]
  • date / date[]
  • uuid / uuid[]
  • text[]
  • integer[]

Create new property types

Collection creation API allows creating properties of each of this types via a familiar API.

collection -> collection.properties(Property.boolArray("flags"), Property.textArray("tags"))

Aggregation queries

Accordingly, new aggregation metrics are available:

Aggregate.bool(m -> m.percentageFalse().totalTrue())
Aggregate.number(m -> m.min().max().mean())
Aggregate.date(m -> m.median().mode())

New date format

While working on this PR I learned that java.util.Date is considered a legacy class that has been superseded by java.time.Instant & co. as of Java 8. The new family of date formats in java.time are more versatile and flexible, which is why client6 now decodes dates into OffsetDateTime.

OffsetDateTime preserves a lot of the information from the timestamp and can be easily converted to Instant via OffsetDateTime::toInstant or Date via Date.from(instant).

New "number" format

Prior to this PR client6 used Number to represent values of the "number" properties. While technically correct (Number is the base class for all numeric values in Java) this was only coincidental. Using an higher-level type, such as Double, is more accurate and corresponds to the actual value being sent over the wire (in protobuf those properties are represented with double).

Same goes for Integer -> Long. Weaviate returns int64 values, which are best represented as Long in Java type system.

For consistency with the other clients, we follow the naming scheme of calling methods and types the way the data types are called in Weaviate ❗
For example, if I want to get the value of "percentageCarbs" for a group in the aggregation I might to do this:

Double pctCarbs = result.groups(0).groupedBy().number() // "number" is the collection property type 

which will return Double, because that's the underlying Java type.

bevzzz added 16 commits August 25, 2025 13:04
Switch from representing Weaviate dates as (legacy) Date
to modern and more accurate OffsetDateTime, such that the information is not lost
This should disambiguate errors which otherwise appear to refer
to Java builin data types. E.g. all numbers extend Number, and
yet an exception might say that  'amoung_long' is not a NUMBER property
Use primitive types where possible.
Accept int where long is accepted.
Accept float where double is accepted.
List/Array properties only support Long/Double.
Copy link

@orca-security-eu orca-security-eu bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca

@bevzzz bevzzz merged commit cf4d1cc into v6 Aug 26, 2025
2 checks passed
@bevzzz bevzzz deleted the v6-data-types branch August 26, 2025 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants