-
Notifications
You must be signed in to change notification settings - Fork 21
Add SEP for Alignment of SPARQL Functions with ISO SQL Standard #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
domel
wants to merge
2
commits into
w3c:main
Choose a base branch
from
domel:sep-0010
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
## Alignment of SPARQL Built-in Functions with ISO SQL Standard Functions | ||
|
||
## Short name | ||
SPARQL-SQL-FUNCTIONS | ||
|
||
## SEP number | ||
SEP-10 | ||
|
||
## Authors | ||
Dominik Tomaszuk (University of Bialystok) | ||
|
||
## Abstract | ||
SPARQL 1.1 defines a limited set of built-in functions for string manipulation, numeric operations, date/time handling, and conditional logic. However, many commonly used functions standardized in ISO/IEC 9075:2023 (SQL:2023) are not currently available in SPARQL. This SEP proposes extending SPARQL with additional non-aggregate functions from the SQL standard to improve interoperability, completeness, and usability. Functions such as `TRIM`, `LPAD`, `RPAD`, `MOD`, `POWER`, `SQRT`, `EXP`, `LOG`, `DATE_ADD`, `TIMESTAMPDIFF`, `CASE`, `NULLIF`, `GREATEST`, and `LEAST` are widely used in database query processing but lack equivalents in SPARQL. By introducing these functions, SPARQL can align better with existing standards, reduce the learning curve for developers, and provide richer query expressivity for RDF data. | ||
|
||
## Motivation | ||
SPARQL 1.1 (2013) provides only a minimal set of built-in functions compared to SQL. | ||
Key limitations include: | ||
- Missing string manipulation functions (`TRIM`, `LPAD`, `RPAD`, `POSITION`). | ||
- Missing numeric/math functions (`MOD`, `POWER`, `SQRT`, `EXP`, `LOG`, `SIN`, `COS`, `TAN`). | ||
- Limited date/time support (no `DATE_ADD`, `TIMESTAMPDIFF`, or `INTERVAL` arithmetic). | ||
- Missing conditional/logical functions (`CASE`, `NULLIF`). | ||
- No generalized comparative functions (`GREATEST`, `LEAST`). | ||
|
||
These gaps limit SPARQL’s usability in data integration and analytics scenarios where users expect similar functionality to SQL. They also complicate interoperability in hybrid systems where RDF data is queried alongside relational databases. | ||
|
||
Scope: This change affects the **SPARQL functions and operators specification**, not the core query language semantics. | ||
|
||
## Rationale and Alternatives | ||
Rationale: | ||
- **Interoperability**: SQL (ISO/IEC 9075:2023) is the most widely deployed query language. Aligning SPARQL functions with SQL reduces friction in adopting SPARQL. | ||
- **Developer familiarity**: Many practitioners know SQL but not SPARQL. Familiar function names and semantics ease adoption. | ||
- **Expressivity**: The missing functions require complex workarounds or external processing in current SPARQL. | ||
|
||
Alternatives considered: | ||
1. Keep SPARQL minimal and rely on external application logic. | ||
2. Define SPARQL-only extensions with new function names. | ||
3. Adopt ISO SQL function names directly to ensure compatibility. | ||
|
||
This SEP recommends option (3) for consistency with established standards. | ||
|
||
## Evidence of consensus | ||
- Multiple research works and developer reports highlight frustration with missing SPARQL functions. | ||
- W3C Community Group discussions on SPARQL 1.2 already acknowledge gaps in function support. | ||
- SQL alignment (ISO/IEC 9075:2023) has been proposed informally in workshops and mailing lists. | ||
|
||
## Specification | ||
The following new functions are proposed to be added to SPARQL: | ||
|
||
### String functions | ||
- `TRIM(string)`, `LTRIM(string)`, `RTRIM(string)` | ||
- `LPAD(string, length, padchar)` | ||
- `RPAD(string, length, padchar)` | ||
- `POSITION(substring IN string)` | ||
|
||
### Numeric functions | ||
- `MOD(numeric, numeric)` | ||
- `POWER(x, y)` | ||
- `SQRT(x)` | ||
- `EXP(x)` | ||
- `LN(x)`, `LOG10(x)` | ||
- `SIN(x)`, `COS(x)`, `TAN(x)` | ||
|
||
### Date/Time functions | ||
- `DATE_ADD(date, interval)` | ||
- `TIMESTAMPDIFF(unit, t1, t2)` | ||
- Support for `INTERVAL` literals (e.g., `INTERVAL '7' DAY`) | ||
|
||
### Conditional and logical functions | ||
- `CASE WHEN ... THEN ... ELSE ... END` | ||
- `NULLIF(x, y)` | ||
|
||
### Comparative functions | ||
- `GREATEST(x1, x2, …)` | ||
- `LEAST(x1, x2, …)` | ||
|
||
Each function should follow ISO/IEC 9075:2023 semantics, adapted for RDF datatypes (notably `xsd:dateTime`, `xsd:decimal`, etc.). | ||
|
||
## Backwards Compatibility | ||
- No impact on existing queries: all proposed functions are new additions. | ||
- Existing SPARQL functions (`STRLEN`, `UCASE`, `LCASE`, etc.) remain valid. | ||
- Overlaps (e.g., `CONCAT`) follow existing SPARQL semantics aligned with SQL. | ||
|
||
## Tests and Implementations | ||
- Test cases must cover typical inputs, edge cases (e.g., empty strings, NaN, null-equivalent values), and datatype conversions. | ||
- Prototype implementations could be built on top of Apache Jena ARQ and RDF4J. | ||
- Alignment tests should compare outputs against equivalent SQL queries on relational backends. | ||
|
||
--- | ||
|
||
## Appendix A: Function Mapping between SQL and SPARQL 1.1 | ||
|
||
| SQL Function | SPARQL 1.1 Equivalent | | ||
|-----------------|------------------------| | ||
| LENGTH | STRLEN | | ||
| TRIM | | | ||
| LTRIM | | | ||
| RTRIM | | | ||
| LPAD | | | ||
| RPAD | | | ||
| POSITION | | | ||
| UPPER | UCASE | | ||
| LOWER | LCASE | | ||
| SUBSTRING | SUBSTR | | ||
| CONCAT | CONCAT | | ||
| REPLACE | REPLACE | | ||
| REGEXP_MATCHES | REGEX | | ||
| ABS | ABS | | ||
| MOD | | | ||
| CEIL / CEILING | CEIL | | ||
| FLOOR | FLOOR | | ||
| ROUND | ROUND | | ||
| EXP | | | ||
| LN | | | ||
| LOG10 | | | ||
| POWER | | | ||
| SQRT | | | ||
| SIN | | | ||
| COS | | | ||
| TAN | | | ||
| CURRENT_TIMESTAMP | NOW | | ||
| EXTRACT | YEAR, MONTH, DAY, HOURS, MINUTES, SECONDS | | ||
| INTERVAL | | | ||
| DATE_ADD | | | ||
| TIMESTAMPDIFF | | | ||
| CASE | | | ||
| COALESCE | COALESCE | | ||
| NULLIF | | | ||
| GREATEST | | | ||
| LEAST | | | ||
| CAST | STR(), xsd:type(...) | | ||
| CURRENT_USER | | | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it already covered by
xsd: duration
that is supported by a few implementations?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xsd:duration
is indeed supported in some implementations, but it only provides the datatype. What is missing are standardized functions and operators (e.g.,DATE_ADD
,TIMESTAMPDIFF
) that make such durations practically usable within queries across engines.