Skip to content

INESData/pycottas

 
 

Repository files navigation

pycottas

License DOI Latest PyPI version Python Version PyPI status Documentation Status

pycottas is a library for working with compressed RDF files in the COTTAS format. COTTAS stores triples as a triple table in Apache Parquet. It is built on top of DuckDB and provides an HDT-like interface.

Features ✨

  • Compression and decompression of RDF files.
  • Querying COTTAS files with triple patterns.
  • RDFLib store backend for querying COTTAS files with SPARQL.
  • Supports RDF datasets (quads).
  • Can be used as a library or via command line.

Documentation 📑

Read the documentation.

Getting Started 🚀

PyPI is the fastest way to install pycottas:

pip install pycottas

We recommend to use virtual environments to install pycottas.

import pycottas
from rdflib import Graph, URIRef

pycottas.rdf2cottas('my_file.ttl', 'my_file.cottas', index='spo')
res = pycottas.search('my_file.cottas', '?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?o')
print(res)
pycottas.cottas2rdf('my_file.cottas', 'my_file.nt')

# COTTASDocument class for querying with triple patterns
cottas_doc = pycottas.COTTASDocument('my_file.cottas')
# the triple pattern can be a string (below) or a tuple of RDFLib terms
res = cottas_doc.search('?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?o')

# COTTASStore class for querying with SPARQL
graph = Graph(store=pycottas.COTTASStore('my_file.cottas'))
res = graph.query('''
  PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
  SELECT DISTINCT ?s ?o WHERE {
    ?s rdf:type ?o .
  } LIMIT 10''')
for row in res:
    print(row)

To execute via command line check the docs.

License 🔓

pycottas is available under the Apache License 2.0.

Author & Contact 📬

Universidad Politécnica de Madrid.

Citing 💬

If you used pycottas in your work, please cite the ISWC paper:

@inproceedings{arenas2025cottas,
  title     = {{COTTAS: Columnar Triple Table Storage for Efficient and Compressed RDF Management}},
  author    = {Arenas-Guerrero, Julián and Ferrada, Sebastián},
  booktitle = {Proceedings of the 24th International Semantic Web Conference, ISWC},
  year      = {2025},
}

About

Python COTTAS library for compressing and querying RDF

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%