Querying semantic data cubes

Querying with SPARQL

SPARQL is the standard query language for querying triple stores. The SPARQL specification is available at https://www.w3.org/TR/sparql11-overview/. To learn SPARQL by example, go to SPARQL by Example. For complete coverage, see the book “Learning SPARQL, 2nd edition”, by Bob DuCharme, ISBN 978-1-449-37143-2.

As an example, say we want to query our Olympics data cube for all observations with numberofmedals > 20, and for each observation we want to get the properties: country, year, and sex, ordered by country and then number (descending).

The corresponding SPARQL query is:

prefix owl: <http://www.w3.org/2002/07/owl#> 
prefix void: <http://rdfs.org/ns/void#> 
prefix dcterms: <http://purl.org/dc/terms/> 
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
prefix dcat: <http://www.w3.org/ns/dcat#> 
prefix sdmx-dimension: <http://purl.org/linked-data/sdmx/2009/dimension#> 
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
prefix sdmx-attribute: <http://purl.org/linked-data/sdmx/2009/attribute#> 
prefix qb: <http://purl.org/linked-data/cube#> 
prefix skos: <http://www.w3.org/2004/02/skos/core#> 
prefix xsd: <http://www.w3.org/2001/XMLSchema#> 
prefix sdmx-concept: <http://purl.org/linked-data/sdmx/2009/concept#> 
SELECT DISTINCT ?country ?year ?sex ?number
WHERE {?obs a qb:Observation;
        sdmx-dimension:refArea    ?countryid ;
        sdmx-dimension:refPeriod  ?year ;
        sdmx-dimension:sex        ?sexid ;
        <https://example.org/ns/olympics#numberofmedals>  ?number .
        FILTER (?number > 20)
       ?countryid skos:prefLabel ?country.
        FILTER (lang(?country) = 'en')
      ?sexid skos:prefLabel ?sex.
  FILTER (lang(?sex) = 'en')
 }
ORDER BY ?country desc(?number)
     

You can fire this query using the SPARQL HTTP protocol to the SPARQL endpoint exposed by the triple store that holds the triples. The format of this endpoint http address is dependent on the triple store used. In a previous section, we deployed a SPARQL endpoint using RDF4J at the URL http://localhost:8080/rdf4j-server/repositories/datacube_olympics (where datacube_olympics is the id of the repository used).

Sending the query (using Paw) to this endpoint gives following response:

using paw

The same query with curl (curl is a command line utility for transferring data with URLs; more info at https://curl.haxx.se/).

## Request
curl -X "POST" "http://localhost:8080/rdf4j-server/repositories/datacube_olympics" \
     -H 'Content-Type: application/sparql-query' \
     -H 'Accept: text/csv' \
     -d $'prefix owl: <http://www.w3.org/2002/07/owl#> 
prefix void: <http://rdfs.org/ns/void#> 
prefix dcterms: <http://purl.org/dc/terms/> 
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
prefix dcat: <http://www.w3.org/ns/dcat#> 
prefix sdmx-dimension: <http://purl.org/linked-data/sdmx/2009/dimension#> 
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
prefix sdmx-attribute: <http://purl.org/linked-data/sdmx/2009/attribute#> 
prefix qb: <http://purl.org/linked-data/cube#> 
prefix skos: <http://www.w3.org/2004/02/skos/core#> 
prefix xsd: <http://www.w3.org/2001/XMLSchema#> 
prefix sdmx-concept: <http://purl.org/linked-data/sdmx/2009/concept#> 
SELECT DISTINCT ?country ?year ?sex ?number
WHERE {?obs a qb:Observation;
        sdmx-dimension:refArea    ?countryid ;
        sdmx-dimension:refPeriod  ?year ;
        sdmx-dimension:sex        ?sexid ;
        <https://example.org/ns/olympics#numberofmedals>  ?number .
   	 FILTER (?number > 20)
       ?countryid skos:prefLabel ?country.
    	 FILTER (lang(?country) = '"'"'en'"'"')
      ?sexid skos:prefLabel ?sex.
  FILTER (lang(?sex) = '"'"'en'"'"')
 }
ORDER BY ?country desc(?number)'
        

Querying with CubiQL

CubiQL is an implementation of the GraphQL query language for querying RDF data cubes. More background information on CubiQL is available here: https://github.com/Swirrl/graphql-qb.

In the previous section, we deployed a CubiQL endpoint at http://localhost:9090/graphql.

We can use a GraphQL IDE such as GraphiQL (https://github.com/graphql/graphiql for more) and point it to our CubiQL endpoint to check if the endpoint is working.

In the left pane we define the structure of the data we want to retrieve.

CubiQL endpoint

The IDE provides guidance in formulating the query based on the underlying GraphQL type system that documents the objects and their retrievable fields.

Guided query composing

More examples of queries can be found at https://github.com/Swirrl/graphql-qb.

See also:

Learning SPARQL