Table of Contents

Querying with SPARQL

Prepare yourself for the lab

Lab instructions

At the end of the lab, each group should email their first project to the teacher. It consist of:

  1. a list of the names of all project authors,
  2. the final *.ttl file with the graph developed during the previous lab,
  3. the set of SPARQL queries against the knowledge graph developed during today's lab (up to Section 4. ASK and DESCRIBE queries).

1. SPARQL = Pattern matching [20 minutes]

  1. Do you have your knowledge graph, developed during the previous lab? If not, now is the time to find it!
  2. Open the preferred tool for querying RDF files using SPARQL (see Tools required for this lab at the top of this page) and execute your first simple SELECT query against your knowledge graph:
    SELECT ?a ?b ?c
    WHERE {
      ?a ?b ?c
    }
    LIMIT 10
  3. Now, it's time to explore your graph more! Prepare two queries for your graph that extract some interesting information. Use only triple patterns – we will move to more complicated things in the subsequent sections.
    • If you want to ask about all members of a container, you can use the rdfs:member which is equivalent to all rdf:_1, rdf:_2, … relations, e.g.:
      BASE   <http://example.org/> 
      PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
      PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
       
      SELECT ?who ?place
      WHERE {
        ?who <visited> [ a rdf:Bag ;
                         rdfs:member ?place ]
      }
      LIMIT 10

      selects all pairs of people and places visited by them (can be executed against https://krzysztof.kutt.pl/didactics/semweb/bob_and_mona_lisa.rdf file)

    • Save the queries for the report!

2. Constraints: FILTER [15 minutes]

After matching RDF graph pattern, there is also possibility to put some constraints on the rows that will be excluded or included in the results. This is achieved using FILTER construct. Let's try it now on your knowledge graphs.

  1. Your graph should contain at least a few different datatypes (this was a requirement in a previous lab!). Select two of them (e.g., boolean, string, numeric, date) and check what functions can be used in filters for them.
  2. Prepare and execute two queries (one for each selected datatype) that filter something interesting in your knowledge graph.
    • Save the queries for the report!

3. SPARQL as rule language [15 minutes]

So far, we have seen that the answers to questions in SPARQL can take the form of a table. In this section, we will take a look at CONSTRUCT queries which answers take the form of an RDF graph. They provide a way to introduce “rules” into RDF datasets:

  1. Now, it's time for you to develop 2 CONSTRUCT queries that provide useful rules for your knowledge graph!
    • Save the queries for the report!

4. ASK and DESCRIBE queries [15 minutes]

SPARQL also provides two more query types:

  1. Prepare at least one ASK query that checks something interesting in your knowledge graph.
  2. Prepare at least one DESCRIBE query that describe the most interesting “thing” in your knowledge graph.
  3. Save the queries for the report!

5. DBpedia SPARQL Endpoint [30 minutes]

SPARQL queries may be asked against RDF file as we did in previous sections. But there is also possibility to use special purpose web services called SPARQL Endpoints. As we already know Wikidata, we will explore the DBpedia in this section.

  1. Do you remember your task from the first lab? You were asked to prepare a list of the 15 most populous countries in Europe based on Wikipedia. Now we know enough to not do it manually but use the SPARQL language and DBpedia instead!
  2. As DBpedia is a dump of Wikipedia, it should contain some information about Poland. We don't know what URI Poland has in DBpedia, but we know the name Poland, and we remember that rdfs:label property is useful. Maybe this will help us? Let's try!
  3. Open the preferred tool for querying SPARQL Endpoints (see Tools required for this lab at the top of this page).
  4. Enter http://dbpedia.org/sparql as SPARQL Endpoint.
  5. What we know so far? There should be some URI (?country) that probably has a relation rdfs:label with object “Poland”@en. This can be easily translated into SPARQL query:
    PREFIX dbo: <http://dbpedia.org/ontology/>
    PREFIX dbr: <http://dbpedia.org/resource/>
    PREFIX dbp: <http://dbpedia.org/property/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
     
    SELECT ?country
    WHERE { 
      ?country rdfs:label "Poland"@en .
    }
    • Hint: some useful prefixes are already in place to assist you in this task.
  6. Success! Now, we can expand this query to find information about the population of Poland.
    • Hint: the following line may be useful to get only objects that are numbers (like population)
      FILTER(ISNUMERIC(?val))
  7. Now, prepare the actual query that returns a list of 15 countries in Europe with the biggest population!

6. Aggregation [30 minutes]

  1. Poland is divided into 16 voivodeships (PL: województwo), and then into 314 counties (PL: powiat). In this task, we will examine it closer.
  2. Prepare a query (using preferred tool for querying SPARQL Endpoints, against DBpedia) which returns list of voivodeships and number of counties inside them. List should consist only of voivodeships with 20 or more counties and should be ordered by number of counties.
  3. Results should look like that:
  4. Hint – useful URIs (you can use dbo, dbr and dbp prefixes defined in previous section):
    • county: dbr:Powiat
    • voivodeship: dbr:Voivodeships_of_Poland

Learn more!

SPARQL:

Sample queries in SPARQL:

Tools:

DB2RDF (RDF and Relational Databases):