courses:semint:lab_rules [IIS Wiki]

This is an old revision of the document!

Last verification: 20220909
Tools required for this lab:
- Protégé Desktop 5

Introduction to SHACL.pdf

At the end of the lab, each group should email their second project to the teacher. It consist of:

the final *.ttl file with the ontology (started on Ontology 101, refined during the Advanced ontology engineering and extended with SWRL rules during today's lab),
the second *.ttl file with SHACL shapes (developed during the second part of today's lab).

0. Prepare the Protégé Desktop [5 minutes]

Run the Protégé Desktop.
Open File → Check for plugins…
Select tickbox on the left side of two plugins (if they are not listed, they are already installed in the newest version):
1. SWRLTab Protege 5.0+ Plugin
2. SHACL4Protege Constraint Validator
Click Install
After a while, you will see message: “Updates will take effect when you next start Protege.”
Close Protégé and run it again (to load the new plugins).

1. Introduction to SWRL [15 minutes]

The OWL 2 language (known from previous labs on ontologies and reasoning) is not able to express all relations. E.g., it cannot express the relation child of married parents, because there is no way in OWL 2 to express the relation between individuals with which an individual has relations:

To address this gap, and to allow this and other kind of inferences, the Semantic Web Rule Language (SWRL) was introduced:
```
Person(?x) ^ hasParent(?x, ?y) ^ hasParent(?x, ?z) ^ hasSpouse(?y, ?z) -> ChildOfMarriedParents(?x)
```
Also, sometimes you can describe something using OWL, but it can be more intuitive when defined as a SWRL rule.
As you can see, rules in SWRL has a simple structure:
- There is a left-hand side (called the antecedent) and a right-hand side (called the consequent). They are separated by an arrow (->).
- Each expression in a SWRL rule is separated by a ^ sign (logical and).
- All parameters (variables that are wildcards and get bound dynamically as the rule fires) are preceded by a ?.
- There are three types of expressions in SWRL:
  - Class expressions. This is the name of a class followed by parentheses with a parameter inside, e.g., Person(?x) will bind ?x to an instance of the class Person and will iterate over each instance of the Person class.
  - Property expressions. This is the name of a property followed by parentheses and two parameters: the first is the subject, and the second is the object of a relation, e.g., hasParent(?x, ?y) will bind ?y to each parent of ?x (note that the ?x will be bound in the previous expression).
  - Built-in functions. SWRL has a number of built-in functions for doing mathematical tests, string tests, etc. All of them are prefaced by the swrlb: prefix, e.g., the math built-in swrlb:greaterThan(?age, 18) succeeds if the value of ?age is greater than 18. See 8. Built-Ins for full list and documentation.
The consequent of the rule fires if and only if every expression in the antecedent is satisfied.
The Protégé supports SWRL
- They are parsed and interpreted by both HermiT and Pellet reasoners
- There are two dedicated interfaces to view and edit SWRL rules (there are slight differences between them, e.g., ^ are changed to commans in the Rule view):
  - The SWRLTab: if it is not visible, select Window → Tabs → SWRLTab (there is no need to use the Drools engine in the bottom panel; we will use the HermiT/Pellet reasoners)
  - The Rule view: select Window → Views → Ontology views → Rules and click somewhere in the Protege, where this view should be placed.

Download the family.swrl.owl ontology (mirror: family.swrl.owl) and load it into Protege.
Take a look at the existing classes, properties, and instances (in the Entities tab).
Open one of the dedicated interfaces for the SWRL (as described above).
Take a look at the SWRL rules. Do you understand them?
Run the rules:
1. Under Reasoner menu select HermiT or Pellet reasoner
2. Reasoner → Start Reasoner (or Synchronize reasoner if it was started before)
Take a look at inferred knowledge

2. SWRL in use [20 minutes]

Open your own ontology in the Protege Desktop.
Prepare 3-5 rules for your ontology.
Run them to check if they work correctly.
After adding rules, don't forget to save the new version of the ontology so you don't lose your work if something will go wrong!

3. Introduction to SHACL [20 minutes]

SHACL stands for Shapes Constraint Language (no, these are not geometric shapes ). It's all about constraints (a shape is a collection of constraints that shapes the data).
- Unlike ontologies (which model a selected part of real-world), SHACL is used to describe (low-level) requirements to specific RDF triples.
- The starting point is our knowledge graph. Alongside it, we create a second graph (with SHACL triples) that describes the constraints.
Whatever you can do with SHACL you can also do with OWL or SWRL. So why do we need SHACL?
(based on SHACL and OWL Compared):
- OWL has been designed to support inferencing, but it is only practically applicable for certain kinds of inference. As a result, there is a need for SWRL rules (above) or a set of SPARQL CONSTRUCT queries (e.g., if you want to change the wrong triples, you need to use SPARQL; OWL and SWRL cannot do this).
  SHACL is used to actually verify constraints (instead of performing inference). SHACL can also be used to define rules.
- Confusion about the meaning of restrictions – in particular that OWL does not constrain anything but rather describes inferences.
  SHACL generates a list of exceeded constraints instead of stating that there are contradictions and the entire graph is false.
- Reasoning under Open World Assumption is tricky. Especially if you want to determine if all records meet the constraints.
  SHACL verifies constraints under Closed World Assumption.

Open the Shapes Constraint Language (SHACL). It may be useful as a reference.
Open Zazuko SHACL Playground.
1. Click on the Shapes Graph icon (top left). Select Format: text/turtle (at the top of the left panel)
2. Click on the Data Graph icon (next to Shapes Graph icon). Select Format: text/turtle.
3. Now, you can see graphs in Turtle syntax in both windows: Shapes Graph and Data Graph.
Let's start with a simple example.
1. Copy the following Shapes Graph to SHACL Playground:
```
@prefix :       <http://example.org/> .
@prefix sh:     <http://www.w3.org/ns/shacl#> .
@prefix xsd:    <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
 
:UserShape a sh:NodeShape ;
   sh:targetNode :alice, :bob, :carol ;
   sh:property [
      sh:path schema:name ; 
      sh:minCount 1; 
      sh:maxCount 1;
      sh:datatype xsd:string
   ] ;
   sh:property [
      sh:path schema:email ; 
      sh:minCount 1; 
      sh:maxCount 1; 
      sh:nodeKind sh:IRI
   ] .
```
  and the following Data Graph:
```
@prefix :       <http://example.org/> .
@prefix sh:     <http://www.w3.org/ns/shacl#> .
@prefix xsd:    <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
 
:alice schema:name "Alice Cooper" ;
       schema:email <mailto:alice@mail.org> .
 
:bob   schema:firstName "Bob" ;
       schema:email <mailto:bob@mail.org> .
 
:carol schema:name "Carol" ;
       schema:email "carol@mail.org" .
```
2. Data Graph should be clear for you
  Shapes Graph has a :UserShape instance of a sh:NodeShape class. It defines the constraints to check against the Data Graph:
  - First, we provide information about nodes in Data Graph that should follow these constraints: sh:targetNode
  - Then, we provide information about constraints on two outgoing relations (i.e., properties in which targetNodes are the subject). sh:path defines the actual relation (it can be more sophisticated → it can be a property chain)
  - For each property, we define the cardinality (min and max count; so, one user must have exactly one name and one email) and the expected object type (to be more concrete → all restrictions on relations, i.e., everything in blank nodes in this example, are PropertyShapes)
3. The Validation Report is presented below. It is formatted in nice HTML, but the SHACL standard defines the exact report structure in RDF. If you want to see the RAW report, select the Validation Report icon (third icon in the top left corner) and select Display errors as Raw RDF
There are some errors in the Data Graph. Fix them to check if you understand the defined shapes correctly!
SHACL is for humans, so make it more human-friendly! Play with all three relations:
1. You can define your own message in a Shape, using sh:message relation.
2. You can also define the severity of each constraint, using sh:severity relation → possible values are sh:Info, sh:Warning, sh:Violation (by default, all are Violations; note that the UI shows only Violations! If you want to see other severity levels, take a look at raw report)
3. You can also deactivate the specific constraint by setting sh:deactivated to true (it is not needed to comment / remove them; you can simply add one statement, and they will not be checked)

Constraints applied to specific nodes listed by URIs (sh:targetNode as above) are not so useful. But, there is a possibility to constraint all instances of specific class (sh:targetClass) or all instances connected with specific relation (sh:targetSubjectsOf and sh:targetObjectsOf).

Load the new Data Graph (with specification of instances of :User class):

@prefix :       <http://example.org/> .
@prefix sh:     <http://www.w3.org/ns/shacl#> .
@prefix xsd:    <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
 
:alice a           :User ;
       schema:name "Alice Cooper" ;
       schema:email <mailto:alice@mail.org> .
 
:bob   a           :User ;
       schema:firstName "Bob" ;
       schema:email <mailto:bob@mail.org> .
 
:carol a           :User ;
       schema:name "Carol" ;
       schema:email "carol@mail.org" .

Change the definition of the :UserShape so that it applies to:
1. All instances of :User class
2. All subjects of schema:email relation

SHACL provides logical operators to combine constraints.

Before you dive into this topic, it may be useful to separate PropertyShapes. As a result, it will be easier to control the constraints, as below:

@prefix :       <http://example.org/> .
@prefix sh:     <http://www.w3.org/ns/shacl#> .
@prefix xsd:    <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
 
:UserShape a sh:NodeShape ;
   sh:targetClass :User ;
   sh:and ( :CheckName :CheckEmail ) .
 
:CheckName
   sh:path schema:name ; 
   sh:minCount 1; 
   sh:maxCount 1;
   sh:datatype xsd:string .
 
:CheckEmail
   sh:path schema:email ; 
   sh:minCount 1; 
   sh:maxCount 1; 
   sh:nodeKind sh:IRI .

Now, let's back to the Data Graph provided at the beginning, where some users have schema:firstName while the others have schema:name. Provide the shapes that allow both of them (but not in the same time). 4.6 Logical Constraint Components may be useful (take a look at sh:xone).

4. SHACL and Protégé [10 minutes]

Protégé has dedicated plugin for SHACL. We will try it now (it should be already installed; as the first task during this lab).
Download the example OWL file (mirror) and load it into Protege (simply File → Open…)
Open dedicated tab by selecting: Window → Tabs → SHACL Editor.

By default, it will show sample SHACL graph with ex:PersonShape. If it is not there, simply copy the following file to SHACL Editor:

@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh:    <http://www.w3.org/ns/shacl#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex:    <http://www.example.org/#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
 
ex:PersonShape
    a sh:NodeShape ;
    sh:targetClass ex:Person ; # Applies to all persons
    sh:property [              # _:b0
        sh:path ex:ssn ;       # constrains the values of ex:ssn
        sh:maxCount 1 ;
    ] ;
    sh:property [              # _:b1
        sh:path ex:ssn ;       # constrains the values of ex:ssn
        sh:datatype xsd:string ;
        sh:pattern "^\\d{3}-\\d{2}-\\d{4}$" ;
        sh:severity sh:Warning ;
    ] ;
    sh:closed true ;
    sh:ignoredProperties ( rdf:type owl:topDataProperty owl:topObjectProperty ) ;
    .

Click Validate to see the Validation Report (in the table, at the bottom).
You can also start reasoner before running the validation → then the shapes will be also validated against inferred knowledge.
There is also a possibility to filter the report by classes and instances (simply click on the class/instance in the left panel of the SHACL Editor tab).
There are new type of constraints named sh:pattern and sh:closed (combined with sh:ignoredProperties). Do you understand them?
For a list of available (core) constraints, see the table below and the 4. Core Constraint Components section of SHACL recommendation.
There are 6 violations. Fix them! (fix the knowledge base and/or change the constraints)

5. SHACL in use [20 minutes]

Open your own ontology (with SWRL rules developed during this lab) in the Protégé.
Prepare some SHACL shapes for your ontology (at least 2 PropertyShapes).
Validate the knowledge base and observe the Validation Report to check if they work correctly.
Save the SHACL shapes in a separate file (they are not saved with the ontology; you need to save them using Save button in the SHACL Editor tab; the file should have .ttl extension, as it is a regular Turtle file)

These instructions are based on New Protégé Pizza Tutorial, OWL 2 and SWRL Tutorial and Shapes applications and tools tutorial

SWRL:

Standard:
- W3C Submission: SWRL: A Semantic Web Rule Language Combining OWL and RuleML
Tutorials:
- New Protégé Pizza Tutorial (2021)
- SWRL Process Modeling Tutorial (2020)
- OWL 2 and SWRL Tutorial (2012)
- Excellent slides about SWRL: SWRL2009ProtegeConference.pdf
Tools:
- SWRLAPI – API for Java; used as a base for Protege plugin

SHACL:

Standard:
- Shapes Constraint Language (SHACL)
- SHACL Advanced Features (e.g., SHACL Functions and SHACL Rules)
Readings:
- Validating RDF Data (2018) – free HTML book on SHACL (and ShEx)
- SHACL and OWL Compared
Tutorials:
- Tutorials at ValidatingRDF
- Introduction to SHACL.pdf
Tools:
- SHACL Playground
- SHACL Validator at RDFShape
- SHACL4Protege
- List of open source tools is provided, e.g., at Wikipedia
ShEx (Shape Expressions) – although similar to SHACL and with a similar purpose (specifying the structure of RDF graphs), it is an independent language:
- SHACL defines constraints for specific classes (and validates them), while ShEx defines the schema (and tries to best match the definitions with instances)
- Standard: Shape Expressions Language
- A chapter in Validating RDF Data: Chapter 7. Comparing ShEx and SHACL