Overview

Inference using RDFS and OWL can be a confusing topic as it is not intuitive how topics like Domain and Range work when living with an Open World Assumption. It is common for folks to interpret, for example, the Domain of an Object or Data Propery in RDFS as a restriction on allowable values, when that is not what actually takes place during inference.

In this post, I'll use the OWL-RL library to show how inference works when specifying the Domain of an Object property to help others avoid some of the modeling errors I'm made and to better understand how inference can be benneficial.

First, we'll need an environment to work in, so if you want to execute this notebook yourself, you'll need to install the following.

Dependencies

To get started you will need the following libraries. Here I am configuring using conda to create a custom environment:

conda create -n inference pip pyparsing html5lib notebook
source activate inference
pip install https://github.com/RDFLib/rdflib/archive/master.zip
pip install https://github.com/RDFLib/OWL-RL/archive/master.zip

Import RDF Libraries

In [1]:
import rdflib
import RDFClosure

Create and Populate a Graph

  • First we create a graph to work with as an example
  • Then we add a single triple to the graph that only includes a label
In [2]:
g = rdflib.Graph()
In [3]:
ttl = """@prefix : <#> .
         @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
         @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
         @prefix xml: <http://www.w3.org/XML/1998/namespace> .
         @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
         
         :non-typed-indv rdfs:label "Example Subject with no rdf:type information." .
      """
In [4]:
g.parse(data=ttl, format='turtle')
Out[4]:
<Graph identifier=N3b10fe26085349e6bec65a1a5d333a92 (<class 'rdflib.graph.Graph'>)>
In [5]:
print g.serialize(format='turtle')
@prefix : <file:///Users/nicholsn/Repos/nicholsn.github.io/content/notebooks/#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:non-typed-indv rdfs:label "Example Subject with no rdf:type information." .


Define an OWL Classe and Datatype Property

  • Here we pull some example classes from an OWL file in the Neuroimaging Data Model (NIDM)
  • The first is an owl:Class representing a Mask Map
  • Next is an owl:DatatypeProperty with the class above listed as its rdfs:domain
  • We then add these semantics to the same graph as our simple example above
In [6]:
owl="""
    @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix xml: <http://www.w3.org/XML/1998/namespace> .
    @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    @prefix obo: <http://purl.obolibrary.org/obo/> .
    @prefix nidm: <http://purl.org/nidash/nidm#> .

    nidm:NIDM_0000054 rdf:type owl:Class ;
                      rdfs:label "Mask Map" ;
                      obo:IAO_0000115 "A binary map representing the exact set of elements (e.g., pixels, voxels, vertices, and faces) in which an activity was performed (e.g. the mask map generated by the model parameter estimation activity represents the exact set of voxels in which the mass univariate model was estimated) and/or restraining the space in which an activity was performed (e.g. the mask map used by inference)" .
                    
    nidm:NIDM_0000158 rdf:type owl:DatatypeProperty ;
                  rdfs:label "noise FWHM In Vertices" ;                  
                  rdfs:comment "Range: Vector of positive floats." ;
                  obo:IAO_0000115 "Estimated Full Width at Half Maximum of the spatial smoothness of the noise process in vertices." ;                  
                  rdfs:domain nidm:NIDM_0000054 .
    """
In [7]:
g.parse(data=owl, format='turtle')
Out[7]:
<Graph identifier=N3b10fe26085349e6bec65a1a5d333a92 (<class 'rdflib.graph.Graph'>)>
In [8]:
print g.serialize(format='turtle')
@prefix : <file:///Users/nicholsn/Repos/nicholsn.github.io/content/notebooks/#> .
@prefix nidm: <http://purl.org/nidash/nidm#> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:non-typed-indv rdfs:label "Example Subject with no rdf:type information." .

nidm:NIDM_0000158 a owl:DatatypeProperty ;
    rdfs:label "noise FWHM In Vertices" ;
    obo:IAO_0000115 "Estimated Full Width at Half Maximum of the spatial smoothness of the noise process in vertices." ;
    rdfs:comment "Range: Vector of positive floats." ;
    rdfs:domain nidm:NIDM_0000054 .

nidm:NIDM_0000054 a owl:Class ;
    rdfs:label "Mask Map" ;
    obo:IAO_0000115 "A binary map representing the exact set of elements (e.g., pixels, voxels, vertices, and faces) in which an activity was performed (e.g. the mask map generated by the model parameter estimation activity represents the exact set of voxels in which the mass univariate model was estimated) and/or restraining the space in which an activity was performed (e.g. the mask map used by inference)" .


Applying RDFS Reasoning

  • First we will just apply reasoning to the graph as-is and see the result
  • Then we will add our datatype property to the individual to see what the result is after reasoning
In [9]:
rdfs = RDFClosure.DeductiveClosure(RDFClosure.RDFS_Semantics)
rdfs.expand(g)
In [10]:
print g.serialize(format='turtle')
@prefix : <file:///Users/nicholsn/Repos/nicholsn.github.io/content/notebooks/#> .
@prefix nidm: <http://purl.org/nidash/nidm#> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:non-typed-indv a rdfs:Resource ;
    rdfs:label "Example Subject with no rdf:type information." .

nidm:NIDM_0000158 a rdfs:Resource,
        owl:DatatypeProperty ;
    rdfs:label "noise FWHM In Vertices" ;
    obo:IAO_0000115 "Estimated Full Width at Half Maximum of the spatial smoothness of the noise process in vertices." ;
    rdfs:comment "Range: Vector of positive floats." ;
    rdfs:domain nidm:NIDM_0000054 .

rdfs:Literal a rdfs:Resource .

obo:IAO_0000115 a rdf:Property ;
    rdfs:subPropertyOf obo:IAO_0000115 .

nidm:NIDM_0000054 a rdfs:Resource,
        owl:Class ;
    rdfs:label "Mask Map" ;
    obo:IAO_0000115 "A binary map representing the exact set of elements (e.g., pixels, voxels, vertices, and faces) in which an activity was performed (e.g. the mask map generated by the model parameter estimation activity represents the exact set of voxels in which the mass univariate model was estimated) and/or restraining the space in which an activity was performed (e.g. the mask map used by inference)" .

rdf:type a rdf:Property ;
    rdfs:subPropertyOf rdf:type .

rdfs:comment a rdf:Property ;
    rdfs:subPropertyOf rdfs:comment .

rdfs:domain a rdf:Property ;
    rdfs:subPropertyOf rdfs:domain .

rdfs:label a rdf:Property ;
    rdfs:subPropertyOf rdfs:label .

rdfs:subPropertyOf a rdf:Property ;
    rdfs:subPropertyOf rdfs:subPropertyOf .

owl:Class a rdfs:Resource .

owl:DatatypeProperty a rdfs:Resource .


Notes on the explanded graph

  • Here you now see that some basic triples have bee added to the graph that were not previously available.
  • There isn't much interesting added, although you'll see that our :non-typed-indv a rdfs:Resource

Adding a Datatype Property

  • We now use one of the NIDM datatype properties on our example triple and give it a value
  • Note that when the statement is parsed it is simply appended below our :non-typed-indv
In [11]:
datatype = """
           @prefix : <#> .
           @prefix nidm: <http://purl.org/nidash/nidm#> .

           :non-typed-indv nidm:NIDM_0000158 "[2.95, 2.96, 2.61]" .
"""
In [12]:
g.parse(data=datatype, format='turtle')
Out[12]:
<Graph identifier=N3b10fe26085349e6bec65a1a5d333a92 (<class 'rdflib.graph.Graph'>)>
In [13]:
print g.serialize(format='turtle')
@prefix : <file:///Users/nicholsn/Repos/nicholsn.github.io/content/notebooks/#> .
@prefix nidm: <http://purl.org/nidash/nidm#> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:non-typed-indv a rdfs:Resource ;
    rdfs:label "Example Subject with no rdf:type information." ;
    nidm:NIDM_0000158 "[2.95, 2.96, 2.61]" .

nidm:NIDM_0000158 a rdfs:Resource,
        owl:DatatypeProperty ;
    rdfs:label "noise FWHM In Vertices" ;
    obo:IAO_0000115 "Estimated Full Width at Half Maximum of the spatial smoothness of the noise process in vertices." ;
    rdfs:comment "Range: Vector of positive floats." ;
    rdfs:domain nidm:NIDM_0000054 .

rdfs:Literal a rdfs:Resource .

obo:IAO_0000115 a rdf:Property ;
    rdfs:subPropertyOf obo:IAO_0000115 .

nidm:NIDM_0000054 a rdfs:Resource,
        owl:Class ;
    rdfs:label "Mask Map" ;
    obo:IAO_0000115 "A binary map representing the exact set of elements (e.g., pixels, voxels, vertices, and faces) in which an activity was performed (e.g. the mask map generated by the model parameter estimation activity represents the exact set of voxels in which the mass univariate model was estimated) and/or restraining the space in which an activity was performed (e.g. the mask map used by inference)" .

rdf:type a rdf:Property ;
    rdfs:subPropertyOf rdf:type .

rdfs:comment a rdf:Property ;
    rdfs:subPropertyOf rdfs:comment .

rdfs:domain a rdf:Property ;
    rdfs:subPropertyOf rdfs:domain .

rdfs:label a rdf:Property ;
    rdfs:subPropertyOf rdfs:label .

rdfs:subPropertyOf a rdf:Property ;
    rdfs:subPropertyOf rdfs:subPropertyOf .

owl:Class a rdfs:Resource .

owl:DatatypeProperty a rdfs:Resource .


Inferring additional type information using rdfs:domain semantics

  • Next we will expand the graph and see what happens
In [14]:
rdfs.expand(g)
print g.serialize(format='turtle')
@prefix : <file:///Users/nicholsn/Repos/nicholsn.github.io/content/notebooks/#> .
@prefix nidm: <http://purl.org/nidash/nidm#> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:non-typed-indv a nidm:NIDM_0000054,
        rdfs:Resource ;
    rdfs:label "Example Subject with no rdf:type information." ;
    nidm:NIDM_0000158 "[2.95, 2.96, 2.61]" .

rdfs:Literal a rdfs:Resource .

obo:IAO_0000115 a rdf:Property,
        rdfs:Resource ;
    rdfs:subPropertyOf obo:IAO_0000115 .

nidm:NIDM_0000158 a rdf:Property,
        rdfs:Resource,
        owl:DatatypeProperty ;
    rdfs:label "noise FWHM In Vertices" ;
    obo:IAO_0000115 "Estimated Full Width at Half Maximum of the spatial smoothness of the noise process in vertices." ;
    rdfs:comment "Range: Vector of positive floats." ;
    rdfs:domain nidm:NIDM_0000054 ;
    rdfs:subPropertyOf nidm:NIDM_0000158 .

rdf:type a rdf:Property,
        rdfs:Resource ;
    rdfs:subPropertyOf rdf:type .

rdfs:comment a rdf:Property,
        rdfs:Resource ;
    rdfs:subPropertyOf rdfs:comment .

rdfs:domain a rdf:Property,
        rdfs:Resource ;
    rdfs:subPropertyOf rdfs:domain .

rdfs:label a rdf:Property,
        rdfs:Resource ;
    rdfs:subPropertyOf rdfs:label .

rdfs:subPropertyOf a rdf:Property,
        rdfs:Resource ;
    rdfs:subPropertyOf rdfs:subPropertyOf .

owl:Class a rdfs:Resource .

owl:DatatypeProperty a rdfs:Resource .

nidm:NIDM_0000054 a rdfs:Resource,
        owl:Class ;
    rdfs:label "Mask Map" ;
    obo:IAO_0000115 "A binary map representing the exact set of elements (e.g., pixels, voxels, vertices, and faces) in which an activity was performed (e.g. the mask map generated by the model parameter estimation activity represents the exact set of voxels in which the mass univariate model was estimated) and/or restraining the space in which an activity was performed (e.g. the mask map used by inference)" .

rdf:Property a rdfs:Resource .

rdfs:Resource a rdfs:Resource .


Interpreting the results

  • As you'll notice, we now have an additional rdf:type associated with our :non-typed-indv indicating that it is a nidm:NIDM_0000054
  • This new type indicates that our object is a "Mask Map", but is that what we wanted? Possibly, but let's think about this a bit more...

When we are modeling a given domain in OWL it is important to know what inferences will come out of your design decisions. Here, the use of rdfs:domain on an owl:DatatypePropery caused us to infer specific type information simply by using the property.

In this case the property is for "noise FWHM In Verticies", which is pretty specific with the definition of:

"Estimated Full Width at Half Maximum of the spatial smoothness of the noise process in vertices."

Any resource using this property, where the author read the definition, is probably in good shape to infer that this is likely a "Mask Map"; however, it is important to ask the question:

"Is there ever a situation where this property could be reasonably applied to an object that is not a Mask Map?"

Possibly not, but inferring new types is not what I originally though of intuitively in the context of data modeling. My original expectation, from a modeling perspective using XSD or relational databases, is that rdfs:domain would place a restriction that could be used to validate if this object indeed conforms to its type definition. This is clearly not the case with RDFS reasoning.

For validation with RDF models there is, surprisingly, no straightforward solution. Although there are a number of efforts in this space with SPIN Rules and ShEx RDF Shape Expressions.

Another avenue to explore in a future post is the use of OWL Property Restrictions



Comments

comments powered by Disqus