Magic sets, DLP, and other strange ways to implement semantic web expert systems
I just finished some changes to python-dlp including a modification to
FuXi that includes an implementation of the Magic Set Transformation
(MST) method for RIF-like horn clauses. The most useful, immediate
value this has for me is to be able to (essentially) implement a DLP
(description logic programming) entailment regime for a SPARQL query
service.
Consider the test case for the SymmetricProperty OWL test.
The base facts are:
first:Ghent first:path first:Antwerp .
first:path a owl:SymmetricProperty
The goal we are trying to prove is:
first:Antwerp first:path first:Ghent
I.e., a user wants to query an RDF dataset that includes an RDF graph
with the above statements and is expected to implement an entailment
regime for OWL-DL RDF such that the following query gives a positive
answer:
ASK { first:Antwerp first:path first:Ghent }
The general pD* rule that would normally apply in helping answer this query is:
{?P a owl:SymmetricProperty. ?S ?P ?O} => {?O ?P ?S}.
Re-written in a familiar (prolog-like) RIF-BLD syntax:
Forall ?P ?O ?S ( ?P(?O ?S) :- And( owl:SymmetricProperty(?P) ?P(?S ?O) ) )
In order to maintain consistency, a rule-based engine that used this
clause to implement the definition of a symmetric property would need
to fire it for *every* triple in the fact base (in order to properly
calculate the herbrand base) because of the 2nd triple pattern in the
body / antecedent / left-hand-side of the rule: ?S ?P ?O
However, the DLP approach that converted tree-based OWL axioms into
colloquial horn clauses would allow us to use (instead) a
domain-specific rule:
Forall ?Y ?X ( first:path(?Y ?X) :- first:path(?X ?Y) )]
This rule is domain-specific in the sense that it only applies to
instances of the first:path predicate rather than for every predicate.
As a result of this transformation, the procedural evaluation of the
rule for symmetry has been reduced from the worst case to only the
fraction of the RDF dataset concerning first:path statements.
So, a knowledge base that could exhaustively evaluate rules in a
top-down fashion (via 'forward chaining') prior to bringing up the
SPARQL service could answer that question against the (smaller)
entailed RDF graph.
However, with the MST implementation, if the query was known a priori
the ruleset can be modified into a version that further restricts the
amount of redundant work done during the inference process. For
example, even if the SPARQL service is known to never have to answer
that query, the colloquial rule above would still be needed by a niave
implemenation and would apply to every statement that used the
first:path predicate.
The diagram above is a Graph-viz rendering of a Proof Markup Language
(PML) proof tree generated by taking the colloquial rule, modifying it
using the MST algorithm, evaluating the base facts against the
ruleset, and adding an RDF statement that 'triggers' the
backward-chaining process. Fuxi includes a nice set of utilities for generating proof tree vizualizations.
Essentially, performing a top-down (or
forward chaining) evaluation of the rules and the facts simulates a
backward-chained proof.
Below are the 3 rules that replace the original domain-specific rule:
:path_magic(?LOC1 ? LOC2) :- And( :path_magic(?X ? LOC2)
:path_bf(?X ? LOC1) :path_magic(?X) )
:path_magic(?X) :- :path_magic(?X ?LOC)
:path(?X ?LOC1) :- And( :path_magic(?X ? LOC1) :path_magic(?X)
:path(?X ?LOC2) :path_magic(? LOC2 ? LOC1) :path(?LOC2 ?LOC1) )
And finally, the trigger for the proof is the following RDF statement:
first:Antwerp first:path_magic first:Ghent
The first two rules, pass through information about the sub-goals of
the query and essentially block the final rule from taking effect
until the trigger is added to the fact graph. It is clear to see that
the 3rd rule, will no longer apply to every RDF statement with a
first:path predicate, but rather only statements of that kind where
the subject and / or object terms are part of a query. So, for a
SPARQL service where we do not expect to answer queries that rely on
supporting symmetric properties in the first:path predicate, no
calculations will be performed and no unnecessary RDF statements will
be entailed.
I hope to write a bit more about some of the benefits of a Python
toolkit for building Semantic Web expert systems. I touched a bit on
these in my InfixOWL write-up and presentation, but haven't really put
the whole picture together.
-- Chimezie



