TextRunner Turing Center KnowItAll Project University of Washington logo

Horn-clause inference rules learned by Sherlock

These files contain the Horn-clause inference rules learned by the Sherlock system. The methodology for constructing and evaluating these rules was described in:

"Learning First-Order Horn Clauses from Web Text"
S. Schoenmackers, J. Davis, O. Etzioni, and D. Weld
In EMNLP 2010
(paper available at: http://ai.cs.washington.edu/pubs/184)

There are two versions of the files available at http://www.cs.washington.edu/research/sherlock-hornclauses/:
The contents of the rules zip files are:

Each file contains a header line (beginning with a '#') and then a series of rules/typed relations/class names, one per line. Within a line, the fields are tab-separated. The header line describes what each of the fields correspond to.

In the alltypedrelations.txt file, the fields list the relation, class of the first argument, class of the second argument, and a couple of scores indicating how often the relation occurred in our corpus and how much more likely it was to occur than random (log PMI).

In each of the rules files, the fields first list the rule, how many relations are in the body of the rule (1 or 2), and the rule's score according to several rule scoring metrics.

The rules are formatted to be human readable and reasonably easy to parse mechanically. In the files they are listed like prolog/datalog rules. For example, one of the rules is:

rule "be bear in(writer_A, place_B) :- be bear in(writer_A, city_C), be locate in(city_C, place_B) "

Which can be understood as:
If 'A is born in C' and 'C is located in B', then 'A is born in B', where A is a member of the class of writers, B is a member of the class of places, and C is a member of the class of cities.

Unfortunately, due to licensing restrictions we are unable to provide the raw extractions.

The format of the class instances file is 4 fields separated by tabs:

This page was last updated March 1, 2011.