To simplify the task of annotating regular structures (e.g., tables or
lists), the Mangrove system provides the
command. This document describes how to use it.
The reglist command can help to annotate any collection of items (such as publications, events, or interests). To be concrete, this document describes how to annotate all "events" in a table with a single template statement, referred to as a regular list, or simply a reglist. This simplifies annotation and makes it easier to maintain the annotations when you edit your document.
Using a template of that kind reduces to adding a
reglist macro that describes the structure of a
single recurring element (e.g., a row in a table).
This "definition" should be written before the very first actual event in the table. For example, if the first row of the table contains the column headings, the template should immediately follow it; otherwise it should be right after the
The position of the closing reglist tag defines the scope of the macros, i.e., the semantic structure defined in the template will be applied only to the rows enclosed between the pair of open and close
The following HTML fragment is annotated with a
reglist macro. The
relevant code is in bold.
<td>Feb 3, 2003</td>
<td>Evolving the Semantic Web with Mangrove</td>
<!-- </reglist> -->
reglistelement has its own syntax and is very similar to a regular expression. Let's take a look at the details.
reglistelement is enclosed in an HTML comment. (In comparison, the regular semantic tags could be added directly among other HTML tags.) Note that in that comment there are no other tags except the reglist itself.
reglistdoes not have the
“uw:”prefix. The reason for that is that the
reglistelement is a macro command, and it is available in all user-defined name spaces.
Let’s go through the above example with
the table and explore the actual syntax of the element.
In this case the
reglist element describes the structure of a single row from the table.
(The very first and the last tag from the value of the
reglist define that scope.)
The string representing the value of the reglist element is actually the skeleton of a
row from the table (the HTML elements) with additional semantic tags and the special symbol “
(without the quotes).
reglist tells the semantic parser to treat each row in this table as a
The data in the first column should be interpreted as a date for that event, the next one as the event’s
topic, etc. The last column does not have any semantic tags (i.e., we have only ‘
This means that there is a column in the table, but we do not want to annotate its contents, or there is
no suitable semantic tag for it in the name space we are using currently.
The symbol “
...” is a place holder for the data which is present in the actual table.
The order of the semantic tags and their neighboring HTML tags could be switched, i.e., it is correct to use "
<tr><uw:event>" or "
Here is an example of a
reglist element used for annotation of an HTML list.
reglistelement is not suported by the GUI tagger, so if you like to use it, you have to edit the source of your web page by hand (i.e., with a text editor).
reglisttag and the semantic tags inside it are ignored by traditional browsers and thus will not disrupt the look and feel of your web page. Likewise, HTML formatting tags are ignored by the semantic parser, so you can tag your data without having to make any other changes to the HTML.
<html>tag as specified above.
reglistelement doesn't have a
"uw:"prefix and that it is inside an HTML comment.
reglistelement (also inside a comment).
reglisttag corresponds exactly to the actual structure of the HTML tags in the table. E.g., if you have 3 columns in the table, describe all of them in the reglist value; if you use closing HTML tags in the reglist value, check that all of their corresponding columns have that closing tag.
<uw:date>(including a year) and a