/** *
* *
* *
* * This Jacc grammar is a transcription of the EBNF * for the canonical syntax of the RIF BLD. This syntax is * canonical in that this EBNF defines the kernel constructs used * for the BLD-to-XML * transformation rules. In addition to the canonical BLD PS * language, it has been proposed to allow a simpler syntax for writing * RIF use cases. This simpler syntax extends the canonical syntax by * allowing various shorthands * for RIF constants and for common expressions such as arithmetic, * etc. - the so-called Abridged PS. This * additional syntax is not canonical PS in that it is just syntactic * sugar that is desugared into the canonical form. * *
* *
* *
| Token | *Value | *
|---|---|
| OPENPAR | *'(' | *
| CLOSEPAR | *')' | *
| OPENBRA | *'[' | *
| CLOSEBRA | *']' | *
| OPENMETA | *'(*' | *
| CLOSEMETA | *'*)' | *DOCUMENT | *'Document' | * *
| BASE | *'Base' | *
| PREFIX | *'Prefix' | *
| IMPORT | *'Import' | *
| GROUP | *'Group' | *
| EXTERNAL | *'External' | *
| AND | *'And' | *
| OR | *'Or' | *
| EXISTS | *'Exists' | *
| FORALL | *'Forall' | *
| IF | *':-' | *
| ARROW | *'->' | *
| LEXSPACE | *'^^' | *
| EQUAL | *'=' | *
| MEMBER | *'#' | *
| SUBCLASS | *'##' | *
| COLON | *':' | *
| NUMBER | *(possibly signed) integer, decimal, or floating-point | *
| VARIABLE | *maximum-length of word characters starting with * a '?' * | *
| LOCALNAME | *maximum-length of word characters starting * with a '_' * | *
| STRING | *a double-quoted string containing any character (using * '\\' to escape '"') * | *
| IDENTIFIER | *maximum-length of word characters starting with a letter * | *
* *
NUMBER is a token representing numbers. * *
VARIABLE is a token * recognized thanks to its leading '?' but the token returned by the lexer * suppresses this leading '?'. * * This means that '?' is * not a separate punctuation mark as shown by the BLC * language's EBNF. * *
LOCALNAME is a token * recognized thanks to its leading '_' but the token returned by the lexer * suppresses this leading '_'. * * This means that '_' is * not a separate punctuation mark as shown by the DTB * language's EBNF for shorthands. * *
Using STRING dispenses * from the spurious "..."^^ * notation, making '^^' an infix * operator. * * In other words, the initial and final double quotes are * part of the token itself and need not appear at the grammar level. * *
IDENTIFIER is any maximal sequence of * non-separator not-punctuation unicode characters that does not start * with a '?'. * *
Note that the colon character (':') is * tokenized as punctuation. Indeed, a SymSpace is parsed as a pair of * IDENTIFIERs separated by a * colon. * * This means that ':' * is a separate punctuation mark unlike what is shown by * the EBNF. * *
* * In the RIF * specification of the EBNF the Rule Language, it is specified * that: * *
* IRIMETA ::= '(*' IRICONST? (Frame | 'And' '(' Frame* ')')? '*)'
* Frame ::= TERM '[' (TERM '->' TERM)* ']'
* TERM ::= IRIMETA? (Const | Var | Expr | 'External' '(' Expr ')')
* Const ::= '"' UNICODESTRING '"^^' SYMSPACE | CONSTSHORT
* SYMSPACE ::= ANGLEBRACKIRI | CURIE
*
*
* where CONSTSHORT, ANGLEBRACKIRI, and CURIE
* are defined (in the DTB
* shorthand notation for RIF constants) by:
*
* * CURIE ::= PNAME_LN | PNAME_NS * CONSTSHORT ::= ANGLEBRACKIRI // shorthand for "..."^^rif:iri * | CURIE // shorthand for "..."^^rif:iri * | '"' UNICODESTRING '"' // shorthand for "..."^^xs:string * | NumericLiteral // shorthand for "..."^^xs:integer,xs:decimal,xs:double * | '_' LocalName // shorthand for "..."^^rif:local ** where: *
* ANGLEBRACKIRI ::= '<' ([^<>"{}|^`\]-[#x00-#x20])* '>'
* PNAME_LN ::= PNAME_NS PN_LOCAL
* PNAME_NS ::= PN_PREFIX? ':'
* PN_LOCAL ::= (PN_CHARS_U | [0-9]) ((PN_CHARS|'.')* PN_CHARS)?
* PN_PREFIX ::= PN_CHARS_BASE ((PN_CHARS|'.')* PN_CHARS)?
* PN_CHARS_U ::= PN_CHARS_BASE | '_'
* PN_CHARS ::= PN_CHARS_U
* | '-'
* | [0-9]
* | #x00B7
* | [#x0300-#x036F]
* | [#x203F-#x2040]
* PN_CHARS_BASE ::= [A-Z]
* | [a-z]
* | [#x00C0-#x00D6]
* | [#x00D8-#x00F6]
* | [#x00F8-#x02FF]
* | [#x0370-#x037D]
* | [#x037F-#x1FFF]
* | [#x200C-#x200D]
* | [#x2070-#x218F]
* | [#x2C00-#x2FEF]
* | [#x3001-#xD7FF]
* | [#xF900-#xFDCF]
* | [#xFDF0-#xFFFD]
* | [#x10000-#xEFFFF]
*
*
* The PS grammar's tokenizing is complexified due to not using
* double-quoted strings around the IRI's that are arguments of
* the pragmas Prefix and Base, which declare
* shorthands for IRI's. The alternative would be to parse
* IRI's - which is beyond our prototype's goal, besides being
* unnecessary in this case. This is not so in the canonical PS, where
* all such IRI's are double-quoted strings - which greatly
* simplifies the tokenizing. It's as simple and as easy to do so for
* the Prefix and Base pragmas - which is what our
* prototype does.
*
*
*
* The original EBNF is accessible in the specification of the BLD
* Rule Language. It is reproduced here for convenience:
*
****************************************************************************
*
* Document ::= IRIMETA? 'Document' '(' Base? Prefix* Import* Group? ')'
* Base ::= 'Base' '(' IRI ')'
* Prefix ::= 'Prefix' '(' Name IRI ')'
* Import ::= IRIMETA? 'Import' '(' IRICONST PROFILE? ')'
* Group ::= IRIMETA? 'Group' '(' (RULE | Group)* ')'
* RULE ::= (IRIMETA? 'Forall' Var+ '(' CLAUSE ')') | CLAUSE
* CLAUSE ::= Implies | ATOMIC
* Implies ::= IRIMETA? (ATOMIC | 'And' '(' ATOMIC* ')') ':-' FORMULA
* PROFILE ::= TERM
*
****************************************************************************
*
* The Jacc rules corresponding to this EBNF are given in BLR.grm.
*
*
* FORMULA ::= ATOMIC
* | IRIMETA? 'And' '(' FORMULA* ')'
* | IRIMETA? 'Or' '(' FORMULA* ')'
* | IRIMETA? 'Exists' Var+ '(' FORMULA ')'
* | IRIMETA? 'External' '(' Atom | Frame ')'
* ATOMIC ::= IRIMETA? (Atom | Equal | Member | Subclass | Frame)
* Atom ::= UNITERM
* UNITERM ::= Const '(' (TERM* | (Name '->' TERM)*) ')'
* Equal ::= TERM '=' TERM
* Member ::= TERM '#' TERM
* Subclass ::= TERM '##' TERM
* Frame ::= TERM '[' (TERM '->' TERM)* ']'
* TERM ::= IRIMETA? (Const | Var | Expr | 'External' '(' Expr ')')
* Expr ::= UNITERM
* Const ::= '"' UNICODESTRING '"^^' SYMSPACE | CONSTSHORT
* Name ::= UNICODESTRING
* Var ::= '?' UNICODESTRING
* SYMSPACE ::= ANGLEBRACKIRI | CURIE
*
* IRIMETA ::= '(*' IRICONST? (Frame | 'And' '(' Frame* ')')? '*)'
*
****************************************************************************
*
* The Jacc rules corresponding to this EBNF are given in BLC.grm.
*
* * * Essentially, the format of a * Jacc grammar is that of a Yacc grammar. As in Yacc, Jacc rules * may be annotated with semantic actions in the form of Java code * involving the rule's RHS constituents (denoted by $1, * $2, ..., $n - the so-called * pseudo-variables where the index n in $n refers * to the order of RHS constituents. Such actions appear between curly * braces ('{' and '}') wherever a symbol may appear * in a rule's RHS. * * Jacc also allows an additional form of annotation in the RHS of a * rule to indicate the XML serialization pattern of the abstract * syntactic tree (AST) node corresponding to a derivation with this * rule. This XML serialization meta-annotation comes between square * brackets ('[' and ']') and is of the form described * in a simple XML * serialization annotation language. * *
* For example, the annotated rule:
*
*
*
* QUANTIF
* : 'Exists' Var_plus '(' CONDIT ')'
* [
* nsprefix : hrl
* localname : quantifier
* attributes : {kind="existential"}
* children : (2,4)
* ]
* ;
*
*
*
* means that an AST node for this rule will be serialized thus:
*
*
*
* <hrl:quantifier kind="existential">
* (XML serialization of Var_plus)
* (XML serialization of CONDIT)
* </hrl:quantifier>
*
*
*
* Rules without XML serialization annotation follow a default behavior:
* the serialization is the concatenation of those of its RHS's
* constituents, eliminating punctuation tokens; i.e., empty
* nodes and literal tokens - namely, tokens that do not carry a value.
* (See the Jacc XML
* annotation manual for more details.)
*
*
* * For example, see the two test files examples/Test1.bld and examples/Test2.bld. Running the * command examples/bld on them produces the XML trees * shown in examples/Test1.xml and * examples/Test2.xml. */