This returns this
RegExpChoice's
normal form according to the following rewrite rules.
These rules are categorized into three subgroups, depending on the
nature of this RegExpConcat's left() and
right() components, corresponding to the three following
cases:
- left() or right() is EMPTY;
- left() and right() forms are non-numeric
unary expressions forming a matching pair, or this is of the
form A|(B|C), where A and B form such a
(restricted) matching
pair;
- power/range rules (taking care of choices of numeric unary
expressions);
- commutation rules;
- the rest.
We simply refer to a rule in each group as (for lack of better
terms!):
- an empty-rule,
- a pair-rule,
- an power-rule,
- an swap-rule,
- an other-rule,
respectively. [In what follows, the "
qualifier-rule"
notation will be used systematically and exclusively to qualify
these groups.]
Empty-rules
The rules for a choice involving
EMPTY are:
-
[CH:E_]
()|X
→
X?
-
[CH:_E]
X|()
→
X?
********************************************************************
Match-rules
The rules for choosing among any two regular expressions forming a
"matching pair" are given
by the table below - but restricted to non-numeric unary
operators.
|
_|_
|
X
|
X?
|
X+
|
X\*
|
|
X
|
|
|
|
|
|
X?
|
|
|
|
|
|
X+
|
|
|
|
|
|
X\*
|
|
|
|
|
Each entry in this table is labeled
[CH:AB] where
A and
B are a pair of letters in the
set
{X O P S}. Each
letter is a mnemonic for the case it stands for:
X = "eXpression",
O = "Option",
P = "Plus",
S = "Star". To each entry in
the above table labeled
[CH:AB], there correspond
two rules, labeled
[CH:AB1] and
[CH:AB2]: the first one covers
the simple case and the second one the nested case. They are listed
again below for the sake of clarity.
The entries in red are never used
as they are systematically preempted earlier by the the
X/X case. Indeed, it is never
possible for matchingPairCode()
to return any of the values OO, PP, nor
SS. Therefore, the 6 corresponding rules [CH:OO1], [CH:OO2], [CH:PP1], [CH:PP2], [CH:SS1], and [CH:SS2], are redundant and could be
eliminated altogether. As a matter of fact, each one reduces to the
corresponding [CH:XX] rule. This can
be trivially verified by looking at the rules (idempotence of
choice). The redundant rules are given here for the sake of
completeness and for the justification of correctness of this
specification.
********************************************************************
-
[CH:XX1]
X|X
→
X
-
[CH:XX2]
X|(X|Y)
→
X|Y
-
[CH:XO1]
X|X?
→
X?
-
[CH:XO2]
X|(X?|Y)
→
X?|Y
-
[CH:XP1]
X|X+
→
X+
-
[CH:XP2]
X|(X+|Y)
→
X+|Y
-
[CH:XS1]
X|X\*
→
X\*
-
[CH:XS2]
X|(X\*|Y)
→
X\*|Y
-
[CH:OX1]
X?|X
→
X?
-
[CH:OX2]
X?|(X|Y)
→
X?|Y
- (redundant)
[CH:OO1]
X?|X?
→
X?
- (redundant)
[CH:OO2]
X?|(X?|Y)
→
X?|Y
-
[CH:OP1]
X?|X+
→
X\*
-
[CH:OP2]
X?|(X+|Y)
→
X\*|Y
-
[CH:OS1]
X?|X\*
→
X\*
-
[CH:OS2]
X?|(X\*|Y)
→
X\*|Y
-
[CH:PX1]
X+|X
→
X+
-
[CH:PX2]
X+|(X|Y)
→
X+|Y
-
[CH:PO1]
X+|X?
→
X\*
-
[CH:PO2]
X+|(X?|Y)
→
X\*|Y
- (redundant)
[CH:PP1]
X+|X+
→
X+
- (redundant)
[CH:PP2]
X+|(X+|Y)
→
X+|Y
-
[CH:PS1]
X+|X\*
→
X\*
-
[CH:PS2]
X+|(X\*|Y)
→
X\*|Y
-
[CH:SX1]
X\*|X
→
X\*
-
[CH:SX2]
X\*|(X|Y)
→
X\*|Y
-
[CH:SO1]
X\*|X?
→
X\*
-
[CH:SO2]
X\*|(X?|Y)
→
X\*|Y
-
[CH:SP1]
X\*|X+
→
X\*
-
[CH:SP2]
X\*|(X+|Y)
→
X\*|Y
- (redundant)
[CH:SS1]
X\*|X\*
→
X\*
- (redundant)
[CH:SS2]
X\*|(X\*|Y)
→
X\*|Y
********************************************************************
Power-rules
The cases of
N = "N-th
power" and
R = "power
Range" missing in the table above need special attention and are
given here.
-
[CH:XN^2]
X | X^2
→
X_1^2
-
[CH:XR0n]
X | X_0^n
→
X_0^n
-
[CH:XR1n]
X | X_1^n
→
X_1^n
-
[CH:XR2n]
X | X_2^n
→
X_1^n
-
[CH:NN_1]
X^n | X^n+1
→
X_n^n+1
-
[CH:NR_1]
X^n | X_n+1^q
→
X_n^q
-
[CH:NR_2]
X^n | X_p^q
→
X_p^q
(if p ≤ n ≤ q)
-
[CH:RN_1]
X_m^n | X^n+1
→
X_m^n+1
-
[CH:RR12]
X_m^n | X_p^q
→
X_m^Max(n,q)
(if p ≤ n+1)
Swap-rules
Although choice is a commutative operation on regular expressions,
we put it systematically in some predefined order based on the nature
of the operands. This is what the swap-rules do.
-
[CH:ba%1]
b | a
→
a | b
(if a < b)
-
[CH:ba%2]
b | (a | X)
→
a | (b | X)
(if a < b)
-
[CH:mn%1]
b^m | a^n
→
a^n | b^m
(if a < b)
-
[CH:mn%2]
b^m | (a^n | X)
→
a^n | (b^m | X)
(if a < b)
-
[CH:NX%1]
X^n | X
→
X | X^n
-
[CH:NX%2]
X^n | (X | Y)
→
X | (X^n | Y)
-
[CH:NN%1]
X^n | X^m
→
X^m | X^n
(if m < n)
-
[CH:NN%2]
X^n | (X^m | Y)
→
X^m | (X^n | Y)
(if m < n)
-
[CH:NR]
X^n | X_p^q
→
X_p^q | X^n
(if p ≤ q ≤ n)
-
[CH:RX]
X_m^n | X
→
X | X_m^n
-
[CH:RN]
X_p^q | X^m
→
X^m | X_p^q
(if m ≤ q)
-
[CH:RR2]
X_p^q | (X_m^n | Y)
→
X_m^n | (X_p^q | Y)
(if m < p)
-
[CH:ba%RR2]
b_m^n | (a_p^q | X)
→
a_p^q | (b_m^n | X)
(if a < b)
-
[CH:RR11]
X_p^q | X_m^n
→
X_m^n | X_p^q
(if m < p)
Other-rules
-
[CH:O_]
X? | Y
→
(X | Y)?
-
[CH:_O]
X | Y?
→
(X | Y)?
-
[CH:AS]
(X | Y) | Z
→
X | (Y | Z)
-
[CH:XSX]
X.Y\* | X
→
X.Y\*
-
[CH:SXX]
Y\*.X | X
→
Y\*.X
-
[CH:XPX]
X.Y+ | X
→
X.Y\*
-
[CH:PXX]
Y+.X | X
→
Y\*.X
-
[CH:XXS]
X | X.Y\*
→
X.Y\*
-
[CH:XSX]
X | Y\*.X
→
Y\*.X
-
[CH:XXP]
X | X.Y+
→
X.Y\*
-
[CH:XPX]
X | Y+.X
→
Y\*.X