

Regexp::Grammars
also offers full manual control over the distillation
process. If you use the reserved word MATCH as the alias for a subrule
call:
<MATCH=filename>
or a subpattern match:
<MATCH=( \w+ )>
or a code block:
<MATCH=(?{ 42 })>
then the current rule will treat the return value of that subrule, pattern, or code block as its complete result, and return that value instead of the usual result-hash it constructs. This is the case even if the result has other entries that would normally also be returned.
For example, in a rule like:
<rule: term>
<MATCH=literal>
| <left_paren> <MATCH=expr> <right_paren>
The use ofMATCHaliases causes the rule to return either whatever<literal>returns, or whatever<expr>returns (provided it's between left and right parentheses).
Note that, in this second case, even though<left_paren>and<right_paren>are captured to the result-hash, they are not returned, because theMATCHalias overrides the normal return the result-hash semantics and returns only what its associated subrule (i.e.<expr>) produces.
El siguiente ejemplo ilustra el uso del alias MATCH:
$ cat -n demo_calc.pl
1 #!/usr/local/lib/perl/5.10.1/bin/perl5.10.1
2 use v5.10;
3 use warnings;
4
5 my $calculator = do{
6 use Regexp::Grammars;
7 qr{
8 <Answer>
9
10 <rule: Answer>
11 <X=Mult> <Op=([+-])> <Y=Answer>
12 | <MATCH=Mult>
13
14 <rule: Mult>
15 <X=Pow> <Op=([*/%])> <Y=Mult>
16 | <MATCH=Pow>
17
18 <rule: Pow>
19 <X=Term> <Op=(\^)> <Y=Pow>
20 | <MATCH=Term>
21
22 <rule: Term>
23 <MATCH=Literal>
24 | \( <MATCH=Answer> \)
25
26 <token: Literal>
27 <MATCH=( [+-]? \d++ (?: \. \d++ )?+ )>
28 }xms
29 };
30
31 while (my $input = <>) {
32 if ($input =~ $calculator) {
33 use Data::Dumper 'Dumper';
34 warn Dumper \%/;
35 }
36 }
Veamos una ejecución:
$ ./demo_calc.pl
2+3*5
$VAR1 = {
'' => '2+3*5',
'Answer' => {
'' => '2+3*5',
'Op' => '+',
'X' => '2',
'Y' => {
'' => '3*5',
'Op' => '*',
'X' => '3',
'Y' => '5'
}
}
};
4-5-2
$VAR1 = {
'' => '4-5-2',
'Answer' => {
'' => '4-5-2',
'Op' => '-',
'X' => '4',
'Y' => {
'' => '5-2',
'Op' => '-',
'X' => '5',
'Y' => '2'
}
}
};
Obsérvese como el árbol construido para la expresión 4-5-2
se hunde a derechas dando lugar a una jerarquía errónea.
Para arreglar el problema sería necesario eliminar la
recursividad por la izquierda en las reglas correspondientes.
It's also possible to control what a rule returns from within a code block. Regexp::Grammars provides a set of reserved variables that give direct access to the result-hash.
The result-hash itself can be accessed as %MATCH within any code block
inside a rule. For example:
<rule: sum>
<X=product> \+ <Y=product>
<MATCH=(?{ $MATCH{X} + $MATCH{Y} })>
Here, the rule matches a product (aliased'X'in the result-hash), then a literal'+', then another product (aliased to'Y'in the result-hash). The rule then executes the code block, which accesses the two saved values (as$MATCH{X}and$MATCH{Y}), adding them together. Because the block is itself aliased toMATCH, the sum produced by the block becomes the (only) result of the rule.
It is also possible to set the rule result from within a code block
(instead of aliasing it). The special override return value is
represented by the special variable $MATCH. So the previous example
could be rewritten:
<rule: sum>
<X=product> \+ <Y=product>
(?{ $MATCH = $MATCH{X} + $MATCH{Y} })
Both forms are identical in effect. Any assignment to $MATCH overrides
the normal return all subrule results behaviour.
Assigning to $MATCH directly is particularly handy if the result may
not always be distillable, for example:
<rule: sum>
<X=product> \+ <Y=product>
(?{ if (!ref $MATCH{X} && !ref $MATCH{Y}) {
# Reduce to sum, if both terms are simple scalars...
$MATCH = $MATCH{X} + $MATCH{Y};
}
else {
# Return full syntax tree for non-simple case...
$MATCH{op} = '+';
}
})
Note that you can also partially override the subrule return
behaviour. Normally, the subrule returns the complete text it matched
under the empty key of its result-hash. That is, of course,
$MATCH{""},
so you can override just that behaviour by directly assigning to that
entry.
For example, if you have a rule that matches key/value pairs from a configuration file, you might prefer that any trailing comments not be included in the matched text entry of the rule's result-hash. You could hide such comments like so:
<rule: config_line>
<key> : <value> <comment>?
(?{
# Edit trailing comments out of "matched text" entry...
$MATCH = "$MATCH{key} : $MATCH{value}";
})
Some more examples of the uses of $MATCH:
<rule: FuncDecl>
# Keyword Name Keep return the name (as a string)...
func <Identifier> ; (?{ $MATCH = $MATCH{'Identifier'} })
<rule: NumList>
# Numbers in square brackets...
\[
( \d+ (?: , \d+)* )
\]
# Return only the numbers...
(?{ $MATCH = $CAPTURE })
<token: Cmd>
# Match standard variants then standardize the keyword...
(?: mv | move | rename ) (?{ $MATCH = 'mv'; })
$CAPTUREand$CONTEXTare both aliases for the built-in read-only$^Nvariable, which always contains the substring matched by the nearest preceding(...)capture.$^Nstill works perfectly well, but these are provided to improve the readability of code blocks and error messages respectively.
El siguiente código implementa una calculadora usando destilación en el código:
pl@nereida:~/Lregexpgrammars/demo$ cat -n demo_calc_inline.pl
1 use v5.10;
2 use warnings;
3
4 my $calculator = do{
5 use Regexp::Grammars;
6 qr{
7 <Answer>
8
9 <rule: Answer>
10 <X=Mult> \+ <Y=Answer>
11 (?{ $MATCH = $MATCH{X} + $MATCH{Y}; })
12 | <X=Mult> - <Y=Answer>
13 (?{ $MATCH = $MATCH{X} - $MATCH{Y}; })
14 | <MATCH=Mult>
15
16 <rule: Mult>
17 <X=Pow> \* <Y=Mult>
18 (?{ $MATCH = $MATCH{X} * $MATCH{Y}; })
19 | <X=Pow> / <Y=Mult>
20 (?{ $MATCH = $MATCH{X} / $MATCH{Y}; })
21 | <X=Pow> % <Y=Mult>
22 (?{ $MATCH = $MATCH{X} % $MATCH{Y}; })
23 | <MATCH=Pow>
24
25 <rule: Pow>
26 <X=Term> \^ <Y=Pow>
27 (?{ $MATCH = $MATCH{X} ** $MATCH{Y}; })
28 | <MATCH=Term>
29
30 <rule: Term>
31 <MATCH=Literal>
32 | \( <MATCH=Answer> \)
33
34 <token: Literal>
35 <MATCH=( [+-]? \d++ (?: \. \d++ )?+ )>
36 }xms
37 };
38
39 while (my $input = <>) {
40 if ($input =~ $calculator) {
41 say '--> ', $/{Answer};
42 }
43 }
4-2-2
8/4/2
2^2^3

