Faster regular expression pattern matching in the grammar compiler.

The sequence operator (x+y) was implemented by splitting the string to be matched at all positions and trying to match the parts against the two subpatterns. To reduce the number of splits, we now estimate the minimum and maximum length of the string that the subpatterns could match. For common cases, where one of the subpatterns is a string of known length, like in (x+"y") or (x + ("a"|"o"|"u"|"e")+"y"), only one split will be tried.
2013-02-27 20:59:43 +00:00
parent 95c4cbb8f5
commit 0feb386691
5 changed files with 50 additions and 10 deletions
@@ -238,6 +238,7 @@ ppCase q (p,e) = ppPatt q 0 p <+> text "=>" <+> ppTerm q 0 e

 ppPatt q d (PAlt p1 p2) = prec d 0 (ppPatt q 0 p1 <+> char '|' <+> ppPatt q 1 p2)
 ppPatt q d (PSeq p1 p2) = prec d 0 (ppPatt q 0 p1 <+> char '+' <+> ppPatt q 1 p2)
+ppPatt q d (PMSeq (_,p1) (_,p2)) = prec d 0 (ppPatt q 0 p1 <+> char '+' <+> ppPatt q 1 p2)
 ppPatt q d (PC f ps)    = if null ps
                            then ppIdent f
                            else prec d 1 (ppIdent f <+> hsep (map (ppPatt q 3) ps))