Staging Regular Expressions with Moore Cayley Fusion
Regular expressions are a tool for recognising regular languages, historically implemented using derivatives or non-deterministic finite automata. They are convenient for many light-weight parsing workloads, but their traditional formulation only lends them to matching text, not returning fully-structured results. This contrasts with other forms of parsing, where the aim is to extract meaningful data, for example abstract syntax trees. Yet, most regular expression libraries do not support this useful output, and those that do are often slower, and backed by parser combinator libraries.
This paper redesigns regular expressions to make use of Moore machines as the underlying machinery; this way the regular expression matcher can produce results. We further show how to produce heterogeneous results, providing a classical applicative interface. To make this representation performant, we make use of the relationship between Cayley representations, continuation-passing style, and staged meta-programming to generate performant code for regular expression matching with fully-structured results.
Benchmarks demonstrate that our approach out-performs commonly used regular expression libraries in Haskell, as well as the \texttt{parsley} library, which is the state of the art for performance of parser combinator libraries in Haskell.
Thu 15 SepDisplayed time zone: Belgrade, Bratislava, Budapest, Ljubljana, Prague change
14:00 - 15:30 | |||
14:00 30mTalk | A Totally Predictable Outcome: An Investigation of Traversals of Infinite Structures Haskell Gershom Bazerman Arista Networks | ||
14:30 30mTalk | Open Transactional Actions: Interacting with non-transactional resources in STM Haskell Haskell Jonathas Augusto de Oliveira Conceição Universidade Federal de Pelotas, André Rauber Du Bois Universidade Federal de Pelotas, Gerson Cavalheiro Universidade Federal de Pelotas, Samuel Feitosa Universidade Federal da Fronteira Sul, Rodrigo G. Ribeiro Federal University of Ouro Preto | ||
15:00 30mTalk | Staging Regular Expressions with Moore Cayley Fusion Haskell |