Parser Combinators Beat Regexes

https://news.ycombinator.com/rss Hits: 13
Summary

The next part of the Advent of Code puzzle involves interpreting instructions called do() and don't() which turn on and off the contributions of mul instructions to the sum. As we parse, we now need to keep track of one bit of state. This is a nightmare for regexes to deal with, because they recognise regular languages, and regular languages are literally stateless languages.2 There’s more nuance, of course, but as a first approximation. Technically regular languages are those that can be recognised by a finite state automaton, and if there are a finite number of states (as there are in this case) then all of them can be encoded in an fsa but let’s not get pedantic here. But with the parser-based solution, we can lift it into a state transformer, and we get a stateful parser.3 Note that in a serious application, we might have lexing and parsing as separate steps, but parser combinators give us the freedom to combine both steps for tiny parsers like this. In[3]: module Main where import Control.Applicative (asum, (<|>)) import qualified Control.Monad.State.Class as State import qualified Control.Monad.State.Strict as State import qualified Data.Attoparsec.ByteString.Char8 as Parser import Data.Bool (bool) import Data.ByteString (ByteString) import qualified Data.ByteString as ByteString import qualified Data.ByteString.Char8 as Char8 import qualified Data.Either as Either test_input :: ByteString test_input = Char8.pack "xmul(2,4)&mul[3,7]!^don't()_mul(5,5)+mul(32,64](mul(11,8)undo()?mul(8,5))" parser_matches :: ByteString -> Int parser_matches input = let -- Upon successfully parsing a do instruction, -- enable contribution of mul instructions. enable = do State.lift (Parser.string (Char8.pack "do()")) State.put True -- Upon successfully parsing a don't instruction, -- disable contribution of mul instructions. disable = do State.lift (Parser.string (Char8.pack "don't()")) State.put False -- Parse a mul instruction just as before, except -- now lifted into a stateful...

First seen: 2025-04-09 23:41

Last seen: 2025-04-10 11:43