If you’re making a compiler, you probably need to do some parsing and emitting.
‘Parsing’ is a fancy way of saying ‘reading code’.
‘Emitting’ is a fancy way of saying ‘writing code’.
When you parse, you often want to emit something as well. Maybe you want to emit your target language. Or an intermediate representation (eg: an abstract syntax tree). Or maybe you just want to emit whether the code is valid or not.
When you parse something, what information do you want back?
Maybe you just want to know if the code parses successfully or not. Maybe you want to know how far it can parse successfully before failing. Maybe you want a helpful error message when it fails. Maybe you want to see how the parser broke down your code into smaller pieces.
For this reason, I like to return ALL the information I possibly can.
Given to you in a result object.
Was the parse successful or not? A simple yes/no question.
A copy of the source code that you tried to parse. For your reference purposes only.
The snippet of code that you successfully parsed. It could be a small part of it, or your entire source, or nothing.
If the parse failed, it’s the longest possible snippet of code that is valid, which is highly subjective.
The remainding code, after you take off the snippet.
An object with some helpful information. It always includes a message. It might include some more information on a case-by-case basis.
Whether a failure should interrupt the parse or not. Yes or no.
Nested arrays containing results for everything that was parsed as part of this parse.
Your emitted code. What your snippet got transformed into.
The term that the compiler attempted to parse.
Optionally, some extra information. Anything you want.
Here’s the twist. Maybe.
The code you parse… doesn’t have to be text. It could be an array, a tree, or anything you want. And you can emit anything you want too.
No matter what, these are the concepts I reach for whenever I want to parse something. Whether that’s some text, a list of tokens, or a tree.
Back to the wikiblogarden.