Enhancements
This guide follows a guide for NFA Construction. If this is your first article here it might help to start at the Compiler Construction Guide.
It will come as no surprise that the NFA Construction method that we have used up to this point is not the most efficient. The idea is to have a solid starting ground to build the concepts and then improve on them. Whether that means that you will start over with a new paradigm such as an object-oriented approach or a purely functional approach, either way, at least, this will serve as a fundamental start to creating a great NFA Lexical Constructor.
Here are a couple of ideas for improvements regarding our current implementation.
Memoize Your Epsilon Closure Function
The epsilon closure function can often take quite a bit of memory, and constructing a 2d array as the transition table will help a lot. But Looking for a manner to memoize this function can often cause a significant change concerning setup time for the lexer.
Improve The Underlying Regular Expressions
The following enhancement you can add to this construction is some manner of construction multi-or regular expressions. Since we branch off when constructing or statements, it might be more efficient to develop a design that expects multiple logical or arguments. Instead of creating a massive tree of epsilon transitions and tasking our epsilon closure with the additive work of consolidating them, we may initially start with one initial state pointing to the list of symbol states.
Use Enhanced Regular Expressions
This enhancement is purely for ease of implementation. Currently, we have no way to construct complex regular expressions other than our partial solution, where we created two functions to construct larger regular expressions out of strings. The idea here is the same, but to create regular expressions such as the plus operator, all except, etc.
Change The States of NFA Concatenation
Currently, we construct the concatenation NFA by building two new states. We could improve this by implementing a construction that uses the first machine’s initial state as the new initial state. Then it would change the accepting states of the first machine to be the initial states of the second machine.
The idea here is that we will save time constructing the NFA with fewer shared states and save time computing unnecessary epsilon transitions.


Continue to Read
If you know anything about compiler construction, it’s safe to assume you do a lot of reading. The topic is heavy on research since most of the subject was discovered in the early years of computing. Continuing to read about theory often gives us a better insight into improvements that we may have overlooked in the first read-over. Here is a couple of resources I value in the topic. If you have any books that have benefited you on the subject, don’t be afraid to share in the comment section!
- Introduction to Languages and the Theory of Computation, By John Martin
- Compilers: Principles, Techniques, and Tools, By Alfred Aho, Monica Lam, Ravi Sethi, and Jeffrey Ullman
Books
Listed below are a few of my favorite books that taught me a fair amount of compiler construction. If you have books that benefited you on this topic, please share them in the comments below and expand this list!
