Enhancements
his article follows a guide for Compiler Construction. If this is your first article here, it might help to start at the Compiler Construction Guide.
Once again we have not built the best lexical analyzer generator that was possible. This is because it is much more important to understand why things happen at first rather than blindly building something. With that being said here are a couple of ideas to improve your derivative-based lexical analyzers.
Finding Lexeme
Currently, we test strings to match regular expressions. It would be better if we took a state-based approach and found each state for each regular expression. With the current implementation, we have to redo work that is otherwise unnecessary.
Simplification
Simplification can be improved if it were joined with construction. The idea is that when taking calling nullable or the derivative and checking then could make space complexity less of an issue and depending on the implementation could also be done faster.
Improving the Underlying Regular Expressions
Once again we see this come up. We have not optimized the abilities of the logical or statements, and this time concatenation and logical and both also could be improved on.
This is not as important of an upgrade for time complexity as it was in the NFA Construction. This time this is a space problem. With fewer branches, we save drastically more room
Use Enhanced Regular Expressions
This enhancement is purely for ease of implementation. Currently, we have no way to construct complex regular expressions other than our partial solution, where we created two functions to construct larger regular expressions out of strings. The idea here is the same, but to create regular expressions such as the plus operator, all except, etc.
Looking into Research Papers
At this point of your study in lexical analysis, the primary way to learn is by finding research papers that have investigated possible improvements. The field is pretty well established so there is surely enough information out there. It can just be a challenge finding it at times. If you have anything that you have found for improvements make sure to share it in the comments below for others!
Books
Listed below are a few of my favorite books that taught me a fair amount of compiler construction. If you have books that benefited you on this topic, please share them in the comments below and expand this list!
- Introduction to Languages and the Theory of Computation, By John Martin
- Regular Algebra and Finite Machines, By John Conway
- Compilers: Principles, Techniques, and Tools, By Alfred Aho, Monica Lam, Ravi Sethi, and Jeffrey Ullman
- Introduction to the Theory of Computation, By Michael Sipser
