Since it was recomended in the language article and also all example projects I’ve found so far uses JFlex to build the lexer. The first thing I did was installing the JFlex plugin for IDEA. Seemed to work mostly fine for the examples. Then I tried out some simple things like finding keywords, whitespace and setting the rest as “bad chars”. The focus of the lexer so far isn’t to be complete just enough so I can learn how to build the syntax highlighter later on. I played around with it a bit but felt I needed some way to see the output. So I built a wrapper for the Lexer class that just logged all tokens it sent out. After that I just played around with the jflex code to see how to do different things. I already knew regexp so that part wasn’t so hard. The tricky parts where the “states”. But after a while I think I understand those to.
The biggest issue I have with the lexer right now is to know how “smart” to make it. Using the states you can make it pretty smart. But I’m thinking probably alot of that might be better to put into the parser. I’ll give you a example:
13s could for instance be lexed as [integerLiteral(13)][durationSuffix(s)] or [durationLiteral(13s)] or [secondLiteral(13)] and probably lots of other ways to. So how to chose witch way to do it? I think I will just have to work with the systems that depend on the data a bit and it will become clearer.
Another minor issue I had was with the example code. They all used the “Interface for constants” pattern. I’ve decided against that. I will instead use “class for constants” pattern together with Static import. Personal preference I guess.
Also after having looked through the plugin codes it seems to not use alot of Dependency injection. I think I might try to integrate guice into this. Would make things easier to test in isolation. But I’m not sure it’s possible. Will have to look it up.
0 Responses to “Developing the lexer”