Class SimpleRegexLexer

  • All Implemented Interfaces:
    Lexer

    public class SimpleRegexLexer
    extends java.lang.Object
    implements Lexer
    This is a "dynamic" Lexer that will use Regex patterns to parse any document, It is NOT as fast as other JFLex generated lexers. The current implementation is about 20x slower than a JFLex lexer (5000 lines in 100ms, vs 5ms for JFlex lexer) This is still usable for a few 100 lines. 500 lines parse in about 10ms. It also depends on how complex the Regexp and how many of them will actually provide a match. Since KEYWORD TokenType is by order less than IDENTIFIER, the higher precedence of KEYWORD token will be used, even if the same regex matches an IDENTIFIER. This is a neat side-effect of the ordering of the TokenTypes. We now just need to add any non-overlapping matches. And since longer matches are found first, we will properly match the longer identifiers which start with a keyword. This behaviour can easily be modified by overriding the compareTo method
    • Constructor Detail

      • SimpleRegexLexer

        public SimpleRegexLexer​(java.util.Map props)
      • SimpleRegexLexer

        public SimpleRegexLexer​(java.lang.String propsLocation)
                         throws java.io.IOException
        Throws:
        java.io.IOException
    • Method Detail

      • parse

        public void parse​(javax.swing.text.Segment segment,
                          int ofst,
                          java.util.List<Token> tokens)
        Description copied from interface: Lexer
        This is the only method a Lexer needs to implement. It will be passed a Reader, and it should return non-overlapping Tokens for each recognized token in the stream.
        Specified by:
        parse in interface Lexer
        Parameters:
        segment - Text to parse.
        ofst - offset to add to start of each token (useful for nesting)
        tokens - List of Tokens to be added. This is done so that the caller creates the appropriate List implementation and size. The parse method just adds to the list