2.21. Lexing

Felix provides a mechanism for constructing lexers. The reglex construction matches a prefix of the string. Of all possible matches, reglex chooses the longest match. As for regmatch, if more than one regexp matches, the first written is used.

The expression for each regexp has access to three values of type iterator: lexeme_start, lexeme_end and buffer_end.

Start felix section to tut/tutorial/tut-1.21-0.flx[1 /1 ]
     1: #line 1176 "./lpsrc/flx_tutorial.pak"
     2: #import <flx.flxh>
     3: open Lexer;
     4: 
     5: regexp lower = ["abcdefghijklmnopqrstuvwxyz"];
     6: regexp upper = ["ABCDEFGHIJKLMNOPQRSTUVWXYZ"];
     7: regexp digit = ["0123456789"];
     8: regexp alpha = lower | upper | "_";
     9: regexp space = " ";
    10: regexp white = space +;
    11: 
    12: fun lexit(start:iterator, finish:iterator):
    13:   iterator * (string * string)
    14: =
    15: {
    16:   return
    17:     reglex start to finish with
    18:     | digit+ => "Number",
    19:       string_between(lexeme_start,lexeme_end)
    20: 
    21:     | alpha+ =>  "Identifier",
    22:       string_between(lexeme_start,lexeme_end)
    23: 
    24:     | white =>  "White",
    25:       string_between(lexeme_start,lexeme_end)
    26:     endmatch
    27:   ;
    28: }
    29: 
    30: 
    31: var s = "A string 2 lex";
    32: val first = start_iterator s;
    33: val finish = end_iterator s;
    34: var current = first;
    35: 
    36: while { current != finish }
    37: {
    38:     match lexit(current, finish) with
    39:     | ?next,(?kind,?lexeme) =>
    40:     {
    41:       current = next;
    42:       print kind; print ": "; print lexeme; endl;
    43:     }
    44:     endmatch
    45:   ;
    46: };
    47: print "Done.\n";
End felix section to tut/tutorial/tut-1.21-0.flx[1]
Start data section to tut/tutorial/tut-1.21-0.expect[1 /1 ]
     1: Identifier: A
     2: White:
     3: Identifier: string
     4: White:
     5: Number: 2
     6: White:
     7: Identifier: lex
     8: Done.
End data section to tut/tutorial/tut-1.21-0.expect[1]


2.21.1. A (much) longer example