5. The Felix Compiler

Basic phase diagram:
  Lexer ->
  Token Filter ->
  Parser ->
  Syntax Macro Processor ->
  Desugaring/Lambda Lifting ->
  Unbound Symbol Table Construction ->
  Lookup/Binding/Use analysis ->
  Optimisation ->
  Code generation
The lexer generates tokens, which are passed to the token filter to strip out white spaces (since these make LR(1) parsing almost impossible by eating up lookaheads). Any token stream macro processing will be inserted here. The parser then builds the Abstract Syntax Treee from the filtered token stream.

The syntax macro processor is a basic term calculator for generating and reducing terms. The macro processing phase includes constant folding, with special handling for conditionals so that conditional compilation of syntactically correct phrases doesn't require any special syntax.

Lambda lifting replaces each anonymous function used in an expressions with a fresh name and defines the name in some context including the expression.

Desugaring includes lambda lifting and a few other rewrites. For example, the if/then/else/endif expression is replaced by an equivalent match. [We want to put this in the library eventually]

The unbound symbol table is a map from indexes to symbol definitions, and is a convolution of the hierarchical input structure of the AST. Each symbol has a unique index, a link to its parent context, and each context containing definitions has a map from names to indexes. However, the uses of symbols are unbound.

The binding phase performs lookups on the types, expressions, directives, and executable statements to replace each names with its index in the symbol table, resulting in the fully bound symbol table.

This phase is extremely complex, since the binding of all components must be done on the unbound symbol table. Whilst simple lookups present no problem, functions can be overloaded. This means the type of an application cannot be determined until the overload is resolved, which requires typing the argument of the function, which in turn depends on overloading.

In addition, the open directive adds considerable complication, since in Felix everything in the same scope is considered to be mutually recursive. This means the argument of the open directive requires lookup of the name in a context enriched by all the other open directives, but the enrichment requires lookup of the argument of those open directives.

Even more complicated: Felix has module expressions, and so the prefix part of a name doesn't have to be a module name .. it can be a module expression. Module expressions include functor applications, and functors can be overloaded. The argument type of a functor must be matched to the supplied module, which requires binding every element of the interface and and the signatures of the elements of the module.

The binding and lookup is driven by a flow analyser than is iniialised with the root procedure (the init routine of the top level module) plus any exported procedures or functions. By chasing down all calls, it discovers all instances types used in the program, and all instances of functions and procedures. An instances is a binding of a type or function to a list of types, which is the basic mechanism for generic programming.

The code generator constructs classes for each type, function and procedure, and generates descriptors for each entity which includes a list of offsets at which frame object pointers are located so the garbage collector can operate.

Finally, the code generator makes C wrappers for exported functions, and provides initialisation and termination functions to construct and destroy the instance frame objects (global, process, and thread frames at present).
5.1. Configuration loader
5.2. Version control
5.3. The front end
5.4. System dependent path handling
5.5. Felix specific Utilities
5.6. Utilities
5.7. AST
5.8. Types
5.9. Mappings
5.10. Routines to extract source reference from terms
5.11. Print module
5.12. Compile time exceptions
5.13. Charset
5.14. Elide unused entries
5.15. Unification
5.16. Parser
5.17. Parser object
5.18. Parser test harness
5.19. String handling
5.20. Internationalised Identifier support
5.21. Keywords
5.22. Lexer
5.23. Pre token filters
5.24. Pre token printer
5.25. Tokeniser
5.26. Lexer test harness
5.27. Pattern matching utilities
5.28. Constant folding
5.29. Macro Expansion
5.30. C format string
5.31. Cil to Felix
5.32. Desugaring
5.33. Build Symbol tables
5.34. Meta typing and Beta reduction
5.35. The type registry
5.36. Generic support
5.37. overload resolution
5.38. Name Lookup
5.39. Name Binding
5.40. Bind executable statements
5.41. Name Binding
5.42. Downgrade Abstract types
5.43. Typeclass checker
5.44. Axiom Check
5.45. label management
5.46. Control Flow
5.47. Optimisation stuff
5.48. Elide unused entries
5.49. Inline exes
5.50. Reparenting
5.51. Reductions
5.52. Fold vars
5.53. Inlining
5.54. Monomorphisation
5.55. Make stack calls
5.56. Elide unused entries
5.57. Thread frame pointer required detector
5.58. Elide unused entries
5.59. The back end
5.60. Type generator
5.61. Code fragment inliner
5.62. DFA
5.63. Lexer generator
5.64. Expression unraveller
5.65. Display calcs
5.66. GC shape object generator
5.67. C++ Code generator
5.68. Why interface
5.69. C++ Code generator
5.70. Get options
5.71. Stub mainline