The PDE Preprocessor is based on the Java Grammar that comes with ANTLR 2.7.2. Moving it forward to a new version of the grammar shouldn't be too difficult. Here's some info about the various files in this directory: java.g: this is the ANTLR grammar for Java 1.3/1.4 from the ANTLR distribution. It is in the public domain. The only change to this file from the original this file is the uncommenting of the clauses required to support assert(). java.tree.g: this describes the Abstract Syntax Tree (AST) generated by java.g. It is only here as a reference for coders hacking on the preprocessor, it is not built or used at all. Note that pde.g overrides some of the java.g rules so that in PDE ASTs, there are a few minor differences. Also in the public domain. pde.g: this is the grammar and lexer for the PDE language itself. It subclasses the java.g grammar and lexer. There are a couple of overrides to java.g that I hope to convince the ANTLR folks to fold back into their grammar, but most of this file is highly specific to PDE itself. PdeEmitter.java: this class traverses the AST generated by the PDE Recognizer, and emits it as Java code, doing any necessary transformations along the way. It is based on JavaEmitter.java, available from antlr.org, written by Andy Tripp , who has given permission for it to be distributed under the GPL. ExtendedCommonASTWithHiddenTokens.java: this adds a necessary initialize() method, as well as a number of methods to allow for XML serialization of the parse tree in a such a way that the hidden tokens are visible. Much of the code is taken from the original CommonASTWithHiddenTokens class. I hope to convince the ANTLR folks to fold these changes back into that class so that this file will be unnecessary. TokenStreamCopyingHiddenTokenFilter.java: this class provides TokenStreamHiddenTokenFilters with the concept of tokens which can be copied so that they are seen by both the hidden token stream as well as the parser itself. This is useful when one wants to use an existing parser (like the Java parser included with ANTLR) that throws away some tokens to create a parse tree which can be used to spit out a copy of the code with only minor modifications. Partially derived from ANTLR code. I hope to convince the ANTLR folks to fold this functionality back into ANTLR proper as well. whitespace_test.pde: a torture test to ensure that the preprocessor is correctly preserving whitespace, comments, and other hidden tokens correctly. See the comments in the code for details about how to run the test. All other files in this directory are generated at build time by ANTLR itself. The ANTLR manual goes into a fair amount of detail about the what each type of file is for. .... Current Preprocessor Subsitutions: "compiler.substitute_floats" (currently "substitute_f") - treat doubles as floats, i.e. 12.3 becomes 12.3f so that people don't have to add f after their numbers all the time. this is confusing for beginners. "compiler.enhanced_casting" - byte(), char(), int(), float() works for casting. this is basic in the current implementation, but should be expanded as described above. color() works similarly to int(), however there is also a *function* called color(r, g, b) in p5. will this cause trouble? "compiler.color_datattype" - 'color' is aliased to 'int' as a datatype to represent ARGB packed into a single int, commonly used in p5 for pixels[] and other color operations. this is just a search/replace type thing, and it can be used interchangeably with int. "compiler.web_colors" (currently "inline_web_colors") - color c = #cc0080; should unpack to 0xffcc0080 (the ff at the top is so that the color is opaque), which is just an int. Other preprocessor functionality - detects what 'mode' the program is in: static (no function brackets at all, just assumes everything is in draw), active (setup plus draw or loop), and java mode (full java support). http://proce55ing.net/reference/environment/index.html - size and background are pulled from draw mode programs and placed into setup(). this has a problem if size() is based on a variable, which we try to avoid people doing, but would like to be able to support it (perhaps by requiring the size() to be final?) - currently does a godawful scrambling of the comments so that the substitution doesn't try to run on them. this also causes lots of bizarro bugs. Possible? - would be nice to just type code wherever, mixing a 'static' style app with a few functions. would be simpler for starting out. but it seems that the declarations would have to be pulled out, but that all seems problematic. or maybe it could all be inside a static { } block. but that wouldn't seem to work either.