Class: LexDictionary | Zope-2.2.1-src/lib/python/Products/ZGadflyDA/gadfly/kjParser.py | |||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
lexical dictionary class this data structure is used by lexical parser below. basic operations: LD.punctuation(string) registers a string as a punctuation EG: LD.punctuation(":") Punctuations are treated as a special kind of keyword that is recognized even when not surrounded by whitespace. IE, "xend" will not be recognized as "x end", but "x;" will be recognized as "x ;" if "end" is a regular keyword but "; is a punctuation. Only single character punctuations are supported (now), ie, must be recognized as ":" "=" above the lexical level. LD.comment(compiled_reg_expression) registers a comment pattern EG LD.comment(regex.compile("--.*\n")) asks to recognize ansi/sql comments like "-- correct?\n" LD.keyword(keyword_string, canonicalstring) specifies a keyword string that should map to the canonicalstring when translated to the lexical stream. EG: LD.keyword("begin","BEGIN"); LD.keyword("BEGIN","BEGIN") will recognize upcase or downcase begins, not mixed case. (automatic upcasing is allowed below at parser level). LD[compiled_reg_expression] = (TerminalFlag, Function) # assignment! specifies a regular expression that should be associated with the lexical terminal marker TerminalFlag EG: LD[regex.compile("[0-9]+")] = ("integer",string.atoi) the Function should be a function on one string argument that interprets the matching string as a value. if None is given, just the string itself will be used as the interpretation. (a better choice above would be a function which "tries" atoi first and uses atol on overflow). NOTE: ambiguity among regular expressions will be decided arbitrarily (fix?). LD[string] # retrieval! returns ((KEYFLAG, Keywordstring), Keywordstring) if the (entire) string matches a keyword or a punctuation Keywordstring. otherwise returns ((TERMFLAG, Terminalname), value) if the (entire) string matches the regular expression for a terminal flaged by Terminalname; value is the interpreted value. TerminalFlag better be something other than KEYFLAG! otherwise raises an error! comments not filtered here! the following additional functions are used for autodocumentation in declaring rules, etcetera. begin = LD.keyword("begin") sets variable "begin" to (KEYFLAG, "BEGIN") if "begin" maps to keyword "BEGIN" in LD integer = LD.terminal("integer") sets variable integer to ("integer", Function) if "integer" is a registered terminal Function is its associated interpretation function.
|