The PHC package contains a program (maketea) to automatically create the Abstract Syntax Tree (AST) out of a definition given in a phc.tea file.
To the definition of a statement, I added the line in red, to have the AST nodes for the new statements:
statement ::= if | while | do | for | foreach | switch | break | continue | return | static_declaration | unset | declare | try | throw | eval_expr | xml_element | xml_element_attribute | escaped_print | xml_processing_instruction ;
Notice that these definitions only state what data has to be stored in the tree, it indicates node dependencies and repetitions, not the actual syntax of the statements, which is given in the parser. Thus, in the following:
xml_element ::= xml_element_name statement*; xml_element_attribute ::= xml_element_name expr; escaped_print ::= expr*; xml_processing_instruction ::= xml_element_name statement*; xml_element_name ::= ns_is_var:"$"? xml_namespace:TAG_NAME? is_var:"$"? TAG_NAME;
an xml_element
is composed of an xml_element_name
and a list (possibly empty) of statement
s. This shows what
nodes or lists of nodes have to be available and how they relate to each other.
Most nodes require an xml_element_name
which is composed of two
parts, an optional namespace and the actual name. Either can be a variable
or a literal thus, for each one there is a flag (ns_is_var
and is_var
)
that indicates the presence of a "$" sign, which
is the PHP prefix for variables. Both names are of type TAG_NAME
,
the second takes that as a name, the first is of the same type but will be
called xml_namespace
. The later can be optional (?
symbol which indicates it can take a NULL
value) while the second
cannot. The maketea
utility assumes that uppercase identifiers are tokens, so a node of type TAG_NAME
will be created.
In order to operate on these nodes in a more generic way, they can be added
to the lists of generic node types. Those that are full statements and can
have full comments in separate lines are added to commented_node
,
those that are only a part of full statements are simply node
s and
a TAG_NAME
is, of course, an identifier
. This is
done in the following section with my additions in red.
node ::= php_script | class_mod | signature | method_mod | formal_parameter | type | attr_mod | directive | list_element | variable_name | target | array_elem | method_name | actual_parameter | class_name | commented_node | expr | identifier | formal_parameter* | directive* | array_elem* | actual_parameter* | INTERFACE_NAME* | list_element* | expr* | xml_element_name ; commented_node ::= member | statement | interface_def | class_def | switch_case | catch | interface_def* | class_def* | member* | statement* | switch_case* | catch* | xml_element | xml_element_attribute | escaped_print | xml_processing_instruction ; identifier ::= INTERFACE_NAME | CLASS_NAME | METHOD_NAME | VARIABLE_NAME | DIRECTIVE_NAME | CAST | OP | CONSTANT_NAME | TAG_NAME ;
To process the phc.tea file, Haskell, a functional programming language, is required. This is not in popular Linux distributions, though it is available as an extra, and the build process might give you an error if it doesn't find it, after all, the public interface to PHC is via the plug-ins which are meant to process standard PHP, not to add extended instructions to it.
< Previous: Parsing |