--nextPart1705709.psIiDhGQil
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8Bit
Dear users of JavaCC,
this is the first time I'm using a compiler compiler and I'm also not
familiar
with building a compiler or interpreter nor with abstract machines. I'm
trying to write the grammar for a very simple scripting language with the
only constructs: variable output, variable assignment and function call.
$var // output contents of var
$var = "123" // assign string to var
$var = $var2 // copy value of $var2 into $var
$var = #func() // assign return value of function to var
#func( "123", $var ) // call a function with two parameters
The generated parser works as long as the input only consists of language
constructs (see attached grammar file). Now comes the tricky part: The
language should be embeddable into *any* kind of text, especially HTML:
<html>
$var = "Hello world"
<h1>$var</h1>
</html>
I'm struggling for days now trying to implement the "allow non-language
tokens" logic. It seems that every time I implement this logic it takes
precedence over all other constructs. I'm probably not thinking the
"JavaCC
way" and thus working around "how it's done" in JavaCC.
I hope you can help me! Thanks!
--
(\ (\ Sincerely
( ^_~)
(_(")(") Sven Jacobs
--nextPart1705709.psIiDhGQil
Content-Type: text/x-java; name="grammar.jj"
Content-Transfer-Encoding: 8Bit
Content-Disposition: attachment; filename="grammar.jj"
PARSER_BEGIN(Parser)
package de.svenjacobs.tpl2html.parser;
public class Parser {
public static void main(String args[]) throws ParseException {
Parser parser = new Parser( System.in );
parser.Process();
}
}
PARSER_END(Parser)
SKIP :
{
" "
| "\t"
}
TOKEN: /* EOL */
{
< EOL: [ "\n", "\r" ] >
}
TOKEN : /* IDENTIFIERS */
{
< IDENTIFIER: ["a"-"z","A"-"Z","_"] ( ["a"-"z","A"-"Z","_","0"-"9"] )* >
| < VAR_IDENT: "$" <IDENTIFIER> >
| < FUNC_IDENT: "#" <IDENTIFIER> >
}
TOKEN : /* SEPARATORS */
{
< LPAREN: "(" >
| < RPAREN: ")" >
| < COMMA: "," >
}
TOKEN : /* OPERATORS */
{
< ASSIGN: "=" >
| < PLUS_ASSIGN: "+=" >
}
TOKEN : /* LITERALS */
{
< STRING_LITERAL: "\""
( ( ~[ "\"", "\n", "\r", "\\" ] )
| ( "\\"
[ "\"", "\\", "n", "r" ]
)
)*
"\"" >
}
void Process() :
{}
{
( [ Statement() ] <EOL> )* <EOF>
}
void Statement() :
{}
{
LOOKAHEAD( 2 )
VariableAssignment()
| Variable()
| Function()
}
void Param() :
{}
{
<STRING_LITERAL> | Variable() | Function()
}
void Variable() :
{}
{
<VAR_IDENT>
}
void VariableAssignment() :
{}
{
Variable() ( <ASSIGN> | <PLUS_ASSIGN> ) Param()
}
void Function() :
{}
{
<FUNC_IDENT>
<LPAREN>
[ Param() ( <COMMA> Param() )* ]
<RPAREN>
}
--nextPart1705709.psIiDhGQil--


|