Node:Grammar in Bison, Next:Semantic Values, Previous:Language and Grammar, Up:Concepts
A formal grammar is a mathematical construct. To define the language for Bison, you must write a file expressing the grammar in Bison syntax: a Bison grammar file. See Bison Grammar Files.
A nonterminal symbol in the formal grammar is represented in Bison input
as an identifier, like an identifier in C. By convention, it should be
in lower case, such as expr, stmt or declaration.
The Bison representation for a terminal symbol is also called a token
type. Token types as well can be represented as C-like identifiers. By
convention, these identifiers should be upper case to distinguish them from
nonterminals: for example, INTEGER, IDENTIFIER, IF or
RETURN. A terminal symbol that stands for a particular keyword in
the language should be named after that keyword converted to upper case.
The terminal symbol error is reserved for error recovery.
See Symbols.
A terminal symbol can also be represented as a character literal, just like a C character constant. You should do this whenever a token is just a single character (parenthesis, plus-sign, etc.): use that same character in a literal as the terminal symbol for that token.
A third way to represent a terminal symbol is with a C string constant containing several characters. See Symbols, for more information.
The grammar rules also have an expression in Bison syntax. For example,
here is the Bison rule for a C return statement. The semicolon in
quotes is a literal character token, representing part of the C syntax for
the statement; the naked semicolon, and the colon, are Bison punctuation
used in every rule.
stmt: RETURN expr ';'
;