Symbols can be either `TK_ident` or `TK_mark`. They are saved in a
table of known symbols and the resulting parser will report them as
`TK_reserved + N`. A small set of identifiers are reserved for the
-different token types that `scanner` can report.
+different token types that `scanner` can report, and an even smaller set
+are reserved for a special token that the parser can generate (`EOL`) as
+will be described later. This latter set cannot use predefined numbers,
+so they are marked as `isspecial` for now and will get assigned a number
+with the non-terminals later.
###### declarations
{ TK_out, "OUT" },
{ TK_newline, "NEWLINE" },
{ TK_eof, "$eof" },
+ { -1, "EOL" },
};
+
###### symbol fields
short num;
+ unsigned int isspecial:1;
Note that `TK_eof` and the two `TK_*_comment` tokens cannot be
recognised. The former is automatically expected at the end of the text
s = sym_find(g, t);
s->type = Terminal;
s->num = reserved_words[i].num;
+ s->isspecial = 1;
}
}
Once we have built everything we allocate arrays for the two lists:
symbols and itemsets. This allows more efficient access during
-reporting. The symbols are grouped as terminals and then non-terminals,
-and we record the changeover point in `first_nonterm`.
+reporting. The symbols are grouped as terminals, then non-terminals,
+then virtual, with the start of non-terminals recorded as `first_nonterm`.
+Special terminals -- meaning just EOL -- are included with the
+non-terminals so that they are not expected by the scanner.
###### grammar fields
struct symbol **symtab;
struct itemset *is;
int snum = TK_reserved;
for (s = g->syms; s; s = s->next)
- if (s->num < 0 && s->type == Terminal) {
+ if (s->num < 0 && s->type == Terminal && !s->isspecial) {
s->num = snum;
snum++;
}
for (i = TK_reserved;
i < g->num_syms;
i++)
- if (g->symtab[i]->type == Nonterminal)
+ if (g->symtab[i]->type == Nonterminal ||
+ g->symtab[i]->isspecial)
fprintf(f, "\t\"%.*s\",\n", g->symtab[i]->name.len,
g->symtab[i]->name.txt);
fprintf(f, "};\n\n");
fprintf(f, "static void do_free(short sym, void *asn)\n");
fprintf(f, "{\n");
fprintf(f, "\tif (!asn) return;\n");
- fprintf(f, "\tif (sym < %d) {\n", g->first_nonterm);
+ fprintf(f, "\tif (sym < %d", g->first_nonterm);
+ /* Need to handle special terminals too */
+ for (i = 0; i < g->num_syms; i++) {
+ struct symbol *s = g->symtab[i];
+ if (i >= g->first_nonterm && s->type == Terminal &&
+ s->isspecial)
+ fprintf(f, " || sym == %d", s->num);
+ }
+ fprintf(f, ") {\n");
fprintf(f, "\t\tfree(asn);\n\t\treturn;\n\t}\n");
fprintf(f, "\tswitch(sym) {\n");