As we don't keep the full look-ahead set, we need to pay a
bit more attention when discarding input symbols, looking
for one we recognize. We need to consider anything
that can be shifted in any state we can reach by simple
shifting.
Signed-off-by: NeilBrown <neil@brown.name>
terminates any line-like structure - we try to reduce down to at most
one symbol for each line where newlines are allowed.
terminates any line-like structure - we try to reduce down to at most
one symbol for each line where newlines are allowed.
+When, during error handling, we discard token read in, we want to keep
+discarding until we see one that is recognised. If we had a full set
+of LR(1) grammar states, this will mean looking in the look-ahead set,
+but we don't keep a full look-ahead set. We only record the subset
+that leads to SHIFT. We can, however, deduce the look-ahead set but
+looking at the SHIFT subsets for all states that we can get to by
+reducing zero or more times. So we need a little function which
+checks if a given token is in any of these look-ahead sets.
+
###### parser includes
#include "parser.h"
###### parser includes
#include "parser.h"
+
+ static int in_lookahead(struct token *tk, const struct state *states, int state)
+ {
+ while (state >= 0) {
+ if (search(&states[state], tk->num) >= 0)
+ return 1;
+ if (states[state].reduce_prod < 0)
+ return 0;
+ state = search(&states[state], states[state].reduce_sym);
+ }
+ return 0;
+ }
+
void *parser_run(struct token_state *tokens,
const struct state states[],
int (*do_reduce)(int, void**, struct token_config*, void*),
void *parser_run(struct token_state *tokens,
const struct state states[],
int (*do_reduce)(int, void**, struct token_config*, void*),
break;
}
tos = &p.stack[p.tos-1];
break;
}
tos = &p.stack[p.tos-1];
- while (search(&states[tos->state], tk->num) < 0 &&
+ while (!in_lookahead(tk, states, tos->state) &&
tk->num != TK_eof) {
free(tk);
tk = tok_copy(token_next(tokens));
tk->num != TK_eof) {
free(tk);
tk = tok_copy(token_next(tokens));