From 4d085c0f91408abb43eeeddd022b13569e3682a4 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Sun, 21 Jul 2013 17:53:51 +1000 Subject: [PATCH] scanner: don't allow an unknown mark to run into a string or comment This means that e.g. printf("hello") where no marks are declared will not treat (" as an known mark, but instead find ( and then a string. This is important for skipping over C code in 'parsergen'. Signed-off-by: NeilBrown --- csrc/scanner.mdc | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/csrc/scanner.mdc b/csrc/scanner.mdc index 6dda848..7e33d0c 100644 --- a/csrc/scanner.mdc +++ b/csrc/scanner.mdc @@ -318,6 +318,12 @@ in a known mark, it will return that first known mark. If no known mark is found we will test against strings and comments below before giving up and assuming an unknown mark. + +If an unknown mark contains a quote character or a comment marker, and +that token is not being ignored, then we terminate the unknown mark +before that quote or comment. This ensure that an unknown mark +immediately before a string is handled correctly. + If `TK_mark` is ignored, then unknown marks as returned as an error. ###### token types @@ -329,6 +335,7 @@ Known marks are included in the same list as the list of known words. tk.num = TK_error; while (is_mark(ch, state->conf)) { int n; + wchar_t prev; close_token(state, &tk); n = find_known(state->conf, tk.txt); if (n >= 0) @@ -339,7 +346,22 @@ Known marks are included in the same list as the list of known words. close_token(state, &tk); return tk; } + prev = ch; + if (prev == '/') + save_unget_state(state); ch = get_char(state); + if (!(ignored && (1<