From: NeilBrown Date: Sun, 21 Jul 2013 07:53:51 +0000 (+1000) Subject: scanner: don't allow an unknown mark to run into a string or comment X-Git-Tag: draftparser~7 X-Git-Url: https://ocean-lang.org/code/?p=ocean;a=commitdiff_plain;h=4d085c0f91408abb43eeeddd022b13569e3682a4 scanner: don't allow an unknown mark to run into a string or comment This means that e.g. printf("hello") where no marks are declared will not treat (" as an known mark, but instead find ( and then a string. This is important for skipping over C code in 'parsergen'. Signed-off-by: NeilBrown --- diff --git a/csrc/scanner.mdc b/csrc/scanner.mdc index 6dda848..7e33d0c 100644 --- a/csrc/scanner.mdc +++ b/csrc/scanner.mdc @@ -318,6 +318,12 @@ in a known mark, it will return that first known mark. If no known mark is found we will test against strings and comments below before giving up and assuming an unknown mark. + +If an unknown mark contains a quote character or a comment marker, and +that token is not being ignored, then we terminate the unknown mark +before that quote or comment. This ensure that an unknown mark +immediately before a string is handled correctly. + If `TK_mark` is ignored, then unknown marks as returned as an error. ###### token types @@ -329,6 +335,7 @@ Known marks are included in the same list as the list of known words. tk.num = TK_error; while (is_mark(ch, state->conf)) { int n; + wchar_t prev; close_token(state, &tk); n = find_known(state->conf, tk.txt); if (n >= 0) @@ -339,7 +346,22 @@ Known marks are included in the same list as the list of known words. close_token(state, &tk); return tk; } + prev = ch; + if (prev == '/') + save_unget_state(state); ch = get_char(state); + if (!(ignored && (1<