NeilBrown [Mon, 8 Nov 2021 08:35:25 +0000 (19:35 +1100)]
oceani: update min_depth promptly.
As the loop in var_block_close() continues until min_depth is too low,
we need to set it promptly to stop the same variable being processed
again before it has been merged.
NeilBrown [Sat, 6 Nov 2021 23:59:23 +0000 (10:59 +1100)]
oceani: create separate scope for do part of while
Any variables created in the do part won't be created in the final
iteration, so we want them to be constrained to the do part, not seen as
part of the whole loop body.
This makes while/do match if/then better.
NeilBrown [Sat, 6 Nov 2021 02:04:54 +0000 (13:04 +1100)]
oceani: move var_block_close() calls to the code sections that close the block
Rather than calling var_block_close() from common non-terminals, move
the calls into the body of the parent non-terminal. This places them
after the 'struct exec' which represents the scope has been created.
This is needed to attach the variables to the point where their scope is
closed, so they can be freed.
This change helped me focus on some untested - and broken - code.
NeilBrown [Fri, 5 Nov 2021 23:55:01 +0000 (10:55 +1100)]
oceani: simplify loop in var_block_close()
The 'step' was not in the 'for' header, which makes it harder to follow
how the loop works.
Also add a comment to explain where is happening when ->name->var != v.
NeilBrown [Sat, 30 Oct 2021 04:49:08 +0000 (15:49 +1100)]
oceani-tests: add test for declaring a CondScope variable
If a variable was declared in all branches of a structures command, it
may or may not be declared as something else afterwards.
We need to test both options.
NeilBrown [Sun, 17 Oct 2021 10:03:01 +0000 (21:03 +1100)]
oceani: move variable values to a stack frame.
We have two frames - one for global values (currently always constant)
and one for local variables.
When we get functions, the local variable frame will be managed with a
stack of frames.
NeilBrown [Sun, 17 Oct 2021 02:35:58 +0000 (13:35 +1100)]
oceani: add parse_context arg to all interp functions, and a few others.
When I switch variables to use a stack frame, I'll need the
parse_context available more broadly (as it will hold the stack).
So add it to a selection of functions now.
NeilBrown [Sat, 16 Oct 2021 05:58:42 +0000 (16:58 +1100)]
oceani: differentiate static-sized arrays from others.
Some arrays will always have the same size - a static size.
Others might have a different size each time their scope is entered, if
the size is calculates from a variable.
The latter need to be reallocated whenever scope is entered, the former
do not.
This will matter when we create call frames to be able to handle
recursion.
NeilBrown [Sat, 16 Oct 2021 05:27:41 +0000 (16:27 +1100)]
oceani: don't allocate init value for non-initialized fields.
Struct fields that aren't explicitly initialised must be initialized to
a 'null' value. This can happen at interp-time. There is no need to
allocate a null value when parsing.
NeilBrown [Thu, 14 Oct 2021 02:43:02 +0000 (13:43 +1100)]
oceani: handle variable-sized arrays better.
An array with size set by a constant variable(!) might have a different
size each time the declaration is encountered. So we need to
re-evaluate the size each time.
We currently re-evaluate the size only if it is zero.
So for numerical-constant sized arrays, evaluate size during parsing.
For other arrays, re-evaulate each time using a new prepare_type method.
NeilBrown [Tue, 12 Oct 2021 10:28:47 +0000 (21:28 +1100)]
oceani: fix a couple of issues
1/ when a variable declared in a loop was re-initialized, we didn't free
the old value before allocating a new one.
2/ When assigning to an out-of-bounds array index, created an rval,
but never freed it.
NeilBrown [Sat, 2 Oct 2021 22:36:50 +0000 (09:36 +1100)]
ocean: introduce prefix op for string->number conversion.
Rather than having magic conversion of command line args to numbers as
needed, introduce '$' as a prefix op to to the conversion.
This is a step towards changing 'program' to be a 'main' function.
NeilBrown [Wed, 10 Mar 2021 01:37:46 +0000 (12:37 +1100)]
oceani: updates for new approach to parsing indents.
Now that IN is a valid stand-alone token, it makes sense to change the
grammar for ocean.
We don't need the ':' before an indent if there is some other terminal
there. So:
while
statements
do
statements
doesn't require any ':'.
We use the ':' to separate an expression from following statements,
in 'if' and 'while' and 'case'.
NeilBrown [Wed, 10 Mar 2021 00:49:24 +0000 (11:49 +1100)]
parsergen: add support for EOL token
And EOL token is generated when a NEWLINE is found and an EOL can be
shifted. This allows a product to declare that it must finish at the
end of a line, without consuming the NEWLINE.
NeilBrown [Wed, 10 Mar 2021 00:38:55 +0000 (11:38 +1100)]
parsergen: implement new handling of IN/OUT and NEWLINE
IN/OUT are now expected in the grammar.
In a state where an IN can be shifted, IN symbols are significant to the
grammar. IN symbols appearing anywhere else are ignored (except for how
they affect NEWLINEs).
OUT symbols are ignored precisely when the matching IN was ignored.
NEWLINEs are ignored if the most recent IN was ignored, otherwise they
are significant for the grammar.
NeilBrown [Fri, 5 Mar 2021 10:24:14 +0000 (21:24 +1100)]
parsergen: add support for "special" terminals.
We will want a new terminal "EOL", which is like "NEWLINE", but
different. There is currently no room in the numbering for something
like that, so make some room.
NeilBrown [Fri, 5 Mar 2021 09:31:32 +0000 (20:31 +1100)]
parsergen: remove line_like information.
I'm going to change the 2D nature of the parser over several patches.
First I remove what I don't want, then I add what I do.
During this series, tests won't work!
NeilBrown [Fri, 26 Feb 2021 06:33:43 +0000 (17:33 +1100)]
parsergen: don't use static buffer for result value.
Add the size of the result value to the per-state information, so it can
be allocated before calling do_reduce(), thus removing the need for a
overly large static buffer.
NeilBrown [Fri, 5 Mar 2021 08:20:22 +0000 (19:20 +1100)]
parsergen: change how reserved_words are stored
Rather than a simple array with holes, have a dense array mapping number
to name. This will enable a future change which adds names that don't
have numbers assigned.
NeilBrown [Sun, 11 Oct 2020 03:49:07 +0000 (14:49 +1100)]
parsergen: add more power to symbol references in generated code
As well as symbol references like "$2", you can now use references
with letters like "$Ss". This will find the shortest symbol in the
production that contains all the given letters in the given order.
There must be a unique shortest symbol.
If that same symbol occurs multiple times, later instances can be given
with a numeric suffix such as "$Ss2".
NeilBrown [Sat, 10 Oct 2020 23:34:06 +0000 (10:34 +1100)]
parsergen: allow terminals to be declared.
By default, any non-virtual symbol that does not appear in the head of a
product is assumed to be a Terminal.
For larger grammars, this misses out of an opportunity to detect errors.
So allow a "$TERM" line to list terminals (that do no appear in
precedence lines). If any $TERM line is given, then generate error
if any symbol appears in a production but is not declared, either
as terminal or non-terminal.
NeilBrown [Sat, 10 Oct 2020 22:50:12 +0000 (09:50 +1100)]
parsergen: avoid infinite loop on error.
If the grammar allows "ERROR" in a recursive location, error handling
can loop for every.
e.g.
foo -> foo bar
foo -> ERROR
Rather than detect and reject such grammars, detect the infinite loop
as it start, and discard an extra token.
i.e. if error handling doesn't discard any tokens from the input
stream, and another error is triggered before anything is shifted, then
we force the next error handling phase to discard at least one token,
or to abort if that token is EOF.
NeilBrown [Tue, 6 Oct 2020 06:02:22 +0000 (17:02 +1100)]
parsegen: detect left-recursive symbols in non-final position.
A left-recursive symbol that appear other than at the end of a
production causes problem for indent-based parsing, as describe in the
document. So teach parsergen to be able to report them.
Ocean currently has several of these, which I'll need to look into at a
later date.
NeilBrown [Tue, 6 Oct 2020 04:44:46 +0000 (15:44 +1100)]
scanner: change the meaning of ignoring comment tokens.
Previously ignoring comment tokens meant they were still parsed, but not
returned. The only way to stop them being parsed was to declare
known marks for the start symbols.
This made is not possible for parsergen to define a language that had
a known mark that would otherwise start a comment.
So change the ignoring of comment tokens to mean they aren't parsed. If
you want to parse comments but not return them, leave the new
"return_comments" field as so. In the unusual case that you want to
return comments set return_comments to 1.
Confirm that this has the desired effect by added in "//" as an
integer-division operator to the sample calculator.
NeilBrown [Mon, 5 Oct 2020 23:00:31 +0000 (10:00 +1100)]
indent_test: fix makefile
Maybe 'make' has changed a little to be less forgiving, but 'make itest'
isn't working now. All of LDLIBS are included in the 'cc' line, but
there are no dependencies to make sure they have been built.
The problem is that I'm using LDLIBS for different programs which need
different libs. This isn't such a good idea.
So change indent_test to use itestLDLIBS and itestCFLAGS.
NeilBrown [Fri, 28 Jun 2019 09:36:49 +0000 (19:36 +1000)]
parsergen: only non-terminals should make a state "starts_line"
If a state is followed by NEWLINE, then it isn't starts_line - more like
ends_line.
It is only non-terminals containing NEWLINEs that cause a state
to be starts_line.
So move the test to after we stop looking at terminals.
NeilBrown [Sun, 23 Jun 2019 05:37:50 +0000 (15:37 +1000)]
oceani: allow 'then' in simple if statements.
Allow 'then' after "if expression", and don't require a ':' if
it is followed by simple statements.
Similarly "else" doesn't need a colon for simple statements
NeilBrown [Sun, 23 Jun 2019 04:41:47 +0000 (14:41 +1000)]
oceani: change parsing for ; at end
When we have 'for' and 'then' on the same line, I want to
require a ';' for the 'for' (and 'while').
So change SimpleStatemnts to never end with ';', and require
a ; or Newline after each instance of SimpleStatements.
NeilBrown [Sun, 23 Jun 2019 04:29:13 +0000 (14:29 +1000)]
oceani: modify grammar to not waste stack on newlines
Current grammar uses one stack frame per newline for leading
newlines as these productions are right-recursive. This is
unnecessary and inelegant. Change to use a left-recursive Newlines
production.
NeilBrown [Sun, 23 Jun 2019 03:51:46 +0000 (13:51 +1000)]
indent_test: reduce stack usage for preceding NEWLINEs
In the cases where we allow preceding newlines (Statementlist Open Close)
we current use one parse-stack from for each newline. While there are
unlikely to be many, this is inelegant.
Change the right-recursive form to use a left-recursive Newlines rule
that absorbs one or more NEWLINEs using at most 2 stack frames.
NeilBrown [Sun, 23 Jun 2019 00:21:14 +0000 (10:21 +1000)]
parsergen: allow $$OUT to be satisfied are start-of-line.
If a $$OUT (or $$NEWLINE) production is being reduced at
start-of-line (with no indents), then that is satisfactory,
we don't need NEWLINE etc as look-ahead.
This means that in cases where this is relevant, the computed
lookahead is wrong - we shouldn't have striped it.
I don't think this matters as it only affects conflict warnings,
and I think these will be reported at a higher level if relevant.
If essense, the $$OUT marking is like a precendence marking which
suppresses shift/reduce warnings as it say that decision is being made
on some basis other than look-ahead.
NeilBrown [Sun, 16 Jun 2019 01:31:54 +0000 (11:31 +1000)]
parsegen: fix up look-ahead for $$NEWLINE items.
I was discarding all non-newlines from the lookahead
in the wrong place.
I need to do it based on the productions added, not
the item the are generated by.