During parsing and analysis we don't want to "init" a
variable as that can allocate memory - when we add arrays,
we might not know yet how much memory.
So introduce 'prepare' to prepare a value - such that calling
free on it will work - without allocating.
'init' is then called when a variable is declared - unless something
is assigned to it instead.
It was always intended that a type could come between
the : and = of a declaration.
It makes more sense, and simplifies the grammar, if
we stop treating := as ever being a token.
So now : and = are separate tokens.
If the parse sees ":=", it will happily treat that
as two separate tokens, which is what we want.
Rather than a enum listing allowed types, we now
have a 'struct type' which can contain any type.
For now it just has an enum and function pointer
to the existing function, but that can be extended.
It is never used so there isn't much point
storing it. If/When we do use it, we'll
parse it immediately and store the meaning.
And in any case, it should be char[3].
NeilBrown [Mon, 19 Feb 2018 05:58:40 +0000 (16:58 +1100)]
oceani: minimal error tracking.
1/ If the parser hits an error, catch it at eof by having
Program -> ERROR
2/ Allow the presence of and error to be recorded in the context.
3/ If an error occured, don't try to run, and do exit with an error
status.
NeilBrown [Mon, 19 Feb 2018 05:46:04 +0000 (16:46 +1100)]
oceani: Expression etc should be 'exec', not 'binode'.
As a 'var' and a 'val' are possible expressions,
'binode' isn't correct. This is obvious when you
consider that I needed to case $1 for a var or var
before assigning it to $0 for Factor -> Value etc.
So change all these to expect the more generic 'struct exec *'.
Without this change, error handling can try to free a var as though it
was a binode, and get into trouble.
NeilBrown [Mon, 19 Feb 2018 05:40:06 +0000 (16:40 +1100)]
parsergen: enable error handling.
The error handling code currently aborts early because
it was badly broken.
After recent changes it works well enough for experimenting,
so remove the exit(1) and other unnecessary code, and
let's experiment.
NeilBrown [Mon, 19 Feb 2018 05:38:12 +0000 (16:38 +1100)]
parsergen: improve symbol-discard in error handling.
As we don't keep the full look-ahead set, we need to pay a
bit more attention when discarding input symbols, looking
for one we recognize. We need to consider anything
that can be shifted in any state we can reach by simple
shifting.
NeilBrown [Mon, 19 Feb 2018 05:31:14 +0000 (16:31 +1100)]
parsergen: be careful shifting TK_error
shift() behaved a little differently when p.tos == 0,
and if the stack is completely empty, there is little
point trying to shift TK_error as there is no state
to work with.
NeilBrown [Mon, 19 Feb 2018 04:32:46 +0000 (15:32 +1100)]
parsergen: remove symbol synthesis option.
This idea never worked, and cannot work as we cannot
magically synthesis the ast node to go with a synthesized
symbol.
If we want to synthesize something on error, we just use
a production like
foo -> ERRROR ${ $0 = a_new_for(); }$
NeilBrown [Sun, 18 Feb 2018 05:10:59 +0000 (16:10 +1100)]
scanner: fix calculation of column.
When we stripe the expected indent from the
start of each line, we need to update 'col'
to correctly account for tabs.
Previous code effectively assumed tabs were 4 spaces.
NeilBrown [Wed, 31 Jan 2018 03:25:16 +0000 (14:25 +1100)]
New lang: Stoney Creek
This is the second iteration of language design.
it adds scopes variables.
Variables must be declared before use, but they
can be declared in both branches of an 'if', then
used afterwards as the one variable.
No hole-in-scope is allowed: names that are declared
cannot be redeclared in a subordinate scope.
A test program is included:
make sayhello
Note that there are no useful error messages yet.
That is the next step.
NeilBrown [Mon, 29 Jan 2018 05:20:01 +0000 (16:20 +1100)]
parsergen.mdc: add precedence handling
This hasn't been documented properly in the text yet, but
the example has been changed to work and it seems good.
There is no support for precedence to select between two
reductions because I don't believe that ever happens :-) and
I haven't done anything special for non-associative because
I don't know what I would do.
NeilBrown [Tue, 6 Feb 2018 05:42:01 +0000 (16:42 +1100)]
parsergen: record line number of reduce fragments.
If there is an error in a code fragment used to handle
a 'reduce' action, we need the compiler to report the
correct line from the grammar file.
This information is easily available from the scanner,
we just need to pass it along.
NeilBrown [Fri, 3 Oct 2014 04:30:36 +0000 (14:30 +1000)]
parsergen: remove special casing for pop(0).
If pop() is asked to remove nothing from the stack, it now
does exactly the right thing and returns the value that we want.
So some special-casing can be removed.
NeilBrown [Fri, 3 Oct 2014 03:28:32 +0000 (13:28 +1000)]
parsergen: revise rule for NEWLINE forcing reduce
If the whole line is a single symbol, then it isn't appropriate
for a NEWLINE to force a reduce (it may be for an OUT, but as the
NEWLINE shifts (the OUT doesn't) we don't need to push so hard).
NeilBrown [Fri, 3 Oct 2014 03:24:36 +0000 (13:24 +1000)]
parsergen: don't use 'frame' to pass args to shift() or receive from pop()
'struct frame' holds a number of fields that shift()
ignores and pop() doesn't fill in.
So it is a bit confusing to see a frame passed in
and mostly ignored.
So just pass in the fields that are actually needed.
This fixes a bug where 'since_newline' was set wrongly when a newline
is shifted.
NeilBrown [Sun, 22 Jun 2014 05:12:41 +0000 (15:12 +1000)]
parsergen: fix up stack management
The stack has alternating states and symbols. I had groups a state
with the following symbol as the first thing pushed is a state and the
next is a symbol.
It works much better to group the other ways. First we push just state zero.
Then we push some symbol and the state which 'goto' leads to.
In particularly this keeps the 'shift' that happens after "reduce"
quite separate from the 'shift' that happens when the look-ahead is
shifted in. Previous the post-reduce shift was stealing the indent
information that should have stayed in the look-ahead buffer.
NeilBrown [Sun, 15 Jun 2014 08:44:14 +0000 (18:44 +1000)]
parsergen: work-around for indent parsing problem.
These was a problem with my reasoning about parsing indents.
Resolving it properly will take a bit of work, but this little 'fix'
handles an easy case for now.
NeilBrown [Sat, 31 May 2014 05:56:20 +0000 (15:56 +1000)]
parsergen: pass 'config' in to 'reduce' function.
As we only support synthesise attributes and no inherited attributes,
we have no way for the reduce functions to access any context (such
as building a table of variables) except via global variables (yuck).
So pass the 'context' pointer through. The main program can embed
this in a larger structure which contains relevant context, and
the reduce functions can find that using pointer manipulation.
NeilBrown [Sun, 11 May 2014 04:23:10 +0000 (14:23 +1000)]
pargergen: make use of --tag for calc grammar
Rather than placing the 'calc' grammar in a separate file,
add 'calc:' tags to the sections and use --tag option to
extract the grammar directly from the pargergen.mdc file.
NeilBrown [Sun, 11 May 2014 04:21:26 +0000 (14:21 +1000)]
parsergen: add --tag option.
Normally parsergen extracts three secctions: header, code, and grammar.
With "--tag foo", it will ignore anything that doesn't start "foo:",
will extract "foo: header", "foo: code", and "foo: grammar", and only
complain if there are other "foo:" headers.
NeilBrown [Sat, 10 May 2014 10:58:02 +0000 (20:58 +1000)]
mdcode.mdc: Allow more sections than just Example: to be ignored.
"Example:" is no longer a special case. Any section name with
starts "Word:" for some "Word", does not need to be included in
other sections.
"File:" is still a special case of that and will be stored in the
named file.
NeilBrown [Sat, 10 May 2014 23:42:02 +0000 (09:42 +1000)]
parsergen: initialise parser.next properly.
Most fields in the 'next' frame should be initialised to zero,
but some need to be properly initialised. Otherwise we might not handle
early newlines correctly.