]> ocean-lang.org Git - ocean/log
ocean
10 years agopargergen: typo: i, not 1. workingparser
NeilBrown [Tue, 7 Oct 2014 06:05:20 +0000 (17:05 +1100)]
pargergen: typo: i, not 1.

This makes some newline handling break.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: update description to match current reality.
NeilBrown [Fri, 3 Oct 2014 04:52:16 +0000 (14:52 +1000)]
parsergen: update description to match current reality.

In partcular, the handling of indents and newlines was a
bit out-dated.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: remove special casing for pop(0).
NeilBrown [Fri, 3 Oct 2014 04:30:36 +0000 (14:30 +1000)]
parsergen: remove special casing for pop(0).

If pop() is asked to remove nothing from the stack, it now
does exactly the right thing and returns the value that we want.
So some special-casing can be removed.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoindent_test: integrate into Makefile scheme nicely.
NeilBrown [Fri, 3 Oct 2014 04:01:45 +0000 (14:01 +1000)]
indent_test: integrate into Makefile scheme nicely.

The test code is now included in indent_test.mdc,
and there is a '.mk'.
  make tests

will run the test.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: update doc for change from 'starts line' to 'line like'
NeilBrown [Fri, 3 Oct 2014 03:37:17 +0000 (13:37 +1000)]
parsergen: update doc for change from 'starts line' to 'line like'

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparse trace: report since_newline rather than newline_permitted
NeilBrown [Fri, 3 Oct 2014 03:29:42 +0000 (13:29 +1000)]
parse trace: report since_newline rather than newline_permitted

The number in newline_permitted isn't interesting.
The number in since_newline is.  So print that.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoindent test: various fixed to match new design.
NeilBrown [Fri, 3 Oct 2014 03:29:14 +0000 (13:29 +1000)]
indent test: various fixed to match new design.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: revise rule for NEWLINE forcing reduce
NeilBrown [Fri, 3 Oct 2014 03:28:32 +0000 (13:28 +1000)]
parsergen: revise rule for NEWLINE forcing reduce

If the whole line is a single symbol, then it isn't appropriate
for a NEWLINE to force a reduce (it may be for an OUT, but as the
NEWLINE shifts (the OUT doesn't) we don't need to push so hard).

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: fix incorrect 'newline_permitted' setting.
NeilBrown [Fri, 3 Oct 2014 03:28:05 +0000 (13:28 +1000)]
parsergen: fix incorrect 'newline_permitted' setting.

If a state 'starts_line', then a newline is explicitly permitted
(once indents have gone), not explicitly denied!

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsegen: pop was not computing start_of_line properly.
NeilBrown [Fri, 3 Oct 2014 03:27:40 +0000 (13:27 +1000)]
parsegen: pop was not computing start_of_line properly.

In there is any line start in the sequence being popped,
then the new symbol is considered to start a line.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: get rid of 'next' in parser_run()
NeilBrown [Fri, 3 Oct 2014 03:26:33 +0000 (13:26 +1000)]
parsergen: get rid of 'next' in parser_run()

It isn't used for anything useful any more.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: next.indents in parser_run is always zero.
NeilBrown [Fri, 3 Oct 2014 03:26:00 +0000 (13:26 +1000)]
parsergen: next.indents in parser_run is always zero.

now that indents are counted with the previous symbol,
next.indents is always zero.  So stop using it or updating it.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: don't pass full frame to parser_trace()
NeilBrown [Fri, 3 Oct 2014 03:25:25 +0000 (13:25 +1000)]
parsergen: don't pass full frame to parser_trace()

parser_trace() only uses 2 fields, so only pass those.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: don't use 'frame' to pass args to shift() or receive from pop()
NeilBrown [Fri, 3 Oct 2014 03:24:36 +0000 (13:24 +1000)]
parsergen: don't use 'frame' to pass args to shift() or receive from pop()

'struct frame' holds a number of fields that shift()
ignores and pop() doesn't fill in.
So it is a bit confusing to see a frame passed in
and mostly ignored.

So just pass in the fields that are actually needed.
This fixes a bug where 'since_newline' was set wrongly when a newline
is shifted.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: fix handling of Newline in parse.
NeilBrown [Fri, 3 Oct 2014 03:22:29 +0000 (13:22 +1000)]
parsergen: fix handling of Newline in parse.

The required handling for 'newline' when not ignored is:

    if the current state can REDUCE and the reduction length is no
    more symbols than the frames-since-start-of-line count, we REDUCE.

'can REDUCE' removes "reduce_size >= 0", not ">".
'not more symbols' means "reduce_size <= tos->since_newline", not "<".

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: fix handling of TK_in during parse.
NeilBrown [Thu, 2 Oct 2014 11:01:41 +0000 (21:01 +1000)]
parsergen: fix handling of TK_in during parse.

Now that an 'in'  it always considered to be *after* a symbol,
we want to apply 'in' token to the current top-of-stack, not to the
'next' frame.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: remove starts_indented.
NeilBrown [Thu, 2 Oct 2014 10:59:55 +0000 (20:59 +1000)]
parsergen: remove starts_indented.

This is no longer used.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: add since_indent to stack frame.
NeilBrown [Thu, 2 Oct 2014 10:58:38 +0000 (20:58 +1000)]
parsergen: add since_indent to stack frame.

This counts frames since last indent.
It is used to determine when an OUT can be canceled and when we must
force a reduction.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: calculate and record "min_prefix" for each state.
NeilBrown [Thu, 2 Oct 2014 10:51:36 +0000 (20:51 +1000)]
parsergen: calculate and record "min_prefix" for each state.

This is needed to determine when we can cancel an TK_out.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: revise "newline_permitted" definition.
NeilBrown [Thu, 2 Oct 2014 10:51:01 +0000 (20:51 +1000)]
parsergen: revise "newline_permitted" definition.

This is in line with "new" approach.  A newline is permitted/expected
if a starts_line state is closer to top of stack than an indent.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: adjust for new definition of line_like symbols.
NeilBrown [Thu, 2 Oct 2014 10:49:59 +0000 (20:49 +1000)]
parsergen: adjust for new definition of line_like symbols.

A symbol is line-like if it is followed by a NEWLINE, or
any symbol which starts with a NEWLINE.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoMore indent_test.cgm mods.
NeilBrown [Thu, 2 Oct 2014 10:49:14 +0000 (20:49 +1000)]
More indent_test.cgm mods.

Still trying to figure it out.

Want to make sure only the right things are line-like.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoNEWLINE must only ever follow a 'linelike' symbol.
NeilBrown [Thu, 2 Oct 2014 10:47:40 +0000 (20:47 +1000)]
NEWLINE must only ever follow a 'linelike' symbol.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoNewline handling stuff
NeilBrown [Thu, 2 Oct 2014 10:45:32 +0000 (20:45 +1000)]
Newline handling stuff

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: various updates.
NeilBrown [Thu, 2 Oct 2014 10:42:04 +0000 (20:42 +1000)]
parsergen: various updates.

- add starts_line flag for symbols
- add starts_newline flag for stack frames

and related changes

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: improve tracing.
NeilBrown [Sun, 22 Jun 2014 05:18:32 +0000 (15:18 +1000)]
parsergen: improve tracing.

1/ perform the "is null?" test on trace find in parser_trace.
   code is cleaner that way

2/ Report the action at each step in the parse.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: fix up stack management
NeilBrown [Sun, 22 Jun 2014 05:12:41 +0000 (15:12 +1000)]
parsergen: fix up stack management

The stack has alternating states and symbols.  I had groups a state
with the following symbol as the first thing pushed is a state and the
next is a symbol.

It works much better to group the other ways.  First we push just state zero.
Then we push some symbol and the state which 'goto' leads to.

In particularly this keeps the 'shift' that happens after "reduce"
quite separate from the 'shift' that happens when the look-ahead is
shifted in.  Previous the post-reduce shift was stealing the indent
information that should have stayed in the look-ahead buffer.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: work-around for indent parsing problem.
NeilBrown [Sun, 15 Jun 2014 08:44:14 +0000 (18:44 +1000)]
parsergen: work-around for indent parsing problem.

These was a problem with my reasoning about parsing indents.
Resolving it properly will take a bit of work, but this little 'fix'
handles an easy case for now.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: fix return of final result.
NeilBrown [Sun, 15 Jun 2014 07:59:27 +0000 (17:59 +1000)]
parsergen: fix return of final result.

We cannot really shift the final result onto the stack, because
there is not 'goto' for '$eof' in that final state.

So if the shift() fails, hold onto the result and ultimately return it.
This means we don't need to pop it off the stack at the end.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: don't leave garbage in the $0 buffer.
NeilBrown [Sat, 31 May 2014 10:14:53 +0000 (20:14 +1000)]
parsergen: don't leave garbage in the $0 buffer.

As this is static it gets reused.  We are likely to free pointers
in it, so after doing that we should make sure it gets zeroed.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: ensure value returned from parser_run is initialised.
NeilBrown [Sat, 31 May 2014 10:14:01 +0000 (20:14 +1000)]
parsergen: ensure value returned from parser_run is initialised.

If we don't accept the program, we currently return an uninitialized
value.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: make sure result in start symbol is returned.
NeilBrown [Sat, 31 May 2014 10:12:20 +0000 (20:12 +1000)]
parsergen: make sure result in start symbol is returned.

Because we synthesize production-zero, we need to make sure
to add the relevant 'code' to it to preserve the returned value.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: pass 'config' in to 'reduce' function.
NeilBrown [Sat, 31 May 2014 05:56:20 +0000 (15:56 +1000)]
parsergen: pass 'config' in to 'reduce' function.

As we only support synthesise attributes and no inherited attributes,
we have no way for the reduce functions to access any context (such
as building a table of variables) except via global variables (yuck).

So pass the 'context' pointer through.  The main program can embed
this in a larger structure which contains relevant context, and
the reduce functions can find that using pointer manipulation.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoparsergen: discard text_cmp now that it is in a library
NeilBrown [Sat, 31 May 2014 05:51:22 +0000 (15:51 +1000)]
parsergen: discard text_cmp now that it is in a library

Signed-off-by: NeilBrown <neil@brown.name>
10 years agomdcode: normalise text_cmp and export it.
NeilBrown [Sat, 31 May 2014 05:47:25 +0000 (15:47 +1000)]
mdcode: normalise text_cmp and export it.

It is silly using a non-standard text order when it isn't really need
and other code might benefit from having this function available.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoindent_test / parsergen: fix various memory leaks.
NeilBrown [Sun, 11 May 2014 07:18:21 +0000 (17:18 +1000)]
indent_test / parsergen: fix various memory leaks.

thanks valgrind...

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoindent_test: make use of new $<N syntax.
NeilBrown [Sun, 11 May 2014 06:59:56 +0000 (16:59 +1000)]
indent_test: make use of new $<N syntax.

This removes a lot of boring code (also removes some code that was wrong.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agopargergen: support $<N in code fragments
NeilBrown [Sun, 11 May 2014 06:58:40 +0000 (16:58 +1000)]
pargergen: support $<N in code fragments

This removes the need to add "$N = NULL" to avoid the referenced structure
being removed.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: review and update text.
NeilBrown [Sun, 11 May 2014 06:04:53 +0000 (16:04 +1000)]
parsergen: review and update text.

Fix lots of typos, improve poor descriptions, and update some text to
match recent changes.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoindent_test: make use of pointers as types for non-terminals.
NeilBrown [Sun, 11 May 2014 05:33:12 +0000 (15:33 +1000)]
indent_test: make use of pointers as types for non-terminals.

As we are building an AST, pointers work much better and clean
up the code a lot.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: allow pointers as well as struct to be associated with nonterminals.
NeilBrown [Sun, 11 May 2014 05:30:53 +0000 (15:30 +1000)]
parsergen: allow pointers as well as struct to be associated with nonterminals.

This makes it a lot easier when building up an AST.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: remove unused 'start' field from grammar.
NeilBrown [Sun, 11 May 2014 04:57:10 +0000 (14:57 +1000)]
parsergen: remove unused 'start' field from grammar.

This field is only used to see if we have found the start symbol yet,
and that can be done using "production_count".

Further, that value actually stored here is "-1" as symbol numbers
haven't been assigned yet.

So discard 'start'.  We know that the real 'start' symbol is whatever
starts production '0'.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agopargergen: make use of --tag for calc grammar
NeilBrown [Sun, 11 May 2014 04:23:10 +0000 (14:23 +1000)]
pargergen: make use of --tag for calc grammar

Rather than placing the 'calc' grammar in a separate file,
add 'calc:' tags to the sections and use --tag option to
extract the grammar directly from the pargergen.mdc file.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: add --tag option.
NeilBrown [Sun, 11 May 2014 04:21:26 +0000 (14:21 +1000)]
parsergen: add --tag option.

Normally parsergen extracts three secctions: header, code, and grammar.
With "--tag foo", it will ignore anything that doesn't start "foo:",
will extract "foo: header", "foo: code", and "foo: grammar", and only
complain if there are other "foo:" headers.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdcode.mdc: Allow more sections than just Example: to be ignored.
NeilBrown [Sat, 10 May 2014 10:58:02 +0000 (20:58 +1000)]
mdcode.mdc: Allow more sections than just Example: to be ignored.

"Example:" is no longer a special case.  Any section name with
starts "Word:" for some "Word", does not need to be included in
other sections.
"File:" is still a special case of that and will be stored in the
named file.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: Don't look beyond the bottom of stack...
NeilBrown [Sat, 10 May 2014 23:43:41 +0000 (09:43 +1000)]
parsergen: Don't look beyond the bottom of stack...

If the stack is empty, looking beyond for whether newlines are
permitted is dangerous.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: initialise parser.next properly.
NeilBrown [Sat, 10 May 2014 23:42:02 +0000 (09:42 +1000)]
parsergen: initialise parser.next properly.

Most fields in the 'next' frame should  be initialised to zero,
but some need to be properly initialised.  Otherwise we might not handle
early newlines correctly.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agopargergen: don't ignore first token
NeilBrown [Sat, 10 May 2014 23:40:32 +0000 (09:40 +1000)]
pargergen: don't ignore first token

When parsing the grammar we currently skip the first token by mistake.
It is often a TK_newline so not much is lost.  But not always...

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoScanner: minor text updates
NeilBrown [Tue, 3 Jun 2014 11:17:57 +0000 (21:17 +1000)]
Scanner: minor text updates

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoscanner: make sure parsing finishes properly when no final end-of-line.
NeilBrown [Tue, 3 Jun 2014 11:16:58 +0000 (21:16 +1000)]
scanner: make sure parsing finishes properly when no final end-of-line.

If the last node doesn't end with an end-of-line, we still need
to produce all delayed tokens.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoScanner: parsing of comments and strings must recognise end-of-node
NeilBrown [Tue, 3 Jun 2014 11:15:45 +0000 (21:15 +1000)]
Scanner: parsing of comments and strings must recognise end-of-node

If the file does not end in a newline, then a node might not also.

This requires a little more care in parsing strings and comments.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoscanner: clarify the interaction between parsing marks and comments/strings
NeilBrown [Tue, 3 Jun 2014 11:14:41 +0000 (21:14 +1000)]
scanner: clarify the interaction between parsing marks and comments/strings

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoscanner: fix up detection of comments blended with marks.
NeilBrown [Tue, 3 Jun 2014 11:07:58 +0000 (21:07 +1000)]
scanner: fix up detection of comments blended with marks.

If we see the start of a comment in an unknown mark, we need to
finish the mark before the start of the comment.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoScanner: Capital E and P should be usable for exponents/powers.
NeilBrown [Tue, 3 Jun 2014 10:08:13 +0000 (20:08 +1000)]
Scanner: Capital E and P should be usable for exponents/powers.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agoscanner: provide wchar includes for all clients of library.
NeilBrown [Tue, 3 Jun 2014 10:06:26 +0000 (20:06 +1000)]
scanner: provide wchar includes for all clients of library.

As we tokens found by the scanner contain unicode, we should encourage
clients of the library to include the unicode/wchar headers.

Signed-off-by: NeilBrown <neil@brown.name>
10 years agomdcode.mdc: fix error message slightly.
NeilBrown [Sat, 10 May 2014 10:31:06 +0000 (20:31 +1000)]
mdcode.mdc: fix error message slightly.

It is only sections which code which need to be referenced.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd test parser for indent and linebreak handling.
NeilBrown [Sun, 4 May 2014 11:55:29 +0000 (21:55 +1000)]
Add test parser for indent and linebreak handling.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: fix a couple of typos in text. linebreakparser
NeilBrown [Sun, 4 May 2014 10:41:50 +0000 (20:41 +1000)]
parsergen: fix a couple of typos in text.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: improve tracing of parse for line-oriented details.
NeilBrown [Sun, 4 May 2014 10:40:35 +0000 (20:40 +1000)]
parsergen: improve tracing of parse for line-oriented details.

Report if each state is known to start a line, and if each
frame permits a newline.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: track when newline is permitted, and discard if not.
NeilBrown [Sun, 4 May 2014 10:31:05 +0000 (20:31 +1000)]
parsergen: track when newline is permitted, and discard if not.

A newline is only permitted (as a recognised symbol) if we are
parsing a non-indented line-like segment.
If we have seen an internal indent since the last line-like start,
newline tokens should be ignored.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: compute starts_line for each state.
NeilBrown [Sun, 4 May 2014 10:21:00 +0000 (20:21 +1000)]
parsergen: compute starts_line for each state.

Using the per-symbol "can_eol" we can deduce for each state whether
it is expected to (sometime) start a line-oriented syntax element.

The flag is "starts_line" and is made available to the parser.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: compute "can_eol" for each symbol.
NeilBrown [Sat, 3 May 2014 21:55:03 +0000 (07:55 +1000)]
parsergen: compute "can_eol" for each symbol.

A symbol is "can_eol" if it can derive a phrase which ends with a
newlike token.
This will allow us to recognise line-like sections of code and
thus know when to ignore newlines and when not to.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: add handling for TK_IN and TK_OUT indentparser
NeilBrown [Sun, 24 Nov 2013 06:54:02 +0000 (17:54 +1100)]
parsergen: add handling for TK_IN and TK_OUT

Intents are tracked.  The end of an indented region forces certain
reductions.  And indents are managed during error handling.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: centralise (some of) the collecting of next token.
NeilBrown [Sun, 24 Nov 2013 07:14:11 +0000 (18:14 +1100)]
parsergen: centralise (some of) the collecting of next token.

A future patch will introduce next sites where we want to
discard the current token.
Rather than calling "token_next" at each site, make it possible
to just set "tk = NULL", and the next token will automatically
be collected when needed.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: recorded a prefered shift-symbol for error recovery.
NeilBrown [Sun, 24 Nov 2013 06:50:09 +0000 (17:50 +1100)]
parsergen: recorded a prefered shift-symbol for error recovery.

When we find there is no valid parse step, one option that we don't
currently try is to synthesize a symbol and shift it.  i.e. insert a
missing symbol.

A future patch will provide a circumstance where this is the ideal
response, so here we choose a symbol which could usefully be shifted.
We choose the one that will get us closest to the end of a production.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsegen: unify the "next" frame to go onto stack.
NeilBrown [Sun, 24 Nov 2013 06:36:41 +0000 (17:36 +1100)]
parsegen: unify the "next" frame to go onto stack.

We current have a current 'state' in the parser and a 'sym'
which is a local variable passed around in different ways.
Both of these get pushed onto the stack at the next 'shift'.

We will shortly add some more fields to the stack frame, so unify
'state' and 'sym' in to a 'next' struct in the parser struct which can
easily be extended.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoparsergen: various cosmetic fixes
NeilBrown [Sun, 24 Nov 2013 05:53:34 +0000 (16:53 +1100)]
parsergen: various cosmetic fixes

- various typos
- remove double spaces
- re-arrange some text slightly.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen: report some tokens better when tracing.
NeilBrown [Thu, 25 Jul 2013 10:23:26 +0000 (20:23 +1000)]
parsergen: report some tokens better when tracing.

Some tokens are best traced by giving their name rather
than their content.  e.g. newline.  So make a special
case of those.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoRefresh boot-strap files.
NeilBrown [Thu, 25 Jul 2013 10:15:43 +0000 (20:15 +1000)]
Refresh boot-strap files.

There have been some changes to mdcode.mdc, so time to
update the generated files in 'boot-strap'

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoRename Indent and Undent to IN and OUT
NeilBrown [Thu, 25 Jul 2013 10:13:22 +0000 (20:13 +1000)]
Rename Indent and Undent to IN and OUT

These names work better for me.
There is an indent on every line, so the place where the
indent increases shouldn't be called "indent".
And "undent" isn't a word.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen: remove 'depth' arg from do_reduce.
NeilBrown [Sun, 21 Jul 2013 10:44:07 +0000 (20:44 +1000)]
parsergen: remove 'depth' arg from do_reduce.

This was only used for tracing and now tracing is done differently.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen: allow "$void" to remove current type. draftparser
NeilBrown [Sun, 21 Jul 2013 09:27:27 +0000 (19:27 +1000)]
parsergen: allow "$void" to remove current type.

This allows return to the initial state where nonterminals
have no type.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen: improve tracing.
NeilBrown [Sun, 21 Jul 2013 09:18:09 +0000 (19:18 +1000)]
parsergen: improve tracing.

Change the tracing to print the full stack after every step.
This can be cross-referenced with the report that parsergen
can generate to get a full picture of how the parse is progressing.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen: make sure we continue making states until all done.
NeilBrown [Sun, 21 Jul 2013 08:20:00 +0000 (18:20 +1000)]
parsergen: make sure we continue making states until all done.

Whenever we add a state, we need to check again, as it might have been
added early in the list.
There is probably a much more efficient way to do this....

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen: change sort order for items.
NeilBrown [Sun, 21 Jul 2013 08:10:45 +0000 (18:10 +1000)]
parsergen: change sort order for items.

I want items with larger indexes to always preceed smaller
indexes.  This makes the report easier to follow.
So negate the index number (and add an offset) when sorting.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen: change symset function to use 'unsigned short'.
NeilBrown [Sun, 21 Jul 2013 08:04:24 +0000 (18:04 +1000)]
parsergen: change symset function to use 'unsigned short'.

To make full use of all 16 bits of a sym, we should make sure
we consistent use "unsigned".

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen: fix bug testing return value for shift()
NeilBrown [Sun, 21 Jul 2013 08:00:13 +0000 (18:00 +1000)]
parsergen: fix bug  testing return value for shift()

shift() returns 0 on failure, not negative.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoparsergen - adjust for recent scanner fix.
NeilBrown [Sun, 21 Jul 2013 07:58:34 +0000 (17:58 +1000)]
parsergen - adjust for recent scanner fix.

scanner was recently fixed to correctly skip over C code with
a string immediately after a '(', so we can remove the extra
spaces we inserted before.

For this to work, we need to stop ignoring TK_String

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoscanner: don't allow an unknown mark to run into a string or comment
NeilBrown [Sun, 21 Jul 2013 07:53:51 +0000 (17:53 +1000)]
scanner: don't allow an unknown mark to run into a string or comment

This means that e.g.
   printf("hello")
where no marks are declared will not treat
    ("
as an known mark, but instead find
    (
and then a string.

This is important for skipping over C code in 'parsergen'.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoscanner: initialise token state properly.
NeilBrown [Sun, 21 Jul 2013 07:52:39 +0000 (17:52 +1000)]
scanner: initialise token state properly.

We need to call do_strip() at the start to ensure the correct
col and indent is used for the first token.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoscanner: fix do_strip bug.
NeilBrown [Sun, 21 Jul 2013 07:51:37 +0000 (17:51 +1000)]
scanner: fix do_strip bug.

do_strip wasn't striping tabs properly

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoMakefile: if bootstrap was required, suggest "make" be run again
NeilBrown [Fri, 12 Jul 2013 21:25:00 +0000 (07:25 +1000)]
Makefile: if bootstrap was required, suggest "make" be run again

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoUpdate .gitignore with new build targets
NeilBrown [Fri, 12 Jul 2013 21:23:03 +0000 (07:23 +1000)]
Update .gitignore with new build targets

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoNew file: parsergen
NeilBrown [Fri, 12 Jul 2013 21:21:37 +0000 (07:21 +1000)]
New file: parsergen

This reads and analyses a grammar and generates a parser.

It include a simple calculator

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoscanner: must call close_token before returning the token.
NeilBrown [Fri, 12 Jul 2013 21:18:06 +0000 (07:18 +1000)]
scanner: must call close_token before returning the token.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoscanner/numbers: fix typo that broken positive exponents.
NeilBrown [Fri, 12 Jul 2013 21:16:07 +0000 (07:16 +1000)]
scanner/numbers: fix typo that broken positive exponents.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoscanner: make the array of known words "const"
NeilBrown [Fri, 12 Jul 2013 21:15:32 +0000 (07:15 +1000)]
scanner: make the array of known words "const"

After all, it shouldn't change.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomdcode: rename code_print to code_node_print and export it.
NeilBrown [Fri, 12 Jul 2013 21:13:55 +0000 (07:13 +1000)]
mdcode: rename code_print to code_node_print and export it.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoFix typo in mdcode.mdc
NeilBrown [Fri, 12 Jul 2013 21:12:21 +0000 (07:12 +1000)]
Fix typo in mdcode.mdc

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoscanner.mdc: lexical scanner for Ocean.
NeilBrown [Sat, 22 Jun 2013 09:18:55 +0000 (19:18 +1000)]
scanner.mdc: lexical scanner for Ocean.

This scanner does lexical analysis and produces tokens.
It also handles numbers and escapes in strings.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoMakefile: auto-support .mdc files.
NeilBrown [Sat, 22 Jun 2013 09:17:49 +0000 (19:17 +1000)]
Makefile: auto-support .mdc files.

Any *.mdc cause "md2c" to be run to create a "*.mk" file which is
included and used.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoRecord in a code_node whether it was indented or not.
NeilBrown [Sat, 22 Jun 2013 07:39:23 +0000 (17:39 +1000)]
Record in a code_node whether it was indented or not.

This is needed to correctly adjust for this indent when processing the
code.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomdcode, md2c - extract C code from a literate markdown program
NeilBrown [Wed, 5 Jun 2013 20:20:35 +0000 (06:20 +1000)]
mdcode, md2c - extract C code from a literate markdown program

All C programs here will be written in literate style using markdown.
md2c strips out the C code so that it can be compiled.

Signed-off-by: NeilBrown <neilb@suse.de>