+
+
+ I have a problem.
+ I want
+
+ else: a := b
+
+ to parse the same as
+
+ else:
+ a := b
+
+ and for the last newline to close the elsepart.
+ But the latter has 2 newlines while the former only has one
+ and I don't have any obvious justification for ignoring either.
+ I think it is in the Newline before the OUT that is extra.
+
+ I could drop the newline before the OUT, assuming the newline
+ separate things, and the OUT will force any reductions needed.
+ But then we have fewer newlines reported than actual.
+ (Same imbalance happens with multiline comments and strings, so maybe
+ that is OK).
+ Another way to look at it is that the newline following an IN is discarded
+ (or always ignored) and not moved to after the OUT.
+ So (maybe) the newline at an IN or OUT is reported *after* the IN or OUT.
+ so
+ A
+ B
+ C
+ D
+
+ Would be A IN NL B NL C OUT IN NL D OUT NL
+
+ The parser always ignores the NL after an IN but uses other
+ NL to reduce to a single symbol (if possible)
+ OR maybe it doesn't ignore (unless not line-like context) and
+ lines are preceeded by NL, not followed by them...
+ No, followed is usually good.. though separated is better... so preceed!!
+
+ Ok, this isn't working.
+ A construct
+ if cond:
+ pass
+
+ cannot be reduced to a Statement until we know what comes next, and it
+ might be separated by several newlines.
+ So the newlines need to be part of the Statement.
+ But that means we cannot have newlines at the front of a statement.
+ But that was the point...
+
+ Maybe a Statementlist is a series of StatementNL followed by a Statement
+
+ We allow
+ StatementNL -> Statement Newlines
+ as a general catch-all, but when we have something like if, or anything
+ with an optional tail "else:" or "case:"
+ We say:
+ StatementNL -> if Expression Block Newlines
+ But that would produce a conflict with
+ Statement -> if Expression Block
+ As a newline could either trigger a reduce to Statement, or a shift.
+ Obviously we shift, but maybe we use precedence to force the point.
+
+ Can we handle 'else if' ...
+ IfStatementNL -> if Expression Block Newlines
+ | if Expression Block else IfStatementNL
+
+ ... I'm contemplating having the parser duplicate NL as necessary, so
+ that
+ if test: action
+ can appear to be followed by 2 NL, one to terminate the 'action' statement
+ and one to terminate the whole 'if'.
+ This might mean I need to extend when NL are discarded - to ensure they
+ don't get duplicated too much.
+ 1/ if state does not permit newlines, discard
+ 2/ else if I can reduce symbols all since start of line do that.
+ 3/ else if can shift, do that
+ 4/ else if only one symbol since newline, discard.
+ 5/ else ERROR
+
+ This means that we cannot recognise multiple newlines
+ or does it.
+ If we shift a Newline, that is since_newline=0;
+ If we reduce that to Newlines, that is still since_newline=0
+ If 4/discard only applies when since_newline==1 -- we win.
+
+ Currently since_newline essentially means the symbol contains a newline.
+ So 'statements' usually does, but 'statement' doesn't.
+ When we shift the newline and reduce, it all becomes since_newline=0.
+ That is when we want to ignore newlines.
+
+ 15jun2019 - still working this through..
+
+ Normally the parser does
+ shift or else reduce or else error
+ exceptions are TK_in which is simply recorded and
+ TK_out: reduce until there is a TK_in in scope, then cancel, else error
+ TK_newline:
+ if not newline_permitted (indent since last starts_line state)
+ Discard
+ if can Reduce to at most start-of-line, reduce
+ if can Shift, duplicate and Shift
+ if can Reduce, do so
+ if 0 since newline, Discard
+
+ since_newline needs to be changed a bit.
+ A TK_newline token *isn't* zero, it is N+1. The token *after*
+ the NEWLINE is zero - so that
+
+ Arg. I'm struggle with that fact that having shifted a newline,
+ we are both at the end of a line, and at the start of the next.
+ When I see a newline, I want to reduce until the end of line
+ is in the same state as the start of that line.
+
+ Maybe I do want newline to be a separator.
+ What if I don't actually include the newline in the grammar, just like in/out.
+ Instead we mark select productions as lines. This is like marking
+ for precedence.
+ A marked production is reduced when a newline is seen providing it won't
+ contain any indents.
+ So: if the reducable item in a state is marked, the start gets marked.
+ When we see a newline, if the state is marked and the reduce size does not
+ exceed since_indent, we reduce. Otherwise we discard.
+ No... I need an error condition too.
+ So I need the state to have a starts_line marking, when a new item is marked.
+
+ So:
+ productions can be marked $$NEWlINE which flags the production as line-like
+ a state with an item with DOT at start of a line-like production is starts_line
+ a state with an item with DOT at the end of a line-line product is ends_line
+ We track indents as before.
+ When we process an indent or newline, we set since_newline to 0
+ When we see a newline we do one of:
+ if not newline_permitted, we discard
+ if top state starts line, we discard
+ else reduce or else error
+
+No.....
+ A production -> { statements }
+ needs to ignore newlines either side of statements.
+ It is a multi-line production - newlines don't matter.
+ Maybe there are several sorts of symbols:
+ - in-line: must be broken across lines unless indented
+ - line-like: is terminated (reduced) by a newline
+ - multi-line: newlines are ignored
+
+ We tag symbols which are line-like.
+ Any symbol which can derive a line-like symbol is multi-line
+ Any other symbol is in-line.
+
+ So SimpleStatements, ifhead, elsepart, casepart etc are linelike
+
+$line SimpleStatements IfHead ....
+
+ A state that is at the start of a linelike symbol starts_line
+ Any state in a multi-line production starts_line
+
+ if tos starts_line, newlines are ignored.
+ else if there is an indent since the starts_line, newlines are ignored.
+ But if there are symbols since starts_line, we have to reduce until are
+ are in a starts_line start (or can see an indent).
+
+No....
+ Block -> : Statementlist
+is none of thse. It must be reduced by a newline, but isn't entirely line-like.
+ Block -> { Statementlist }
+is multi-line
+
+but maybe Block here is neither. It only becomes linelike when it
+terminates a Statement, which is linelike. Or terminates an ifHead
+
+Should this be legal?
+ a:=b;pass if something
+probably not. I want to require at least ';' or NEWLINE.
+That means I need to include NEWLINE in the grammar.
+ Statements -> SimpleStatements ; Statement
+ | Statements Statement
+ | Statement
+
+ if cond: if cond: a:=b
+NEWLINE reduces this down to IfHead
+IfHead -> IfHead NEWLINE
+Statement -> IfHead
+ | IfHead ElsePart
+
+ElsePart -> else BLOCK
+ | else IfHead
+ | else IfHead ElsePart
+ | ElsePart NEWLINE
+
+
+Statements -> Statements Statement
+ | Statement
+
+
+if cond { statments } else { statements}
+but not
+if cond: statements else: statements
+
+so :statements must expect a NEWLINE but then
+ if cond: if cond: statement
+expects 2 NEWLINEs.
+Maybe
+ Block -> : IN statments NEWLINE
+if there is no indent, we synth one which triggers an OUT NEWLINE pair.
+This could be automatic.
+ If a linelike is followed by a newline, we synthesis an IN before it.
+
+That requires a hack to the scanner: Synth Indent
+
+What if
+
+ Block -> { Statements }
+ | : Statements NEWLINE
+ | : SimpleStatements
+
+Then
+ if Expr Block
+
+might not end in a NEWLINE so else could come immediately. Is that OK?
+ if expr: a:=0 if expr
+must be forbidden. That requires a newline.
+
+ Block -> : IN statements OUT NEWLINE
+
+In marks state as 'needs_indent' If an indent arrives, fine.
+If something else, we record that next newline (After balanced in/out)
+must synth extra out/newline.
+
+ Block -> { Statements }
+ | : statementblock
+
+ statementblock -> Statements $$line
+
+$$line means it must be reduced by a newline. If something else tries,
+it is an error and we skip to newline.
+It also strips everything but NEWLINE from the (effective) lookahead
+to avoid reporting conficts, as those things will never be shifted.
+
+ IfHead -> if Expr Block
+ | IFHead NEWLINE
+ Block -> { statements }
+ | : statements $$line
+
+ IfStatement -> IfHead
+ | IfHead IfTail
+ | IfHead else IfStatement
+ IfTail -> else Block
+ | IfTail NEWLINE
+
+
+------
+20jun2019 Happy 87th birthday Dad.
+
+I'm not convinced about $$NEWLINE
+
+ else: simplestatementlist
+
+should be able to parse simplestatementlist without a newline, and
+use the newline to close the if/else.
+where as
+ else:
+ statementlist
+
+has a newline to close the statementlist and another to close the if/else.
+But can the LR parser tell the difference?
+It only sees that newlines don't forcibly reduce the else:
+So when it sees the newline at the end of simplestatementlist,
+it cannto shift because there is a sub-line thing that can be reduced.
+So this becomes elsepart before the newline is absorbed.
+Whereas in statementlist, the newline can be shifted creating simplestatementline.
+
+What about
+ if cond: if cond: statement
+
+Again, the newline cannot be shifted while we can reduce
+
+But.... how does conflict analysis know that an 'if', for example, is not
+permitted after simplestatementlist?
+
+Ahh.. This is exactly what $$NEWLINE is for. Maybe it should be $$OUT.
+Either way, the grammar is ambiguous and relies on newlines or indentation to
+close the production, and this fact needs to be explicit.
+Requiring OUT is probably best as it means
+ if cond:
+ statements
+ else:
+ statements
+
+works even though there is no newline after the first statements.
+Here I want the 'statements' to be closed by OUT, but the whole to
+be closed by NEWLINE.
+So maybe I need both $$NEWLINE and $$OUT ??
+
+
+$$OUT makes lots of sense. It is exactly how we expect :statements to
+be closed - where we allow NEWLINE to have the same effect.
+
+$$NEWLINE is good for closing an complex if or for etc. It means that
+nothing else can be on the same line - allowing for indents.
+How do we implement that?
+Any production in the grammar that represents a full line but doesn't
+end with a newline should be marked $$NEWLINE
+This head of that production should recursively absorb NEWLINEs.
+
+I'm not yet clear on exactly the difference bwetween $$OUT and $$NEWLINE.
+I would put $$OUT after
+ block -> : Statements $$OUT
+and $$NEWLINE at the end of a statement that must end a line
+ condstatement -> Ifpart IfSuffix $$NEWLINE
+ | constatement NEWLINE
+
+Maybe I need a worked example.
+ while conda:
+ if condb: if condc: action
+ pass
+
+So after action there is NL OUT NL pass
+The NL sees that it can reduce, and the if allows the NL to reduce it so
+ while COND : IN if COND : statement [ NL OUT NL ]
+again the NL can reduce. Note that we *don'* absorb the NL in the statement
+ while COND : in statement [ NL OUT NL ]
+Now we can shift the NL
+ while COND : IN statment [ OUT NL ]
+Now the OUT forces a reduction
+ while COND BLOCK(in) [ OUT NL ]
+Now the out is cancelled
+ while COND BLOCK [NL]
+and the while is reduced.
+
+So the $$NEWLINE must always see a newline (or $$EOF)
+An $$OUT must see an OUT or a NEWLINE (if there was no IN)
+
+$$OUT causes the LA set for items with the production to be empty.
+It is never credible that anything will be shifted so any apparent LA
+contents can be ignored.
+The state when a $$OUT is reducible has a recedence higher than any terminal, so
+nothing can be shifted and no completion should be possible.
+The state when a $$NEWLINE is reducible is much the same.
+
+Maybe I don't want NEWLINE in the grammar, only $$NEWLINE??
+How would we recognize a blank line?
+ command -> $$NEWLINE
+??
+We would need a new rule for discarding newlines.
+e.g. when the top-but-one state is start-of-line we discard and mark the top
+state s-o-l. That stops us discarding a newline until it reduces something that is at the start of a line....
+
+1/ if there is an indent since the last start-of-line state, discard NEWLINEs
+2/ if ....
+
+Q: When is a NEWLINE an error?
+A: when it isn't ignored and we cannot reduce and
+ top or top-but-one state isn't starts_line.??
+
+So we need extra state info and extra frame info.
+
+State has:
+ - starts-line - is at start of a $$NEWLINE production
+ - ends-line - is at end of unreduced $NEWLINE production
+ - ends-indent - is at the end of a $$OUT production
+ - min-prefix - how far back a 'in' can be and still cancel
+
+Frame has:
+ - indents - count in or after that sym
+ - line_start - is the was a line start (IN or NEWLINE) immediately after
+ the symbol
+ - newline_permitted: no indent since start-line
+ - since_indent: number of frames where indents==0
+ - since_newline: number of frames where line_start==0
+
+If we see a NEWLINE then:
+ if ! newline_permitted, discard
+ elseif can reduce and reduce_count <= since_newline - reduce
+ elseif since_newline <= 1, and state.starts-line, discard and record line_start
+ else error
+
+If we see an IN
+ increment indents, set line_start
+
+If we see an OUT
+ if reduce_size <= since_indent, reduce
+ if min_prefix >= since_indent, cancel
+ else error
+
+How does error handling work?
+Normally we pop states until we can shift ERROR
+Then we discard tokens until we can shift one.
+
+However we need to do something different for IN OUT NEWLINE.
+For IN, we simply increment a counter
+For OUT we decrement if it is positive.
+ If it is zero and the state ends-indent, then we are synced.
+ If it doesn't, we need to pop more states until we have an indent to cancel.
+For NEWLINE if the state ends-line or ends-indent and ...something... we are synced.
+ else we skip it??
+
+... no, that doesn't work because I cannot see a way to describe an optional newline.
+
+Let's try with just $$OUT which requires OUT or NEWLINE...
+We put $$OUT on productions that must be closed in a 2-d obvious way.
+So they can be at the end of a line or at the end of an indente block.
+So
+ : statements $OUT
+means the next line after the : cannot be indented.
+However
+ Block -> : statementline | : statementblock | { statements }
+ statementblock -> statements $NEWLINE
+means I can have else indented, or on same line as single statement
+
+ if cond: a = b; else: b = a
+ if cond:
+ a = b
+ else:
+ b = a
+
+The whole 'if' needs a $NEWLINE marking to ensure a following statement isn't
+indented.
+So implementation is almost exactly what I have:
+ - if anything else is lookahead when reducing that production, it is an error.
+ - remove non-newlines from lookahead in items
+
+But I don't think $$OUT is quite what I want to call it.
+That doesn't quote cover end-of-line possibilities.
+Maybe allow $$NEWLINE or $$OUT but with same behaviour.
+
+.... still not there.
+Another way to satisfy a $$OUT reduction is for it to already look right.
+So: No indents and at start-of-line
+
+But that upsets the modification to look-ahead as we can no longer assume
+the next token.
+I think this might be more like a precedence thing??
+Without look-ahead modification, the first token of a statement can be shifted
+ before a newline forces a reduction....
+
+Maybe I do need two sorts productions.
+ $$OUT requires an out/newline to reduce it.
+ $$NEWLINE follows either $$OUT or NEWLINE and requires start-of-line and no indents.
+ or $$OUT or $$NEWLINE
+
+Or does it matter. Over-modification of the look-ahead suppresses warnings, but
+doesn't affect the parse.
+Will we get warnings anyway?
+
+
+--------
+Are left-recursive symbols in a non-final position always bad?
+
+Left-recursive symbols cannot be closed by forcing a reduction.
+So if one starts in an indented region (in which newlines are ignored)
+it could continue afterwards - unless we make that an explicit error somehow.
+If they appear at the end of some other production, that one will (maybe)
+be reduced as well so (maybe) no problem...
+
+if cond:
+ a()
+ b()
+ c()
+
+is weird and I want to forbid it. Al that is between b() and c() is
+NL OUT IN. NL closed b(), so it is just OUT IN
+So I do want the statementlist to close.
+
+ a :=
+ 1 + 2
+ * 3 + 4
+
+is very wrong. How much can I help?
+The OUT will reduce "1 + 2" which will then become
+ ((1 + 2) * 3) + 4
+which would be highly confusing.
+So something about this must be disallowed.
+Maybe when newlines are ignored, OUT doesn't force a reduce??
+I can make it an error by having Expression reduce to something else.
+
+Do I want an error even for
+ 1 + 2
+ * 3 + 4
+??
+
+I could achieve that by adding extra checks when we SHIFT at
+start of line.
+If we could reduce tokens since previous SOL, then we have 2D ambiguity.
+
+ 1 + 2 *
+ 3 + 4
+
+That is just as ambiguous, but we cannot reduce anything.
+When we see the second '+', the reduction crosses a line-start but doesn't
+result in a line-start.
+
+So: a reduce that doesn't contain an indent, but does contain a start-of-line
+must reduce to that start of a line.
+
+This means we need to keep the start-of-line when we "IGNORE" a newline.
+
+Can I use this sort of logic to avoid the need for the extra reduction,
+or for the $$OUT markings??
+
+1/ The point of extra reduction is to avoid consuming more after an OUT or
+ ignored NL.
+ if cond:
+ a()
+ b()
+ c()
+
+ must be an error. The OUT reduced a()b(). The stack is then
+ if cond : statements(n1) . IN Ident
+ The first indent is gone. . There is no error until we see all of c() so
+ if cond : statements(n1) simplestatement(i) . NL OUT NL
+
+ Is it problen that the simplestatement is indented?
+ if cond : statements(n1) statement(i) . OUT NL
+
+ Q: is it an error to reduce a sequence containing an (uncancelled) indent?
+
+2/ The $$OUT markings guard against exactly a reduction containing an uncancelled IN.
+
+So maybe I have two new rules.
+
+ 1/ a reduction must not include any uncancelled indent. pop() must return 0.
+ 2/ a reduction the contains an unindented start-of-line must begin with start-of-line.
+ So when we cancel an indent, we also cancel line starts since there.
+
+One other value of $$OUT is that is avoided conflicts - most symbols could not
+be shifted. That should have only applied to $$NEWLINE(!) and doesn't apply
+at all if I drop the marking and use internal rules instead.
+So how do I avoid reporting conflicts?
+
+Really, there shouldn't be any conflict as NEWLINE should be expected.
+Let's go back to that idea.
+
+1/ A linelike thing MAY start with Newlines and MUST end with a NEWLINE
+2/ A SimpleStatement is not linelike and doesn't include and Newlines
+3/ if condition : SimpleStatement
+ is a SimpleStatement.
+
+4/ When we see the NEWLINE after "if condition : SimpleStatement" we have a shift/reduce
+ conflict as we could SHIFT to make a complex statement, or reduce the whole thing
+ to a SimpleStatement.
+ Default action is SHIFT but in this case we want REDUCE - due to precedence?
+
+ However when we see the NEWLINE after "if condition :IN Simplestatement"
+ we cannot REDUCE as there is an cancelled indent, so we have to shift.
+
+ But when we reduce, we only want to Reduce to IfHead so that an 'else' can appear
+ on the next line.
+
+ If we see IN .. just continue.
+
+What do I need to do:
+
+ 1/ Change grammer to expect blank-lines before and to have a NEWLINE at the end
+ of any line-like thing.
+ This requires IfHeadNL and IfHead. ditto for switch, while, then ...
+ This get complex with
+ for a:=0; then a += 1; while a < 10:
+ which could have several newlines
+ for a:=0
+ then a+=1 ; while a < 10:
+
+ ForPart -> for simplestatements ;
+ | for simplestatements NEWLINE
+ | for Block
+ | Newlines ForPart
+
+ ThenPart -> then SimpleStatements ;
+ | then SimpleStatements NEWLINE
+ | then BLOCK
+ | Newlines ThenPart
+
+ 2/ disallow Reduce when embedded indents - report ERROR
+ 3/ disallow Reduce when embedded start-of-line.
+ 4/ TK_newline uses these rules to decide when to force a reduce.
+
+A/ A parser symbol that starts after an IN must end before the OUT
+B/ A parser symbol that starts before an IN must end at-or-after the OUT
+ only if if the symbol is not line-like ???
+
+C/ A parser symbol that starts after a line-start and before an indent must end
+ by the end of line
+D/ A parser symbol that starts at a line-start must end before the end-of-line,
+ or at a subsequent end-of-line.
+
+A is satisfied by forcing a reduce on OUT and reporting error if IN cannot be cancelled
+B is satisfied if we report an error if we try to reduce an uncancelled IN
+C is satisfied by forcing a reduce *after* shifting NL and reporting ERROR if
+ min_prefix exceeds the line
+D is satisfied if we report an error when reducing at eol crosses a NL and doesn't start
+ at start-of-line.
+
+C is interesting - do we reduce *after* shifting NL?? I think we do, yes.
+
+So: when can I suppress conflicts, and how do I handle reduce/reduce conflicts?
+
+I need to be sure that a line-like ends with an unindented newline.
+I can trigger an error when that doesn't happen, but I want more.
+I want to encourage it to happen. So if the grammar allows a NEWLINE it
+will be shifted in, but if we have already seen an OUT, we ignore the NEWLINE
+rather than trigger an error.
+Also, I need to not report shift/reduce conflicts on whatever comes next.
+i.e. if
+ a -> b c
+and c can end with a, then both c and a can be followed by the same things.
+This is a conflict. If c (and a) end with NEWLINE we declare the conflict
+resolved.
+
+An IfHead might not end with a NEWLINE. So to make a statement we need
+to follow it with an optional NEWLINE. Let's see if we can make that work.
+
+What is a SimpleCondStatement? It has no blank lines or unindented breaks..
+
+ ForPart ThenPart WhilePart CondSuffix OptNL
+
+We don't need a distinct Simple class!! If it didn't start at SOL,
+then unindented NEWLINEs must be terminal Wooho!!
+
+This requires that "Statements" doesn't insist on following NEWLINE
+
+A SimpleStatement can be followed by a ';', a Statement cannot. That
+different is still needed.
+So
+ SimpleStatements -> SimpleStatement | SimpleStatements ; SimpleStatement
+
+SimpleStatements can end with a ComplexStatement and no NEWLINE.
+ComplexStatements must end with a NEWLINE after each statement except the last
+
+Each Part (including SSlist) end with arbitrary NEWLINEs. These will
+only ever be at the same indent level.
+A ComplexStatement must be separated from next by a NEWLINE.
+So if the final non-empty Part does not end with NEWLINEs, how do we require one?
+Maybe not..
+
+What if a Part doesn't end with NEWLINE ever, but can start with them
+
+CondStatement -> IfPart Newlines
+ | IfPart IfTail...
+
+I think I need a CondStatement which doesn't end with a newline and a
+CondStatementNL which does. Then anything that can end a cond statement
+must come in two versions.
+ IfPart ElsePart CasePart WhilePart CondSuffix
+
+If we expect the non-NL, we accept the NL but not vice-versa.
+
+------
+problem.
+in
+ if cond:
+ cmd1
+ cmd2
+
+the 'if' that started before the indent must finish at/after the indent.
+But in
+ if a = b or
+ c = d :
+ do something
+The Expr that started before the first indent may finish well before the indent finishes.
+I think this is because Expr is not linelike but 'if' is.
+
+So I don't want an error when reducing if there is an indent, unless the new top start
+starts_line
+...
+
+OK, I'm up to the part where I need to hide conflicts that I can automatically resolve.
+I have:
+
+ State 7 has 2 (or more) reducible items
+ IfHead -> IfHeadNL . [25]
+ IfStatementNL -> IfHeadNL . [27] (2Right)
+ State 35 has 2 (or more) reducible items
+ IfTailNL -> else IfStatementNL . [32] (2Right)
+ IfStatement -> IfStatementNL . [30]
+
+I need to clarify the rules that I'm working with.
+
+1/ Statements might not end with a NL but as it is linelike...
+
+2/ IfHeadNL IfStatementNL IfTailNL all end with a NEWLINE
+ IfHead IfStatement IfTail might not (but they may)
+
+Why did I want IfHead -> IfHeadNL??
+Because I might have
+ if cond:
+ action
+
+
+ else: bar
+
+No, that is still an IfStatementNL. Once there is any NL, we cannot fit it
+on a line.
+
+Hmm... the FooNL pattern is getting out of control.
+Why do I need this again?
+
+Because when "if cond : statements" is followed by a NEWLINE I need to
+hang on to the parser state - not reducing to statement - until I see an 'else' or don't.
+
+If I do see the else, the difference doesn't matter. If I don't then I need
+to know if I have a NEWLINE.
+
+So I could have
+ IfHead -> if Expression : Statements
+ IfHeadNL -> IfHead Newlines
+
+ IfElse -> IfHead else
+ | IfHeadNL else
+
+ Statement -> IfHeadNL
+
+ But I want "if Expr then SimpleStatements"
+ where SimpleStatements can end with "IfHead"
+So when I see:
+
+ if foo : if bar : baz NEWLINE
+
+That NEWLINE mustn't turn "if bar: baz" into an IfHeadNL
+ We need to first turn "if bar: baz" into a SimpleStatement, then
+ "if foo : SimpleStatement" into an "IfHead", but NOT into a SimpleStatement.
+
+Arg. I might not know to reduce something until I've seen an IN. It is the
+'else'
+
+What if an Statement *always* ends with a newline.
+So
+ if Expr : Statement
+also ends with a newline and can be a statement
+But if there is an IN after ':' the newline is hidden.
+So that doesn't work.
+
+What if a NEWLINE absolutely has to be at the top level.
+If a symbol contains a NEWLINE, then it must be at the start of a line,
+possibly indented.
+So if it isn't indented, it mustn't contain a NEWLINE - no NEWLINE will get shifted in.
+
+ Statement -> if Expr : Statements
+ | SimpleStatements
+
+What does Statements look like? It must end a NEWLINE
+ Statements -> Statements Statement NEWLINE
+ | Statement NEWLINE
+ | Statements NEWLINE
+ | SimpleStatements
+
+ SimpleStatements -> SimplePrefix ; Statement
+ SimplePrefix -> SimpleStatement
+ | SimplePrefix ; SimpleStatement
+
+
+01July2019
+ I think I have a very different approach - it incorporates a lot of
+ the ideas so far and is maybe better.
+
+ From the top:
+
+ We have a simplified SLR(1) grammar where each state has at most one
+ reducible production. We don't have an action table, but use the goto table
+ to decide if a terminal can be shifted. If it can, we do. If it cannot,
+ we reduce or trigger an error.
+
+ Onto this we add handling for IN/OUT and NL. NL can appear int the grammar,
+ IN/OUT cannot.
+
+ Any non-terminal which can derive a NL is deemed "line-like".
+ Such non-terminals will normally appear at the start of a line - possibly indented.
+ These non-terminals can have some productions that have a NL (usually at the end)
+ and some that contain no NL.
+ If a non-terminal appears other than at the start of a line then no NL will ever
+ be shifted into it, so a production without NL will be used.
+ If it does appear at the start of a line, then any production can be used, though
+ it must end up ending in a NL.
+
+ The above paints an incorrect picture of how LR parsing works. At any given
+ time you don't know what non-terminal is being matched, so we cannot exclude
+ NL based on the non-terminal. We only know what parser state(s) we are in.
+ So: any parser state which is at the start of a line-like non-terminal is
+ flagged as "starts_line".
+ Also, in each start we store a "min prefix" which is the minimum non-zero number of
+ symbols before "dot" in any item in the set. This given a sense of where we are
+ in the parse.
+
+ If min_prefix if the top state is less than the number of symbols since start-of-line,
+ then we will not SHIFT a newline.
+
+ Indents (IN) are recorded after the symbol they follow. If there is an IN
+ since the most recent starts_line state, the any NL is ignored.
+ An OUT will cancel the most recent IN, providing is in the top min_prefix symbols.
+ If not, we need to reduce something first.
+
+ So when we see an OUT, we reduce until we can cancel.
+ When we see a NL, we reduce until the min_prefix reaches at least to the
+ start of the line. Then we can shift the NL.
+ After shifting the NL, the whole line should be reduced.
+
+ When a line-like non-terminal produces a sequence that *doesn't*
+ end with an explicit NEWLINE, the grammar analysis ensures that
+ nothing can be shifted in after the end of the production. This
+ forces it to be reduced into the non-terminal.
+
+ For example
+ simplestatement -> var = expr | print expr
+ simplestatements -> simplestatement | simplestatements simplestatement
+ SSline -> simplestatements NEWLINE | simplestatements ; condstatement NEWLINE
+
+ statement ->..
+ | simplestatements NEWLINE
+ | simplestatements ; statement
+ | if expr then statements NEWLINE
+ | if expr then statements Newlines else statements NEWLINE
+
+ When a non-terminal is explicitly followed by a NEWLINE, it is line-like
+ also if it contains a NEWLINE or linelike, it is linelike.
+
+
+ ifhead -> if expr : statements
+ | ifhead NEWLINE
+
+ iftail -> else : statements
+ | else ifstatement
+
+ ifstatement -> ifhead
+ | ifhead iftail
+
+ statement -> simplelist | ifstatement | Newlines simplelist | Newlines ifstatement
+
+ statements -> statement | statements statement
+
+ A line-like that contains newlines must be reduced by OUT or NEWLINE.
+
+ How can I know that a statements can be followed by else in
+ if cond : statements else: statements
+ or
+ IfHead IfTail
+ bit not by 'if' in
+ statements statement
+
+ Maybe I could have a $sol token.
+ If at $sol, and cannot shift then try shifting $sol..
+ then
+ statements -> statements $sol statement
+ or maybe $eol is better, then we can have NEWLINEs start start of statement.
+ OR maybe either... $linebreak is shifted if previous or next is NEWLINE
+ An IN doesn't allow a LINEBREAK.
+
+Just to repeat myself:
+ Arg. I might not know to reduce something until I've seen an IN. It is the
+ 'else'
+
+After above, new approach: IfHead and Statement have NL versions, nothing else does.
+MAybe fixed the above...
+
+So
+
+ StatementNL -> Statement NEWLINE | IfHeadNL
+ Statements -> StatmentNL | Statements StatementNL
+ StatementList -> Statement | Statements
+
+ Block -> : Statements | Open Statements Close
+
+ IfStatement -> IfHead | IfHead IfTail | IfHeadNL IfTail
+ IfHead -> if Expr Block
+ IfHeadNL -> IfHead NEWLINE | IfHeadNL NEWLINE
+ IfTail -> else Block | else IfStatement
+
+Close, but the IfHeadNL in "else IfStatement" cannot accept newlines.
+What if
+ IfHead -> if Expr Block | IfHeadNL else IfHead | IfHead else IfHead
+
+No, I think I have to take a totally different approach.
+
+IfPart elsepart switchpart whilepart etc are all syntactically valid
+as stand-alone statements in the base grammar.
+We use the code to fail a stand-alone elsepart the isn't preceeded by an ifpart, whilepart or casepart.
+
+So statements never contain newlines, only the statement-list does.
+If puts a NEWLINE at the end of each statement
+ statementlist -> statement | statementlist NEWLINE statement
+statement can be empty string, thus allowing blank lines and a NEWLINE at the end,
+which the parser will require.
+
+
+[[ thought experiment - interestibg, but gets unwieldy with
+ more complex statements
+statements -> simpleline NL statements
+ | ifhead NL iftail NL statements
+ | ifhead NL statements
+ | ifstatement NL statements
+
+iftail -> else block
+ | else ifhead iftail
+ | NL iftail
+
+ifhead -> Newlines if expr block
+ifstatement -> ifhead iftail
+]]