+
+27jan2021
+ I need a new idea concerning starts-line states. I need some refinement somehow
+ The state
+ block -> { statementlist . }
+ should ignore newlines - providing statementlist isn't recursive - but doesn't
+ because
+ block -> { . statementlist }
+ is further up the stack, and that is a startsline state
+
+ Maybe the thing is that the latter is startsline only because of statementlist, and
+ now that statementlist is gone, the startsline-ness lapses.
+
+ So in the former state, it is not startsline, and it is not terminal, so it
+ suppresses a startlines state 2 levels up.
+
+ But does that help? We would suppress the startsline-ness but there are
+ no remaining indents to ignore the newline.
+ Why can I ignore a newline in "if cond { st }" but not in "a = ( x )"
+ ??
+ Ahhh. This helps because the new top startsline would be at the start
+ of a line, so newlines can be shifted. The grammar can explicitly
+ allow a newline there... only then the state becomes a startsline
+ state?? or does it? But it is the top state, so it doesn't matter.
+
+ Rule: a NEWLINE cannot be SHIFTed if the topmost active startlines state
+ is not at the start of a line non-indented. This is because newline
+ must be meant to end a line started earlier - where starts-line was at
+ the beginning of a line.
+ The stop state is never "active" as the line it would start hasn't
+ actually started. If the shifted newline reduces immediately, the
+ grammar is probably broken.
+ Also a state is inactive if a subsequent state declares it to be. This
+ happens when a state is non-terminal (not reducable), and is not startsline.
+ The smallest prefix length of all core items indicates how many
+ preceding states are deactivated. If min-prefix is N, then N-1 starts
+ are deactivated.
+
+
+ So what do I need to code:
+ - I need to record with each state how far back it suppresses
+ start-line states.
+ - enhance test for shifting newline
+
+30jan2021
+ OK, the parsing code seems to do what I want, now I need to fix the grammar.
+ The context is structure statements which contain lines. e.g.
+ if cond:
+ statements
+ else:
+ statements
+
+ The "if cond: statements" is a while line so it looks like a statement.
+ But then we see "else" which isn't the start of a statement.
+ I've considered two avenues.
+ 1/ decide that "else: statements" is a valid statement and generate errors
+ in the semantics analysis if the preceeding statement doesn't like the else.
+ 2/ enumerate all the possibilities to the grammar as 1 or more lines.
+ ifstatement -> ifline | ifheadline elseline ...
+ But that seems problematic with cascaded "else if"
+
+ So let's try avenue 1. "else block" and "else ifstatement" are statements.
+
+03feb2021
+ indent_test seems to work, now trying to convert ocean.
+ My plan is that the various parts of a condstatement can either be
+ all on one "line", or some of them on their own lines.
+ The parts are:
+
+ for then while do case* else
+ switch case* else
+ if then else
+
+ a for,while,switch,if,do can start a statement
+ and this determines what other parts are allowed.
+ So we need to allow continuations of
+
+ after for
+ then? while case* else?
+ after while
+ do* case* else?
+ after switch
+ case* else?
+ after if
+ then? else?
+ after do
+ -nothing
+
+
+ But wait... what happens with "else"?
+ I want to allow "else" to be followed by a CondStatement so
+ if cond:
+ stuff
+ else if cond:
+ sufff
+
+ works. I guess there is not much of an issue there the 'else' becomes an
+ option prefix to a condstatement
+ Callinfg var_block_close at the right time might be awkward as we don't
+ know when we are parsing the end of a CondStatement.
+
+ Pause and reflect: what is the problem we are trying to solve, and does
+ it still apply?
+
+ The problem is newlines. When we see one we don't know whether to
+ reduce to a Statement or just to an (e.g.) IfPart.
+ We would need to allow several Newlines while staying at IfPart.
+ Then if we see 'else' we shift that, otherwise reduce to Statement
+
+ ifstatement -> ifhead elsepart
+ | ifheadnl elsepart
+ | ifheadnl
+
+
+ But wait... indent_test is broken!!
+ If I indent the 'else' one space, it looks like an ElseStatement after
+ the Statementlist that should be closed - but is recursive.
+ I can change it to a BStatementlist, but there is nothing to force that
+ to reduce. We prevent shifting until the outdent is cleared, but that
+ happens with the Statementlist. Maybe don't clear the outdent if the
+ top symbol state had a reduce-length of 1.??
+
+ OK.. that's fixed. Let's get back to the bigger problem.
+
+ A statement can be:
+ ->
+ | simplestatements NEWLINEs
+ | IfHeadNL
+ | IfHead IfSuffixNL
+ | IfHeadNL IfSuffixBL
+ | SwitchPart CondSuffixNL
+ | SwitchPartNL CondSuffixNL
+ | WhilePart CondSuffixNL
+ | WhilePartNL CondSuffixNL
+ | ForPart WhilePart CondSuffixNL
+ | ForPart WhilePartNL CondSuffixNL
+ | ForPartNL WhilePart CondSuffixNL
+ | ForPartNL WhilePartNL CondSuffixNL
+
+ ... and some for ThenPart and ThenPartNL
+
+ ForPart -> for simplestatements
+ | for Block
+ ForPartNL -> ForPart NEWLINE
+ | ForPartNL NEWLINE
+ IfHeadNL -> IfHead NEWLINE
+ | IfHeadNL NEWLINE
+ IfSuffixNL -> IfSuffix NEWLINE
+ | else Block NEWLINE
+ | else statement
+ SwitchPart -> switch Expr
+ | switch Block
+ SwitchPartNL -> SwitchPart NEWLINE
+ | SwitchPartNL NEWLINE
+ CondSuffixNL -> IfSuffixNL
+ | CasePart CondSuffixNL
+ | CasePartNL CondSuffixNL
+
+ CasePart -> case Expr Block
+ CasePartNL -> CasePart NEWLINE
+ | CarePartNL NEWLINE
+
+05feb2021
+
+ Above looks promising but doesn't quite work.
+ The "statement" after an "else" must be "statementNONL" because no
+ further newline is expected, but even then it isn't quite right
+
+ if expr1:
+ stat1
+ else if cond2:
+ stat2
+
+ scans as: if expr1 : IN stat1 NL OUT IN else if cond2 : IN stat2 NL OUT NL OUT NL
+
+ whereas
+ if expr1 :
+ stat1
+ else if cond2: stat2
+
+ scans as: if expr1 : IN stat1 NL OUT IN else if cond2 : stat2 NL OUT NL
+
+ In both cases there are more NLs than things that need to be ended.
+ We always was a NL for the starting 'if', and in the first case we need a NL
+ for 'stat2'. I wonder what that means.
+
+ Separately
+
+ if cond block else block NL
+
+ because the state before 'else' is startsline the NEWLINE cannot be shifted.
+ That seems to mean the NEWLINE must be in the production that starts the line,
+ so "CasePartNL" etc cannot be used.....
+
+ Bingo(??) I change each statement type to be a FooNL, or list thereof, with
+ FooNL -> stuff and nonsense NEWLINE
+ | FooNL NEWLINE
+
+ But what about that extra NL .... which now seems not to be a problem
+
+ Ah-ha. The second (of 3) is ignored because it is indented. All good (for now).
+
+06feb2021
+ The longest multi-line thing is
+ For Then While Do Case... Else
+
+ Each can be on a new line, or on previous line.
+ How can Case be handled? I guess they all need to be the same.
+
+ What about
+ if cond1:
+ stat1
+ else if cond2:
+ stat2
+ else if cond3....
+
+ ??? That looks awkward.
+
+ Can I have
+ For -> ForPart
+ | For NEWLINE
+ ??
+ I should test and see. ... I don't think so. At least not without more
+ smarts for newline handling.
+
+ So back to
+ For Then While Do Case... Else NEWLINE
+
+ Other forms are
+
+ ForNL Then While Do Case... Else
+ ForNL ThenNL While Do Case... Else
+ ForNL ThenNL WhileNL Do Case... Else
+ For Then While Do Case... Else
+ For Then While Do Case... Else
+ For Then While Do Case... Else
+ For Then While Do Case... Else
+
+ more than 64 combinations....
+
+ First line is one of:
+
+ For
+ For Then
+ For Then While
+ For Then While Do
+ For Then While Do Case
+ For Then While Do Case Else
+
+ Then
+ Then.. 5 options
+ then 3, 2, 1
+ Maybe only 21 parts
+
+ Cases should be easy. A list of caselines, each as list of case parts.
+ Followed by an elseline which has zero or more caseparts and an elsepart.
+
+ I think I need to change how NEWLINE is handled, do minprefix differently.
+ It is used to ignore stuff when deciding which startsline starts can prevent a
+ newline from shifting. Review exactly what is wanted there.
+
+ What exactly do I do with newlines?
+ - If a production contains a literal NEWLINE, the head is marked line-like
+ - forbid shifting NEWLINE when recent starts_line state is not at actual
+ start of line... but ignore intermediate states based on min_prefix
+ - record where lines actually start
+ - ignore if indent since starts-line state
+ and that is all.
+
+ Note that any state where an item starts with a line-like symbol is a
+ starts-line state.
+ Any state that can reduce to a line-like symbol requires indents to be
+ balanced.
+ starts_line states only affect ignoring newlines and choosing when to
+ allow shift, as described above.
+
+ Thoughts:
+ I could extend 'line-like' to any production containing a symbol that
+ starts with NEWLINE. The Newlines would work.
+ Rather than 'min_prefix' I could store "since-newline-or-start' so
+ that multiple newlines in a production would make sense,
+
+10feb2021
+ New thoughts. I wonder if they will work.
+
+ Change the scanner to produce paired SOL and EOL tokens, where EOL is
+ much link NEWLINE currently and is delayed by paired IN/OUT.
+ Also skip blank line, so only get a SOL if there is text on the line.
+
+ Now a production needs to be explicit about being at the start of a
+ line.
+ Maybe we can even do
+ OptNL ->
+ | EOL SOL
+
+ So:
+ statement -> SOL SimpleStatements EOL
+ | SOL CondStatement EOL
+
+ If the grammar requires an EOL followed by an EOL, there must be an
+ implied OUT.
+
+ in "if cond block else"
+ how do we know when the "block" is finished so that the "else" can be
+ shifted?
+ The expansion of 'block' will (possibly) end with a EOL. For "else" to
+ follow EOL without a SOL, there must be an OUT.
+
+12feb2021
+ I need to clarify how the scanner must work for SOL/EOL so that I can
+ write code that works.
+
+ SOL needs to be generated when we see a non-space character on a new line.
+ This is the same time that we need to possibly generate IN, which is in
+ check_indent.
+ So at start of line we scan for non-space, then unget and set check_indent.
+ In check_indent we assume start-of-line and generate SOL after any IN.
+
+ EOL needs to be generated after we see a NEWLINE (or maybe EOF) on a
+ non-empty line. It may be delayed until after indents, so we need to store
+ it. We delay it until after multiple blank lines, so we always need to
+ store it. So ->indent_eol[->indent_level] is a delayed EOL, if ->num
+ is not TK_error.
+
+ I think we need a flag for 'at start of line' which means the line
+ seen so far is empty. So much like my "non_empty"
+
+ OK - much easier to get it right once I've thought it through :-)