review/summary.
We track indents and newlines. The goal is resolve ambiguities and detect errors.
Ambiguities are resolved by forcing a REDUCE in some circumstances when an OUT or NEWLINE is seen.
- Errors happen when an there are too many OUTs.
+ Errors happen when there are too many OUTs.
NEWLINEs are a normal part of a grammar, except that they get ignored sometimes when they are not relevant.
and are protected.
seen so far is empty. So much like my "non_empty"
OK - much easier to get it right once I've thought it through :-)
+
+13feb2021
+
+ This isn't quite working how I had hoped :-(
+ The "EOL SOL" pair, or more the "SOL else" pair suggests I need a look-ahead
+ for 2 to recognise if I have an IfSuffix or not.
+ But I know and LR(2) can be re-written as LR(1) (Did I learn that in uni?)
+ How can I do that?
+
+ Statementlist -> SOL SimpleStatements EOL Statementlist
+ | SOL Ifhead EOL Statementlist
+ | SOL Ifhead IfSuffix Statementlist
+ | SOL IfHead EOL SOL IfSuffix Statementlist
+ |
+ So if we see EOL SOL we can wait for else, which leads to IfSuffix, or
+ something else for StatementList.
+ But I don't want to allow StatementList to be empty. I can achieve this
+ but duplicating the above for a StatementList_nonempty. A bit ugly.
+
+ Also, this is right-recursive which uses a lot of stack.
+ I can compress it a bit. By making an IfStat include the following statement.
+ SL -> Stat | SL Stat
+
+ Stat -> SOL SimpList EOL
+ | IfX Stat
+ | IfX SOL IfSuffix
+ | SOL IfHead IfSuffix
+
+ IfX -> SOL IfHead EOL
+ IfHead -> if Expr Block
+ IfSuffix -> else Block
+ | else IfHead
+ | else IfHead IfSuffix
+ | else IfHead EOL SOL IfSuffix
+ | else IfHead EOL Stat
+
+
+ Getting there... (again).
+ Problem:
+ if cond1:
+ if cond2:
+ stat1
+ else:
+
+ The 'else' pairs with cond2.
+ There is an EOL after "if cond2: stat1" and then "SOL else"
+ which looks just the same as
+ if cond1:
+ if cond2:
+ stat1
+ else:
+
+ The only difference is an extra OUT IN which we currently ignore.
+
+ How can I use the OUT?
+ I have
+ SOL IFHead EOL .... OUT IN SOL
+ and I need the OUT to tell me to Reduce, or to block the Shift of SOL.
+ But if I simply block Shift when I have an OUT, the SOL IfHead EOL
+ becomes a Statement which is merged into the StatementList and then
+ the SOL is Shifted. I need to go all the way to make that Statementlist
+ a Block and IfHead.
+ If I hold out with the OUT longer until reduce_size!=1
+ I get further but
+ IfHead else IfHead .... EOL
+ cannot shift the EOL
+
+ Maybe I need to use min_prefix, but I really don't like that.
+ Need to think this through.
+
+ Well, I have it working.
+
+ If suppress shift if there are outs EXCEPT for TK_eol. Why?
+ Also I use the Bstatementlist indirection
+ and don't cancel the out if reduce_size==1
+
+ It's a bit clunky. Can I justify it?
+
+ I'd like the tokens to be different. With
+ if cond:
+ st
+ else:
+
+ The SOL before the else is ignored becuause we don't expect SOL there.
+ Trouble is in the problem case, SOL doesn't get ignored until later.
+
+ Can I *only* prevent a shift of SOL when it is unbalanced?
+
+ So: prevent shift of SOL if there is an uncancelled out, otherwise it will
+ be assumed to be at the wrong level.
+ Better, but not completely happy...
+
+14feb2021 valentines day
+
+ What if the rule for cancelling indents was that the cancel couldn't cross
+ a starts-line state. How would that work out?
+
+15feb2021
+ I didn't have time to pursue that, and now I'm a lot less convinced.
+
+ New idea: Allow IN and OUT in the grammar, and selectively ignore them
+ like we do with SOL EOL.
+ That was, OUT could force a reduce which could not them be extended, so that
+ whole issue of recursive productions becomes moot.
+
+ When are indents relevant? Maybe we have starts-block states which
+ expect IN, and with ignore IN if there is an indent since the last
+ starts-block state.
+ So
+ block -> : IN statementlist OUT
+ | : simplestatements
+ would ignore IN until we hit the :, then IN becomes relevant.
+ If we don't see and IN it must be simplestatements. Do we allow IN
+ there-in? Probably not. It would look confusing.
+ But if we get an IN, then we start ignoring INs again.
+
+ The OUT absolutely must balance the IN, so we ignore OUT whenever the matching
+ IN was ignored.
+
+ We still refuse to skip OUT if the matching IN is too far away. Must be in top
+ frame.
+
+ Clarify handling of OUT when the IN was ignored...
+ A linelike production that started before the IN must not reduce until
+ after the OUT???
+
+ Any production that started after the IN must reduce before the OUT.
+ We don't force it to reduce, we flag an error.
+ So if we reduce some symbols which contain more OUT than IN, that is
+ an error
+
+17feb2021
+ I need to track in/out carefully so they match properly and I ignore the right
+ OUTs.
+ IN is ignored whenever SOL/EOL would be. OUT is ignored precisely when the matching
+ IN was ignored.
+ I also want to track all ins and outs until they cancel in a reduction.
+ It is only at the reduction step that we can determine if an error occured.
+ An error is when a symbol contains nett negative indent.
+ So we can just count indents in each symbol.
+ Some in/out are within symbols, possibly IN and OUT. Others which are ignored
+ exist between symbols. A frame holds (symbol+internal indents),(state+pending indents).
+ To track which OUT to ignore we need a depth count and a bit-set.
+ If a bit is set, then the IN was ignored so the OUT must be too.
+ If clear, the IN was shifted, so the OUT must be too.
+
+ I need to get indents_on_line right.
+ Previously I tracked them before this frame. I don't know why...
+ I want 0 when starts_line