X-Git-Url: https://ocean-lang.org/code/?p=ocean-D;a=blobdiff_plain;f=twod;h=4c9a44ec077b31265f5e7aef4df1ad204cae9ddc;hp=9d94cda7e6be8ffb17ed13dbb1415abbc254ccca;hb=ca5610c0c857f3c4829fadde9f4a76f5010ffd2a;hpb=ee1ea60b6b67ecf08b62b1051a6da41f8eb5adc4 diff --git a/twod b/twod index 9d94cda..4c9a44e 100644 --- a/twod +++ b/twod @@ -2896,7 +2896,7 @@ ifstatement -> ifhead iftail review/summary. We track indents and newlines. The goal is resolve ambiguities and detect errors. Ambiguities are resolved by forcing a REDUCE in some circumstances when an OUT or NEWLINE is seen. - Errors happen when an there are too many OUTs. + Errors happen when there are too many OUTs. NEWLINEs are a normal part of a grammar, except that they get ignored sometimes when they are not relevant. and are protected. @@ -3319,3 +3319,261 @@ ifstatement -> ifhead iftail seen so far is empty. So much like my "non_empty" OK - much easier to get it right once I've thought it through :-) + +13feb2021 + + This isn't quite working how I had hoped :-( + The "EOL SOL" pair, or more the "SOL else" pair suggests I need a look-ahead + for 2 to recognise if I have an IfSuffix or not. + But I know and LR(2) can be re-written as LR(1) (Did I learn that in uni?) + How can I do that? + + Statementlist -> SOL SimpleStatements EOL Statementlist + | SOL Ifhead EOL Statementlist + | SOL Ifhead IfSuffix Statementlist + | SOL IfHead EOL SOL IfSuffix Statementlist + | + So if we see EOL SOL we can wait for else, which leads to IfSuffix, or + something else for StatementList. + But I don't want to allow StatementList to be empty. I can achieve this + but duplicating the above for a StatementList_nonempty. A bit ugly. + + Also, this is right-recursive which uses a lot of stack. + I can compress it a bit. By making an IfStat include the following statement. + SL -> Stat | SL Stat + + Stat -> SOL SimpList EOL + | IfX Stat + | IfX SOL IfSuffix + | SOL IfHead IfSuffix + + IfX -> SOL IfHead EOL + IfHead -> if Expr Block + IfSuffix -> else Block + | else IfHead + | else IfHead IfSuffix + | else IfHead EOL SOL IfSuffix + | else IfHead EOL Stat + + + Getting there... (again). + Problem: + if cond1: + if cond2: + stat1 + else: + + The 'else' pairs with cond2. + There is an EOL after "if cond2: stat1" and then "SOL else" + which looks just the same as + if cond1: + if cond2: + stat1 + else: + + The only difference is an extra OUT IN which we currently ignore. + + How can I use the OUT? + I have + SOL IFHead EOL .... OUT IN SOL + and I need the OUT to tell me to Reduce, or to block the Shift of SOL. + But if I simply block Shift when I have an OUT, the SOL IfHead EOL + becomes a Statement which is merged into the StatementList and then + the SOL is Shifted. I need to go all the way to make that Statementlist + a Block and IfHead. + If I hold out with the OUT longer until reduce_size!=1 + I get further but + IfHead else IfHead .... EOL + cannot shift the EOL + + Maybe I need to use min_prefix, but I really don't like that. + Need to think this through. + + Well, I have it working. + + If suppress shift if there are outs EXCEPT for TK_eol. Why? + Also I use the Bstatementlist indirection + and don't cancel the out if reduce_size==1 + + It's a bit clunky. Can I justify it? + + I'd like the tokens to be different. With + if cond: + st + else: + + The SOL before the else is ignored becuause we don't expect SOL there. + Trouble is in the problem case, SOL doesn't get ignored until later. + + Can I *only* prevent a shift of SOL when it is unbalanced? + + So: prevent shift of SOL if there is an uncancelled out, otherwise it will + be assumed to be at the wrong level. + Better, but not completely happy... + +14feb2021 valentines day + + What if the rule for cancelling indents was that the cancel couldn't cross + a starts-line state. How would that work out? + +15feb2021 + I didn't have time to pursue that, and now I'm a lot less convinced. + + New idea: Allow IN and OUT in the grammar, and selectively ignore them + like we do with SOL EOL. + That was, OUT could force a reduce which could not them be extended, so that + whole issue of recursive productions becomes moot. + + When are indents relevant? Maybe we have starts-block states which + expect IN, and with ignore IN if there is an indent since the last + starts-block state. + So + block -> : IN statementlist OUT + | : simplestatements + would ignore IN until we hit the :, then IN becomes relevant. + If we don't see and IN it must be simplestatements. Do we allow IN + there-in? Probably not. It would look confusing. + But if we get an IN, then we start ignoring INs again. + + The OUT absolutely must balance the IN, so we ignore OUT whenever the matching + IN was ignored. + + We still refuse to skip OUT if the matching IN is too far away. Must be in top + frame. + + Clarify handling of OUT when the IN was ignored... + A linelike production that started before the IN must not reduce until + after the OUT??? + + Any production that started after the IN must reduce before the OUT. + We don't force it to reduce, we flag an error. + So if we reduce some symbols which contain more OUT than IN, that is + an error + +17feb2021 + I need to track in/out carefully so they match properly and I ignore the right + OUTs. + IN is ignored whenever SOL/EOL would be. OUT is ignored precisely when the matching + IN was ignored. + I also want to track all ins and outs until they cancel in a reduction. + It is only at the reduction step that we can determine if an error occured. + An error is when a symbol contains nett negative indent. + So we can just count indents in each symbol. + Some in/out are within symbols, possibly IN and OUT. Others which are ignored + exist between symbols. A frame holds (symbol+internal indents),(state+pending indents). + To track which OUT to ignore we need a depth count and a bit-set. + If a bit is set, then the IN was ignored so the OUT must be too. + If clear, the IN was shifted, so the OUT must be too. + + I need to get indents_on_line right. + Previously I tracked them before this frame. I don't know why... + I want 0 when starts_line + +19feb2021 + OK, new approach is looking really good. Need to make sure it isn't too hard + to use. + Tricky area is multi-line statements that don't *have* to be multi-line. + + We cannot reduce "SOL IfHead EOL" to a statement as we cannot tell if it + is complete until we shift the SOL and look for an "else". + One option is "statement -> SOL IfHead EOL statement | SOL IfHead EOL IfTail" + So "statement" is really a sublit of statements. + Easy in indent_test, what about in ocean? + + There are lots of parts that can be on a line: + if, else, for, then, while, do, switch, case + + if and while can be "expr block" or "block" and the thenpart/dopart + else can be "block" or "statement" + then is optional in for, request if some if + + ifpart -> if expr block | if block then block | if block EOL SOL then block + + OR?? + + ifpart -> if expr block EOL SOL | if block then block EOL SOL... + + What if I support backtracking over terminals? So if I cannot shift + and cannot reduce, I back up until I can reduce, then do so? + + Then I can shift the SOL and if there is an else, I'm good. If not I back up + and reduce the statement + So + statement -> SOL simple EOL + | SOL ifhead EOL + | SOL ifhead EOL SOL elsepart EOL + | SOL ifhead elsepart EOL + would work. + But do I need it? + + statement -> simple EOL + | ifhead EOL + | ifhead EOL SOL statement + | ifhead EOL SOL iftail + | whilepart + | forhead whilepart + | switchead casepart + + + ifhead -> if block then block | if expr block | if block EOL SOL then block + iftail -> else block | else statement + + whilehead -> while expr block | while block EOL SOL do block | while block do block + whilepart -> whilehead EOL + | whilehead EOL SOL statement + | whilehead casepart + | whilehead EOL SOL casepart + + casepart -> casehead casepart + | casehead EOL SOL casepart + | casehead EOL SOL statement + | iftail + casehead -> case expr block + +22feb2021 + I've had a new idea - let's drop SOL! Now that I have IN, it isn't really needed. + We can assume SOL follows EOL or IN .... maybe. + Problem is if we want to require IN/OUT around something that is not line-oriented. + Might that ever matter? + No, I don't think so. + +23feb2021 + Maybe this make it really really easy. + We don't mark different sorts of states, and we only track which indents were + 'ignored'. + + Then: + IN never causes a reduction, it is either shifted or ignored. + An EOL is ignored if the most recent IN was ignored, otherwise it is a normal + token. + An OUT is similarly ignored if the matching indent was ignored. It also + cancels that indent. + + Is thats too easy? + + .... no, it seems to work. + + So: back to the ocean grammar + + statement -> simple EOL + | ifhead EOL + | ifhead EOL iftail + | whilepart + | forhead whilepart + | switchead casepart + + + ifhead -> if block then block | if expr block | if block EOL then block + iftail -> else block EOL | else statement + + whilehead -> while expr block | while block EOL do block | while block do block + whilepart -> whilehead EOL + | whilehead casepart + | whilehead EOL casepart + + casepart -> casehead casepart + | casehead EOL casepart + | casehead EOL + | iftail + casehead -> case expr block +