ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Stoney Creek version
   2
   3 Ocean is intended to be an compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage, and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This second version of the interpreter exists to test out the
  33 structured statement providing conditions and iteration, and simple
  34 variable scoping.  Clearly we need some minimal other functionality so
  35 that values can be tested and instructions iterated over.  All that
  36 functionality is clearly not normative at this stage (not that
  37 anything is **really** normative yet) and will change, so early test
  38 code will certainly break in later versions.
  39
  40 The under-test parts of the language are:
  41
  42  - conditional/looping structured statements
  43  - the `use` statement which is needed for that
  44  - Variable binding using ":=" and "::=", and assignment using "=".
  45
  46 Elements which are present to make a usable language are:
  47
  48  - "blocks" of multiple statements.
  49  - `pass`: a statement which does nothing.
  50  - expressions: `+`, `-`, `*`, `/` can apply to numbers and `++` can
  51    catenate strings.  `and`, `or`, `not` manipulate Booleans, and
  52    normal comparison operators can work on all three types.
  53  - `print`: will print the values in a list of expressions.
  54  - `program`: is given a list of identifiers to initialize from
  55    arguments.
  56
  57 ## Naming
  58
  59 Versions of the interpreter which obviously do not support a complete
  60 language will be named after creeks and streams.  This one is Stoney
  61 Creek.
  62
  63 Once we have something reasonably resembling a complete language, the
  64 names of rivers will be used.
  65 Early versions of the compiler will be named after seas.  Major
  66 releases of the compiler will be named after oceans.  Hopefully I will
  67 be finished once I get to the Pacific Ocean release.
  68
  69 ## Outline
  70
  71 As well as parsing and executing a program, the interpreter can print
  72 out the program from the parsed internal structure.  This is useful
  73 for validating the parsing.
  74 So the main requirements of the interpreter are:
  75
  76 - Parse the program, possible with tracing
  77 - Analyse the parsed program to ensure consistency
  78 - print the program
  79 - execute the program
  80
  81 This is all performed by a single C program extracted with
  82 `parsergen`.
  83
  84 There will be two formats for printing the program: a default and one
  85 that uses bracketing.  So a `--bracket` command line option is needed
  86 for that.  Normally the first code section found is used, however an
  87 alternate section can be requested so that a file (such as this one)
  88 can contain multiple programs This is effected with the `--section`
  89 option.
  90
  91 ###### File: oceani.mk
  92
  93         myCFLAGS := -Wall -g -fplan9-extensions
  94         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  95         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  96         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  97         ## libs
  98         all :: $(LDLIBS) oceani
  99         oceani.c oceani.h : oceani.mdc parsergen
 100                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 101         oceani.mk: oceani.mdc md2c
 102                 ./md2c oceani.mdc
 103
 104         oceani: oceani.o $(LDLIBS)
 105                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 106
 107 ###### Parser: header
 108         ## macros
 109         ## ast
 110         struct parse_context {
 111                 struct token_config config;
 112                 ## parse context
 113         };
 114
 115 ###### macros
 116
 117         #define container_of(ptr, type, member) ({                      \
 118                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 119                 (type *)( (char *)__mptr - offsetof(type,member) );})
 120
 121         #define config2context(_conf) container_of(_conf, struct parse_context, \
 122                 config)
 123
 124 ###### Parser: code
 125
 126         #include <unistd.h>
 127         #include <stdlib.h>
 128         #include <fcntl.h>
 129         #include <errno.h>
 130         #include <sys/mman.h>
 131         #include <string.h>
 132         #include <stdio.h>
 133         #include <locale.h>
 134         #include <malloc.h>
 135         #include "mdcode.h"
 136         #include "scanner.h"
 137         #include "parser.h"
 138
 139         ## includes
 140
 141         #include "oceani.h"
 142
 143         ## forward decls
 144         ## value functions
 145         ## ast functions
 146         ## core functions
 147
 148         #include <getopt.h>
 149         static char Usage[] = "Usage: oceani --trace --print --noexec --brackets"
 150                               "--section=SectionName prog.ocn\n";
 151         static const struct option long_options[] = {
 152                 {"trace",     0, NULL, 't'},
 153                 {"print",     0, NULL, 'p'},
 154                 {"noexec",    0, NULL, 'n'},
 155                 {"brackets",  0, NULL, 'b'},
 156                 {"section",   1, NULL, 's'},
 157                 {NULL,        0, NULL, 0},
 158         };
 159         const char *options = "tpnbs";
 160         int main(int argc, char *argv[])
 161         {
 162                 int fd;
 163                 int len;
 164                 char *file;
 165                 struct section *s;
 166                 char *section = NULL;
 167                 struct parse_context context = {
 168                         .config = {
 169                                 .ignored = (1 << TK_line_comment)
 170                                          | (1 << TK_block_comment),
 171                                 .number_chars = ".,_+-",
 172                                 .word_start = "_",
 173                                 .word_cont = "_",
 174                         },
 175                 };
 176                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 177                 struct exec **prog;
 178                 int opt;
 179                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 180                        != -1) {
 181                         switch(opt) {
 182                         case 't': dotrace=1; break;
 183                         case 'p': doprint=1; break;
 184                         case 'n': doexec=0; break;
 185                         case 'b': brackets=1; break;
 186                         case 's': section = optarg; break;
 187                         default: fprintf(stderr, Usage);
 188                                 exit(1);
 189                         }
 190                 }
 191                 if (optind >= argc) {
 192                         fprintf(stderr, "oceani: no input file given\n");
 193                         exit(1);
 194                 }
 195                 fd = open(argv[optind], O_RDONLY);
 196                 if (fd < 0) {
 197                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 198                         exit(1);
 199                 }
 200                 len = lseek(fd, 0, 2);
 201                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 202                 s = code_extract(file, file+len, NULL);
 203                 if (!s) {
 204                         fprintf(stderr, "oceani: could not find any code in %s\n",
 205                                 argv[optind]);
 206                         exit(1);
 207                 }
 208                 if (section) {
 209                         struct section *ss;
 210                         for (ss = s; ss; ss = ss->next) {
 211                                 struct text sec = ss->section;
 212                                 if (sec.len == strlen(section) &&
 213                                     strncmp(sec.txt, section, sec.len) == 0)
 214                                         break;
 215                         }
 216                         if (ss)
 217                                 prog = parse_oceani(ss->code, &context.config,
 218                                                     dotrace ? stderr : NULL);
 219                         else {
 220                                 fprintf(stderr, "oceani: cannot find section %s\n",
 221                                         section);
 222                                 exit(1);
 223                         }
 224                 } else
 225                         prog = parse_oceani(s->code, &context.config,
 226                                     dotrace ? stderr : NULL);
 227                 if (prog && doprint)
 228                         print_exec(*prog, 0, brackets);
 229                 if (prog && doexec) {
 230                         if (!analyse_prog(*prog, &context)) {
 231                                 fprintf(stderr, "oceani: type error in program\n");
 232                                 exit(1);
 233                         }
 234                         interp_prog(*prog, argv+optind+1);
 235                 }
 236                 if (prog) {
 237                         free_exec(*prog);
 238                         free(prog);
 239                 }
 240                 while (s) {
 241                         struct section *t = s->next;
 242                         code_free(s->code);
 243                         free(s);
 244                         s = t;
 245                 }
 246                 ## free context
 247                 exit(0);
 248         }
 249
 250 ### Analysis
 251
 252 These four requirements of parse, analyse, print, interpret apply to
 253 each language element individually so that is how most of the code
 254 will be structured.
 255
 256 Three of the four are fairly self explanatory.  The one that requires
 257 a little explanation is the analysis step.
 258
 259 The current language design does not require (or even allow) the types
 260 of variables to be declared, but they must still have a single type.
 261 Different operations impose different requirements on the variables,
 262 for example addition requires both arguments to be numeric, and
 263 assignment requires the variable on the left to have the same type as
 264 the expression on the right.
 265
 266 Analysis involves propagating these type requirements around and
 267 consequently setting the type of each variable.  If any requirements
 268 are violated (e.g. a string is compared with a number) or if a
 269 variable needs to have two different types, then an error is raised
 270 and the program will not run.
 271
 272 If the same variable is declared in both branchs of an 'if/else', or
 273 in all cases of a 'switch' then the multiple instances may be merged
 274 into just one variable if the variable is references after the
 275 conditional statement.  When this happens, the types must naturally be
 276 consistent across all the branches.  When the variable is not used
 277 outside the if, the variables in the different branches are distinct
 278 and can be of different types.
 279
 280 Determining the types of all variables early is important for
 281 processing command line arguments.  These can be assigned to any type
 282 of variable, but we must first know the correct type so any required
 283 conversion can happen.  If a variable is associated with a command
 284 line argument but no type can be interpreted (e.g. the variable is
 285 only ever used in a `print` statement), then the type is set to
 286 'string'.
 287
 288 Undeclared names may only appear in "use" statements and "case" expressions.
 289 These names are given a type of "label" and a unique value.
 290 This allows them to fill the role of a name in an enumerated type, which
 291 is useful for testing the `switch` statement.
 292
 293 As there are, as yet, no distinct types that are compatible, there
 294 isn't much subtlety in the analysis.  When we hav distinct number
 295 types, this will become more interesting.
 296
 297 ## Data Structures
 298
 299 One last introductory step before detailing the language elements and
 300 providing their four requirements is to establish the data structures
 301 to store these elements.
 302
 303 There are two key objects that we need to work with: executable
 304 elements which comprise the program, and values which the program
 305 works with.  Between these are the variables in their various scopes
 306 which hold the values.
 307
 308 ### Values
 309
 310 Values can be numbers, which we represent as multi-precision
 311 fractions, strings, Booleans and labels.  When analysing the program
 312 we also need to allow for places where no value is meaningful
 313 (`Vnone`) and where we don't know what type to expect yet (`Vunknown`
 314 which can be anything and `Vnolabel` which can be anything except a
 315 label).  A 2 character 'tail' is included in each value as the scanner
 316 wants to parse that from the end of numbers and we need somewhere to
 317 put it.  It is currently ignored but one day might allow for
 318 e.g. "imaginary" numbers.
 319
 320 Values are never shared, they are always copied when used, and freed
 321 when no longer needed.
 322
 323 When propagating type information around the program, we need to
 324 determine if two types are compatible, where `Vunknown` is compatible
 325 which anything, and `Vnolabel` is compatible with anything except a
 326 label.  A separate funtion to encode this rule will simplify some code
 327 later.
 328
 329 When assigning command line arguments to variable, we need to be able
 330 to parse each type from a string.
 331
 332 ###### includes
 333         #include <gmp.h>
 334         #include "string.h"
 335         #include "number.h"
 336
 337 ###### libs
 338         myLDLIBS := libnumber.o libstring.o -lgmp
 339         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
 340
 341 ###### ast
 342         struct value {
 343                 enum vtype {Vnolabel, Vunknown, Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
 344                 union {
 345                         struct text str;
 346                         mpq_t num;
 347                         int bool;
 348                         void *label;
 349                 };
 350                 char tail[2];
 351         };
 352
 353 ###### ast functions
 354         static void free_value(struct value v)
 355         {
 356                 switch (v.vtype) {
 357                 case Vnone:
 358                 case Vnolabel:
 359                 case Vunknown: break;
 360                 case Vstr: free(v.str.txt); break;
 361                 case Vnum: mpq_clear(v.num); break;
 362                 case Vlabel:
 363                 case Vbool: break;
 364                 }
 365         }
 366
 367         static int vtype_compat(enum vtype require, enum vtype have)
 368         {
 369                 switch (require) {
 370                 case Vnolabel:
 371                         return have != Vlabel;
 372                 case Vunknown:
 373                         return 1;
 374                 default:
 375                         return have == Vunknown || require == have;
 376                 }
 377         }
 378
 379 ###### value functions
 380
 381         static void val_init(struct value *val, enum vtype type)
 382         {
 383                 val->vtype = type;
 384                 switch(type) {
 385                 case Vnone:abort();
 386                 case Vnolabel:
 387                 case Vunknown: break;
 388                 case Vnum:
 389                         mpq_init(val->num); break;
 390                 case Vstr:
 391                         val->str.txt = malloc(1);
 392                         val->str.len = 0;
 393                         break;
 394                 case Vbool:
 395                         val->bool = 0;
 396                         break;
 397                 case Vlabel:
 398                         val->label = val;
 399                         break;
 400                 }
 401         }
 402
 403         static struct value dup_value(struct value v)
 404         {
 405                 struct value rv;
 406                 rv.vtype = v.vtype;
 407                 switch (rv.vtype) {
 408                 case Vnone:
 409                 case Vnolabel:
 410                 case Vunknown: break;
 411                 case Vlabel:
 412                         rv.label = v.label;
 413                         break;
 414                 case Vbool:
 415                         rv.bool = v.bool;
 416                         break;
 417                 case Vnum:
 418                         mpq_init(rv.num);
 419                         mpq_set(rv.num, v.num);
 420                         break;
 421                 case Vstr:
 422                         rv.str.len = v.str.len;
 423                         rv.str.txt = malloc(rv.str.len);
 424                         memcpy(rv.str.txt, v.str.txt, v.str.len);
 425                         break;
 426                 }
 427                 return rv;
 428         }
 429
 430         static int value_cmp(struct value left, struct value right)
 431         {
 432                 int cmp;
 433                 if (left.vtype != right.vtype)
 434                         return left.vtype - right.vtype;
 435                 switch (left.vtype) {
 436                 case Vlabel: cmp = left.label == right.label ? 0 : 1; break;
 437                 case Vnum: cmp = mpq_cmp(left.num, right.num); break;
 438                 case Vstr: cmp = text_cmp(left.str, right.str); break;
 439                 case Vbool: cmp = left.bool - right.bool; break;
 440                 case Vnone:
 441                 case Vnolabel:
 442                 case Vunknown: cmp = 0;
 443                 }
 444                 return cmp;
 445         }
 446
 447         static struct text text_join(struct text a, struct text b)
 448         {
 449                 struct text rv;
 450                 rv.len = a.len + b.len;
 451                 rv.txt = malloc(rv.len);
 452                 memcpy(rv.txt, a.txt, a.len);
 453                 memcpy(rv.txt+a.len, b.txt, b.len);
 454                 return rv;
 455         }
 456
 457         static void print_value(struct value v)
 458         {
 459                 switch (v.vtype) {
 460                 case Vunknown:
 461                         printf("*Unknown*"); break;
 462                 case Vnone:
 463                 case Vnolabel:
 464                         printf("*no-value*"); break;
 465                 case Vlabel:
 466                         printf("*label-%p*", v.label); break;
 467                 case Vstr:
 468                         printf("%.*s", v.str.len, v.str.txt); break;
 469                 case Vbool:
 470                         printf("%s", v.bool ? "True":"False"); break;
 471                 case Vnum:
 472                         {
 473                         mpf_t fl;
 474                         mpf_init2(fl, 20);
 475                         mpf_set_q(fl, v.num);
 476                         gmp_printf("%Fg", fl);
 477                         mpf_clear(fl);
 478                         break;
 479                         }
 480                 }
 481         }
 482
 483         static int parse_value(struct value *vl, char *arg)
 484         {
 485                 struct text tx;
 486                 int neg = 0;
 487                 switch(vl->vtype) {
 488                 case Vnolabel:
 489                 case Vlabel:
 490                 case Vunknown:
 491                 case Vnone:
 492                         return 0;
 493                 case Vstr:
 494                         vl->str.len = strlen(arg);
 495                         vl->str.txt = malloc(vl->str.len);
 496                         memcpy(vl->str.txt, arg, vl->str.len);
 497                         break;
 498                 case Vnum:
 499                         if (*arg == '-') {
 500                                 neg = 1;
 501                                 arg++;
 502                         }
 503                         tx.txt = arg; tx.len = strlen(tx.txt);
 504                         if (number_parse(vl->num, vl->tail, tx) == 0)
 505                                 mpq_init(vl->num);
 506                         else if (neg)
 507                                 mpq_neg(vl->num, vl->num);
 508                         break;
 509                 case Vbool:
 510                         if (strcasecmp(arg, "true") == 0 ||
 511                             strcmp(arg, "1") == 0)
 512                                 vl->bool = 1;
 513                         else if (strcasecmp(arg, "false") == 0 ||
 514                                  strcmp(arg, "0") == 0)
 515                                 vl->bool = 0;
 516                         else {
 517                                 printf("Bad bool: %s\n", arg);
 518                                 return 0;
 519                         }
 520                         break;
 521                 }
 522                 return 1;
 523         }
 524
 525 ### Variables
 526
 527 Variables are scoped named values.  We store the names in a linked
 528 list of "bindings" sorted lexically, and use sequential search and
 529 insertion sort.
 530
 531 ###### ast
 532
 533         struct binding {
 534                 struct text name;
 535                 struct binding *next;   // in lexical order
 536                 ## binding fields
 537         };
 538
 539 This linked list is stored in the parse context so that "reduce"
 540 functions can find or add variables, and so the analysis phase can
 541 ensure that every variable gets a type.
 542
 543 ###### parse context
 544
 545         struct binding *varlist;  // In lexical order
 546
 547 ###### ast functions
 548
 549         static struct binding *find_binding(struct parse_context *c, struct text s)
 550         {
 551                 struct binding **l = &c->varlist;
 552                 struct binding *n;
 553                 int cmp = 1;
 554
 555                 while (*l &&
 556                         (cmp = text_cmp((*l)->name, s)) < 0)
 557                                 l = & (*l)->next;
 558                 if (cmp == 0)
 559                         return *l;
 560                 n = calloc(1, sizeof(*n));
 561                 n->name = s;
 562                 n->next = *l;
 563                 *l = n;
 564                 return n;
 565         }
 566
 567 Each name can be linked to multiple variables defined in different
 568 scopes.  Each scope starts where the name is declared and continues
 569 until the end of the containing code block.  Scopes of a given name
 570 cannot nest, so a declaration while a name is in-scope is an error.
 571
 572 ###### binding fields
 573         struct variable *var;
 574
 575 ###### ast
 576         struct variable {
 577                 struct variable *previous;
 578                 struct value val;
 579                 struct binding *name;
 580                 ## variable fields
 581         };
 582
 583 While the naming seems strange, we include local constants in the
 584 definition of variables.  A name declared `var := value` can
 585 subsequently be changed, but a name declared `var ::= value` cannot -
 586 it is constant
 587
 588 ###### variable fields
 589         int constant;
 590
 591 Scopes in parallel branches can be partially merged.  More
 592 specifically, if a given name is declared in both branches of an
 593 if/else then it's scope is a candidate for merging.  Similarly if
 594 every branch of an exhaustive switch (e.g. has an "else" clause)
 595 declares a given name, then the scopes from the branches are
 596 candidates for merging.
 597
 598 Note that names declared inside a loop (which is only parallel to
 599 itself) are never visible after the loop.  Similarly names defined in
 600 scopes which are not parallel, such as those started by `for` and
 601 `switch`, are never visible after the scope.  Only variable defined in
 602 both `then` and `else` (including the implicit then after an `if`, and
 603 excluding `then` used with `for`) and in all `case`s and `else` of a
 604 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
 605
 606 Labels, which are a bit like variables, follow different rules.
 607 Labels are not explicitly declared, but if an undeclared name appears
 608 in a context where a label is legal, that effectively declares the
 609 name as a label.  The declaration remains in force (or in scope) at
 610 least to the end of the immediately containing block and conditionally
 611 in any larger containing block which does not declare the name in some
 612 other way.  Importantly, the conditional scope extension happens even
 613 if the label is only used in parallel branch of a conditional -- when
 614 used in one branch it is treated as having been declared in all
 615 branches.
 616
 617 Merge candidates are tentatively visible beyond the end of the
 618 branching statement which creates them.  If the name is used, the
 619 merge is affirmed and they become a single variable visible at the
 620 outer layer.  If not - if it is redeclared first - the merge lapses.
 621
 622 To track scopes we have an extra stack, implemented as a linked list,
 623 which roughly parallels the parse stack and which is used exclusively
 624 for scoping.  When a new scope is opened, a new frame is pushed and
 625 the child-count of the parent frame is incremented.  This child-count
 626 is used to distinguish between the first of a set of parallel scopes,
 627 in which declared variables must not be in scope, and subsequent
 628 branches, whether they must already be conditionally scoped.
 629
 630 To push a new frame *before* any code in the frame is parsed, we need a
 631 grammar reduction.  This is most easily achieved with a grammar
 632 element which derives the empty string, and created the new scope when
 633 it is recognized.  This can be placed, for example, between a keyword
 634 like "if" and the code following it.
 635
 636 ###### ast
 637         struct scope {
 638                 struct scope *parent;
 639                 int child_count;
 640         };
 641
 642 ###### parse context
 643         int scope_depth;
 644         struct scope *scope_stack;
 645
 646 ###### ast functions
 647         static void scope_pop(struct parse_context *c)
 648         {
 649                 struct scope *s = c->scope_stack;
 650
 651                 c->scope_stack = s->parent;
 652                 free(s);
 653                 c->scope_depth -= 1;
 654         }
 655
 656         static void scope_push(struct parse_context *c)
 657         {
 658                 struct scope *s = calloc(1, sizeof(*s));
 659                 if (c->scope_stack)
 660                         c->scope_stack->child_count += 1;
 661                 s->parent = c->scope_stack;
 662                 c->scope_stack = s;
 663                 c->scope_depth += 1;
 664         }
 665
 666 ###### Grammar
 667
 668         $void
 669         OpenScope -> ${ scope_push(config2context(config)); }$
 670
 671
 672 Each variable records a scope depth and is in one of four states:
 673
 674 - "in scope".  This is the case between the declaration of the
 675   variable and the end of the containing block, and also between
 676   the usage with affirms a merge and the end of the block.
 677
 678   The scope depth is not greater than the current parse context scope
 679   nest depth.  When the block of that depth closes, the state will
 680   change.  To achieve this, all "in scope" variables are linked
 681   together as a stack in nesting order.
 682
 683 - "pending".  The "in scope" block has closed, but other parallel
 684   scopes are still being processed.  So far, every parallel block at
 685   the same level that has closed has declared the name.
 686
 687   The scope depth is the depth of the last parallel block that
 688   enclosed the declaration, and that has closed.
 689
 690 - "conditionally in scope".  The "in scope" block and all parallel
 691   scopes have closed, and no further mention of the name has been
 692   seen.  This state includes a secondary nest depth which records the
 693   outermost scope seen since the variable became conditionally in
 694   scope.  If a use of the name is found, the variable becomes "in
 695   scope" and that secondary depth becomes the recorded scope depth.
 696   If the name is declared as a new variable, the old variable becomes
 697   "out of scope" and the recorded scope depth stays unchanged.
 698
 699 - "out of scope".  The variable is neither in scope nor conditionally
 700   in scope.  It is permanently out of scope now and can be removed from
 701   the "in scope" stack.
 702
 703
 704 ###### variable fields
 705         int depth, min_depth;
 706         enum { OutScope, PendingScope, CondScope, InScope } scope;
 707         struct variable *in_scope;
 708
 709 ###### parse context
 710
 711         struct variable *in_scope;
 712
 713 All variables with the same name are linked together using the
 714 'previous' link.  Those variable that have
 715 been affirmatively merged all have a 'merged' pointer that points to
 716 one primary variable - the most recently declared instance. When
 717 merging variables, we need to also adjust the 'merged' pointer on any
 718 other variables that had previously been merged with the one that will
 719 no longer be primary.
 720
 721 ###### variable fields
 722         struct variable *merged;
 723
 724 ###### ast functions
 725
 726         static void variable_merge(struct variable *primary, struct variable *secondary)
 727         {
 728                 struct variable *v;
 729
 730                 if (primary->merged)
 731                         // shouldn't happen
 732                         primary = primary->merged;
 733
 734                 for (v = primary->previous; v; v=v->previous)
 735                         if (v == secondary || v == secondary->merged ||
 736                             v->merged == secondary ||
 737                             (v->merged && v->merged == secondary->merged)) {
 738                                 v->scope = OutScope;
 739                                 v->merged = primary;
 740                         }
 741         }
 742
 743 ###### free context
 744
 745         while (context.varlist) {
 746                 struct binding *b = context.varlist;
 747                 struct variable *v = b->var;
 748                 context.varlist = b->next;
 749                 free(b);
 750                 while (v) {
 751                         struct variable *t = v;
 752
 753                         v = t->previous;
 754                         free_value(t->val);
 755                         free(t);
 756                 }
 757         }
 758
 759 #### Manipulating Bindings
 760
 761 When a name is conditionally visible, a new declaration discards the
 762 old binding - the condition lapses.  Conversely a usage of the name
 763 affirms the visibility and extends it to the end of the containing
 764 block - i.e. the block that contains both the original declaration and
 765 the latest usage.  This is determined from `min_depth`.  When a
 766 conditionally visible variable gets affirmed like this, it is also
 767 merged with other conditionally visible variables with the same name.
 768
 769 When we parse a variable declaration we either signal an error if the
 770 name is currently bound, or create a new variable at the current nest
 771 depth if the name is unbound or bound to a conditionally scoped or
 772 pending-scope variable.  If the previous variable was conditionally
 773 scoped, it and its homonyms becomes out-of-scope.
 774
 775 When we parse a variable reference (including non-declarative
 776 assignment) we signal an error if the name is not bound or is bound to
 777 a pending-scope variable; update the scope if the name is bound to a
 778 conditionally scoped variable; or just proceed normally if the named
 779 variable is in scope.
 780
 781 When we exit a scope, any variables bound at this level are either
 782 marked out of scope or pending-scoped, depending on whether the
 783 scope was sequential or parallel.
 784
 785 When exiting a parallel scope we check if there are any variables that
 786 were previously pending and are still visible. If there are, then
 787 there weren't redeclared in the most recent scope, so they cannot be
 788 merged and must become out-of-scope.  If it is not the first of
 789 parallel scopes (based on `child_count`), we check that there was a
 790 previous binding that is still pending-scope.  If there isn't, the new
 791 variable must now be out-of-scope.
 792
 793 When exiting a sequential scope that immediately enclosed parallel
 794 scopes, we need to resolve any pending-scope variables.  If there was
 795 no `else` clause, and we cannot determine that the `switch` was exhaustive,
 796 we need to mark all pending-scope variable as out-of-scope.  Otherwise
 797 all pending-scope variables become conditionally scoped.
 798
 799 ###### ast
 800         enum closetype { CloseSequential, CloseParallel, CloseElse };
 801
 802 ###### ast functions
 803
 804         static struct variable *var_decl(struct parse_context *c, struct text s)
 805         {
 806                 struct binding *b = find_binding(c, s);
 807                 struct variable *v = b->var;
 808
 809                 switch (v ? v->scope : OutScope) {
 810                 case InScope:
 811                         /* Signal error ... once I build error signalling support */
 812                         return NULL;
 813                 case CondScope:
 814                         for (;
 815                              v && v->scope == CondScope;
 816                              v = v->previous)
 817                                 v->scope = OutScope;
 818                         break;
 819                 default: break;
 820                 }
 821                 v = calloc(1, sizeof(*v));
 822                 v->previous = b->var;
 823                 b->var = v;
 824                 v->name = b;
 825                 v->min_depth = v->depth = c->scope_depth;
 826                 v->scope = InScope;
 827                 v->in_scope = c->in_scope;
 828                 c->in_scope = v;
 829                 val_init(&v->val, Vunknown);
 830                 return v;
 831         }
 832
 833         static struct variable *var_ref(struct parse_context *c, struct text s)
 834         {
 835                 struct binding *b = find_binding(c, s);
 836                 struct variable *v = b->var;
 837                 struct variable *v2;
 838
 839                 switch (v ? v->scope : OutScope) {
 840                 case OutScope:
 841                 case PendingScope:
 842                         /* Signal an error - once that is possible */
 843                         return NULL;
 844                 case CondScope:
 845                         /* All CondScope variables of this name need to be merged
 846                          * and become InScope
 847                          */
 848                         v->depth = v->min_depth;
 849                         v->scope = InScope;
 850                         for (v2 = v->previous;
 851                              v2 && v2->scope == CondScope;
 852                              v2 = v2->previous)
 853                                 variable_merge(v, v2);
 854                         break;
 855                 case InScope:
 856                         break;
 857                 }
 858                 return v;
 859         }
 860
 861         static void var_block_close(struct parse_context *c, enum closetype ct)
 862         {
 863                 /* close of all variables that are in_scope */
 864                 struct variable *v, **vp, *v2;
 865
 866                 scope_pop(c);
 867                 for (vp = &c->in_scope;
 868                      v = *vp, v && v->depth > c->scope_depth && v->min_depth > c->scope_depth;
 869                      ) {
 870                         switch (ct) {
 871                         case CloseElse:
 872                         case CloseParallel: /* handle PendingScope */
 873                                 switch(v->scope) {
 874                                 case InScope:
 875                                 case CondScope:
 876                                         if (c->scope_stack->child_count == 1)
 877                                                 v->scope = PendingScope;
 878                                         else if (v->previous &&
 879                                                  v->previous->scope == PendingScope)
 880                                                 v->scope = PendingScope;
 881                                         else if (v->val.vtype == Vlabel)
 882                                                 v->scope = PendingScope;
 883                                         else if (v->name->var == v)
 884                                                 v->scope = OutScope;
 885                                         if (ct == CloseElse) {
 886                                                 /* All Pending variables with this name
 887                                                  * are now Conditional */
 888                                                 for (v2 = v;
 889                                                      v2 && v2->scope == PendingScope;
 890                                                      v2 = v2->previous)
 891                                                         v2->scope = CondScope;
 892                                         }
 893                                         break;
 894                                 case PendingScope:
 895                                         for (v2 = v;
 896                                              v2 && v2->scope == PendingScope;
 897                                              v2 = v2->previous)
 898                                                 if (v2->val.vtype != Vlabel)
 899                                                         v2->scope = OutScope;
 900                                         break;
 901                                 case OutScope: break;
 902                                 }
 903                                 break;
 904                         case CloseSequential:
 905                                 if (v->val.vtype == Vlabel)
 906                                         v->scope = PendingScope;
 907                                 switch (v->scope) {
 908                                 case InScope:
 909                                         v->scope = OutScope;
 910                                         break;
 911                                 case PendingScope:
 912                                         /* There was no 'else', so we can only become
 913                                          * conditional if we know the cases were exhaustive,
 914                                          * and that doesn't mean anything yet.
 915                                          * So only labels become conditional..
 916                                          */
 917                                         for (v2 = v;
 918                                              v2 && v2->scope == PendingScope;
 919                                              v2 = v2->previous)
 920                                                 if (v2->val.vtype == Vlabel) {
 921                                                         v2->scope = CondScope;
 922                                                         v2->min_depth = c->scope_depth;
 923                                                 } else
 924                                                         v2->scope = OutScope;
 925                                         break;
 926                                 case CondScope:
 927                                 case OutScope: break;
 928                                 }
 929                                 break;
 930                         }
 931                         if (v->scope == OutScope)
 932                                 *vp = v->in_scope;
 933                         else
 934                                 vp = &v->in_scope;
 935                 }
 936         }
 937
 938 ### Executables
 939
 940 Executables can be lots of different things.  In many cases an
 941 executable is just an operation combined with one or two other
 942 executables.  This allows for expressions and lists etc.  Other times
 943 an executable is something quite specific like a constant or variable
 944 name.  So we define a `struct exec` to be a general executable with a
 945 type, and a `struct binode` which is a subclass of `exec` and forms a
 946 node in a binary tree and holding an operation. There will be other
 947 subclasses, and to access these we need to be able to `cast` the
 948 `exec` into the various other types.
 949
 950 ###### macros
 951         #define cast(structname, pointer) ({            \
 952                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 953                 if (__mptr && *__mptr != X##structname) abort();                \
 954                 (struct structname *)( (char *)__mptr);})
 955
 956         #define new(structname) ({                      \
 957                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 958                 __ptr->type = X##structname;            \
 959                 __ptr;})
 960
 961 ###### ast
 962         enum exec_types {
 963                 Xbinode,
 964                 ## exec type
 965         };
 966         struct exec {
 967                 enum exec_types type;
 968         };
 969         struct binode {
 970                 struct exec;
 971                 enum Btype {
 972                         ## Binode types
 973                 } op;
 974                 struct exec *left, *right;
 975         };
 976
 977 Each different type of `exec` node needs a number of functions
 978 defined, a bit like methods.  We must be able to be able to free it,
 979 print it, analyse it and execute it.  Once we have specific `exec`
 980 types we will need to parse them too.  Let's take this a bit more
 981 slowly.
 982
 983 #### Freeing
 984
 985 The parser generator requires a `free_foo` function for each struct
 986 that stores attributes and they will be `exec`s of subtypes there-of.
 987 So we need `free_exec` which can handle all the subtypes, and we need
 988 `free_binode`.
 989
 990 ###### ast functions
 991
 992         static void free_binode(struct binode *b)
 993         {
 994                 if (!b)
 995                         return;
 996                 free_exec(b->left);
 997                 free_exec(b->right);
 998                 free(b);
 999         }
1000
1001 ###### core functions
1002         static void free_exec(struct exec *e)
1003         {
1004                 if (!e)
1005                         return;
1006                 switch(e->type) {
1007                         ## free exec cases
1008                 }
1009         }
1010
1011 ###### forward decls
1012
1013         static void free_exec(struct exec *e);
1014
1015 ###### free exec cases
1016         case Xbinode: free_binode(cast(binode, e)); break;
1017
1018 #### Printing
1019
1020 Printing an `exec` requires that we know the current indent level for
1021 printing line-oriented components.  As will become clear later, we
1022 also want to know what sort of bracketing to use.
1023
1024 ###### ast functions
1025
1026         static void do_indent(int i, char *str)
1027         {
1028                 while (i--)
1029                         printf("    ");
1030                 printf("%s", str);
1031         }
1032
1033 ###### core functions
1034         static void print_binode(struct binode *b, int indent, int bracket)
1035         {
1036                 struct binode *b2;
1037                 switch(b->op) {
1038                 ## print binode cases
1039                 }
1040         }
1041
1042         static void print_exec(struct exec *e, int indent, int bracket)
1043         {
1044                 switch (e->type) {
1045                 case Xbinode:
1046                         print_binode(cast(binode, e), indent, bracket); break;
1047                 ## print exec cases
1048                 }
1049         }
1050
1051 ###### forward decls
1052
1053         static void print_exec(struct exec *e, int indent, int bracket);
1054
1055 #### Analysing
1056
1057 As discussed, analysis involves propagating type requirements around
1058 the program and looking for errors.
1059
1060 So `propagate_types` is passed a type that the `exec` is expected to return,
1061 and returns the type that it does return, either of which can be `Vunknown`.
1062 An `ok` flag is passed by reference. It is set to `0` when an error is
1063 found, and `2` when any change is made.  If it remains unchanged at
1064 `1`, then no more propagation is needed.
1065
1066 ###### core functions
1067
1068         static enum vtype propagate_types(struct exec *prog, enum vtype type,
1069                                           int *ok)
1070         {
1071                 enum vtype t;
1072
1073                 if (!prog)
1074                         return Vnone;
1075
1076                 switch (prog->type) {
1077                 case Xbinode:
1078                 {
1079                         struct binode *b = cast(binode, prog);
1080                         switch (b->op) {
1081                         ## propagate binode cases
1082                         }
1083                         break;
1084                 }
1085                 ## propagate exec cases
1086                 }
1087                 return Vnone;
1088         }
1089
1090 #### Interpreting
1091
1092 Interpreting an `exec` doesn't require anything but the `exec`.  State
1093 is stored in variables and each variable will be directly linked from
1094 within the `exec` tree.  The exception to this is the whole `program`
1095 which needs to look at command line arguments.  The `program` will be
1096 interpreted separately.
1097
1098 Each `exec` can return a value, which may be `Vnone` but shouldn't be `Vunknown`.
1099
1100 ###### core functions
1101
1102         static struct value interp_exec(struct exec *e)
1103         {
1104                 struct value rv;
1105                 rv.vtype = Vnone;
1106                 if (!e)
1107                         return rv;
1108
1109                 switch(e->type) {
1110                 case Xbinode:
1111                 {
1112                         struct binode *b = cast(binode, e);
1113                         struct value left, right;
1114                         left.vtype = right.vtype = Vnone;
1115                         switch (b->op) {
1116                         ## interp binode cases
1117                         }
1118                         free_value(left); free_value(right);
1119                         break;
1120                 }
1121                 ## interp exec cases
1122                 }
1123                 return rv;
1124         }
1125
1126 ## Language elements
1127
1128 Each language element needs to be parsed, printed, analysed,
1129 interpreted, and freed.  There are several, so let's just start with
1130 the easy ones and work our way up.
1131
1132 ### Values
1133
1134 We have already met values as separate objects.  When manifest
1135 constants appear in the program text that must result in an executable
1136 which has a constant value.  So the `val` structure embeds a value in
1137 an executable.
1138
1139 ###### exec type
1140         Xval,
1141
1142 ###### ast
1143         struct val {
1144                 struct exec;
1145                 struct value val;
1146         };
1147
1148 ###### Grammar
1149
1150         $*val
1151         Value ->  True ${
1152                         $0 = new(val);
1153                         $0->val.vtype = Vbool;
1154                         $0->val.bool = 1;
1155                         }$
1156                 | False ${
1157                         $0 = new(val);
1158                         $0->val.vtype = Vbool;
1159                         $0->val.bool = 0;
1160                         }$
1161                 | NUMBER ${
1162                         $0 = new(val);
1163                         $0->val.vtype = Vnum;
1164                         if (number_parse($0->val.num, $0->val.tail, $1.txt) == 0)
1165                                 mpq_init($0->val.num);
1166                         }$
1167                 | STRING ${
1168                         $0 = new(val);
1169                         $0->val.vtype = Vstr;
1170                         string_parse(&$1, '\\', &$0->val.str, $0->val.tail);
1171                         }$
1172                 | MULTI_STRING ${
1173                         $0 = new(val);
1174                         $0->val.vtype = Vstr;
1175                         string_parse(&$1, '\\', &$0->val.str, $0->val.tail);
1176                         }$
1177
1178 ###### print exec cases
1179         case Xval:
1180         {
1181                 struct val *v = cast(val, e);
1182                 if (v->val.vtype == Vstr)
1183                         printf("\"");
1184                 print_value(v->val);
1185                 if (v->val.vtype == Vstr)
1186                         printf("\"");
1187                 break;
1188         }
1189
1190 ###### propagate exec cases
1191                 case Xval:
1192                 {
1193                         struct val *val = cast(val, prog);
1194                         if (!vtype_compat(type, val->val.vtype))
1195                                 *ok = 0;
1196                         return val->val.vtype;
1197                 }
1198
1199 ###### interp exec cases
1200         case Xval:
1201                 return dup_value(cast(val, e)->val);
1202
1203 ###### ast functions
1204         static void free_val(struct val *v)
1205         {
1206                 if (!v)
1207                         return;
1208                 free_value(v->val);
1209                 free(v);
1210         }
1211
1212 ###### free exec cases
1213         case Xval: free_val(cast(val, e)); break;
1214
1215 ###### ast functions
1216         // Move all nodes from 'b' to 'rv', reversing the order.
1217         // In 'b' 'left' is a list, and 'right' is the last node.
1218         // In 'rv', left' is the first node and 'right' is a list.
1219         static struct binode *reorder_bilist(struct binode *b)
1220         {
1221                 struct binode *rv = NULL;
1222
1223                 while (b) {
1224                         struct exec *t = b->right;
1225                         b->right = rv;
1226                         rv = b;
1227                         if (b->left)
1228                                 b = cast(binode, b->left);
1229                         else
1230                                 b = NULL;
1231                         rv->left = t;
1232                 }
1233                 return rv;
1234         }
1235
1236 ### Variables
1237
1238 Just as we used as `val` to wrap a value into an `exec`, we similarly
1239 need a `var` to wrap a `variable` into an exec.  While each `val`
1240 contained a copy of the value, each `var` hold a link to the variable
1241 because it really is the same variable no matter where it appears.
1242 When a variable is used, we need to remember to follow the `->merged`
1243 link to find the primary instance.
1244
1245 ###### exec type
1246         Xvar,
1247
1248 ###### ast
1249         struct var {
1250                 struct exec;
1251                 struct variable *var;
1252         };
1253
1254 ###### Grammar
1255
1256         $*var
1257         VariableDecl -> IDENTIFIER := ${ {
1258                 struct variable *v = var_decl(config2context(config), $1.txt);
1259                 $0 = new(var);
1260                 $0->var = v;
1261         } }$
1262             | IDENTIFIER ::= ${ {
1263                 struct variable *v = var_decl(config2context(config), $1.txt);
1264                 v->constant = 1;
1265                 $0 = new(var);
1266                 $0->var = v;
1267         } }$
1268
1269         Variable -> IDENTIFIER ${ {
1270                 struct variable *v = var_ref(config2context(config), $1.txt);
1271                 if (v == NULL) {
1272                         /* This might be a label - allocate a var just in case */
1273                         v = var_decl(config2context(config), $1.txt);
1274                         if (v)
1275                                 val_init(&v->val, Vlabel);
1276                 }
1277                 $0 = new(var);
1278                 $0->var = v;
1279         } }$
1280
1281 ###### print exec cases
1282         case Xvar:
1283         {
1284                 struct var *v = cast(var, e);
1285                 if (v->var) {
1286                         struct binding *b = v->var->name;
1287                         printf("%.*s", b->name.len, b->name.txt);
1288                 }
1289                 break;
1290         }
1291
1292 ###### propagate exec cases
1293
1294         case Xvar:
1295         {
1296                 struct var *var = cast(var, prog);
1297                 struct variable *v = var->var;
1298                 if (!v) {
1299                         *ok = 0;
1300                         return Vnone;
1301                 }
1302                 if (v->merged)
1303                         v = v->merged;
1304                 if (v->val.vtype == Vunknown) {
1305                         if (type > Vunknown && *ok != 0) {
1306                                 val_init(&v->val, type);
1307                                 *ok = 2;
1308                         }
1309                         return type;
1310                 }
1311                 if (!vtype_compat(type, v->val.vtype))
1312                         *ok = 0;
1313                 if (type <= Vunknown)
1314                         return v->val.vtype;
1315                 return type;
1316         }
1317
1318 ###### interp exec cases
1319         case Xvar:
1320         {
1321                 struct var *var = cast(var, e);
1322                 struct variable *v = var->var;
1323
1324                 if (v->merged)
1325                         v = v->merged;
1326                 return dup_value(v->val);
1327         }
1328
1329 ###### ast functions
1330
1331         static void free_var(struct var *v)
1332         {
1333                 free(v);
1334         }
1335
1336 ###### free exec cases
1337         case Xvar: free_var(cast(var, e)); break;
1338
1339 ### Expressions: Boolean
1340
1341 Our first user of the `binode` will be expressions, and particularly
1342 Boolean expressions.  As I haven't implemented precedence in the
1343 parser generator yet, we need different names from each precedence
1344 level used by expressions.  The outer most or lowest level precedence
1345 are Boolean `or` `and`, and `not` which form an `Expression` out of `BTerm`s
1346 and `BFact`s.
1347
1348 ###### Binode types
1349         And,
1350         Or,
1351         Not,
1352
1353 ####### Grammar
1354
1355         $*binode
1356         Expression -> Expression or BTerm ${
1357                         $0 = new(binode);
1358                         $0->op = Or;
1359                         $0->left = $<1;
1360                         $0->right = $<3;
1361                 }$
1362                 | BTerm ${ $0 = $<1; }$
1363
1364         BTerm -> BTerm and BFact ${
1365                         $0 = new(binode);
1366                         $0->op = And;
1367                         $0->left = $<1;
1368                         $0->right = $<3;
1369                 }$
1370                 | BFact ${ $0 = $<1; }$
1371
1372         BFact -> not BFact ${
1373                         $0 = new(binode);
1374                         $0->op = Not;
1375                         $0->right = $<2;
1376                         }$
1377                 ## other BFact
1378
1379 ###### print binode cases
1380         case And:
1381                 print_exec(b->left, -1, 0);
1382                 printf(" and ");
1383                 print_exec(b->right, -1, 0);
1384                 break;
1385         case Or:
1386                 print_exec(b->left, -1, 0);
1387                 printf(" or ");
1388                 print_exec(b->right, -1, 0);
1389                 break;
1390         case Not:
1391                 printf("not ");
1392                 print_exec(b->right, -1, 0);
1393                 break;
1394
1395 ###### propagate binode cases
1396         case And:
1397         case Or:
1398         case Not:
1399                 /* both must be Vbool, result is Vbool */
1400                 propagate_types(b->left, Vbool, ok);
1401                 propagate_types(b->right, Vbool, ok);
1402                 if (type != Vbool && type > Vunknown)
1403                         *ok = 0;
1404                 return Vbool;
1405
1406 ###### interp binode cases
1407         case And:
1408                 rv = interp_exec(b->left);
1409                 right = interp_exec(b->right);
1410                 rv.bool = rv.bool && right.bool;
1411                 break;
1412         case Or:
1413                 rv = interp_exec(b->left);
1414                 right = interp_exec(b->right);
1415                 rv.bool = rv.bool || right.bool;
1416                 break;
1417         case Not:
1418                 rv = interp_exec(b->right);
1419                 rv.bool = !rv.bool;
1420                 break;
1421
1422 ### Expressions: Comparison
1423
1424 Of slightly higher precedence that Boolean expressions are
1425 Comparisons.
1426 A comparison takes arguments of any type, but the two types must be
1427 the same.
1428
1429 To simplify the parsing we introduce an `eop` which can return an
1430 expression operator.
1431
1432 ###### ast
1433         struct eop {
1434                 enum Btype op;
1435         };
1436
1437 ###### ast functions
1438         static void free_eop(struct eop *e)
1439         {
1440                 if (e)
1441                         free(e);
1442         }
1443
1444 ###### Binode types
1445         Less,
1446         Gtr,
1447         LessEq,
1448         GtrEq,
1449         Eql,
1450         NEql,
1451
1452 ###### other BFact
1453         | Expr CMPop Expr ${
1454                         $0 = new(binode);
1455                         $0->op = $2.op;
1456                         $0->left = $<1;
1457                         $0->right = $<3;
1458                 }$
1459         | Expr ${ $0 = $<1; }$
1460
1461 ###### Grammar
1462
1463         $eop
1464         CMPop ->   < ${ $0.op = Less; }$
1465                 |  > ${ $0.op = Gtr; }$
1466                 |  <= ${ $0.op = LessEq; }$
1467                 |  >= ${ $0.op = GtrEq; }$
1468                 |  == ${ $0.op = Eql; }$
1469                 |  != ${ $0.op = NEql; }$
1470
1471 ###### print binode cases
1472
1473         case Less:
1474         case LessEq:
1475         case Gtr:
1476         case GtrEq:
1477         case Eql:
1478         case NEql:
1479                 print_exec(b->left, -1, 0);
1480                 switch(b->op) {
1481                 case Less:   printf(" < "); break;
1482                 case LessEq: printf(" <= "); break;
1483                 case Gtr:    printf(" > "); break;
1484                 case GtrEq:  printf(" >= "); break;
1485                 case Eql:    printf(" == "); break;
1486                 case NEql:   printf(" != "); break;
1487                 default: abort();
1488                 }
1489                 print_exec(b->right, -1, 0);
1490                 break;
1491
1492 ###### propagate binode cases
1493         case Less:
1494         case LessEq:
1495         case Gtr:
1496         case GtrEq:
1497         case Eql:
1498         case NEql:
1499                 /* Both must match but not labels, result is Vbool */
1500                 t = propagate_types(b->left, Vnolabel, ok);
1501                 if (t > Vunknown)
1502                         propagate_types(b->right, t, ok);
1503                 else {
1504                         t = propagate_types(b->right, Vnolabel, ok);
1505                         if (t > Vunknown)
1506                                 t = propagate_types(b->left, t, ok);
1507                 }
1508                 if (!vtype_compat(type, Vbool))
1509                         *ok = 0;
1510                 return Vbool;
1511
1512 ###### interp binode cases
1513         case Less:
1514         case LessEq:
1515         case Gtr:
1516         case GtrEq:
1517         case Eql:
1518         case NEql:
1519         {
1520                 int cmp;
1521                 left = interp_exec(b->left);
1522                 right = interp_exec(b->right);
1523                 cmp = value_cmp(left, right);
1524                 rv.vtype = Vbool;
1525                 switch (b->op) {
1526                 case Less:      rv.bool = cmp <  0; break;
1527                 case LessEq:    rv.bool = cmp <= 0; break;
1528                 case Gtr:       rv.bool = cmp >  0; break;
1529                 case GtrEq:     rv.bool = cmp >= 0; break;
1530                 case Eql:       rv.bool = cmp == 0; break;
1531                 case NEql:      rv.bool = cmp != 0; break;
1532                 default: rv.bool = 0; break;
1533                 }
1534                 break;
1535         }
1536
1537 ### Expressions: The rest
1538
1539 The remaining expressions with the highest precedence are arithmetic
1540 and string concatenation.  There are `Expr`, `Term`, and `Factor`.
1541 The `Factor` is where the `Value` and `Variable` that we already have
1542 are included.
1543
1544 `+` and `-` are both infix and prefix operations (where they are
1545 absolute value and negation).  These have different operator names.
1546
1547 We also have a 'Bracket' operator which records where parentheses were
1548 found.  This make it easy to reproduce these when printing.  Once
1549 precedence is handled better I might be able to discard this.
1550
1551 ###### Binode types
1552         Plus, Minus,
1553         Times, Divide,
1554         Concat,
1555         Absolute, Negate,
1556         Bracket,
1557
1558 ###### Grammar
1559
1560         $*binode
1561         Expr -> Expr Eop Term ${
1562                         $0 = new(binode);
1563                         $0->op = $2.op;
1564                         $0->left = $<1;
1565                         $0->right = $<3;
1566                 }$
1567                 | Term ${ $0 = $<1; }$
1568
1569         Term -> Term Top Factor ${
1570                         $0 = new(binode);
1571                         $0->op = $2.op;
1572                         $0->left = $<1;
1573                         $0->right = $<3;
1574                 }$
1575                 | Factor ${ $0 = $<1; }$
1576
1577         Factor -> ( Expression ) ${
1578                         $0 = new(binode);
1579                         $0->op = Bracket;
1580                         $0->right = $<2;
1581                 }$
1582                 | Uop Factor ${
1583                         $0 = new(binode);
1584                         $0->op = $1.op;
1585                         $0->right = $<2;
1586                 }$
1587                 | Value ${ $0 = (struct binode *)$<1; }$
1588                 | Variable ${ $0 = (struct binode *)$<1; }$
1589
1590         $eop
1591         Eop ->    + ${ $0.op = Plus; }$
1592                 | - ${ $0.op = Minus; }$
1593
1594         Uop ->    + ${ $0.op = Absolute; }$
1595                 | - ${ $0.op = Negate; }$
1596
1597         Top ->    * ${ $0.op = Times; }$
1598                 | / ${ $0.op = Divide; }$
1599                 | ++ ${ $0.op = Concat; }$
1600
1601 ###### print binode cases
1602         case Plus:
1603         case Minus:
1604         case Times:
1605         case Divide:
1606         case Concat:
1607                 print_exec(b->left, indent, 0);
1608                 switch(b->op) {
1609                 case Plus:   printf(" + "); break;
1610                 case Minus:  printf(" - "); break;
1611                 case Times:  printf(" * "); break;
1612                 case Divide: printf(" / "); break;
1613                 case Concat: printf(" ++ "); break;
1614                 default: abort();
1615                 }
1616                 print_exec(b->right, indent, 0);
1617                 break;
1618         case Absolute:
1619                 printf("+");
1620                 print_exec(b->right, indent, 0);
1621                 break;
1622         case Negate:
1623                 printf("-");
1624                 print_exec(b->right, indent, 0);
1625                 break;
1626         case Bracket:
1627                 printf("(");
1628                 print_exec(b->right, indent, 0);
1629                 printf(")");
1630                 break;
1631
1632 ###### propagate binode cases
1633         case Plus:
1634         case Minus:
1635         case Times:
1636         case Divide:
1637                 /* both must be numbers, result is Vnum */
1638         case Absolute:
1639         case Negate:
1640                 /* as propagate_types ignores a NULL,
1641                  * unary ops fit here too */
1642                 propagate_types(b->left, Vnum, ok);
1643                 propagate_types(b->right, Vnum, ok);
1644                 if (!vtype_compat(type, Vnum))
1645                         *ok = 0;
1646                 return Vnum;
1647
1648         case Concat:
1649                 /* both must be Vstr, result is Vstr */
1650                 propagate_types(b->left, Vstr, ok);
1651                 propagate_types(b->right, Vstr, ok);
1652                 if (!vtype_compat(type, Vstr))
1653                         *ok = 0;
1654                 return Vstr;
1655
1656         case Bracket:
1657                 return propagate_types(b->right, type, ok);
1658
1659 ###### interp binode cases
1660
1661         case Plus:
1662                 rv = interp_exec(b->left);
1663                 right = interp_exec(b->right);
1664                 mpq_add(rv.num, rv.num, right.num);
1665                 break;
1666         case Minus:
1667                 rv = interp_exec(b->left);
1668                 right = interp_exec(b->right);
1669                 mpq_sub(rv.num, rv.num, right.num);
1670                 break;
1671         case Times:
1672                 rv = interp_exec(b->left);
1673                 right = interp_exec(b->right);
1674                 mpq_mul(rv.num, rv.num, right.num);
1675                 break;
1676         case Divide:
1677                 rv = interp_exec(b->left);
1678                 right = interp_exec(b->right);
1679                 mpq_div(rv.num, rv.num, right.num);
1680                 break;
1681         case Negate:
1682                 rv = interp_exec(b->right);
1683                 mpq_neg(rv.num, rv.num);
1684                 break;
1685         case Absolute:
1686                 rv = interp_exec(b->right);
1687                 mpq_abs(rv.num, rv.num);
1688                 break;
1689         case Bracket:
1690                 rv = interp_exec(b->right);
1691                 break;
1692         case Concat:
1693                 left = interp_exec(b->left);
1694                 right = interp_exec(b->right);
1695                 rv.vtype = Vstr;
1696                 rv.str = text_join(left.str, right.str);
1697                 break;
1698
1699 ### Blocks, Statements, and Statement lists.
1700
1701 Now that we have expressions out of the way we need to turn to
1702 statements.  There are simple statements and more complex statements.
1703 Simple statements do not contain newlines, complex statements do.
1704
1705 Statements often come in sequences and we have corresponding simple
1706 statement lists and complex statement lists.
1707 The former comprise only simple statements separated by semicolons.
1708 The later comprise complex statements and simple statement lists.  They are
1709 separated by newlines.  Thus the semicolon is only used to separate
1710 simple statements on the one line.  This may be overly restrictive,
1711 but I'm not sure I every want a complex statement to share a line with
1712 anything else.
1713
1714 Note that a simple statement list can still use multiple lines if
1715 subsequent lines are indented, so
1716
1717 ###### Example: wrapped simple statement list
1718
1719         a = b; c = d;
1720            e = f; print g
1721
1722 is a single simple statement list.  This might allow room for
1723 confusion, so I'm not set on it yet.
1724
1725 A simple statement list needs no extra syntax.  A complex statement
1726 list has two syntactic forms.  It can be enclosed in braces (much like
1727 C blocks), or it can be introduced by a colon and continue until an
1728 unindented newline (much like Python blocks).  With this extra syntax
1729 it is referred to as a block.
1730
1731 Note that a block does not have to include any newlines if it only
1732 contains simple statements.  So both of:
1733
1734         if condition: a=b; d=f
1735
1736         if condition { a=b; print f }
1737
1738 are valid.
1739
1740 In either case the list is constructed from a `binode` list with
1741 `Block` as the operator.  When parsing the list it is most convenient
1742 to append to the end, so a list is a list and a statement.  When using
1743 the list it is more convenient to consider a list to be a statement
1744 and a list.  So we need a function to re-order a list.
1745 `reorder_bilist` serves this purpose.
1746
1747 The only stand-alone statement we introduce at this stage is `pass`
1748 which does nothing and is represented as a `NULL` pointer in a `Block`
1749 list.
1750
1751 ###### Binode types
1752         Block,
1753
1754 ###### Grammar
1755
1756         $void
1757         OptNL -> Newlines
1758                 |
1759
1760         Newlines -> NEWLINE
1761                 | Newlines NEWLINE
1762
1763         $*binode
1764         Open -> {
1765                 | NEWLINE {
1766         Close -> }
1767                 | NEWLINE }
1768         Block -> Open Statementlist Close ${ $0 = $<2; }$
1769                 | Open Newlines Statementlist Close ${ $0 = $<3; }$
1770                 | Open SimpleStatements } ${ $0 = reorder_bilist($<2); }$
1771                 | Open Newlines SimpleStatements } ${ $0 = reorder_bilist($<3); }$
1772                 | : Statementlist ${ $0 = $<2; }$
1773                 | : SimpleStatements ${ $0 = reorder_bilist($<2); }$
1774
1775         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<1); }$
1776
1777         ComplexStatements -> ComplexStatements ComplexStatement ${
1778                 $0 = new(binode);
1779                 $0->op = Block;
1780                 $0->left = $<1;
1781                 $0->right = $<2;
1782                 }$
1783                 | ComplexStatements NEWLINE ${ $0 = $<1; }$
1784                 | ComplexStatement ${
1785                 $0 = new(binode);
1786                 $0->op = Block;
1787                 $0->left = NULL;
1788                 $0->right = $<1;
1789                 }$
1790
1791         $*exec
1792         ComplexStatement -> SimpleStatements NEWLINE ${
1793                         $0 = reorder_bilist($<1);
1794                         }$
1795                 ## ComplexStatement Grammar
1796
1797         $*binode
1798         SimpleStatements -> SimpleStatements ; SimpleStatement ${
1799                         $0 = new(binode);
1800                         $0->op = Block;
1801                         $0->left = $<1;
1802                         $0->right = $<3;
1803                         }$
1804                 | SimpleStatement ${
1805                         $0 = new(binode);
1806                         $0->op = Block;
1807                         $0->left = NULL;
1808                         $0->right = $<1;
1809                         }$
1810                 | SimpleStatements ; ${ $0 = $<1; }$
1811
1812         SimpleStatement -> pass ${ $0 = NULL; }$
1813                 ## SimpleStatement Grammar
1814
1815 ###### print binode cases
1816         case Block:
1817                 if (indent < 0) {
1818                         // simple statement
1819                         if (b->left == NULL)
1820                                 printf("pass");
1821                         else
1822                                 print_exec(b->left, indent, 0);
1823                         if (b->right) {
1824                                 printf("; ");
1825                                 print_exec(b->right, indent, 0);
1826                         }
1827                 } else {
1828                         // block, one per line
1829                         if (b->left == NULL)
1830                                 do_indent(indent, "pass\n");
1831                         else
1832                                 print_exec(b->left, indent, bracket);
1833                         if (b->right)
1834                                 print_exec(b->right, indent, bracket);
1835                 }
1836                 break;
1837
1838 ###### propagate binode cases
1839         case Block:
1840         {
1841                 /* If any statement returns something other then Vnone
1842                  * then all such must return same type.
1843                  * As each statement may be Vnone or something else,
1844                  * we must always pass Vunknown down, otherwise an incorrect
1845                  * error might occur.
1846                  */
1847                 struct binode *e;
1848
1849                 for (e = b; e; e = cast(binode, e->right)) {
1850                         t = propagate_types(e->left, Vunknown, ok);
1851                         if (t != Vunknown && t != Vnone) {
1852                                 if (type == Vunknown)
1853                                         type = t;
1854                                 else if (t != type)
1855                                         *ok = 0;
1856                         }
1857                 }
1858                 return type;
1859         }
1860
1861 ###### interp binode cases
1862         case Block:
1863                 while (rv.vtype == Vnone &&
1864                        b) {
1865                         if (b->left)
1866                                 rv = interp_exec(b->left);
1867                         b = cast(binode, b->right);
1868                 }
1869                 break;
1870
1871 ### The Print statement
1872
1873 `print` is a simple statement that takes a comma-separated list of
1874 expressions and prints the values separated by spaces and terminated
1875 by a newline.  No control of formatting is possible.
1876
1877 `print` faces the same list-ordering issue as blocks, and uses the
1878 same solution.
1879
1880 ###### Binode types
1881         Print,
1882
1883 ###### SimpleStatement Grammar
1884
1885         | print ExpressionList ${
1886                 $0 = reorder_bilist($<2);
1887         }$
1888         | print ExpressionList , ${
1889                 $0 = new(binode);
1890                 $0->op = Print;
1891                 $0->right = NULL;
1892                 $0->left = $<2;
1893                 $0 = reorder_bilist($0);
1894         }$
1895         | print ${
1896                 $0 = new(binode);
1897                 $0->op = Print;
1898                 $0->right = NULL;
1899         }$
1900
1901 ###### Grammar
1902
1903         $*binode
1904         ExpressionList -> ExpressionList , Expression ${
1905                 $0 = new(binode);
1906                 $0->op = Print;
1907                 $0->left = $<1;
1908                 $0->right = $<3;
1909                 }$
1910                 | Expression ${
1911                         $0 = new(binode);
1912                         $0->op = Print;
1913                         $0->left = NULL;
1914                         $0->right = $<1;
1915                 }$
1916
1917 ###### print binode cases
1918
1919         case Print:
1920                 do_indent(indent, "print");
1921                 while (b) {
1922                         if (b->left) {
1923                                 printf(" ");
1924                                 print_exec(b->left, -1, 0);
1925                                 if (b->right)
1926                                         printf(",");
1927                         }
1928                         b = cast(binode, b->right);
1929                 }
1930                 if (indent >= 0)
1931                         printf("\n");
1932                 break;
1933
1934 ###### propagate binode cases
1935
1936         case Print:
1937                 /* don't care but all must be consistent */
1938                 propagate_types(b->left, Vnolabel, ok);
1939                 propagate_types(b->right, Vnolabel, ok);
1940                 break;
1941
1942 ###### interp binode cases
1943
1944         case Print:
1945         {
1946                 char sep = 0;
1947                 int eol = 1;
1948                 for ( ; b; b = cast(binode, b->right))
1949                         if (b->left) {
1950                                 if (sep)
1951                                         putchar(sep);
1952                                 left = interp_exec(b->left);
1953                                 print_value(left);
1954                                 free_value(left);
1955                                 if (b->right)
1956                                         sep = ' ';
1957                         } else if (sep)
1958                                 eol = 0;
1959                 left.vtype = Vnone;
1960                 if (eol)
1961                         printf("\n");
1962                 break;
1963         }
1964
1965 ###### Assignment statement
1966
1967 An assignment will assign a value to a variable, providing it hasn't
1968 be declared as a constant.  The analysis phase ensures that the type
1969 will be correct so the interpreter just needs to perform the
1970 calculation.  There is a form of assignment which declares a new
1971 variable as well as assigning a value.  If a name is assigned before
1972 it is declared, and error will be raised as the name is created as
1973 `Vlabel` and it is illegal to assign to such names.
1974
1975 ###### Binode types
1976         Assign,
1977         Declare,
1978
1979 ###### SimpleStatement Grammar
1980         | Variable = Expression ${ {
1981                         struct var *v = cast(var, $1);
1982
1983                         $0 = new(binode);
1984                         $0->op = Assign;
1985                         $0->left = $<1;
1986                         $0->right = $<3;
1987                         if (v->var && !v->var->constant) {
1988                                 /* FIXME error? */
1989                         }
1990                 } }$
1991         | VariableDecl Expression ${
1992                         $0 = new(binode);
1993                         $0->op = Declare;
1994                         $0->left = $<1;
1995                         $0->right =$<2;
1996                 }$
1997
1998 ###### print binode cases
1999
2000         case Assign:
2001                 do_indent(indent, "");
2002                 print_exec(b->left, indent, 0);
2003                 printf(" = ");
2004                 print_exec(b->right, indent, 0);
2005                 if (indent >= 0)
2006                         printf("\n");
2007                 break;
2008
2009         case Declare:
2010                 do_indent(indent, "");
2011                 print_exec(b->left, indent, 0);
2012                 if (cast(var, b->left)->var->constant)
2013                         printf(" ::= ");
2014                 else
2015                         printf(" := ");
2016                 print_exec(b->right, indent, 0);
2017                 if (indent >= 0)
2018                         printf("\n");
2019                 break;
2020
2021 ###### propagate binode cases
2022
2023         case Assign:
2024         case Declare:
2025                 /* Both must match and not be labels, result is Vnone */
2026                 t = propagate_types(b->left, Vnolabel, ok);
2027                 if (t > Vunknown)
2028                         propagate_types(b->right, t, ok);
2029                 else {
2030                         t = propagate_types(b->right, Vnolabel, ok);
2031                         if (t > Vunknown)
2032                                 t = propagate_types(b->left, t, ok);
2033                 }
2034                 return Vnone;
2035
2036                 break;
2037
2038 ###### interp binode cases
2039
2040         case Assign:
2041         case Declare:
2042         {
2043                 struct variable *v = cast(var, b->left)->var;
2044                 if (v->merged)
2045                         v = v->merged;
2046                 right = interp_exec(b->right);
2047                 free_value(v->val);
2048                 v->val = right;
2049                 right.vtype = Vunknown;
2050                 break;
2051         }
2052
2053 ### The `use` statement
2054
2055 The `use` statement is the last "simple" statement.  It is needed when
2056 the condition in a conditional statement is a block.  `use` works much
2057 like `return` in C, but only completes the `condition`, not the whole
2058 function.
2059
2060 ###### Binode types
2061         Use,
2062
2063 ###### SimpleStatement Grammar
2064         | use Expression ${
2065                 $0 = new(binode);
2066                 $0->op = Use;
2067                 $0->right = $<2;
2068         }$
2069
2070 ###### print binode cases
2071
2072         case Use:
2073                 do_indent(indent, "use ");
2074                 print_exec(b->right, -1, 0);
2075                 if (indent >= 0)
2076                         printf("\n");
2077                 break;
2078
2079 ###### propagate binode cases
2080
2081         case Use:
2082                 /* result matches value */
2083                 return propagate_types(b->right, type, ok);
2084
2085 ###### interp binode cases
2086
2087         case Use:
2088                 rv = interp_exec(b->right);
2089                 break;
2090
2091 ### The Conditional Statement
2092
2093 This is the biggy and currently the only complex statement.
2094 This subsumes `if`, `while`, `do/while`, `switch`, and some parts of
2095 `for`.  It is comprised of a number of parts, all of which are
2096 optional though set combinations apply.
2097
2098 If there is a `forpart`, it is executed first, only once.
2099 If there is a `dopart`, then it is executed repeatedly providing
2100 always that the `condpart` or `cond`, if present, does not return a non-True
2101 value.  `condpart` can fail to return any value if it simply executes
2102 to completion.  This is treated the same as returning True.
2103
2104 If there is a `thenpart` it will be executed whenever the `condpart`
2105 or `cond` returns True (or does not return any value), but this will happen
2106 *after* `dopart` (when present).
2107
2108 If `elsepart` is present it will be executed at most once when the
2109 condition returns `False` or some value that isn't `True` and isn't
2110 matched by any `casepart`.  If there are any `casepart`s, they will be
2111 executed when the condition returns a matching value.
2112
2113 The particular sorts of values allowed in case parts has not yet been
2114 determined in the language design, so nothing is prohibited.
2115
2116 The various blocks in this complex statement potentially provide scope
2117 for variables as described earlier.  Each such block must include the
2118 "OpenScope" nonterminal before parsing the block, and must call
2119 `var_block_close()` when closing the block.
2120
2121 The code following "`if`", "`switch`" and "`for`" does not get its own
2122 scope, but is in a scope covering the whole statement, so names
2123 declared there cannot be redeclared elsewhere.  Similarly the
2124 condition following "`while`" is in a scope the covers the body
2125 ("`do`" part) of the loop, and which does not allow conditional scope
2126 extension.  Code following "`then`" (both looping and non-looping),
2127 "`else`" and "`case`" each get their own local scope.
2128
2129 The `cond_statement` cannot fit into a `binode` so a new `exec` is
2130 defined.
2131
2132 ###### exec type
2133         Xcond_statement,
2134
2135 ###### ast
2136         struct casepart {
2137                 struct exec *value;
2138                 struct exec *action;
2139                 struct casepart *next;
2140         };
2141         struct cond_statement {
2142                 struct exec;
2143                 struct exec *forpart, *condpart, *dopart, *thenpart, *elsepart;
2144                 struct casepart *casepart;
2145         };
2146
2147 ###### ast functions
2148
2149         static void free_casepart(struct casepart *cp)
2150         {
2151                 while (cp) {
2152                         struct casepart *t;
2153                         free_exec(cp->value);
2154                         free_exec(cp->action);
2155                         t = cp->next;
2156                         free(cp);
2157                         cp = t;
2158                 }
2159         }
2160
2161         static void free_cond_statement(struct cond_statement *s)
2162         {
2163                 if (!s)
2164                         return;
2165                 free_exec(s->forpart);
2166                 free_exec(s->condpart);
2167                 free_exec(s->dopart);
2168                 free_exec(s->thenpart);
2169                 free_exec(s->elsepart);
2170                 free_casepart(s->casepart);
2171                 free(s);
2172         }
2173
2174 ###### free exec cases
2175         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
2176
2177 ###### ComplexStatement Grammar
2178         | CondStatement ${ $0 = $<1; }$
2179
2180 ###### Grammar
2181
2182         $*cond_statement
2183         // both ForThen and Whilepart open scopes, and CondSuffix only
2184         // closes one - so in the first branch here we have another to close.
2185         CondStatement -> ForThen WhilePart CondSuffix ${
2186                         $0 = $<3;
2187                         $0->forpart = $1.forpart; $1.forpart = NULL;
2188                         $0->thenpart = $1.thenpart; $1.thenpart = NULL;
2189                         $0->condpart = $2.condpart; $2.condpart = NULL;
2190                         $0->dopart = $2.dopart; $2.dopart = NULL;
2191                         var_block_close(config2context(config), CloseSequential);
2192                         }$
2193                 | WhilePart CondSuffix ${
2194                         $0 = $<2;
2195                         $0->condpart = $1.condpart; $1.condpart = NULL;
2196                         $0->dopart = $1.dopart; $1.dopart = NULL;
2197                         }$
2198                 | SwitchPart CondSuffix ${
2199                         $0 = $<2;
2200                         $0->condpart = $<1;
2201                         }$
2202                 | IfPart IfSuffix ${
2203                         $0 = $<2;
2204                         $0->condpart = $1.condpart; $1.condpart = NULL;
2205                         $0->thenpart = $1.thenpart; $1.thenpart = NULL;
2206                         // This is where we close an "if" statement
2207                         var_block_close(config2context(config), CloseSequential);
2208                         }$
2209
2210         CondSuffix -> IfSuffix ${
2211                         $0 = $<1;
2212                         // This is where we close scope of the whole
2213                         // "for" or "while" statement
2214                         var_block_close(config2context(config), CloseSequential);
2215                 }$
2216                 | CasePart CondSuffix ${
2217                         $0 = $<2;
2218                         $1->next = $0->casepart;
2219                         $0->casepart = $<1;
2220                 }$
2221
2222         $*casepart
2223         CasePart -> Newlines case Expression OpenScope Block ${
2224                         $0 = calloc(1,sizeof(struct casepart));
2225                         $0->value = $<3;
2226                         $0->action = $<5;
2227                         var_block_close(config2context(config), CloseParallel);
2228                 }$
2229                 | case Expression OpenScope Block ${
2230                         $0 = calloc(1,sizeof(struct casepart));
2231                         $0->value = $<2;
2232                         $0->action = $<4;
2233                         var_block_close(config2context(config), CloseParallel);
2234                 }$
2235
2236         $*cond_statement
2237         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
2238                 | Newlines else OpenScope Block ${
2239                         $0 = new(cond_statement);
2240                         $0->elsepart = $<4;
2241                         var_block_close(config2context(config), CloseElse);
2242                 }$
2243                 | else OpenScope Block ${
2244                         $0 = new(cond_statement);
2245                         $0->elsepart = $<3;
2246                         var_block_close(config2context(config), CloseElse);
2247                 }$
2248                 | Newlines else OpenScope CondStatement ${
2249                         $0 = new(cond_statement);
2250                         $0->elsepart = $<4;
2251                         var_block_close(config2context(config), CloseElse);
2252                 }$
2253                 | else OpenScope CondStatement ${
2254                         $0 = new(cond_statement);
2255                         $0->elsepart = $<3;
2256                         var_block_close(config2context(config), CloseElse);
2257                 }$
2258
2259
2260         $*exec
2261         // These scopes are closed in CondSuffix
2262         ForPart -> for OpenScope SimpleStatements ${
2263                         $0 = reorder_bilist($<3);
2264                 }$
2265                 |  for OpenScope Block ${
2266                         $0 = $<3;
2267                 }$
2268
2269         ThenPart -> then OpenScope SimpleStatements ${
2270                         $0 = reorder_bilist($<3);
2271                         var_block_close(config2context(config), CloseSequential);
2272                 }$
2273                 |  then OpenScope Block ${
2274                         $0 = $<3;
2275                         var_block_close(config2context(config), CloseSequential);
2276                 }$
2277
2278         ThenPartNL -> ThenPart OptNL ${
2279                         $0 = $<1;
2280                 }$
2281
2282         // This scope is closed in CondSuffix
2283         WhileHead -> while OpenScope Block ${
2284                 $0 = $<3;
2285                 }$
2286
2287         $cond_statement
2288         ForThen -> ForPart OptNL ThenPartNL ${
2289                         $0.forpart = $<1;
2290                         $0.thenpart = $<3;
2291                 }$
2292                 | ForPart OptNL ${
2293                         $0.forpart = $<1;
2294                 }$
2295
2296         // This scope is closed in CondSuffix
2297         WhilePart -> while OpenScope Expression Block ${
2298                         $0.type = Xcond_statement;
2299                         $0.condpart = $<3;
2300                         $0.dopart = $<4;
2301                 }$
2302                 | WhileHead OptNL do Block ${
2303                         $0.type = Xcond_statement;
2304                         $0.condpart = $<1;
2305                         $0.dopart = $<4;
2306                 }$
2307
2308         IfPart -> if OpenScope Expression OpenScope Block ${
2309                         $0.type = Xcond_statement;
2310                         $0.condpart = $<3;
2311                         $0.thenpart = $<5;
2312                         var_block_close(config2context(config), CloseParallel);
2313                 }$
2314                 | if OpenScope Block OptNL then OpenScope Block ${
2315                         $0.type = Xcond_statement;
2316                         $0.condpart = $<3;
2317                         $0.thenpart = $<7;
2318                         var_block_close(config2context(config), CloseParallel);
2319                 }$
2320
2321         $*exec
2322         // This scope is closed in CondSuffix
2323         SwitchPart -> switch OpenScope Expression ${
2324                         $0 = $<3;
2325                 }$
2326                 | switch OpenScope Block ${
2327                         $0 = $<3;
2328                 }$
2329
2330 ###### print exec cases
2331
2332         case Xcond_statement:
2333         {
2334                 struct cond_statement *cs = cast(cond_statement, e);
2335                 struct casepart *cp;
2336                 if (cs->forpart) {
2337                         do_indent(indent, "for");
2338                         if (bracket) printf(" {\n"); else printf(":\n");
2339                         print_exec(cs->forpart, indent+1, bracket);
2340                         if (cs->thenpart) {
2341                                 if (bracket)
2342                                         do_indent(indent, "} then {\n");
2343                                 else
2344                                         do_indent(indent, "then:\n");
2345                                 print_exec(cs->thenpart, indent+1, bracket);
2346                         }
2347                         if (bracket) do_indent(indent, "}\n");
2348                 }
2349                 if (cs->dopart) {
2350                         // a loop
2351                         if (cs->condpart && cs->condpart->type == Xbinode &&
2352                             cast(binode, cs->condpart)->op == Block) {
2353                                 if (bracket)
2354                                         do_indent(indent, "while {\n");
2355                                 else
2356                                         do_indent(indent, "while:\n");
2357                                 print_exec(cs->condpart, indent+1, bracket);
2358                                 if (bracket)
2359                                         do_indent(indent, "} do {\n");
2360                                 else
2361                                         do_indent(indent, "do:\n");
2362                                 print_exec(cs->dopart, indent+1, bracket);
2363                                 if (bracket)
2364                                         do_indent(indent, "}\n");
2365                         } else {
2366                                 do_indent(indent, "while ");
2367                                 print_exec(cs->condpart, 0, bracket);
2368                                 if (bracket)
2369                                         printf(" {\n");
2370                                 else
2371                                         printf(":\n");
2372                                 print_exec(cs->dopart, indent+1, bracket);
2373                                 if (bracket)
2374                                         do_indent(indent, "}\n");
2375                         }
2376                 } else {
2377                         // a condition
2378                         if (cs->casepart)
2379                                 do_indent(indent, "switch");
2380                         else
2381                                 do_indent(indent, "if");
2382                         if (cs->condpart && cs->condpart->type == Xbinode &&
2383                             cast(binode, cs->condpart)->op == Block) {
2384                                 if (bracket)
2385                                         printf(" {\n");
2386                                 else
2387                                         printf(":\n");
2388                                 print_exec(cs->condpart, indent+1, bracket);
2389                                 if (bracket)
2390                                         do_indent(indent, "}\n");
2391                                 if (cs->thenpart) {
2392                                         do_indent(indent, "then:\n");
2393                                         print_exec(cs->thenpart, indent+1, bracket);
2394                                 }
2395                         } else {
2396                                 printf(" ");
2397                                 print_exec(cs->condpart, 0, bracket);
2398                                 if (cs->thenpart) {
2399                                         if (bracket)
2400                                                 printf(" {\n");
2401                                         else
2402                                                 printf(":\n");
2403                                         print_exec(cs->thenpart, indent+1, bracket);
2404                                         if (bracket)
2405                                                 do_indent(indent, "}\n");
2406                                 } else
2407                                         printf("\n");
2408                         }
2409                 }
2410                 for (cp = cs->casepart; cp; cp = cp->next) {
2411                         do_indent(indent, "case ");
2412                         print_exec(cp->value, -1, 0);
2413                         if (bracket)
2414                                 printf(" {\n");
2415                         else
2416                                 printf(":\n");
2417                         print_exec(cp->action, indent+1, bracket);
2418                         if (bracket)
2419                                 do_indent(indent, "}\n");
2420                 }
2421                 if (cs->elsepart) {
2422                         do_indent(indent, "else");
2423                         if (bracket)
2424                                 printf(" {\n");
2425                         else
2426                                 printf(":\n");
2427                         print_exec(cs->elsepart, indent+1, bracket);
2428                         if (bracket)
2429                                 do_indent(indent, "}\n");
2430                 }
2431                 break;
2432         }
2433
2434 ###### propagate exec cases
2435         case Xcond_statement:
2436         {
2437                 // forpart and dopart must return Vnone
2438                 // condpart must be bool or match casepart->values
2439                 // thenpart, elsepart, casepart->action must match
2440                 // or be Vnone
2441                 struct cond_statement *cs = cast(cond_statement, prog);
2442                 struct casepart *c;
2443
2444                 t = propagate_types(cs->forpart, Vnone, ok);
2445                 if (!vtype_compat(Vnone, t))
2446                         *ok = 0;
2447                 t = propagate_types(cs->dopart, Vnone, ok);
2448                 if (!vtype_compat(Vnone, t))
2449                         *ok = 0;
2450                 if (cs->casepart == NULL)
2451                         propagate_types(cs->condpart, Vbool, ok);
2452                 else {
2453                         t = Vunknown;
2454                         for (c = cs->casepart;
2455                              c && (t == Vunknown); c = c->next)
2456                                 t = propagate_types(c->value, Vunknown, ok);
2457                         if (t == Vunknown && cs->condpart)
2458                                 t = propagate_types(cs->condpart, Vunknown, ok);
2459                         // Now we have a type (I hope) push it down
2460                         if (t != Vunknown) {
2461                                 for (c = cs->casepart; c; c = c->next)
2462                                         propagate_types(c->value, t, ok);
2463                                 propagate_types(cs->condpart, t, ok);
2464                         }
2465                 }
2466                 if (type == Vunknown || type == Vnone)
2467                         type = propagate_types(cs->thenpart, Vunknown, ok);
2468                 if (type == Vunknown || type == Vnone)
2469                         type = propagate_types(cs->elsepart, Vunknown, ok);
2470                 for (c = cs->casepart;
2471                      c && (type == Vunknown || type == Vnone);
2472                      c = c->next)
2473                         type = propagate_types(c->action, Vunknown, ok);
2474                 if (type != Vunknown && type != Vnone) {
2475                         propagate_types(cs->thenpart, type, ok);
2476                         propagate_types(cs->elsepart, type, ok);
2477                         for (c = cs->casepart; c ; c = c->next)
2478                                 propagate_types(c->action, type, ok);
2479                         return type;
2480                 } else
2481                         return Vunknown;
2482         }
2483
2484 ###### interp exec cases
2485         case Xcond_statement:
2486         {
2487                 struct value v, cnd;
2488                 struct casepart *cp;
2489                 struct cond_statement *c = cast(cond_statement, e);
2490                 if (c->forpart)
2491                         interp_exec(c->forpart);
2492                 do {
2493                         if (c->condpart)
2494                                 cnd = interp_exec(c->condpart);
2495                         else
2496                                 cnd.vtype = Vnone;
2497                         if (!(cnd.vtype == Vnone ||
2498                               (cnd.vtype == Vbool && cnd.bool != 0)))
2499                                 break;
2500                         if (c->dopart) {
2501                                 free_value(cnd);
2502                                 interp_exec(c->dopart);
2503                         }
2504                         if (c->thenpart) {
2505                                 v = interp_exec(c->thenpart);
2506                                 if (v.vtype != Vnone || !c->dopart)
2507                                         return v;
2508                                 free_value(v);
2509                         }
2510                 } while (c->dopart);
2511
2512                 for (cp = c->casepart; cp; cp = cp->next) {
2513                         v = interp_exec(cp->value);
2514                         if (value_cmp(v, cnd) == 0) {
2515                                 free_value(v);
2516                                 free_value(cnd);
2517                                 return interp_exec(cp->action);
2518                         }
2519                         free_value(v);
2520                 }
2521                 free_value(cnd);
2522                 if (c->elsepart)
2523                         return interp_exec(c->elsepart);
2524                 v.vtype = Vnone;
2525                 return v;
2526         }
2527
2528 ### Finally the whole program.
2529
2530 Somewhat reminiscent of Pascal a (current) Ocean program starts with
2531 the keyword "program" and a list of variable names which are assigned
2532 values from command line arguments.  Following this is a `block` which
2533 is the code to execute.
2534
2535 As this is the top level, several things are handled a bit
2536 differently.
2537 The whole program is not interpreted by `interp_exec` as that isn't
2538 passed the argument list which the program requires.  Similarly type
2539 analysis is a bit more interesting at this level.
2540
2541 ###### Binode types
2542         Program,
2543
2544 ###### Parser: grammar
2545
2546         $*binode
2547         Program -> program OpenScope Varlist Block OptNL ${
2548                 $0 = new(binode);
2549                 $0->op = Program;
2550                 $0->left = reorder_bilist($<3);
2551                 $0->right = $<4;
2552                 var_block_close(config2context(config), CloseSequential);
2553                 if (config2context(config)->scope_stack) abort();
2554         }$
2555
2556         Varlist -> Varlist ArgDecl ${
2557                         $0 = new(binode);
2558                         $0->op = Program;
2559                         $0->left = $<1;
2560                         $0->right = $<2;
2561                 }$
2562                 | ${ $0 = NULL; }$
2563
2564         $*var
2565         ArgDecl -> IDENTIFIER ${ {
2566                 struct variable *v = var_decl(config2context(config), $1.txt);
2567                 $0 = new(var);
2568                 $0->var = v;
2569         } }$
2570
2571         ## Grammar
2572
2573 ###### print binode cases
2574         case Program:
2575                 do_indent(indent, "program");
2576                 for (b2 = cast(binode, b->left); b2; b2 = cast(binode, b2->right)) {
2577                         printf(" ");
2578                         print_exec(b2->left, 0, 0);
2579                 }
2580                 if (bracket)
2581                         printf(" {\n");
2582                 else
2583                         printf(":\n");
2584                 print_exec(b->right, indent+1, bracket);
2585                 if (bracket)
2586                         do_indent(indent, "}\n");
2587                 break;
2588
2589 ###### propagate binode cases
2590         case Program: abort();
2591
2592 ###### core functions
2593
2594         static int analyse_prog(struct exec *prog, struct parse_context *c)
2595         {
2596                 struct binode *b = cast(binode, prog);
2597                 int ok = 1;
2598
2599                 do {
2600                         ok = 1;
2601                         propagate_types(b->right, Vnone, &ok);
2602                 } while (ok == 2);
2603                 if (!ok)
2604                         return 0;
2605
2606                 for (b = cast(binode, b->left); b; b = cast(binode, b->right)) {
2607                         struct var *v = cast(var, b->left);
2608                         if (v->var->val.vtype == Vunknown)
2609                                 val_init(&v->var->val, Vstr);
2610                 }
2611                 b = cast(binode, prog);
2612                 do {
2613                         ok = 1;
2614                         propagate_types(b->right, Vnone, &ok);
2615                 } while (ok == 2);
2616                 if (!ok)
2617                         return 0;
2618
2619                 /* Make sure everything is still consistent */
2620                 propagate_types(b->right, Vnone, &ok);
2621                 return !!ok;
2622         }
2623
2624         static void interp_prog(struct exec *prog, char **argv)
2625         {
2626                 struct binode *p = cast(binode, prog);
2627                 struct binode *al = cast(binode, p->left);
2628                 struct value v;
2629
2630                 while (al) {
2631                         struct var *v = cast(var, al->left);
2632                         struct value *vl = &v->var->val;
2633
2634                         if (argv[0] == NULL) {
2635                                 printf("Not enough args\n");
2636                                 exit(1);
2637                         }
2638                         al = cast(binode, al->right);
2639                         free_value(*vl);
2640                         if (!parse_value(vl, argv[0]))
2641                                 exit(1);
2642                         argv++;
2643                 }
2644                 v = interp_exec(p->right);
2645                 free_value(v);
2646         }
2647
2648 ###### interp binode cases
2649         case Program: abort();
2650
2651 ## And now to test it out.
2652
2653 Having a language requires having a "hello world" program. I'll
2654 provide a little more than that: a program that prints "Hello world"
2655 finds the GCD of two numbers, prints the first few elements of
2656 Fibonacci, and performs a binary search for a number.
2657
2658 ###### File: oceani.mk
2659         tests :: sayhello
2660         sayhello : oceani
2661                 @echo "===== TEST ====="
2662                 ./oceani --section "test: hello" oceani.mdc 55 33
2663
2664 ###### test: hello
2665
2666         program A B:
2667                 print "Hello World, what lovely oceans you have!"
2668                 /* When a variable is defined in both branches of an 'if',
2669                  * and used afterwards, the variables are merged.
2670                  */
2671                 if A > B:
2672                         bigger := "yes"
2673                 else:
2674                         bigger := "no"
2675                 print "Is", A, "bigger than", B,"? ", bigger
2676                 /* If a variable is not used after the 'if', no
2677                  * merge happens, so types can be different
2678                  */
2679                 if A * 2 > B:
2680                         double := "yes"
2681                         print A, "is more than twice", B, "?", double
2682                 else:
2683                         double := A*2
2684                         print "double", A, "is only", double
2685
2686                 a := A; b := B
2687                 if a > 0 and b > 0:
2688                         while a != b:
2689                                 if a < b:
2690                                         b = b - a
2691                                 else:
2692                                         a = a - b
2693                         print "GCD of", A, "and", B,"is", a
2694                 else if a <= 0:
2695                         print a, "is not positive, cannot calculate GCD"
2696                 else:
2697                         print b, "is not positive, cannot calculate GCD"
2698
2699                 for:
2700                         togo := 10
2701                         f1 := 1; f2 := 1;
2702                         print "Fibonacci:", f1,f2,
2703                 then togo = togo - 1
2704                 while togo > 0:
2705                         f3 := f1 + f2
2706                         print "", f3,
2707                         f1 = f2
2708                         f2 = f3
2709                 print ""
2710
2711                 /* Binary search... */
2712                 for:
2713                         lo:= 0; hi := 100
2714                         target := 77
2715                 while:
2716                         mid := (lo + hi) / 2
2717                         if mid == target:
2718                                 use Found
2719                         if mid < target:
2720                                 lo = mid
2721                         else:
2722                                 hi = mid
2723                         if hi - lo < 1:
2724                                 use GiveUp
2725
2726                 do: pass
2727                 case Found:
2728                         print "Yay, I found", target
2729                 case GiveUp:
2730                         print "Closest I found was", mid