ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 struct exec *prog;
 118                 ## parse context
 119         };
 120
 121 ###### macros
 122
 123         #define container_of(ptr, type, member) ({                      \
 124                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 125                 (type *)( (char *)__mptr - offsetof(type,member) );})
 126
 127         #define config2context(_conf) container_of(_conf, struct parse_context, \
 128                 config)
 129
 130 ###### Parser: reduce
 131         struct parse_context *c = config2context(config);
 132
 133 ###### Parser: code
 134
 135         #include <unistd.h>
 136         #include <stdlib.h>
 137         #include <fcntl.h>
 138         #include <errno.h>
 139         #include <sys/mman.h>
 140         #include <string.h>
 141         #include <stdio.h>
 142         #include <locale.h>
 143         #include <malloc.h>
 144         #include "mdcode.h"
 145         #include "scanner.h"
 146         #include "parser.h"
 147
 148         ## includes
 149
 150         #include "oceani.h"
 151
 152         ## forward decls
 153         ## value functions
 154         ## ast functions
 155         ## core functions
 156
 157         #include <getopt.h>
 158         static char Usage[] =
 159                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 160         static const struct option long_options[] = {
 161                 {"trace",     0, NULL, 't'},
 162                 {"print",     0, NULL, 'p'},
 163                 {"noexec",    0, NULL, 'n'},
 164                 {"brackets",  0, NULL, 'b'},
 165                 {"section",   1, NULL, 's'},
 166                 {NULL,        0, NULL, 0},
 167         };
 168         const char *options = "tpnbs";
 169
 170         static void pr_err(char *msg)                   // NOTEST
 171         {
 172                 fprintf(stderr, "%s\n", msg);           // NOTEST
 173         }                                               // NOTEST
 174
 175         int main(int argc, char *argv[])
 176         {
 177                 int fd;
 178                 int len;
 179                 char *file;
 180                 struct section *s, *ss;
 181                 char *section = NULL;
 182                 struct parse_context context = {
 183                         .config = {
 184                                 .ignored = (1 << TK_mark),
 185                                 .number_chars = ".,_+- ",
 186                                 .word_start = "_",
 187                                 .word_cont = "_",
 188                         },
 189                 };
 190                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 191                 int opt;
 192                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 193                        != -1) {
 194                         switch(opt) {
 195                         case 't': dotrace=1; break;
 196                         case 'p': doprint=1; break;
 197                         case 'n': doexec=0; break;
 198                         case 'b': brackets=1; break;
 199                         case 's': section = optarg; break;
 200                         default: fprintf(stderr, Usage);
 201                                 exit(1);
 202                         }
 203                 }
 204                 if (optind >= argc) {
 205                         fprintf(stderr, "oceani: no input file given\n");
 206                         exit(1);
 207                 }
 208                 fd = open(argv[optind], O_RDONLY);
 209                 if (fd < 0) {
 210                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 211                         exit(1);
 212                 }
 213                 context.file_name = argv[optind];
 214                 len = lseek(fd, 0, 2);
 215                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 216                 s = code_extract(file, file+len, pr_err);
 217                 if (!s) {
 218                         fprintf(stderr, "oceani: could not find any code in %s\n",
 219                                 argv[optind]);
 220                         exit(1);
 221                 }
 222
 223                 ## context initialization
 224
 225                 if (section) {
 226                         for (ss = s; ss; ss = ss->next) {
 227                                 struct text sec = ss->section;
 228                                 if (sec.len == strlen(section) &&
 229                                     strncmp(sec.txt, section, sec.len) == 0)
 230                                         break;
 231                         }
 232                         if (!ss) {
 233                                 fprintf(stderr, "oceani: cannot find section %s\n",
 234                                         section);
 235                                 exit(1);
 236                         }
 237                 } else
 238                         ss = s;                         // NOTEST
 239                 if (!ss->code) {
 240                         fprintf(stderr, "oceani: no code found in requested section\n");        // NOTEST
 241                         exit(1);                        // NOTEST
 242                 }
 243
 244                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 245
 246                 if (!context.prog) {
 247                         fprintf(stderr, "oceani: no main function found.\n");
 248                         context.parse_error = 1;
 249                 }
 250                 if (context.prog && !context.parse_error) {
 251                         if (!analyse_prog(context.prog, &context)) {
 252                                 fprintf(stderr, "oceani: type error in program - not running.\n");
 253                                 context.parse_error = 1;
 254                         }
 255                 }
 256                 if (context.prog && doprint) {
 257                         ## print const decls
 258                         ## print type decls
 259                         print_exec(context.prog, 0, brackets);
 260                 }
 261                 if (context.prog && doexec && !context.parse_error)
 262                         interp_prog(&context, context.prog, argc - optind, argv+optind);
 263                 free_exec(context.prog);
 264
 265                 while (s) {
 266                         struct section *t = s->next;
 267                         code_free(s->code);
 268                         free(s);
 269                         s = t;
 270                 }
 271                 ## free global vars
 272                 ## free context types
 273                 ## free context storage
 274                 exit(context.parse_error ? 1 : 0);
 275         }
 276
 277 ### Analysis
 278
 279 The four requirements of parse, analyse, print, interpret apply to
 280 each language element individually so that is how most of the code
 281 will be structured.
 282
 283 Three of the four are fairly self explanatory.  The one that requires
 284 a little explanation is the analysis step.
 285
 286 The current language design does not require the types of variables to
 287 be declared, but they must still have a single type.  Different
 288 operations impose different requirements on the variables, for example
 289 addition requires both arguments to be numeric, and assignment
 290 requires the variable on the left to have the same type as the
 291 expression on the right.
 292
 293 Analysis involves propagating these type requirements around and
 294 consequently setting the type of each variable.  If any requirements
 295 are violated (e.g. a string is compared with a number) or if a
 296 variable needs to have two different types, then an error is raised
 297 and the program will not run.
 298
 299 If the same variable is declared in both branchs of an 'if/else', or
 300 in all cases of a 'switch' then the multiple instances may be merged
 301 into just one variable if the variable is referenced after the
 302 conditional statement.  When this happens, the types must naturally be
 303 consistent across all the branches.  When the variable is not used
 304 outside the if, the variables in the different branches are distinct
 305 and can be of different types.
 306
 307 Undeclared names may only appear in "use" statements and "case" expressions.
 308 These names are given a type of "label" and a unique value.
 309 This allows them to fill the role of a name in an enumerated type, which
 310 is useful for testing the `switch` statement.
 311
 312 As we will see, the condition part of a `while` statement can return
 313 either a Boolean or some other type.  This requires that the expected
 314 type that gets passed around comprises a type and a flag to indicate
 315 that `Tbool` is also permitted.
 316
 317 As there are, as yet, no distinct types that are compatible, there
 318 isn't much subtlety in the analysis.  When we have distinct number
 319 types, this will become more interesting.
 320
 321 #### Error reporting
 322
 323 When analysis discovers an inconsistency it needs to report an error;
 324 just refusing to run the code ensures that the error doesn't cascade,
 325 but by itself it isn't very useful.  A clear understanding of the sort
 326 of error message that are useful will help guide the process of
 327 analysis.
 328
 329 At a simplistic level, the only sort of error that type analysis can
 330 report is that the type of some construct doesn't match a contextual
 331 requirement.  For example, in `4 + "hello"` the addition provides a
 332 contextual requirement for numbers, but `"hello"` is not a number.  In
 333 this particular example no further information is needed as the types
 334 are obvious from local information.  When a variable is involved that
 335 isn't the case.  It may be helpful to explain why the variable has a
 336 particular type, by indicating the location where the type was set,
 337 whether by declaration or usage.
 338
 339 Using a recursive-descent analysis we can easily detect a problem at
 340 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 341 will detect that one argument is not a number and the usage of `hello`
 342 will detect that a number was wanted, but not provided.  In this
 343 (early) version of the language, we will generate error reports at
 344 multiple locations, so the use of `hello` will report an error and
 345 explain were the value was set, and the addition will report an error
 346 and say why numbers are needed.  To be able to report locations for
 347 errors, each language element will need to record a file location
 348 (line and column) and each variable will need to record the language
 349 element where its type was set.  For now we will assume that each line
 350 of an error message indicates one location in the file, and up to 2
 351 types.  So we provide a `printf`-like function which takes a format, a
 352 location (a `struct exec` which has not yet been introduced), and 2
 353 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 354 will need a function to print the location, once we know how that is
 355 stored. e As will be explained later, there are sometimes extra rules for
 356 type matching and they might affect error messages, we need to pass those
 357 in too.
 358
 359 As well as type errors, we sometimes need to report problems with
 360 tokens, which might be unexpected or might name a type that has not
 361 been defined.  For these we have `tok_err()` which reports an error
 362 with a given token.  Each of the error functions sets the flag in the
 363 context so indicate that parsing failed.
 364
 365 ###### forward decls
 366
 367         static void fput_loc(struct exec *loc, FILE *f);
 368
 369 ###### core functions
 370
 371         static void type_err(struct parse_context *c,
 372                              char *fmt, struct exec *loc,
 373                              struct type *t1, int rules, struct type *t2)
 374         {
 375                 fprintf(stderr, "%s:", c->file_name);
 376                 fput_loc(loc, stderr);
 377                 for (; *fmt ; fmt++) {
 378                         if (*fmt != '%') {
 379                                 fputc(*fmt, stderr);
 380                                 continue;
 381                         }
 382                         fmt++;
 383                         switch (*fmt) {
 384                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 385                         default: fputc('?', stderr); break;     // NOTEST
 386                         case '1':
 387                                 type_print(t1, stderr);
 388                                 break;
 389                         case '2':
 390                                 type_print(t2, stderr);
 391                                 break;
 392                         ## format cases
 393                         }
 394                 }
 395                 fputs("\n", stderr);
 396                 c->parse_error = 1;
 397         }
 398
 399         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 400         {
 401                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 402                         t->txt.len, t->txt.txt);
 403                 c->parse_error = 1;
 404         }
 405
 406 ## Entities: declared and predeclared.
 407
 408 There are various "things" that the language and/or the interpreter
 409 needs to know about to parse and execute a program.  These include
 410 types, variables, values, and executable code.  These are all lumped
 411 together under the term "entities" (calling them "objects" would be
 412 confusing) and introduced here.  The following section will present the
 413 different specific code elements which comprise or manipulate these
 414 various entities.
 415
 416 ### Types
 417
 418 Values come in a wide range of types, with more likely to be added.
 419 Each type needs to be able to print its own values (for convenience at
 420 least) as well as to compare two values, at least for equality and
 421 possibly for order.  For now, values might need to be duplicated and
 422 freed, though eventually such manipulations will be better integrated
 423 into the language.
 424
 425 Rather than requiring every numeric type to support all numeric
 426 operations (add, multiple, etc), we allow types to be able to present
 427 as one of a few standard types: integer, float, and fraction.  The
 428 existence of these conversion functions eventually enable types to
 429 determine if they are compatible with other types, though such types
 430 have not yet been implemented.
 431
 432 Named type are stored in a simple linked list.  Objects of each type are
 433 "values" which are often passed around by value.
 434
 435 ###### ast
 436
 437         struct value {
 438                 union {
 439                         char ptr[1];
 440                         ## value union fields
 441                 };
 442         };
 443
 444         struct type {
 445                 struct text name;
 446                 struct type *next;
 447                 int size, align;
 448                 void (*init)(struct type *type, struct value *val);
 449                 void (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 450                 void (*print)(struct type *type, struct value *val);
 451                 void (*print_type)(struct type *type, FILE *f);
 452                 int (*cmp_order)(struct type *t1, struct type *t2,
 453                                  struct value *v1, struct value *v2);
 454                 int (*cmp_eq)(struct type *t1, struct type *t2,
 455                               struct value *v1, struct value *v2);
 456                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 457                 void (*free)(struct type *type, struct value *val);
 458                 void (*free_type)(struct type *t);
 459                 long long (*to_int)(struct value *v);
 460                 double (*to_float)(struct value *v);
 461                 int (*to_mpq)(mpq_t *q, struct value *v);
 462                 ## type functions
 463                 union {
 464                         ## type union fields
 465                 };
 466         };
 467
 468 ###### parse context
 469
 470         struct type *typelist;
 471
 472 ###### ast functions
 473
 474         static struct type *find_type(struct parse_context *c, struct text s)
 475         {
 476                 struct type *l = c->typelist;
 477
 478                 while (l &&
 479                        text_cmp(l->name, s) != 0)
 480                                 l = l->next;
 481                 return l;
 482         }
 483
 484         static struct type *add_type(struct parse_context *c, struct text s,
 485                                      struct type *proto)
 486         {
 487                 struct type *n;
 488
 489                 n = calloc(1, sizeof(*n));
 490                 *n = *proto;
 491                 n->name = s;
 492                 n->next = c->typelist;
 493                 c->typelist = n;
 494                 return n;
 495         }
 496
 497         static void free_type(struct type *t)
 498         {
 499                 /* The type is always a reference to something in the
 500                  * context, so we don't need to free anything.
 501                  */
 502         }
 503
 504         static void free_value(struct type *type, struct value *v)
 505         {
 506                 if (type && v) {
 507                         type->free(type, v);
 508                         memset(v, 0x5a, type->size);
 509                 }
 510         }
 511
 512         static void type_print(struct type *type, FILE *f)
 513         {
 514                 if (!type)
 515                         fputs("*unknown*type*", f);     // NOTEST
 516                 else if (type->name.len)
 517                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 518                 else if (type->print_type)
 519                         type->print_type(type, f);
 520                 else
 521                         fputs("*invalid*type*", f);     // NOTEST
 522         }
 523
 524         static void val_init(struct type *type, struct value *val)
 525         {
 526                 if (type && type->init)
 527                         type->init(type, val);
 528         }
 529
 530         static void dup_value(struct type *type,
 531                               struct value *vold, struct value *vnew)
 532         {
 533                 if (type && type->dup)
 534                         type->dup(type, vold, vnew);
 535         }
 536
 537         static int value_cmp(struct type *tl, struct type *tr,
 538                              struct value *left, struct value *right)
 539         {
 540                 if (tl && tl->cmp_order)
 541                         return tl->cmp_order(tl, tr, left, right);
 542                 if (tl && tl->cmp_eq)                   // NOTEST
 543                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 544                 return -1;                              // NOTEST
 545         }
 546
 547         static void print_value(struct type *type, struct value *v)
 548         {
 549                 if (type && type->print)
 550                         type->print(type, v);
 551                 else
 552                         printf("*Unknown*");            // NOTEST
 553         }
 554
 555 ###### forward decls
 556
 557         static void free_value(struct type *type, struct value *v);
 558         static int type_compat(struct type *require, struct type *have, int rules);
 559         static void type_print(struct type *type, FILE *f);
 560         static void val_init(struct type *type, struct value *v);
 561         static void dup_value(struct type *type,
 562                               struct value *vold, struct value *vnew);
 563         static int value_cmp(struct type *tl, struct type *tr,
 564                              struct value *left, struct value *right);
 565         static void print_value(struct type *type, struct value *v);
 566
 567 ###### free context types
 568
 569         while (context.typelist) {
 570                 struct type *t = context.typelist;
 571
 572                 context.typelist = t->next;
 573                 if (t->free_type)
 574                         t->free_type(t);
 575                 free(t);
 576         }
 577
 578 Type can be specified for local variables, for fields in a structure,
 579 for formal parameters to functions, and possibly elsewhere.  Different
 580 rules may apply in different contexts.  As a minimum, a named type may
 581 always be used.  Currently the type of a formal parameter can be
 582 different from types in other contexts, so we have a separate grammar
 583 symbol for those.
 584
 585 ###### Grammar
 586
 587         $*type
 588         Type -> IDENTIFIER ${
 589                 $0 = find_type(c, $1.txt);
 590                 if (!$0) {
 591                         tok_err(c,
 592                                 "error: undefined type", &$1);
 593
 594                         $0 = Tnone;
 595                 }
 596         }$
 597         ## type grammar
 598
 599         FormalType -> Type ${ $0 = $<1; }$
 600         ## formal type grammar
 601
 602 #### Base Types
 603
 604 Values of the base types can be numbers, which we represent as
 605 multi-precision fractions, strings, Booleans and labels.  When
 606 analysing the program we also need to allow for places where no value
 607 is meaningful (type `Tnone`) and where we don't know what type to
 608 expect yet (type is `NULL`).
 609
 610 Values are never shared, they are always copied when used, and freed
 611 when no longer needed.
 612
 613 When propagating type information around the program, we need to
 614 determine if two types are compatible, where type `NULL` is compatible
 615 with anything.  There are two special cases with type compatibility,
 616 both related to the Conditional Statement which will be described
 617 later.  In some cases a Boolean can be accepted as well as some other
 618 primary type, and in others any type is acceptable except a label (`Vlabel`).
 619 A separate function encoding these cases will simplify some code later.
 620
 621 ###### type functions
 622
 623         int (*compat)(struct type *this, struct type *other);
 624
 625 ###### ast functions
 626
 627         static int type_compat(struct type *require, struct type *have, int rules)
 628         {
 629                 if ((rules & Rboolok) && have == Tbool)
 630                         return 1;       // NOTEST
 631                 if ((rules & Rnolabel) && have == Tlabel)
 632                         return 0;       // NOTEST
 633                 if (!require || !have)
 634                         return 1;
 635
 636                 if (require->compat)
 637                         return require->compat(require, have);
 638
 639                 return require == have;
 640         }
 641
 642 ###### includes
 643         #include <gmp.h>
 644         #include "parse_string.h"
 645         #include "parse_number.h"
 646
 647 ###### libs
 648         myLDLIBS := libnumber.o libstring.o -lgmp
 649         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
 650
 651 ###### type union fields
 652         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
 653
 654 ###### value union fields
 655         struct text str;
 656         mpq_t num;
 657         unsigned char bool;
 658         void *label;
 659
 660 ###### ast functions
 661         static void _free_value(struct type *type, struct value *v)
 662         {
 663                 if (!v)
 664                         return;         // NOTEST
 665                 switch (type->vtype) {
 666                 case Vnone: break;
 667                 case Vstr: free(v->str.txt); break;
 668                 case Vnum: mpq_clear(v->num); break;
 669                 case Vlabel:
 670                 case Vbool: break;
 671                 }
 672         }
 673
 674 ###### value functions
 675
 676         static void _val_init(struct type *type, struct value *val)
 677         {
 678                 switch(type->vtype) {
 679                 case Vnone:             // NOTEST
 680                         break;          // NOTEST
 681                 case Vnum:
 682                         mpq_init(val->num); break;
 683                 case Vstr:
 684                         val->str.txt = malloc(1);
 685                         val->str.len = 0;
 686                         break;
 687                 case Vbool:
 688                         val->bool = 0;
 689                         break;
 690                 case Vlabel:
 691                         val->label = NULL;
 692                         break;
 693                 }
 694         }
 695
 696         static void _dup_value(struct type *type,
 697                                struct value *vold, struct value *vnew)
 698         {
 699                 switch (type->vtype) {
 700                 case Vnone:             // NOTEST
 701                         break;          // NOTEST
 702                 case Vlabel:
 703                         vnew->label = vold->label;
 704                         break;
 705                 case Vbool:
 706                         vnew->bool = vold->bool;
 707                         break;
 708                 case Vnum:
 709                         mpq_init(vnew->num);
 710                         mpq_set(vnew->num, vold->num);
 711                         break;
 712                 case Vstr:
 713                         vnew->str.len = vold->str.len;
 714                         vnew->str.txt = malloc(vnew->str.len);
 715                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
 716                         break;
 717                 }
 718         }
 719
 720         static int _value_cmp(struct type *tl, struct type *tr,
 721                               struct value *left, struct value *right)
 722         {
 723                 int cmp;
 724                 if (tl != tr)
 725                         return tl - tr; // NOTEST
 726                 switch (tl->vtype) {
 727                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
 728                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
 729                 case Vstr: cmp = text_cmp(left->str, right->str); break;
 730                 case Vbool: cmp = left->bool - right->bool; break;
 731                 case Vnone: cmp = 0;                    // NOTEST
 732                 }
 733                 return cmp;
 734         }
 735
 736         static void _print_value(struct type *type, struct value *v)
 737         {
 738                 switch (type->vtype) {
 739                 case Vnone:                             // NOTEST
 740                         printf("*no-value*"); break;    // NOTEST
 741                 case Vlabel:                            // NOTEST
 742                         printf("*label-%p*", v->label); break; // NOTEST
 743                 case Vstr:
 744                         printf("%.*s", v->str.len, v->str.txt); break;
 745                 case Vbool:
 746                         printf("%s", v->bool ? "True":"False"); break;
 747                 case Vnum:
 748                         {
 749                         mpf_t fl;
 750                         mpf_init2(fl, 20);
 751                         mpf_set_q(fl, v->num);
 752                         gmp_printf("%Fg", fl);
 753                         mpf_clear(fl);
 754                         break;
 755                         }
 756                 }
 757         }
 758
 759         static void _free_value(struct type *type, struct value *v);
 760
 761         static struct type base_prototype = {
 762                 .init = _val_init,
 763                 .print = _print_value,
 764                 .cmp_order = _value_cmp,
 765                 .cmp_eq = _value_cmp,
 766                 .dup = _dup_value,
 767                 .free = _free_value,
 768         };
 769
 770         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
 771
 772 ###### ast functions
 773         static struct type *add_base_type(struct parse_context *c, char *n,
 774                                           enum vtype vt, int size)
 775         {
 776                 struct text txt = { n, strlen(n) };
 777                 struct type *t;
 778
 779                 t = add_type(c, txt, &base_prototype);
 780                 t->vtype = vt;
 781                 t->size = size;
 782                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
 783                 if (t->size & (t->align - 1))
 784                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
 785                 return t;
 786         }
 787
 788 ###### context initialization
 789
 790         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
 791         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
 792         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
 793         Tnone  = add_base_type(&context, "none", Vnone, 0);
 794         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
 795
 796 ### Variables
 797
 798 Variables are scoped named values.  We store the names in a linked list
 799 of "bindings" sorted in lexical order, and use sequential search and
 800 insertion sort.
 801
 802 ###### ast
 803
 804         struct binding {
 805                 struct text name;
 806                 struct binding *next;   // in lexical order
 807                 ## binding fields
 808         };
 809
 810 This linked list is stored in the parse context so that "reduce"
 811 functions can find or add variables, and so the analysis phase can
 812 ensure that every variable gets a type.
 813
 814 ###### parse context
 815
 816         struct binding *varlist;  // In lexical order
 817
 818 ###### ast functions
 819
 820         static struct binding *find_binding(struct parse_context *c, struct text s)
 821         {
 822                 struct binding **l = &c->varlist;
 823                 struct binding *n;
 824                 int cmp = 1;
 825
 826                 while (*l &&
 827                         (cmp = text_cmp((*l)->name, s)) < 0)
 828                                 l = & (*l)->next;
 829                 if (cmp == 0)
 830                         return *l;
 831                 n = calloc(1, sizeof(*n));
 832                 n->name = s;
 833                 n->next = *l;
 834                 *l = n;
 835                 return n;
 836         }
 837
 838 Each name can be linked to multiple variables defined in different
 839 scopes.  Each scope starts where the name is declared and continues
 840 until the end of the containing code block.  Scopes of a given name
 841 cannot nest, so a declaration while a name is in-scope is an error.
 842
 843 ###### binding fields
 844         struct variable *var;
 845
 846 ###### ast
 847         struct variable {
 848                 struct variable *previous;
 849                 struct type *type;
 850                 struct binding *name;
 851                 struct exec *where_decl;// where name was declared
 852                 struct exec *where_set; // where type was set
 853                 ## variable fields
 854         };
 855
 856 When a scope closes, the values of the variables might need to be freed.
 857 This happens in the context of some `struct exec` and each `exec` will
 858 need to know which variables need to be freed when it completes.
 859
 860 ####### exec fields
 861         struct variable *to_free;
 862
 863 ####### variable fields
 864         struct exec *cleanup_exec;
 865         struct variable *next_free;
 866
 867 ####### interp exec cleanup
 868         {
 869                 struct variable *v;
 870                 for (v = e->to_free; v; v = v->next_free) {
 871                         struct value *val = var_value(c, v);
 872                         free_value(v->type, val);
 873                 }
 874         }
 875
 876 ###### ast functions
 877         static void variable_unlink_exec(struct variable *v)
 878         {
 879                 struct variable **vp;
 880                 if (!v->cleanup_exec)
 881                         return;
 882                 for (vp = &v->cleanup_exec->to_free;
 883                     *vp; vp = &(*vp)->next_free) {
 884                         if (*vp != v)
 885                                 continue;
 886                         *vp = v->next_free;
 887                         v->cleanup_exec = NULL;
 888                         break;
 889                 }
 890         }
 891
 892 While the naming seems strange, we include local constants in the
 893 definition of variables.  A name declared `var := value` can
 894 subsequently be changed, but a name declared `var ::= value` cannot -
 895 it is constant
 896
 897 ###### variable fields
 898         int constant;
 899
 900 Scopes in parallel branches can be partially merged.  More
 901 specifically, if a given name is declared in both branches of an
 902 if/else then its scope is a candidate for merging.  Similarly if
 903 every branch of an exhaustive switch (e.g. has an "else" clause)
 904 declares a given name, then the scopes from the branches are
 905 candidates for merging.
 906
 907 Note that names declared inside a loop (which is only parallel to
 908 itself) are never visible after the loop.  Similarly names defined in
 909 scopes which are not parallel, such as those started by `for` and
 910 `switch`, are never visible after the scope.  Only variables defined in
 911 both `then` and `else` (including the implicit then after an `if`, and
 912 excluding `then` used with `for`) and in all `case`s and `else` of a
 913 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
 914
 915 Labels, which are a bit like variables, follow different rules.
 916 Labels are not explicitly declared, but if an undeclared name appears
 917 in a context where a label is legal, that effectively declares the
 918 name as a label.  The declaration remains in force (or in scope) at
 919 least to the end of the immediately containing block and conditionally
 920 in any larger containing block which does not declare the name in some
 921 other way.  Importantly, the conditional scope extension happens even
 922 if the label is only used in one parallel branch of a conditional --
 923 when used in one branch it is treated as having been declared in all
 924 branches.
 925
 926 Merge candidates are tentatively visible beyond the end of the
 927 branching statement which creates them.  If the name is used, the
 928 merge is affirmed and they become a single variable visible at the
 929 outer layer.  If not - if it is redeclared first - the merge lapses.
 930
 931 To track scopes we have an extra stack, implemented as a linked list,
 932 which roughly parallels the parse stack and which is used exclusively
 933 for scoping.  When a new scope is opened, a new frame is pushed and
 934 the child-count of the parent frame is incremented.  This child-count
 935 is used to distinguish between the first of a set of parallel scopes,
 936 in which declared variables must not be in scope, and subsequent
 937 branches, whether they may already be conditionally scoped.
 938
 939 To push a new frame *before* any code in the frame is parsed, we need a
 940 grammar reduction.  This is most easily achieved with a grammar
 941 element which derives the empty string, and creates the new scope when
 942 it is recognised.  This can be placed, for example, between a keyword
 943 like "if" and the code following it.
 944
 945 ###### ast
 946         struct scope {
 947                 struct scope *parent;
 948                 int child_count;
 949         };
 950
 951 ###### parse context
 952         int scope_depth;
 953         struct scope *scope_stack;
 954
 955 ###### ast functions
 956         static void scope_pop(struct parse_context *c)
 957         {
 958                 struct scope *s = c->scope_stack;
 959
 960                 c->scope_stack = s->parent;
 961                 free(s);
 962                 c->scope_depth -= 1;
 963         }
 964
 965         static void scope_push(struct parse_context *c)
 966         {
 967                 struct scope *s = calloc(1, sizeof(*s));
 968                 if (c->scope_stack)
 969                         c->scope_stack->child_count += 1;
 970                 s->parent = c->scope_stack;
 971                 c->scope_stack = s;
 972                 c->scope_depth += 1;
 973         }
 974
 975 ###### Grammar
 976
 977         $void
 978         OpenScope -> ${ scope_push(c); }$
 979
 980 Each variable records a scope depth and is in one of four states:
 981
 982 - "in scope".  This is the case between the declaration of the
 983   variable and the end of the containing block, and also between
 984   the usage with affirms a merge and the end of that block.
 985
 986   The scope depth is not greater than the current parse context scope
 987   nest depth.  When the block of that depth closes, the state will
 988   change.  To achieve this, all "in scope" variables are linked
 989   together as a stack in nesting order.
 990
 991 - "pending".  The "in scope" block has closed, but other parallel
 992   scopes are still being processed.  So far, every parallel block at
 993   the same level that has closed has declared the name.
 994
 995   The scope depth is the depth of the last parallel block that
 996   enclosed the declaration, and that has closed.
 997
 998 - "conditionally in scope".  The "in scope" block and all parallel
 999   scopes have closed, and no further mention of the name has been seen.
1000   This state includes a secondary nest depth (`min_depth`) which records
1001   the outermost scope seen since the variable became conditionally in
1002   scope.  If a use of the name is found, the variable becomes "in scope"
1003   and that secondary depth becomes the recorded scope depth.  If the
1004   name is declared as a new variable, the old variable becomes "out of
1005   scope" and the recorded scope depth stays unchanged.
1006
1007 - "out of scope".  The variable is neither in scope nor conditionally
1008   in scope.  It is permanently out of scope now and can be removed from
1009   the "in scope" stack.
1010
1011 ###### variable fields
1012         int depth, min_depth;
1013         enum { OutScope, PendingScope, CondScope, InScope } scope;
1014         struct variable *in_scope;
1015
1016 ###### parse context
1017
1018         struct variable *in_scope;
1019
1020 All variables with the same name are linked together using the
1021 'previous' link.  Those variable that have been affirmatively merged all
1022 have a 'merged' pointer that points to one primary variable - the most
1023 recently declared instance.  When merging variables, we need to also
1024 adjust the 'merged' pointer on any other variables that had previously
1025 been merged with the one that will no longer be primary.
1026
1027 A variable that is no longer the most recent instance of a name may
1028 still have "pending" scope, if it might still be merged with most
1029 recent instance.  These variables don't really belong in the
1030 "in_scope" list, but are not immediately removed when a new instance
1031 is found.  Instead, they are detected and ignored when considering the
1032 list of in_scope names.
1033
1034 The storage of the value of a variable will be described later.  For now
1035 we just need to know that when a variable goes out of scope, it might
1036 need to be freed.  For this we need to be able to find it, so assume that
1037 `var_value()` will provide that.
1038
1039 ###### variable fields
1040         struct variable *merged;
1041
1042 ###### ast functions
1043
1044         static void variable_merge(struct variable *primary, struct variable *secondary)
1045         {
1046                 struct variable *v;
1047
1048                 primary = primary->merged;
1049
1050                 for (v = primary->previous; v; v=v->previous)
1051                         if (v == secondary || v == secondary->merged ||
1052                             v->merged == secondary ||
1053                             v->merged == secondary->merged) {
1054                                 v->scope = OutScope;
1055                                 v->merged = primary;
1056                                 variable_unlink_exec(v);
1057                         }
1058         }
1059
1060 ###### forward decls
1061         static struct value *var_value(struct parse_context *c, struct variable *v);
1062
1063 ###### free global vars
1064
1065         while (context.varlist) {
1066                 struct binding *b = context.varlist;
1067                 struct variable *v = b->var;
1068                 context.varlist = b->next;
1069                 free(b);
1070                 while (v) {
1071                         struct variable *t = v;
1072
1073                         v = t->previous;
1074                         if (t->global) {
1075                                 free_value(t->type, var_value(&context, t));
1076                                 if (t->depth == 0)
1077                                         free_exec(t->where_decl);
1078                         }
1079                         free(t);
1080                 }
1081         }
1082
1083 #### Manipulating Bindings
1084
1085 When a name is conditionally visible, a new declaration discards the
1086 old binding - the condition lapses.  Conversely a usage of the name
1087 affirms the visibility and extends it to the end of the containing
1088 block - i.e. the block that contains both the original declaration and
1089 the latest usage.  This is determined from `min_depth`.  When a
1090 conditionally visible variable gets affirmed like this, it is also
1091 merged with other conditionally visible variables with the same name.
1092
1093 When we parse a variable declaration we either report an error if the
1094 name is currently bound, or create a new variable at the current nest
1095 depth if the name is unbound or bound to a conditionally scoped or
1096 pending-scope variable.  If the previous variable was conditionally
1097 scoped, it and its homonyms becomes out-of-scope.
1098
1099 When we parse a variable reference (including non-declarative assignment
1100 "foo = bar") we report an error if the name is not bound or is bound to
1101 a pending-scope variable; update the scope if the name is bound to a
1102 conditionally scoped variable; or just proceed normally if the named
1103 variable is in scope.
1104
1105 When we exit a scope, any variables bound at this level are either
1106 marked out of scope or pending-scoped, depending on whether the scope
1107 was sequential or parallel.  Here a "parallel" scope means the "then"
1108 or "else" part of a conditional, or any "case" or "else" branch of a
1109 switch.  Other scopes are "sequential".
1110
1111 When exiting a parallel scope we check if there are any variables that
1112 were previously pending and are still visible. If there are, then
1113 they weren't redeclared in the most recent scope, so they cannot be
1114 merged and must become out-of-scope.  If it is not the first of
1115 parallel scopes (based on `child_count`), we check that there was a
1116 previous binding that is still pending-scope.  If there isn't, the new
1117 variable must now be out-of-scope.
1118
1119 When exiting a sequential scope that immediately enclosed parallel
1120 scopes, we need to resolve any pending-scope variables.  If there was
1121 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1122 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1123 all pending-scope variables become conditionally scoped.
1124
1125 ###### ast
1126         enum closetype { CloseSequential, CloseParallel, CloseElse };
1127
1128 ###### ast functions
1129
1130         static struct variable *var_decl(struct parse_context *c, struct text s)
1131         {
1132                 struct binding *b = find_binding(c, s);
1133                 struct variable *v = b->var;
1134
1135                 switch (v ? v->scope : OutScope) {
1136                 case InScope:
1137                         /* Caller will report the error */
1138                         return NULL;
1139                 case CondScope:
1140                         for (;
1141                              v && v->scope == CondScope;
1142                              v = v->previous)
1143                                 v->scope = OutScope;
1144                         break;
1145                 default: break;
1146                 }
1147                 v = calloc(1, sizeof(*v));
1148                 v->previous = b->var;
1149                 b->var = v;
1150                 v->name = b;
1151                 v->merged = v;
1152                 v->min_depth = v->depth = c->scope_depth;
1153                 v->scope = InScope;
1154                 v->in_scope = c->in_scope;
1155                 c->in_scope = v;
1156                 return v;
1157         }
1158
1159         static struct variable *var_ref(struct parse_context *c, struct text s)
1160         {
1161                 struct binding *b = find_binding(c, s);
1162                 struct variable *v = b->var;
1163                 struct variable *v2;
1164
1165                 switch (v ? v->scope : OutScope) {
1166                 case OutScope:
1167                 case PendingScope:
1168                         /* Caller will report the error */
1169                         return NULL;
1170                 case CondScope:
1171                         /* All CondScope variables of this name need to be merged
1172                          * and become InScope
1173                          */
1174                         v->depth = v->min_depth;
1175                         v->scope = InScope;
1176                         for (v2 = v->previous;
1177                              v2 && v2->scope == CondScope;
1178                              v2 = v2->previous)
1179                                 variable_merge(v, v2);
1180                         break;
1181                 case InScope:
1182                         break;
1183                 }
1184                 return v;
1185         }
1186
1187         static void var_block_close(struct parse_context *c, enum closetype ct,
1188                                     struct exec *e)
1189         {
1190                 /* Close off all variables that are in_scope.
1191                  * Some variables in c->scope may already be not-in-scope,
1192                  * such as when a PendingScope variable is hidden by a new
1193                  * variable with the same name.
1194                  * So we check for v->name->var != v and drop them.
1195                  * If we choose to make a variable OutScope, we drop it
1196                  * immediately too.
1197                  */
1198                 struct variable *v, **vp, *v2;
1199
1200                 scope_pop(c);
1201                 for (vp = &c->in_scope;
1202                      (v = *vp) && v->min_depth > c->scope_depth;
1203                      (v->scope == OutScope || v->name->var != v)
1204                      ? (*vp =  v->in_scope, 0)
1205                      : ( vp = &v->in_scope, 0)) {
1206                         v->min_depth = c->scope_depth;
1207                         if (v->name->var != v)
1208                                 /* This is still in scope, but we haven't just
1209                                  * closed the scope.
1210                                  */
1211                                 continue;
1212                         v->min_depth = c->scope_depth;
1213                         if (v->scope == InScope) {
1214                                 /* This variable gets cleaned up when 'e' finishes */
1215                                 variable_unlink_exec(v);
1216                                 v->cleanup_exec = e;
1217                                 v->next_free = e->to_free;
1218                                 e->to_free = v;
1219                         }
1220                         switch (ct) {
1221                         case CloseElse:
1222                         case CloseParallel: /* handle PendingScope */
1223                                 switch(v->scope) {
1224                                 case InScope:
1225                                 case CondScope:
1226                                         if (c->scope_stack->child_count == 1)
1227                                                 /* first among parallel branches */
1228                                                 v->scope = PendingScope;
1229                                         else if (v->previous &&
1230                                                  v->previous->scope == PendingScope)
1231                                                 /* all previous branches used name */
1232                                                 v->scope = PendingScope;
1233                                         else if (v->type == Tlabel)
1234                                                 /* Labels remain pending even when not used */
1235                                                 v->scope = PendingScope;        // UNTESTED
1236                                         else
1237                                                 v->scope = OutScope;
1238                                         if (ct == CloseElse) {
1239                                                 /* All Pending variables with this name
1240                                                  * are now Conditional */
1241                                                 for (v2 = v;
1242                                                      v2 && v2->scope == PendingScope;
1243                                                      v2 = v2->previous)
1244                                                         v2->scope = CondScope;
1245                                         }
1246                                         break;
1247                                 case PendingScope:
1248                                         /* Not possible as it would require
1249                                          * parallel scope to be nested immediately
1250                                          * in a parallel scope, and that never
1251                                          * happens.
1252                                          */                     // NOTEST
1253                                 case OutScope:
1254                                         /* Not possible as we already tested for
1255                                          * OutScope
1256                                          */
1257                                         abort();                // NOTEST
1258                                 }
1259                                 break;
1260                         case CloseSequential:
1261                                 if (v->type == Tlabel)
1262                                         v->scope = PendingScope;
1263                                 switch (v->scope) {
1264                                 case InScope:
1265                                         v->scope = OutScope;
1266                                         break;
1267                                 case PendingScope:
1268                                         /* There was no 'else', so we can only become
1269                                          * conditional if we know the cases were exhaustive,
1270                                          * and that doesn't mean anything yet.
1271                                          * So only labels become conditional..
1272                                          */
1273                                         for (v2 = v;
1274                                              v2 && v2->scope == PendingScope;
1275                                              v2 = v2->previous)
1276                                                 if (v2->type == Tlabel)
1277                                                         v2->scope = CondScope;
1278                                                 else
1279                                                         v2->scope = OutScope;
1280                                         break;
1281                                 case CondScope:
1282                                 case OutScope: break;
1283                                 }
1284                                 break;
1285                         }
1286                 }
1287         }
1288
1289 #### Storing Values
1290
1291 The value of a variable is store separately from the variable, on an
1292 analogue of a stack frame.  There are (currently) two frames that can be
1293 active.  A global frame which currently only stores constants, and a
1294 stacked frame which stores local variables.  Each variable knows if it
1295 is global or not, and what its index into the frame is.
1296
1297 Values in the global frame are known immediately they are relevant, so
1298 the frame needs to be reallocated as it grows so it can store those
1299 values.  The local frame doesn't get values until the interpreted phase
1300 is started, so there is no need to allocate until the size is known.
1301
1302 ###### variable fields
1303                 short frame_pos;
1304                 short global;
1305
1306 ###### parse context
1307
1308         short global_size, global_alloc;
1309         short local_size;
1310         void *global, *local;
1311
1312 ###### ast functions
1313
1314         static struct value *var_value(struct parse_context *c, struct variable *v)
1315         {
1316                 if (!v->global) {
1317                         if (!c->local || !v->type)
1318                                 return NULL;                    // NOTEST
1319                         if (v->frame_pos + v->type->size > c->local_size) {
1320                                 printf("INVALID frame_pos\n");  // NOTEST
1321                                 exit(2);                        // NOTEST
1322                         }
1323                         return c->local + v->frame_pos;
1324                 }
1325                 if (c->global_size > c->global_alloc) {
1326                         int old = c->global_alloc;
1327                         c->global_alloc = (c->global_size | 1023) + 1024;
1328                         c->global = realloc(c->global, c->global_alloc);
1329                         memset(c->global + old, 0, c->global_alloc - old);
1330                 }
1331                 return c->global + v->frame_pos;
1332         }
1333
1334         static struct value *global_alloc(struct parse_context *c, struct type *t,
1335                                           struct variable *v, struct value *init)
1336         {
1337                 struct value *ret;
1338                 struct variable scratch;
1339
1340                 if (t->prepare_type)
1341                         t->prepare_type(c, t, 1);       // NOTEST
1342
1343                 if (c->global_size & (t->align - 1))
1344                         c->global_size = (c->global_size + t->align) & ~(t->align-1);   // UNTESTED
1345                 if (!v) {
1346                         v = &scratch;
1347                         v->type = t;
1348                 }
1349                 v->frame_pos = c->global_size;
1350                 v->global = 1;
1351                 c->global_size += v->type->size;
1352                 ret = var_value(c, v);
1353                 if (init)
1354                         memcpy(ret, init, t->size);
1355                 else
1356                         val_init(t, ret);
1357                 return ret;
1358         }
1359
1360 As global values are found -- struct field initializers, labels etc --
1361 `global_alloc()` is called to record the value in the global frame.
1362
1363 When the program is fully parsed, we need to walk the list of variables
1364 to find any that weren't merged away and that aren't global, and to
1365 calculate the frame size and assign a frame position for each variable.
1366 For this we have `scope_finalize()`.
1367
1368 ###### ast functions
1369
1370         static void scope_finalize(struct parse_context *c)
1371         {
1372                 struct binding *b;
1373
1374                 for (b = c->varlist; b; b = b->next) {
1375                         struct variable *v;
1376                         for (v = b->var; v; v = v->previous) {
1377                                 struct type *t = v->type;
1378                                 if (v->merged != v)
1379                                         continue;
1380                                 if (v->global)
1381                                         continue;
1382                                 if (c->local_size & (t->align - 1))
1383                                         c->local_size = (c->local_size + t->align) & ~(t->align-1);
1384                                 v->frame_pos = c->local_size;
1385                                 c->local_size += v->type->size;
1386                         }
1387                 }
1388                 c->local = calloc(1, c->local_size);
1389         }
1390
1391 ###### free context storage
1392         free(context.global);
1393         free(context.local);
1394
1395 ### Executables
1396
1397 Executables can be lots of different things.  In many cases an
1398 executable is just an operation combined with one or two other
1399 executables.  This allows for expressions and lists etc.  Other times an
1400 executable is something quite specific like a constant or variable name.
1401 So we define a `struct exec` to be a general executable with a type, and
1402 a `struct binode` which is a subclass of `exec`, forms a node in a
1403 binary tree, and holds an operation.  There will be other subclasses,
1404 and to access these we need to be able to `cast` the `exec` into the
1405 various other types.  The first field in any `struct exec` is the type
1406 from the `exec_types` enum.
1407
1408 ###### macros
1409         #define cast(structname, pointer) ({            \
1410                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
1411                 if (__mptr && *__mptr != X##structname) abort();                \
1412                 (struct structname *)( (char *)__mptr);})
1413
1414         #define new(structname) ({                                              \
1415                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
1416                 __ptr->type = X##structname;                                            \
1417                 __ptr->line = -1; __ptr->column = -1;                                   \
1418                 __ptr;})
1419
1420         #define new_pos(structname, token) ({                                           \
1421                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
1422                 __ptr->type = X##structname;                                            \
1423                 __ptr->line = token.line; __ptr->column = token.col;                    \
1424                 __ptr;})
1425
1426 ###### ast
1427         enum exec_types {
1428                 Xbinode,
1429                 ## exec type
1430         };
1431         struct exec {
1432                 enum exec_types type;
1433                 int line, column;
1434                 ## exec fields
1435         };
1436         struct binode {
1437                 struct exec;
1438                 enum Btype {
1439                         ## Binode types
1440                 } op;
1441                 struct exec *left, *right;
1442         };
1443
1444 ###### ast functions
1445
1446         static int __fput_loc(struct exec *loc, FILE *f)
1447         {
1448                 if (!loc)
1449                         return 0;
1450                 if (loc->line >= 0) {
1451                         fprintf(f, "%d:%d: ", loc->line, loc->column);
1452                         return 1;
1453                 }
1454                 if (loc->type == Xbinode)
1455                         return __fput_loc(cast(binode,loc)->left, f) ||
1456                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
1457                 return 0;                       // NOTEST
1458         }
1459         static void fput_loc(struct exec *loc, FILE *f)
1460         {
1461                 if (!__fput_loc(loc, f))
1462                         fprintf(f, "??:??: ");  // NOTEST
1463         }
1464
1465 Each different type of `exec` node needs a number of functions defined,
1466 a bit like methods.  We must be able to free it, print it, analyse it
1467 and execute it.  Once we have specific `exec` types we will need to
1468 parse them too.  Let's take this a bit more slowly.
1469
1470 #### Freeing
1471
1472 The parser generator requires a `free_foo` function for each struct
1473 that stores attributes and they will often be `exec`s and subtypes
1474 there-of.  So we need `free_exec` which can handle all the subtypes,
1475 and we need `free_binode`.
1476
1477 ###### ast functions
1478
1479         static void free_binode(struct binode *b)
1480         {
1481                 if (!b)
1482                         return;
1483                 free_exec(b->left);
1484                 free_exec(b->right);
1485                 free(b);
1486         }
1487
1488 ###### core functions
1489         static void free_exec(struct exec *e)
1490         {
1491                 if (!e)
1492                         return;
1493                 switch(e->type) {
1494                         ## free exec cases
1495                 }
1496         }
1497
1498 ###### forward decls
1499
1500         static void free_exec(struct exec *e);
1501
1502 ###### free exec cases
1503         case Xbinode: free_binode(cast(binode, e)); break;
1504
1505 #### Printing
1506
1507 Printing an `exec` requires that we know the current indent level for
1508 printing line-oriented components.  As will become clear later, we
1509 also want to know what sort of bracketing to use.
1510
1511 ###### ast functions
1512
1513         static void do_indent(int i, char *str)
1514         {
1515                 while (i--)
1516                         printf("    ");
1517                 printf("%s", str);
1518         }
1519
1520 ###### core functions
1521         static void print_binode(struct binode *b, int indent, int bracket)
1522         {
1523                 struct binode *b2;
1524                 switch(b->op) {
1525                 ## print binode cases
1526                 }
1527         }
1528
1529         static void print_exec(struct exec *e, int indent, int bracket)
1530         {
1531                 if (!e)
1532                         return;
1533                 switch (e->type) {
1534                 case Xbinode:
1535                         print_binode(cast(binode, e), indent, bracket); break;
1536                 ## print exec cases
1537                 }
1538                 if (e->to_free) {
1539                         struct variable *v;
1540                         do_indent(indent, "/* FREE");
1541                         for (v = e->to_free; v; v = v->next_free)
1542                                 printf(" %.*s(%c%d+%d)", v->name->name.len, v->name->name.txt,
1543                                        v->global ? 'G':'L',
1544                                        v->frame_pos, v->type ? v->type->size:0);
1545                         printf(" */\n");
1546                 }
1547         }
1548
1549 ###### forward decls
1550
1551         static void print_exec(struct exec *e, int indent, int bracket);
1552
1553 #### Analysing
1554
1555 As discussed, analysis involves propagating type requirements around the
1556 program and looking for errors.
1557
1558 So `propagate_types` is passed an expected type (being a `struct type`
1559 pointer together with some `val_rules` flags) that the `exec` is
1560 expected to return, and returns the type that it does return, either
1561 of which can be `NULL` signifying "unknown".  An `ok` flag is passed
1562 by reference. It is set to `0` when an error is found, and `2` when
1563 any change is made.  If it remains unchanged at `1`, then no more
1564 propagation is needed.
1565
1566 ###### ast
1567
1568         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 2<<1};
1569
1570 ###### format cases
1571         case 'r':
1572                 if (rules & Rnolabel)
1573                         fputs(" (labels not permitted)", stderr);
1574                 break;
1575
1576 ###### core functions
1577
1578         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1579                                             struct type *type, int rules);
1580         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1581                                               struct type *type, int rules)
1582         {
1583                 struct type *t;
1584
1585                 if (!prog)
1586                         return Tnone;
1587
1588                 switch (prog->type) {
1589                 case Xbinode:
1590                 {
1591                         struct binode *b = cast(binode, prog);
1592                         switch (b->op) {
1593                         ## propagate binode cases
1594                         }
1595                         break;
1596                 }
1597                 ## propagate exec cases
1598                 }
1599                 return Tnone;
1600         }
1601
1602         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1603                                             struct type *type, int rules)
1604         {
1605                 struct type *ret = __propagate_types(prog, c, ok, type, rules);
1606
1607                 if (c->parse_error)
1608                         *ok = 0;
1609                 return ret;
1610         }
1611
1612 #### Interpreting
1613
1614 Interpreting an `exec` doesn't require anything but the `exec`.  State
1615 is stored in variables and each variable will be directly linked from
1616 within the `exec` tree.  The exception to this is the `main` function
1617 which needs to look at command line arguments.  This function will be
1618 interpreted separately.
1619
1620 Each `exec` can return a value combined with a type in `struct lrval`.
1621 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
1622 the location of a value, which can be updated, in `lval`.  Others will
1623 set `lval` to NULL indicating that there is a value of appropriate type
1624 in `rval`.
1625
1626 ###### core functions
1627
1628         struct lrval {
1629                 struct type *type;
1630                 struct value rval, *lval;
1631         };
1632
1633         static struct lrval _interp_exec(struct parse_context *c, struct exec *e);
1634
1635         static struct value interp_exec(struct parse_context *c, struct exec *e,
1636                                         struct type **typeret)
1637         {
1638                 struct lrval ret = _interp_exec(c, e);
1639
1640                 if (!ret.type) abort();
1641                 if (typeret)
1642                         *typeret = ret.type;
1643                 if (ret.lval)
1644                         dup_value(ret.type, ret.lval, &ret.rval);
1645                 return ret.rval;
1646         }
1647
1648         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
1649                                           struct type **typeret)
1650         {
1651                 struct lrval ret = _interp_exec(c, e);
1652
1653                 if (ret.lval)
1654                         *typeret = ret.type;
1655                 else
1656                         free_value(ret.type, &ret.rval);
1657                 return ret.lval;
1658         }
1659
1660         static struct lrval _interp_exec(struct parse_context *c, struct exec *e)
1661         {
1662                 struct lrval ret;
1663                 struct value rv = {}, *lrv = NULL;
1664                 struct type *rvtype;
1665
1666                 rvtype = ret.type = Tnone;
1667                 if (!e) {
1668                         ret.lval = lrv;
1669                         ret.rval = rv;
1670                         return ret;
1671                 }
1672
1673                 switch(e->type) {
1674                 case Xbinode:
1675                 {
1676                         struct binode *b = cast(binode, e);
1677                         struct value left, right, *lleft;
1678                         struct type *ltype, *rtype;
1679                         ltype = rtype = Tnone;
1680                         switch (b->op) {
1681                         ## interp binode cases
1682                         }
1683                         free_value(ltype, &left);
1684                         free_value(rtype, &right);
1685                         break;
1686                 }
1687                 ## interp exec cases
1688                 }
1689                 ret.lval = lrv;
1690                 ret.rval = rv;
1691                 ret.type = rvtype;
1692                 ## interp exec cleanup
1693                 return ret;
1694         }
1695
1696 ### Complex types
1697
1698 Now that we have the shape of the interpreter in place we can add some
1699 complex types and connected them in to the data structures and the
1700 different phases of parse, analyse, print, interpret.
1701
1702 Thus far we have arrays and structs.
1703
1704 #### Arrays
1705
1706 Arrays can be declared by giving a size and a type, as `[size]type' so
1707 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
1708 size can be either a literal number, or a named constant.  Some day an
1709 arbitrary expression will be supported.
1710
1711 As a formal parameter to a function, the array can be declared with a
1712 new variable as the size: `name:[size::number]string`.  The `size`
1713 variable is set to the size of the array and must be a constant.  As
1714 `number` is the only supported type, it can be left out:
1715 `name:[size::]string`.
1716
1717 Arrays cannot be assigned.  When pointers are introduced we will also
1718 introduce array slices which can refer to part or all of an array -
1719 the assignment syntax will create a slice.  For now, an array can only
1720 ever be referenced by the name it is declared with.  It is likely that
1721 a "`copy`" primitive will eventually be define which can be used to
1722 make a copy of an array with controllable recursive depth.
1723
1724 For now we have two sorts of array, those with fixed size either because
1725 it is given as a literal number or because it is a struct member (which
1726 cannot have a runtime-changing size), and those with a size that is
1727 determined at runtime - local variables with a const size.  The former
1728 have their size calculated at parse time, the latter at run time.
1729
1730 For the latter type, the `size` field of the type is the size of a
1731 pointer, and the array is reallocated every time it comes into scope.
1732
1733 We differentiate struct fields with a const size from local variables
1734 with a const size by whether they are prepared at parse time or not.
1735
1736 ###### type union fields
1737
1738         struct {
1739                 int unspec;     // size is unspecified - vsize must be set.
1740                 short size;
1741                 short static_size;
1742                 struct variable *vsize;
1743                 struct type *member;
1744         } array;
1745
1746 ###### value union fields
1747         void *array;  // used if not static_size
1748
1749 ###### value functions
1750
1751         static void array_prepare_type(struct parse_context *c, struct type *type,
1752                                        int parse_time)
1753         {
1754                 struct value *vsize;
1755                 mpz_t q;
1756                 if (!type->array.vsize || type->array.static_size)
1757                         return;
1758
1759                 vsize = var_value(c, type->array.vsize);
1760                 mpz_init(q);
1761                 mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
1762                 type->array.size = mpz_get_si(q);
1763                 mpz_clear(q);
1764
1765                 if (parse_time) {
1766                         type->array.static_size = 1;
1767                         type->size = type->array.size * type->array.member->size;
1768                         type->align = type->array.member->align;
1769                 }
1770         }
1771
1772         static void array_init(struct type *type, struct value *val)
1773         {
1774                 int i;
1775                 void *ptr = val->ptr;
1776
1777                 if (!val)
1778                         return;                         // NOTEST
1779                 if (!type->array.static_size) {
1780                         val->array = calloc(type->array.size,
1781                                             type->array.member->size);
1782                         ptr = val->array;
1783                 }
1784                 for (i = 0; i < type->array.size; i++) {
1785                         struct value *v;
1786                         v = (void*)ptr + i * type->array.member->size;
1787                         val_init(type->array.member, v);
1788                 }
1789         }
1790
1791         static void array_free(struct type *type, struct value *val)
1792         {
1793                 int i;
1794                 void *ptr = val->ptr;
1795
1796                 if (!type->array.static_size)
1797                         ptr = val->array;
1798                 for (i = 0; i < type->array.size; i++) {
1799                         struct value *v;
1800                         v = (void*)ptr + i * type->array.member->size;
1801                         free_value(type->array.member, v);
1802                 }
1803                 if (!type->array.static_size)
1804                         free(ptr);
1805         }
1806
1807         static int array_compat(struct type *require, struct type *have)
1808         {
1809                 if (have->compat != require->compat)
1810                         return 0;       // UNTESTED
1811                 /* Both are arrays, so we can look at details */
1812                 if (!type_compat(require->array.member, have->array.member, 0))
1813                         return 0;
1814                 if (have->array.unspec && require->array.unspec) {
1815                         if (have->array.vsize && require->array.vsize &&
1816                             have->array.vsize != require->array.vsize)  // UNTESTED
1817                                 /* sizes might not be the same */
1818                                 return 0;       // UNTESTED
1819                         return 1;
1820                 }
1821                 if (have->array.unspec || require->array.unspec)
1822                         return 1;       // UNTESTED
1823                 if (require->array.vsize == NULL && have->array.vsize == NULL)
1824                         return require->array.size == have->array.size;
1825
1826                 return require->array.vsize == have->array.vsize;       // UNTESTED
1827         }
1828
1829         static void array_print_type(struct type *type, FILE *f)
1830         {
1831                 fputs("[", f);
1832                 if (type->array.vsize) {
1833                         struct binding *b = type->array.vsize->name;
1834                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
1835                                 type->array.unspec ? "::" : "");
1836                 } else
1837                         fprintf(f, "%d]", type->array.size);
1838                 type_print(type->array.member, f);
1839         }
1840
1841         static struct type array_prototype = {
1842                 .init = array_init,
1843                 .prepare_type = array_prepare_type,
1844                 .print_type = array_print_type,
1845                 .compat = array_compat,
1846                 .free = array_free,
1847                 .size = sizeof(void*),
1848                 .align = sizeof(void*),
1849         };
1850
1851 ###### declare terminals
1852         $TERM [ ]
1853
1854 ###### type grammar
1855
1856         | [ NUMBER ] Type ${ {
1857                 char tail[3];
1858                 mpq_t num;
1859                 struct text noname = { "", 0 };
1860                 struct type *t;
1861
1862                 $0 = t = add_type(c, noname, &array_prototype);
1863                 t->array.member = $<4;
1864                 t->array.vsize = NULL;
1865                 if (number_parse(num, tail, $2.txt) == 0)
1866                         tok_err(c, "error: unrecognised number", &$2);
1867                 else if (tail[0])
1868                         tok_err(c, "error: unsupported number suffix", &$2);
1869                 else {
1870                         t->array.size = mpz_get_ui(mpq_numref(num));
1871                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
1872                                 tok_err(c, "error: array size must be an integer",
1873                                         &$2);
1874                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
1875                                 tok_err(c, "error: array size is too large",
1876                                         &$2);
1877                         mpq_clear(num);
1878                 }
1879                 t->array.static_size = 1;
1880                 t->size = t->array.size * t->array.member->size;
1881                 t->align = t->array.member->align;
1882         } }$
1883
1884         | [ IDENTIFIER ] Type ${ {
1885                 struct variable *v = var_ref(c, $2.txt);
1886                 struct text noname = { "", 0 };
1887
1888                 if (!v)
1889                         tok_err(c, "error: name undeclared", &$2);
1890                 else if (!v->constant)
1891                         tok_err(c, "error: array size must be a constant", &$2);
1892
1893                 $0 = add_type(c, noname, &array_prototype);
1894                 $0->array.member = $<4;
1895                 $0->array.size = 0;
1896                 $0->array.vsize = v;
1897         } }$
1898
1899 ###### Grammar
1900         $*type
1901         OptType -> Type ${ $0 = $<1; }$
1902                 | ${ $0 = NULL; }$
1903
1904 ###### formal type grammar
1905
1906         | [ IDENTIFIER :: OptType ] Type ${ {
1907                 struct variable *v = var_decl(c, $ID.txt);
1908                 struct text noname = { "", 0 };
1909
1910                 v->type = $<OT;
1911                 v->constant = 1;
1912                 if (!v->type)
1913                         v->type = Tnum;
1914                 $0 = add_type(c, noname, &array_prototype);
1915                 $0->array.member = $<6;
1916                 $0->array.size = 0;
1917                 $0->array.unspec = 1;
1918                 $0->array.vsize = v;
1919         } }$
1920
1921 ###### Binode types
1922         Index,
1923
1924 ###### variable grammar
1925
1926         | Variable [ Expression ] ${ {
1927                 struct binode *b = new(binode);
1928                 b->op = Index;
1929                 b->left = $<1;
1930                 b->right = $<3;
1931                 $0 = b;
1932         } }$
1933
1934 ###### print binode cases
1935         case Index:
1936                 print_exec(b->left, -1, bracket);
1937                 printf("[");
1938                 print_exec(b->right, -1, bracket);
1939                 printf("]");
1940                 break;
1941
1942 ###### propagate binode cases
1943         case Index:
1944                 /* left must be an array, right must be a number,
1945                  * result is the member type of the array
1946                  */
1947                 propagate_types(b->right, c, ok, Tnum, 0);
1948                 t = propagate_types(b->left, c, ok, NULL, rules & Rnoconstant);
1949                 if (!t || t->compat != array_compat) {
1950                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
1951                         return NULL;
1952                 } else {
1953                         if (!type_compat(type, t->array.member, rules)) {
1954                                 type_err(c, "error: have %1 but need %2", prog,
1955                                          t->array.member, rules, type);
1956                         }
1957                         return t->array.member;
1958                 }
1959                 break;
1960
1961 ###### interp binode cases
1962         case Index: {
1963                 mpz_t q;
1964                 long i;
1965                 void *ptr;
1966
1967                 lleft = linterp_exec(c, b->left, &ltype);
1968                 right = interp_exec(c, b->right, &rtype);
1969                 mpz_init(q);
1970                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
1971                 i = mpz_get_si(q);
1972                 mpz_clear(q);
1973
1974                 if (ltype->array.static_size)
1975                         ptr = lleft;
1976                 else
1977                         ptr = *(void**)lleft;
1978                 rvtype = ltype->array.member;
1979                 if (i >= 0 && i < ltype->array.size)
1980                         lrv = ptr + i * rvtype->size;
1981                 else
1982                         val_init(ltype->array.member, &rv);
1983                 ltype = NULL;
1984                 break;
1985         }
1986
1987 #### Structs
1988
1989 A `struct` is a data-type that contains one or more other data-types.
1990 It differs from an array in that each member can be of a different
1991 type, and they are accessed by name rather than by number.  Thus you
1992 cannot choose an element by calculation, you need to know what you
1993 want up-front.
1994
1995 The language makes no promises about how a given structure will be
1996 stored in memory - it is free to rearrange fields to suit whatever
1997 criteria seems important.
1998
1999 Structs are declared separately from program code - they cannot be
2000 declared in-line in a variable declaration like arrays can.  A struct
2001 is given a name and this name is used to identify the type - the name
2002 is not prefixed by the word `struct` as it would be in C.
2003
2004 Structs are only treated as the same if they have the same name.
2005 Simply having the same fields in the same order is not enough.  This
2006 might change once we can create structure initializers from a list of
2007 values.
2008
2009 Each component datum is identified much like a variable is declared,
2010 with a name, one or two colons, and a type.  The type cannot be omitted
2011 as there is no opportunity to deduce the type from usage.  An initial
2012 value can be given following an equals sign, so
2013
2014 ##### Example: a struct type
2015
2016         struct complex:
2017                 x:number = 0
2018                 y:number = 0
2019
2020 would declare a type called "complex" which has two number fields,
2021 each initialised to zero.
2022
2023 Struct will need to be declared separately from the code that uses
2024 them, so we will need to be able to print out the declaration of a
2025 struct when reprinting the whole program.  So a `print_type_decl` type
2026 function will be needed.
2027
2028 ###### type union fields
2029
2030         struct {
2031                 int nfields;
2032                 struct field {
2033                         struct text name;
2034                         struct type *type;
2035                         struct value *init;
2036                         int offset;
2037                 } *fields;
2038         } structure;
2039
2040 ###### type functions
2041         void (*print_type_decl)(struct type *type, FILE *f);
2042
2043 ###### value functions
2044
2045         static void structure_init(struct type *type, struct value *val)
2046         {
2047                 int i;
2048
2049                 for (i = 0; i < type->structure.nfields; i++) {
2050                         struct value *v;
2051                         v = (void*) val->ptr + type->structure.fields[i].offset;
2052                         if (type->structure.fields[i].init)
2053                                 dup_value(type->structure.fields[i].type,
2054                                           type->structure.fields[i].init,
2055                                           v);
2056                         else
2057                                 val_init(type->structure.fields[i].type, v);
2058                 }
2059         }
2060
2061         static void structure_free(struct type *type, struct value *val)
2062         {
2063                 int i;
2064
2065                 for (i = 0; i < type->structure.nfields; i++) {
2066                         struct value *v;
2067                         v = (void*)val->ptr + type->structure.fields[i].offset;
2068                         free_value(type->structure.fields[i].type, v);
2069                 }
2070         }
2071
2072         static void structure_free_type(struct type *t)
2073         {
2074                 int i;
2075                 for (i = 0; i < t->structure.nfields; i++)
2076                         if (t->structure.fields[i].init) {
2077                                 free_value(t->structure.fields[i].type,
2078                                            t->structure.fields[i].init);
2079                         }
2080                 free(t->structure.fields);
2081         }
2082
2083         static struct type structure_prototype = {
2084                 .init = structure_init,
2085                 .free = structure_free,
2086                 .free_type = structure_free_type,
2087                 .print_type_decl = structure_print_type,
2088         };
2089
2090 ###### exec type
2091         Xfieldref,
2092
2093 ###### ast
2094         struct fieldref {
2095                 struct exec;
2096                 struct exec *left;
2097                 int index;
2098                 struct text name;
2099         };
2100
2101 ###### free exec cases
2102         case Xfieldref:
2103                 free_exec(cast(fieldref, e)->left);
2104                 free(e);
2105                 break;
2106
2107 ###### declare terminals
2108         $TERM struct .
2109
2110 ###### variable grammar
2111
2112         | Variable . IDENTIFIER ${ {
2113                 struct fieldref *fr = new_pos(fieldref, $2);
2114                 fr->left = $<1;
2115                 fr->name = $3.txt;
2116                 fr->index = -2;
2117                 $0 = fr;
2118         } }$
2119
2120 ###### print exec cases
2121
2122         case Xfieldref:
2123         {
2124                 struct fieldref *f = cast(fieldref, e);
2125                 print_exec(f->left, -1, bracket);
2126                 printf(".%.*s", f->name.len, f->name.txt);
2127                 break;
2128         }
2129
2130 ###### ast functions
2131         static int find_struct_index(struct type *type, struct text field)
2132         {
2133                 int i;
2134                 for (i = 0; i < type->structure.nfields; i++)
2135                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2136                                 return i;
2137                 return -1;
2138         }
2139
2140 ###### propagate exec cases
2141
2142         case Xfieldref:
2143         {
2144                 struct fieldref *f = cast(fieldref, prog);
2145                 struct type *st = propagate_types(f->left, c, ok, NULL, 0);
2146
2147                 if (!st)
2148                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2149                                  NULL, 0, NULL);
2150                 else if (st->init != structure_init)
2151                         type_err(c, "error: field reference attempted on %1, not a struct",
2152                                  f->left, st, 0, NULL);
2153                 else if (f->index == -2) {
2154                         f->index = find_struct_index(st, f->name);
2155                         if (f->index < 0)
2156                                 type_err(c, "error: cannot find requested field in %1",
2157                                          f->left, st, 0, NULL);
2158                 }
2159                 if (f->index >= 0) {
2160                         struct type *ft = st->structure.fields[f->index].type;
2161                         if (!type_compat(type, ft, rules))
2162                                 type_err(c, "error: have %1 but need %2", prog,
2163                                          ft, rules, type);
2164                         return ft;
2165                 }
2166                 break;
2167         }
2168
2169 ###### interp exec cases
2170         case Xfieldref:
2171         {
2172                 struct fieldref *f = cast(fieldref, e);
2173                 struct type *ltype;
2174                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2175                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2176                 rvtype = ltype->structure.fields[f->index].type;
2177                 break;
2178         }
2179
2180 ###### ast
2181         struct fieldlist {
2182                 struct fieldlist *prev;
2183                 struct field f;
2184         };
2185
2186 ###### ast functions
2187         static void free_fieldlist(struct fieldlist *f)
2188         {
2189                 if (!f)
2190                         return;
2191                 free_fieldlist(f->prev);
2192                 if (f->f.init) {
2193                         free_value(f->f.type, f->f.init);       // UNTESTED
2194                         free(f->f.init);        // UNTESTED
2195                 }
2196                 free(f);
2197         }
2198
2199 ###### top level grammar
2200         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2201                         struct type *t =
2202                                 add_type(c, $2.txt, &structure_prototype);
2203                         int cnt = 0;
2204                         struct fieldlist *f;
2205
2206                         for (f = $3; f; f=f->prev)
2207                                 cnt += 1;
2208
2209                         t->structure.nfields = cnt;
2210                         t->structure.fields = calloc(cnt, sizeof(struct field));
2211                         f = $3;
2212                         while (cnt > 0) {
2213                                 int a = f->f.type->align;
2214                                 cnt -= 1;
2215                                 t->structure.fields[cnt] = f->f;
2216                                 if (t->size & (a-1))
2217                                         t->size = (t->size | (a-1)) + 1;
2218                                 t->structure.fields[cnt].offset = t->size;
2219                                 t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2220                                 if (a > t->align)
2221                                         t->align = a;
2222                                 f->f.init = NULL;
2223                                 f = f->prev;
2224                         }
2225                 } }$
2226
2227         $*fieldlist
2228         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2229                 | { SimpleFieldList } ${ $0 = $<SFL; }$
2230                 | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2231                 | SimpleFieldList EOL ${ $0 = $<SFL; }$
2232
2233         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2234                 | FieldLines SimpleFieldList Newlines ${
2235                         $SFL->prev = $<FL;
2236                         $0 = $<SFL;
2237                 }$
2238
2239         SimpleFieldList -> Field ${ $0 = $<F; }$
2240                 | SimpleFieldList ; Field ${
2241                         $F->prev = $<SFL;
2242                         $0 = $<F;
2243                 }$
2244                 | SimpleFieldList ; ${
2245                         $0 = $<SFL;
2246                 }$
2247                 | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2248
2249         Field -> IDENTIFIER : Type = Expression ${ {
2250                         int ok; // UNTESTED
2251
2252                         $0 = calloc(1, sizeof(struct fieldlist));
2253                         $0->f.name = $1.txt;
2254                         $0->f.type = $<3;
2255                         $0->f.init = NULL;
2256                         do {
2257                                 ok = 1;
2258                                 propagate_types($<5, c, &ok, $3, 0);
2259                         } while (ok == 2);
2260                         if (!ok)
2261                                 c->parse_error = 1;     // UNTESTED
2262                         else {
2263                                 struct value vl = interp_exec(c, $5, NULL);
2264                                 $0->f.init = global_alloc(c, $0->f.type, NULL, &vl);
2265                         }
2266                 } }$
2267                 | IDENTIFIER : Type ${
2268                         $0 = calloc(1, sizeof(struct fieldlist));
2269                         $0->f.name = $1.txt;
2270                         $0->f.type = $<3;
2271                         if ($0->f.type->prepare_type)
2272                                 $0->f.type->prepare_type(c, $0->f.type, 1);
2273                 }$
2274
2275 ###### forward decls
2276         static void structure_print_type(struct type *t, FILE *f);
2277
2278 ###### value functions
2279         static void structure_print_type(struct type *t, FILE *f)       // UNTESTED
2280         {       // UNTESTED
2281                 int i;  // UNTESTED
2282
2283                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2284
2285                 for (i = 0; i < t->structure.nfields; i++) {
2286                         struct field *fl = t->structure.fields + i;
2287                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2288                         type_print(fl->type, f);
2289                         if (fl->type->print && fl->init) {
2290                                 fprintf(f, " = ");
2291                                 if (fl->type == Tstr)
2292                                         fprintf(f, "\"");       // UNTESTED
2293                                 print_value(fl->type, fl->init);
2294                                 if (fl->type == Tstr)
2295                                         fprintf(f, "\"");       // UNTESTED
2296                         }
2297                         printf("\n");
2298                 }
2299         }
2300
2301 ###### print type decls
2302         {       // UNTESTED
2303                 struct type *t; // UNTESTED
2304                 int target = -1;
2305
2306                 while (target != 0) {
2307                         int i = 0;
2308                         for (t = context.typelist; t ; t=t->next)
2309                                 if (t->print_type_decl) {
2310                                         i += 1;
2311                                         if (i == target)
2312                                                 break;
2313                                 }
2314
2315                         if (target == -1) {
2316                                 target = i;
2317                         } else {
2318                                 t->print_type_decl(t, stdout);
2319                                 target -= 1;
2320                         }
2321                 }
2322         }
2323
2324 ### Functions
2325
2326 A function is a named chunk of code which can be passed parameters and
2327 can return results.  Each function has an implicit type which includes
2328 the set of parameters and the return value.  As yet these types cannot
2329 be declared separate from the function itself.
2330
2331 In fact, only one function is currently possible - `main`.  `main` is
2332 passed an array of strings together with the size of the array, and
2333 doesn't return anything.  The strings are command line arguments.
2334
2335 The parameters can be specified either in parentheses as a list, such as
2336
2337 ##### Example: function 1
2338
2339         func main(av:[ac::number]string)
2340                 code block
2341
2342 or as an indented list of one parameter per line
2343
2344 ##### Example: function 2
2345
2346         func main
2347                 argv:[argc::number]string
2348         do
2349                 code block
2350
2351 For constructing these lists we use a `List` binode, which will be
2352 further detailed when Expression Lists are introduced.
2353
2354 ###### Binode types
2355         Func, List,
2356
2357 ###### Grammar
2358
2359         $TERM func main
2360
2361         $*binode
2362         MainFunction -> func main ( OpenScope Args ) Block Newlines ${
2363                         $0 = new(binode);
2364                         $0->op = Func;
2365                         $0->left = reorder_bilist($<Ar);
2366                         $0->right = $<Bl;
2367                         var_block_close(c, CloseSequential, $0);
2368                         if (c->scope_stack && !c->parse_error) abort();
2369                 }$
2370                 | func main IN OpenScope OptNL Args OUT OptNL do Block Newlines ${
2371                         $0 = new(binode);
2372                         $0->op = Func;
2373                         $0->left = reorder_bilist($<Ar);
2374                         $0->right = $<Bl;
2375                         var_block_close(c, CloseSequential, $0);
2376                         if (c->scope_stack && !c->parse_error) abort();
2377                 }$
2378                 | func main NEWLINE OpenScope OptNL do Block Newlines ${
2379                         $0 = new(binode);
2380                         $0->op = Func;
2381                         $0->left = NULL;
2382                         $0->right = $<Bl;
2383                         var_block_close(c, CloseSequential, $0);
2384                         if (c->scope_stack && !c->parse_error) abort();
2385                 }$
2386
2387         Args -> ${ $0 = NULL; }$
2388                 | Varlist ${ $0 = $<1; }$
2389                 | Varlist ; ${ $0 = $<1; }$
2390                 | Varlist NEWLINE ${ $0 = $<1; }$
2391
2392         Varlist -> Varlist ; ArgDecl ${ // UNTESTED
2393                         $0 = new(binode);
2394                         $0->op = List;
2395                         $0->left = $<Vl;
2396                         $0->right = $<AD;
2397                 }$
2398                 | ArgDecl ${
2399                         $0 = new(binode);
2400                         $0->op = List;
2401                         $0->left = NULL;
2402                         $0->right = $<AD;
2403                 }$
2404
2405         $*var
2406         ArgDecl -> IDENTIFIER : FormalType ${ {
2407                 struct variable *v = var_decl(c, $1.txt);
2408                 $0 = new(var);
2409                 $0->var = v;
2410                 v->type = $<FT;
2411         } }$
2412
2413 ## Executables: the elements of code
2414
2415 Each code element needs to be parsed, printed, analysed,
2416 interpreted, and freed.  There are several, so let's just start with
2417 the easy ones and work our way up.
2418
2419 ### Values
2420
2421 We have already met values as separate objects.  When manifest
2422 constants appear in the program text, that must result in an executable
2423 which has a constant value.  So the `val` structure embeds a value in
2424 an executable.
2425
2426 ###### exec type
2427         Xval,
2428
2429 ###### ast
2430         struct val {
2431                 struct exec;
2432                 struct type *vtype;
2433                 struct value val;
2434         };
2435
2436 ###### ast functions
2437         struct val *new_val(struct type *T, struct token tk)
2438         {
2439                 struct val *v = new_pos(val, tk);
2440                 v->vtype = T;
2441                 return v;
2442         }
2443
2444 ###### Grammar
2445
2446         $TERM True False
2447
2448         $*val
2449         Value ->  True ${
2450                         $0 = new_val(Tbool, $1);
2451                         $0->val.bool = 1;
2452                         }$
2453                 | False ${
2454                         $0 = new_val(Tbool, $1);
2455                         $0->val.bool = 0;
2456                         }$
2457                 | NUMBER ${
2458                         $0 = new_val(Tnum, $1);
2459                         {
2460                         char tail[3];
2461                         if (number_parse($0->val.num, tail, $1.txt) == 0)
2462                                 mpq_init($0->val.num);  // UNTESTED
2463                                 if (tail[0])
2464                                         tok_err(c, "error: unsupported number suffix",
2465                                                 &$1);
2466                         }
2467                         }$
2468                 | STRING ${
2469                         $0 = new_val(Tstr, $1);
2470                         {
2471                         char tail[3];
2472                         string_parse(&$1, '\\', &$0->val.str, tail);
2473                         if (tail[0])
2474                                 tok_err(c, "error: unsupported string suffix",
2475                                         &$1);
2476                         }
2477                         }$
2478                 | MULTI_STRING ${
2479                         $0 = new_val(Tstr, $1);
2480                         {
2481                         char tail[3];
2482                         string_parse(&$1, '\\', &$0->val.str, tail);
2483                         if (tail[0])
2484                                 tok_err(c, "error: unsupported string suffix",
2485                                         &$1);
2486                         }
2487                         }$
2488
2489 ###### print exec cases
2490         case Xval:
2491         {
2492                 struct val *v = cast(val, e);
2493                 if (v->vtype == Tstr)
2494                         printf("\"");
2495                 print_value(v->vtype, &v->val);
2496                 if (v->vtype == Tstr)
2497                         printf("\"");
2498                 break;
2499         }
2500
2501 ###### propagate exec cases
2502         case Xval:
2503         {
2504                 struct val *val = cast(val, prog);
2505                 if (!type_compat(type, val->vtype, rules))
2506                         type_err(c, "error: expected %1%r found %2",
2507                                    prog, type, rules, val->vtype);
2508                 return val->vtype;
2509         }
2510
2511 ###### interp exec cases
2512         case Xval:
2513                 rvtype = cast(val, e)->vtype;
2514                 dup_value(rvtype, &cast(val, e)->val, &rv);
2515                 break;
2516
2517 ###### ast functions
2518         static void free_val(struct val *v)
2519         {
2520                 if (v)
2521                         free_value(v->vtype, &v->val);
2522                 free(v);
2523         }
2524
2525 ###### free exec cases
2526         case Xval: free_val(cast(val, e)); break;
2527
2528 ###### ast functions
2529         // Move all nodes from 'b' to 'rv', reversing their order.
2530         // In 'b' 'left' is a list, and 'right' is the last node.
2531         // In 'rv', left' is the first node and 'right' is a list.
2532         static struct binode *reorder_bilist(struct binode *b)
2533         {
2534                 struct binode *rv = NULL;
2535
2536                 while (b) {
2537                         struct exec *t = b->right;
2538                         b->right = rv;
2539                         rv = b;
2540                         if (b->left)
2541                                 b = cast(binode, b->left);
2542                         else
2543                                 b = NULL;
2544                         rv->left = t;
2545                 }
2546                 return rv;
2547         }
2548
2549 ### Variables
2550
2551 Just as we used a `val` to wrap a value into an `exec`, we similarly
2552 need a `var` to wrap a `variable` into an exec.  While each `val`
2553 contained a copy of the value, each `var` holds a link to the variable
2554 because it really is the same variable no matter where it appears.
2555 When a variable is used, we need to remember to follow the `->merged`
2556 link to find the primary instance.
2557
2558 ###### exec type
2559         Xvar,
2560
2561 ###### ast
2562         struct var {
2563                 struct exec;
2564                 struct variable *var;
2565         };
2566
2567 ###### Grammar
2568
2569         $TERM : ::
2570
2571         $*var
2572         VariableDecl -> IDENTIFIER : ${ {
2573                 struct variable *v = var_decl(c, $1.txt);
2574                 $0 = new_pos(var, $1);
2575                 $0->var = v;
2576                 if (v)
2577                         v->where_decl = $0;
2578                 else {
2579                         v = var_ref(c, $1.txt);
2580                         $0->var = v;
2581                         type_err(c, "error: variable '%v' redeclared",
2582                                  $0, NULL, 0, NULL);
2583                         type_err(c, "info: this is where '%v' was first declared",
2584                                  v->where_decl, NULL, 0, NULL);
2585                 }
2586         } }$
2587             | IDENTIFIER :: ${ {
2588                 struct variable *v = var_decl(c, $1.txt);
2589                 $0 = new_pos(var, $1);
2590                 $0->var = v;
2591                 if (v) {
2592                         v->where_decl = $0;
2593                         v->constant = 1;
2594                 } else {
2595                         v = var_ref(c, $1.txt);
2596                         $0->var = v;
2597                         type_err(c, "error: variable '%v' redeclared",
2598                                  $0, NULL, 0, NULL);
2599                         type_err(c, "info: this is where '%v' was first declared",
2600                                  v->where_decl, NULL, 0, NULL);
2601                 }
2602         } }$
2603             | IDENTIFIER : Type ${ {
2604                 struct variable *v = var_decl(c, $1.txt);
2605                 $0 = new_pos(var, $1);
2606                 $0->var = v;
2607                 if (v) {
2608                         v->where_decl = $0;
2609                         v->where_set = $0;
2610                         v->type = $<Type;
2611                 } else {
2612                         v = var_ref(c, $1.txt);
2613                         $0->var = v;
2614                         type_err(c, "error: variable '%v' redeclared",
2615                                  $0, NULL, 0, NULL);
2616                         type_err(c, "info: this is where '%v' was first declared",
2617                                  v->where_decl, NULL, 0, NULL);
2618                 }
2619         } }$
2620             | IDENTIFIER :: Type ${ {
2621                 struct variable *v = var_decl(c, $1.txt);
2622                 $0 = new_pos(var, $1);
2623                 $0->var = v;
2624                 if (v) {
2625                         v->where_decl = $0;
2626                         v->where_set = $0;
2627                         v->type = $<Type;
2628                         v->constant = 1;
2629                 } else {
2630                         v = var_ref(c, $1.txt);
2631                         $0->var = v;
2632                         type_err(c, "error: variable '%v' redeclared",
2633                                  $0, NULL, 0, NULL);
2634                         type_err(c, "info: this is where '%v' was first declared",
2635                                  v->where_decl, NULL, 0, NULL);
2636                 }
2637         } }$
2638
2639         $*exec
2640         Variable -> IDENTIFIER ${ {
2641                 struct variable *v = var_ref(c, $1.txt);
2642                 $0 = new_pos(var, $1);
2643                 if (v == NULL) {
2644                         /* This might be a label - allocate a var just in case */
2645                         v = var_decl(c, $1.txt);
2646                         if (v) {
2647                                 v->type = Tnone;
2648                                 v->where_decl = $0;
2649                                 v->where_set = $0;
2650                         }
2651                 }
2652                 cast(var, $0)->var = v;
2653         } }$
2654         ## variable grammar
2655
2656 ###### print exec cases
2657         case Xvar:
2658         {
2659                 struct var *v = cast(var, e);
2660                 if (v->var) {
2661                         struct binding *b = v->var->name;
2662                         printf("%.*s", b->name.len, b->name.txt);
2663                 }
2664                 break;
2665         }
2666
2667 ###### format cases
2668         case 'v':
2669                 if (loc && loc->type == Xvar) {
2670                         struct var *v = cast(var, loc);
2671                         if (v->var) {
2672                                 struct binding *b = v->var->name;
2673                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2674                         } else
2675                                 fputs("???", stderr);   // NOTEST
2676                 } else
2677                         fputs("NOTVAR", stderr);        // NOTEST
2678                 break;
2679
2680 ###### propagate exec cases
2681
2682         case Xvar:
2683         {
2684                 struct var *var = cast(var, prog);
2685                 struct variable *v = var->var;
2686                 if (!v) {
2687                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2688                         return Tnone;                                   // NOTEST
2689                 }
2690                 v = v->merged;
2691                 if (v->constant && (rules & Rnoconstant)) {
2692                         type_err(c, "error: Cannot assign to a constant: %v",
2693                                  prog, NULL, 0, NULL);
2694                         type_err(c, "info: name was defined as a constant here",
2695                                  v->where_decl, NULL, 0, NULL);
2696                         return v->type;
2697                 }
2698                 if (v->type == Tnone && v->where_decl == prog)
2699                         type_err(c, "error: variable used but not declared: %v",
2700                                  prog, NULL, 0, NULL);
2701                 if (v->type == NULL) {
2702                         if (type && *ok != 0) {
2703                                 v->type = type;
2704                                 v->where_set = prog;
2705                                 *ok = 2;
2706                         }
2707                         return type;
2708                 }
2709                 if (!type_compat(type, v->type, rules)) {
2710                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2711                                  type, rules, v->type);
2712                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2713                                  v->type, rules, NULL);
2714                 }
2715                 if (!type)
2716                         return v->type;
2717                 return type;
2718         }
2719
2720 ###### interp exec cases
2721         case Xvar:
2722         {
2723                 struct var *var = cast(var, e);
2724                 struct variable *v = var->var;
2725
2726                 v = v->merged;
2727                 lrv = var_value(c, v);
2728                 rvtype = v->type;
2729                 break;
2730         }
2731
2732 ###### ast functions
2733
2734         static void free_var(struct var *v)
2735         {
2736                 free(v);
2737         }
2738
2739 ###### free exec cases
2740         case Xvar: free_var(cast(var, e)); break;
2741
2742 ### Expressions: Conditional
2743
2744 Our first user of the `binode` will be conditional expressions, which
2745 is a bit odd as they actually have three components.  That will be
2746 handled by having 2 binodes for each expression.  The conditional
2747 expression is the lowest precedence operator which is why we define it
2748 first - to start the precedence list.
2749
2750 Conditional expressions are of the form "value `if` condition `else`
2751 other_value".  They associate to the right, so everything to the right
2752 of `else` is part of an else value, while only a higher-precedence to
2753 the left of `if` is the if values.  Between `if` and `else` there is no
2754 room for ambiguity, so a full conditional expression is allowed in
2755 there.
2756
2757 ###### Binode types
2758         CondExpr,
2759
2760 ###### Grammar
2761
2762         $LEFT if $$ifelse
2763         ## expr precedence
2764
2765         $*exec
2766         Expression -> Expression if Expression else Expression $$ifelse ${ {
2767                         struct binode *b1 = new(binode);
2768                         struct binode *b2 = new(binode);
2769                         b1->op = CondExpr;
2770                         b1->left = $<3;
2771                         b1->right = b2;
2772                         b2->op = CondExpr;
2773                         b2->left = $<1;
2774                         b2->right = $<5;
2775                         $0 = b1;
2776                 } }$
2777                 ## expression grammar
2778
2779 ###### print binode cases
2780
2781         case CondExpr:
2782                 b2 = cast(binode, b->right);
2783                 if (bracket) printf("(");
2784                 print_exec(b2->left, -1, bracket);
2785                 printf(" if ");
2786                 print_exec(b->left, -1, bracket);
2787                 printf(" else ");
2788                 print_exec(b2->right, -1, bracket);
2789                 if (bracket) printf(")");
2790                 break;
2791
2792 ###### propagate binode cases
2793
2794         case CondExpr: {
2795                 /* cond must be Tbool, others must match */
2796                 struct binode *b2 = cast(binode, b->right);
2797                 struct type *t2;
2798
2799                 propagate_types(b->left, c, ok, Tbool, 0);
2800                 t = propagate_types(b2->left, c, ok, type, Rnolabel);
2801                 t2 = propagate_types(b2->right, c, ok, type ?: t, Rnolabel);
2802                 return t ?: t2;
2803         }
2804
2805 ###### interp binode cases
2806
2807         case CondExpr: {
2808                 struct binode *b2 = cast(binode, b->right);
2809                 left = interp_exec(c, b->left, &ltype);
2810                 if (left.bool)
2811                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
2812                 else
2813                         rv = interp_exec(c, b2->right, &rvtype);
2814                 }
2815                 break;
2816
2817 ### Expression list
2818
2819 We take a brief detour, now that we have expressions, to describe lists
2820 of expressions.  These will be needed for function parameters and
2821 possibly other situations.  They seem generic enough to introduce here
2822 to be used elsewhere.
2823
2824 And ExpressionList will use the `List` type of `binode`, building up at
2825 the end.  And place where they are used will probably call
2826 `reorder_bilist()` to get a more normal first/next arrangement.
2827
2828 ###### declare terminals
2829         $TERM ,
2830
2831 `List` execs have no implicit semantics, so they are never propagated or
2832 interpreted.  The can be printed as a comma separate list, which is how
2833 they are parsed.  Note they are also used for function formal parameter
2834 lists.  In that case a separate function is used to print them.
2835
2836 ###### print binode cases
2837         case List:
2838                 while (b) {
2839                         printf(" ");
2840                         print_exec(b->left, -1, bracket);
2841                         if (b->right)
2842                                 printf(",");
2843                         b = cast(binode, b->right);
2844                 }
2845                 break;
2846
2847 ###### propagate binode cases
2848         case List: abort(); // NOTEST
2849 ###### interp binode cases
2850         case List: abort(); // NOTEST
2851
2852 ###### Grammar
2853
2854         $*binode
2855         ExpressionList -> ExpressionList , Expression ${
2856                         $0 = new(binode);
2857                         $0->op = List;
2858                         $0->left = $<1;
2859                         $0->right = $<3;
2860                 }$
2861                 | Expression ${
2862                         $0 = new(binode);
2863                         $0->op = List;
2864                         $0->left = NULL;
2865                         $0->right = $<1;
2866                 }$
2867
2868 ### Expressions: Boolean
2869
2870 The next class of expressions to use the `binode` will be Boolean
2871 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
2872 have same corresponding precendence.  The difference is that they don't
2873 evaluate the second expression if not necessary.
2874
2875 ###### Binode types
2876         And,
2877         AndThen,
2878         Or,
2879         OrElse,
2880         Not,
2881
2882 ###### expr precedence
2883         $LEFT or
2884         $LEFT and
2885         $LEFT not
2886
2887 ###### expression grammar
2888                 | Expression or Expression ${ {
2889                         struct binode *b = new(binode);
2890                         b->op = Or;
2891                         b->left = $<1;
2892                         b->right = $<3;
2893                         $0 = b;
2894                 } }$
2895                 | Expression or else Expression ${ {
2896                         struct binode *b = new(binode);
2897                         b->op = OrElse;
2898                         b->left = $<1;
2899                         b->right = $<4;
2900                         $0 = b;
2901                 } }$
2902
2903                 | Expression and Expression ${ {
2904                         struct binode *b = new(binode);
2905                         b->op = And;
2906                         b->left = $<1;
2907                         b->right = $<3;
2908                         $0 = b;
2909                 } }$
2910                 | Expression and then Expression ${ {
2911                         struct binode *b = new(binode);
2912                         b->op = AndThen;
2913                         b->left = $<1;
2914                         b->right = $<4;
2915                         $0 = b;
2916                 } }$
2917
2918                 | not Expression ${ {
2919                         struct binode *b = new(binode);
2920                         b->op = Not;
2921                         b->right = $<2;
2922                         $0 = b;
2923                 } }$
2924
2925 ###### print binode cases
2926         case And:
2927                 if (bracket) printf("(");
2928                 print_exec(b->left, -1, bracket);
2929                 printf(" and ");
2930                 print_exec(b->right, -1, bracket);
2931                 if (bracket) printf(")");
2932                 break;
2933         case AndThen:
2934                 if (bracket) printf("(");
2935                 print_exec(b->left, -1, bracket);
2936                 printf(" and then ");
2937                 print_exec(b->right, -1, bracket);
2938                 if (bracket) printf(")");
2939                 break;
2940         case Or:
2941                 if (bracket) printf("(");
2942                 print_exec(b->left, -1, bracket);
2943                 printf(" or ");
2944                 print_exec(b->right, -1, bracket);
2945                 if (bracket) printf(")");
2946                 break;
2947         case OrElse:
2948                 if (bracket) printf("(");
2949                 print_exec(b->left, -1, bracket);
2950                 printf(" or else ");
2951                 print_exec(b->right, -1, bracket);
2952                 if (bracket) printf(")");
2953                 break;
2954         case Not:
2955                 if (bracket) printf("(");
2956                 printf("not ");
2957                 print_exec(b->right, -1, bracket);
2958                 if (bracket) printf(")");
2959                 break;
2960
2961 ###### propagate binode cases
2962         case And:
2963         case AndThen:
2964         case Or:
2965         case OrElse:
2966         case Not:
2967                 /* both must be Tbool, result is Tbool */
2968                 propagate_types(b->left, c, ok, Tbool, 0);
2969                 propagate_types(b->right, c, ok, Tbool, 0);
2970                 if (type && type != Tbool)
2971                         type_err(c, "error: %1 operation found where %2 expected", prog,
2972                                    Tbool, 0, type);
2973                 return Tbool;
2974
2975 ###### interp binode cases
2976         case And:
2977                 rv = interp_exec(c, b->left, &rvtype);
2978                 right = interp_exec(c, b->right, &rtype);
2979                 rv.bool = rv.bool && right.bool;
2980                 break;
2981         case AndThen:
2982                 rv = interp_exec(c, b->left, &rvtype);
2983                 if (rv.bool)
2984                         rv = interp_exec(c, b->right, NULL);
2985                 break;
2986         case Or:
2987                 rv = interp_exec(c, b->left, &rvtype);
2988                 right = interp_exec(c, b->right, &rtype);
2989                 rv.bool = rv.bool || right.bool;
2990                 break;
2991         case OrElse:
2992                 rv = interp_exec(c, b->left, &rvtype);
2993                 if (!rv.bool)
2994                         rv = interp_exec(c, b->right, NULL);
2995                 break;
2996         case Not:
2997                 rv = interp_exec(c, b->right, &rvtype);
2998                 rv.bool = !rv.bool;
2999                 break;
3000
3001 ### Expressions: Comparison
3002
3003 Of slightly higher precedence that Boolean expressions are Comparisons.
3004 A comparison takes arguments of any comparable type, but the two types
3005 must be the same.
3006
3007 To simplify the parsing we introduce an `eop` which can record an
3008 expression operator, and the `CMPop` non-terminal will match one of them.
3009
3010 ###### ast
3011         struct eop {
3012                 enum Btype op;
3013         };
3014
3015 ###### ast functions
3016         static void free_eop(struct eop *e)
3017         {
3018                 if (e)
3019                         free(e);
3020         }
3021
3022 ###### Binode types
3023         Less,
3024         Gtr,
3025         LessEq,
3026         GtrEq,
3027         Eql,
3028         NEql,
3029
3030 ###### expr precedence
3031         $LEFT < > <= >= == != CMPop
3032
3033 ###### expression grammar
3034         | Expression CMPop Expression ${ {
3035                 struct binode *b = new(binode);
3036                 b->op = $2.op;
3037                 b->left = $<1;
3038                 b->right = $<3;
3039                 $0 = b;
3040         } }$
3041
3042 ###### Grammar
3043
3044         $eop
3045         CMPop ->   < ${ $0.op = Less; }$
3046                 |  > ${ $0.op = Gtr; }$
3047                 |  <= ${ $0.op = LessEq; }$
3048                 |  >= ${ $0.op = GtrEq; }$
3049                 |  == ${ $0.op = Eql; }$
3050                 |  != ${ $0.op = NEql; }$
3051
3052 ###### print binode cases
3053
3054         case Less:
3055         case LessEq:
3056         case Gtr:
3057         case GtrEq:
3058         case Eql:
3059         case NEql:
3060                 if (bracket) printf("(");
3061                 print_exec(b->left, -1, bracket);
3062                 switch(b->op) {
3063                 case Less:   printf(" < "); break;
3064                 case LessEq: printf(" <= "); break;
3065                 case Gtr:    printf(" > "); break;
3066                 case GtrEq:  printf(" >= "); break;
3067                 case Eql:    printf(" == "); break;
3068                 case NEql:   printf(" != "); break;
3069                 default: abort();               // NOTEST
3070                 }
3071                 print_exec(b->right, -1, bracket);
3072                 if (bracket) printf(")");
3073                 break;
3074
3075 ###### propagate binode cases
3076         case Less:
3077         case LessEq:
3078         case Gtr:
3079         case GtrEq:
3080         case Eql:
3081         case NEql:
3082                 /* Both must match but not be labels, result is Tbool */
3083                 t = propagate_types(b->left, c, ok, NULL, Rnolabel);
3084                 if (t)
3085                         propagate_types(b->right, c, ok, t, 0);
3086                 else {
3087                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);   // UNTESTED
3088                         if (t)  // UNTESTED
3089                                 t = propagate_types(b->left, c, ok, t, 0);      // UNTESTED
3090                 }
3091                 if (!type_compat(type, Tbool, 0))
3092                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
3093                                     Tbool, rules, type);
3094                 return Tbool;
3095
3096 ###### interp binode cases
3097         case Less:
3098         case LessEq:
3099         case Gtr:
3100         case GtrEq:
3101         case Eql:
3102         case NEql:
3103         {
3104                 int cmp;
3105                 left = interp_exec(c, b->left, &ltype);
3106                 right = interp_exec(c, b->right, &rtype);
3107                 cmp = value_cmp(ltype, rtype, &left, &right);
3108                 rvtype = Tbool;
3109                 switch (b->op) {
3110                 case Less:      rv.bool = cmp <  0; break;
3111                 case LessEq:    rv.bool = cmp <= 0; break;
3112                 case Gtr:       rv.bool = cmp >  0; break;
3113                 case GtrEq:     rv.bool = cmp >= 0; break;
3114                 case Eql:       rv.bool = cmp == 0; break;
3115                 case NEql:      rv.bool = cmp != 0; break;
3116                 default:        rv.bool = 0; break;     // NOTEST
3117                 }
3118                 break;
3119         }
3120
3121 ### Expressions: The rest
3122
3123 The remaining expressions with the highest precedence are arithmetic,
3124 string concatenation, and string conversion.  String concatenation
3125 (`++`) has the same precedence as multiplication and division, but lower
3126 than the uniary.
3127
3128 String conversion is a temporary feature until I get a better type
3129 system.  `$` is a prefix operator which expects a string and returns
3130 a number.
3131
3132 `+` and `-` are both infix and prefix operations (where they are
3133 absolute value and negation).  These have different operator names.
3134
3135 We also have a 'Bracket' operator which records where parentheses were
3136 found.  This makes it easy to reproduce these when printing.  Possibly I
3137 should only insert brackets were needed for precedence.
3138
3139 ###### Binode types
3140         Plus, Minus,
3141         Times, Divide, Rem,
3142         Concat,
3143         Absolute, Negate,
3144         StringConv,
3145         Bracket,
3146
3147 ###### expr precedence
3148         $LEFT + - Eop
3149         $LEFT * / % ++ Top
3150         $LEFT Uop $
3151         $TERM ( )
3152
3153 ###### expression grammar
3154                 | Expression Eop Expression ${ {
3155                         struct binode *b = new(binode);
3156                         b->op = $2.op;
3157                         b->left = $<1;
3158                         b->right = $<3;
3159                         $0 = b;
3160                 } }$
3161
3162                 | Expression Top Expression ${ {
3163                         struct binode *b = new(binode);
3164                         b->op = $2.op;
3165                         b->left = $<1;
3166                         b->right = $<3;
3167                         $0 = b;
3168                 } }$
3169
3170                 | ( Expression ) ${ {
3171                         struct binode *b = new_pos(binode, $1);
3172                         b->op = Bracket;
3173                         b->right = $<2;
3174                         $0 = b;
3175                 } }$
3176                 | Uop Expression ${ {
3177                         struct binode *b = new(binode);
3178                         b->op = $1.op;
3179                         b->right = $<2;
3180                         $0 = b;
3181                 } }$
3182                 | Value ${ $0 = $<1; }$
3183                 | Variable ${ $0 = $<1; }$
3184
3185         $eop
3186         Eop ->    + ${ $0.op = Plus; }$
3187                 | - ${ $0.op = Minus; }$
3188
3189         Uop ->    + ${ $0.op = Absolute; }$
3190                 | - ${ $0.op = Negate; }$
3191                 | $ ${ $0.op = StringConv; }$
3192
3193         Top ->    * ${ $0.op = Times; }$
3194                 | / ${ $0.op = Divide; }$
3195                 | % ${ $0.op = Rem; }$
3196                 | ++ ${ $0.op = Concat; }$
3197
3198 ###### print binode cases
3199         case Plus:
3200         case Minus:
3201         case Times:
3202         case Divide:
3203         case Concat:
3204         case Rem:
3205                 if (bracket) printf("(");
3206                 print_exec(b->left, indent, bracket);
3207                 switch(b->op) {
3208                 case Plus:   fputs(" + ", stdout); break;
3209                 case Minus:  fputs(" - ", stdout); break;
3210                 case Times:  fputs(" * ", stdout); break;
3211                 case Divide: fputs(" / ", stdout); break;
3212                 case Rem:    fputs(" % ", stdout); break;
3213                 case Concat: fputs(" ++ ", stdout); break;
3214                 default: abort();       // NOTEST
3215                 }                       // NOTEST
3216                 print_exec(b->right, indent, bracket);
3217                 if (bracket) printf(")");
3218                 break;
3219         case Absolute:
3220         case Negate:
3221         case StringConv:
3222                 if (bracket) printf("(");
3223                 switch (b->op) {
3224                 case Absolute:   fputs("+", stdout); break;
3225                 case Negate:     fputs("-", stdout); break;
3226                 case StringConv: fputs("$", stdout); break;
3227                 default: abort();       // NOTEST
3228                 }                       // NOTEST
3229                 print_exec(b->right, indent, bracket);
3230                 if (bracket) printf(")");
3231                 break;
3232         case Bracket:
3233                 printf("(");
3234                 print_exec(b->right, indent, bracket);
3235                 printf(")");
3236                 break;
3237
3238 ###### propagate binode cases
3239         case Plus:
3240         case Minus:
3241         case Times:
3242         case Rem:
3243         case Divide:
3244                 /* both must be numbers, result is Tnum */
3245         case Absolute:
3246         case Negate:
3247                 /* as propagate_types ignores a NULL,
3248                  * unary ops fit here too */
3249                 propagate_types(b->left, c, ok, Tnum, 0);
3250                 propagate_types(b->right, c, ok, Tnum, 0);
3251                 if (!type_compat(type, Tnum, 0))
3252                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3253                                    Tnum, rules, type);
3254                 return Tnum;
3255
3256         case Concat:
3257                 /* both must be Tstr, result is Tstr */
3258                 propagate_types(b->left, c, ok, Tstr, 0);
3259                 propagate_types(b->right, c, ok, Tstr, 0);
3260                 if (!type_compat(type, Tstr, 0))
3261                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3262                                    Tstr, rules, type);
3263                 return Tstr;
3264
3265         case StringConv:
3266                 /* op must be string, result is number */
3267                 propagate_types(b->left, c, ok, Tstr, 0);
3268                 if (!type_compat(type, Tnum, 0))
3269                         type_err(c,     // UNTESTED
3270                           "error: Can only convert string to number, not %1",
3271                                 prog, type, 0, NULL);
3272                 return Tnum;
3273
3274         case Bracket:
3275                 return propagate_types(b->right, c, ok, type, 0);
3276
3277 ###### interp binode cases
3278
3279         case Plus:
3280                 rv = interp_exec(c, b->left, &rvtype);
3281                 right = interp_exec(c, b->right, &rtype);
3282                 mpq_add(rv.num, rv.num, right.num);
3283                 break;
3284         case Minus:
3285                 rv = interp_exec(c, b->left, &rvtype);
3286                 right = interp_exec(c, b->right, &rtype);
3287                 mpq_sub(rv.num, rv.num, right.num);
3288                 break;
3289         case Times:
3290                 rv = interp_exec(c, b->left, &rvtype);
3291                 right = interp_exec(c, b->right, &rtype);
3292                 mpq_mul(rv.num, rv.num, right.num);
3293                 break;
3294         case Divide:
3295                 rv = interp_exec(c, b->left, &rvtype);
3296                 right = interp_exec(c, b->right, &rtype);
3297                 mpq_div(rv.num, rv.num, right.num);
3298                 break;
3299         case Rem: {
3300                 mpz_t l, r, rem;
3301
3302                 left = interp_exec(c, b->left, &ltype);
3303                 right = interp_exec(c, b->right, &rtype);
3304                 mpz_init(l); mpz_init(r); mpz_init(rem);
3305                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3306                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3307                 mpz_tdiv_r(rem, l, r);
3308                 val_init(Tnum, &rv);
3309                 mpq_set_z(rv.num, rem);
3310                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3311                 rvtype = ltype;
3312                 break;
3313         }
3314         case Negate:
3315                 rv = interp_exec(c, b->right, &rvtype);
3316                 mpq_neg(rv.num, rv.num);
3317                 break;
3318         case Absolute:
3319                 rv = interp_exec(c, b->right, &rvtype);
3320                 mpq_abs(rv.num, rv.num);
3321                 break;
3322         case Bracket:
3323                 rv = interp_exec(c, b->right, &rvtype);
3324                 break;
3325         case Concat:
3326                 left = interp_exec(c, b->left, &ltype);
3327                 right = interp_exec(c, b->right, &rtype);
3328                 rvtype = Tstr;
3329                 rv.str = text_join(left.str, right.str);
3330                 break;
3331         case StringConv:
3332                 right = interp_exec(c, b->right, &rvtype);
3333                 rtype = Tstr;
3334                 rvtype = Tnum;
3335
3336                 struct text tx = right.str;
3337                 char tail[3];
3338                 int neg = 0;
3339                 if (tx.txt[0] == '-') {
3340                         neg = 1;        // UNTESTED
3341                         tx.txt++;       // UNTESTED
3342                         tx.len--;       // UNTESTED
3343                 }
3344                 if (number_parse(rv.num, tail, tx) == 0)
3345                         mpq_init(rv.num);       // UNTESTED
3346                 else if (neg)
3347                         mpq_neg(rv.num, rv.num);        // UNTESTED
3348                 if (tail[0])
3349                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3350
3351                 break;
3352
3353 ###### value functions
3354
3355         static struct text text_join(struct text a, struct text b)
3356         {
3357                 struct text rv;
3358                 rv.len = a.len + b.len;
3359                 rv.txt = malloc(rv.len);
3360                 memcpy(rv.txt, a.txt, a.len);
3361                 memcpy(rv.txt+a.len, b.txt, b.len);
3362                 return rv;
3363         }
3364
3365 ### Blocks, Statements, and Statement lists.
3366
3367 Now that we have expressions out of the way we need to turn to
3368 statements.  There are simple statements and more complex statements.
3369 Simple statements do not contain (syntactic) newlines, complex statements do.
3370
3371 Statements often come in sequences and we have corresponding simple
3372 statement lists and complex statement lists.
3373 The former comprise only simple statements separated by semicolons.
3374 The later comprise complex statements and simple statement lists.  They are
3375 separated by newlines.  Thus the semicolon is only used to separate
3376 simple statements on the one line.  This may be overly restrictive,
3377 but I'm not sure I ever want a complex statement to share a line with
3378 anything else.
3379
3380 Note that a simple statement list can still use multiple lines if
3381 subsequent lines are indented, so
3382
3383 ###### Example: wrapped simple statement list
3384
3385         a = b; c = d;
3386            e = f; print g
3387
3388 is a single simple statement list.  This might allow room for
3389 confusion, so I'm not set on it yet.
3390
3391 A simple statement list needs no extra syntax.  A complex statement
3392 list has two syntactic forms.  It can be enclosed in braces (much like
3393 C blocks), or it can be introduced by an indent and continue until an
3394 unindented newline (much like Python blocks).  With this extra syntax
3395 it is referred to as a block.
3396
3397 Note that a block does not have to include any newlines if it only
3398 contains simple statements.  So both of:
3399
3400         if condition: a=b; d=f
3401
3402         if condition { a=b; print f }
3403
3404 are valid.
3405
3406 In either case the list is constructed from a `binode` list with
3407 `Block` as the operator.  When parsing the list it is most convenient
3408 to append to the end, so a list is a list and a statement.  When using
3409 the list it is more convenient to consider a list to be a statement
3410 and a list.  So we need a function to re-order a list.
3411 `reorder_bilist` serves this purpose.
3412
3413 The only stand-alone statement we introduce at this stage is `pass`
3414 which does nothing and is represented as a `NULL` pointer in a `Block`
3415 list.  Other stand-alone statements will follow once the infrastructure
3416 is in-place.
3417
3418 ###### Binode types
3419         Block,
3420
3421 ###### Grammar
3422
3423         $TERM { } ;
3424
3425         $*binode
3426         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3427                 | { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3428                 | SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3429                 | SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3430                 | IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3431
3432         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3433                 | OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3434                 | OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3435                 | OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3436                 | IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3437
3438         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3439                 | { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3440                 | IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3441
3442         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3443                 | { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3444                 | : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3445                 | : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3446                 | : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3447
3448         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3449
3450         ComplexStatements -> ComplexStatements ComplexStatement ${
3451                         if ($2 == NULL) {
3452                                 $0 = $<1;
3453                         } else {
3454                                 $0 = new(binode);
3455                                 $0->op = Block;
3456                                 $0->left = $<1;
3457                                 $0->right = $<2;
3458                         }
3459                 }$
3460                 | ComplexStatement ${
3461                         if ($1 == NULL) {
3462                                 $0 = NULL;
3463                         } else {
3464                                 $0 = new(binode);
3465                                 $0->op = Block;
3466                                 $0->left = NULL;
3467                                 $0->right = $<1;
3468                         }
3469                 }$
3470
3471         $*exec
3472         ComplexStatement -> SimpleStatements Newlines ${
3473                         $0 = reorder_bilist($<SS);
3474                         }$
3475                 |  SimpleStatements ; Newlines ${
3476                         $0 = reorder_bilist($<SS);
3477                         }$
3478                 ## ComplexStatement Grammar
3479
3480         $*binode
3481         SimpleStatements -> SimpleStatements ; SimpleStatement ${
3482                         $0 = new(binode);
3483                         $0->op = Block;
3484                         $0->left = $<1;
3485                         $0->right = $<3;
3486                         }$
3487                 | SimpleStatement ${
3488                         $0 = new(binode);
3489                         $0->op = Block;
3490                         $0->left = NULL;
3491                         $0->right = $<1;
3492                         }$
3493
3494         $TERM pass
3495         SimpleStatement -> pass ${ $0 = NULL; }$
3496                 | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
3497                 ## SimpleStatement Grammar
3498
3499 ###### print binode cases
3500         case Block:
3501                 if (indent < 0) {
3502                         // simple statement
3503                         if (b->left == NULL)    // UNTESTED
3504                                 printf("pass"); // UNTESTED
3505                         else
3506                                 print_exec(b->left, indent, bracket);   // UNTESTED
3507                         if (b->right) { // UNTESTED
3508                                 printf("; ");   // UNTESTED
3509                                 print_exec(b->right, indent, bracket);  // UNTESTED
3510                         }
3511                 } else {
3512                         // block, one per line
3513                         if (b->left == NULL)
3514                                 do_indent(indent, "pass\n");
3515                         else
3516                                 print_exec(b->left, indent, bracket);
3517                         if (b->right)
3518                                 print_exec(b->right, indent, bracket);
3519                 }
3520                 break;
3521
3522 ###### propagate binode cases
3523         case Block:
3524         {
3525                 /* If any statement returns something other than Tnone
3526                  * or Tbool then all such must return same type.
3527                  * As each statement may be Tnone or something else,
3528                  * we must always pass NULL (unknown) down, otherwise an incorrect
3529                  * error might occur.  We never return Tnone unless it is
3530                  * passed in.
3531                  */
3532                 struct binode *e;
3533
3534                 for (e = b; e; e = cast(binode, e->right)) {
3535                         t = propagate_types(e->left, c, ok, NULL, rules);
3536                         if ((rules & Rboolok) && t == Tbool)
3537                                 t = NULL;
3538                         if (t && t != Tnone && t != Tbool) {
3539                                 if (!type)
3540                                         type = t;
3541                                 else if (t != type)
3542                                         type_err(c, "error: expected %1%r, found %2",
3543                                                  e->left, type, rules, t);
3544                         }
3545                 }
3546                 return type;
3547         }
3548
3549 ###### interp binode cases
3550         case Block:
3551                 while (rvtype == Tnone &&
3552                        b) {
3553                         if (b->left)
3554                                 rv = interp_exec(c, b->left, &rvtype);
3555                         b = cast(binode, b->right);
3556                 }
3557                 break;
3558
3559 ### The Print statement
3560
3561 `print` is a simple statement that takes a comma-separated list of
3562 expressions and prints the values separated by spaces and terminated
3563 by a newline.  No control of formatting is possible.
3564
3565 `print` uses `ExpressionList` to collect the expressions and stores them
3566 on the left side of a `Print` binode unlessthere is a trailing comma
3567 when the list is stored on the `right` side and no trailing newline is
3568 printed.
3569
3570 ###### Binode types
3571         Print,
3572
3573 ##### expr precedence
3574         $TERM print
3575
3576 ###### SimpleStatement Grammar
3577
3578         | print ExpressionList ${
3579                 $0 = new(binode);
3580                 $0->op = Print;
3581                 $0->right = NULL;
3582                 $0->left = reorder_bilist($<EL);
3583         }$
3584         | print ExpressionList , ${ {
3585                 $0 = new(binode);
3586                 $0->op = Print;
3587                 $0->right = reorder_bilist($<EL);
3588                 $0->left = NULL;
3589         } }$
3590         | print ${
3591                 $0 = new(binode);
3592                 $0->op = Print;
3593                 $0->left = NULL;
3594                 $0->right = NULL;
3595         }$
3596
3597 ###### print binode cases
3598
3599         case Print:
3600                 do_indent(indent, "print");
3601                 if (b->right) {
3602                         print_exec(b->right, -1, bracket);
3603                         printf(",");
3604                 } else
3605                         print_exec(b->left, -1, bracket);
3606                 if (indent >= 0)
3607                         printf("\n");
3608                 break;
3609
3610 ###### propagate binode cases
3611
3612         case Print:
3613                 /* don't care but all must be consistent */
3614                 if (b->left)
3615                         b = cast(binode, b->left);
3616                 else
3617                         b = cast(binode, b->right);
3618                 while (b) {
3619                         propagate_types(b->left, c, ok, NULL, Rnolabel);
3620                         b = cast(binode, b->right);
3621                 }
3622                 break;
3623
3624 ###### interp binode cases
3625
3626         case Print:
3627         {
3628                 struct binode *b2 = cast(binode, b->left);
3629                 if (!b2)
3630                         b2 = cast(binode, b->right);
3631                 for (; b2; b2 = cast(binode, b2->right)) {
3632                         left = interp_exec(c, b2->left, &ltype);
3633                         print_value(ltype, &left);
3634                         free_value(ltype, &left);
3635                         if (b2->right)
3636                                 putchar(' ');
3637                 }
3638                 if (b->right == NULL)
3639                         printf("\n");
3640                 ltype = Tnone;
3641                 break;
3642         }
3643
3644 ###### Assignment statement
3645
3646 An assignment will assign a value to a variable, providing it hasn't
3647 been declared as a constant.  The analysis phase ensures that the type
3648 will be correct so the interpreter just needs to perform the
3649 calculation.  There is a form of assignment which declares a new
3650 variable as well as assigning a value.  If a name is assigned before
3651 it is declared, and error will be raised as the name is created as
3652 `Tlabel` and it is illegal to assign to such names.
3653
3654 ###### Binode types
3655         Assign,
3656         Declare,
3657
3658 ###### declare terminals
3659         $TERM =
3660
3661 ###### SimpleStatement Grammar
3662         | Variable = Expression ${
3663                         $0 = new(binode);
3664                         $0->op = Assign;
3665                         $0->left = $<1;
3666                         $0->right = $<3;
3667                 }$
3668         | VariableDecl = Expression ${
3669                         $0 = new(binode);
3670                         $0->op = Declare;
3671                         $0->left = $<1;
3672                         $0->right =$<3;
3673                 }$
3674
3675         | VariableDecl ${
3676                         if ($1->var->where_set == NULL) {
3677                                 type_err(c,
3678                                          "Variable declared with no type or value: %v",
3679                                          $1, NULL, 0, NULL);
3680                         } else {
3681                                 $0 = new(binode);
3682                                 $0->op = Declare;
3683                                 $0->left = $<1;
3684                                 $0->right = NULL;
3685                         }
3686                 }$
3687
3688 ###### print binode cases
3689
3690         case Assign:
3691                 do_indent(indent, "");
3692                 print_exec(b->left, indent, bracket);
3693                 printf(" = ");
3694                 print_exec(b->right, indent, bracket);
3695                 if (indent >= 0)
3696                         printf("\n");
3697                 break;
3698
3699         case Declare:
3700                 {
3701                 struct variable *v = cast(var, b->left)->var;
3702                 do_indent(indent, "");
3703                 print_exec(b->left, indent, bracket);
3704                 if (cast(var, b->left)->var->constant) {
3705                         printf("::");
3706                         if (v->where_decl == v->where_set) {
3707                                 type_print(v->type, stdout);
3708                                 printf(" ");
3709                         }
3710                 } else {
3711                         printf(":");
3712                         if (v->where_decl == v->where_set) {
3713                                 type_print(v->type, stdout);
3714                                 printf(" ");
3715                         }
3716                 }
3717                 if (b->right) {
3718                         printf("= ");
3719                         print_exec(b->right, indent, bracket);
3720                 }
3721                 if (indent >= 0)
3722                         printf("\n");
3723                 }
3724                 break;
3725
3726 ###### propagate binode cases
3727
3728         case Assign:
3729         case Declare:
3730                 /* Both must match and not be labels,
3731                  * Type must support 'dup',
3732                  * For Assign, left must not be constant.
3733                  * result is Tnone
3734                  */
3735                 t = propagate_types(b->left, c, ok, NULL,
3736                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
3737                 if (!b->right)
3738                         return Tnone;
3739
3740                 if (t) {
3741                         if (propagate_types(b->right, c, ok, t, 0) != t)
3742                                 if (b->left->type == Xvar)
3743                                         type_err(c, "info: variable '%v' was set as %1 here.",
3744                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
3745                 } else {
3746                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
3747                         if (t)
3748                                 propagate_types(b->left, c, ok, t,
3749                                                 (b->op == Assign ? Rnoconstant : 0));
3750                 }
3751                 if (t && t->dup == NULL)
3752                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
3753                 return Tnone;
3754
3755                 break;
3756
3757 ###### interp binode cases
3758
3759         case Assign:
3760                 lleft = linterp_exec(c, b->left, &ltype);
3761                 right = interp_exec(c, b->right, &rtype);
3762                 if (lleft) {
3763                         free_value(ltype, lleft);
3764                         dup_value(ltype, &right, lleft);
3765                         ltype = NULL;
3766                 }
3767                 break;
3768
3769         case Declare:
3770         {
3771                 struct variable *v = cast(var, b->left)->var;
3772                 struct value *val;
3773                 v = v->merged;
3774                 val = var_value(c, v);
3775                 if (v->type->prepare_type)
3776                         v->type->prepare_type(c, v->type, 0);
3777                 if (b->right) {
3778                         right = interp_exec(c, b->right, &rtype);
3779                         memcpy(val, &right, rtype->size);
3780                         rtype = Tnone;
3781                 } else {
3782                         val_init(v->type, val);
3783                 }
3784                 break;
3785         }
3786
3787 ### The `use` statement
3788
3789 The `use` statement is the last "simple" statement.  It is needed when
3790 the condition in a conditional statement is a block.  `use` works much
3791 like `return` in C, but only completes the `condition`, not the whole
3792 function.
3793
3794 ###### Binode types
3795         Use,
3796
3797 ###### expr precedence
3798         $TERM use
3799
3800 ###### SimpleStatement Grammar
3801         | use Expression ${
3802                 $0 = new_pos(binode, $1);
3803                 $0->op = Use;
3804                 $0->right = $<2;
3805                 if ($0->right->type == Xvar) {
3806                         struct var *v = cast(var, $0->right);
3807                         if (v->var->type == Tnone) {
3808                                 /* Convert this to a label */
3809                                 struct value *val;
3810
3811                                 v->var->type = Tlabel;
3812                                 val = global_alloc(c, Tlabel, v->var, NULL);
3813                                 val->label = val;
3814                         }
3815                 }
3816         }$
3817
3818 ###### print binode cases
3819
3820         case Use:
3821                 do_indent(indent, "use ");
3822                 print_exec(b->right, -1, bracket);
3823                 if (indent >= 0)
3824                         printf("\n");
3825                 break;
3826
3827 ###### propagate binode cases
3828
3829         case Use:
3830                 /* result matches value */
3831                 return propagate_types(b->right, c, ok, type, 0);
3832
3833 ###### interp binode cases
3834
3835         case Use:
3836                 rv = interp_exec(c, b->right, &rvtype);
3837                 break;
3838
3839 ### The Conditional Statement
3840
3841 This is the biggy and currently the only complex statement.  This
3842 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
3843 It is comprised of a number of parts, all of which are optional though
3844 set combinations apply.  Each part is (usually) a key word (`then` is
3845 sometimes optional) followed by either an expression or a code block,
3846 except the `casepart` which is a "key word and an expression" followed
3847 by a code block.  The code-block option is valid for all parts and,
3848 where an expression is also allowed, the code block can use the `use`
3849 statement to report a value.  If the code block does not report a value
3850 the effect is similar to reporting `True`.
3851
3852 The `else` and `case` parts, as well as `then` when combined with
3853 `if`, can contain a `use` statement which will apply to some
3854 containing conditional statement. `for` parts, `do` parts and `then`
3855 parts used with `for` can never contain a `use`, except in some
3856 subordinate conditional statement.
3857
3858 If there is a `forpart`, it is executed first, only once.
3859 If there is a `dopart`, then it is executed repeatedly providing
3860 always that the `condpart` or `cond`, if present, does not return a non-True
3861 value.  `condpart` can fail to return any value if it simply executes
3862 to completion.  This is treated the same as returning `True`.
3863
3864 If there is a `thenpart` it will be executed whenever the `condpart`
3865 or `cond` returns True (or does not return any value), but this will happen
3866 *after* `dopart` (when present).
3867
3868 If `elsepart` is present it will be executed at most once when the
3869 condition returns `False` or some value that isn't `True` and isn't
3870 matched by any `casepart`.  If there are any `casepart`s, they will be
3871 executed when the condition returns a matching value.
3872
3873 The particular sorts of values allowed in case parts has not yet been
3874 determined in the language design, so nothing is prohibited.
3875
3876 The various blocks in this complex statement potentially provide scope
3877 for variables as described earlier.  Each such block must include the
3878 "OpenScope" nonterminal before parsing the block, and must call
3879 `var_block_close()` when closing the block.
3880
3881 The code following "`if`", "`switch`" and "`for`" does not get its own
3882 scope, but is in a scope covering the whole statement, so names
3883 declared there cannot be redeclared elsewhere.  Similarly the
3884 condition following "`while`" is in a scope the covers the body
3885 ("`do`" part) of the loop, and which does not allow conditional scope
3886 extension.  Code following "`then`" (both looping and non-looping),
3887 "`else`" and "`case`" each get their own local scope.
3888
3889 The type requirements on the code block in a `whilepart` are quite
3890 unusal.  It is allowed to return a value of some identifiable type, in
3891 which case the loop aborts and an appropriate `casepart` is run, or it
3892 can return a Boolean, in which case the loop either continues to the
3893 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
3894 This is different both from the `ifpart` code block which is expected to
3895 return a Boolean, or the `switchpart` code block which is expected to
3896 return the same type as the casepart values.  The correct analysis of
3897 the type of the `whilepart` code block is the reason for the
3898 `Rboolok` flag which is passed to `propagate_types()`.
3899
3900 The `cond_statement` cannot fit into a `binode` so a new `exec` is
3901 defined.  As there are two scopes which cover multiple parts - one for
3902 the whole statement and one for "while" and "do" - and as we will use
3903 the 'struct exec' to track scopes, we actually need two new types of
3904 exec.  One is a `binode` for the looping part, the rest is the
3905 `cond_statement`.  The `cond_statement` will use an auxilliary `struct
3906 casepart` to track a list of case parts.
3907
3908 ###### Binode types
3909         Loop
3910
3911 ###### exec type
3912         Xcond_statement,
3913
3914 ###### ast
3915         struct casepart {
3916                 struct exec *value;
3917                 struct exec *action;
3918                 struct casepart *next;
3919         };
3920         struct cond_statement {
3921                 struct exec;
3922                 struct exec *forpart, *condpart, *thenpart, *elsepart;
3923                 struct binode *looppart;
3924                 struct casepart *casepart;
3925         };
3926
3927 ###### ast functions
3928
3929         static void free_casepart(struct casepart *cp)
3930         {
3931                 while (cp) {
3932                         struct casepart *t;
3933                         free_exec(cp->value);
3934                         free_exec(cp->action);
3935                         t = cp->next;
3936                         free(cp);
3937                         cp = t;
3938                 }
3939         }
3940
3941         static void free_cond_statement(struct cond_statement *s)
3942         {
3943                 if (!s)
3944                         return;
3945                 free_exec(s->forpart);
3946                 free_exec(s->condpart);
3947                 free_exec(s->looppart);
3948                 free_exec(s->thenpart);
3949                 free_exec(s->elsepart);
3950                 free_casepart(s->casepart);
3951                 free(s);
3952         }
3953
3954 ###### free exec cases
3955         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
3956
3957 ###### ComplexStatement Grammar
3958         | CondStatement ${ $0 = $<1; }$
3959
3960 ###### expr precedence
3961         $TERM for then while do
3962         $TERM else
3963         $TERM switch case
3964
3965 ###### Grammar
3966
3967         $*cond_statement
3968         // A CondStatement must end with EOL, as does CondSuffix and
3969         // IfSuffix.
3970         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
3971         // may or may not end with EOL
3972         // WhilePart and IfPart include an appropriate Suffix
3973
3974         // ForPart, SwitchPart, and IfPart open scopes, o we have to close
3975         // them.  WhilePart opens and closes its own scope.
3976         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
3977                         $0 = $<CS;
3978                         $0->forpart = $<FP;
3979                         $0->thenpart = $<TP;
3980                         $0->looppart = $<WP;
3981                         var_block_close(c, CloseSequential, $0);
3982                         }$
3983                 | ForPart OptNL WhilePart CondSuffix ${
3984                         $0 = $<CS;
3985                         $0->forpart = $<FP;
3986                         $0->looppart = $<WP;
3987                         var_block_close(c, CloseSequential, $0);
3988                         }$
3989                 | WhilePart CondSuffix ${
3990                         $0 = $<CS;
3991                         $0->looppart = $<WP;
3992                         }$
3993                 | SwitchPart OptNL CasePart CondSuffix ${
3994                         $0 = $<CS;
3995                         $0->condpart = $<SP;
3996                         $CP->next = $0->casepart;
3997                         $0->casepart = $<CP;
3998                         var_block_close(c, CloseSequential, $0);
3999                         }$
4000                 | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
4001                         $0 = $<CS;
4002                         $0->condpart = $<SP;
4003                         $CP->next = $0->casepart;
4004                         $0->casepart = $<CP;
4005                         var_block_close(c, CloseSequential, $0);
4006                         }$
4007                 | IfPart IfSuffix ${
4008                         $0 = $<IS;
4009                         $0->condpart = $IP.condpart; $IP.condpart = NULL;
4010                         $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
4011                         // This is where we close an "if" statement
4012                         var_block_close(c, CloseSequential, $0);
4013                         }$
4014
4015         CondSuffix -> IfSuffix ${
4016                         $0 = $<1;
4017                 }$
4018                 | Newlines CasePart CondSuffix ${
4019                         $0 = $<CS;
4020                         $CP->next = $0->casepart;
4021                         $0->casepart = $<CP;
4022                 }$
4023                 | CasePart CondSuffix ${
4024                         $0 = $<CS;
4025                         $CP->next = $0->casepart;
4026                         $0->casepart = $<CP;
4027                 }$
4028
4029         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
4030                 | Newlines ElsePart ${ $0 = $<EP; }$
4031                 | ElsePart ${$0 = $<EP; }$
4032
4033         ElsePart -> else OpenBlock Newlines ${
4034                         $0 = new(cond_statement);
4035                         $0->elsepart = $<OB;
4036                         var_block_close(c, CloseElse, $0->elsepart);
4037                 }$
4038                 | else OpenScope CondStatement ${
4039                         $0 = new(cond_statement);
4040                         $0->elsepart = $<CS;
4041                         var_block_close(c, CloseElse, $0->elsepart);
4042                 }$
4043
4044         $*casepart
4045         CasePart -> case Expression OpenScope ColonBlock ${
4046                         $0 = calloc(1,sizeof(struct casepart));
4047                         $0->value = $<Ex;
4048                         $0->action = $<Bl;
4049                         var_block_close(c, CloseParallel, $0->action);
4050                 }$
4051
4052         $*exec
4053         // These scopes are closed in CondStatement
4054         ForPart -> for OpenBlock ${
4055                         $0 = $<Bl;
4056                 }$
4057
4058         ThenPart -> then OpenBlock ${
4059                         $0 = $<OB;
4060                         var_block_close(c, CloseSequential, $0);
4061                 }$
4062
4063         $*binode
4064         // This scope is closed in CondStatement
4065         WhilePart -> while UseBlock OptNL do OpenBlock ${
4066                         $0 = new(binode);
4067                         $0->op = Loop;
4068                         $0->left = $<UB;
4069                         $0->right = $<OB;
4070                         var_block_close(c, CloseSequential, $0->right);
4071                         var_block_close(c, CloseSequential, $0);
4072                 }$
4073                 | while OpenScope Expression OpenScope ColonBlock ${
4074                         $0 = new(binode);
4075                         $0->op = Loop;
4076                         $0->left = $<Exp;
4077                         $0->right = $<CB;
4078                         var_block_close(c, CloseSequential, $0->right);
4079                         var_block_close(c, CloseSequential, $0);
4080                 }$
4081
4082         $cond_statement
4083         IfPart -> if UseBlock OptNL then OpenBlock ${
4084                         $0.condpart = $<UB;
4085                         $0.thenpart = $<OB;
4086                         var_block_close(c, CloseParallel, $0.thenpart);
4087                 }$
4088                 | if OpenScope Expression OpenScope ColonBlock ${
4089                         $0.condpart = $<Ex;
4090                         $0.thenpart = $<CB;
4091                         var_block_close(c, CloseParallel, $0.thenpart);
4092                 }$
4093                 | if OpenScope Expression OpenScope OptNL then Block ${
4094                         $0.condpart = $<Ex;
4095                         $0.thenpart = $<Bl;
4096                         var_block_close(c, CloseParallel, $0.thenpart);
4097                 }$
4098
4099         $*exec
4100         // This scope is closed in CondStatement
4101         SwitchPart -> switch OpenScope Expression ${
4102                         $0 = $<Ex;
4103                 }$
4104                 | switch UseBlock ${
4105                         $0 = $<Bl;
4106                 }$
4107
4108 ###### print binode cases
4109         case Loop:
4110                 if (b->left && b->left->type == Xbinode &&
4111                     cast(binode, b->left)->op == Block) {
4112                         if (bracket)
4113                                 do_indent(indent, "while {\n");
4114                         else
4115                                 do_indent(indent, "while\n");
4116                         print_exec(b->left, indent+1, bracket);
4117                         if (bracket)
4118                                 do_indent(indent, "} do {\n");
4119                         else
4120                                 do_indent(indent, "do\n");
4121                         print_exec(b->right, indent+1, bracket);
4122                         if (bracket)
4123                                 do_indent(indent, "}\n");
4124                 } else {
4125                         do_indent(indent, "while ");
4126                         print_exec(b->left, 0, bracket);
4127                         if (bracket)
4128                                 printf(" {\n");
4129                         else
4130                                 printf(":\n");
4131                         print_exec(b->right, indent+1, bracket);
4132                         if (bracket)
4133                                 do_indent(indent, "}\n");
4134                 }
4135                 break;
4136
4137 ###### print exec cases
4138
4139         case Xcond_statement:
4140         {
4141                 struct cond_statement *cs = cast(cond_statement, e);
4142                 struct casepart *cp;
4143                 if (cs->forpart) {
4144                         do_indent(indent, "for");
4145                         if (bracket) printf(" {\n"); else printf("\n");
4146                         print_exec(cs->forpart, indent+1, bracket);
4147                         if (cs->thenpart) {
4148                                 if (bracket)
4149                                         do_indent(indent, "} then {\n");
4150                                 else
4151                                         do_indent(indent, "then\n");
4152                                 print_exec(cs->thenpart, indent+1, bracket);
4153                         }
4154                         if (bracket) do_indent(indent, "}\n");
4155                 }
4156                 if (cs->looppart) {
4157                         print_exec(cs->looppart, indent, bracket);
4158                 } else {
4159                         // a condition
4160                         if (cs->casepart)
4161                                 do_indent(indent, "switch");
4162                         else
4163                                 do_indent(indent, "if");
4164                         if (cs->condpart && cs->condpart->type == Xbinode &&
4165                             cast(binode, cs->condpart)->op == Block) {
4166                                 if (bracket)
4167                                         printf(" {\n");
4168                                 else
4169                                         printf("\n");
4170                                 print_exec(cs->condpart, indent+1, bracket);
4171                                 if (bracket)
4172                                         do_indent(indent, "}\n");
4173                                 if (cs->thenpart) {
4174                                         do_indent(indent, "then\n");
4175                                         print_exec(cs->thenpart, indent+1, bracket);
4176                                 }
4177                         } else {
4178                                 printf(" ");
4179                                 print_exec(cs->condpart, 0, bracket);
4180                                 if (cs->thenpart) {
4181                                         if (bracket)
4182                                                 printf(" {\n");
4183                                         else
4184                                                 printf(":\n");
4185                                         print_exec(cs->thenpart, indent+1, bracket);
4186                                         if (bracket)
4187                                                 do_indent(indent, "}\n");
4188                                 } else
4189                                         printf("\n");
4190                         }
4191                 }
4192                 for (cp = cs->casepart; cp; cp = cp->next) {
4193                         do_indent(indent, "case ");
4194                         print_exec(cp->value, -1, 0);
4195                         if (bracket)
4196                                 printf(" {\n");
4197                         else
4198                                 printf(":\n");
4199                         print_exec(cp->action, indent+1, bracket);
4200                         if (bracket)
4201                                 do_indent(indent, "}\n");
4202                 }
4203                 if (cs->elsepart) {
4204                         do_indent(indent, "else");
4205                         if (bracket)
4206                                 printf(" {\n");
4207                         else
4208                                 printf("\n");
4209                         print_exec(cs->elsepart, indent+1, bracket);
4210                         if (bracket)
4211                                 do_indent(indent, "}\n");
4212                 }
4213                 break;
4214         }
4215
4216 ###### propagate binode cases
4217         case Loop:
4218                 t = propagate_types(b->right, c, ok, Tnone, 0);
4219                 if (!type_compat(Tnone, t, 0))
4220                         *ok = 0;        // UNTESTED
4221                 return propagate_types(b->left, c, ok, type, rules);
4222
4223 ###### propagate exec cases
4224         case Xcond_statement:
4225         {
4226                 // forpart and looppart->right must return Tnone
4227                 // thenpart must return Tnone if there is a loopart,
4228                 // otherwise it is like elsepart.
4229                 // condpart must:
4230                 //    be bool if there is no casepart
4231                 //    match casepart->values if there is a switchpart
4232                 //    either be bool or match casepart->value if there
4233                 //             is a whilepart
4234                 // elsepart and casepart->action must match the return type
4235                 //   expected of this statement.
4236                 struct cond_statement *cs = cast(cond_statement, prog);
4237                 struct casepart *cp;
4238
4239                 t = propagate_types(cs->forpart, c, ok, Tnone, 0);
4240                 if (!type_compat(Tnone, t, 0))
4241                         *ok = 0;        // UNTESTED
4242
4243                 if (cs->looppart) {
4244                         t = propagate_types(cs->thenpart, c, ok, Tnone, 0);
4245                         if (!type_compat(Tnone, t, 0))
4246                                 *ok = 0;        // UNTESTED
4247                 }
4248                 if (cs->casepart == NULL) {
4249                         propagate_types(cs->condpart, c, ok, Tbool, 0);
4250                         propagate_types(cs->looppart, c, ok, Tbool, 0);
4251                 } else {
4252                         /* Condpart must match case values, with bool permitted */
4253                         t = NULL;
4254                         for (cp = cs->casepart;
4255                              cp && !t; cp = cp->next)
4256                                 t = propagate_types(cp->value, c, ok, NULL, 0);
4257                         if (!t && cs->condpart)
4258                                 t = propagate_types(cs->condpart, c, ok, NULL, Rboolok);        // UNTESTED
4259                         if (!t && cs->looppart)
4260                                 t = propagate_types(cs->looppart, c, ok, NULL, Rboolok);        // UNTESTED
4261                         // Now we have a type (I hope) push it down
4262                         if (t) {
4263                                 for (cp = cs->casepart; cp; cp = cp->next)
4264                                         propagate_types(cp->value, c, ok, t, 0);
4265                                 propagate_types(cs->condpart, c, ok, t, Rboolok);
4266                                 propagate_types(cs->looppart, c, ok, t, Rboolok);
4267                         }
4268                 }
4269                 // (if)then, else, and case parts must return expected type.
4270                 if (!cs->looppart && !type)
4271                         type = propagate_types(cs->thenpart, c, ok, NULL, rules);
4272                 if (!type)
4273                         type = propagate_types(cs->elsepart, c, ok, NULL, rules);
4274                 for (cp = cs->casepart;
4275                      cp && !type;
4276                      cp = cp->next)     // UNTESTED
4277                         type = propagate_types(cp->action, c, ok, NULL, rules); // UNTESTED
4278                 if (type) {
4279                         if (!cs->looppart)
4280                                 propagate_types(cs->thenpart, c, ok, type, rules);
4281                         propagate_types(cs->elsepart, c, ok, type, rules);
4282                         for (cp = cs->casepart; cp ; cp = cp->next)
4283                                 propagate_types(cp->action, c, ok, type, rules);
4284                         return type;
4285                 } else
4286                         return NULL;
4287         }
4288
4289 ###### interp binode cases
4290         case Loop:
4291                 // This just performs one iterration of the loop
4292                 rv = interp_exec(c, b->left, &rvtype);
4293                 if (rvtype == Tnone ||
4294                     (rvtype == Tbool && rv.bool != 0))
4295                         // cnd is Tnone or Tbool, doesn't need to be freed
4296                         interp_exec(c, b->right, NULL);
4297                 break;
4298
4299 ###### interp exec cases
4300         case Xcond_statement:
4301         {
4302                 struct value v, cnd;
4303                 struct type *vtype, *cndtype;
4304                 struct casepart *cp;
4305                 struct cond_statement *cs = cast(cond_statement, e);
4306
4307                 if (cs->forpart)
4308                         interp_exec(c, cs->forpart, NULL);
4309                 if (cs->looppart) {
4310                         while ((cnd = interp_exec(c, cs->looppart, &cndtype)),
4311                                cndtype == Tnone || (cndtype == Tbool && cnd.bool != 0))
4312                                 interp_exec(c, cs->thenpart, NULL);
4313                 } else {
4314                         cnd = interp_exec(c, cs->condpart, &cndtype);
4315                         if ((cndtype == Tnone ||
4316                             (cndtype == Tbool && cnd.bool != 0))) {
4317                                 // cnd is Tnone or Tbool, doesn't need to be freed
4318                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4319                                 // skip else (and cases)
4320                                 goto Xcond_done;
4321                         }
4322                 }
4323                 for (cp = cs->casepart; cp; cp = cp->next) {
4324                         v = interp_exec(c, cp->value, &vtype);
4325                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4326                                 free_value(vtype, &v);
4327                                 free_value(cndtype, &cnd);
4328                                 rv = interp_exec(c, cp->action, &rvtype);
4329                                 goto Xcond_done;
4330                         }
4331                         free_value(vtype, &v);
4332                 }
4333                 free_value(cndtype, &cnd);
4334                 if (cs->elsepart)
4335                         rv = interp_exec(c, cs->elsepart, &rvtype);
4336                 else
4337                         rvtype = Tnone;
4338         Xcond_done:
4339                 break;
4340         }
4341
4342 ### Top level structure
4343
4344 All the language elements so far can be used in various places.  Now
4345 it is time to clarify what those places are.
4346
4347 At the top level of a file there will be a number of declarations.
4348 Many of the things that can be declared haven't been described yet,
4349 such as functions, procedures, imports, and probably more.
4350 For now there are two sorts of things that can appear at the top
4351 level.  They are predefined constants, `struct` types, and the `main`
4352 function.  While the syntax will allow the `main` function to appear
4353 multiple times, that will trigger an error if it is actually attempted.
4354
4355 The various declarations do not return anything.  They store the
4356 various declarations in the parse context.
4357
4358 ###### Parser: grammar
4359
4360         $void
4361         Ocean -> OptNL DeclarationList
4362
4363         ## declare terminals
4364
4365         OptNL ->
4366                 | OptNL NEWLINE
4367         Newlines -> NEWLINE
4368                 | Newlines NEWLINE
4369
4370         DeclarationList -> Declaration
4371                 | DeclarationList Declaration
4372
4373         Declaration -> ERROR Newlines ${
4374                         tok_err(c,      // UNTESTED
4375                                 "error: unhandled parse error", &$1);
4376                 }$
4377                 | DeclareConstant
4378                 | DeclareFunction
4379                 | DeclareStruct
4380
4381         ## top level grammar
4382
4383         ## Grammar
4384
4385 ### The `const` section
4386
4387 As well as being defined in with the code that uses them, constants
4388 can be declared at the top level.  These have full-file scope, so they
4389 are always `InScope`.  The value of a top level constant can be given
4390 as an expression, and this is evaluated immediately rather than in the
4391 later interpretation stage.  Once we add functions to the language, we
4392 will need rules concern which, if any, can be used to define a top
4393 level constant.
4394
4395 Constants are defined in a section that starts with the reserved word
4396 `const` and then has a block with a list of assignment statements.
4397 For syntactic consistency, these must use the double-colon syntax to
4398 make it clear that they are constants.  Type can also be given: if
4399 not, the type will be determined during analysis, as with other
4400 constants.
4401
4402 As the types constants are inserted at the head of a list, printing
4403 them in the same order that they were read is not straight forward.
4404 We take a quadratic approach here and count the number of constants
4405 (variables of depth 0), then count down from there, each time
4406 searching through for the Nth constant for decreasing N.
4407
4408 ###### top level grammar
4409
4410         $TERM const
4411
4412         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4413                 | const { SimpleConstList } Newlines
4414                 | const IN OptNL ConstList OUT Newlines
4415                 | const SimpleConstList Newlines
4416
4417         ConstList -> ConstList SimpleConstLine
4418                 | SimpleConstLine
4419         SimpleConstList -> SimpleConstList ; Const
4420                 | Const
4421                 | SimpleConstList ;
4422         SimpleConstLine -> SimpleConstList Newlines
4423                 | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4424
4425         $*type
4426         CType -> Type   ${ $0 = $<1; }$
4427                 |       ${ $0 = NULL; }$
4428         $void
4429         Const -> IDENTIFIER :: CType = Expression ${ {
4430                 int ok;
4431                 struct variable *v;
4432
4433                 v = var_decl(c, $1.txt);
4434                 if (v) {
4435                         struct var *var = new_pos(var, $1);
4436                         v->where_decl = var;
4437                         v->where_set = var;
4438                         var->var = v;
4439                         v->constant = 1;
4440                 } else {
4441                         v = var_ref(c, $1.txt);
4442                         tok_err(c, "error: name already declared", &$1);
4443                         type_err(c, "info: this is where '%v' was first declared",
4444                                  v->where_decl, NULL, 0, NULL);
4445                 }
4446                 do {
4447                         ok = 1;
4448                         propagate_types($5, c, &ok, $3, 0);
4449                 } while (ok == 2);
4450                 if (!ok)
4451                         c->parse_error = 1;
4452                 else if (v) {
4453                         struct value res = interp_exec(c, $5, &v->type);
4454                         global_alloc(c, v->type, v, &res);
4455                 }
4456         } }$
4457
4458 ###### print const decls
4459         {
4460                 struct variable *v;
4461                 int target = -1;
4462
4463                 while (target != 0) {
4464                         int i = 0;
4465                         for (v = context.in_scope; v; v=v->in_scope)
4466                                 if (v->depth == 0) {
4467                                         i += 1;
4468                                         if (i == target)
4469                                                 break;
4470                                 }
4471
4472                         if (target == -1) {
4473                                 if (i)
4474                                         printf("const\n");
4475                                 target = i;
4476                         } else {
4477                                 struct value *val = var_value(&context, v);
4478                                 printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
4479                                 type_print(v->type, stdout);
4480                                 printf(" = ");
4481                                 if (v->type == Tstr)
4482                                         printf("\"");
4483                                 print_value(v->type, val);
4484                                 if (v->type == Tstr)
4485                                         printf("\"");
4486                                 printf("\n");
4487                                 target -= 1;
4488                         }
4489                 }
4490         }
4491
4492 ### Finally the whole `main` function.
4493
4494 An Ocean program can currently have only one function - `main` - and
4495 that must exist.  It expects an array of strings with a provided size.
4496 Following this is a `block` which is the code to execute.
4497
4498 As this is the top level, several things are handled a bit
4499 differently.
4500 The function is not interpreted by `interp_exec` as that isn't
4501 passed the argument list which the program requires.  Similarly type
4502 analysis is a bit more interesting at this level.
4503
4504 ###### top level grammar
4505
4506         DeclareFunction -> MainFunction ${ {
4507                 if (c->prog)
4508                         type_err(c, "\"main\" defined a second time",
4509                                  $1, NULL, 0, NULL);
4510                 else
4511                         c->prog = $<1;
4512         } }$
4513
4514 ###### print binode cases
4515         case Func:
4516                 do_indent(indent, "func main(");
4517                 for (b2 = cast(binode, b->left); b2; b2 = cast(binode, b2->right)) {
4518                         struct variable *v = cast(var, b2->left)->var;
4519                         printf(" ");
4520                         print_exec(b2->left, 0, 0);
4521                         printf(":");
4522                         type_print(v->type, stdout);
4523                 }
4524                 if (bracket)
4525                         printf(") {\n");
4526                 else
4527                         printf(")\n");
4528                 print_exec(b->right, indent+1, bracket);
4529                 if (bracket)
4530                         do_indent(indent, "}\n");
4531                 break;
4532
4533 ###### propagate binode cases
4534         case Func: abort();             // NOTEST
4535
4536 ###### core functions
4537
4538         static int analyse_prog(struct exec *prog, struct parse_context *c)
4539         {
4540                 struct binode *bp = cast(binode, prog);
4541                 struct binode *b;
4542                 int ok = 1;
4543                 int arg = 0;
4544                 struct type *argv_type;
4545                 struct text argv_type_name = { " argv", 5 };
4546
4547                 if (!bp)
4548                         return 0;       // NOTEST
4549
4550                 argv_type = add_type(c, argv_type_name, &array_prototype);
4551                 argv_type->array.member = Tstr;
4552                 argv_type->array.unspec = 1;
4553
4554                 for (b = cast(binode, bp->left); b; b = cast(binode, b->right)) {
4555                         ok = 1;
4556                         switch (arg++) {
4557                         case 0: /* argv */
4558                                 propagate_types(b->left, c, &ok, argv_type, 0);
4559                                 break;
4560                         default: /* invalid */  // NOTEST
4561                                 propagate_types(b->left, c, &ok, Tnone, 0);     // NOTEST
4562                         }
4563                 }
4564
4565                 do {
4566                         ok = 1;
4567                         propagate_types(bp->right, c, &ok, Tnone, 0);
4568                 } while (ok == 2);
4569                 if (!ok)
4570                         return 0;
4571
4572                 /* Make sure everything is still consistent */
4573                 propagate_types(bp->right, c, &ok, Tnone, 0);
4574                 if (!ok)
4575                         return 0;       // UNTESTED
4576                 scope_finalize(c);
4577                 return 1;
4578         }
4579
4580         static void interp_prog(struct parse_context *c, struct exec *prog,
4581                                 int argc, char **argv)
4582         {
4583                 struct binode *p = cast(binode, prog);
4584                 struct binode *al;
4585                 int anum = 0;
4586                 struct value v;
4587                 struct type *vtype;
4588
4589                 if (!prog)
4590                         return;         // NOTEST
4591                 al = cast(binode, p->left);
4592                 while (al) {
4593                         struct var *v = cast(var, al->left);
4594                         struct value *vl = var_value(c, v->var);
4595                         struct value arg;
4596                         struct type *t;
4597                         mpq_t argcq;
4598                         int i;
4599
4600                         switch (anum++) {
4601                         case 0: /* argv */
4602                                 t = v->var->type;
4603                                 mpq_init(argcq);
4604                                 mpq_set_ui(argcq, argc, 1);
4605                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
4606                                 t->prepare_type(c, t, 0);
4607                                 array_init(v->var->type, vl);
4608                                 for (i = 0; i < argc; i++) {
4609                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
4610
4611
4612                                         arg.str.txt = argv[i];
4613                                         arg.str.len = strlen(argv[i]);
4614                                         free_value(Tstr, vl2);
4615                                         dup_value(Tstr, &arg, vl2);
4616                                 }
4617                                 break;
4618                         }
4619                         al = cast(binode, al->right);
4620                 }
4621                 v = interp_exec(c, p, &vtype);
4622                 free_value(vtype, &v);
4623         }
4624
4625 ###### interp binode cases
4626         case Func:
4627                 rv = interp_exec(c, b->right, &rvtype);
4628                 break;
4629
4630 ## And now to test it out.
4631
4632 Having a language requires having a "hello world" program.  I'll
4633 provide a little more than that: a program that prints "Hello world"
4634 finds the GCD of two numbers, prints the first few elements of
4635 Fibonacci, performs a binary search for a number, and a few other
4636 things which will likely grow as the languages grows.
4637
4638 ###### File: oceani.mk
4639         demos :: sayhello
4640         sayhello : oceani
4641                 @echo "===== DEMO ====="
4642                 ./oceani --section "demo: hello" oceani.mdc 55 33
4643
4644 ###### demo: hello
4645
4646         const
4647                 pi ::= 3.141_592_6
4648                 four ::= 2 + 2 ; five ::= 10/2
4649         const pie ::= "I like Pie";
4650                 cake ::= "The cake is"
4651                   ++ " a lie"
4652
4653         struct fred
4654                 size:[four]number
4655                 name:string
4656                 alive:Boolean
4657
4658         func main
4659                 argv:[argc::]string
4660         do
4661                 print "Hello World, what lovely oceans you have!"
4662                 print "Are there", five, "?"
4663                 print pi, pie, "but", cake
4664
4665                 A := $argv[1]; B := $argv[2]
4666
4667                 /* When a variable is defined in both branches of an 'if',
4668                  * and used afterwards, the variables are merged.
4669                  */
4670                 if A > B:
4671                         bigger := "yes"
4672                 else
4673                         bigger := "no"
4674                 print "Is", A, "bigger than", B,"? ", bigger
4675                 /* If a variable is not used after the 'if', no
4676                  * merge happens, so types can be different
4677                  */
4678                 if A > B * 2:
4679                         double:string = "yes"
4680                         print A, "is more than twice", B, "?", double
4681                 else
4682                         double := B*2
4683                         print "double", B, "is", double
4684
4685                 a : number
4686                 a = A;
4687                 b:number = B
4688                 if a > 0 and then b > 0:
4689                         while a != b:
4690                                 if a < b:
4691                                         b = b - a
4692                                 else
4693                                         a = a - b
4694                         print "GCD of", A, "and", B,"is", a
4695                 else if a <= 0:
4696                         print a, "is not positive, cannot calculate GCD"
4697                 else
4698                         print b, "is not positive, cannot calculate GCD"
4699
4700                 for
4701                         togo := 10
4702                         f1 := 1; f2 := 1
4703                         print "Fibonacci:", f1,f2,
4704                 then togo = togo - 1
4705                 while togo > 0:
4706                         f3 := f1 + f2
4707                         print "", f3,
4708                         f1 = f2
4709                         f2 = f3
4710                 print ""
4711
4712                 /* Binary search... */
4713                 for
4714                         lo:= 0; hi := 100
4715                         target := 77
4716                 while
4717                         mid := (lo + hi) / 2
4718                         if mid == target:
4719                                 use Found
4720                         if mid < target:
4721                                 lo = mid
4722                         else
4723                                 hi = mid
4724                         if hi - lo < 1:
4725                                 lo = mid
4726                                 use GiveUp
4727                         use True
4728                 do pass
4729                 case Found:
4730                         print "Yay, I found", target
4731                 case GiveUp:
4732                         print "Closest I found was", lo
4733
4734                 size::= 10
4735                 list:[size]number
4736                 list[0] = 1234
4737                 // "middle square" PRNG.  Not particularly good, but one my
4738                 // Dad taught me - the first one I ever heard of.
4739                 for i:=1; then i = i + 1; while i < size:
4740                         n := list[i-1] * list[i-1]
4741                         list[i] = (n / 100) % 10 000
4742
4743                 print "Before sort:",
4744                 for i:=0; then i = i + 1; while i < size:
4745                         print "", list[i],
4746                 print
4747
4748                 for i := 1; then i=i+1; while i < size:
4749                         for j:=i-1; then j=j-1; while j >= 0:
4750                                 if list[j] > list[j+1]:
4751                                         t:= list[j]
4752                                         list[j] = list[j+1]
4753                                         list[j+1] = t
4754                 print " After sort:",
4755                 for i:=0; then i = i + 1; while i < size:
4756                         print "", list[i],
4757                 print
4758
4759                 if 1 == 2 then print "yes"; else print "no"
4760
4761                 bob:fred
4762                 bob.name = "Hello"
4763                 bob.alive = (bob.name == "Hello")
4764                 print "bob", "is" if  bob.alive else "isn't", "alive"