ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 ## parse context
 118         };
 119
 120 ###### macros
 121
 122         #define container_of(ptr, type, member) ({                      \
 123                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 124                 (type *)( (char *)__mptr - offsetof(type,member) );})
 125
 126         #define config2context(_conf) container_of(_conf, struct parse_context, \
 127                 config)
 128
 129 ###### Parser: reduce
 130         struct parse_context *c = config2context(config);
 131
 132 ###### Parser: code
 133         #define _GNU_SOURCE
 134         #include <unistd.h>
 135         #include <stdlib.h>
 136         #include <fcntl.h>
 137         #include <errno.h>
 138         #include <sys/mman.h>
 139         #include <string.h>
 140         #include <stdio.h>
 141         #include <locale.h>
 142         #include <malloc.h>
 143         #include "mdcode.h"
 144         #include "scanner.h"
 145         #include "parser.h"
 146
 147         ## includes
 148
 149         #include "oceani.h"
 150
 151         ## forward decls
 152         ## value functions
 153         ## ast functions
 154         ## core functions
 155
 156         #include <getopt.h>
 157         static char Usage[] =
 158                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 159         static const struct option long_options[] = {
 160                 {"trace",     0, NULL, 't'},
 161                 {"print",     0, NULL, 'p'},
 162                 {"noexec",    0, NULL, 'n'},
 163                 {"brackets",  0, NULL, 'b'},
 164                 {"section",   1, NULL, 's'},
 165                 {NULL,        0, NULL, 0},
 166         };
 167         const char *options = "tpnbs";
 168
 169         static void pr_err(char *msg)                   // NOTEST
 170         {
 171                 fprintf(stderr, "%s\n", msg);           // NOTEST
 172         }                                               // NOTEST
 173
 174         int main(int argc, char *argv[])
 175         {
 176                 int fd;
 177                 int len;
 178                 char *file;
 179                 struct section *s = NULL, *ss;
 180                 char *section = NULL;
 181                 struct parse_context context = {
 182                         .config = {
 183                                 .ignored = (1 << TK_mark),
 184                                 .number_chars = ".,_+- ",
 185                                 .word_start = "_",
 186                                 .word_cont = "_",
 187                         },
 188                 };
 189                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 190                 int opt;
 191                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 192                        != -1) {
 193                         switch(opt) {
 194                         case 't': dotrace=1; break;
 195                         case 'p': doprint=1; break;
 196                         case 'n': doexec=0; break;
 197                         case 'b': brackets=1; break;
 198                         case 's': section = optarg; break;
 199                         default: fprintf(stderr, Usage);
 200                                 exit(1);
 201                         }
 202                 }
 203                 if (optind >= argc) {
 204                         fprintf(stderr, "oceani: no input file given\n");
 205                         exit(1);
 206                 }
 207                 fd = open(argv[optind], O_RDONLY);
 208                 if (fd < 0) {
 209                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 210                         exit(1);
 211                 }
 212                 context.file_name = argv[optind];
 213                 len = lseek(fd, 0, 2);
 214                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 215                 s = code_extract(file, file+len, pr_err);
 216                 if (!s) {
 217                         fprintf(stderr, "oceani: could not find any code in %s\n",
 218                                 argv[optind]);
 219                         exit(1);
 220                 }
 221
 222                 ## context initialization
 223
 224                 if (section) {
 225                         for (ss = s; ss; ss = ss->next) {
 226                                 struct text sec = ss->section;
 227                                 if (sec.len == strlen(section) &&
 228                                     strncmp(sec.txt, section, sec.len) == 0)
 229                                         break;
 230                         }
 231                         if (!ss) {
 232                                 fprintf(stderr, "oceani: cannot find section %s\n",
 233                                         section);
 234                                 goto cleanup;
 235                         }
 236                 } else
 237                         ss = s;                         // NOTEST
 238                 if (!ss->code) {
 239                         fprintf(stderr, "oceani: no code found in requested section\n");        // NOTEST
 240                         goto cleanup;                   // NOTEST
 241                 }
 242
 243                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 244
 245                 resolve_consts(&context);
 246                 prepare_types(&context);
 247                 if (!context.parse_error && !analyse_funcs(&context)) {
 248                         fprintf(stderr, "oceani: type error in program - not running.\n");
 249                         context.parse_error += 1;
 250                 }
 251
 252                 if (doprint) {
 253                         ## print const decls
 254                         ## print type decls
 255                         ## print func decls
 256                 }
 257                 if (doexec && !context.parse_error)
 258                         interp_main(&context, argc - optind, argv + optind);
 259         cleanup:
 260                 while (s) {
 261                         struct section *t = s->next;
 262                         code_free(s->code);
 263                         free(s);
 264                         s = t;
 265                 }
 266                 // FIXME parser should pop scope even on error
 267                 while (context.scope_depth > 0)
 268                         scope_pop(&context);
 269                 ## free global vars
 270                 ## free const decls
 271                 ## free context types
 272                 ## free context storage
 273                 exit(context.parse_error ? 1 : 0);
 274         }
 275
 276 ### Analysis
 277
 278 The four requirements of parse, analyse, print, interpret apply to
 279 each language element individually so that is how most of the code
 280 will be structured.
 281
 282 Three of the four are fairly self explanatory.  The one that requires
 283 a little explanation is the analysis step.
 284
 285 The current language design does not require the types of variables to
 286 be declared, but they must still have a single type.  Different
 287 operations impose different requirements on the variables, for example
 288 addition requires both arguments to be numeric, and assignment
 289 requires the variable on the left to have the same type as the
 290 expression on the right.
 291
 292 Analysis involves propagating these type requirements around and
 293 consequently setting the type of each variable.  If any requirements
 294 are violated (e.g. a string is compared with a number) or if a
 295 variable needs to have two different types, then an error is raised
 296 and the program will not run.
 297
 298 If the same variable is declared in both branchs of an 'if/else', or
 299 in all cases of a 'switch' then the multiple instances may be merged
 300 into just one variable if the variable is referenced after the
 301 conditional statement.  When this happens, the types must naturally be
 302 consistent across all the branches.  When the variable is not used
 303 outside the if, the variables in the different branches are distinct
 304 and can be of different types.
 305
 306 Undeclared names may only appear in "use" statements and "case" expressions.
 307 These names are given a type of "label" and a unique value.
 308 This allows them to fill the role of a name in an enumerated type, which
 309 is useful for testing the `switch` statement.
 310
 311 As we will see, the condition part of a `while` statement can return
 312 either a Boolean or some other type.  This requires that the expected
 313 type that gets passed around comprises a type and a flag to indicate
 314 that `Tbool` is also permitted.
 315
 316 As there are, as yet, no distinct types that are compatible, there
 317 isn't much subtlety in the analysis.  When we have distinct number
 318 types, this will become more interesting.
 319
 320 #### Error reporting
 321
 322 When analysis discovers an inconsistency it needs to report an error;
 323 just refusing to run the code ensures that the error doesn't cascade,
 324 but by itself it isn't very useful.  A clear understanding of the sort
 325 of error message that are useful will help guide the process of
 326 analysis.
 327
 328 At a simplistic level, the only sort of error that type analysis can
 329 report is that the type of some construct doesn't match a contextual
 330 requirement.  For example, in `4 + "hello"` the addition provides a
 331 contextual requirement for numbers, but `"hello"` is not a number.  In
 332 this particular example no further information is needed as the types
 333 are obvious from local information.  When a variable is involved that
 334 isn't the case.  It may be helpful to explain why the variable has a
 335 particular type, by indicating the location where the type was set,
 336 whether by declaration or usage.
 337
 338 Using a recursive-descent analysis we can easily detect a problem at
 339 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 340 will detect that one argument is not a number and the usage of `hello`
 341 will detect that a number was wanted, but not provided.  In this
 342 (early) version of the language, we will generate error reports at
 343 multiple locations, so the use of `hello` will report an error and
 344 explain were the value was set, and the addition will report an error
 345 and say why numbers are needed.  To be able to report locations for
 346 errors, each language element will need to record a file location
 347 (line and column) and each variable will need to record the language
 348 element where its type was set.  For now we will assume that each line
 349 of an error message indicates one location in the file, and up to 2
 350 types.  So we provide a `printf`-like function which takes a format, a
 351 location (a `struct exec` which has not yet been introduced), and 2
 352 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 353 will need a function to print the location, once we know how that is
 354 stored. e As will be explained later, there are sometimes extra rules for
 355 type matching and they might affect error messages, we need to pass those
 356 in too.
 357
 358 As well as type errors, we sometimes need to report problems with
 359 tokens, which might be unexpected or might name a type that has not
 360 been defined.  For these we have `tok_err()` which reports an error
 361 with a given token.  Each of the error functions sets the flag in the
 362 context so indicate that parsing failed.
 363
 364 ###### forward decls
 365
 366         static void fput_loc(struct exec *loc, FILE *f);
 367         static void type_err(struct parse_context *c,
 368                              char *fmt, struct exec *loc,
 369                              struct type *t1, int rules, struct type *t2);
 370         static void tok_err(struct parse_context *c, char *fmt, struct token *t);
 371
 372 ###### core functions
 373
 374         static void type_err(struct parse_context *c,
 375                              char *fmt, struct exec *loc,
 376                              struct type *t1, int rules, struct type *t2)
 377         {
 378                 fprintf(stderr, "%s:", c->file_name);
 379                 fput_loc(loc, stderr);
 380                 for (; *fmt ; fmt++) {
 381                         if (*fmt != '%') {
 382                                 fputc(*fmt, stderr);
 383                                 continue;
 384                         }
 385                         fmt++;
 386                         switch (*fmt) {
 387                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 388                         default: fputc('?', stderr); break;     // NOTEST
 389                         case '1':
 390                                 type_print(t1, stderr);
 391                                 break;
 392                         case '2':
 393                                 type_print(t2, stderr);
 394                                 break;
 395                         ## format cases
 396                         }
 397                 }
 398                 fputs("\n", stderr);
 399                 c->parse_error += 1;
 400         }
 401
 402         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 403         {
 404                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 405                         t->txt.len, t->txt.txt);
 406                 c->parse_error += 1;
 407         }
 408
 409 ## Entities: declared and predeclared.
 410
 411 There are various "things" that the language and/or the interpreter
 412 needs to know about to parse and execute a program.  These include
 413 types, variables, values, and executable code.  These are all lumped
 414 together under the term "entities" (calling them "objects" would be
 415 confusing) and introduced here.  The following section will present the
 416 different specific code elements which comprise or manipulate these
 417 various entities.
 418
 419 ### Executables
 420
 421 Executables can be lots of different things.  In many cases an
 422 executable is just an operation combined with one or two other
 423 executables.  This allows for expressions and lists etc.  Other times an
 424 executable is something quite specific like a constant or variable name.
 425 So we define a `struct exec` to be a general executable with a type, and
 426 a `struct binode` which is a subclass of `exec`, forms a node in a
 427 binary tree, and holds an operation.  There will be other subclasses,
 428 and to access these we need to be able to `cast` the `exec` into the
 429 various other types.  The first field in any `struct exec` is the type
 430 from the `exec_types` enum.
 431
 432 ###### macros
 433         #define cast(structname, pointer) ({            \
 434                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 435                 if (__mptr && *__mptr != X##structname) abort();                \
 436                 (struct structname *)( (char *)__mptr);})
 437
 438         #define new(structname) ({                                              \
 439                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 440                 __ptr->type = X##structname;                                            \
 441                 __ptr->line = -1; __ptr->column = -1;                                   \
 442                 __ptr;})
 443
 444         #define new_pos(structname, token) ({                                           \
 445                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 446                 __ptr->type = X##structname;                                            \
 447                 __ptr->line = token.line; __ptr->column = token.col;                    \
 448                 __ptr;})
 449
 450 ###### ast
 451         enum exec_types {
 452                 Xbinode,
 453                 ## exec type
 454         };
 455         struct exec {
 456                 enum exec_types type;
 457                 int line, column;
 458                 ## exec fields
 459         };
 460         struct binode {
 461                 struct exec;
 462                 enum Btype {
 463                         ## Binode types
 464                 } op;
 465                 struct exec *left, *right;
 466         };
 467
 468 ###### ast functions
 469
 470         static int __fput_loc(struct exec *loc, FILE *f)
 471         {
 472                 if (!loc)
 473                         return 0;
 474                 if (loc->line >= 0) {
 475                         fprintf(f, "%d:%d: ", loc->line, loc->column);
 476                         return 1;
 477                 }
 478                 if (loc->type == Xbinode)
 479                         return __fput_loc(cast(binode,loc)->left, f) ||
 480                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
 481                 return 0;
 482         }
 483         static void fput_loc(struct exec *loc, FILE *f)
 484         {
 485                 if (!__fput_loc(loc, f))
 486                         fprintf(f, "??:??: ");
 487         }
 488
 489 Each different type of `exec` node needs a number of functions defined,
 490 a bit like methods.  We must be able to free it, print it, analyse it
 491 and execute it.  Once we have specific `exec` types we will need to
 492 parse them too.  Let's take this a bit more slowly.
 493
 494 #### Freeing
 495
 496 The parser generator requires a `free_foo` function for each struct
 497 that stores attributes and they will often be `exec`s and subtypes
 498 there-of.  So we need `free_exec` which can handle all the subtypes,
 499 and we need `free_binode`.
 500
 501 ###### ast functions
 502
 503         static void free_binode(struct binode *b)
 504         {
 505                 if (!b)
 506                         return;
 507                 free_exec(b->left);
 508                 free_exec(b->right);
 509                 free(b);
 510         }
 511
 512 ###### core functions
 513         static void free_exec(struct exec *e)
 514         {
 515                 if (!e)
 516                         return;
 517                 switch(e->type) {
 518                         ## free exec cases
 519                 }
 520         }
 521
 522 ###### forward decls
 523
 524         static void free_exec(struct exec *e);
 525
 526 ###### free exec cases
 527         case Xbinode: free_binode(cast(binode, e)); break;
 528
 529 #### Printing
 530
 531 Printing an `exec` requires that we know the current indent level for
 532 printing line-oriented components.  As will become clear later, we
 533 also want to know what sort of bracketing to use.
 534
 535 ###### ast functions
 536
 537         static void do_indent(int i, char *str)
 538         {
 539                 while (i-- > 0)
 540                         printf("    ");
 541                 printf("%s", str);
 542         }
 543
 544 ###### core functions
 545         static void print_binode(struct binode *b, int indent, int bracket)
 546         {
 547                 struct binode *b2;
 548                 switch(b->op) {
 549                 ## print binode cases
 550                 }
 551         }
 552
 553         static void print_exec(struct exec *e, int indent, int bracket)
 554         {
 555                 if (!e)
 556                         return;
 557                 switch (e->type) {
 558                 case Xbinode:
 559                         print_binode(cast(binode, e), indent, bracket); break;
 560                 ## print exec cases
 561                 }
 562                 if (e->to_free) {
 563                         struct variable *v;
 564                         do_indent(indent, "/* FREE");
 565                         for (v = e->to_free; v; v = v->next_free) {
 566                                 printf(" %.*s", v->name->name.len, v->name->name.txt);
 567                                 printf("[%d,%d]", v->scope_start, v->scope_end);
 568                                 if (v->frame_pos >= 0)
 569                                         printf("(%d+%d)", v->frame_pos,
 570                                                v->type ? v->type->size:0);
 571                         }
 572                         printf(" */\n");
 573                 }
 574         }
 575
 576 ###### forward decls
 577
 578         static void print_exec(struct exec *e, int indent, int bracket);
 579
 580 #### Analysing
 581
 582 As discussed, analysis involves propagating type requirements around the
 583 program and looking for errors.
 584
 585 So `propagate_types` is passed an expected type (being a `struct type`
 586 pointer together with some `val_rules` flags) that the `exec` is
 587 expected to return, and returns the type that it does return, either of
 588 which can be `NULL` signifying "unknown".  A `prop_err` flag set is
 589 passed by reference.  It has `Efail` set when an error is found, and
 590 `Eretry` when the type for some element is set via propagation.  If
 591 any expression cannot be evaluated immediately, `Enoconst` is set.
 592
 593 If it remains unchanged at `0`, then no more propagation is needed.
 594
 595 ###### ast
 596
 597         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 1<<2};
 598         enum prop_err {Efail = 1<<0, Eretry = 1<<1, Enoconst = 1<<2};
 599
 600 ###### format cases
 601         case 'r':
 602                 if (rules & Rnolabel)
 603                         fputs(" (labels not permitted)", stderr);
 604                 break;
 605
 606 ###### forward decls
 607         static struct type *propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 608                                             struct type *type, int rules);
 609 ###### core functions
 610
 611         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 612                                               struct type *type, int rules)
 613         {
 614                 struct type *t;
 615
 616                 if (!prog)
 617                         return Tnone;
 618
 619                 switch (prog->type) {
 620                 case Xbinode:
 621                 {
 622                         struct binode *b = cast(binode, prog);
 623                         switch (b->op) {
 624                         ## propagate binode cases
 625                         }
 626                         break;
 627                 }
 628                 ## propagate exec cases
 629                 }
 630                 return Tnone;
 631         }
 632
 633         static struct type *propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 634                                             struct type *type, int rules)
 635         {
 636                 int pre_err = c->parse_error;
 637                 struct type *ret = __propagate_types(prog, c, perr, type, rules);
 638
 639                 if (c->parse_error > pre_err)
 640                         *perr |= Efail;
 641                 return ret;
 642         }
 643
 644 #### Interpreting
 645
 646 Interpreting an `exec` doesn't require anything but the `exec`.  State
 647 is stored in variables and each variable will be directly linked from
 648 within the `exec` tree.  The exception to this is the `main` function
 649 which needs to look at command line arguments.  This function will be
 650 interpreted separately.
 651
 652 Each `exec` can return a value combined with a type in `struct lrval`.
 653 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
 654 the location of a value, which can be updated, in `lval`.  Others will
 655 set `lval` to NULL indicating that there is a value of appropriate type
 656 in `rval`.
 657
 658 ###### forward decls
 659         static struct value interp_exec(struct parse_context *c, struct exec *e,
 660                                         struct type **typeret);
 661 ###### core functions
 662
 663         struct lrval {
 664                 struct type *type;
 665                 struct value rval, *lval;
 666         };
 667
 668         /* If dest is passed, dtype must give the expected type, and
 669          * result can go there, in which case type is returned as NULL.
 670          */
 671         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 672                                          struct value *dest, struct type *dtype);
 673
 674         static struct value interp_exec(struct parse_context *c, struct exec *e,
 675                                         struct type **typeret)
 676         {
 677                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 678
 679                 if (!ret.type) abort();
 680                 if (typeret)
 681                         *typeret = ret.type;
 682                 if (ret.lval)
 683                         dup_value(ret.type, ret.lval, &ret.rval);
 684                 return ret.rval;
 685         }
 686
 687         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
 688                                           struct type **typeret)
 689         {
 690                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 691
 692                 if (!ret.type) abort();
 693                 if (ret.lval)
 694                         *typeret = ret.type;
 695                 else
 696                         free_value(ret.type, &ret.rval);
 697                 return ret.lval;
 698         }
 699
 700         /* dinterp_exec is used when the destination type is certain and
 701          * the value has a place to go.
 702          */
 703         static void dinterp_exec(struct parse_context *c, struct exec *e,
 704                                  struct value *dest, struct type *dtype,
 705                                  int need_free)
 706         {
 707                 struct lrval ret = _interp_exec(c, e, dest, dtype);
 708                 if (!ret.type)
 709                         return;
 710                 if (need_free)
 711                         free_value(dtype, dest);
 712                 if (ret.lval)
 713                         dup_value(dtype, ret.lval, dest);
 714                 else
 715                         memcpy(dest, &ret.rval, dtype->size);
 716         }
 717
 718         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 719                                          struct value *dest, struct type *dtype)
 720         {
 721                 /* If the result is copied to dest, ret.type is set to NULL */
 722                 struct lrval ret;
 723                 struct value rv = {}, *lrv = NULL;
 724                 struct type *rvtype;
 725
 726                 rvtype = ret.type = Tnone;
 727                 if (!e) {
 728                         ret.lval = lrv;
 729                         ret.rval = rv;
 730                         return ret;
 731                 }
 732
 733                 switch(e->type) {
 734                 case Xbinode:
 735                 {
 736                         struct binode *b = cast(binode, e);
 737                         struct value left, right, *lleft;
 738                         struct type *ltype, *rtype;
 739                         ltype = rtype = Tnone;
 740                         switch (b->op) {
 741                         ## interp binode cases
 742                         }
 743                         free_value(ltype, &left);
 744                         free_value(rtype, &right);
 745                         break;
 746                 }
 747                 ## interp exec cases
 748                 }
 749                 if (rvtype) {
 750                         ret.lval = lrv;
 751                         ret.rval = rv;
 752                         ret.type = rvtype;
 753                 }
 754                 ## interp exec cleanup
 755                 return ret;
 756         }
 757
 758 ### Types
 759
 760 Values come in a wide range of types, with more likely to be added.
 761 Each type needs to be able to print its own values (for convenience at
 762 least) as well as to compare two values, at least for equality and
 763 possibly for order.  For now, values might need to be duplicated and
 764 freed, though eventually such manipulations will be better integrated
 765 into the language.
 766
 767 Rather than requiring every numeric type to support all numeric
 768 operations (add, multiply, etc), we allow types to be able to present
 769 as one of a few standard types: integer, float, and fraction.  The
 770 existence of these conversion functions eventually enable types to
 771 determine if they are compatible with other types, though such types
 772 have not yet been implemented.
 773
 774 Named type are stored in a simple linked list.  Objects of each type are
 775 "values" which are often passed around by value.
 776
 777 There are both explicitly named types, and anonymous types.  Anonymous
 778 cannot be accessed by name, but are used internally and have a name
 779 which might be reported in error messages.
 780
 781 ###### ast
 782
 783         struct value {
 784                 union {
 785                         char ptr[1];
 786                         ## value union fields
 787                 };
 788         };
 789
 790         struct type {
 791                 struct text name;
 792                 struct type *next;
 793                 struct token first_use;
 794                 int size, align;
 795                 int anon;
 796                 void (*init)(struct type *type, struct value *val);
 797                 int (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 798                 void (*print)(struct type *type, struct value *val, FILE *f);
 799                 void (*print_type)(struct type *type, FILE *f);
 800                 int (*cmp_order)(struct type *t1, struct type *t2,
 801                                  struct value *v1, struct value *v2);
 802                 int (*cmp_eq)(struct type *t1, struct type *t2,
 803                               struct value *v1, struct value *v2);
 804                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 805                 void (*free)(struct type *type, struct value *val);
 806                 void (*free_type)(struct type *t);
 807                 long long (*to_int)(struct value *v);
 808                 double (*to_float)(struct value *v);
 809                 int (*to_mpq)(mpq_t *q, struct value *v);
 810                 ## type functions
 811                 union {
 812                         ## type union fields
 813                 };
 814         };
 815
 816 ###### parse context
 817
 818         struct type *typelist;
 819
 820 ###### includes
 821         #include <stdarg.h>
 822
 823 ###### ast functions
 824
 825         static struct type *find_type(struct parse_context *c, struct text s)
 826         {
 827                 struct type *t = c->typelist;
 828
 829                 while (t && (t->anon ||
 830                              text_cmp(t->name, s) != 0))
 831                                 t = t->next;
 832                 return t;
 833         }
 834
 835         static struct type *_add_type(struct parse_context *c, struct text s,
 836                                      struct type *proto, int anon)
 837         {
 838                 struct type *n;
 839
 840                 n = calloc(1, sizeof(*n));
 841                 if (proto)
 842                         *n = *proto;
 843                 else
 844                         n->size = -1;
 845                 n->name = s;
 846                 n->anon = anon;
 847                 n->next = c->typelist;
 848                 c->typelist = n;
 849                 return n;
 850         }
 851
 852         static struct type *add_type(struct parse_context *c, struct text s,
 853                                       struct type *proto)
 854         {
 855                 return _add_type(c, s, proto, 0);
 856         }
 857
 858         static struct type *add_anon_type(struct parse_context *c,
 859                                           struct type *proto, char *name, ...)
 860         {
 861                 struct text t;
 862                 va_list ap;
 863
 864                 va_start(ap, name);
 865                 vasprintf(&t.txt, name, ap);
 866                 va_end(ap);
 867                 t.len = strlen(name);
 868                 return _add_type(c, t, proto, 1);
 869         }
 870
 871         static void free_type(struct type *t)
 872         {
 873                 /* The type is always a reference to something in the
 874                  * context, so we don't need to free anything.
 875                  */
 876         }
 877
 878         static void free_value(struct type *type, struct value *v)
 879         {
 880                 if (type && v) {
 881                         type->free(type, v);
 882                         memset(v, 0x5a, type->size);
 883                 }
 884         }
 885
 886         static void type_print(struct type *type, FILE *f)
 887         {
 888                 if (!type)
 889                         fputs("*unknown*type*", f);     // NOTEST
 890                 else if (type->name.len && !type->anon)
 891                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 892                 else if (type->print_type)
 893                         type->print_type(type, f);
 894                 else
 895                         fputs("*invalid*type*", f);
 896         }
 897
 898         static void val_init(struct type *type, struct value *val)
 899         {
 900                 if (type && type->init)
 901                         type->init(type, val);
 902         }
 903
 904         static void dup_value(struct type *type,
 905                               struct value *vold, struct value *vnew)
 906         {
 907                 if (type && type->dup)
 908                         type->dup(type, vold, vnew);
 909         }
 910
 911         static int value_cmp(struct type *tl, struct type *tr,
 912                              struct value *left, struct value *right)
 913         {
 914                 if (tl && tl->cmp_order)
 915                         return tl->cmp_order(tl, tr, left, right);
 916                 if (tl && tl->cmp_eq)                   // NOTEST
 917                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 918                 return -1;                              // NOTEST
 919         }
 920
 921         static void print_value(struct type *type, struct value *v, FILE *f)
 922         {
 923                 if (type && type->print)
 924                         type->print(type, v, f);
 925                 else
 926                         fprintf(f, "*Unknown*");                // NOTEST
 927         }
 928
 929         static void prepare_types(struct parse_context *c)
 930         {
 931                 struct type *t;
 932                 int retry = 1;
 933                 enum { none, some, cannot } progress = none;
 934
 935                 while (retry) {
 936                         retry = 0;
 937
 938                         for (t = c->typelist; t; t = t->next) {
 939                                 if (t->size < 0)
 940                                         tok_err(c, "error: type used but not declared",
 941                                                  &t->first_use);
 942                                 if (t->size == 0 && t->prepare_type) {
 943                                         if (t->prepare_type(c, t, 1))
 944                                                 progress = some;
 945                                         else if (progress == cannot)
 946                                                 tok_err(c, "error: type has recursive definition",
 947                                                         &t->first_use);
 948                                         else
 949                                                 retry = 1;
 950                                 }
 951                         }
 952                         switch (progress) {
 953                         case cannot:
 954                                 retry = 0; break;
 955                         case none:
 956                                 progress = cannot; break;
 957                         case some:
 958                                 progress = none; break;
 959                         }
 960                 }
 961         }
 962
 963 ###### forward decls
 964
 965         static void free_value(struct type *type, struct value *v);
 966         static int type_compat(struct type *require, struct type *have, int rules);
 967         static void type_print(struct type *type, FILE *f);
 968         static void val_init(struct type *type, struct value *v);
 969         static void dup_value(struct type *type,
 970                               struct value *vold, struct value *vnew);
 971         static int value_cmp(struct type *tl, struct type *tr,
 972                              struct value *left, struct value *right);
 973         static void print_value(struct type *type, struct value *v, FILE *f);
 974
 975 ###### free context types
 976
 977         while (context.typelist) {
 978                 struct type *t = context.typelist;
 979
 980                 context.typelist = t->next;
 981                 if (t->free_type)
 982                         t->free_type(t);
 983                 if (t->anon)
 984                         free(t->name.txt);
 985                 free(t);
 986         }
 987
 988 Type can be specified for local variables, for fields in a structure,
 989 for formal parameters to functions, and possibly elsewhere.  Different
 990 rules may apply in different contexts.  As a minimum, a named type may
 991 always be used.  Currently the type of a formal parameter can be
 992 different from types in other contexts, so we have a separate grammar
 993 symbol for those.
 994
 995 ###### Grammar
 996
 997         $*type
 998         Type -> IDENTIFIER ${
 999                 $0 = find_type(c, $ID.txt);
1000                 if (!$0) {
1001                         $0 = add_type(c, $ID.txt, NULL);
1002                         $0->first_use = $ID;
1003                 }
1004         }$
1005         ## type grammar
1006
1007         FormalType -> Type ${ $0 = $<1; }$
1008         ## formal type grammar
1009
1010 #### Base Types
1011
1012 Values of the base types can be numbers, which we represent as
1013 multi-precision fractions, strings, Booleans and labels.  When
1014 analysing the program we also need to allow for places where no value
1015 is meaningful (type `Tnone`) and where we don't know what type to
1016 expect yet (type is `NULL`).
1017
1018 Values are never shared, they are always copied when used, and freed
1019 when no longer needed.
1020
1021 When propagating type information around the program, we need to
1022 determine if two types are compatible, where type `NULL` is compatible
1023 with anything.  There are two special cases with type compatibility,
1024 both related to the Conditional Statement which will be described
1025 later.  In some cases a Boolean can be accepted as well as some other
1026 primary type, and in others any type is acceptable except a label (`Vlabel`).
1027 A separate function encoding these cases will simplify some code later.
1028
1029 ###### type functions
1030
1031         int (*compat)(struct type *this, struct type *other);
1032
1033 ###### ast functions
1034
1035         static int type_compat(struct type *require, struct type *have, int rules)
1036         {
1037                 if ((rules & Rboolok) && have == Tbool)
1038                         return 1;       // NOTEST
1039                 if ((rules & Rnolabel) && have == Tlabel)
1040                         return 0;       // NOTEST
1041                 if (!require || !have)
1042                         return 1;
1043
1044                 if (require->compat)
1045                         return require->compat(require, have);
1046
1047                 return require == have;
1048         }
1049
1050 ###### includes
1051         #include <gmp.h>
1052         #include "parse_string.h"
1053         #include "parse_number.h"
1054
1055 ###### libs
1056         myLDLIBS := libnumber.o libstring.o -lgmp
1057         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
1058
1059 ###### type union fields
1060         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
1061
1062 ###### value union fields
1063         struct text str;
1064         mpq_t num;
1065         unsigned char bool;
1066         void *label;
1067
1068 ###### ast functions
1069         static void _free_value(struct type *type, struct value *v)
1070         {
1071                 if (!v)
1072                         return;         // NOTEST
1073                 switch (type->vtype) {
1074                 case Vnone: break;
1075                 case Vstr: free(v->str.txt); break;
1076                 case Vnum: mpq_clear(v->num); break;
1077                 case Vlabel:
1078                 case Vbool: break;
1079                 }
1080         }
1081
1082 ###### value functions
1083
1084         static void _val_init(struct type *type, struct value *val)
1085         {
1086                 switch(type->vtype) {
1087                 case Vnone:             // NOTEST
1088                         break;          // NOTEST
1089                 case Vnum:
1090                         mpq_init(val->num); break;
1091                 case Vstr:
1092                         val->str.txt = malloc(1);
1093                         val->str.len = 0;
1094                         break;
1095                 case Vbool:
1096                         val->bool = 0;
1097                         break;
1098                 case Vlabel:
1099                         val->label = NULL;
1100                         break;
1101                 }
1102         }
1103
1104         static void _dup_value(struct type *type,
1105                                struct value *vold, struct value *vnew)
1106         {
1107                 switch (type->vtype) {
1108                 case Vnone:             // NOTEST
1109                         break;          // NOTEST
1110                 case Vlabel:
1111                         vnew->label = vold->label;
1112                         break;
1113                 case Vbool:
1114                         vnew->bool = vold->bool;
1115                         break;
1116                 case Vnum:
1117                         mpq_init(vnew->num);
1118                         mpq_set(vnew->num, vold->num);
1119                         break;
1120                 case Vstr:
1121                         vnew->str.len = vold->str.len;
1122                         vnew->str.txt = malloc(vnew->str.len);
1123                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
1124                         break;
1125                 }
1126         }
1127
1128         static int _value_cmp(struct type *tl, struct type *tr,
1129                               struct value *left, struct value *right)
1130         {
1131                 int cmp;
1132                 if (tl != tr)
1133                         return tl - tr; // NOTEST
1134                 switch (tl->vtype) {
1135                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
1136                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
1137                 case Vstr: cmp = text_cmp(left->str, right->str); break;
1138                 case Vbool: cmp = left->bool - right->bool; break;
1139                 case Vnone: cmp = 0;                    // NOTEST
1140                 }
1141                 return cmp;
1142         }
1143
1144         static void _print_value(struct type *type, struct value *v, FILE *f)
1145         {
1146                 switch (type->vtype) {
1147                 case Vnone:                             // NOTEST
1148                         fprintf(f, "*no-value*"); break;        // NOTEST
1149                 case Vlabel:                            // NOTEST
1150                         fprintf(f, "*label-%p*", v->label); break; // NOTEST
1151                 case Vstr:
1152                         fprintf(f, "%.*s", v->str.len, v->str.txt); break;
1153                 case Vbool:
1154                         fprintf(f, "%s", v->bool ? "True":"False"); break;
1155                 case Vnum:
1156                         {
1157                         mpf_t fl;
1158                         mpf_init2(fl, 20);
1159                         mpf_set_q(fl, v->num);
1160                         gmp_fprintf(f, "%.10Fg", fl);
1161                         mpf_clear(fl);
1162                         break;
1163                         }
1164                 }
1165         }
1166
1167         static void _free_value(struct type *type, struct value *v);
1168
1169         static struct type base_prototype = {
1170                 .init = _val_init,
1171                 .print = _print_value,
1172                 .cmp_order = _value_cmp,
1173                 .cmp_eq = _value_cmp,
1174                 .dup = _dup_value,
1175                 .free = _free_value,
1176         };
1177
1178         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
1179
1180 ###### ast functions
1181         static struct type *add_base_type(struct parse_context *c, char *n,
1182                                           enum vtype vt, int size)
1183         {
1184                 struct text txt = { n, strlen(n) };
1185                 struct type *t;
1186
1187                 t = add_type(c, txt, &base_prototype);
1188                 t->vtype = vt;
1189                 t->size = size;
1190                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
1191                 if (t->size & (t->align - 1))
1192                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
1193                 return t;
1194         }
1195
1196 ###### context initialization
1197
1198         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
1199         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
1200         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
1201         Tnone  = add_base_type(&context, "none", Vnone, 0);
1202         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
1203
1204 ##### Base Values
1205
1206 We have already met values as separate objects.  When manifest constants
1207 appear in the program text, that must result in an executable which has
1208 a constant value.  So the `val` structure embeds a value in an
1209 executable.
1210
1211 ###### exec type
1212         Xval,
1213
1214 ###### ast
1215         struct val {
1216                 struct exec;
1217                 struct type *vtype;
1218                 struct value val;
1219         };
1220
1221 ###### ast functions
1222         struct val *new_val(struct type *T, struct token tk)
1223         {
1224                 struct val *v = new_pos(val, tk);
1225                 v->vtype = T;
1226                 return v;
1227         }
1228
1229 ###### Grammar
1230
1231         $TERM True False
1232
1233         $*val
1234         Value ->  True ${
1235                 $0 = new_val(Tbool, $1);
1236                 $0->val.bool = 1;
1237         }$
1238         | False ${
1239                 $0 = new_val(Tbool, $1);
1240                 $0->val.bool = 0;
1241         }$
1242         | NUMBER ${ {
1243                 char tail[3];
1244                 $0 = new_val(Tnum, $1);
1245                 if (number_parse($0->val.num, tail, $1.txt) == 0)
1246                         mpq_init($0->val.num);  // UNTESTED
1247                         if (tail[0])
1248                                 tok_err(c, "error: unsupported number suffix",
1249                                         &$1);
1250         } }$
1251         | STRING ${ {
1252                 char tail[3];
1253                 $0 = new_val(Tstr, $1);
1254                 string_parse(&$1, '\\', &$0->val.str, tail);
1255                 if (tail[0])
1256                         tok_err(c, "error: unsupported string suffix",
1257                                 &$1);
1258         } }$
1259         | MULTI_STRING ${ {
1260                 char tail[3];
1261                 $0 = new_val(Tstr, $1);
1262                 string_parse(&$1, '\\', &$0->val.str, tail);
1263                 if (tail[0])
1264                         tok_err(c, "error: unsupported string suffix",
1265                                 &$1);
1266         } }$
1267
1268 ###### print exec cases
1269         case Xval:
1270         {
1271                 struct val *v = cast(val, e);
1272                 if (v->vtype == Tstr)
1273                         printf("\"");
1274                 // FIXME how to ensure numbers have same precision.
1275                 print_value(v->vtype, &v->val, stdout);
1276                 if (v->vtype == Tstr)
1277                         printf("\"");
1278                 break;
1279         }
1280
1281 ###### propagate exec cases
1282         case Xval:
1283         {
1284                 struct val *val = cast(val, prog);
1285                 if (!type_compat(type, val->vtype, rules))
1286                         type_err(c, "error: expected %1%r found %2",
1287                                    prog, type, rules, val->vtype);
1288                 return val->vtype;
1289         }
1290
1291 ###### interp exec cases
1292         case Xval:
1293                 rvtype = cast(val, e)->vtype;
1294                 dup_value(rvtype, &cast(val, e)->val, &rv);
1295                 break;
1296
1297 ###### ast functions
1298         static void free_val(struct val *v)
1299         {
1300                 if (v)
1301                         free_value(v->vtype, &v->val);
1302                 free(v);
1303         }
1304
1305 ###### free exec cases
1306         case Xval: free_val(cast(val, e)); break;
1307
1308 ###### ast functions
1309         // Move all nodes from 'b' to 'rv', reversing their order.
1310         // In 'b' 'left' is a list, and 'right' is the last node.
1311         // In 'rv', left' is the first node and 'right' is a list.
1312         static struct binode *reorder_bilist(struct binode *b)
1313         {
1314                 struct binode *rv = NULL;
1315
1316                 while (b) {
1317                         struct exec *t = b->right;
1318                         b->right = rv;
1319                         rv = b;
1320                         if (b->left)
1321                                 b = cast(binode, b->left);
1322                         else
1323                                 b = NULL;
1324                         rv->left = t;
1325                 }
1326                 return rv;
1327         }
1328
1329 ### Variables
1330
1331 Variables are scoped named values.  We store the names in a linked list
1332 of "bindings" sorted in lexical order, and use sequential search and
1333 insertion sort.
1334
1335 ###### ast
1336
1337         struct binding {
1338                 struct text name;
1339                 struct binding *next;   // in lexical order
1340                 ## binding fields
1341         };
1342
1343 This linked list is stored in the parse context so that "reduce"
1344 functions can find or add variables, and so the analysis phase can
1345 ensure that every variable gets a type.
1346
1347 ###### parse context
1348
1349         struct binding *varlist;  // In lexical order
1350
1351 ###### ast functions
1352
1353         static struct binding *find_binding(struct parse_context *c, struct text s)
1354         {
1355                 struct binding **l = &c->varlist;
1356                 struct binding *n;
1357                 int cmp = 1;
1358
1359                 while (*l &&
1360                         (cmp = text_cmp((*l)->name, s)) < 0)
1361                                 l = & (*l)->next;
1362                 if (cmp == 0)
1363                         return *l;
1364                 n = calloc(1, sizeof(*n));
1365                 n->name = s;
1366                 n->next = *l;
1367                 *l = n;
1368                 return n;
1369         }
1370
1371 Each name can be linked to multiple variables defined in different
1372 scopes.  Each scope starts where the name is declared and continues
1373 until the end of the containing code block.  Scopes of a given name
1374 cannot nest, so a declaration while a name is in-scope is an error.
1375
1376 ###### binding fields
1377         struct variable *var;
1378
1379 ###### ast
1380         struct variable {
1381                 struct variable *previous;
1382                 struct type *type;
1383                 struct binding *name;
1384                 struct exec *where_decl;// where name was declared
1385                 struct exec *where_set; // where type was set
1386                 ## variable fields
1387         };
1388
1389 When a scope closes, the values of the variables might need to be freed.
1390 This happens in the context of some `struct exec` and each `exec` will
1391 need to know which variables need to be freed when it completes.
1392
1393 ####### exec fields
1394         struct variable *to_free;
1395
1396 ####### variable fields
1397         struct exec *cleanup_exec;
1398         struct variable *next_free;
1399
1400 ####### interp exec cleanup
1401         {
1402                 struct variable *v;
1403                 for (v = e->to_free; v; v = v->next_free) {
1404                         struct value *val = var_value(c, v);
1405                         free_value(v->type, val);
1406                 }
1407         }
1408
1409 ###### ast functions
1410         static void variable_unlink_exec(struct variable *v)
1411         {
1412                 struct variable **vp;
1413                 if (!v->cleanup_exec)
1414                         return;
1415                 for (vp = &v->cleanup_exec->to_free;
1416                     *vp; vp = &(*vp)->next_free) {
1417                         if (*vp != v)
1418                                 continue;
1419                         *vp = v->next_free;
1420                         v->cleanup_exec = NULL;
1421                         break;
1422                 }
1423         }
1424
1425 While the naming seems strange, we include local constants in the
1426 definition of variables.  A name declared `var := value` can
1427 subsequently be changed, but a name declared `var ::= value` cannot -
1428 it is constant
1429
1430 ###### variable fields
1431         int constant;
1432
1433 Scopes in parallel branches can be partially merged.  More
1434 specifically, if a given name is declared in both branches of an
1435 if/else then its scope is a candidate for merging.  Similarly if
1436 every branch of an exhaustive switch (e.g. has an "else" clause)
1437 declares a given name, then the scopes from the branches are
1438 candidates for merging.
1439
1440 Note that names declared inside a loop (which is only parallel to
1441 itself) are never visible after the loop.  Similarly names defined in
1442 scopes which are not parallel, such as those started by `for` and
1443 `switch`, are never visible after the scope.  Only variables defined in
1444 both `then` and `else` (including the implicit then after an `if`, and
1445 excluding `then` used with `for`) and in all `case`s and `else` of a
1446 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
1447
1448 Labels, which are a bit like variables, follow different rules.
1449 Labels are not explicitly declared, but if an undeclared name appears
1450 in a context where a label is legal, that effectively declares the
1451 name as a label.  The declaration remains in force (or in scope) at
1452 least to the end of the immediately containing block and conditionally
1453 in any larger containing block which does not declare the name in some
1454 other way.  Importantly, the conditional scope extension happens even
1455 if the label is only used in one parallel branch of a conditional --
1456 when used in one branch it is treated as having been declared in all
1457 branches.
1458
1459 Merge candidates are tentatively visible beyond the end of the
1460 branching statement which creates them.  If the name is used, the
1461 merge is affirmed and they become a single variable visible at the
1462 outer layer.  If not - if it is redeclared first - the merge lapses.
1463
1464 To track scopes we have an extra stack, implemented as a linked list,
1465 which roughly parallels the parse stack and which is used exclusively
1466 for scoping.  When a new scope is opened, a new frame is pushed and
1467 the child-count of the parent frame is incremented.  This child-count
1468 is used to distinguish between the first of a set of parallel scopes,
1469 in which declared variables must not be in scope, and subsequent
1470 branches, whether they may already be conditionally scoped.
1471
1472 We need a total ordering of scopes so we can easily compare to variables
1473 to see if they are concurrently in scope.  To achieve this we record a
1474 `scope_count` which is actually a count of both beginnings and endings
1475 of scopes.  Then each variable has a record of the scope count where it
1476 enters scope, and where it leaves.
1477
1478 To push a new frame *before* any code in the frame is parsed, we need a
1479 grammar reduction.  This is most easily achieved with a grammar
1480 element which derives the empty string, and creates the new scope when
1481 it is recognised.  This can be placed, for example, between a keyword
1482 like "if" and the code following it.
1483
1484 ###### ast
1485         struct scope {
1486                 struct scope *parent;
1487                 int child_count;
1488         };
1489
1490 ###### parse context
1491         int scope_depth;
1492         int scope_count;
1493         struct scope *scope_stack;
1494
1495 ###### variable fields
1496         int scope_start, scope_end;
1497
1498 ###### ast functions
1499         static void scope_pop(struct parse_context *c)
1500         {
1501                 struct scope *s = c->scope_stack;
1502
1503                 c->scope_stack = s->parent;
1504                 free(s);
1505                 c->scope_depth -= 1;
1506                 c->scope_count += 1;
1507         }
1508
1509         static void scope_push(struct parse_context *c)
1510         {
1511                 struct scope *s = calloc(1, sizeof(*s));
1512                 if (c->scope_stack)
1513                         c->scope_stack->child_count += 1;
1514                 s->parent = c->scope_stack;
1515                 c->scope_stack = s;
1516                 c->scope_depth += 1;
1517                 c->scope_count += 1;
1518         }
1519
1520 ###### Grammar
1521
1522         $void
1523         OpenScope -> ${ scope_push(c); }$
1524
1525 Each variable records a scope depth and is in one of four states:
1526
1527 - "in scope".  This is the case between the declaration of the
1528   variable and the end of the containing block, and also between
1529   the usage with affirms a merge and the end of that block.
1530
1531   The scope depth is not greater than the current parse context scope
1532   nest depth.  When the block of that depth closes, the state will
1533   change.  To achieve this, all "in scope" variables are linked
1534   together as a stack in nesting order.
1535
1536 - "pending".  The "in scope" block has closed, but other parallel
1537   scopes are still being processed.  So far, every parallel block at
1538   the same level that has closed has declared the name.
1539
1540   The scope depth is the depth of the last parallel block that
1541   enclosed the declaration, and that has closed.
1542
1543 - "conditionally in scope".  The "in scope" block and all parallel
1544   scopes have closed, and no further mention of the name has been seen.
1545   This state includes a secondary nest depth (`min_depth`) which records
1546   the outermost scope seen since the variable became conditionally in
1547   scope.  If a use of the name is found, the variable becomes "in scope"
1548   and that secondary depth becomes the recorded scope depth.  If the
1549   name is declared as a new variable, the old variable becomes "out of
1550   scope" and the recorded scope depth stays unchanged.
1551
1552 - "out of scope".  The variable is neither in scope nor conditionally
1553   in scope.  It is permanently out of scope now and can be removed from
1554   the "in scope" stack.  When a variable becomes out-of-scope it is
1555   moved to a separate list (`out_scope`) of variables which have fully
1556   known scope.  This will be used at the end of each function to assign
1557   each variable a place in the stack frame.
1558
1559 ###### variable fields
1560         int depth, min_depth;
1561         enum { OutScope, PendingScope, CondScope, InScope } scope;
1562         struct variable *in_scope;
1563
1564 ###### parse context
1565
1566         struct variable *in_scope;
1567         struct variable *out_scope;
1568
1569 All variables with the same name are linked together using the
1570 'previous' link.  Those variable that have been affirmatively merged all
1571 have a 'merged' pointer that points to one primary variable - the most
1572 recently declared instance.  When merging variables, we need to also
1573 adjust the 'merged' pointer on any other variables that had previously
1574 been merged with the one that will no longer be primary.
1575
1576 A variable that is no longer the most recent instance of a name may
1577 still have "pending" scope, if it might still be merged with most
1578 recent instance.  These variables don't really belong in the
1579 "in_scope" list, but are not immediately removed when a new instance
1580 is found.  Instead, they are detected and ignored when considering the
1581 list of in_scope names.
1582
1583 The storage of the value of a variable will be described later.  For now
1584 we just need to know that when a variable goes out of scope, it might
1585 need to be freed.  For this we need to be able to find it, so assume that
1586 `var_value()` will provide that.
1587
1588 ###### variable fields
1589         struct variable *merged;
1590
1591 ###### ast functions
1592
1593         static void variable_merge(struct variable *primary, struct variable *secondary)
1594         {
1595                 struct variable *v;
1596
1597                 primary = primary->merged;
1598
1599                 for (v = primary->previous; v; v=v->previous)
1600                         if (v == secondary || v == secondary->merged ||
1601                             v->merged == secondary ||
1602                             v->merged == secondary->merged) {
1603                                 v->scope = OutScope;
1604                                 v->merged = primary;
1605                                 if (v->scope_start < primary->scope_start)
1606                                         primary->scope_start = v->scope_start;
1607                                 if (v->scope_end > primary->scope_end)
1608                                         primary->scope_end = v->scope_end;      // NOTEST
1609                                 variable_unlink_exec(v);
1610                         }
1611         }
1612
1613 ###### forward decls
1614         static struct value *var_value(struct parse_context *c, struct variable *v);
1615
1616 ###### free global vars
1617
1618         while (context.varlist) {
1619                 struct binding *b = context.varlist;
1620                 struct variable *v = b->var;
1621                 context.varlist = b->next;
1622                 free(b);
1623                 while (v) {
1624                         struct variable *next = v->previous;
1625
1626                         if (v->global && v->frame_pos >= 0) {
1627                                 free_value(v->type, var_value(&context, v));
1628                                 if (v->depth == 0 && v->type->free == function_free)
1629                                         // This is a function constant
1630                                         free_exec(v->where_decl);
1631                         }
1632                         free(v);
1633                         v = next;
1634                 }
1635         }
1636
1637 #### Manipulating Bindings
1638
1639 When a name is conditionally visible, a new declaration discards the old
1640 binding - the condition lapses.  Similarly when we reach the end of a
1641 function (outermost non-global scope) any conditional scope must lapse.
1642 Conversely a usage of the name affirms the visibility and extends it to
1643 the end of the containing block - i.e.  the block that contains both the
1644 original declaration and the latest usage.  This is determined from
1645 `min_depth`.  When a conditionally visible variable gets affirmed like
1646 this, it is also merged with other conditionally visible variables with
1647 the same name.
1648
1649 When we parse a variable declaration we either report an error if the
1650 name is currently bound, or create a new variable at the current nest
1651 depth if the name is unbound or bound to a conditionally scoped or
1652 pending-scope variable.  If the previous variable was conditionally
1653 scoped, it and its homonyms becomes out-of-scope.
1654
1655 When we parse a variable reference (including non-declarative assignment
1656 "foo = bar") we report an error if the name is not bound or is bound to
1657 a pending-scope variable; update the scope if the name is bound to a
1658 conditionally scoped variable; or just proceed normally if the named
1659 variable is in scope.
1660
1661 When we exit a scope, any variables bound at this level are either
1662 marked out of scope or pending-scoped, depending on whether the scope
1663 was sequential or parallel.  Here a "parallel" scope means the "then"
1664 or "else" part of a conditional, or any "case" or "else" branch of a
1665 switch.  Other scopes are "sequential".
1666
1667 When exiting a parallel scope we check if there are any variables that
1668 were previously pending and are still visible. If there are, then
1669 they weren't redeclared in the most recent scope, so they cannot be
1670 merged and must become out-of-scope.  If it is not the first of
1671 parallel scopes (based on `child_count`), we check that there was a
1672 previous binding that is still pending-scope.  If there isn't, the new
1673 variable must now be out-of-scope.
1674
1675 When exiting a sequential scope that immediately enclosed parallel
1676 scopes, we need to resolve any pending-scope variables.  If there was
1677 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1678 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1679 all pending-scope variables become conditionally scoped.
1680
1681 ###### ast
1682         enum closetype { CloseSequential, CloseFunction, CloseParallel, CloseElse };
1683
1684 ###### ast functions
1685
1686         static struct variable *var_decl(struct parse_context *c, struct text s)
1687         {
1688                 struct binding *b = find_binding(c, s);
1689                 struct variable *v = b->var;
1690
1691                 switch (v ? v->scope : OutScope) {
1692                 case InScope:
1693                         /* Caller will report the error */
1694                         return NULL;
1695                 case CondScope:
1696                         for (;
1697                              v && v->scope == CondScope;
1698                              v = v->previous)
1699                                 v->scope = OutScope;
1700                         break;
1701                 default: break;
1702                 }
1703                 v = calloc(1, sizeof(*v));
1704                 v->previous = b->var;
1705                 b->var = v;
1706                 v->name = b;
1707                 v->merged = v;
1708                 v->min_depth = v->depth = c->scope_depth;
1709                 v->scope = InScope;
1710                 v->in_scope = c->in_scope;
1711                 v->scope_start = c->scope_count;
1712                 c->in_scope = v;
1713                 ## variable init
1714                 return v;
1715         }
1716
1717         static struct variable *var_ref(struct parse_context *c, struct text s)
1718         {
1719                 struct binding *b = find_binding(c, s);
1720                 struct variable *v = b->var;
1721                 struct variable *v2;
1722
1723                 switch (v ? v->scope : OutScope) {
1724                 case OutScope:
1725                 case PendingScope:
1726                         /* Caller will report the error */
1727                         return NULL;
1728                 case CondScope:
1729                         /* All CondScope variables of this name need to be merged
1730                          * and become InScope
1731                          */
1732                         v->depth = v->min_depth;
1733                         v->scope = InScope;
1734                         for (v2 = v->previous;
1735                              v2 && v2->scope == CondScope;
1736                              v2 = v2->previous)
1737                                 variable_merge(v, v2);
1738                         break;
1739                 case InScope:
1740                         break;
1741                 }
1742                 return v;
1743         }
1744
1745         static int var_refile(struct parse_context *c, struct variable *v)
1746         {
1747                 /* Variable just went out of scope.  Add it to the out_scope
1748                  * list, sorted by ->scope_start
1749                  */
1750                 struct variable **vp = &c->out_scope;
1751                 while ((*vp) && (*vp)->scope_start < v->scope_start)
1752                         vp = &(*vp)->in_scope;
1753                 v->in_scope = *vp;
1754                 *vp = v;
1755                 return 0;
1756         }
1757
1758         static void var_block_close(struct parse_context *c, enum closetype ct,
1759                                     struct exec *e)
1760         {
1761                 /* Close off all variables that are in_scope.
1762                  * Some variables in c->scope may already be not-in-scope,
1763                  * such as when a PendingScope variable is hidden by a new
1764                  * variable with the same name.
1765                  * So we check for v->name->var != v and drop them.
1766                  * If we choose to make a variable OutScope, we drop it
1767                  * immediately too.
1768                  */
1769                 struct variable *v, **vp, *v2;
1770
1771                 scope_pop(c);
1772                 for (vp = &c->in_scope;
1773                      (v = *vp) && v->min_depth > c->scope_depth;
1774                      (v->scope == OutScope || v->name->var != v)
1775                      ? (*vp =  v->in_scope, var_refile(c, v))
1776                      : ( vp = &v->in_scope, 0)) {
1777                         v->min_depth = c->scope_depth;
1778                         if (v->name->var != v)
1779                                 /* This is still in scope, but we haven't just
1780                                  * closed the scope.
1781                                  */
1782                                 continue;
1783                         v->min_depth = c->scope_depth;
1784                         if (v->scope == InScope)
1785                                 v->scope_end = c->scope_count;
1786                         if (v->scope == InScope && e && !v->global) {
1787                                 /* This variable gets cleaned up when 'e' finishes */
1788                                 variable_unlink_exec(v);
1789                                 v->cleanup_exec = e;
1790                                 v->next_free = e->to_free;
1791                                 e->to_free = v;
1792                         }
1793                         switch (ct) {
1794                         case CloseElse:
1795                         case CloseParallel: /* handle PendingScope */
1796                                 switch(v->scope) {
1797                                 case InScope:
1798                                 case CondScope:
1799                                         if (c->scope_stack->child_count == 1)
1800                                                 /* first among parallel branches */
1801                                                 v->scope = PendingScope;
1802                                         else if (v->previous &&
1803                                                  v->previous->scope == PendingScope)
1804                                                 /* all previous branches used name */
1805                                                 v->scope = PendingScope;
1806                                         else if (v->type == Tlabel)
1807                                                 /* Labels remain pending even when not used */
1808                                                 v->scope = PendingScope;        // UNTESTED
1809                                         else
1810                                                 v->scope = OutScope;
1811                                         if (ct == CloseElse) {
1812                                                 /* All Pending variables with this name
1813                                                  * are now Conditional */
1814                                                 for (v2 = v;
1815                                                      v2 && v2->scope == PendingScope;
1816                                                      v2 = v2->previous)
1817                                                         v2->scope = CondScope;
1818                                         }
1819                                         break;
1820                                 case PendingScope:
1821                                         /* Not possible as it would require
1822                                          * parallel scope to be nested immediately
1823                                          * in a parallel scope, and that never
1824                                          * happens.
1825                                          */                     // NOTEST
1826                                 case OutScope:
1827                                         /* Not possible as we already tested for
1828                                          * OutScope
1829                                          */
1830                                         abort();                // NOTEST
1831                                 }
1832                                 break;
1833                         case CloseFunction:
1834                                 if (v->scope == CondScope)
1835                                         /* Condition cannot continue past end of function */
1836                                         v->scope = InScope;
1837                                 /* fallthrough */
1838                         case CloseSequential:
1839                                 if (v->type == Tlabel)
1840                                         v->scope = PendingScope;
1841                                 switch (v->scope) {
1842                                 case InScope:
1843                                         v->scope = OutScope;
1844                                         break;
1845                                 case PendingScope:
1846                                         /* There was no 'else', so we can only become
1847                                          * conditional if we know the cases were exhaustive,
1848                                          * and that doesn't mean anything yet.
1849                                          * So only labels become conditional..
1850                                          */
1851                                         for (v2 = v;
1852                                              v2 && v2->scope == PendingScope;
1853                                              v2 = v2->previous)
1854                                                 if (v2->type == Tlabel)
1855                                                         v2->scope = CondScope;
1856                                                 else
1857                                                         v2->scope = OutScope;
1858                                         break;
1859                                 case CondScope:
1860                                 case OutScope: break;
1861                                 }
1862                                 break;
1863                         }
1864                 }
1865         }
1866
1867 #### Storing Values
1868
1869 The value of a variable is store separately from the variable, on an
1870 analogue of a stack frame.  There are (currently) two frames that can be
1871 active.  A global frame which currently only stores constants, and a
1872 stacked frame which stores local variables.  Each variable knows if it
1873 is global or not, and what its index into the frame is.
1874
1875 Values in the global frame are known immediately they are relevant, so
1876 the frame needs to be reallocated as it grows so it can store those
1877 values.  The local frame doesn't get values until the interpreted phase
1878 is started, so there is no need to allocate until the size is known.
1879
1880 We initialize the `frame_pos` to an impossible value, so that we can
1881 tell if it was set or not later.
1882
1883 ###### variable fields
1884         short frame_pos;
1885         short global;
1886
1887 ###### variable init
1888         v->frame_pos = -1;
1889
1890 ###### parse context
1891
1892         short global_size, global_alloc;
1893         short local_size;
1894         void *global, *local;
1895
1896 ###### forward decls
1897         static struct value *global_alloc(struct parse_context *c, struct type *t,
1898                                           struct variable *v, struct value *init);
1899
1900 ###### ast functions
1901
1902         static struct value *var_value(struct parse_context *c, struct variable *v)
1903         {
1904                 if (!v->global) {
1905                         if (!c->local || !v->type)
1906                                 return NULL;    // UNTESTED
1907                         if (v->frame_pos + v->type->size > c->local_size) {
1908                                 printf("INVALID frame_pos\n");  // NOTEST
1909                                 exit(2);                        // NOTEST
1910                         }
1911                         return c->local + v->frame_pos;
1912                 }
1913                 if (c->global_size > c->global_alloc) {
1914                         int old = c->global_alloc;
1915                         c->global_alloc = (c->global_size | 1023) + 1024;
1916                         c->global = realloc(c->global, c->global_alloc);
1917                         memset(c->global + old, 0, c->global_alloc - old);
1918                 }
1919                 return c->global + v->frame_pos;
1920         }
1921
1922         static struct value *global_alloc(struct parse_context *c, struct type *t,
1923                                           struct variable *v, struct value *init)
1924         {
1925                 struct value *ret;
1926                 struct variable scratch;
1927
1928                 if (t->prepare_type)
1929                         t->prepare_type(c, t, 1);       // NOTEST
1930
1931                 if (c->global_size & (t->align - 1))
1932                         c->global_size = (c->global_size + t->align) & ~(t->align-1);   // NOTEST
1933                 if (!v) {
1934                         v = &scratch;
1935                         v->type = t;
1936                 }
1937                 v->frame_pos = c->global_size;
1938                 v->global = 1;
1939                 c->global_size += v->type->size;
1940                 ret = var_value(c, v);
1941                 if (init)
1942                         memcpy(ret, init, t->size);
1943                 else
1944                         val_init(t, ret);
1945                 return ret;
1946         }
1947
1948 As global values are found -- struct field initializers, labels etc --
1949 `global_alloc()` is called to record the value in the global frame.
1950
1951 When the program is fully parsed, each function is analysed, we need to
1952 walk the list of variables local to that function and assign them an
1953 offset in the stack frame.  For this we have `scope_finalize()`.
1954
1955 We keep the stack from dense by re-using space for between variables
1956 that are not in scope at the same time.  The `out_scope` list is sorted
1957 by `scope_start` and as we process a varible, we move it to an FIFO
1958 stack.  For each variable we consider, we first discard any from the
1959 stack anything that went out of scope before the new variable came in.
1960 Then we place the new variable just after the one at the top of the
1961 stack.
1962
1963 ###### ast functions
1964
1965         static void scope_finalize(struct parse_context *c, struct type *ft)
1966         {
1967                 int size = ft->function.local_size;
1968                 struct variable *next = ft->function.scope;
1969                 struct variable *done = NULL;
1970
1971                 while (next) {
1972                         struct variable *v = next;
1973                         struct type *t = v->type;
1974                         int pos;
1975                         next = v->in_scope;
1976                         if (v->merged != v)
1977                                 continue;
1978                         if (!t)
1979                                 continue;
1980                         if (v->frame_pos >= 0)
1981                                 continue;
1982                         while (done && done->scope_end < v->scope_start)
1983                                 done = done->in_scope;
1984                         if (done)
1985                                 pos = done->frame_pos + done->type->size;
1986                         else
1987                                 pos = ft->function.local_size;
1988                         if (pos & (t->align - 1))
1989                                 pos = (pos + t->align) & ~(t->align-1);
1990                         v->frame_pos = pos;
1991                         if (size < pos + v->type->size)
1992                                 size = pos + v->type->size;
1993                         v->in_scope = done;
1994                         done = v;
1995                 }
1996                 c->out_scope = NULL;
1997                 ft->function.local_size = size;
1998         }
1999
2000 ###### free context storage
2001         free(context.global);
2002
2003 #### Variables as executables
2004
2005 Just as we used a `val` to wrap a value into an `exec`, we similarly
2006 need a `var` to wrap a `variable` into an exec.  While each `val`
2007 contained a copy of the value, each `var` holds a link to the variable
2008 because it really is the same variable no matter where it appears.
2009 When a variable is used, we need to remember to follow the `->merged`
2010 link to find the primary instance.
2011
2012 When a variable is declared, it may or may not be given an explicit
2013 type.  We need to record which so that we can report the parsed code
2014 correctly.
2015
2016 ###### exec type
2017         Xvar,
2018
2019 ###### ast
2020         struct var {
2021                 struct exec;
2022                 struct variable *var;
2023         };
2024
2025 ###### variable fields
2026         int explicit_type;
2027
2028 ###### Grammar
2029
2030         $TERM : ::
2031
2032         $*var
2033         VariableDecl -> IDENTIFIER : ${ {
2034                 struct variable *v = var_decl(c, $1.txt);
2035                 $0 = new_pos(var, $1);
2036                 $0->var = v;
2037                 if (v)
2038                         v->where_decl = $0;
2039                 else {
2040                         v = var_ref(c, $1.txt);
2041                         $0->var = v;
2042                         type_err(c, "error: variable '%v' redeclared",
2043                                  $0, NULL, 0, NULL);
2044                         type_err(c, "info: this is where '%v' was first declared",
2045                                  v->where_decl, NULL, 0, NULL);
2046                 }
2047         } }$
2048         | IDENTIFIER :: ${ {
2049                 struct variable *v = var_decl(c, $1.txt);
2050                 $0 = new_pos(var, $1);
2051                 $0->var = v;
2052                 if (v) {
2053                         v->where_decl = $0;
2054                         v->constant = 1;
2055                 } else {
2056                         v = var_ref(c, $1.txt);
2057                         $0->var = v;
2058                         type_err(c, "error: variable '%v' redeclared",
2059                                  $0, NULL, 0, NULL);
2060                         type_err(c, "info: this is where '%v' was first declared",
2061                                  v->where_decl, NULL, 0, NULL);
2062                 }
2063         } }$
2064         | IDENTIFIER : Type ${ {
2065                 struct variable *v = var_decl(c, $1.txt);
2066                 $0 = new_pos(var, $1);
2067                 $0->var = v;
2068                 if (v) {
2069                         v->where_decl = $0;
2070                         v->where_set = $0;
2071                         v->type = $<Type;
2072                         v->explicit_type = 1;
2073                 } else {
2074                         v = var_ref(c, $1.txt);
2075                         $0->var = v;
2076                         type_err(c, "error: variable '%v' redeclared",
2077                                  $0, NULL, 0, NULL);
2078                         type_err(c, "info: this is where '%v' was first declared",
2079                                  v->where_decl, NULL, 0, NULL);
2080                 }
2081         } }$
2082         | IDENTIFIER :: Type ${ {
2083                 struct variable *v = var_decl(c, $1.txt);
2084                 $0 = new_pos(var, $1);
2085                 $0->var = v;
2086                 if (v) {
2087                         v->where_decl = $0;
2088                         v->where_set = $0;
2089                         v->type = $<Type;
2090                         v->constant = 1;
2091                         v->explicit_type = 1;
2092                 } else {
2093                         v = var_ref(c, $1.txt);
2094                         $0->var = v;
2095                         type_err(c, "error: variable '%v' redeclared",
2096                                  $0, NULL, 0, NULL);
2097                         type_err(c, "info: this is where '%v' was first declared",
2098                                  v->where_decl, NULL, 0, NULL);
2099                 }
2100         } }$
2101
2102         $*exec
2103         Variable -> IDENTIFIER ${ {
2104                 struct variable *v = var_ref(c, $1.txt);
2105                 $0 = new_pos(var, $1);
2106                 if (v == NULL) {
2107                         /* This might be a global const or a label
2108                          * Allocate a var with impossible type Tnone,
2109                          * which will be adjusted when we find out what it is,
2110                          * or will trigger an error.
2111                          */
2112                         v = var_decl(c, $1.txt);
2113                         if (v) {
2114                                 v->type = Tnone;
2115                                 v->where_decl = $0;
2116                                 v->where_set = $0;
2117                         }
2118                 }
2119                 cast(var, $0)->var = v;
2120         } }$
2121
2122 ###### print exec cases
2123         case Xvar:
2124         {
2125                 struct var *v = cast(var, e);
2126                 if (v->var) {
2127                         struct binding *b = v->var->name;
2128                         printf("%.*s", b->name.len, b->name.txt);
2129                 }
2130                 break;
2131         }
2132
2133 ###### format cases
2134         case 'v':
2135                 if (loc && loc->type == Xvar) {
2136                         struct var *v = cast(var, loc);
2137                         if (v->var) {
2138                                 struct binding *b = v->var->name;
2139                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2140                         } else
2141                                 fputs("???", stderr);   // NOTEST
2142                 } else
2143                         fputs("NOTVAR", stderr);
2144                 break;
2145
2146 ###### propagate exec cases
2147
2148         case Xvar:
2149         {
2150                 struct var *var = cast(var, prog);
2151                 struct variable *v = var->var;
2152                 if (!v) {
2153                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2154                         return Tnone;                                   // NOTEST
2155                 }
2156                 v = v->merged;
2157                 if (v->constant && (rules & Rnoconstant)) {
2158                         type_err(c, "error: Cannot assign to a constant: %v",
2159                                  prog, NULL, 0, NULL);
2160                         type_err(c, "info: name was defined as a constant here",
2161                                  v->where_decl, NULL, 0, NULL);
2162                         return v->type;
2163                 }
2164                 if (v->type == Tnone && v->where_decl == prog)
2165                         type_err(c, "error: variable used but not declared: %v",
2166                                  prog, NULL, 0, NULL);
2167                 if (v->type == NULL) {
2168                         if (type && !(*perr & Efail)) {
2169                                 v->type = type;
2170                                 v->where_set = prog;
2171                                 *perr |= Eretry;
2172                         }
2173                 } else if (!type_compat(type, v->type, rules)) {
2174                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2175                                  type, rules, v->type);
2176                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2177                                  v->type, rules, NULL);
2178                 }
2179                 if (!v->global || v->frame_pos < 0)
2180                         *perr |= Enoconst;
2181                 if (!type)
2182                         return v->type;
2183                 return type;
2184         }
2185
2186 ###### interp exec cases
2187         case Xvar:
2188         {
2189                 struct var *var = cast(var, e);
2190                 struct variable *v = var->var;
2191
2192                 v = v->merged;
2193                 lrv = var_value(c, v);
2194                 rvtype = v->type;
2195                 break;
2196         }
2197
2198 ###### ast functions
2199
2200         static void free_var(struct var *v)
2201         {
2202                 free(v);
2203         }
2204
2205 ###### free exec cases
2206         case Xvar: free_var(cast(var, e)); break;
2207
2208
2209 ### Complex types
2210
2211 Now that we have the shape of the interpreter in place we can add some
2212 complex types and connected them in to the data structures and the
2213 different phases of parse, analyse, print, interpret.
2214
2215 Being "complex" the language will naturally have syntax to access
2216 specifics of objects of these types.  These will fit into the grammar as
2217 "Terms" which are the things that are combined with various operators to
2218 form "Expression".  Where a Term is formed by some operation on another
2219 Term, the subordinate Term will always come first, so for example a
2220 member of an array will be expressed as the Term for the array followed
2221 by an index in square brackets.  The strict rule of using postfix
2222 operations makes precedence irrelevant within terms.  To provide a place
2223 to put the grammar for each terms of each type, we will start out by
2224 introducing the "Term" grammar production, with contains at least a
2225 simple "Value" (to be explained later).
2226
2227 ###### Grammar
2228         $*exec
2229         Term ->  Value ${ $0 = $<1; }$
2230         | Variable ${ $0 = $<1; }$
2231         ## term grammar
2232
2233 Thus far the complex types we have are arrays and structs.
2234
2235 #### Arrays
2236
2237 Arrays can be declared by giving a size and a type, as `[size]type' so
2238 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
2239 size can be either a literal number, or a named constant.  Some day an
2240 arbitrary expression will be supported.
2241
2242 As a formal parameter to a function, the array can be declared with a
2243 new variable as the size: `name:[size::number]string`.  The `size`
2244 variable is set to the size of the array and must be a constant.  As
2245 `number` is the only supported type, it can be left out:
2246 `name:[size::]string`.
2247
2248 Arrays cannot be assigned.  When pointers are introduced we will also
2249 introduce array slices which can refer to part or all of an array -
2250 the assignment syntax will create a slice.  For now, an array can only
2251 ever be referenced by the name it is declared with.  It is likely that
2252 a "`copy`" primitive will eventually be define which can be used to
2253 make a copy of an array with controllable recursive depth.
2254
2255 For now we have two sorts of array, those with fixed size either because
2256 it is given as a literal number or because it is a struct member (which
2257 cannot have a runtime-changing size), and those with a size that is
2258 determined at runtime - local variables with a const size.  The former
2259 have their size calculated at parse time, the latter at run time.
2260
2261 For the latter type, the `size` field of the type is the size of a
2262 pointer, and the array is reallocated every time it comes into scope.
2263
2264 We differentiate struct fields with a const size from local variables
2265 with a const size by whether they are prepared at parse time or not.
2266
2267 ###### type union fields
2268
2269         struct {
2270                 int unspec;     // size is unspecified - vsize must be set.
2271                 short size;
2272                 short static_size;
2273                 struct variable *vsize;
2274                 struct type *member;
2275         } array;
2276
2277 ###### value union fields
2278         void *array;  // used if not static_size
2279
2280 ###### value functions
2281
2282         static int array_prepare_type(struct parse_context *c, struct type *type,
2283                                        int parse_time)
2284         {
2285                 struct value *vsize;
2286                 mpz_t q;
2287                 if (type->array.static_size)
2288                         return 1;       // UNTESTED
2289                 if (type->array.unspec && parse_time)
2290                         return 1;       // UNTESTED
2291                 if (parse_time && type->array.vsize && !type->array.vsize->global)
2292                         return 1;       // UNTESTED
2293
2294                 if (type->array.vsize) {
2295                         vsize = var_value(c, type->array.vsize);
2296                         if (!vsize)
2297                                 return 1;       // UNTESTED
2298                         mpz_init(q);
2299                         mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
2300                         type->array.size = mpz_get_si(q);
2301                         mpz_clear(q);
2302                 }
2303                 if (!parse_time)
2304                         return 1;
2305                 if (type->array.member->size <= 0)
2306                         return 0;
2307
2308                 type->array.static_size = 1;
2309                 type->size = type->array.size * type->array.member->size;
2310                 type->align = type->array.member->align;
2311
2312                 return 1;
2313         }
2314
2315         static void array_init(struct type *type, struct value *val)
2316         {
2317                 int i;
2318                 void *ptr = val->ptr;
2319
2320                 if (!val)
2321                         return;                         // NOTEST
2322                 if (!type->array.static_size) {
2323                         val->array = calloc(type->array.size,
2324                                             type->array.member->size);
2325                         ptr = val->array;
2326                 }
2327                 for (i = 0; i < type->array.size; i++) {
2328                         struct value *v;
2329                         v = (void*)ptr + i * type->array.member->size;
2330                         val_init(type->array.member, v);
2331                 }
2332         }
2333
2334         static void array_free(struct type *type, struct value *val)
2335         {
2336                 int i;
2337                 void *ptr = val->ptr;
2338
2339                 if (!type->array.static_size)
2340                         ptr = val->array;
2341                 for (i = 0; i < type->array.size; i++) {
2342                         struct value *v;
2343                         v = (void*)ptr + i * type->array.member->size;
2344                         free_value(type->array.member, v);
2345                 }
2346                 if (!type->array.static_size)
2347                         free(ptr);
2348         }
2349
2350         static int array_compat(struct type *require, struct type *have)
2351         {
2352                 if (have->compat != require->compat)
2353                         return 0;
2354                 /* Both are arrays, so we can look at details */
2355                 if (!type_compat(require->array.member, have->array.member, 0))
2356                         return 0;
2357                 if (have->array.unspec && require->array.unspec) {
2358                         if (have->array.vsize && require->array.vsize &&
2359                             have->array.vsize != require->array.vsize)  // UNTESTED
2360                                 /* sizes might not be the same */
2361                                 return 0;       // UNTESTED
2362                         return 1;
2363                 }
2364                 if (have->array.unspec || require->array.unspec)
2365                         return 1;       // UNTESTED
2366                 if (require->array.vsize == NULL && have->array.vsize == NULL)
2367                         return require->array.size == have->array.size;
2368
2369                 return require->array.vsize == have->array.vsize;       // UNTESTED
2370         }
2371
2372         static void array_print_type(struct type *type, FILE *f)
2373         {
2374                 fputs("[", f);
2375                 if (type->array.vsize) {
2376                         struct binding *b = type->array.vsize->name;
2377                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
2378                                 type->array.unspec ? "::" : "");
2379                 } else if (type->array.size)
2380                         fprintf(f, "%d]", type->array.size);
2381                 else
2382                         fprintf(f, "]");
2383                 type_print(type->array.member, f);
2384         }
2385
2386         static struct type array_prototype = {
2387                 .init = array_init,
2388                 .prepare_type = array_prepare_type,
2389                 .print_type = array_print_type,
2390                 .compat = array_compat,
2391                 .free = array_free,
2392                 .size = sizeof(void*),
2393                 .align = sizeof(void*),
2394         };
2395
2396 ###### declare terminals
2397         $TERM [ ]
2398
2399 ###### type grammar
2400
2401         | [ NUMBER ] Type ${ {
2402                 char tail[3];
2403                 mpq_t num;
2404                 struct type *t;
2405                 int elements = 0;
2406
2407                 if (number_parse(num, tail, $2.txt) == 0)
2408                         tok_err(c, "error: unrecognised number", &$2);
2409                 else if (tail[0]) {
2410                         tok_err(c, "error: unsupported number suffix", &$2);
2411                         mpq_clear(num);
2412                 } else {
2413                         elements = mpz_get_ui(mpq_numref(num));
2414                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
2415                                 tok_err(c, "error: array size must be an integer",
2416                                         &$2);
2417                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
2418                                 tok_err(c, "error: array size is too large",
2419                                         &$2);
2420                         mpq_clear(num);
2421                 }
2422
2423                 $0 = t = add_anon_type(c, &array_prototype, "array[%d]", elements );
2424                 t->array.size = elements;
2425                 t->array.member = $<4;
2426                 t->array.vsize = NULL;
2427         } }$
2428
2429         | [ IDENTIFIER ] Type ${ {
2430                 struct variable *v = var_ref(c, $2.txt);
2431
2432                 if (!v)
2433                         tok_err(c, "error: name undeclared", &$2);
2434                 else if (!v->constant)
2435                         tok_err(c, "error: array size must be a constant", &$2);
2436
2437                 $0 = add_anon_type(c, &array_prototype, "array[%.*s]", $2.txt.len, $2.txt.txt);
2438                 $0->array.member = $<4;
2439                 $0->array.size = 0;
2440                 $0->array.vsize = v;
2441         } }$
2442
2443 ###### Grammar
2444         $*type
2445         OptType -> Type ${ $0 = $<1; }$
2446                 | ${ $0 = NULL; }$
2447
2448 ###### formal type grammar
2449
2450         | [ IDENTIFIER :: OptType ] Type ${ {
2451                 struct variable *v = var_decl(c, $ID.txt);
2452
2453                 v->type = $<OT;
2454                 v->constant = 1;
2455                 if (!v->type)
2456                         v->type = Tnum;
2457                 $0 = add_anon_type(c, &array_prototype, "array[var]");
2458                 $0->array.member = $<6;
2459                 $0->array.size = 0;
2460                 $0->array.unspec = 1;
2461                 $0->array.vsize = v;
2462         } }$
2463
2464 ###### Binode types
2465         Index,
2466
2467 ###### term grammar
2468
2469         | Term [ Expression ] ${ {
2470                 struct binode *b = new(binode);
2471                 b->op = Index;
2472                 b->left = $<1;
2473                 b->right = $<3;
2474                 $0 = b;
2475         } }$
2476
2477 ###### print binode cases
2478         case Index:
2479                 print_exec(b->left, -1, bracket);
2480                 printf("[");
2481                 print_exec(b->right, -1, bracket);
2482                 printf("]");
2483                 break;
2484
2485 ###### propagate binode cases
2486         case Index:
2487                 /* left must be an array, right must be a number,
2488                  * result is the member type of the array
2489                  */
2490                 propagate_types(b->right, c, perr, Tnum, 0);
2491                 t = propagate_types(b->left, c, perr, NULL, rules & Rnoconstant);
2492                 if (!t || t->compat != array_compat) {
2493                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
2494                         return NULL;
2495                 } else {
2496                         if (!type_compat(type, t->array.member, rules)) {
2497                                 type_err(c, "error: have %1 but need %2", prog,
2498                                          t->array.member, rules, type);
2499                         }
2500                         return t->array.member;
2501                 }
2502                 break;
2503
2504 ###### interp binode cases
2505         case Index: {
2506                 mpz_t q;
2507                 long i;
2508                 void *ptr;
2509
2510                 lleft = linterp_exec(c, b->left, &ltype);
2511                 right = interp_exec(c, b->right, &rtype);
2512                 mpz_init(q);
2513                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
2514                 i = mpz_get_si(q);
2515                 mpz_clear(q);
2516
2517                 if (ltype->array.static_size)
2518                         ptr = lleft;
2519                 else
2520                         ptr = *(void**)lleft;
2521                 rvtype = ltype->array.member;
2522                 if (i >= 0 && i < ltype->array.size)
2523                         lrv = ptr + i * rvtype->size;
2524                 else
2525                         val_init(ltype->array.member, &rv); // UNSAFE
2526                 ltype = NULL;
2527                 break;
2528         }
2529
2530 #### Structs
2531
2532 A `struct` is a data-type that contains one or more other data-types.
2533 It differs from an array in that each member can be of a different
2534 type, and they are accessed by name rather than by number.  Thus you
2535 cannot choose an element by calculation, you need to know what you
2536 want up-front.
2537
2538 The language makes no promises about how a given structure will be
2539 stored in memory - it is free to rearrange fields to suit whatever
2540 criteria seems important.
2541
2542 Structs are declared separately from program code - they cannot be
2543 declared in-line in a variable declaration like arrays can.  A struct
2544 is given a name and this name is used to identify the type - the name
2545 is not prefixed by the word `struct` as it would be in C.
2546
2547 Structs are only treated as the same if they have the same name.
2548 Simply having the same fields in the same order is not enough.  This
2549 might change once we can create structure initializers from a list of
2550 values.
2551
2552 Each component datum is identified much like a variable is declared,
2553 with a name, one or two colons, and a type.  The type cannot be omitted
2554 as there is no opportunity to deduce the type from usage.  An initial
2555 value can be given following an equals sign, so
2556
2557 ##### Example: a struct type
2558
2559         struct complex:
2560                 x:number = 0
2561                 y:number = 0
2562
2563 would declare a type called "complex" which has two number fields,
2564 each initialised to zero.
2565
2566 Struct will need to be declared separately from the code that uses
2567 them, so we will need to be able to print out the declaration of a
2568 struct when reprinting the whole program.  So a `print_type_decl` type
2569 function will be needed.
2570
2571 ###### type union fields
2572
2573         struct {
2574                 int nfields;
2575                 struct field {
2576                         struct text name;
2577                         struct type *type;
2578                         struct value *init;
2579                         int offset;
2580                 } *fields; // This is created when field_list is analysed.
2581                 struct fieldlist {
2582                         struct fieldlist *prev;
2583                         struct field f;
2584                         struct exec *init;
2585                 } *field_list; // This is created during parsing
2586         } structure;
2587
2588 ###### type functions
2589         void (*print_type_decl)(struct type *type, FILE *f);
2590
2591 ###### value functions
2592
2593         static void structure_init(struct type *type, struct value *val)
2594         {
2595                 int i;
2596
2597                 for (i = 0; i < type->structure.nfields; i++) {
2598                         struct value *v;
2599                         v = (void*) val->ptr + type->structure.fields[i].offset;
2600                         if (type->structure.fields[i].init)
2601                                 dup_value(type->structure.fields[i].type,
2602                                           type->structure.fields[i].init,
2603                                           v);
2604                         else
2605                                 val_init(type->structure.fields[i].type, v);
2606                 }
2607         }
2608
2609         static void structure_free(struct type *type, struct value *val)
2610         {
2611                 int i;
2612
2613                 for (i = 0; i < type->structure.nfields; i++) {
2614                         struct value *v;
2615                         v = (void*)val->ptr + type->structure.fields[i].offset;
2616                         free_value(type->structure.fields[i].type, v);
2617                 }
2618         }
2619
2620         static void free_fieldlist(struct fieldlist *f)
2621         {
2622                 if (!f)
2623                         return;
2624                 free_fieldlist(f->prev);
2625                 free_exec(f->init);
2626                 free(f);
2627         }
2628
2629         static void structure_free_type(struct type *t)
2630         {
2631                 int i;
2632                 for (i = 0; i < t->structure.nfields; i++)
2633                         if (t->structure.fields[i].init) {
2634                                 free_value(t->structure.fields[i].type,
2635                                            t->structure.fields[i].init);
2636                         }
2637                 free(t->structure.fields);
2638                 free_fieldlist(t->structure.field_list);
2639         }
2640
2641         static int structure_prepare_type(struct parse_context *c,
2642                                           struct type *t, int parse_time)
2643         {
2644                 int cnt = 0;
2645                 struct fieldlist *f;
2646
2647                 if (!parse_time || t->structure.fields)
2648                         return 1;
2649
2650                 for (f = t->structure.field_list; f; f=f->prev) {
2651                         enum prop_err perr;
2652                         cnt += 1;
2653
2654                         if (f->f.type->size <= 0)
2655                                 return 0;
2656                         if (f->f.type->prepare_type)
2657                                 f->f.type->prepare_type(c, f->f.type, parse_time);
2658
2659                         if (f->init == NULL)
2660                                 continue;
2661                         do {
2662                                 perr = 0;
2663                                 propagate_types(f->init, c, &perr, f->f.type, 0);
2664                         } while (perr & Eretry);
2665                         if (perr & Efail)
2666                                 c->parse_error += 1;    // NOTEST
2667                 }
2668
2669                 t->structure.nfields = cnt;
2670                 t->structure.fields = calloc(cnt, sizeof(struct field));
2671                 f = t->structure.field_list;
2672                 while (cnt > 0) {
2673                         int a = f->f.type->align;
2674                         cnt -= 1;
2675                         t->structure.fields[cnt] = f->f;
2676                         if (t->size & (a-1))
2677                                 t->size = (t->size | (a-1)) + 1;
2678                         t->structure.fields[cnt].offset = t->size;
2679                         t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2680                         if (a > t->align)
2681                                 t->align = a;
2682
2683                         if (f->init && !c->parse_error) {
2684                                 struct value vl = interp_exec(c, f->init, NULL);
2685                                 t->structure.fields[cnt].init =
2686                                         global_alloc(c, f->f.type, NULL, &vl);
2687                         }
2688
2689                         f = f->prev;
2690                 }
2691                 return 1;
2692         }
2693
2694         static struct type structure_prototype = {
2695                 .init = structure_init,
2696                 .free = structure_free,
2697                 .free_type = structure_free_type,
2698                 .print_type_decl = structure_print_type,
2699                 .prepare_type = structure_prepare_type,
2700         };
2701
2702 ###### exec type
2703         Xfieldref,
2704
2705 ###### ast
2706         struct fieldref {
2707                 struct exec;
2708                 struct exec *left;
2709                 int index;
2710                 struct text name;
2711         };
2712
2713 ###### free exec cases
2714         case Xfieldref:
2715                 free_exec(cast(fieldref, e)->left);
2716                 free(e);
2717                 break;
2718
2719 ###### declare terminals
2720         $TERM struct .
2721
2722 ###### term grammar
2723
2724         | Term . IDENTIFIER ${ {
2725                 struct fieldref *fr = new_pos(fieldref, $2);
2726                 fr->left = $<1;
2727                 fr->name = $3.txt;
2728                 fr->index = -2;
2729                 $0 = fr;
2730         } }$
2731
2732 ###### print exec cases
2733
2734         case Xfieldref:
2735         {
2736                 struct fieldref *f = cast(fieldref, e);
2737                 print_exec(f->left, -1, bracket);
2738                 printf(".%.*s", f->name.len, f->name.txt);
2739                 break;
2740         }
2741
2742 ###### ast functions
2743         static int find_struct_index(struct type *type, struct text field)
2744         {
2745                 int i;
2746                 for (i = 0; i < type->structure.nfields; i++)
2747                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2748                                 return i;
2749                 return -1;
2750         }
2751
2752 ###### propagate exec cases
2753
2754         case Xfieldref:
2755         {
2756                 struct fieldref *f = cast(fieldref, prog);
2757                 struct type *st = propagate_types(f->left, c, perr, NULL, 0);
2758
2759                 if (!st)
2760                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2761                                  NULL, 0, NULL);
2762                 else if (st->init != structure_init)
2763                         type_err(c, "error: field reference attempted on %1, not a struct",
2764                                  f->left, st, 0, NULL);
2765                 else if (f->index == -2) {
2766                         f->index = find_struct_index(st, f->name);
2767                         if (f->index < 0)
2768                                 type_err(c, "error: cannot find requested field in %1",
2769                                          f->left, st, 0, NULL);
2770                 }
2771                 if (f->index >= 0) {
2772                         struct type *ft = st->structure.fields[f->index].type;
2773                         if (!type_compat(type, ft, rules))
2774                                 type_err(c, "error: have %1 but need %2", prog,
2775                                          ft, rules, type);
2776                         return ft;
2777                 }
2778                 break;
2779         }
2780
2781 ###### interp exec cases
2782         case Xfieldref:
2783         {
2784                 struct fieldref *f = cast(fieldref, e);
2785                 struct type *ltype;
2786                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2787                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2788                 rvtype = ltype->structure.fields[f->index].type;
2789                 break;
2790         }
2791
2792 ###### top level grammar
2793         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2794                 struct type *t;
2795                 t = find_type(c, $ID.txt);
2796                 if (!t)
2797                         t = add_type(c, $ID.txt, &structure_prototype);
2798                 else if (t->size >= 0) {
2799                         tok_err(c, "error: type already declared", &$ID);
2800                         tok_err(c, "info: this is location of declartion", &t->first_use);
2801                         /* Create a new one - duplicate */
2802                         t = add_type(c, $ID.txt, &structure_prototype);
2803                 } else {
2804                         struct type tmp = *t;
2805                         *t = structure_prototype;
2806                         t->name = tmp.name;
2807                         t->next = tmp.next;
2808                 }
2809                 t->structure.field_list = $<FB;
2810                 t->first_use = $ID;
2811         } }$
2812
2813         $*fieldlist
2814         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2815         | { SimpleFieldList } ${ $0 = $<SFL; }$
2816         | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2817         | SimpleFieldList EOL ${ $0 = $<SFL; }$
2818
2819         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2820         | FieldLines SimpleFieldList Newlines ${
2821                 $SFL->prev = $<FL;
2822                 $0 = $<SFL;
2823         }$
2824
2825         SimpleFieldList -> Field ${ $0 = $<F; }$
2826         | SimpleFieldList ; Field ${
2827                 $F->prev = $<SFL;
2828                 $0 = $<F;
2829         }$
2830         | SimpleFieldList ; ${
2831                 $0 = $<SFL;
2832         }$
2833         | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2834
2835         Field -> IDENTIFIER : Type = Expression ${ {
2836                 $0 = calloc(1, sizeof(struct fieldlist));
2837                 $0->f.name = $ID.txt;
2838                 $0->f.type = $<Type;
2839                 $0->f.init = NULL;
2840                 $0->init = $<Expr;
2841         } }$
2842         | IDENTIFIER : Type ${
2843                 $0 = calloc(1, sizeof(struct fieldlist));
2844                 $0->f.name = $ID.txt;
2845                 $0->f.type = $<Type;
2846         }$
2847
2848 ###### forward decls
2849         static void structure_print_type(struct type *t, FILE *f);
2850
2851 ###### value functions
2852         static void structure_print_type(struct type *t, FILE *f)
2853         {
2854                 int i;
2855
2856                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2857
2858                 for (i = 0; i < t->structure.nfields; i++) {
2859                         struct field *fl = t->structure.fields + i;
2860                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2861                         type_print(fl->type, f);
2862                         if (fl->type->print && fl->init) {
2863                                 fprintf(f, " = ");
2864                                 if (fl->type == Tstr)
2865                                         fprintf(f, "\"");       // UNTESTED
2866                                 print_value(fl->type, fl->init, f);
2867                                 if (fl->type == Tstr)
2868                                         fprintf(f, "\"");       // UNTESTED
2869                         }
2870                         fprintf(f, "\n");
2871                 }
2872         }
2873
2874 ###### print type decls
2875         {
2876                 struct type *t;
2877                 int target = -1;
2878
2879                 while (target != 0) {
2880                         int i = 0;
2881                         for (t = context.typelist; t ; t=t->next)
2882                                 if (!t->anon && t->print_type_decl &&
2883                                     !t->check_args) {
2884                                         i += 1;
2885                                         if (i == target)
2886                                                 break;
2887                                 }
2888
2889                         if (target == -1) {
2890                                 target = i;
2891                         } else {
2892                                 t->print_type_decl(t, stdout);
2893                                 target -= 1;
2894                         }
2895                 }
2896         }
2897
2898 #### Functions
2899
2900 A function is a chunk of code which can be passed parameters and can
2901 return results.  Each function has a type which includes the set of
2902 parameters and the return value.  As yet these types cannot be declared
2903 separately from the function itself.
2904
2905 The parameters can be specified either in parentheses as a ';' separated
2906 list, such as
2907
2908 ##### Example: function 1
2909
2910         func main(av:[ac::number]string; env:[envc::number]string)
2911                 code block
2912
2913 or as an indented list of one parameter per line (though each line can
2914 be a ';' separated list)
2915
2916 ##### Example: function 2
2917
2918         func main
2919                 argv:[argc::number]string
2920                 env:[envc::number]string
2921         do
2922                 code block
2923
2924 In the first case a return type can follow the parentheses after a colon,
2925 in the second it is given on a line starting with the word `return`.
2926
2927 ##### Example: functions that return
2928
2929         func add(a:number; b:number): number
2930                 code block
2931
2932         func catenate
2933                 a: string
2934                 b: string
2935         return string
2936         do
2937                 code block
2938
2939 Rather than returning a type, the function can specify a set of local
2940 variables to return as a struct.  The values of these variables when the
2941 function exits will be provided to the caller.  For this the return type
2942 is replaced with a block of result declarations, either in parentheses
2943 or bracketed by `return` and `do`.
2944
2945 ##### Example: functions returning multiple variables
2946
2947         func to_cartesian(rho:number; theta:number):(x:number; y:number)
2948                 x = .....
2949                 y = .....
2950
2951         func to_polar
2952                 x:number; y:number
2953         return
2954                 rho:number
2955                 theta:number
2956         do
2957                 rho = ....
2958                 theta = ....
2959
2960 For constructing the lists we use a `List` binode, which will be
2961 further detailed when Expression Lists are introduced.
2962
2963 ###### type union fields
2964
2965         struct {
2966                 struct binode *params;
2967                 struct type *return_type;
2968                 struct variable *scope;
2969                 int inline_result;      // return value is at start of 'local'
2970                 int local_size;
2971         } function;
2972
2973 ###### value union fields
2974         struct exec *function;
2975
2976 ###### type functions
2977         void (*check_args)(struct parse_context *c, enum prop_err *perr,
2978                            struct type *require, struct exec *args);
2979
2980 ###### value functions
2981
2982         static void function_free(struct type *type, struct value *val)
2983         {
2984                 free_exec(val->function);
2985                 val->function = NULL;
2986         }
2987
2988         static int function_compat(struct type *require, struct type *have)
2989         {
2990                 // FIXME can I do anything here yet?
2991                 return 0;
2992         }
2993
2994         static void function_check_args(struct parse_context *c, enum prop_err *perr,
2995                                         struct type *require, struct exec *args)
2996         {
2997                 /* This should be 'compat', but we don't have a 'tuple' type to
2998                  * hold the type of 'args'
2999                  */
3000                 struct binode *arg = cast(binode, args);
3001                 struct binode *param = require->function.params;
3002
3003                 while (param) {
3004                         struct var *pv = cast(var, param->left);
3005                         if (!arg) {
3006                                 type_err(c, "error: insufficient arguments to function.",
3007                                          args, NULL, 0, NULL);
3008                                 break;
3009                         }
3010                         *perr = 0;
3011                         propagate_types(arg->left, c, perr, pv->var->type, 0);
3012                         param = cast(binode, param->right);
3013                         arg = cast(binode, arg->right);
3014                 }
3015                 if (arg)
3016                         type_err(c, "error: too many arguments to function.",
3017                                  args, NULL, 0, NULL);
3018         }
3019
3020         static void function_print(struct type *type, struct value *val, FILE *f)
3021         {
3022                 print_exec(val->function, 1, 0);
3023         }
3024
3025         static void function_print_type_decl(struct type *type, FILE *f)
3026         {
3027                 struct binode *b;
3028                 fprintf(f, "(");
3029                 for (b = type->function.params; b; b = cast(binode, b->right)) {
3030                         struct variable *v = cast(var, b->left)->var;
3031                         fprintf(f, "%.*s%s", v->name->name.len, v->name->name.txt,
3032                                 v->constant ? "::" : ":");
3033                         type_print(v->type, f);
3034                         if (b->right)
3035                                 fprintf(f, "; ");
3036                 }
3037                 fprintf(f, ")");
3038                 if (type->function.return_type != Tnone) {
3039                         fprintf(f, ":");
3040                         if (type->function.inline_result) {
3041                                 int i;
3042                                 struct type *t = type->function.return_type;
3043                                 fprintf(f, " (");
3044                                 for (i = 0; i < t->structure.nfields; i++) {
3045                                         struct field *fl = t->structure.fields + i;
3046                                         if (i)
3047                                                 fprintf(f, "; ");
3048                                         fprintf(f, "%.*s:", fl->name.len, fl->name.txt);
3049                                         type_print(fl->type, f);
3050                                 }
3051                                 fprintf(f, ")");
3052                         } else
3053                                 type_print(type->function.return_type, f);
3054                 }
3055                 fprintf(f, "\n");
3056         }
3057
3058         static void function_free_type(struct type *t)
3059         {
3060                 free_exec(t->function.params);
3061         }
3062
3063         static struct type function_prototype = {
3064                 .size = sizeof(void*),
3065                 .align = sizeof(void*),
3066                 .free = function_free,
3067                 .compat = function_compat,
3068                 .check_args = function_check_args,
3069                 .print = function_print,
3070                 .print_type_decl = function_print_type_decl,
3071                 .free_type = function_free_type,
3072         };
3073
3074 ###### declare terminals
3075
3076         $TERM func
3077
3078 ###### Binode types
3079         List,
3080
3081 ###### Grammar
3082
3083         $*variable
3084         FuncName -> IDENTIFIER ${ {
3085                 struct variable *v = var_decl(c, $1.txt);
3086                 struct var *e = new_pos(var, $1);
3087                 e->var = v;
3088                 if (v) {
3089                         v->where_decl = e;
3090                         $0 = v;
3091                 } else {
3092                         v = var_ref(c, $1.txt);
3093                         e->var = v;
3094                         type_err(c, "error: function '%v' redeclared",
3095                                 e, NULL, 0, NULL);
3096                         type_err(c, "info: this is where '%v' was first declared",
3097                                 v->where_decl, NULL, 0, NULL);
3098                         free_exec(e);
3099                 }
3100         } }$
3101
3102         $*binode
3103         Args -> ArgsLine NEWLINE ${ $0 = $<AL; }$
3104         | Args ArgsLine NEWLINE ${ {
3105                 struct binode *b = $<AL;
3106                 struct binode **bp = &b;
3107                 while (*bp)
3108                         bp = (struct binode **)&(*bp)->left;
3109                 *bp = $<A;
3110                 $0 = b;
3111         } }$
3112
3113         ArgsLine -> ${ $0 = NULL; }$
3114         | Varlist ${ $0 = $<1; }$
3115         | Varlist ; ${ $0 = $<1; }$
3116
3117         Varlist -> Varlist ; ArgDecl ${
3118                 $0 = new(binode);
3119                 $0->op = List;
3120                 $0->left = $<Vl;
3121                 $0->right = $<AD;
3122         }$
3123         | ArgDecl ${
3124                 $0 = new(binode);
3125                 $0->op = List;
3126                 $0->left = NULL;
3127                 $0->right = $<AD;
3128         }$
3129
3130         $*var
3131         ArgDecl -> IDENTIFIER : FormalType ${ {
3132                 struct variable *v = var_decl(c, $1.txt);
3133                 $0 = new(var);
3134                 $0->var = v;
3135                 v->type = $<FT;
3136         } }$
3137
3138 ##### Function calls
3139
3140 A function call can appear either as an expression or as a statement.
3141 We use a new 'Funcall' binode type to link the function with a list of
3142 arguments, form with the 'List' nodes.
3143
3144 We have already seen the "Term" which is how a function call can appear
3145 in an expression.  To parse a function call into a statement we include
3146 it in the "SimpleStatement Grammar" which will be described later.
3147
3148 ###### Binode types
3149         Funcall,
3150
3151 ###### term grammar
3152         | Term ( ExpressionList ) ${ {
3153                 struct binode *b = new(binode);
3154                 b->op = Funcall;
3155                 b->left = $<T;
3156                 b->right = reorder_bilist($<EL);
3157                 $0 = b;
3158         } }$
3159         | Term ( ) ${ {
3160                 struct binode *b = new(binode);
3161                 b->op = Funcall;
3162                 b->left = $<T;
3163                 b->right = NULL;
3164                 $0 = b;
3165         } }$
3166
3167 ###### SimpleStatement Grammar
3168
3169         | Term ( ExpressionList ) ${ {
3170                 struct binode *b = new(binode);
3171                 b->op = Funcall;
3172                 b->left = $<T;
3173                 b->right = reorder_bilist($<EL);
3174                 $0 = b;
3175         } }$
3176
3177 ###### print binode cases
3178
3179         case Funcall:
3180                 do_indent(indent, "");
3181                 print_exec(b->left, -1, bracket);
3182                 printf("(");
3183                 for (b = cast(binode, b->right); b; b = cast(binode, b->right)) {
3184                         if (b->left) {
3185                                 printf(" ");
3186                                 print_exec(b->left, -1, bracket);
3187                                 if (b->right)
3188                                         printf(",");
3189                         }
3190                 }
3191                 printf(")");
3192                 if (indent >= 0)
3193                         printf("\n");
3194                 break;
3195
3196 ###### propagate binode cases
3197
3198         case Funcall: {
3199                 /* Every arg must match formal parameter, and result
3200                  * is return type of function
3201                  */
3202                 struct binode *args = cast(binode, b->right);
3203                 struct var *v = cast(var, b->left);
3204
3205                 if (!v->var->type || v->var->type->check_args == NULL) {
3206                         type_err(c, "error: attempt to call a non-function.",
3207                                  prog, NULL, 0, NULL);
3208                         return NULL;
3209                 }
3210                 *perr |= Enoconst;
3211                 v->var->type->check_args(c, perr, v->var->type, args);
3212                 return v->var->type->function.return_type;
3213         }
3214
3215 ###### interp binode cases
3216
3217         case Funcall: {
3218                 struct var *v = cast(var, b->left);
3219                 struct type *t = v->var->type;
3220                 void *oldlocal = c->local;
3221                 int old_size = c->local_size;
3222                 void *local = calloc(1, t->function.local_size);
3223                 struct value *fbody = var_value(c, v->var);
3224                 struct binode *arg = cast(binode, b->right);
3225                 struct binode *param = t->function.params;
3226
3227                 while (param) {
3228                         struct var *pv = cast(var, param->left);
3229                         struct type *vtype = NULL;
3230                         struct value val = interp_exec(c, arg->left, &vtype);
3231                         struct value *lval;
3232                         c->local = local; c->local_size = t->function.local_size;
3233                         lval = var_value(c, pv->var);
3234                         c->local = oldlocal; c->local_size = old_size;
3235                         memcpy(lval, &val, vtype->size);
3236                         param = cast(binode, param->right);
3237                         arg = cast(binode, arg->right);
3238                 }
3239                 c->local = local; c->local_size = t->function.local_size;
3240                 if (t->function.inline_result && dtype) {
3241                         _interp_exec(c, fbody->function, NULL, NULL);
3242                         memcpy(dest, local, dtype->size);
3243                         rvtype = ret.type = NULL;
3244                 } else
3245                         rv = interp_exec(c, fbody->function, &rvtype);
3246                 c->local = oldlocal; c->local_size = old_size;
3247                 free(local);
3248                 break;
3249         }
3250
3251 ## Complex executables: statements and expressions
3252
3253 Now that we have types and values and variables and most of the basic
3254 Terms which provide access to these, we can explore the more complex
3255 code that combine all of these to get useful work done.  Specifically
3256 statements and expressions.
3257
3258 Expressions are various combinations of Terms.  We will use operator
3259 precedence to ensure correct parsing.  The simplest Expression is just a
3260 Term - others will follow.
3261
3262 ###### Grammar
3263
3264         $*exec
3265         Expression -> Term ${ $0 = $<Term; }$
3266         ## expression grammar
3267
3268 ### Expressions: Conditional
3269
3270 Our first user of the `binode` will be conditional expressions, which
3271 is a bit odd as they actually have three components.  That will be
3272 handled by having 2 binodes for each expression.  The conditional
3273 expression is the lowest precedence operator which is why we define it
3274 first - to start the precedence list.
3275
3276 Conditional expressions are of the form "value `if` condition `else`
3277 other_value".  They associate to the right, so everything to the right
3278 of `else` is part of an else value, while only a higher-precedence to
3279 the left of `if` is the if values.  Between `if` and `else` there is no
3280 room for ambiguity, so a full conditional expression is allowed in
3281 there.
3282
3283 ###### Binode types
3284         CondExpr,
3285
3286 ###### declare terminals
3287
3288         $LEFT if $$ifelse
3289
3290 ###### expression grammar
3291
3292         | Expression if Expression else Expression $$ifelse ${ {
3293                 struct binode *b1 = new(binode);
3294                 struct binode *b2 = new(binode);
3295                 b1->op = CondExpr;
3296                 b1->left = $<3;
3297                 b1->right = b2;
3298                 b2->op = CondExpr;
3299                 b2->left = $<1;
3300                 b2->right = $<5;
3301                 $0 = b1;
3302         } }$
3303
3304 ###### print binode cases
3305
3306         case CondExpr:
3307                 b2 = cast(binode, b->right);
3308                 if (bracket) printf("(");
3309                 print_exec(b2->left, -1, bracket);
3310                 printf(" if ");
3311                 print_exec(b->left, -1, bracket);
3312                 printf(" else ");
3313                 print_exec(b2->right, -1, bracket);
3314                 if (bracket) printf(")");
3315                 break;
3316
3317 ###### propagate binode cases
3318
3319         case CondExpr: {
3320                 /* cond must be Tbool, others must match */
3321                 struct binode *b2 = cast(binode, b->right);
3322                 struct type *t2;
3323
3324                 propagate_types(b->left, c, perr, Tbool, 0);
3325                 t = propagate_types(b2->left, c, perr, type, Rnolabel);
3326                 t2 = propagate_types(b2->right, c, perr, type ?: t, Rnolabel);
3327                 return t ?: t2;
3328         }
3329
3330 ###### interp binode cases
3331
3332         case CondExpr: {
3333                 struct binode *b2 = cast(binode, b->right);
3334                 left = interp_exec(c, b->left, &ltype);
3335                 if (left.bool)
3336                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
3337                 else
3338                         rv = interp_exec(c, b2->right, &rvtype);
3339                 }
3340                 break;
3341
3342 ### Expression list
3343
3344 We take a brief detour, now that we have expressions, to describe lists
3345 of expressions.  These will be needed for function parameters and
3346 possibly other situations.  They seem generic enough to introduce here
3347 to be used elsewhere.
3348
3349 And ExpressionList will use the `List` type of `binode`, building up at
3350 the end.  And place where they are used will probably call
3351 `reorder_bilist()` to get a more normal first/next arrangement.
3352
3353 ###### declare terminals
3354         $TERM ,
3355
3356 `List` execs have no implicit semantics, so they are never propagated or
3357 interpreted.  The can be printed as a comma separate list, which is how
3358 they are parsed.  Note they are also used for function formal parameter
3359 lists.  In that case a separate function is used to print them.
3360
3361 ###### print binode cases
3362         case List:
3363                 while (b) {
3364                         printf(" ");
3365                         print_exec(b->left, -1, bracket);
3366                         if (b->right)
3367                                 printf(",");
3368                         b = cast(binode, b->right);
3369                 }
3370                 break;
3371
3372 ###### propagate binode cases
3373         case List: abort(); // NOTEST
3374 ###### interp binode cases
3375         case List: abort(); // NOTEST
3376
3377 ###### Grammar
3378
3379         $*binode
3380         ExpressionList -> ExpressionList , Expression ${
3381                 $0 = new(binode);
3382                 $0->op = List;
3383                 $0->left = $<1;
3384                 $0->right = $<3;
3385         }$
3386         | Expression ${
3387                 $0 = new(binode);
3388                 $0->op = List;
3389                 $0->left = NULL;
3390                 $0->right = $<1;
3391         }$
3392
3393 ### Expressions: Boolean
3394
3395 The next class of expressions to use the `binode` will be Boolean
3396 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
3397 have same corresponding precendence.  The difference is that they don't
3398 evaluate the second expression if not necessary.
3399
3400 ###### Binode types
3401         And,
3402         AndThen,
3403         Or,
3404         OrElse,
3405         Not,
3406
3407 ###### declare terminals
3408         $LEFT or
3409         $LEFT and
3410         $LEFT not
3411
3412 ###### expression grammar
3413         | Expression or Expression ${ {
3414                 struct binode *b = new(binode);
3415                 b->op = Or;
3416                 b->left = $<1;
3417                 b->right = $<3;
3418                 $0 = b;
3419         } }$
3420         | Expression or else Expression ${ {
3421                 struct binode *b = new(binode);
3422                 b->op = OrElse;
3423                 b->left = $<1;
3424                 b->right = $<4;
3425                 $0 = b;
3426         } }$
3427
3428         | Expression and Expression ${ {
3429                 struct binode *b = new(binode);
3430                 b->op = And;
3431                 b->left = $<1;
3432                 b->right = $<3;
3433                 $0 = b;
3434         } }$
3435         | Expression and then Expression ${ {
3436                 struct binode *b = new(binode);
3437                 b->op = AndThen;
3438                 b->left = $<1;
3439                 b->right = $<4;
3440                 $0 = b;
3441         } }$
3442
3443         | not Expression ${ {
3444                 struct binode *b = new(binode);
3445                 b->op = Not;
3446                 b->right = $<2;
3447                 $0 = b;
3448         } }$
3449
3450 ###### print binode cases
3451         case And:
3452                 if (bracket) printf("(");
3453                 print_exec(b->left, -1, bracket);
3454                 printf(" and ");
3455                 print_exec(b->right, -1, bracket);
3456                 if (bracket) printf(")");
3457                 break;
3458         case AndThen:
3459                 if (bracket) printf("(");
3460                 print_exec(b->left, -1, bracket);
3461                 printf(" and then ");
3462                 print_exec(b->right, -1, bracket);
3463                 if (bracket) printf(")");
3464                 break;
3465         case Or:
3466                 if (bracket) printf("(");
3467                 print_exec(b->left, -1, bracket);
3468                 printf(" or ");
3469                 print_exec(b->right, -1, bracket);
3470                 if (bracket) printf(")");
3471                 break;
3472         case OrElse:
3473                 if (bracket) printf("(");
3474                 print_exec(b->left, -1, bracket);
3475                 printf(" or else ");
3476                 print_exec(b->right, -1, bracket);
3477                 if (bracket) printf(")");
3478                 break;
3479         case Not:
3480                 if (bracket) printf("(");
3481                 printf("not ");
3482                 print_exec(b->right, -1, bracket);
3483                 if (bracket) printf(")");
3484                 break;
3485
3486 ###### propagate binode cases
3487         case And:
3488         case AndThen:
3489         case Or:
3490         case OrElse:
3491         case Not:
3492                 /* both must be Tbool, result is Tbool */
3493                 propagate_types(b->left, c, perr, Tbool, 0);
3494                 propagate_types(b->right, c, perr, Tbool, 0);
3495                 if (type && type != Tbool)
3496                         type_err(c, "error: %1 operation found where %2 expected", prog,
3497                                    Tbool, 0, type);
3498                 return Tbool;
3499
3500 ###### interp binode cases
3501         case And:
3502                 rv = interp_exec(c, b->left, &rvtype);
3503                 right = interp_exec(c, b->right, &rtype);
3504                 rv.bool = rv.bool && right.bool;
3505                 break;
3506         case AndThen:
3507                 rv = interp_exec(c, b->left, &rvtype);
3508                 if (rv.bool)
3509                         rv = interp_exec(c, b->right, NULL);
3510                 break;
3511         case Or:
3512                 rv = interp_exec(c, b->left, &rvtype);
3513                 right = interp_exec(c, b->right, &rtype);
3514                 rv.bool = rv.bool || right.bool;
3515                 break;
3516         case OrElse:
3517                 rv = interp_exec(c, b->left, &rvtype);
3518                 if (!rv.bool)
3519                         rv = interp_exec(c, b->right, NULL);
3520                 break;
3521         case Not:
3522                 rv = interp_exec(c, b->right, &rvtype);
3523                 rv.bool = !rv.bool;
3524                 break;
3525
3526 ### Expressions: Comparison
3527
3528 Of slightly higher precedence that Boolean expressions are Comparisons.
3529 A comparison takes arguments of any comparable type, but the two types
3530 must be the same.
3531
3532 To simplify the parsing we introduce an `eop` which can record an
3533 expression operator, and the `CMPop` non-terminal will match one of them.
3534
3535 ###### ast
3536         struct eop {
3537                 enum Btype op;
3538         };
3539
3540 ###### ast functions
3541         static void free_eop(struct eop *e)
3542         {
3543                 if (e)
3544                         free(e);
3545         }
3546
3547 ###### Binode types
3548         Less,
3549         Gtr,
3550         LessEq,
3551         GtrEq,
3552         Eql,
3553         NEql,
3554
3555 ###### declare terminals
3556         $LEFT < > <= >= == != CMPop
3557
3558 ###### expression grammar
3559         | Expression CMPop Expression ${ {
3560                 struct binode *b = new(binode);
3561                 b->op = $2.op;
3562                 b->left = $<1;
3563                 b->right = $<3;
3564                 $0 = b;
3565         } }$
3566
3567 ###### Grammar
3568
3569         $eop
3570         CMPop ->  < ${ $0.op = Less; }$
3571         |         > ${ $0.op = Gtr; }$
3572         |         <= ${ $0.op = LessEq; }$
3573         |         >= ${ $0.op = GtrEq; }$
3574         |         == ${ $0.op = Eql; }$
3575         |         != ${ $0.op = NEql; }$
3576
3577 ###### print binode cases
3578
3579         case Less:
3580         case LessEq:
3581         case Gtr:
3582         case GtrEq:
3583         case Eql:
3584         case NEql:
3585                 if (bracket) printf("(");
3586                 print_exec(b->left, -1, bracket);
3587                 switch(b->op) {
3588                 case Less:   printf(" < "); break;
3589                 case LessEq: printf(" <= "); break;
3590                 case Gtr:    printf(" > "); break;
3591                 case GtrEq:  printf(" >= "); break;
3592                 case Eql:    printf(" == "); break;
3593                 case NEql:   printf(" != "); break;
3594                 default: abort();               // NOTEST
3595                 }
3596                 print_exec(b->right, -1, bracket);
3597                 if (bracket) printf(")");
3598                 break;
3599
3600 ###### propagate binode cases
3601         case Less:
3602         case LessEq:
3603         case Gtr:
3604         case GtrEq:
3605         case Eql:
3606         case NEql:
3607                 /* Both must match but not be labels, result is Tbool */
3608                 t = propagate_types(b->left, c, perr, NULL, Rnolabel);
3609                 if (t)
3610                         propagate_types(b->right, c, perr, t, 0);
3611                 else {
3612                         t = propagate_types(b->right, c, perr, NULL, Rnolabel); // UNTESTED
3613                         if (t)  // UNTESTED
3614                                 t = propagate_types(b->left, c, perr, t, 0);    // UNTESTED
3615                 }
3616                 if (!type_compat(type, Tbool, 0))
3617                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
3618                                     Tbool, rules, type);
3619                 return Tbool;
3620
3621 ###### interp binode cases
3622         case Less:
3623         case LessEq:
3624         case Gtr:
3625         case GtrEq:
3626         case Eql:
3627         case NEql:
3628         {
3629                 int cmp;
3630                 left = interp_exec(c, b->left, &ltype);
3631                 right = interp_exec(c, b->right, &rtype);
3632                 cmp = value_cmp(ltype, rtype, &left, &right);
3633                 rvtype = Tbool;
3634                 switch (b->op) {
3635                 case Less:      rv.bool = cmp <  0; break;
3636                 case LessEq:    rv.bool = cmp <= 0; break;
3637                 case Gtr:       rv.bool = cmp >  0; break;
3638                 case GtrEq:     rv.bool = cmp >= 0; break;
3639                 case Eql:       rv.bool = cmp == 0; break;
3640                 case NEql:      rv.bool = cmp != 0; break;
3641                 default:        rv.bool = 0; break;     // NOTEST
3642                 }
3643                 break;
3644         }
3645
3646 ### Expressions: Arithmetic etc.
3647
3648 The remaining expressions with the highest precedence are arithmetic,
3649 string concatenation, and string conversion.  String concatenation
3650 (`++`) has the same precedence as multiplication and division, but lower
3651 than the uniary.
3652
3653 String conversion is a temporary feature until I get a better type
3654 system.  `$` is a prefix operator which expects a string and returns
3655 a number.
3656
3657 `+` and `-` are both infix and prefix operations (where they are
3658 absolute value and negation).  These have different operator names.
3659
3660 We also have a 'Bracket' operator which records where parentheses were
3661 found.  This makes it easy to reproduce these when printing.  Possibly I
3662 should only insert brackets were needed for precedence.  Putting
3663 parentheses around an expression converts it into a Term,
3664
3665 ###### Binode types
3666         Plus, Minus,
3667         Times, Divide, Rem,
3668         Concat,
3669         Absolute, Negate,
3670         StringConv,
3671         Bracket,
3672
3673 ###### declare terminals
3674         $LEFT + - Eop
3675         $LEFT * / % ++ Top
3676         $LEFT Uop $
3677         $TERM ( )
3678
3679 ###### expression grammar
3680         | Expression Eop Expression ${ {
3681                 struct binode *b = new(binode);
3682                 b->op = $2.op;
3683                 b->left = $<1;
3684                 b->right = $<3;
3685                 $0 = b;
3686         } }$
3687
3688         | Expression Top Expression ${ {
3689                 struct binode *b = new(binode);
3690                 b->op = $2.op;
3691                 b->left = $<1;
3692                 b->right = $<3;
3693                 $0 = b;
3694         } }$
3695
3696         | Uop Expression ${ {
3697                 struct binode *b = new(binode);
3698                 b->op = $1.op;
3699                 b->right = $<2;
3700                 $0 = b;
3701         } }$
3702
3703 ###### term grammar
3704
3705         | ( Expression ) ${ {
3706                 struct binode *b = new_pos(binode, $1);
3707                 b->op = Bracket;
3708                 b->right = $<2;
3709                 $0 = b;
3710         } }$
3711
3712 ###### Grammar
3713
3714         $eop
3715         Eop ->   + ${ $0.op = Plus; }$
3716         |        - ${ $0.op = Minus; }$
3717
3718         Uop ->   + ${ $0.op = Absolute; }$
3719         |        - ${ $0.op = Negate; }$
3720         |        $ ${ $0.op = StringConv; }$
3721
3722         Top ->   * ${ $0.op = Times; }$
3723         |        / ${ $0.op = Divide; }$
3724         |        % ${ $0.op = Rem; }$
3725         |        ++ ${ $0.op = Concat; }$
3726
3727 ###### print binode cases
3728         case Plus:
3729         case Minus:
3730         case Times:
3731         case Divide:
3732         case Concat:
3733         case Rem:
3734                 if (bracket) printf("(");
3735                 print_exec(b->left, indent, bracket);
3736                 switch(b->op) {
3737                 case Plus:   fputs(" + ", stdout); break;
3738                 case Minus:  fputs(" - ", stdout); break;
3739                 case Times:  fputs(" * ", stdout); break;
3740                 case Divide: fputs(" / ", stdout); break;
3741                 case Rem:    fputs(" % ", stdout); break;
3742                 case Concat: fputs(" ++ ", stdout); break;
3743                 default: abort();       // NOTEST
3744                 }                       // NOTEST
3745                 print_exec(b->right, indent, bracket);
3746                 if (bracket) printf(")");
3747                 break;
3748         case Absolute:
3749         case Negate:
3750         case StringConv:
3751                 if (bracket) printf("(");
3752                 switch (b->op) {
3753                 case Absolute:   fputs("+", stdout); break;
3754                 case Negate:     fputs("-", stdout); break;
3755                 case StringConv: fputs("$", stdout); break;
3756                 default: abort();       // NOTEST
3757                 }                       // NOTEST
3758                 print_exec(b->right, indent, bracket);
3759                 if (bracket) printf(")");
3760                 break;
3761         case Bracket:
3762                 printf("(");
3763                 print_exec(b->right, indent, bracket);
3764                 printf(")");
3765                 break;
3766
3767 ###### propagate binode cases
3768         case Plus:
3769         case Minus:
3770         case Times:
3771         case Rem:
3772         case Divide:
3773                 /* both must be numbers, result is Tnum */
3774         case Absolute:
3775         case Negate:
3776                 /* as propagate_types ignores a NULL,
3777                  * unary ops fit here too */
3778                 propagate_types(b->left, c, perr, Tnum, 0);
3779                 propagate_types(b->right, c, perr, Tnum, 0);
3780                 if (!type_compat(type, Tnum, 0))
3781                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3782                                    Tnum, rules, type);
3783                 return Tnum;
3784
3785         case Concat:
3786                 /* both must be Tstr, result is Tstr */
3787                 propagate_types(b->left, c, perr, Tstr, 0);
3788                 propagate_types(b->right, c, perr, Tstr, 0);
3789                 if (!type_compat(type, Tstr, 0))
3790                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3791                                    Tstr, rules, type);
3792                 return Tstr;
3793
3794         case StringConv:
3795                 /* op must be string, result is number */
3796                 propagate_types(b->left, c, perr, Tstr, 0);
3797                 if (!type_compat(type, Tnum, 0))
3798                         type_err(c,     // UNTESTED
3799                           "error: Can only convert string to number, not %1",
3800                                 prog, type, 0, NULL);
3801                 return Tnum;
3802
3803         case Bracket:
3804                 return propagate_types(b->right, c, perr, type, 0);
3805
3806 ###### interp binode cases
3807
3808         case Plus:
3809                 rv = interp_exec(c, b->left, &rvtype);
3810                 right = interp_exec(c, b->right, &rtype);
3811                 mpq_add(rv.num, rv.num, right.num);
3812                 break;
3813         case Minus:
3814                 rv = interp_exec(c, b->left, &rvtype);
3815                 right = interp_exec(c, b->right, &rtype);
3816                 mpq_sub(rv.num, rv.num, right.num);
3817                 break;
3818         case Times:
3819                 rv = interp_exec(c, b->left, &rvtype);
3820                 right = interp_exec(c, b->right, &rtype);
3821                 mpq_mul(rv.num, rv.num, right.num);
3822                 break;
3823         case Divide:
3824                 rv = interp_exec(c, b->left, &rvtype);
3825                 right = interp_exec(c, b->right, &rtype);
3826                 mpq_div(rv.num, rv.num, right.num);
3827                 break;
3828         case Rem: {
3829                 mpz_t l, r, rem;
3830
3831                 left = interp_exec(c, b->left, &ltype);
3832                 right = interp_exec(c, b->right, &rtype);
3833                 mpz_init(l); mpz_init(r); mpz_init(rem);
3834                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3835                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3836                 mpz_tdiv_r(rem, l, r);
3837                 val_init(Tnum, &rv);
3838                 mpq_set_z(rv.num, rem);
3839                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3840                 rvtype = ltype;
3841                 break;
3842         }
3843         case Negate:
3844                 rv = interp_exec(c, b->right, &rvtype);
3845                 mpq_neg(rv.num, rv.num);
3846                 break;
3847         case Absolute:
3848                 rv = interp_exec(c, b->right, &rvtype);
3849                 mpq_abs(rv.num, rv.num);
3850                 break;
3851         case Bracket:
3852                 rv = interp_exec(c, b->right, &rvtype);
3853                 break;
3854         case Concat:
3855                 left = interp_exec(c, b->left, &ltype);
3856                 right = interp_exec(c, b->right, &rtype);
3857                 rvtype = Tstr;
3858                 rv.str = text_join(left.str, right.str);
3859                 break;
3860         case StringConv:
3861                 right = interp_exec(c, b->right, &rvtype);
3862                 rtype = Tstr;
3863                 rvtype = Tnum;
3864
3865                 struct text tx = right.str;
3866                 char tail[3];
3867                 int neg = 0;
3868                 if (tx.txt[0] == '-') {
3869                         neg = 1;        // UNTESTED
3870                         tx.txt++;       // UNTESTED
3871                         tx.len--;       // UNTESTED
3872                 }
3873                 if (number_parse(rv.num, tail, tx) == 0)
3874                         mpq_init(rv.num);       // UNTESTED
3875                 else if (neg)
3876                         mpq_neg(rv.num, rv.num);        // UNTESTED
3877                 if (tail[0])
3878                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3879
3880                 break;
3881
3882 ###### value functions
3883
3884         static struct text text_join(struct text a, struct text b)
3885         {
3886                 struct text rv;
3887                 rv.len = a.len + b.len;
3888                 rv.txt = malloc(rv.len);
3889                 memcpy(rv.txt, a.txt, a.len);
3890                 memcpy(rv.txt+a.len, b.txt, b.len);
3891                 return rv;
3892         }
3893
3894 ### Blocks, Statements, and Statement lists.
3895
3896 Now that we have expressions out of the way we need to turn to
3897 statements.  There are simple statements and more complex statements.
3898 Simple statements do not contain (syntactic) newlines, complex statements do.
3899
3900 Statements often come in sequences and we have corresponding simple
3901 statement lists and complex statement lists.
3902 The former comprise only simple statements separated by semicolons.
3903 The later comprise complex statements and simple statement lists.  They are
3904 separated by newlines.  Thus the semicolon is only used to separate
3905 simple statements on the one line.  This may be overly restrictive,
3906 but I'm not sure I ever want a complex statement to share a line with
3907 anything else.
3908
3909 Note that a simple statement list can still use multiple lines if
3910 subsequent lines are indented, so
3911
3912 ###### Example: wrapped simple statement list
3913
3914         a = b; c = d;
3915            e = f; print g
3916
3917 is a single simple statement list.  This might allow room for
3918 confusion, so I'm not set on it yet.
3919
3920 A simple statement list needs no extra syntax.  A complex statement
3921 list has two syntactic forms.  It can be enclosed in braces (much like
3922 C blocks), or it can be introduced by an indent and continue until an
3923 unindented newline (much like Python blocks).  With this extra syntax
3924 it is referred to as a block.
3925
3926 Note that a block does not have to include any newlines if it only
3927 contains simple statements.  So both of:
3928
3929         if condition: a=b; d=f
3930
3931         if condition { a=b; print f }
3932
3933 are valid.
3934
3935 In either case the list is constructed from a `binode` list with
3936 `Block` as the operator.  When parsing the list it is most convenient
3937 to append to the end, so a list is a list and a statement.  When using
3938 the list it is more convenient to consider a list to be a statement
3939 and a list.  So we need a function to re-order a list.
3940 `reorder_bilist` serves this purpose.
3941
3942 The only stand-alone statement we introduce at this stage is `pass`
3943 which does nothing and is represented as a `NULL` pointer in a `Block`
3944 list.  Other stand-alone statements will follow once the infrastructure
3945 is in-place.
3946
3947 As many statements will use binodes, we declare a binode pointer 'b' in
3948 the common header for all reductions to use.
3949
3950 ###### Parser: reduce
3951         struct binode *b;
3952
3953 ###### Binode types
3954         Block,
3955
3956 ###### Grammar
3957
3958         $TERM { } ;
3959
3960         $*binode
3961         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3962         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3963         |        SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3964         |        SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3965         |        IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3966
3967         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3968         |        OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3969         |        OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3970         |        OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3971         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3972
3973         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3974         |        { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3975         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3976
3977         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3978         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3979         |        : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3980         |        : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3981         |        : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3982
3983         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3984
3985         ComplexStatements -> ComplexStatements ComplexStatement ${
3986                 if ($2 == NULL) {
3987                         $0 = $<1;
3988                 } else {
3989                         $0 = new(binode);
3990                         $0->op = Block;
3991                         $0->left = $<1;
3992                         $0->right = $<2;
3993                 }
3994         }$
3995         | ComplexStatement ${
3996                 if ($1 == NULL) {
3997                         $0 = NULL;
3998                 } else {
3999                         $0 = new(binode);
4000                         $0->op = Block;
4001                         $0->left = NULL;
4002                         $0->right = $<1;
4003                 }
4004         }$
4005
4006         $*exec
4007         ComplexStatement -> SimpleStatements Newlines ${
4008                 $0 = reorder_bilist($<SS);
4009         }$
4010         |  SimpleStatements ; Newlines ${
4011                 $0 = reorder_bilist($<SS);
4012         }$
4013         ## ComplexStatement Grammar
4014
4015         $*binode
4016         SimpleStatements -> SimpleStatements ; SimpleStatement ${
4017                 $0 = new(binode);
4018                 $0->op = Block;
4019                 $0->left = $<1;
4020                 $0->right = $<3;
4021         }$
4022         | SimpleStatement ${
4023                 $0 = new(binode);
4024                 $0->op = Block;
4025                 $0->left = NULL;
4026                 $0->right = $<1;
4027         }$
4028
4029         $TERM pass
4030         $*exec
4031         SimpleStatement -> pass ${ $0 = NULL; }$
4032         | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
4033         ## SimpleStatement Grammar
4034
4035 ###### print binode cases
4036         case Block:
4037                 if (indent < 0) {
4038                         // simple statement
4039                         if (b->left == NULL)    // UNTESTED
4040                                 printf("pass"); // UNTESTED
4041                         else
4042                                 print_exec(b->left, indent, bracket);   // UNTESTED
4043                         if (b->right) { // UNTESTED
4044                                 printf("; ");   // UNTESTED
4045                                 print_exec(b->right, indent, bracket);  // UNTESTED
4046                         }
4047                 } else {
4048                         // block, one per line
4049                         if (b->left == NULL)
4050                                 do_indent(indent, "pass\n");
4051                         else
4052                                 print_exec(b->left, indent, bracket);
4053                         if (b->right)
4054                                 print_exec(b->right, indent, bracket);
4055                 }
4056                 break;
4057
4058 ###### propagate binode cases
4059         case Block:
4060         {
4061                 /* If any statement returns something other than Tnone
4062                  * or Tbool then all such must return same type.
4063                  * As each statement may be Tnone or something else,
4064                  * we must always pass NULL (unknown) down, otherwise an incorrect
4065                  * error might occur.  We never return Tnone unless it is
4066                  * passed in.
4067                  */
4068                 struct binode *e;
4069
4070                 for (e = b; e; e = cast(binode, e->right)) {
4071                         t = propagate_types(e->left, c, perr, NULL, rules);
4072                         if ((rules & Rboolok) && (t == Tbool || t == Tnone))
4073                                 t = NULL;
4074                         if (t == Tnone && e->right)
4075                                 /* Only the final statement *must* return a value
4076                                  * when not Rboolok
4077                                  */
4078                                 t = NULL;
4079                         if (t) {
4080                                 if (!type)
4081                                         type = t;
4082                                 else if (t != type)
4083                                         type_err(c, "error: expected %1%r, found %2",
4084                                                  e->left, type, rules, t);
4085                         }
4086                 }
4087                 return type;
4088         }
4089
4090 ###### interp binode cases
4091         case Block:
4092                 while (rvtype == Tnone &&
4093                        b) {
4094                         if (b->left)
4095                                 rv = interp_exec(c, b->left, &rvtype);
4096                         b = cast(binode, b->right);
4097                 }
4098                 break;
4099
4100 ### The Print statement
4101
4102 `print` is a simple statement that takes a comma-separated list of
4103 expressions and prints the values separated by spaces and terminated
4104 by a newline.  No control of formatting is possible.
4105
4106 `print` uses `ExpressionList` to collect the expressions and stores them
4107 on the left side of a `Print` binode unlessthere is a trailing comma
4108 when the list is stored on the `right` side and no trailing newline is
4109 printed.
4110
4111 ###### Binode types
4112         Print,
4113
4114 ##### declare terminals
4115         $TERM print
4116
4117 ###### SimpleStatement Grammar
4118
4119         | print ExpressionList ${
4120                 $0 = b = new(binode);
4121                 b->op = Print;
4122                 b->right = NULL;
4123                 b->left = reorder_bilist($<EL);
4124         }$
4125         | print ExpressionList , ${ {
4126                 $0 = b = new(binode);
4127                 b->op = Print;
4128                 b->right = reorder_bilist($<EL);
4129                 b->left = NULL;
4130         } }$
4131         | print ${
4132                 $0 = b = new(binode);
4133                 b->op = Print;
4134                 b->left = NULL;
4135                 b->right = NULL;
4136         }$
4137
4138 ###### print binode cases
4139
4140         case Print:
4141                 do_indent(indent, "print");
4142                 if (b->right) {
4143                         print_exec(b->right, -1, bracket);
4144                         printf(",");
4145                 } else
4146                         print_exec(b->left, -1, bracket);
4147                 if (indent >= 0)
4148                         printf("\n");
4149                 break;
4150
4151 ###### propagate binode cases
4152
4153         case Print:
4154                 /* don't care but all must be consistent */
4155                 if (b->left)
4156                         b = cast(binode, b->left);
4157                 else
4158                         b = cast(binode, b->right);
4159                 while (b) {
4160                         propagate_types(b->left, c, perr, NULL, Rnolabel);
4161                         b = cast(binode, b->right);
4162                 }
4163                 break;
4164
4165 ###### interp binode cases
4166
4167         case Print:
4168         {
4169                 struct binode *b2 = cast(binode, b->left);
4170                 if (!b2)
4171                         b2 = cast(binode, b->right);
4172                 for (; b2; b2 = cast(binode, b2->right)) {
4173                         left = interp_exec(c, b2->left, &ltype);
4174                         print_value(ltype, &left, stdout);
4175                         free_value(ltype, &left);
4176                         if (b2->right)
4177                                 putchar(' ');
4178                 }
4179                 if (b->right == NULL)
4180                         printf("\n");
4181                 ltype = Tnone;
4182                 break;
4183         }
4184
4185 ###### Assignment statement
4186
4187 An assignment will assign a value to a variable, providing it hasn't
4188 been declared as a constant.  The analysis phase ensures that the type
4189 will be correct so the interpreter just needs to perform the
4190 calculation.  There is a form of assignment which declares a new
4191 variable as well as assigning a value.  If a name is assigned before
4192 it is declared, and error will be raised as the name is created as
4193 `Tlabel` and it is illegal to assign to such names.
4194
4195 ###### Binode types
4196         Assign,
4197         Declare,
4198
4199 ###### declare terminals
4200         $TERM =
4201
4202 ###### SimpleStatement Grammar
4203         | Term = Expression ${
4204                 $0 = b= new(binode);
4205                 b->op = Assign;
4206                 b->left = $<1;
4207                 b->right = $<3;
4208         }$
4209         | VariableDecl = Expression ${
4210                 $0 = b= new(binode);
4211                 b->op = Declare;
4212                 b->left = $<1;
4213                 b->right =$<3;
4214         }$
4215
4216         | VariableDecl ${
4217                 if ($1->var->where_set == NULL) {
4218                         type_err(c,
4219                                  "Variable declared with no type or value: %v",
4220                                  $1, NULL, 0, NULL);
4221                         free_var($1);
4222                 } else {
4223                         $0 = b = new(binode);
4224                         b->op = Declare;
4225                         b->left = $<1;
4226                         b->right = NULL;
4227                 }
4228         }$
4229
4230 ###### print binode cases
4231
4232         case Assign:
4233                 do_indent(indent, "");
4234                 print_exec(b->left, indent, bracket);
4235                 printf(" = ");
4236                 print_exec(b->right, indent, bracket);
4237                 if (indent >= 0)
4238                         printf("\n");
4239                 break;
4240
4241         case Declare:
4242                 {
4243                 struct variable *v = cast(var, b->left)->var;
4244                 do_indent(indent, "");
4245                 print_exec(b->left, indent, bracket);
4246                 if (cast(var, b->left)->var->constant) {
4247                         printf("::");
4248                         if (v->explicit_type) {
4249                                 type_print(v->type, stdout);
4250                                 printf(" ");
4251                         }
4252                 } else {
4253                         printf(":");
4254                         if (v->explicit_type) {
4255                                 type_print(v->type, stdout);
4256                                 printf(" ");
4257                         }
4258                 }
4259                 if (b->right) {
4260                         printf("= ");
4261                         print_exec(b->right, indent, bracket);
4262                 }
4263                 if (indent >= 0)
4264                         printf("\n");
4265                 }
4266                 break;
4267
4268 ###### propagate binode cases
4269
4270         case Assign:
4271         case Declare:
4272                 /* Both must match and not be labels,
4273                  * Type must support 'dup',
4274                  * For Assign, left must not be constant.
4275                  * result is Tnone
4276                  */
4277                 t = propagate_types(b->left, c, perr, NULL,
4278                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
4279                 if (!b->right)
4280                         return Tnone;
4281
4282                 if (t) {
4283                         if (propagate_types(b->right, c, perr, t, 0) != t)
4284                                 if (b->left->type == Xvar)
4285                                         type_err(c, "info: variable '%v' was set as %1 here.",
4286                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
4287                 } else {
4288                         t = propagate_types(b->right, c, perr, NULL, Rnolabel);
4289                         if (t)
4290                                 propagate_types(b->left, c, perr, t,
4291                                                 (b->op == Assign ? Rnoconstant : 0));
4292                 }
4293                 if (t && t->dup == NULL && t->name.txt[0] != ' ') // HACK
4294                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
4295                 return Tnone;
4296
4297                 break;
4298
4299 ###### interp binode cases
4300
4301         case Assign:
4302                 lleft = linterp_exec(c, b->left, &ltype);
4303                 if (lleft)
4304                         dinterp_exec(c, b->right, lleft, ltype, 1);
4305                 ltype = Tnone;
4306                 break;
4307
4308         case Declare:
4309         {
4310                 struct variable *v = cast(var, b->left)->var;
4311                 struct value *val;
4312                 v = v->merged;
4313                 val = var_value(c, v);
4314                 if (v->type->prepare_type)
4315                         v->type->prepare_type(c, v->type, 0);
4316                 if (b->right)
4317                         dinterp_exec(c, b->right, val, v->type, 0);
4318                 else
4319                         val_init(v->type, val);
4320                 break;
4321         }
4322
4323 ### The `use` statement
4324
4325 The `use` statement is the last "simple" statement.  It is needed when a
4326 statement block can return a value.  This includes the body of a
4327 function which has a return type, and the "condition" code blocks in
4328 `if`, `while`, and `switch` statements.
4329
4330 ###### Binode types
4331         Use,
4332
4333 ###### declare terminals
4334         $TERM use
4335
4336 ###### SimpleStatement Grammar
4337         | use Expression ${
4338                 $0 = b = new_pos(binode, $1);
4339                 b->op = Use;
4340                 b->right = $<2;
4341                 if (b->right->type == Xvar) {
4342                         struct var *v = cast(var, b->right);
4343                         if (v->var->type == Tnone) {
4344                                 /* Convert this to a label */
4345                                 struct value *val;
4346
4347                                 v->var->type = Tlabel;
4348                                 val = global_alloc(c, Tlabel, v->var, NULL);
4349                                 val->label = val;
4350                         }
4351                 }
4352         }$
4353
4354 ###### print binode cases
4355
4356         case Use:
4357                 do_indent(indent, "use ");
4358                 print_exec(b->right, -1, bracket);
4359                 if (indent >= 0)
4360                         printf("\n");
4361                 break;
4362
4363 ###### propagate binode cases
4364
4365         case Use:
4366                 /* result matches value */
4367                 return propagate_types(b->right, c, perr, type, 0);
4368
4369 ###### interp binode cases
4370
4371         case Use:
4372                 rv = interp_exec(c, b->right, &rvtype);
4373                 break;
4374
4375 ### The Conditional Statement
4376
4377 This is the biggy and currently the only complex statement.  This
4378 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
4379 It is comprised of a number of parts, all of which are optional though
4380 set combinations apply.  Each part is (usually) a key word (`then` is
4381 sometimes optional) followed by either an expression or a code block,
4382 except the `casepart` which is a "key word and an expression" followed
4383 by a code block.  The code-block option is valid for all parts and,
4384 where an expression is also allowed, the code block can use the `use`
4385 statement to report a value.  If the code block does not report a value
4386 the effect is similar to reporting `True`.
4387
4388 The `else` and `case` parts, as well as `then` when combined with
4389 `if`, can contain a `use` statement which will apply to some
4390 containing conditional statement. `for` parts, `do` parts and `then`
4391 parts used with `for` can never contain a `use`, except in some
4392 subordinate conditional statement.
4393
4394 If there is a `forpart`, it is executed first, only once.
4395 If there is a `dopart`, then it is executed repeatedly providing
4396 always that the `condpart` or `cond`, if present, does not return a non-True
4397 value.  `condpart` can fail to return any value if it simply executes
4398 to completion.  This is treated the same as returning `True`.
4399
4400 If there is a `thenpart` it will be executed whenever the `condpart`
4401 or `cond` returns True (or does not return any value), but this will happen
4402 *after* `dopart` (when present).
4403
4404 If `elsepart` is present it will be executed at most once when the
4405 condition returns `False` or some value that isn't `True` and isn't
4406 matched by any `casepart`.  If there are any `casepart`s, they will be
4407 executed when the condition returns a matching value.
4408
4409 The particular sorts of values allowed in case parts has not yet been
4410 determined in the language design, so nothing is prohibited.
4411
4412 The various blocks in this complex statement potentially provide scope
4413 for variables as described earlier.  Each such block must include the
4414 "OpenScope" nonterminal before parsing the block, and must call
4415 `var_block_close()` when closing the block.
4416
4417 The code following "`if`", "`switch`" and "`for`" does not get its own
4418 scope, but is in a scope covering the whole statement, so names
4419 declared there cannot be redeclared elsewhere.  Similarly the
4420 condition following "`while`" is in a scope the covers the body
4421 ("`do`" part) of the loop, and which does not allow conditional scope
4422 extension.  Code following "`then`" (both looping and non-looping),
4423 "`else`" and "`case`" each get their own local scope.
4424
4425 The type requirements on the code block in a `whilepart` are quite
4426 unusal.  It is allowed to return a value of some identifiable type, in
4427 which case the loop aborts and an appropriate `casepart` is run, or it
4428 can return a Boolean, in which case the loop either continues to the
4429 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
4430 This is different both from the `ifpart` code block which is expected to
4431 return a Boolean, or the `switchpart` code block which is expected to
4432 return the same type as the casepart values.  The correct analysis of
4433 the type of the `whilepart` code block is the reason for the
4434 `Rboolok` flag which is passed to `propagate_types()`.
4435
4436 The `cond_statement` cannot fit into a `binode` so a new `exec` is
4437 defined.  As there are two scopes which cover multiple parts - one for
4438 the whole statement and one for "while" and "do" - and as we will use
4439 the 'struct exec' to track scopes, we actually need two new types of
4440 exec.  One is a `binode` for the looping part, the rest is the
4441 `cond_statement`.  The `cond_statement` will use an auxilliary `struct
4442 casepart` to track a list of case parts.
4443
4444 ###### Binode types
4445         Loop
4446
4447 ###### exec type
4448         Xcond_statement,
4449
4450 ###### ast
4451         struct casepart {
4452                 struct exec *value;
4453                 struct exec *action;
4454                 struct casepart *next;
4455         };
4456         struct cond_statement {
4457                 struct exec;
4458                 struct exec *forpart, *condpart, *thenpart, *elsepart;
4459                 struct binode *looppart;
4460                 struct casepart *casepart;
4461         };
4462
4463 ###### ast functions
4464
4465         static void free_casepart(struct casepart *cp)
4466         {
4467                 while (cp) {
4468                         struct casepart *t;
4469                         free_exec(cp->value);
4470                         free_exec(cp->action);
4471                         t = cp->next;
4472                         free(cp);
4473                         cp = t;
4474                 }
4475         }
4476
4477         static void free_cond_statement(struct cond_statement *s)
4478         {
4479                 if (!s)
4480                         return;
4481                 free_exec(s->forpart);
4482                 free_exec(s->condpart);
4483                 free_exec(s->looppart);
4484                 free_exec(s->thenpart);
4485                 free_exec(s->elsepart);
4486                 free_casepart(s->casepart);
4487                 free(s);
4488         }
4489
4490 ###### free exec cases
4491         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
4492
4493 ###### ComplexStatement Grammar
4494         | CondStatement ${ $0 = $<1; }$
4495
4496 ###### declare terminals
4497         $TERM for then while do
4498         $TERM else
4499         $TERM switch case
4500
4501 ###### Grammar
4502
4503         $*cond_statement
4504         // A CondStatement must end with EOL, as does CondSuffix and
4505         // IfSuffix.
4506         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
4507         // may or may not end with EOL
4508         // WhilePart and IfPart include an appropriate Suffix
4509
4510         // ForPart, SwitchPart, and IfPart open scopes, o we have to close
4511         // them.  WhilePart opens and closes its own scope.
4512         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
4513                 $0 = $<CS;
4514                 $0->forpart = $<FP;
4515                 $0->thenpart = $<TP;
4516                 $0->looppart = $<WP;
4517                 var_block_close(c, CloseSequential, $0);
4518         }$
4519         | ForPart OptNL WhilePart CondSuffix ${
4520                 $0 = $<CS;
4521                 $0->forpart = $<FP;
4522                 $0->looppart = $<WP;
4523                 var_block_close(c, CloseSequential, $0);
4524         }$
4525         | WhilePart CondSuffix ${
4526                 $0 = $<CS;
4527                 $0->looppart = $<WP;
4528         }$
4529         | SwitchPart OptNL CasePart CondSuffix ${
4530                 $0 = $<CS;
4531                 $0->condpart = $<SP;
4532                 $CP->next = $0->casepart;
4533                 $0->casepart = $<CP;
4534                 var_block_close(c, CloseSequential, $0);
4535         }$
4536         | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
4537                 $0 = $<CS;
4538                 $0->condpart = $<SP;
4539                 $CP->next = $0->casepart;
4540                 $0->casepart = $<CP;
4541                 var_block_close(c, CloseSequential, $0);
4542         }$
4543         | IfPart IfSuffix ${
4544                 $0 = $<IS;
4545                 $0->condpart = $IP.condpart; $IP.condpart = NULL;
4546                 $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
4547                 // This is where we close an "if" statement
4548                 var_block_close(c, CloseSequential, $0);
4549         }$
4550
4551         CondSuffix -> IfSuffix ${
4552                 $0 = $<1;
4553         }$
4554         | Newlines CasePart CondSuffix ${
4555                 $0 = $<CS;
4556                 $CP->next = $0->casepart;
4557                 $0->casepart = $<CP;
4558         }$
4559         | CasePart CondSuffix ${
4560                 $0 = $<CS;
4561                 $CP->next = $0->casepart;
4562                 $0->casepart = $<CP;
4563         }$
4564
4565         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
4566         | Newlines ElsePart ${ $0 = $<EP; }$
4567         | ElsePart ${$0 = $<EP; }$
4568
4569         ElsePart -> else OpenBlock Newlines ${
4570                 $0 = new(cond_statement);
4571                 $0->elsepart = $<OB;
4572                 var_block_close(c, CloseElse, $0->elsepart);
4573         }$
4574         | else OpenScope CondStatement ${
4575                 $0 = new(cond_statement);
4576                 $0->elsepart = $<CS;
4577                 var_block_close(c, CloseElse, $0->elsepart);
4578         }$
4579
4580         $*casepart
4581         CasePart -> case Expression OpenScope ColonBlock ${
4582                 $0 = calloc(1,sizeof(struct casepart));
4583                 $0->value = $<Ex;
4584                 $0->action = $<Bl;
4585                 var_block_close(c, CloseParallel, $0->action);
4586         }$
4587
4588         $*exec
4589         // These scopes are closed in CondStatement
4590         ForPart -> for OpenBlock ${
4591                 $0 = $<Bl;
4592         }$
4593
4594         ThenPart -> then OpenBlock ${
4595                 $0 = $<OB;
4596                 var_block_close(c, CloseSequential, $0);
4597         }$
4598
4599         $*binode
4600         // This scope is closed in CondStatement
4601         WhilePart -> while UseBlock OptNL do OpenBlock ${
4602                 $0 = new(binode);
4603                 $0->op = Loop;
4604                 $0->left = $<UB;
4605                 $0->right = $<OB;
4606                 var_block_close(c, CloseSequential, $0->right);
4607                 var_block_close(c, CloseSequential, $0);
4608         }$
4609         | while OpenScope Expression OpenScope ColonBlock ${
4610                 $0 = new(binode);
4611                 $0->op = Loop;
4612                 $0->left = $<Exp;
4613                 $0->right = $<CB;
4614                 var_block_close(c, CloseSequential, $0->right);
4615                 var_block_close(c, CloseSequential, $0);
4616         }$
4617
4618         $cond_statement
4619         IfPart -> if UseBlock OptNL then OpenBlock ${
4620                 $0.condpart = $<UB;
4621                 $0.thenpart = $<OB;
4622                 var_block_close(c, CloseParallel, $0.thenpart);
4623         }$
4624         | if OpenScope Expression OpenScope ColonBlock ${
4625                 $0.condpart = $<Ex;
4626                 $0.thenpart = $<CB;
4627                 var_block_close(c, CloseParallel, $0.thenpart);
4628         }$
4629         | if OpenScope Expression OpenScope OptNL then Block ${
4630                 $0.condpart = $<Ex;
4631                 $0.thenpart = $<Bl;
4632                 var_block_close(c, CloseParallel, $0.thenpart);
4633         }$
4634
4635         $*exec
4636         // This scope is closed in CondStatement
4637         SwitchPart -> switch OpenScope Expression ${
4638                 $0 = $<Ex;
4639         }$
4640         | switch UseBlock ${
4641                 $0 = $<Bl;
4642         }$
4643
4644 ###### print binode cases
4645         case Loop:
4646                 if (b->left && b->left->type == Xbinode &&
4647                     cast(binode, b->left)->op == Block) {
4648                         if (bracket)
4649                                 do_indent(indent, "while {\n");
4650                         else
4651                                 do_indent(indent, "while\n");
4652                         print_exec(b->left, indent+1, bracket);
4653                         if (bracket)
4654                                 do_indent(indent, "} do {\n");
4655                         else
4656                                 do_indent(indent, "do\n");
4657                         print_exec(b->right, indent+1, bracket);
4658                         if (bracket)
4659                                 do_indent(indent, "}\n");
4660                 } else {
4661                         do_indent(indent, "while ");
4662                         print_exec(b->left, 0, bracket);
4663                         if (bracket)
4664                                 printf(" {\n");
4665                         else
4666                                 printf(":\n");
4667                         print_exec(b->right, indent+1, bracket);
4668                         if (bracket)
4669                                 do_indent(indent, "}\n");
4670                 }
4671                 break;
4672
4673 ###### print exec cases
4674
4675         case Xcond_statement:
4676         {
4677                 struct cond_statement *cs = cast(cond_statement, e);
4678                 struct casepart *cp;
4679                 if (cs->forpart) {
4680                         do_indent(indent, "for");
4681                         if (bracket) printf(" {\n"); else printf("\n");
4682                         print_exec(cs->forpart, indent+1, bracket);
4683                         if (cs->thenpart) {
4684                                 if (bracket)
4685                                         do_indent(indent, "} then {\n");
4686                                 else
4687                                         do_indent(indent, "then\n");
4688                                 print_exec(cs->thenpart, indent+1, bracket);
4689                         }
4690                         if (bracket) do_indent(indent, "}\n");
4691                 }
4692                 if (cs->looppart) {
4693                         print_exec(cs->looppart, indent, bracket);
4694                 } else {
4695                         // a condition
4696                         if (cs->casepart)
4697                                 do_indent(indent, "switch");
4698                         else
4699                                 do_indent(indent, "if");
4700                         if (cs->condpart && cs->condpart->type == Xbinode &&
4701                             cast(binode, cs->condpart)->op == Block) {
4702                                 if (bracket)
4703                                         printf(" {\n");
4704                                 else
4705                                         printf("\n");
4706                                 print_exec(cs->condpart, indent+1, bracket);
4707                                 if (bracket)
4708                                         do_indent(indent, "}\n");
4709                                 if (cs->thenpart) {
4710                                         do_indent(indent, "then\n");
4711                                         print_exec(cs->thenpart, indent+1, bracket);
4712                                 }
4713                         } else {
4714                                 printf(" ");
4715                                 print_exec(cs->condpart, 0, bracket);
4716                                 if (cs->thenpart) {
4717                                         if (bracket)
4718                                                 printf(" {\n");
4719                                         else
4720                                                 printf(":\n");
4721                                         print_exec(cs->thenpart, indent+1, bracket);
4722                                         if (bracket)
4723                                                 do_indent(indent, "}\n");
4724                                 } else
4725                                         printf("\n");
4726                         }
4727                 }
4728                 for (cp = cs->casepart; cp; cp = cp->next) {
4729                         do_indent(indent, "case ");
4730                         print_exec(cp->value, -1, 0);
4731                         if (bracket)
4732                                 printf(" {\n");
4733                         else
4734                                 printf(":\n");
4735                         print_exec(cp->action, indent+1, bracket);
4736                         if (bracket)
4737                                 do_indent(indent, "}\n");
4738                 }
4739                 if (cs->elsepart) {
4740                         do_indent(indent, "else");
4741                         if (bracket)
4742                                 printf(" {\n");
4743                         else
4744                                 printf("\n");
4745                         print_exec(cs->elsepart, indent+1, bracket);
4746                         if (bracket)
4747                                 do_indent(indent, "}\n");
4748                 }
4749                 break;
4750         }
4751
4752 ###### propagate binode cases
4753         case Loop:
4754                 t = propagate_types(b->right, c, perr, Tnone, 0);
4755                 if (!type_compat(Tnone, t, 0))
4756                         *perr |= Efail; // UNTESTED
4757                 return propagate_types(b->left, c, perr, type, rules);
4758
4759 ###### propagate exec cases
4760         case Xcond_statement:
4761         {
4762                 // forpart and looppart->right must return Tnone
4763                 // thenpart must return Tnone if there is a loopart,
4764                 // otherwise it is like elsepart.
4765                 // condpart must:
4766                 //    be bool if there is no casepart
4767                 //    match casepart->values if there is a switchpart
4768                 //    either be bool or match casepart->value if there
4769                 //             is a whilepart
4770                 // elsepart and casepart->action must match the return type
4771                 //   expected of this statement.
4772                 struct cond_statement *cs = cast(cond_statement, prog);
4773                 struct casepart *cp;
4774
4775                 t = propagate_types(cs->forpart, c, perr, Tnone, 0);
4776                 if (!type_compat(Tnone, t, 0))
4777                         *perr |= Efail; // UNTESTED
4778
4779                 if (cs->looppart) {
4780                         t = propagate_types(cs->thenpart, c, perr, Tnone, 0);
4781                         if (!type_compat(Tnone, t, 0))
4782                                 *perr |= Efail; // UNTESTED
4783                 }
4784                 if (cs->casepart == NULL) {
4785                         propagate_types(cs->condpart, c, perr, Tbool, 0);
4786                         propagate_types(cs->looppart, c, perr, Tbool, 0);
4787                 } else {
4788                         /* Condpart must match case values, with bool permitted */
4789                         t = NULL;
4790                         for (cp = cs->casepart;
4791                              cp && !t; cp = cp->next)
4792                                 t = propagate_types(cp->value, c, perr, NULL, 0);
4793                         if (!t && cs->condpart)
4794                                 t = propagate_types(cs->condpart, c, perr, NULL, Rboolok);      // UNTESTED
4795                         if (!t && cs->looppart)
4796                                 t = propagate_types(cs->looppart, c, perr, NULL, Rboolok);      // UNTESTED
4797                         // Now we have a type (I hope) push it down
4798                         if (t) {
4799                                 for (cp = cs->casepart; cp; cp = cp->next)
4800                                         propagate_types(cp->value, c, perr, t, 0);
4801                                 propagate_types(cs->condpart, c, perr, t, Rboolok);
4802                                 propagate_types(cs->looppart, c, perr, t, Rboolok);
4803                         }
4804                 }
4805                 // (if)then, else, and case parts must return expected type.
4806                 if (!cs->looppart && !type)
4807                         type = propagate_types(cs->thenpart, c, perr, NULL, rules);
4808                 if (!type)
4809                         type = propagate_types(cs->elsepart, c, perr, NULL, rules);
4810                 for (cp = cs->casepart;
4811                      cp && !type;
4812                      cp = cp->next)     // UNTESTED
4813                         type = propagate_types(cp->action, c, perr, NULL, rules);       // UNTESTED
4814                 if (type) {
4815                         if (!cs->looppart)
4816                                 propagate_types(cs->thenpart, c, perr, type, rules);
4817                         propagate_types(cs->elsepart, c, perr, type, rules);
4818                         for (cp = cs->casepart; cp ; cp = cp->next)
4819                                 propagate_types(cp->action, c, perr, type, rules);
4820                         return type;
4821                 } else
4822                         return NULL;
4823         }
4824
4825 ###### interp binode cases
4826         case Loop:
4827                 // This just performs one iterration of the loop
4828                 rv = interp_exec(c, b->left, &rvtype);
4829                 if (rvtype == Tnone ||
4830                     (rvtype == Tbool && rv.bool != 0))
4831                         // rvtype is Tnone or Tbool, doesn't need to be freed
4832                         interp_exec(c, b->right, NULL);
4833                 break;
4834
4835 ###### interp exec cases
4836         case Xcond_statement:
4837         {
4838                 struct value v, cnd;
4839                 struct type *vtype, *cndtype;
4840                 struct casepart *cp;
4841                 struct cond_statement *cs = cast(cond_statement, e);
4842
4843                 if (cs->forpart)
4844                         interp_exec(c, cs->forpart, NULL);
4845                 if (cs->looppart) {
4846                         while ((cnd = interp_exec(c, cs->looppart, &cndtype)),
4847                                cndtype == Tnone || (cndtype == Tbool && cnd.bool != 0))
4848                                 interp_exec(c, cs->thenpart, NULL);
4849                 } else {
4850                         cnd = interp_exec(c, cs->condpart, &cndtype);
4851                         if ((cndtype == Tnone ||
4852                             (cndtype == Tbool && cnd.bool != 0))) {
4853                                 // cnd is Tnone or Tbool, doesn't need to be freed
4854                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4855                                 // skip else (and cases)
4856                                 goto Xcond_done;
4857                         }
4858                 }
4859                 for (cp = cs->casepart; cp; cp = cp->next) {
4860                         v = interp_exec(c, cp->value, &vtype);
4861                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4862                                 free_value(vtype, &v);
4863                                 free_value(cndtype, &cnd);
4864                                 rv = interp_exec(c, cp->action, &rvtype);
4865                                 goto Xcond_done;
4866                         }
4867                         free_value(vtype, &v);
4868                 }
4869                 free_value(cndtype, &cnd);
4870                 if (cs->elsepart)
4871                         rv = interp_exec(c, cs->elsepart, &rvtype);
4872                 else
4873                         rvtype = Tnone;
4874         Xcond_done:
4875                 break;
4876         }
4877
4878 ### Top level structure
4879
4880 All the language elements so far can be used in various places.  Now
4881 it is time to clarify what those places are.
4882
4883 At the top level of a file there will be a number of declarations.
4884 Many of the things that can be declared haven't been described yet,
4885 such as functions, procedures, imports, and probably more.
4886 For now there are two sorts of things that can appear at the top
4887 level.  They are predefined constants, `struct` types, and the `main`
4888 function.  While the syntax will allow the `main` function to appear
4889 multiple times, that will trigger an error if it is actually attempted.
4890
4891 The various declarations do not return anything.  They store the
4892 various declarations in the parse context.
4893
4894 ###### Parser: grammar
4895
4896         $void
4897         Ocean -> OptNL DeclarationList
4898
4899         ## declare terminals
4900
4901         OptNL ->
4902         | OptNL NEWLINE
4903
4904         Newlines -> NEWLINE
4905         | Newlines NEWLINE
4906
4907         DeclarationList -> Declaration
4908         | DeclarationList Declaration
4909
4910         Declaration -> ERROR Newlines ${
4911                 tok_err(c,      // UNTESTED
4912                         "error: unhandled parse error", &$1);
4913         }$
4914         | DeclareConstant
4915         | DeclareFunction
4916         | DeclareStruct
4917
4918         ## top level grammar
4919
4920         ## Grammar
4921
4922 ### The `const` section
4923
4924 As well as being defined in with the code that uses them, constants can
4925 be declared at the top level.  These have full-file scope, so they are
4926 always `InScope`, even before(!) they have been declared.  The value of
4927 a top level constant can be given as an expression, and this is
4928 evaluated after parsing and before execution.
4929
4930 A function call can be used to evaluate a constant, but it will not have
4931 access to any program state, once such statement becomes meaningful.
4932 e.g.  arguments and filesystem will not be visible.
4933
4934 Constants are defined in a section that starts with the reserved word
4935 `const` and then has a block with a list of assignment statements.
4936 For syntactic consistency, these must use the double-colon syntax to
4937 make it clear that they are constants.  Type can also be given: if
4938 not, the type will be determined during analysis, as with other
4939 constants.
4940
4941 ###### parse context
4942         struct binode *constlist;
4943
4944 ###### top level grammar
4945
4946         $TERM const
4947
4948         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4949         | const { SimpleConstList } Newlines
4950         | const IN OptNL ConstList OUT Newlines
4951         | const SimpleConstList Newlines
4952
4953         ConstList -> ConstList SimpleConstLine
4954         | SimpleConstLine
4955
4956         SimpleConstList -> SimpleConstList ; Const
4957         | Const
4958         | SimpleConstList ;
4959
4960         SimpleConstLine -> SimpleConstList Newlines
4961         | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4962
4963         $*type
4964         CType -> Type   ${ $0 = $<1; }$
4965         |               ${ $0 = NULL; }$
4966
4967         $void
4968         Const -> IDENTIFIER :: CType = Expression ${ {
4969                 struct variable *v;
4970                 struct binode *bl, *bv;
4971                 struct var *var = new_pos(var, $ID);
4972
4973                 v = var_decl(c, $ID.txt);
4974                 if (v) {
4975                         v->where_decl = var;
4976                         v->where_set = var;
4977                         v->type = $<CT;
4978                         v->constant = 1;
4979                         v->global = 1;
4980                 } else {
4981                         v = var_ref(c, $1.txt);
4982                         if (v->type == Tnone) {
4983                                 v->where_decl = var;
4984                                 v->where_set = var;
4985                                 v->type = $<CT;
4986                                 v->constant = 1;
4987                                 v->global = 1;
4988                         } else {
4989                                 tok_err(c, "error: name already declared", &$1);
4990                                 type_err(c, "info: this is where '%v' was first declared",
4991                                          v->where_decl, NULL, 0, NULL);
4992                         }
4993                 }
4994                 var->var = v;
4995
4996                 bv = new(binode);
4997                 bv->op = Declare;
4998                 bv->left = var;
4999                 bv->right= $<Exp;
5000
5001                 bl = new(binode);
5002                 bl->op = List;
5003                 bl->left = c->constlist;
5004                 bl->right = bv;
5005                 c->constlist = bl;
5006         } }$
5007
5008 ###### core functions
5009         static void resolve_consts(struct parse_context *c)
5010         {
5011                 struct binode *b;
5012                 int retry = 1;
5013                 enum { none, some, cannot } progress = none;
5014
5015                 c->constlist = reorder_bilist(c->constlist);
5016                 while (retry) {
5017                         retry = 0;
5018                         for (b = cast(binode, c->constlist); b;
5019                              b = cast(binode, b->right)) {
5020                                 enum prop_err perr;
5021                                 struct binode *vb = cast(binode, b->left);
5022                                 struct var *v = cast(var, vb->left);
5023                                 if (v->var->frame_pos >= 0)
5024                                         continue;
5025                                 do {
5026                                         perr = 0;
5027                                         propagate_types(vb->right, c, &perr,
5028                                                         v->var->type, 0);
5029                                 } while (perr & Eretry);
5030                                 if (perr & Efail)
5031                                         c->parse_error += 1;
5032                                 else if (!(perr & Enoconst)) {
5033                                         progress = some;
5034                                         struct value res = interp_exec(
5035                                                 c, vb->right, &v->var->type);
5036                                         global_alloc(c, v->var->type, v->var, &res);
5037                                 } else {
5038                                         if (progress == cannot)
5039                                                 type_err(c, "error: const %v cannot be resolved.",
5040                                                          v, NULL, 0, NULL);
5041                                         else
5042                                                 retry = 1;
5043                                 }
5044                         }
5045                         switch (progress) {
5046                         case cannot:
5047                                 retry = 0; break;
5048                         case none:
5049                                 progress = cannot; break;
5050                         case some:
5051                                 progress = none; break;
5052                         }
5053                 }
5054         }
5055
5056 ###### print const decls
5057         {
5058                 struct binode *b;
5059                 int first = 1;
5060
5061                 for (b = cast(binode, context.constlist); b;
5062                      b = cast(binode, b->right)) {
5063                         struct binode *vb = cast(binode, b->left);
5064                         struct var *vr = cast(var, vb->left);
5065                         struct variable *v = vr->var;
5066
5067                         if (first)
5068                                 printf("const\n");
5069                         first = 0;
5070
5071                         printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
5072                         type_print(v->type, stdout);
5073                         printf(" = ");
5074                         print_exec(vb->right, -1, 0);
5075                         printf("\n");
5076                 }
5077         }
5078
5079 ###### free const decls
5080         free_binode(context.constlist);
5081
5082 ### Function declarations
5083
5084 The code in an Ocean program is all stored in function declarations.
5085 One of the functions must be named `main` and it must accept an array of
5086 strings as a parameter - the command line arguments.
5087
5088 As this is the top level, several things are handled a bit differently.
5089 The function is not interpreted by `interp_exec` as that isn't passed
5090 the argument list which the program requires.  Similarly type analysis
5091 is a bit more interesting at this level.
5092
5093 ###### ast functions
5094
5095         static struct type *handle_results(struct parse_context *c,
5096                                            struct binode *results)
5097         {
5098                 /* Create a 'struct' type from the results list, which
5099                  * is a list for 'struct var'
5100                  */
5101                 struct type *t = add_anon_type(c, &structure_prototype,
5102                                                " function result");
5103                 int cnt = 0;
5104                 struct binode *b;
5105
5106                 for (b = results; b; b = cast(binode, b->right))
5107                         cnt += 1;
5108                 t->structure.nfields = cnt;
5109                 t->structure.fields = calloc(cnt, sizeof(struct field));
5110                 cnt = 0;
5111                 for (b = results; b; b = cast(binode, b->right)) {
5112                         struct var *v = cast(var, b->left);
5113                         struct field *f = &t->structure.fields[cnt++];
5114                         int a = v->var->type->align;
5115                         f->name = v->var->name->name;
5116                         f->type = v->var->type;
5117                         f->init = NULL;
5118                         f->offset = t->size;
5119                         v->var->frame_pos = f->offset;
5120                         t->size += ((f->type->size - 1) | (a-1)) + 1;
5121                         if (a > t->align)
5122                                 t->align = a;
5123                         variable_unlink_exec(v->var);
5124                 }
5125                 free_binode(results);
5126                 return t;
5127         }
5128
5129         static struct variable *declare_function(struct parse_context *c,
5130                                                 struct variable *name,
5131                                                 struct binode *args,
5132                                                 struct type *ret,
5133                                                 struct binode *results,
5134                                                 struct exec *code)
5135         {
5136                 if (name) {
5137                         struct value fn = {.function = code};
5138                         struct type *t;
5139                         var_block_close(c, CloseFunction, code);
5140                         t = add_anon_type(c, &function_prototype,
5141                                           "func %.*s", name->name->name.len,
5142                                           name->name->name.txt);
5143                         name->type = t;
5144                         t->function.params = reorder_bilist(args);
5145                         if (!ret) {
5146                                 ret = handle_results(c, reorder_bilist(results));
5147                                 t->function.inline_result = 1;
5148                                 t->function.local_size = ret->size;
5149                         }
5150                         t->function.return_type = ret;
5151                         global_alloc(c, t, name, &fn);
5152                         name->type->function.scope = c->out_scope;
5153                 } else {
5154                         free_binode(args);
5155                         free_type(ret);
5156                         free_exec(code);
5157                         var_block_close(c, CloseFunction, NULL);
5158                 }
5159                 c->out_scope = NULL;
5160                 return name;
5161         }
5162
5163 ###### declare terminals
5164         $TERM return
5165
5166 ###### top level grammar
5167
5168         $*variable
5169         DeclareFunction -> func FuncName ( OpenScope ArgsLine ) Block Newlines ${
5170                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5171         }$
5172         | func FuncName IN OpenScope Args OUT OptNL do Block Newlines ${
5173                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5174         }$
5175         | func FuncName NEWLINE OpenScope OptNL do Block Newlines ${
5176                 $0 = declare_function(c, $<FN, NULL, Tnone, NULL, $<Bl);
5177         }$
5178         | func FuncName ( OpenScope ArgsLine ) : Type Block Newlines ${
5179                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5180         }$
5181         | func FuncName ( OpenScope ArgsLine ) : ( ArgsLine ) Block Newlines ${
5182                 $0 = declare_function(c, $<FN, $<AL, NULL, $<AL2, $<Bl);
5183         }$
5184         | func FuncName IN OpenScope Args OUT OptNL return Type Newlines do Block Newlines ${
5185                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5186         }$
5187         | func FuncName NEWLINE OpenScope return Type Newlines do Block Newlines ${
5188                 $0 = declare_function(c, $<FN, NULL, $<Ty, NULL, $<Bl);
5189         }$
5190         | func FuncName IN OpenScope Args OUT OptNL return IN Args OUT OptNL do Block Newlines ${
5191                 $0 = declare_function(c, $<FN, $<Ar, NULL, $<Ar2, $<Bl);
5192         }$
5193         | func FuncName NEWLINE OpenScope return IN Args OUT OptNL do Block Newlines ${
5194                 $0 = declare_function(c, $<FN, NULL, NULL, $<Ar, $<Bl);
5195         }$
5196
5197 ###### print func decls
5198         {
5199                 struct variable *v;
5200                 int target = -1;
5201
5202                 while (target != 0) {
5203                         int i = 0;
5204                         for (v = context.in_scope; v; v=v->in_scope)
5205                                 if (v->depth == 0 && v->type && v->type->check_args) {
5206                                         i += 1;
5207                                         if (i == target)
5208                                                 break;
5209                                 }
5210
5211                         if (target == -1) {
5212                                 target = i;
5213                         } else {
5214                                 struct value *val = var_value(&context, v);
5215                                 printf("func %.*s", v->name->name.len, v->name->name.txt);
5216                                 v->type->print_type_decl(v->type, stdout);
5217                                 if (brackets)
5218                                         print_exec(val->function, 0, brackets);
5219                                 else
5220                                         print_value(v->type, val, stdout);
5221                                 printf("/* frame size %d */\n", v->type->function.local_size);
5222                                 target -= 1;
5223                         }
5224                 }
5225         }
5226
5227 ###### core functions
5228
5229         static int analyse_funcs(struct parse_context *c)
5230         {
5231                 struct variable *v;
5232                 int all_ok = 1;
5233                 for (v = c->in_scope; v; v = v->in_scope) {
5234                         struct value *val;
5235                         struct type *ret;
5236                         enum prop_err perr;
5237                         if (v->depth != 0 || !v->type || !v->type->check_args)
5238                                 continue;
5239                         ret = v->type->function.inline_result ?
5240                                 Tnone : v->type->function.return_type;
5241                         val = var_value(c, v);
5242                         do {
5243                                 perr = 0;
5244                                 propagate_types(val->function, c, &perr, ret, 0);
5245                         } while (!(perr & Efail) && (perr & Eretry));
5246                         if (!(perr & Efail))
5247                                 /* Make sure everything is still consistent */
5248                                 propagate_types(val->function, c, &perr, ret, 0);
5249                         if (perr & Efail)
5250                                 all_ok = 0;
5251                         if (!v->type->function.inline_result &&
5252                             !v->type->function.return_type->dup) {
5253                                 type_err(c, "error: function cannot return value of type %1",
5254                                          v->where_decl, v->type->function.return_type, 0, NULL);
5255                         }
5256
5257                         scope_finalize(c, v->type);
5258                 }
5259                 return all_ok;
5260         }
5261
5262         static int analyse_main(struct type *type, struct parse_context *c)
5263         {
5264                 struct binode *bp = type->function.params;
5265                 struct binode *b;
5266                 enum prop_err perr;
5267                 int arg = 0;
5268                 struct type *argv_type;
5269
5270                 argv_type = add_anon_type(c, &array_prototype, "argv");
5271                 argv_type->array.member = Tstr;
5272                 argv_type->array.unspec = 1;
5273
5274                 for (b = bp; b; b = cast(binode, b->right)) {
5275                         perr = 0;
5276                         switch (arg++) {
5277                         case 0: /* argv */
5278                                 propagate_types(b->left, c, &perr, argv_type, 0);
5279                                 break;
5280                         default: /* invalid */  // NOTEST
5281                                 propagate_types(b->left, c, &perr, Tnone, 0);   // NOTEST
5282                         }
5283                         if (perr & Efail)
5284                                 c->parse_error += 1;
5285                 }
5286
5287                 return !c->parse_error;
5288         }
5289
5290         static void interp_main(struct parse_context *c, int argc, char **argv)
5291         {
5292                 struct value *progp = NULL;
5293                 struct text main_name = { "main", 4 };
5294                 struct variable *mainv;
5295                 struct binode *al;
5296                 int anum = 0;
5297                 struct value v;
5298                 struct type *vtype;
5299
5300                 mainv = var_ref(c, main_name);
5301                 if (mainv)
5302                         progp = var_value(c, mainv);
5303                 if (!progp || !progp->function) {
5304                         fprintf(stderr, "oceani: no main function found.\n");
5305                         c->parse_error += 1;
5306                         return;
5307                 }
5308                 if (!analyse_main(mainv->type, c)) {
5309                         fprintf(stderr, "oceani: main has wrong type.\n");
5310                         c->parse_error += 1;
5311                         return;
5312                 }
5313                 al = mainv->type->function.params;
5314
5315                 c->local_size = mainv->type->function.local_size;
5316                 c->local = calloc(1, c->local_size);
5317                 while (al) {
5318                         struct var *v = cast(var, al->left);
5319                         struct value *vl = var_value(c, v->var);
5320                         struct value arg;
5321                         struct type *t;
5322                         mpq_t argcq;
5323                         int i;
5324
5325                         switch (anum++) {
5326                         case 0: /* argv */
5327                                 t = v->var->type;
5328                                 mpq_init(argcq);
5329                                 mpq_set_ui(argcq, argc, 1);
5330                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
5331                                 t->prepare_type(c, t, 0);
5332                                 array_init(v->var->type, vl);
5333                                 for (i = 0; i < argc; i++) {
5334                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
5335
5336                                         arg.str.txt = argv[i];
5337                                         arg.str.len = strlen(argv[i]);
5338                                         free_value(Tstr, vl2);
5339                                         dup_value(Tstr, &arg, vl2);
5340                                 }
5341                                 break;
5342                         }
5343                         al = cast(binode, al->right);
5344                 }
5345                 v = interp_exec(c, progp->function, &vtype);
5346                 free_value(vtype, &v);
5347                 free(c->local);
5348                 c->local = NULL;
5349         }
5350
5351 ###### ast functions
5352         void free_variable(struct variable *v)
5353         {
5354         }
5355
5356 ## And now to test it out.
5357
5358 Having a language requires having a "hello world" program.  I'll
5359 provide a little more than that: a program that prints "Hello world"
5360 finds the GCD of two numbers, prints the first few elements of
5361 Fibonacci, performs a binary search for a number, and a few other
5362 things which will likely grow as the languages grows.
5363
5364 ###### File: oceani.mk
5365         demos :: sayhello
5366         sayhello : oceani
5367                 @echo "===== DEMO ====="
5368                 ./oceani --section "demo: hello" oceani.mdc 55 33
5369
5370 ###### demo: hello
5371
5372         const
5373                 pi ::= 3.141_592_6
5374                 four ::= 2 + 2 ; five ::= 10/2
5375         const pie ::= "I like Pie";
5376                 cake ::= "The cake is"
5377                   ++ " a lie"
5378
5379         struct fred
5380                 size:[four]number
5381                 name:string
5382                 alive:Boolean
5383
5384         func main(argv:[argc::]string)
5385                 print "Hello World, what lovely oceans you have!"
5386                 print "Are there", five, "?"
5387                 print pi, pie, "but", cake
5388
5389                 A := $argv[1]; B := $argv[2]
5390
5391                 /* When a variable is defined in both branches of an 'if',
5392                  * and used afterwards, the variables are merged.
5393                  */
5394                 if A > B:
5395                         bigger := "yes"
5396                 else
5397                         bigger := "no"
5398                 print "Is", A, "bigger than", B,"? ", bigger
5399                 /* If a variable is not used after the 'if', no
5400                  * merge happens, so types can be different
5401                  */
5402                 if A > B * 2:
5403                         double:string = "yes"
5404                         print A, "is more than twice", B, "?", double
5405                 else
5406                         double := B*2
5407                         print "double", B, "is", double
5408
5409                 a : number
5410                 a = A;
5411                 b:number = B
5412                 if a > 0 and then b > 0:
5413                         while a != b:
5414                                 if a < b:
5415                                         b = b - a
5416                                 else
5417                                         a = a - b
5418                         print "GCD of", A, "and", B,"is", a
5419                 else if a <= 0:
5420                         print a, "is not positive, cannot calculate GCD"
5421                 else
5422                         print b, "is not positive, cannot calculate GCD"
5423
5424                 for
5425                         togo := 10
5426                         f1 := 1; f2 := 1
5427                         print "Fibonacci:", f1,f2,
5428                 then togo = togo - 1
5429                 while togo > 0:
5430                         f3 := f1 + f2
5431                         print "", f3,
5432                         f1 = f2
5433                         f2 = f3
5434                 print ""
5435
5436                 /* Binary search... */
5437                 for
5438                         lo:= 0; hi := 100
5439                         target := 77
5440                 while
5441                         mid := (lo + hi) / 2
5442                         if mid == target:
5443                                 use Found
5444                         if mid < target:
5445                                 lo = mid
5446                         else
5447                                 hi = mid
5448                         if hi - lo < 1:
5449                                 lo = mid
5450                                 use GiveUp
5451                         use True
5452                 do pass
5453                 case Found:
5454                         print "Yay, I found", target
5455                 case GiveUp:
5456                         print "Closest I found was", lo
5457
5458                 size::= 10
5459                 list:[size]number
5460                 list[0] = 1234
5461                 // "middle square" PRNG.  Not particularly good, but one my
5462                 // Dad taught me - the first one I ever heard of.
5463                 for i:=1; then i = i + 1; while i < size:
5464                         n := list[i-1] * list[i-1]
5465                         list[i] = (n / 100) % 10 000
5466
5467                 print "Before sort:",
5468                 for i:=0; then i = i + 1; while i < size:
5469                         print "", list[i],
5470                 print
5471
5472                 for i := 1; then i=i+1; while i < size:
5473                         for j:=i-1; then j=j-1; while j >= 0:
5474                                 if list[j] > list[j+1]:
5475                                         t:= list[j]
5476                                         list[j] = list[j+1]
5477                                         list[j+1] = t
5478                 print " After sort:",
5479                 for i:=0; then i = i + 1; while i < size:
5480                         print "", list[i],
5481                 print
5482
5483                 if 1 == 2 then print "yes"; else print "no"
5484
5485                 bob:fred
5486                 bob.name = "Hello"
5487                 bob.alive = (bob.name == "Hello")
5488                 print "bob", "is" if  bob.alive else "isn't", "alive"