ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 ## parse context
 118         };
 119
 120 ###### macros
 121
 122         #define container_of(ptr, type, member) ({                      \
 123                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 124                 (type *)( (char *)__mptr - offsetof(type,member) );})
 125
 126         #define config2context(_conf) container_of(_conf, struct parse_context, \
 127                 config)
 128
 129 ###### Parser: reduce
 130         struct parse_context *c = config2context(config);
 131
 132 ###### Parser: code
 133         #define _GNU_SOURCE
 134         #include <unistd.h>
 135         #include <stdlib.h>
 136         #include <fcntl.h>
 137         #include <errno.h>
 138         #include <sys/mman.h>
 139         #include <string.h>
 140         #include <stdio.h>
 141         #include <locale.h>
 142         #include <malloc.h>
 143         #include "mdcode.h"
 144         #include "scanner.h"
 145         #include "parser.h"
 146
 147         ## includes
 148
 149         #include "oceani.h"
 150
 151         ## forward decls
 152         ## value functions
 153         ## ast functions
 154         ## core functions
 155
 156         #include <getopt.h>
 157         static char Usage[] =
 158                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 159         static const struct option long_options[] = {
 160                 {"trace",     0, NULL, 't'},
 161                 {"print",     0, NULL, 'p'},
 162                 {"noexec",    0, NULL, 'n'},
 163                 {"brackets",  0, NULL, 'b'},
 164                 {"section",   1, NULL, 's'},
 165                 {NULL,        0, NULL, 0},
 166         };
 167         const char *options = "tpnbs";
 168
 169         static void pr_err(char *msg)                   // NOTEST
 170         {
 171                 fprintf(stderr, "%s\n", msg);           // NOTEST
 172         }                                               // NOTEST
 173
 174         int main(int argc, char *argv[])
 175         {
 176                 int fd;
 177                 int len;
 178                 char *file;
 179                 struct section *s = NULL, *ss;
 180                 char *section = NULL;
 181                 struct parse_context context = {
 182                         .config = {
 183                                 .ignored = (1 << TK_mark),
 184                                 .number_chars = ".,_+- ",
 185                                 .word_start = "_",
 186                                 .word_cont = "_",
 187                         },
 188                 };
 189                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 190                 int opt;
 191                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 192                        != -1) {
 193                         switch(opt) {
 194                         case 't': dotrace=1; break;
 195                         case 'p': doprint=1; break;
 196                         case 'n': doexec=0; break;
 197                         case 'b': brackets=1; break;
 198                         case 's': section = optarg; break;
 199                         default: fprintf(stderr, Usage);
 200                                 exit(1);
 201                         }
 202                 }
 203                 if (optind >= argc) {
 204                         fprintf(stderr, "oceani: no input file given\n");
 205                         exit(1);
 206                 }
 207                 fd = open(argv[optind], O_RDONLY);
 208                 if (fd < 0) {
 209                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 210                         exit(1);
 211                 }
 212                 context.file_name = argv[optind];
 213                 len = lseek(fd, 0, 2);
 214                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 215                 s = code_extract(file, file+len, pr_err);
 216                 if (!s) {
 217                         fprintf(stderr, "oceani: could not find any code in %s\n",
 218                                 argv[optind]);
 219                         exit(1);
 220                 }
 221
 222                 ## context initialization
 223
 224                 if (section) {
 225                         for (ss = s; ss; ss = ss->next) {
 226                                 struct text sec = ss->section;
 227                                 if (sec.len == strlen(section) &&
 228                                     strncmp(sec.txt, section, sec.len) == 0)
 229                                         break;
 230                         }
 231                         if (!ss) {
 232                                 fprintf(stderr, "oceani: cannot find section %s\n",
 233                                         section);
 234                                 goto cleanup;
 235                         }
 236                 } else
 237                         ss = s;                         // NOTEST
 238                 if (!ss->code) {
 239                         fprintf(stderr, "oceani: no code found in requested section\n");        // NOTEST
 240                         goto cleanup;                   // NOTEST
 241                 }
 242
 243                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 244
 245                 if (!context.parse_error && !analyse_funcs(&context)) {
 246                         fprintf(stderr, "oceani: type error in program - not running.\n");
 247                         context.parse_error = 1;
 248                 }
 249
 250                 if (doprint) {
 251                         ## print const decls
 252                         ## print type decls
 253                         ## print func decls
 254                 }
 255                 if (doexec && !context.parse_error)
 256                         interp_main(&context, argc - optind, argv + optind);
 257         cleanup:
 258                 while (s) {
 259                         struct section *t = s->next;
 260                         code_free(s->code);
 261                         free(s);
 262                         s = t;
 263                 }
 264                 // FIXME parser should pop scope even on error
 265                 while (context.scope_depth > 0)
 266                         scope_pop(&context);
 267                 ## free global vars
 268                 ## free context types
 269                 ## free context storage
 270                 exit(context.parse_error ? 1 : 0);
 271         }
 272
 273 ### Analysis
 274
 275 The four requirements of parse, analyse, print, interpret apply to
 276 each language element individually so that is how most of the code
 277 will be structured.
 278
 279 Three of the four are fairly self explanatory.  The one that requires
 280 a little explanation is the analysis step.
 281
 282 The current language design does not require the types of variables to
 283 be declared, but they must still have a single type.  Different
 284 operations impose different requirements on the variables, for example
 285 addition requires both arguments to be numeric, and assignment
 286 requires the variable on the left to have the same type as the
 287 expression on the right.
 288
 289 Analysis involves propagating these type requirements around and
 290 consequently setting the type of each variable.  If any requirements
 291 are violated (e.g. a string is compared with a number) or if a
 292 variable needs to have two different types, then an error is raised
 293 and the program will not run.
 294
 295 If the same variable is declared in both branchs of an 'if/else', or
 296 in all cases of a 'switch' then the multiple instances may be merged
 297 into just one variable if the variable is referenced after the
 298 conditional statement.  When this happens, the types must naturally be
 299 consistent across all the branches.  When the variable is not used
 300 outside the if, the variables in the different branches are distinct
 301 and can be of different types.
 302
 303 Undeclared names may only appear in "use" statements and "case" expressions.
 304 These names are given a type of "label" and a unique value.
 305 This allows them to fill the role of a name in an enumerated type, which
 306 is useful for testing the `switch` statement.
 307
 308 As we will see, the condition part of a `while` statement can return
 309 either a Boolean or some other type.  This requires that the expected
 310 type that gets passed around comprises a type and a flag to indicate
 311 that `Tbool` is also permitted.
 312
 313 As there are, as yet, no distinct types that are compatible, there
 314 isn't much subtlety in the analysis.  When we have distinct number
 315 types, this will become more interesting.
 316
 317 #### Error reporting
 318
 319 When analysis discovers an inconsistency it needs to report an error;
 320 just refusing to run the code ensures that the error doesn't cascade,
 321 but by itself it isn't very useful.  A clear understanding of the sort
 322 of error message that are useful will help guide the process of
 323 analysis.
 324
 325 At a simplistic level, the only sort of error that type analysis can
 326 report is that the type of some construct doesn't match a contextual
 327 requirement.  For example, in `4 + "hello"` the addition provides a
 328 contextual requirement for numbers, but `"hello"` is not a number.  In
 329 this particular example no further information is needed as the types
 330 are obvious from local information.  When a variable is involved that
 331 isn't the case.  It may be helpful to explain why the variable has a
 332 particular type, by indicating the location where the type was set,
 333 whether by declaration or usage.
 334
 335 Using a recursive-descent analysis we can easily detect a problem at
 336 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 337 will detect that one argument is not a number and the usage of `hello`
 338 will detect that a number was wanted, but not provided.  In this
 339 (early) version of the language, we will generate error reports at
 340 multiple locations, so the use of `hello` will report an error and
 341 explain were the value was set, and the addition will report an error
 342 and say why numbers are needed.  To be able to report locations for
 343 errors, each language element will need to record a file location
 344 (line and column) and each variable will need to record the language
 345 element where its type was set.  For now we will assume that each line
 346 of an error message indicates one location in the file, and up to 2
 347 types.  So we provide a `printf`-like function which takes a format, a
 348 location (a `struct exec` which has not yet been introduced), and 2
 349 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 350 will need a function to print the location, once we know how that is
 351 stored. e As will be explained later, there are sometimes extra rules for
 352 type matching and they might affect error messages, we need to pass those
 353 in too.
 354
 355 As well as type errors, we sometimes need to report problems with
 356 tokens, which might be unexpected or might name a type that has not
 357 been defined.  For these we have `tok_err()` which reports an error
 358 with a given token.  Each of the error functions sets the flag in the
 359 context so indicate that parsing failed.
 360
 361 ###### forward decls
 362
 363         static void fput_loc(struct exec *loc, FILE *f);
 364         static void type_err(struct parse_context *c,
 365                              char *fmt, struct exec *loc,
 366                              struct type *t1, int rules, struct type *t2);
 367
 368 ###### core functions
 369
 370         static void type_err(struct parse_context *c,
 371                              char *fmt, struct exec *loc,
 372                              struct type *t1, int rules, struct type *t2)
 373         {
 374                 fprintf(stderr, "%s:", c->file_name);
 375                 fput_loc(loc, stderr);
 376                 for (; *fmt ; fmt++) {
 377                         if (*fmt != '%') {
 378                                 fputc(*fmt, stderr);
 379                                 continue;
 380                         }
 381                         fmt++;
 382                         switch (*fmt) {
 383                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 384                         default: fputc('?', stderr); break;     // NOTEST
 385                         case '1':
 386                                 type_print(t1, stderr);
 387                                 break;
 388                         case '2':
 389                                 type_print(t2, stderr);
 390                                 break;
 391                         ## format cases
 392                         }
 393                 }
 394                 fputs("\n", stderr);
 395                 c->parse_error = 1;
 396         }
 397
 398         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 399         {
 400                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 401                         t->txt.len, t->txt.txt);
 402                 c->parse_error = 1;
 403         }
 404
 405 ## Entities: declared and predeclared.
 406
 407 There are various "things" that the language and/or the interpreter
 408 needs to know about to parse and execute a program.  These include
 409 types, variables, values, and executable code.  These are all lumped
 410 together under the term "entities" (calling them "objects" would be
 411 confusing) and introduced here.  The following section will present the
 412 different specific code elements which comprise or manipulate these
 413 various entities.
 414
 415 ### Executables
 416
 417 Executables can be lots of different things.  In many cases an
 418 executable is just an operation combined with one or two other
 419 executables.  This allows for expressions and lists etc.  Other times an
 420 executable is something quite specific like a constant or variable name.
 421 So we define a `struct exec` to be a general executable with a type, and
 422 a `struct binode` which is a subclass of `exec`, forms a node in a
 423 binary tree, and holds an operation.  There will be other subclasses,
 424 and to access these we need to be able to `cast` the `exec` into the
 425 various other types.  The first field in any `struct exec` is the type
 426 from the `exec_types` enum.
 427
 428 ###### macros
 429         #define cast(structname, pointer) ({            \
 430                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 431                 if (__mptr && *__mptr != X##structname) abort();                \
 432                 (struct structname *)( (char *)__mptr);})
 433
 434         #define new(structname) ({                                              \
 435                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 436                 __ptr->type = X##structname;                                            \
 437                 __ptr->line = -1; __ptr->column = -1;                                   \
 438                 __ptr;})
 439
 440         #define new_pos(structname, token) ({                                           \
 441                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 442                 __ptr->type = X##structname;                                            \
 443                 __ptr->line = token.line; __ptr->column = token.col;                    \
 444                 __ptr;})
 445
 446 ###### ast
 447         enum exec_types {
 448                 Xbinode,
 449                 ## exec type
 450         };
 451         struct exec {
 452                 enum exec_types type;
 453                 int line, column;
 454                 ## exec fields
 455         };
 456         struct binode {
 457                 struct exec;
 458                 enum Btype {
 459                         ## Binode types
 460                 } op;
 461                 struct exec *left, *right;
 462         };
 463
 464 ###### ast functions
 465
 466         static int __fput_loc(struct exec *loc, FILE *f)
 467         {
 468                 if (!loc)
 469                         return 0;
 470                 if (loc->line >= 0) {
 471                         fprintf(f, "%d:%d: ", loc->line, loc->column);
 472                         return 1;
 473                 }
 474                 if (loc->type == Xbinode)
 475                         return __fput_loc(cast(binode,loc)->left, f) ||
 476                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
 477                 return 0;
 478         }
 479         static void fput_loc(struct exec *loc, FILE *f)
 480         {
 481                 if (!__fput_loc(loc, f))
 482                         fprintf(f, "??:??: ");
 483         }
 484
 485 Each different type of `exec` node needs a number of functions defined,
 486 a bit like methods.  We must be able to free it, print it, analyse it
 487 and execute it.  Once we have specific `exec` types we will need to
 488 parse them too.  Let's take this a bit more slowly.
 489
 490 #### Freeing
 491
 492 The parser generator requires a `free_foo` function for each struct
 493 that stores attributes and they will often be `exec`s and subtypes
 494 there-of.  So we need `free_exec` which can handle all the subtypes,
 495 and we need `free_binode`.
 496
 497 ###### ast functions
 498
 499         static void free_binode(struct binode *b)
 500         {
 501                 if (!b)
 502                         return;
 503                 free_exec(b->left);
 504                 free_exec(b->right);
 505                 free(b);
 506         }
 507
 508 ###### core functions
 509         static void free_exec(struct exec *e)
 510         {
 511                 if (!e)
 512                         return;
 513                 switch(e->type) {
 514                         ## free exec cases
 515                 }
 516         }
 517
 518 ###### forward decls
 519
 520         static void free_exec(struct exec *e);
 521
 522 ###### free exec cases
 523         case Xbinode: free_binode(cast(binode, e)); break;
 524
 525 #### Printing
 526
 527 Printing an `exec` requires that we know the current indent level for
 528 printing line-oriented components.  As will become clear later, we
 529 also want to know what sort of bracketing to use.
 530
 531 ###### ast functions
 532
 533         static void do_indent(int i, char *str)
 534         {
 535                 while (i-- > 0)
 536                         printf("    ");
 537                 printf("%s", str);
 538         }
 539
 540 ###### core functions
 541         static void print_binode(struct binode *b, int indent, int bracket)
 542         {
 543                 struct binode *b2;
 544                 switch(b->op) {
 545                 ## print binode cases
 546                 }
 547         }
 548
 549         static void print_exec(struct exec *e, int indent, int bracket)
 550         {
 551                 if (!e)
 552                         return;
 553                 switch (e->type) {
 554                 case Xbinode:
 555                         print_binode(cast(binode, e), indent, bracket); break;
 556                 ## print exec cases
 557                 }
 558                 if (e->to_free) {
 559                         struct variable *v;
 560                         do_indent(indent, "/* FREE");
 561                         for (v = e->to_free; v; v = v->next_free) {
 562                                 printf(" %.*s", v->name->name.len, v->name->name.txt);
 563                                 printf("[%d,%d]", v->scope_start, v->scope_end);
 564                                 if (v->frame_pos >= 0)
 565                                         printf("(%d+%d)", v->frame_pos,
 566                                                v->type ? v->type->size:0);
 567                         }
 568                         printf(" */\n");
 569                 }
 570         }
 571
 572 ###### forward decls
 573
 574         static void print_exec(struct exec *e, int indent, int bracket);
 575
 576 #### Analysing
 577
 578 As discussed, analysis involves propagating type requirements around the
 579 program and looking for errors.
 580
 581 So `propagate_types` is passed an expected type (being a `struct type`
 582 pointer together with some `val_rules` flags) that the `exec` is
 583 expected to return, and returns the type that it does return, either
 584 of which can be `NULL` signifying "unknown".  An `ok` flag is passed
 585 by reference. It is set to `0` when an error is found, and `2` when
 586 any change is made.  If it remains unchanged at `1`, then no more
 587 propagation is needed.
 588
 589 ###### ast
 590
 591         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 1<<2};
 592
 593 ###### format cases
 594         case 'r':
 595                 if (rules & Rnolabel)
 596                         fputs(" (labels not permitted)", stderr);
 597                 break;
 598
 599 ###### forward decls
 600         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 601                                             struct type *type, int rules);
 602 ###### core functions
 603
 604         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 605                                               struct type *type, int rules)
 606         {
 607                 struct type *t;
 608
 609                 if (!prog)
 610                         return Tnone;
 611
 612                 switch (prog->type) {
 613                 case Xbinode:
 614                 {
 615                         struct binode *b = cast(binode, prog);
 616                         switch (b->op) {
 617                         ## propagate binode cases
 618                         }
 619                         break;
 620                 }
 621                 ## propagate exec cases
 622                 }
 623                 return Tnone;
 624         }
 625
 626         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 627                                             struct type *type, int rules)
 628         {
 629                 struct type *ret = __propagate_types(prog, c, ok, type, rules);
 630
 631                 if (c->parse_error)
 632                         *ok = 0;
 633                 return ret;
 634         }
 635
 636 #### Interpreting
 637
 638 Interpreting an `exec` doesn't require anything but the `exec`.  State
 639 is stored in variables and each variable will be directly linked from
 640 within the `exec` tree.  The exception to this is the `main` function
 641 which needs to look at command line arguments.  This function will be
 642 interpreted separately.
 643
 644 Each `exec` can return a value combined with a type in `struct lrval`.
 645 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
 646 the location of a value, which can be updated, in `lval`.  Others will
 647 set `lval` to NULL indicating that there is a value of appropriate type
 648 in `rval`.
 649
 650 ###### forward decls
 651         static struct value interp_exec(struct parse_context *c, struct exec *e,
 652                                         struct type **typeret);
 653 ###### core functions
 654
 655         struct lrval {
 656                 struct type *type;
 657                 struct value rval, *lval;
 658         };
 659
 660         /* If dest is passed, dtype must give the expected type, and
 661          * result can go there, in which case type is returned as NULL.
 662          */
 663         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 664                                          struct value *dest, struct type *dtype);
 665
 666         static struct value interp_exec(struct parse_context *c, struct exec *e,
 667                                         struct type **typeret)
 668         {
 669                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 670
 671                 if (!ret.type) abort();
 672                 if (typeret)
 673                         *typeret = ret.type;
 674                 if (ret.lval)
 675                         dup_value(ret.type, ret.lval, &ret.rval);
 676                 return ret.rval;
 677         }
 678
 679         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
 680                                           struct type **typeret)
 681         {
 682                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 683
 684                 if (!ret.type) abort();
 685                 if (ret.lval)
 686                         *typeret = ret.type;
 687                 else
 688                         free_value(ret.type, &ret.rval);
 689                 return ret.lval;
 690         }
 691
 692         /* dinterp_exec is used when the destination type is certain and
 693          * the value has a place to go.
 694          */
 695         static void dinterp_exec(struct parse_context *c, struct exec *e,
 696                                  struct value *dest, struct type *dtype,
 697                                  int need_free)
 698         {
 699                 struct lrval ret = _interp_exec(c, e, dest, dtype);
 700                 if (!ret.type)
 701                         return;
 702                 if (need_free)
 703                         free_value(dtype, dest);
 704                 if (ret.lval)
 705                         dup_value(dtype, ret.lval, dest);
 706                 else
 707                         memcpy(dest, &ret.rval, dtype->size);
 708         }
 709
 710         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 711                                          struct value *dest, struct type *dtype)
 712         {
 713                 /* If the result is copied to dest, ret.type is set to NULL */
 714                 struct lrval ret;
 715                 struct value rv = {}, *lrv = NULL;
 716                 struct type *rvtype;
 717
 718                 rvtype = ret.type = Tnone;
 719                 if (!e) {
 720                         ret.lval = lrv;
 721                         ret.rval = rv;
 722                         return ret;
 723                 }
 724
 725                 switch(e->type) {
 726                 case Xbinode:
 727                 {
 728                         struct binode *b = cast(binode, e);
 729                         struct value left, right, *lleft;
 730                         struct type *ltype, *rtype;
 731                         ltype = rtype = Tnone;
 732                         switch (b->op) {
 733                         ## interp binode cases
 734                         }
 735                         free_value(ltype, &left);
 736                         free_value(rtype, &right);
 737                         break;
 738                 }
 739                 ## interp exec cases
 740                 }
 741                 if (rvtype) {
 742                         ret.lval = lrv;
 743                         ret.rval = rv;
 744                         ret.type = rvtype;
 745                 }
 746                 ## interp exec cleanup
 747                 return ret;
 748         }
 749
 750 ### Types
 751
 752 Values come in a wide range of types, with more likely to be added.
 753 Each type needs to be able to print its own values (for convenience at
 754 least) as well as to compare two values, at least for equality and
 755 possibly for order.  For now, values might need to be duplicated and
 756 freed, though eventually such manipulations will be better integrated
 757 into the language.
 758
 759 Rather than requiring every numeric type to support all numeric
 760 operations (add, multiply, etc), we allow types to be able to present
 761 as one of a few standard types: integer, float, and fraction.  The
 762 existence of these conversion functions eventually enable types to
 763 determine if they are compatible with other types, though such types
 764 have not yet been implemented.
 765
 766 Named type are stored in a simple linked list.  Objects of each type are
 767 "values" which are often passed around by value.
 768
 769 There are both explicitly named types, and anonymous types.  Anonymous
 770 cannot be accessed by name, but are used internally and have a name
 771 which might be reported in error messages.
 772
 773 ###### ast
 774
 775         struct value {
 776                 union {
 777                         char ptr[1];
 778                         ## value union fields
 779                 };
 780         };
 781
 782         struct type {
 783                 struct text name;
 784                 struct type *next;
 785                 int size, align;
 786                 int anon;
 787                 void (*init)(struct type *type, struct value *val);
 788                 void (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 789                 void (*print)(struct type *type, struct value *val, FILE *f);
 790                 void (*print_type)(struct type *type, FILE *f);
 791                 int (*cmp_order)(struct type *t1, struct type *t2,
 792                                  struct value *v1, struct value *v2);
 793                 int (*cmp_eq)(struct type *t1, struct type *t2,
 794                               struct value *v1, struct value *v2);
 795                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 796                 void (*free)(struct type *type, struct value *val);
 797                 void (*free_type)(struct type *t);
 798                 long long (*to_int)(struct value *v);
 799                 double (*to_float)(struct value *v);
 800                 int (*to_mpq)(mpq_t *q, struct value *v);
 801                 ## type functions
 802                 union {
 803                         ## type union fields
 804                 };
 805         };
 806
 807 ###### parse context
 808
 809         struct type *typelist;
 810
 811 ###### includes
 812         #include <stdarg.h>
 813
 814 ###### ast functions
 815
 816         static struct type *find_type(struct parse_context *c, struct text s)
 817         {
 818                 struct type *t = c->typelist;
 819
 820                 while (t && (t->anon ||
 821                              text_cmp(t->name, s) != 0))
 822                                 t = t->next;
 823                 return t;
 824         }
 825
 826         static struct type *_add_type(struct parse_context *c, struct text s,
 827                                      struct type *proto, int anon)
 828         {
 829                 struct type *n;
 830
 831                 n = calloc(1, sizeof(*n));
 832                 *n = *proto;
 833                 n->name = s;
 834                 n->anon = anon;
 835                 n->next = c->typelist;
 836                 c->typelist = n;
 837                 return n;
 838         }
 839
 840         static struct type *add_type(struct parse_context *c, struct text s,
 841                                       struct type *proto)
 842         {
 843                 return _add_type(c, s, proto, 0);
 844         }
 845
 846         static struct type *add_anon_type(struct parse_context *c,
 847                                           struct type *proto, char *name, ...)
 848         {
 849                 struct text t;
 850                 va_list ap;
 851
 852                 va_start(ap, name);
 853                 vasprintf(&t.txt, name, ap);
 854                 va_end(ap);
 855                 t.len = strlen(name);
 856                 return _add_type(c, t, proto, 1);
 857         }
 858
 859         static void free_type(struct type *t)
 860         {
 861                 /* The type is always a reference to something in the
 862                  * context, so we don't need to free anything.
 863                  */
 864         }
 865
 866         static void free_value(struct type *type, struct value *v)
 867         {
 868                 if (type && v) {
 869                         type->free(type, v);
 870                         memset(v, 0x5a, type->size);
 871                 }
 872         }
 873
 874         static void type_print(struct type *type, FILE *f)
 875         {
 876                 if (!type)
 877                         fputs("*unknown*type*", f);     // NOTEST
 878                 else if (type->name.len && !type->anon)
 879                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 880                 else if (type->print_type)
 881                         type->print_type(type, f);
 882                 else
 883                         fputs("*invalid*type*", f);
 884         }
 885
 886         static void val_init(struct type *type, struct value *val)
 887         {
 888                 if (type && type->init)
 889                         type->init(type, val);
 890         }
 891
 892         static void dup_value(struct type *type,
 893                               struct value *vold, struct value *vnew)
 894         {
 895                 if (type && type->dup)
 896                         type->dup(type, vold, vnew);
 897         }
 898
 899         static int value_cmp(struct type *tl, struct type *tr,
 900                              struct value *left, struct value *right)
 901         {
 902                 if (tl && tl->cmp_order)
 903                         return tl->cmp_order(tl, tr, left, right);
 904                 if (tl && tl->cmp_eq)                   // NOTEST
 905                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 906                 return -1;                              // NOTEST
 907         }
 908
 909         static void print_value(struct type *type, struct value *v, FILE *f)
 910         {
 911                 if (type && type->print)
 912                         type->print(type, v, f);
 913                 else
 914                         fprintf(f, "*Unknown*");                // NOTEST
 915         }
 916
 917 ###### forward decls
 918
 919         static void free_value(struct type *type, struct value *v);
 920         static int type_compat(struct type *require, struct type *have, int rules);
 921         static void type_print(struct type *type, FILE *f);
 922         static void val_init(struct type *type, struct value *v);
 923         static void dup_value(struct type *type,
 924                               struct value *vold, struct value *vnew);
 925         static int value_cmp(struct type *tl, struct type *tr,
 926                              struct value *left, struct value *right);
 927         static void print_value(struct type *type, struct value *v, FILE *f);
 928
 929 ###### free context types
 930
 931         while (context.typelist) {
 932                 struct type *t = context.typelist;
 933
 934                 context.typelist = t->next;
 935                 if (t->free_type)
 936                         t->free_type(t);
 937                 if (t->anon)
 938                         free(t->name.txt);
 939                 free(t);
 940         }
 941
 942 Type can be specified for local variables, for fields in a structure,
 943 for formal parameters to functions, and possibly elsewhere.  Different
 944 rules may apply in different contexts.  As a minimum, a named type may
 945 always be used.  Currently the type of a formal parameter can be
 946 different from types in other contexts, so we have a separate grammar
 947 symbol for those.
 948
 949 ###### Grammar
 950
 951         $*type
 952         Type -> IDENTIFIER ${
 953                 $0 = find_type(c, $1.txt);
 954                 if (!$0) {
 955                         tok_err(c,
 956                                 "error: undefined type", &$1);
 957
 958                         $0 = Tnone;
 959                 }
 960         }$
 961         ## type grammar
 962
 963         FormalType -> Type ${ $0 = $<1; }$
 964         ## formal type grammar
 965
 966 #### Base Types
 967
 968 Values of the base types can be numbers, which we represent as
 969 multi-precision fractions, strings, Booleans and labels.  When
 970 analysing the program we also need to allow for places where no value
 971 is meaningful (type `Tnone`) and where we don't know what type to
 972 expect yet (type is `NULL`).
 973
 974 Values are never shared, they are always copied when used, and freed
 975 when no longer needed.
 976
 977 When propagating type information around the program, we need to
 978 determine if two types are compatible, where type `NULL` is compatible
 979 with anything.  There are two special cases with type compatibility,
 980 both related to the Conditional Statement which will be described
 981 later.  In some cases a Boolean can be accepted as well as some other
 982 primary type, and in others any type is acceptable except a label (`Vlabel`).
 983 A separate function encoding these cases will simplify some code later.
 984
 985 ###### type functions
 986
 987         int (*compat)(struct type *this, struct type *other);
 988
 989 ###### ast functions
 990
 991         static int type_compat(struct type *require, struct type *have, int rules)
 992         {
 993                 if ((rules & Rboolok) && have == Tbool)
 994                         return 1;       // NOTEST
 995                 if ((rules & Rnolabel) && have == Tlabel)
 996                         return 0;       // NOTEST
 997                 if (!require || !have)
 998                         return 1;
 999
1000                 if (require->compat)
1001                         return require->compat(require, have);
1002
1003                 return require == have;
1004         }
1005
1006 ###### includes
1007         #include <gmp.h>
1008         #include "parse_string.h"
1009         #include "parse_number.h"
1010
1011 ###### libs
1012         myLDLIBS := libnumber.o libstring.o -lgmp
1013         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
1014
1015 ###### type union fields
1016         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
1017
1018 ###### value union fields
1019         struct text str;
1020         mpq_t num;
1021         unsigned char bool;
1022         void *label;
1023
1024 ###### ast functions
1025         static void _free_value(struct type *type, struct value *v)
1026         {
1027                 if (!v)
1028                         return;         // NOTEST
1029                 switch (type->vtype) {
1030                 case Vnone: break;
1031                 case Vstr: free(v->str.txt); break;
1032                 case Vnum: mpq_clear(v->num); break;
1033                 case Vlabel:
1034                 case Vbool: break;
1035                 }
1036         }
1037
1038 ###### value functions
1039
1040         static void _val_init(struct type *type, struct value *val)
1041         {
1042                 switch(type->vtype) {
1043                 case Vnone:             // NOTEST
1044                         break;          // NOTEST
1045                 case Vnum:
1046                         mpq_init(val->num); break;
1047                 case Vstr:
1048                         val->str.txt = malloc(1);
1049                         val->str.len = 0;
1050                         break;
1051                 case Vbool:
1052                         val->bool = 0;
1053                         break;
1054                 case Vlabel:
1055                         val->label = NULL;
1056                         break;
1057                 }
1058         }
1059
1060         static void _dup_value(struct type *type,
1061                                struct value *vold, struct value *vnew)
1062         {
1063                 switch (type->vtype) {
1064                 case Vnone:             // NOTEST
1065                         break;          // NOTEST
1066                 case Vlabel:
1067                         vnew->label = vold->label;
1068                         break;
1069                 case Vbool:
1070                         vnew->bool = vold->bool;
1071                         break;
1072                 case Vnum:
1073                         mpq_init(vnew->num);
1074                         mpq_set(vnew->num, vold->num);
1075                         break;
1076                 case Vstr:
1077                         vnew->str.len = vold->str.len;
1078                         vnew->str.txt = malloc(vnew->str.len);
1079                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
1080                         break;
1081                 }
1082         }
1083
1084         static int _value_cmp(struct type *tl, struct type *tr,
1085                               struct value *left, struct value *right)
1086         {
1087                 int cmp;
1088                 if (tl != tr)
1089                         return tl - tr; // NOTEST
1090                 switch (tl->vtype) {
1091                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
1092                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
1093                 case Vstr: cmp = text_cmp(left->str, right->str); break;
1094                 case Vbool: cmp = left->bool - right->bool; break;
1095                 case Vnone: cmp = 0;                    // NOTEST
1096                 }
1097                 return cmp;
1098         }
1099
1100         static void _print_value(struct type *type, struct value *v, FILE *f)
1101         {
1102                 switch (type->vtype) {
1103                 case Vnone:                             // NOTEST
1104                         fprintf(f, "*no-value*"); break;        // NOTEST
1105                 case Vlabel:                            // NOTEST
1106                         fprintf(f, "*label-%p*", v->label); break; // NOTEST
1107                 case Vstr:
1108                         fprintf(f, "%.*s", v->str.len, v->str.txt); break;
1109                 case Vbool:
1110                         fprintf(f, "%s", v->bool ? "True":"False"); break;
1111                 case Vnum:
1112                         {
1113                         mpf_t fl;
1114                         mpf_init2(fl, 20);
1115                         mpf_set_q(fl, v->num);
1116                         gmp_fprintf(f, "%Fg", fl);
1117                         mpf_clear(fl);
1118                         break;
1119                         }
1120                 }
1121         }
1122
1123         static void _free_value(struct type *type, struct value *v);
1124
1125         static struct type base_prototype = {
1126                 .init = _val_init,
1127                 .print = _print_value,
1128                 .cmp_order = _value_cmp,
1129                 .cmp_eq = _value_cmp,
1130                 .dup = _dup_value,
1131                 .free = _free_value,
1132         };
1133
1134         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
1135
1136 ###### ast functions
1137         static struct type *add_base_type(struct parse_context *c, char *n,
1138                                           enum vtype vt, int size)
1139         {
1140                 struct text txt = { n, strlen(n) };
1141                 struct type *t;
1142
1143                 t = add_type(c, txt, &base_prototype);
1144                 t->vtype = vt;
1145                 t->size = size;
1146                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
1147                 if (t->size & (t->align - 1))
1148                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
1149                 return t;
1150         }
1151
1152 ###### context initialization
1153
1154         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
1155         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
1156         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
1157         Tnone  = add_base_type(&context, "none", Vnone, 0);
1158         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
1159
1160 ##### Base Values
1161
1162 We have already met values as separate objects.  When manifest constants
1163 appear in the program text, that must result in an executable which has
1164 a constant value.  So the `val` structure embeds a value in an
1165 executable.
1166
1167 ###### exec type
1168         Xval,
1169
1170 ###### ast
1171         struct val {
1172                 struct exec;
1173                 struct type *vtype;
1174                 struct value val;
1175         };
1176
1177 ###### ast functions
1178         struct val *new_val(struct type *T, struct token tk)
1179         {
1180                 struct val *v = new_pos(val, tk);
1181                 v->vtype = T;
1182                 return v;
1183         }
1184
1185 ###### Grammar
1186
1187         $TERM True False
1188
1189         $*val
1190         Value ->  True ${
1191                 $0 = new_val(Tbool, $1);
1192                 $0->val.bool = 1;
1193         }$
1194         | False ${
1195                 $0 = new_val(Tbool, $1);
1196                 $0->val.bool = 0;
1197         }$
1198         | NUMBER ${ {
1199                 char tail[3];
1200                 $0 = new_val(Tnum, $1);
1201                 if (number_parse($0->val.num, tail, $1.txt) == 0)
1202                         mpq_init($0->val.num);  // UNTESTED
1203                         if (tail[0])
1204                                 tok_err(c, "error: unsupported number suffix",
1205                                         &$1);
1206         } }$
1207         | STRING ${ {
1208                 char tail[3];
1209                 $0 = new_val(Tstr, $1);
1210                 string_parse(&$1, '\\', &$0->val.str, tail);
1211                 if (tail[0])
1212                         tok_err(c, "error: unsupported string suffix",
1213                                 &$1);
1214         } }$
1215         | MULTI_STRING ${ {
1216                 char tail[3];
1217                 $0 = new_val(Tstr, $1);
1218                 string_parse(&$1, '\\', &$0->val.str, tail);
1219                 if (tail[0])
1220                         tok_err(c, "error: unsupported string suffix",
1221                                 &$1);
1222         } }$
1223
1224 ###### print exec cases
1225         case Xval:
1226         {
1227                 struct val *v = cast(val, e);
1228                 if (v->vtype == Tstr)
1229                         printf("\"");
1230                 print_value(v->vtype, &v->val, stdout);
1231                 if (v->vtype == Tstr)
1232                         printf("\"");
1233                 break;
1234         }
1235
1236 ###### propagate exec cases
1237         case Xval:
1238         {
1239                 struct val *val = cast(val, prog);
1240                 if (!type_compat(type, val->vtype, rules))
1241                         type_err(c, "error: expected %1%r found %2",
1242                                    prog, type, rules, val->vtype);
1243                 return val->vtype;
1244         }
1245
1246 ###### interp exec cases
1247         case Xval:
1248                 rvtype = cast(val, e)->vtype;
1249                 dup_value(rvtype, &cast(val, e)->val, &rv);
1250                 break;
1251
1252 ###### ast functions
1253         static void free_val(struct val *v)
1254         {
1255                 if (v)
1256                         free_value(v->vtype, &v->val);
1257                 free(v);
1258         }
1259
1260 ###### free exec cases
1261         case Xval: free_val(cast(val, e)); break;
1262
1263 ###### ast functions
1264         // Move all nodes from 'b' to 'rv', reversing their order.
1265         // In 'b' 'left' is a list, and 'right' is the last node.
1266         // In 'rv', left' is the first node and 'right' is a list.
1267         static struct binode *reorder_bilist(struct binode *b)
1268         {
1269                 struct binode *rv = NULL;
1270
1271                 while (b) {
1272                         struct exec *t = b->right;
1273                         b->right = rv;
1274                         rv = b;
1275                         if (b->left)
1276                                 b = cast(binode, b->left);
1277                         else
1278                                 b = NULL;
1279                         rv->left = t;
1280                 }
1281                 return rv;
1282         }
1283
1284 ### Variables
1285
1286 Variables are scoped named values.  We store the names in a linked list
1287 of "bindings" sorted in lexical order, and use sequential search and
1288 insertion sort.
1289
1290 ###### ast
1291
1292         struct binding {
1293                 struct text name;
1294                 struct binding *next;   // in lexical order
1295                 ## binding fields
1296         };
1297
1298 This linked list is stored in the parse context so that "reduce"
1299 functions can find or add variables, and so the analysis phase can
1300 ensure that every variable gets a type.
1301
1302 ###### parse context
1303
1304         struct binding *varlist;  // In lexical order
1305
1306 ###### ast functions
1307
1308         static struct binding *find_binding(struct parse_context *c, struct text s)
1309         {
1310                 struct binding **l = &c->varlist;
1311                 struct binding *n;
1312                 int cmp = 1;
1313
1314                 while (*l &&
1315                         (cmp = text_cmp((*l)->name, s)) < 0)
1316                                 l = & (*l)->next;
1317                 if (cmp == 0)
1318                         return *l;
1319                 n = calloc(1, sizeof(*n));
1320                 n->name = s;
1321                 n->next = *l;
1322                 *l = n;
1323                 return n;
1324         }
1325
1326 Each name can be linked to multiple variables defined in different
1327 scopes.  Each scope starts where the name is declared and continues
1328 until the end of the containing code block.  Scopes of a given name
1329 cannot nest, so a declaration while a name is in-scope is an error.
1330
1331 ###### binding fields
1332         struct variable *var;
1333
1334 ###### ast
1335         struct variable {
1336                 struct variable *previous;
1337                 struct type *type;
1338                 struct binding *name;
1339                 struct exec *where_decl;// where name was declared
1340                 struct exec *where_set; // where type was set
1341                 ## variable fields
1342         };
1343
1344 When a scope closes, the values of the variables might need to be freed.
1345 This happens in the context of some `struct exec` and each `exec` will
1346 need to know which variables need to be freed when it completes.
1347
1348 ####### exec fields
1349         struct variable *to_free;
1350
1351 ####### variable fields
1352         struct exec *cleanup_exec;
1353         struct variable *next_free;
1354
1355 ####### interp exec cleanup
1356         {
1357                 struct variable *v;
1358                 for (v = e->to_free; v; v = v->next_free) {
1359                         struct value *val = var_value(c, v);
1360                         free_value(v->type, val);
1361                 }
1362         }
1363
1364 ###### ast functions
1365         static void variable_unlink_exec(struct variable *v)
1366         {
1367                 struct variable **vp;
1368                 if (!v->cleanup_exec)
1369                         return;
1370                 for (vp = &v->cleanup_exec->to_free;
1371                     *vp; vp = &(*vp)->next_free) {
1372                         if (*vp != v)
1373                                 continue;
1374                         *vp = v->next_free;
1375                         v->cleanup_exec = NULL;
1376                         break;
1377                 }
1378         }
1379
1380 While the naming seems strange, we include local constants in the
1381 definition of variables.  A name declared `var := value` can
1382 subsequently be changed, but a name declared `var ::= value` cannot -
1383 it is constant
1384
1385 ###### variable fields
1386         int constant;
1387
1388 Scopes in parallel branches can be partially merged.  More
1389 specifically, if a given name is declared in both branches of an
1390 if/else then its scope is a candidate for merging.  Similarly if
1391 every branch of an exhaustive switch (e.g. has an "else" clause)
1392 declares a given name, then the scopes from the branches are
1393 candidates for merging.
1394
1395 Note that names declared inside a loop (which is only parallel to
1396 itself) are never visible after the loop.  Similarly names defined in
1397 scopes which are not parallel, such as those started by `for` and
1398 `switch`, are never visible after the scope.  Only variables defined in
1399 both `then` and `else` (including the implicit then after an `if`, and
1400 excluding `then` used with `for`) and in all `case`s and `else` of a
1401 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
1402
1403 Labels, which are a bit like variables, follow different rules.
1404 Labels are not explicitly declared, but if an undeclared name appears
1405 in a context where a label is legal, that effectively declares the
1406 name as a label.  The declaration remains in force (or in scope) at
1407 least to the end of the immediately containing block and conditionally
1408 in any larger containing block which does not declare the name in some
1409 other way.  Importantly, the conditional scope extension happens even
1410 if the label is only used in one parallel branch of a conditional --
1411 when used in one branch it is treated as having been declared in all
1412 branches.
1413
1414 Merge candidates are tentatively visible beyond the end of the
1415 branching statement which creates them.  If the name is used, the
1416 merge is affirmed and they become a single variable visible at the
1417 outer layer.  If not - if it is redeclared first - the merge lapses.
1418
1419 To track scopes we have an extra stack, implemented as a linked list,
1420 which roughly parallels the parse stack and which is used exclusively
1421 for scoping.  When a new scope is opened, a new frame is pushed and
1422 the child-count of the parent frame is incremented.  This child-count
1423 is used to distinguish between the first of a set of parallel scopes,
1424 in which declared variables must not be in scope, and subsequent
1425 branches, whether they may already be conditionally scoped.
1426
1427 We need a total ordering of scopes so we can easily compare to variables
1428 to see if they are concurrently in scope.  To achieve this we record a
1429 `scope_count` which is actually a count of both beginnings and endings
1430 of scopes.  Then each variable has a record of the scope count where it
1431 enters scope, and where it leaves.
1432
1433 To push a new frame *before* any code in the frame is parsed, we need a
1434 grammar reduction.  This is most easily achieved with a grammar
1435 element which derives the empty string, and creates the new scope when
1436 it is recognised.  This can be placed, for example, between a keyword
1437 like "if" and the code following it.
1438
1439 ###### ast
1440         struct scope {
1441                 struct scope *parent;
1442                 int child_count;
1443         };
1444
1445 ###### parse context
1446         int scope_depth;
1447         int scope_count;
1448         struct scope *scope_stack;
1449
1450 ###### variable fields
1451         int scope_start, scope_end;
1452
1453 ###### ast functions
1454         static void scope_pop(struct parse_context *c)
1455         {
1456                 struct scope *s = c->scope_stack;
1457
1458                 c->scope_stack = s->parent;
1459                 free(s);
1460                 c->scope_depth -= 1;
1461                 c->scope_count += 1;
1462         }
1463
1464         static void scope_push(struct parse_context *c)
1465         {
1466                 struct scope *s = calloc(1, sizeof(*s));
1467                 if (c->scope_stack)
1468                         c->scope_stack->child_count += 1;
1469                 s->parent = c->scope_stack;
1470                 c->scope_stack = s;
1471                 c->scope_depth += 1;
1472                 c->scope_count += 1;
1473         }
1474
1475 ###### Grammar
1476
1477         $void
1478         OpenScope -> ${ scope_push(c); }$
1479
1480 Each variable records a scope depth and is in one of four states:
1481
1482 - "in scope".  This is the case between the declaration of the
1483   variable and the end of the containing block, and also between
1484   the usage with affirms a merge and the end of that block.
1485
1486   The scope depth is not greater than the current parse context scope
1487   nest depth.  When the block of that depth closes, the state will
1488   change.  To achieve this, all "in scope" variables are linked
1489   together as a stack in nesting order.
1490
1491 - "pending".  The "in scope" block has closed, but other parallel
1492   scopes are still being processed.  So far, every parallel block at
1493   the same level that has closed has declared the name.
1494
1495   The scope depth is the depth of the last parallel block that
1496   enclosed the declaration, and that has closed.
1497
1498 - "conditionally in scope".  The "in scope" block and all parallel
1499   scopes have closed, and no further mention of the name has been seen.
1500   This state includes a secondary nest depth (`min_depth`) which records
1501   the outermost scope seen since the variable became conditionally in
1502   scope.  If a use of the name is found, the variable becomes "in scope"
1503   and that secondary depth becomes the recorded scope depth.  If the
1504   name is declared as a new variable, the old variable becomes "out of
1505   scope" and the recorded scope depth stays unchanged.
1506
1507 - "out of scope".  The variable is neither in scope nor conditionally
1508   in scope.  It is permanently out of scope now and can be removed from
1509   the "in scope" stack.  When a variable becomes out-of-scope it is
1510   moved to a separate list (`out_scope`) of variables which have fully
1511   known scope.  This will be used at the end of each function to assign
1512   each variable a place in the stack frame.
1513
1514 ###### variable fields
1515         int depth, min_depth;
1516         enum { OutScope, PendingScope, CondScope, InScope } scope;
1517         struct variable *in_scope;
1518
1519 ###### parse context
1520
1521         struct variable *in_scope;
1522         struct variable *out_scope;
1523
1524 All variables with the same name are linked together using the
1525 'previous' link.  Those variable that have been affirmatively merged all
1526 have a 'merged' pointer that points to one primary variable - the most
1527 recently declared instance.  When merging variables, we need to also
1528 adjust the 'merged' pointer on any other variables that had previously
1529 been merged with the one that will no longer be primary.
1530
1531 A variable that is no longer the most recent instance of a name may
1532 still have "pending" scope, if it might still be merged with most
1533 recent instance.  These variables don't really belong in the
1534 "in_scope" list, but are not immediately removed when a new instance
1535 is found.  Instead, they are detected and ignored when considering the
1536 list of in_scope names.
1537
1538 The storage of the value of a variable will be described later.  For now
1539 we just need to know that when a variable goes out of scope, it might
1540 need to be freed.  For this we need to be able to find it, so assume that
1541 `var_value()` will provide that.
1542
1543 ###### variable fields
1544         struct variable *merged;
1545
1546 ###### ast functions
1547
1548         static void variable_merge(struct variable *primary, struct variable *secondary)
1549         {
1550                 struct variable *v;
1551
1552                 primary = primary->merged;
1553
1554                 for (v = primary->previous; v; v=v->previous)
1555                         if (v == secondary || v == secondary->merged ||
1556                             v->merged == secondary ||
1557                             v->merged == secondary->merged) {
1558                                 v->scope = OutScope;
1559                                 v->merged = primary;
1560                                 if (v->scope_start < primary->scope_start)
1561                                         primary->scope_start = v->scope_start;
1562                                 if (v->scope_end > primary->scope_end)
1563                                         primary->scope_end = v->scope_end;      // NOTEST
1564                                 variable_unlink_exec(v);
1565                         }
1566         }
1567
1568 ###### forward decls
1569         static struct value *var_value(struct parse_context *c, struct variable *v);
1570
1571 ###### free global vars
1572
1573         while (context.varlist) {
1574                 struct binding *b = context.varlist;
1575                 struct variable *v = b->var;
1576                 context.varlist = b->next;
1577                 free(b);
1578                 while (v) {
1579                         struct variable *next = v->previous;
1580
1581                         if (v->global) {
1582                                 free_value(v->type, var_value(&context, v));
1583                                 if (v->depth == 0)
1584                                         // This is a global constant
1585                                         free_exec(v->where_decl);
1586                         }
1587                         free(v);
1588                         v = next;
1589                 }
1590         }
1591
1592 #### Manipulating Bindings
1593
1594 When a name is conditionally visible, a new declaration discards the old
1595 binding - the condition lapses.  Similarly when we reach the end of a
1596 function (outermost non-global scope) any conditional scope must lapse.
1597 Conversely a usage of the name affirms the visibility and extends it to
1598 the end of the containing block - i.e.  the block that contains both the
1599 original declaration and the latest usage.  This is determined from
1600 `min_depth`.  When a conditionally visible variable gets affirmed like
1601 this, it is also merged with other conditionally visible variables with
1602 the same name.
1603
1604 When we parse a variable declaration we either report an error if the
1605 name is currently bound, or create a new variable at the current nest
1606 depth if the name is unbound or bound to a conditionally scoped or
1607 pending-scope variable.  If the previous variable was conditionally
1608 scoped, it and its homonyms becomes out-of-scope.
1609
1610 When we parse a variable reference (including non-declarative assignment
1611 "foo = bar") we report an error if the name is not bound or is bound to
1612 a pending-scope variable; update the scope if the name is bound to a
1613 conditionally scoped variable; or just proceed normally if the named
1614 variable is in scope.
1615
1616 When we exit a scope, any variables bound at this level are either
1617 marked out of scope or pending-scoped, depending on whether the scope
1618 was sequential or parallel.  Here a "parallel" scope means the "then"
1619 or "else" part of a conditional, or any "case" or "else" branch of a
1620 switch.  Other scopes are "sequential".
1621
1622 When exiting a parallel scope we check if there are any variables that
1623 were previously pending and are still visible. If there are, then
1624 they weren't redeclared in the most recent scope, so they cannot be
1625 merged and must become out-of-scope.  If it is not the first of
1626 parallel scopes (based on `child_count`), we check that there was a
1627 previous binding that is still pending-scope.  If there isn't, the new
1628 variable must now be out-of-scope.
1629
1630 When exiting a sequential scope that immediately enclosed parallel
1631 scopes, we need to resolve any pending-scope variables.  If there was
1632 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1633 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1634 all pending-scope variables become conditionally scoped.
1635
1636 ###### ast
1637         enum closetype { CloseSequential, CloseFunction, CloseParallel, CloseElse };
1638
1639 ###### ast functions
1640
1641         static struct variable *var_decl(struct parse_context *c, struct text s)
1642         {
1643                 struct binding *b = find_binding(c, s);
1644                 struct variable *v = b->var;
1645
1646                 switch (v ? v->scope : OutScope) {
1647                 case InScope:
1648                         /* Caller will report the error */
1649                         return NULL;
1650                 case CondScope:
1651                         for (;
1652                              v && v->scope == CondScope;
1653                              v = v->previous)
1654                                 v->scope = OutScope;
1655                         break;
1656                 default: break;
1657                 }
1658                 v = calloc(1, sizeof(*v));
1659                 v->previous = b->var;
1660                 b->var = v;
1661                 v->name = b;
1662                 v->merged = v;
1663                 v->min_depth = v->depth = c->scope_depth;
1664                 v->scope = InScope;
1665                 v->in_scope = c->in_scope;
1666                 v->scope_start = c->scope_count;
1667                 c->in_scope = v;
1668                 ## variable init
1669                 return v;
1670         }
1671
1672         static struct variable *var_ref(struct parse_context *c, struct text s)
1673         {
1674                 struct binding *b = find_binding(c, s);
1675                 struct variable *v = b->var;
1676                 struct variable *v2;
1677
1678                 switch (v ? v->scope : OutScope) {
1679                 case OutScope:
1680                 case PendingScope:
1681                         /* Caller will report the error */
1682                         return NULL;
1683                 case CondScope:
1684                         /* All CondScope variables of this name need to be merged
1685                          * and become InScope
1686                          */
1687                         v->depth = v->min_depth;
1688                         v->scope = InScope;
1689                         for (v2 = v->previous;
1690                              v2 && v2->scope == CondScope;
1691                              v2 = v2->previous)
1692                                 variable_merge(v, v2);
1693                         break;
1694                 case InScope:
1695                         break;
1696                 }
1697                 return v;
1698         }
1699
1700         static int var_refile(struct parse_context *c, struct variable *v)
1701         {
1702                 /* Variable just went out of scope.  Add it to the out_scope
1703                  * list, sorted by ->scope_start
1704                  */
1705                 struct variable **vp = &c->out_scope;
1706                 while ((*vp) && (*vp)->scope_start < v->scope_start)
1707                         vp = &(*vp)->in_scope;
1708                 v->in_scope = *vp;
1709                 *vp = v;
1710                 return 0;
1711         }
1712
1713         static void var_block_close(struct parse_context *c, enum closetype ct,
1714                                     struct exec *e)
1715         {
1716                 /* Close off all variables that are in_scope.
1717                  * Some variables in c->scope may already be not-in-scope,
1718                  * such as when a PendingScope variable is hidden by a new
1719                  * variable with the same name.
1720                  * So we check for v->name->var != v and drop them.
1721                  * If we choose to make a variable OutScope, we drop it
1722                  * immediately too.
1723                  */
1724                 struct variable *v, **vp, *v2;
1725
1726                 scope_pop(c);
1727                 for (vp = &c->in_scope;
1728                      (v = *vp) && v->min_depth > c->scope_depth;
1729                      (v->scope == OutScope || v->name->var != v)
1730                      ? (*vp =  v->in_scope, var_refile(c, v))
1731                      : ( vp = &v->in_scope, 0)) {
1732                         v->min_depth = c->scope_depth;
1733                         if (v->name->var != v)
1734                                 /* This is still in scope, but we haven't just
1735                                  * closed the scope.
1736                                  */
1737                                 continue;
1738                         v->min_depth = c->scope_depth;
1739                         if (v->scope == InScope)
1740                                 v->scope_end = c->scope_count;
1741                         if (v->scope == InScope && e && !v->global) {
1742                                 /* This variable gets cleaned up when 'e' finishes */
1743                                 variable_unlink_exec(v);
1744                                 v->cleanup_exec = e;
1745                                 v->next_free = e->to_free;
1746                                 e->to_free = v;
1747                         }
1748                         switch (ct) {
1749                         case CloseElse:
1750                         case CloseParallel: /* handle PendingScope */
1751                                 switch(v->scope) {
1752                                 case InScope:
1753                                 case CondScope:
1754                                         if (c->scope_stack->child_count == 1)
1755                                                 /* first among parallel branches */
1756                                                 v->scope = PendingScope;
1757                                         else if (v->previous &&
1758                                                  v->previous->scope == PendingScope)
1759                                                 /* all previous branches used name */
1760                                                 v->scope = PendingScope;
1761                                         else if (v->type == Tlabel)
1762                                                 /* Labels remain pending even when not used */
1763                                                 v->scope = PendingScope;        // UNTESTED
1764                                         else
1765                                                 v->scope = OutScope;
1766                                         if (ct == CloseElse) {
1767                                                 /* All Pending variables with this name
1768                                                  * are now Conditional */
1769                                                 for (v2 = v;
1770                                                      v2 && v2->scope == PendingScope;
1771                                                      v2 = v2->previous)
1772                                                         v2->scope = CondScope;
1773                                         }
1774                                         break;
1775                                 case PendingScope:
1776                                         /* Not possible as it would require
1777                                          * parallel scope to be nested immediately
1778                                          * in a parallel scope, and that never
1779                                          * happens.
1780                                          */                     // NOTEST
1781                                 case OutScope:
1782                                         /* Not possible as we already tested for
1783                                          * OutScope
1784                                          */
1785                                         abort();                // NOTEST
1786                                 }
1787                                 break;
1788                         case CloseFunction:
1789                                 if (v->scope == CondScope)
1790                                         /* Condition cannot continue past end of function */
1791                                         v->scope = InScope;
1792                                 /* fallthrough */
1793                         case CloseSequential:
1794                                 if (v->type == Tlabel)
1795                                         v->scope = PendingScope;
1796                                 switch (v->scope) {
1797                                 case InScope:
1798                                         v->scope = OutScope;
1799                                         break;
1800                                 case PendingScope:
1801                                         /* There was no 'else', so we can only become
1802                                          * conditional if we know the cases were exhaustive,
1803                                          * and that doesn't mean anything yet.
1804                                          * So only labels become conditional..
1805                                          */
1806                                         for (v2 = v;
1807                                              v2 && v2->scope == PendingScope;
1808                                              v2 = v2->previous)
1809                                                 if (v2->type == Tlabel)
1810                                                         v2->scope = CondScope;
1811                                                 else
1812                                                         v2->scope = OutScope;
1813                                         break;
1814                                 case CondScope:
1815                                 case OutScope: break;
1816                                 }
1817                                 break;
1818                         }
1819                 }
1820         }
1821
1822 #### Storing Values
1823
1824 The value of a variable is store separately from the variable, on an
1825 analogue of a stack frame.  There are (currently) two frames that can be
1826 active.  A global frame which currently only stores constants, and a
1827 stacked frame which stores local variables.  Each variable knows if it
1828 is global or not, and what its index into the frame is.
1829
1830 Values in the global frame are known immediately they are relevant, so
1831 the frame needs to be reallocated as it grows so it can store those
1832 values.  The local frame doesn't get values until the interpreted phase
1833 is started, so there is no need to allocate until the size is known.
1834
1835 We initialize the `frame_pos` to an impossible value, so that we can
1836 tell if it was set or not later.
1837
1838 ###### variable fields
1839         short frame_pos;
1840         short global;
1841
1842 ###### variable init
1843         v->frame_pos = -1;
1844
1845 ###### parse context
1846
1847         short global_size, global_alloc;
1848         short local_size;
1849         void *global, *local;
1850
1851 ###### forward decls
1852         static struct value *global_alloc(struct parse_context *c, struct type *t,
1853                                           struct variable *v, struct value *init);
1854
1855 ###### ast functions
1856
1857         static struct value *var_value(struct parse_context *c, struct variable *v)
1858         {
1859                 if (!v->global) {
1860                         if (!c->local || !v->type)
1861                                 return NULL;                    // NOTEST
1862                         if (v->frame_pos + v->type->size > c->local_size) {
1863                                 printf("INVALID frame_pos\n");  // NOTEST
1864                                 exit(2);                        // NOTEST
1865                         }
1866                         return c->local + v->frame_pos;
1867                 }
1868                 if (c->global_size > c->global_alloc) {
1869                         int old = c->global_alloc;
1870                         c->global_alloc = (c->global_size | 1023) + 1024;
1871                         c->global = realloc(c->global, c->global_alloc);
1872                         memset(c->global + old, 0, c->global_alloc - old);
1873                 }
1874                 return c->global + v->frame_pos;
1875         }
1876
1877         static struct value *global_alloc(struct parse_context *c, struct type *t,
1878                                           struct variable *v, struct value *init)
1879         {
1880                 struct value *ret;
1881                 struct variable scratch;
1882
1883                 if (t->prepare_type)
1884                         t->prepare_type(c, t, 1);       // NOTEST
1885
1886                 if (c->global_size & (t->align - 1))
1887                         c->global_size = (c->global_size + t->align) & ~(t->align-1);
1888                 if (!v) {
1889                         v = &scratch;
1890                         v->type = t;
1891                 }
1892                 v->frame_pos = c->global_size;
1893                 v->global = 1;
1894                 c->global_size += v->type->size;
1895                 ret = var_value(c, v);
1896                 if (init)
1897                         memcpy(ret, init, t->size);
1898                 else
1899                         val_init(t, ret);
1900                 return ret;
1901         }
1902
1903 As global values are found -- struct field initializers, labels etc --
1904 `global_alloc()` is called to record the value in the global frame.
1905
1906 When the program is fully parsed, each function is analysed, we need to
1907 walk the list of variables local to that function and assign them an
1908 offset in the stack frame.  For this we have `scope_finalize()`.
1909
1910 We keep the stack from dense by re-using space for between variables
1911 that are not in scope at the same time.  The `out_scope` list is sorted
1912 by `scope_start` and as we process a varible, we move it to an FIFO
1913 stack.  For each variable we consider, we first discard any from the
1914 stack anything that went out of scope before the new variable came in.
1915 Then we place the new variable just after the one at the top of the
1916 stack.
1917
1918 ###### ast functions
1919
1920         static void scope_finalize(struct parse_context *c, struct type *ft)
1921         {
1922                 int size = ft->function.local_size;
1923                 struct variable *next = ft->function.scope;
1924                 struct variable *done = NULL;
1925
1926                 while (next) {
1927                         struct variable *v = next;
1928                         struct type *t = v->type;
1929                         int pos;
1930                         next = v->in_scope;
1931                         if (v->merged != v)
1932                                 continue;
1933                         if (!t)
1934                                 continue;
1935                         if (v->frame_pos >= 0)
1936                                 continue;
1937                         while (done && done->scope_end < v->scope_start)
1938                                 done = done->in_scope;
1939                         if (done)
1940                                 pos = done->frame_pos + done->type->size;
1941                         else
1942                                 pos = ft->function.local_size;
1943                         if (pos & (t->align - 1))
1944                                 pos = (pos + t->align) & ~(t->align-1);
1945                         v->frame_pos = pos;
1946                         if (size < pos + v->type->size)
1947                                 size = pos + v->type->size;
1948                         v->in_scope = done;
1949                         done = v;
1950                 }
1951                 c->out_scope = NULL;
1952                 ft->function.local_size = size;
1953         }
1954
1955 ###### free context storage
1956         free(context.global);
1957
1958 #### Variables as executables
1959
1960 Just as we used a `val` to wrap a value into an `exec`, we similarly
1961 need a `var` to wrap a `variable` into an exec.  While each `val`
1962 contained a copy of the value, each `var` holds a link to the variable
1963 because it really is the same variable no matter where it appears.
1964 When a variable is used, we need to remember to follow the `->merged`
1965 link to find the primary instance.
1966
1967 When a variable is declared, it may or may not be given an explicit
1968 type.  We need to record which so that we can report the parsed code
1969 correctly.
1970
1971 ###### exec type
1972         Xvar,
1973
1974 ###### ast
1975         struct var {
1976                 struct exec;
1977                 struct variable *var;
1978         };
1979
1980 ###### variable fields
1981         int explicit_type;
1982
1983 ###### Grammar
1984
1985         $TERM : ::
1986
1987         $*var
1988         VariableDecl -> IDENTIFIER : ${ {
1989                 struct variable *v = var_decl(c, $1.txt);
1990                 $0 = new_pos(var, $1);
1991                 $0->var = v;
1992                 if (v)
1993                         v->where_decl = $0;
1994                 else {
1995                         v = var_ref(c, $1.txt);
1996                         $0->var = v;
1997                         type_err(c, "error: variable '%v' redeclared",
1998                                  $0, NULL, 0, NULL);
1999                         type_err(c, "info: this is where '%v' was first declared",
2000                                  v->where_decl, NULL, 0, NULL);
2001                 }
2002         } }$
2003         | IDENTIFIER :: ${ {
2004                 struct variable *v = var_decl(c, $1.txt);
2005                 $0 = new_pos(var, $1);
2006                 $0->var = v;
2007                 if (v) {
2008                         v->where_decl = $0;
2009                         v->constant = 1;
2010                 } else {
2011                         v = var_ref(c, $1.txt);
2012                         $0->var = v;
2013                         type_err(c, "error: variable '%v' redeclared",
2014                                  $0, NULL, 0, NULL);
2015                         type_err(c, "info: this is where '%v' was first declared",
2016                                  v->where_decl, NULL, 0, NULL);
2017                 }
2018         } }$
2019         | IDENTIFIER : Type ${ {
2020                 struct variable *v = var_decl(c, $1.txt);
2021                 $0 = new_pos(var, $1);
2022                 $0->var = v;
2023                 if (v) {
2024                         v->where_decl = $0;
2025                         v->where_set = $0;
2026                         v->type = $<Type;
2027                         v->explicit_type = 1;
2028                 } else {
2029                         v = var_ref(c, $1.txt);
2030                         $0->var = v;
2031                         type_err(c, "error: variable '%v' redeclared",
2032                                  $0, NULL, 0, NULL);
2033                         type_err(c, "info: this is where '%v' was first declared",
2034                                  v->where_decl, NULL, 0, NULL);
2035                 }
2036         } }$
2037         | IDENTIFIER :: Type ${ {
2038                 struct variable *v = var_decl(c, $1.txt);
2039                 $0 = new_pos(var, $1);
2040                 $0->var = v;
2041                 if (v) {
2042                         v->where_decl = $0;
2043                         v->where_set = $0;
2044                         v->type = $<Type;
2045                         v->constant = 1;
2046                         v->explicit_type = 1;
2047                 } else {
2048                         v = var_ref(c, $1.txt);
2049                         $0->var = v;
2050                         type_err(c, "error: variable '%v' redeclared",
2051                                  $0, NULL, 0, NULL);
2052                         type_err(c, "info: this is where '%v' was first declared",
2053                                  v->where_decl, NULL, 0, NULL);
2054                 }
2055         } }$
2056
2057         $*exec
2058         Variable -> IDENTIFIER ${ {
2059                 struct variable *v = var_ref(c, $1.txt);
2060                 $0 = new_pos(var, $1);
2061                 if (v == NULL) {
2062                         /* This might be a label - allocate a var just in case */
2063                         v = var_decl(c, $1.txt);
2064                         if (v) {
2065                                 v->type = Tnone;
2066                                 v->where_decl = $0;
2067                                 v->where_set = $0;
2068                         }
2069                 }
2070                 cast(var, $0)->var = v;
2071         } }$
2072
2073 ###### print exec cases
2074         case Xvar:
2075         {
2076                 struct var *v = cast(var, e);
2077                 if (v->var) {
2078                         struct binding *b = v->var->name;
2079                         printf("%.*s", b->name.len, b->name.txt);
2080                 }
2081                 break;
2082         }
2083
2084 ###### format cases
2085         case 'v':
2086                 if (loc && loc->type == Xvar) {
2087                         struct var *v = cast(var, loc);
2088                         if (v->var) {
2089                                 struct binding *b = v->var->name;
2090                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2091                         } else
2092                                 fputs("???", stderr);   // NOTEST
2093                 } else
2094                         fputs("NOTVAR", stderr);
2095                 break;
2096
2097 ###### propagate exec cases
2098
2099         case Xvar:
2100         {
2101                 struct var *var = cast(var, prog);
2102                 struct variable *v = var->var;
2103                 if (!v) {
2104                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2105                         return Tnone;                                   // NOTEST
2106                 }
2107                 v = v->merged;
2108                 if (v->constant && (rules & Rnoconstant)) {
2109                         type_err(c, "error: Cannot assign to a constant: %v",
2110                                  prog, NULL, 0, NULL);
2111                         type_err(c, "info: name was defined as a constant here",
2112                                  v->where_decl, NULL, 0, NULL);
2113                         return v->type;
2114                 }
2115                 if (v->type == Tnone && v->where_decl == prog)
2116                         type_err(c, "error: variable used but not declared: %v",
2117                                  prog, NULL, 0, NULL);
2118                 if (v->type == NULL) {
2119                         if (type && *ok != 0) {
2120                                 v->type = type;
2121                                 v->where_set = prog;
2122                                 *ok = 2;
2123                         }
2124                         return type;
2125                 }
2126                 if (!type_compat(type, v->type, rules)) {
2127                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2128                                  type, rules, v->type);
2129                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2130                                  v->type, rules, NULL);
2131                 }
2132                 if (!type)
2133                         return v->type;
2134                 return type;
2135         }
2136
2137 ###### interp exec cases
2138         case Xvar:
2139         {
2140                 struct var *var = cast(var, e);
2141                 struct variable *v = var->var;
2142
2143                 v = v->merged;
2144                 lrv = var_value(c, v);
2145                 rvtype = v->type;
2146                 break;
2147         }
2148
2149 ###### ast functions
2150
2151         static void free_var(struct var *v)
2152         {
2153                 free(v);
2154         }
2155
2156 ###### free exec cases
2157         case Xvar: free_var(cast(var, e)); break;
2158
2159
2160 ### Complex types
2161
2162 Now that we have the shape of the interpreter in place we can add some
2163 complex types and connected them in to the data structures and the
2164 different phases of parse, analyse, print, interpret.
2165
2166 Being "complex" the language will naturally have syntax to access
2167 specifics of objects of these types.  These will fit into the grammar as
2168 "Terms" which are the things that are combined with various operators to
2169 form "Expression".  Where a Term is formed by some operation on another
2170 Term, the subordinate Term will always come first, so for example a
2171 member of an array will be expressed as the Term for the array followed
2172 by an index in square brackets.  The strict rule of using postfix
2173 operations makes precedence irrelevant within terms.  To provide a place
2174 to put the grammar for each terms of each type, we will start out by
2175 introducing the "Term" grammar production, with contains at least a
2176 simple "Value" (to be explained later).
2177
2178 ###### Grammar
2179         $*exec
2180         Term ->  Value ${ $0 = $<1; }$
2181         | Variable ${ $0 = $<1; }$
2182         ## term grammar
2183
2184 Thus far the complex types we have are arrays and structs.
2185
2186 #### Arrays
2187
2188 Arrays can be declared by giving a size and a type, as `[size]type' so
2189 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
2190 size can be either a literal number, or a named constant.  Some day an
2191 arbitrary expression will be supported.
2192
2193 As a formal parameter to a function, the array can be declared with a
2194 new variable as the size: `name:[size::number]string`.  The `size`
2195 variable is set to the size of the array and must be a constant.  As
2196 `number` is the only supported type, it can be left out:
2197 `name:[size::]string`.
2198
2199 Arrays cannot be assigned.  When pointers are introduced we will also
2200 introduce array slices which can refer to part or all of an array -
2201 the assignment syntax will create a slice.  For now, an array can only
2202 ever be referenced by the name it is declared with.  It is likely that
2203 a "`copy`" primitive will eventually be define which can be used to
2204 make a copy of an array with controllable recursive depth.
2205
2206 For now we have two sorts of array, those with fixed size either because
2207 it is given as a literal number or because it is a struct member (which
2208 cannot have a runtime-changing size), and those with a size that is
2209 determined at runtime - local variables with a const size.  The former
2210 have their size calculated at parse time, the latter at run time.
2211
2212 For the latter type, the `size` field of the type is the size of a
2213 pointer, and the array is reallocated every time it comes into scope.
2214
2215 We differentiate struct fields with a const size from local variables
2216 with a const size by whether they are prepared at parse time or not.
2217
2218 ###### type union fields
2219
2220         struct {
2221                 int unspec;     // size is unspecified - vsize must be set.
2222                 short size;
2223                 short static_size;
2224                 struct variable *vsize;
2225                 struct type *member;
2226         } array;
2227
2228 ###### value union fields
2229         void *array;  // used if not static_size
2230
2231 ###### value functions
2232
2233         static void array_prepare_type(struct parse_context *c, struct type *type,
2234                                        int parse_time)
2235         {
2236                 struct value *vsize;
2237                 mpz_t q;
2238                 if (type->array.static_size)
2239                         return; // NOTEST
2240                 if (type->array.unspec && parse_time)
2241                         return; // NOTEST
2242
2243                 if (type->array.vsize) {
2244                         vsize = var_value(c, type->array.vsize);
2245                         if (!vsize)
2246                                 return; // NOTEST
2247                         mpz_init(q);
2248                         mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
2249                         type->array.size = mpz_get_si(q);
2250                         mpz_clear(q);
2251                 }
2252
2253                 if (parse_time && type->array.member->size) {
2254                         type->array.static_size = 1;
2255                         type->size = type->array.size * type->array.member->size;
2256                         type->align = type->array.member->align;
2257                 }
2258         }
2259
2260         static void array_init(struct type *type, struct value *val)
2261         {
2262                 int i;
2263                 void *ptr = val->ptr;
2264
2265                 if (!val)
2266                         return;                         // NOTEST
2267                 if (!type->array.static_size) {
2268                         val->array = calloc(type->array.size,
2269                                             type->array.member->size);
2270                         ptr = val->array;
2271                 }
2272                 for (i = 0; i < type->array.size; i++) {
2273                         struct value *v;
2274                         v = (void*)ptr + i * type->array.member->size;
2275                         val_init(type->array.member, v);
2276                 }
2277         }
2278
2279         static void array_free(struct type *type, struct value *val)
2280         {
2281                 int i;
2282                 void *ptr = val->ptr;
2283
2284                 if (!type->array.static_size)
2285                         ptr = val->array;
2286                 for (i = 0; i < type->array.size; i++) {
2287                         struct value *v;
2288                         v = (void*)ptr + i * type->array.member->size;
2289                         free_value(type->array.member, v);
2290                 }
2291                 if (!type->array.static_size)
2292                         free(ptr);
2293         }
2294
2295         static int array_compat(struct type *require, struct type *have)
2296         {
2297                 if (have->compat != require->compat)
2298                         return 0;
2299                 /* Both are arrays, so we can look at details */
2300                 if (!type_compat(require->array.member, have->array.member, 0))
2301                         return 0;
2302                 if (have->array.unspec && require->array.unspec) {
2303                         if (have->array.vsize && require->array.vsize &&
2304                             have->array.vsize != require->array.vsize)  // UNTESTED
2305                                 /* sizes might not be the same */
2306                                 return 0;       // UNTESTED
2307                         return 1;
2308                 }
2309                 if (have->array.unspec || require->array.unspec)
2310                         return 1;       // UNTESTED
2311                 if (require->array.vsize == NULL && have->array.vsize == NULL)
2312                         return require->array.size == have->array.size;
2313
2314                 return require->array.vsize == have->array.vsize;       // UNTESTED
2315         }
2316
2317         static void array_print_type(struct type *type, FILE *f)
2318         {
2319                 fputs("[", f);
2320                 if (type->array.vsize) {
2321                         struct binding *b = type->array.vsize->name;
2322                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
2323                                 type->array.unspec ? "::" : "");
2324                 } else if (type->array.size)
2325                         fprintf(f, "%d]", type->array.size);
2326                 else
2327                         fprintf(f, "]");
2328                 type_print(type->array.member, f);
2329         }
2330
2331         static struct type array_prototype = {
2332                 .init = array_init,
2333                 .prepare_type = array_prepare_type,
2334                 .print_type = array_print_type,
2335                 .compat = array_compat,
2336                 .free = array_free,
2337                 .size = sizeof(void*),
2338                 .align = sizeof(void*),
2339         };
2340
2341 ###### declare terminals
2342         $TERM [ ]
2343
2344 ###### type grammar
2345
2346         | [ NUMBER ] Type ${ {
2347                 char tail[3];
2348                 mpq_t num;
2349                 struct type *t;
2350                 int elements = 0;
2351
2352                 if (number_parse(num, tail, $2.txt) == 0)
2353                         tok_err(c, "error: unrecognised number", &$2);
2354                 else if (tail[0]) {
2355                         tok_err(c, "error: unsupported number suffix", &$2);
2356                         mpq_clear(num);
2357                 } else {
2358                         elements = mpz_get_ui(mpq_numref(num));
2359                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
2360                                 tok_err(c, "error: array size must be an integer",
2361                                         &$2);
2362                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
2363                                 tok_err(c, "error: array size is too large",
2364                                         &$2);
2365                         mpq_clear(num);
2366                 }
2367
2368                 $0 = t = add_anon_type(c, &array_prototype, "array[%d]", elements );
2369                 t->array.size = elements;
2370                 t->array.member = $<4;
2371                 t->array.vsize = NULL;
2372         } }$
2373
2374         | [ IDENTIFIER ] Type ${ {
2375                 struct variable *v = var_ref(c, $2.txt);
2376
2377                 if (!v)
2378                         tok_err(c, "error: name undeclared", &$2);
2379                 else if (!v->constant)
2380                         tok_err(c, "error: array size must be a constant", &$2);
2381
2382                 $0 = add_anon_type(c, &array_prototype, "array[%.*s]", $2.txt.len, $2.txt.txt);
2383                 $0->array.member = $<4;
2384                 $0->array.size = 0;
2385                 $0->array.vsize = v;
2386         } }$
2387
2388 ###### Grammar
2389         $*type
2390         OptType -> Type ${ $0 = $<1; }$
2391                 | ${ $0 = NULL; }$
2392
2393 ###### formal type grammar
2394
2395         | [ IDENTIFIER :: OptType ] Type ${ {
2396                 struct variable *v = var_decl(c, $ID.txt);
2397
2398                 v->type = $<OT;
2399                 v->constant = 1;
2400                 if (!v->type)
2401                         v->type = Tnum;
2402                 $0 = add_anon_type(c, &array_prototype, "array[var]");
2403                 $0->array.member = $<6;
2404                 $0->array.size = 0;
2405                 $0->array.unspec = 1;
2406                 $0->array.vsize = v;
2407         } }$
2408
2409 ###### Binode types
2410         Index,
2411
2412 ###### term grammar
2413
2414         | Term [ Expression ] ${ {
2415                 struct binode *b = new(binode);
2416                 b->op = Index;
2417                 b->left = $<1;
2418                 b->right = $<3;
2419                 $0 = b;
2420         } }$
2421
2422 ###### print binode cases
2423         case Index:
2424                 print_exec(b->left, -1, bracket);
2425                 printf("[");
2426                 print_exec(b->right, -1, bracket);
2427                 printf("]");
2428                 break;
2429
2430 ###### propagate binode cases
2431         case Index:
2432                 /* left must be an array, right must be a number,
2433                  * result is the member type of the array
2434                  */
2435                 propagate_types(b->right, c, ok, Tnum, 0);
2436                 t = propagate_types(b->left, c, ok, NULL, rules & Rnoconstant);
2437                 if (!t || t->compat != array_compat) {
2438                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
2439                         return NULL;
2440                 } else {
2441                         if (!type_compat(type, t->array.member, rules)) {
2442                                 type_err(c, "error: have %1 but need %2", prog,
2443                                          t->array.member, rules, type);
2444                         }
2445                         return t->array.member;
2446                 }
2447                 break;
2448
2449 ###### interp binode cases
2450         case Index: {
2451                 mpz_t q;
2452                 long i;
2453                 void *ptr;
2454
2455                 lleft = linterp_exec(c, b->left, &ltype);
2456                 right = interp_exec(c, b->right, &rtype);
2457                 mpz_init(q);
2458                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
2459                 i = mpz_get_si(q);
2460                 mpz_clear(q);
2461
2462                 if (ltype->array.static_size)
2463                         ptr = lleft;
2464                 else
2465                         ptr = *(void**)lleft;
2466                 rvtype = ltype->array.member;
2467                 if (i >= 0 && i < ltype->array.size)
2468                         lrv = ptr + i * rvtype->size;
2469                 else
2470                         val_init(ltype->array.member, &rv); // UNSAFE
2471                 ltype = NULL;
2472                 break;
2473         }
2474
2475 #### Structs
2476
2477 A `struct` is a data-type that contains one or more other data-types.
2478 It differs from an array in that each member can be of a different
2479 type, and they are accessed by name rather than by number.  Thus you
2480 cannot choose an element by calculation, you need to know what you
2481 want up-front.
2482
2483 The language makes no promises about how a given structure will be
2484 stored in memory - it is free to rearrange fields to suit whatever
2485 criteria seems important.
2486
2487 Structs are declared separately from program code - they cannot be
2488 declared in-line in a variable declaration like arrays can.  A struct
2489 is given a name and this name is used to identify the type - the name
2490 is not prefixed by the word `struct` as it would be in C.
2491
2492 Structs are only treated as the same if they have the same name.
2493 Simply having the same fields in the same order is not enough.  This
2494 might change once we can create structure initializers from a list of
2495 values.
2496
2497 Each component datum is identified much like a variable is declared,
2498 with a name, one or two colons, and a type.  The type cannot be omitted
2499 as there is no opportunity to deduce the type from usage.  An initial
2500 value can be given following an equals sign, so
2501
2502 ##### Example: a struct type
2503
2504         struct complex:
2505                 x:number = 0
2506                 y:number = 0
2507
2508 would declare a type called "complex" which has two number fields,
2509 each initialised to zero.
2510
2511 Struct will need to be declared separately from the code that uses
2512 them, so we will need to be able to print out the declaration of a
2513 struct when reprinting the whole program.  So a `print_type_decl` type
2514 function will be needed.
2515
2516 ###### type union fields
2517
2518         struct {
2519                 int nfields;
2520                 struct field {
2521                         struct text name;
2522                         struct type *type;
2523                         struct value *init;
2524                         int offset;
2525                 } *fields; // This is created when field_list is analysed.
2526                 struct fieldlist {
2527                         struct fieldlist *prev;
2528                         struct field f;
2529                         struct exec *init;
2530                 } *field_list; // This is created during parsing
2531         } structure;
2532
2533 ###### type functions
2534         void (*print_type_decl)(struct type *type, FILE *f);
2535
2536 ###### value functions
2537
2538         static void structure_init(struct type *type, struct value *val)
2539         {
2540                 int i;
2541
2542                 for (i = 0; i < type->structure.nfields; i++) {
2543                         struct value *v;
2544                         v = (void*) val->ptr + type->structure.fields[i].offset;
2545                         if (type->structure.fields[i].init)
2546                                 dup_value(type->structure.fields[i].type,
2547                                           type->structure.fields[i].init,
2548                                           v);
2549                         else
2550                                 val_init(type->structure.fields[i].type, v);
2551                 }
2552         }
2553
2554         static void structure_free(struct type *type, struct value *val)
2555         {
2556                 int i;
2557
2558                 for (i = 0; i < type->structure.nfields; i++) {
2559                         struct value *v;
2560                         v = (void*)val->ptr + type->structure.fields[i].offset;
2561                         free_value(type->structure.fields[i].type, v);
2562                 }
2563         }
2564
2565         static void free_fieldlist(struct fieldlist *f)
2566         {
2567                 if (!f)
2568                         return;
2569                 free_fieldlist(f->prev);
2570                 free_exec(f->init);
2571                 free(f);
2572         }
2573
2574         static void structure_free_type(struct type *t)
2575         {
2576                 int i;
2577                 for (i = 0; i < t->structure.nfields; i++)
2578                         if (t->structure.fields[i].init) {
2579                                 free_value(t->structure.fields[i].type,
2580                                            t->structure.fields[i].init);
2581                         }
2582                 free(t->structure.fields);
2583                 free_fieldlist(t->structure.field_list);
2584         }
2585
2586         static void structure_prepare_type(struct parse_context *c,
2587                                            struct type *t, int parse_time)
2588         {
2589                 int cnt = 0;
2590                 struct fieldlist *f;
2591
2592                 if (!parse_time || t->structure.fields)
2593                         return;
2594
2595                 for (f = t->structure.field_list; f; f=f->prev) {
2596                         int ok;
2597                         cnt += 1;
2598
2599                         if (f->f.type->prepare_type)
2600                                 f->f.type->prepare_type(c, f->f.type, 1);
2601                         if (f->init == NULL)
2602                                 continue;
2603                         do {
2604                                 ok = 1;
2605                                 propagate_types(f->init, c, &ok, f->f.type, 0);
2606                         } while (ok == 2);
2607                         if (!ok)
2608                                 c->parse_error = 1;     // NOTEST
2609                 }
2610
2611                 t->structure.nfields = cnt;
2612                 t->structure.fields = calloc(cnt, sizeof(struct field));
2613                 f = t->structure.field_list;
2614                 while (cnt > 0) {
2615                         int a = f->f.type->align;
2616                         cnt -= 1;
2617                         t->structure.fields[cnt] = f->f;
2618                         if (t->size & (a-1))
2619                                 t->size = (t->size | (a-1)) + 1;
2620                         t->structure.fields[cnt].offset = t->size;
2621                         t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2622                         if (a > t->align)
2623                                 t->align = a;
2624
2625                         if (f->init && !c->parse_error) {
2626                                 struct value vl = interp_exec(c, f->init, NULL);
2627                                 t->structure.fields[cnt].init =
2628                                         global_alloc(c, f->f.type, NULL, &vl);
2629                         }
2630
2631                         f = f->prev;
2632                 }
2633         }
2634
2635         static struct type structure_prototype = {
2636                 .init = structure_init,
2637                 .free = structure_free,
2638                 .free_type = structure_free_type,
2639                 .print_type_decl = structure_print_type,
2640                 .prepare_type = structure_prepare_type,
2641         };
2642
2643 ###### exec type
2644         Xfieldref,
2645
2646 ###### ast
2647         struct fieldref {
2648                 struct exec;
2649                 struct exec *left;
2650                 int index;
2651                 struct text name;
2652         };
2653
2654 ###### free exec cases
2655         case Xfieldref:
2656                 free_exec(cast(fieldref, e)->left);
2657                 free(e);
2658                 break;
2659
2660 ###### declare terminals
2661         $TERM struct .
2662
2663 ###### term grammar
2664
2665         | Term . IDENTIFIER ${ {
2666                 struct fieldref *fr = new_pos(fieldref, $2);
2667                 fr->left = $<1;
2668                 fr->name = $3.txt;
2669                 fr->index = -2;
2670                 $0 = fr;
2671         } }$
2672
2673 ###### print exec cases
2674
2675         case Xfieldref:
2676         {
2677                 struct fieldref *f = cast(fieldref, e);
2678                 print_exec(f->left, -1, bracket);
2679                 printf(".%.*s", f->name.len, f->name.txt);
2680                 break;
2681         }
2682
2683 ###### ast functions
2684         static int find_struct_index(struct type *type, struct text field)
2685         {
2686                 int i;
2687                 for (i = 0; i < type->structure.nfields; i++)
2688                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2689                                 return i;
2690                 return -1;
2691         }
2692
2693 ###### propagate exec cases
2694
2695         case Xfieldref:
2696         {
2697                 struct fieldref *f = cast(fieldref, prog);
2698                 struct type *st = propagate_types(f->left, c, ok, NULL, 0);
2699
2700                 if (!st)
2701                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2702                                  NULL, 0, NULL);
2703                 else if (st->init != structure_init)
2704                         type_err(c, "error: field reference attempted on %1, not a struct",
2705                                  f->left, st, 0, NULL);
2706                 else if (f->index == -2) {
2707                         f->index = find_struct_index(st, f->name);
2708                         if (f->index < 0)
2709                                 type_err(c, "error: cannot find requested field in %1",
2710                                          f->left, st, 0, NULL);
2711                 }
2712                 if (f->index >= 0) {
2713                         struct type *ft = st->structure.fields[f->index].type;
2714                         if (!type_compat(type, ft, rules))
2715                                 type_err(c, "error: have %1 but need %2", prog,
2716                                          ft, rules, type);
2717                         return ft;
2718                 }
2719                 break;
2720         }
2721
2722 ###### interp exec cases
2723         case Xfieldref:
2724         {
2725                 struct fieldref *f = cast(fieldref, e);
2726                 struct type *ltype;
2727                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2728                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2729                 rvtype = ltype->structure.fields[f->index].type;
2730                 break;
2731         }
2732
2733 ###### top level grammar
2734         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2735                 struct type *t =
2736                         add_type(c, $2.txt, &structure_prototype);
2737                 t->structure.field_list = $<FB;
2738                 if (t->prepare_type)
2739                         t->prepare_type(c, t, 1);
2740
2741         } }$
2742
2743         $*fieldlist
2744         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2745         | { SimpleFieldList } ${ $0 = $<SFL; }$
2746         | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2747         | SimpleFieldList EOL ${ $0 = $<SFL; }$
2748
2749         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2750         | FieldLines SimpleFieldList Newlines ${
2751                 $SFL->prev = $<FL;
2752                 $0 = $<SFL;
2753         }$
2754
2755         SimpleFieldList -> Field ${ $0 = $<F; }$
2756         | SimpleFieldList ; Field ${
2757                 $F->prev = $<SFL;
2758                 $0 = $<F;
2759         }$
2760         | SimpleFieldList ; ${
2761                 $0 = $<SFL;
2762         }$
2763         | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2764
2765         Field -> IDENTIFIER : Type = Expression ${ {
2766                 $0 = calloc(1, sizeof(struct fieldlist));
2767                 $0->f.name = $ID.txt;
2768                 $0->f.type = $<Type;
2769                 $0->f.init = NULL;
2770                 $0->init = $<Expr;
2771         } }$
2772         | IDENTIFIER : Type ${
2773                 $0 = calloc(1, sizeof(struct fieldlist));
2774                 $0->f.name = $ID.txt;
2775                 $0->f.type = $<Type;
2776         }$
2777
2778 ###### forward decls
2779         static void structure_print_type(struct type *t, FILE *f);
2780
2781 ###### value functions
2782         static void structure_print_type(struct type *t, FILE *f)
2783         {
2784                 int i;
2785
2786                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2787
2788                 for (i = 0; i < t->structure.nfields; i++) {
2789                         struct field *fl = t->structure.fields + i;
2790                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2791                         type_print(fl->type, f);
2792                         if (fl->type->print && fl->init) {
2793                                 fprintf(f, " = ");
2794                                 if (fl->type == Tstr)
2795                                         fprintf(f, "\"");       // UNTESTED
2796                                 print_value(fl->type, fl->init, f);
2797                                 if (fl->type == Tstr)
2798                                         fprintf(f, "\"");       // UNTESTED
2799                         }
2800                         fprintf(f, "\n");
2801                 }
2802         }
2803
2804 ###### print type decls
2805         {
2806                 struct type *t;
2807                 int target = -1;
2808
2809                 while (target != 0) {
2810                         int i = 0;
2811                         for (t = context.typelist; t ; t=t->next)
2812                                 if (!t->anon && t->print_type_decl &&
2813                                     !t->check_args) {
2814                                         i += 1;
2815                                         if (i == target)
2816                                                 break;
2817                                 }
2818
2819                         if (target == -1) {
2820                                 target = i;
2821                         } else {
2822                                 t->print_type_decl(t, stdout);
2823                                 target -= 1;
2824                         }
2825                 }
2826         }
2827
2828 #### Functions
2829
2830 A function is a chunk of code which can be passed parameters and can
2831 return results.  Each function has a type which includes the set of
2832 parameters and the return value.  As yet these types cannot be declared
2833 separately from the function itself.
2834
2835 The parameters can be specified either in parentheses as a ';' separated
2836 list, such as
2837
2838 ##### Example: function 1
2839
2840         func main(av:[ac::number]string; env:[envc::number]string)
2841                 code block
2842
2843 or as an indented list of one parameter per line (though each line can
2844 be a ';' separated list)
2845
2846 ##### Example: function 2
2847
2848         func main
2849                 argv:[argc::number]string
2850                 env:[envc::number]string
2851         do
2852                 code block
2853
2854 In the first case a return type can follow the parentheses after a colon,
2855 in the second it is given on a line starting with the word `return`.
2856
2857 ##### Example: functions that return
2858
2859         func add(a:number; b:number): number
2860                 code block
2861
2862         func catenate
2863                 a: string
2864                 b: string
2865         return string
2866         do
2867                 code block
2868
2869 Rather than returning a type, the function can specify a set of local
2870 variables to return as a struct.  The values of these variables when the
2871 function exits will be provided to the caller.  For this the return type
2872 is replaced with a block of result declarations, either in parentheses
2873 or bracketed by `return` and `do`.
2874
2875 ##### Example: functions returning multiple variables
2876
2877         func to_cartesian(rho:number; theta:number):(x:number; y:number)
2878                 x = .....
2879                 y = .....
2880
2881         func to_polar
2882                 x:number; y:number
2883         return
2884                 rho:number
2885                 theta:number
2886         do
2887                 rho = ....
2888                 theta = ....
2889
2890 For constructing the lists we use a `List` binode, which will be
2891 further detailed when Expression Lists are introduced.
2892
2893 ###### type union fields
2894
2895         struct {
2896                 struct binode *params;
2897                 struct type *return_type;
2898                 struct variable *scope;
2899                 int inline_result;      // return value is at start of 'local'
2900                 int local_size;
2901         } function;
2902
2903 ###### value union fields
2904         struct exec *function;
2905
2906 ###### type functions
2907         void (*check_args)(struct parse_context *c, int *ok,
2908                            struct type *require, struct exec *args);
2909
2910 ###### value functions
2911
2912         static void function_free(struct type *type, struct value *val)
2913         {
2914                 free_exec(val->function);
2915                 val->function = NULL;
2916         }
2917
2918         static int function_compat(struct type *require, struct type *have)
2919         {
2920                 // FIXME can I do anything here yet?
2921                 return 0;
2922         }
2923
2924         static void function_check_args(struct parse_context *c, int *ok,
2925                                         struct type *require, struct exec *args)
2926         {
2927                 /* This should be 'compat', but we don't have a 'tuple' type to
2928                  * hold the type of 'args'
2929                  */
2930                 struct binode *arg = cast(binode, args);
2931                 struct binode *param = require->function.params;
2932
2933                 while (param) {
2934                         struct var *pv = cast(var, param->left);
2935                         if (!arg) {
2936                                 type_err(c, "error: insufficient arguments to function.",
2937                                          args, NULL, 0, NULL);
2938                                 break;
2939                         }
2940                         *ok = 1;
2941                         propagate_types(arg->left, c, ok, pv->var->type, 0);
2942                         param = cast(binode, param->right);
2943                         arg = cast(binode, arg->right);
2944                 }
2945                 if (arg)
2946                         type_err(c, "error: too many arguments to function.",
2947                                  args, NULL, 0, NULL);
2948         }
2949
2950         static void function_print(struct type *type, struct value *val, FILE *f)
2951         {
2952                 print_exec(val->function, 1, 0);
2953         }
2954
2955         static void function_print_type_decl(struct type *type, FILE *f)
2956         {
2957                 struct binode *b;
2958                 fprintf(f, "(");
2959                 for (b = type->function.params; b; b = cast(binode, b->right)) {
2960                         struct variable *v = cast(var, b->left)->var;
2961                         fprintf(f, "%.*s%s", v->name->name.len, v->name->name.txt,
2962                                 v->constant ? "::" : ":");
2963                         type_print(v->type, f);
2964                         if (b->right)
2965                                 fprintf(f, "; ");
2966                 }
2967                 fprintf(f, ")");
2968                 if (type->function.return_type != Tnone) {
2969                         fprintf(f, ":");
2970                         if (type->function.inline_result) {
2971                                 int i;
2972                                 struct type *t = type->function.return_type;
2973                                 fprintf(f, " (");
2974                                 for (i = 0; i < t->structure.nfields; i++) {
2975                                         struct field *fl = t->structure.fields + i;
2976                                         if (i)
2977                                                 fprintf(f, "; ");
2978                                         fprintf(f, "%.*s:", fl->name.len, fl->name.txt);
2979                                         type_print(fl->type, f);
2980                                 }
2981                                 fprintf(f, ")");
2982                         } else
2983                                 type_print(type->function.return_type, f);
2984                 }
2985                 fprintf(f, "\n");
2986         }
2987
2988         static void function_free_type(struct type *t)
2989         {
2990                 free_exec(t->function.params);
2991         }
2992
2993         static struct type function_prototype = {
2994                 .size = sizeof(void*),
2995                 .align = sizeof(void*),
2996                 .free = function_free,
2997                 .compat = function_compat,
2998                 .check_args = function_check_args,
2999                 .print = function_print,
3000                 .print_type_decl = function_print_type_decl,
3001                 .free_type = function_free_type,
3002         };
3003
3004 ###### declare terminals
3005
3006         $TERM func
3007
3008 ###### Binode types
3009         List,
3010
3011 ###### Grammar
3012
3013         $*variable
3014         FuncName -> IDENTIFIER ${ {
3015                 struct variable *v = var_decl(c, $1.txt);
3016                 struct var *e = new_pos(var, $1);
3017                 e->var = v;
3018                 if (v) {
3019                         v->where_decl = e;
3020                         $0 = v;
3021                 } else {
3022                         v = var_ref(c, $1.txt);
3023                         e->var = v;
3024                         type_err(c, "error: function '%v' redeclared",
3025                                 e, NULL, 0, NULL);
3026                         type_err(c, "info: this is where '%v' was first declared",
3027                                 v->where_decl, NULL, 0, NULL);
3028                         free_exec(e);
3029                 }
3030         } }$
3031
3032         $*binode
3033         Args -> ArgsLine NEWLINE ${ $0 = $<AL; }$
3034         | Args ArgsLine NEWLINE ${ {
3035                 struct binode *b = $<AL;
3036                 struct binode **bp = &b;
3037                 while (*bp)
3038                         bp = (struct binode **)&(*bp)->left;
3039                 *bp = $<A;
3040                 $0 = b;
3041         } }$
3042
3043         ArgsLine -> ${ $0 = NULL; }$
3044         | Varlist ${ $0 = $<1; }$
3045         | Varlist ; ${ $0 = $<1; }$
3046
3047         Varlist -> Varlist ; ArgDecl ${
3048                 $0 = new(binode);
3049                 $0->op = List;
3050                 $0->left = $<Vl;
3051                 $0->right = $<AD;
3052         }$
3053         | ArgDecl ${
3054                 $0 = new(binode);
3055                 $0->op = List;
3056                 $0->left = NULL;
3057                 $0->right = $<AD;
3058         }$
3059
3060         $*var
3061         ArgDecl -> IDENTIFIER : FormalType ${ {
3062                 struct variable *v = var_decl(c, $1.txt);
3063                 $0 = new(var);
3064                 $0->var = v;
3065                 v->type = $<FT;
3066         } }$
3067
3068 ##### Function calls
3069
3070 A function call can appear either as an expression or as a statement.
3071 We use a new 'Funcall' binode type to link the function with a list of
3072 arguments, form with the 'List' nodes.
3073
3074 We have already seen the "Term" which is how a function call can appear
3075 in an expression.  To parse a function call into a statement we include
3076 it in the "SimpleStatement Grammar" which will be described later.
3077
3078 ###### Binode types
3079         Funcall,
3080
3081 ###### term grammar
3082         | Term ( ExpressionList ) ${ {
3083                 struct binode *b = new(binode);
3084                 b->op = Funcall;
3085                 b->left = $<T;
3086                 b->right = reorder_bilist($<EL);
3087                 $0 = b;
3088         } }$
3089         | Term ( ) ${ {
3090                 struct binode *b = new(binode);
3091                 b->op = Funcall;
3092                 b->left = $<T;
3093                 b->right = NULL;
3094                 $0 = b;
3095         } }$
3096
3097 ###### SimpleStatement Grammar
3098
3099         | Term ( ExpressionList ) ${ {
3100                 struct binode *b = new(binode);
3101                 b->op = Funcall;
3102                 b->left = $<T;
3103                 b->right = reorder_bilist($<EL);
3104                 $0 = b;
3105         } }$
3106
3107 ###### print binode cases
3108
3109         case Funcall:
3110                 do_indent(indent, "");
3111                 print_exec(b->left, -1, bracket);
3112                 printf("(");
3113                 for (b = cast(binode, b->right); b; b = cast(binode, b->right)) {
3114                         if (b->left) {
3115                                 printf(" ");
3116                                 print_exec(b->left, -1, bracket);
3117                                 if (b->right)
3118                                         printf(",");
3119                         }
3120                 }
3121                 printf(")");
3122                 if (indent >= 0)
3123                         printf("\n");
3124                 break;
3125
3126 ###### propagate binode cases
3127
3128         case Funcall: {
3129                 /* Every arg must match formal parameter, and result
3130                  * is return type of function
3131                  */
3132                 struct binode *args = cast(binode, b->right);
3133                 struct var *v = cast(var, b->left);
3134
3135                 if (!v->var->type || v->var->type->check_args == NULL) {
3136                         type_err(c, "error: attempt to call a non-function.",
3137                                  prog, NULL, 0, NULL);
3138                         return NULL;
3139                 }
3140                 v->var->type->check_args(c, ok, v->var->type, args);
3141                 return v->var->type->function.return_type;
3142         }
3143
3144 ###### interp binode cases
3145
3146         case Funcall: {
3147                 struct var *v = cast(var, b->left);
3148                 struct type *t = v->var->type;
3149                 void *oldlocal = c->local;
3150                 int old_size = c->local_size;
3151                 void *local = calloc(1, t->function.local_size);
3152                 struct value *fbody = var_value(c, v->var);
3153                 struct binode *arg = cast(binode, b->right);
3154                 struct binode *param = t->function.params;
3155
3156                 while (param) {
3157                         struct var *pv = cast(var, param->left);
3158                         struct type *vtype = NULL;
3159                         struct value val = interp_exec(c, arg->left, &vtype);
3160                         struct value *lval;
3161                         c->local = local; c->local_size = t->function.local_size;
3162                         lval = var_value(c, pv->var);
3163                         c->local = oldlocal; c->local_size = old_size;
3164                         memcpy(lval, &val, vtype->size);
3165                         param = cast(binode, param->right);
3166                         arg = cast(binode, arg->right);
3167                 }
3168                 c->local = local; c->local_size = t->function.local_size;
3169                 if (t->function.inline_result && dtype) {
3170                         _interp_exec(c, fbody->function, NULL, NULL);
3171                         memcpy(dest, local, dtype->size);
3172                         rvtype = ret.type = NULL;
3173                 } else
3174                         rv = interp_exec(c, fbody->function, &rvtype);
3175                 c->local = oldlocal; c->local_size = old_size;
3176                 free(local);
3177                 break;
3178         }
3179
3180 ## Complex executables: statements and expressions
3181
3182 Now that we have types and values and variables and most of the basic
3183 Terms which provide access to these, we can explore the more complex
3184 code that combine all of these to get useful work done.  Specifically
3185 statements and expressions.
3186
3187 Expressions are various combinations of Terms.  We will use operator
3188 precedence to ensure correct parsing.  The simplest Expression is just a
3189 Term - others will follow.
3190
3191 ###### Grammar
3192
3193         $*exec
3194         Expression -> Term ${ $0 = $<Term; }$
3195         ## expression grammar
3196
3197 ### Expressions: Conditional
3198
3199 Our first user of the `binode` will be conditional expressions, which
3200 is a bit odd as they actually have three components.  That will be
3201 handled by having 2 binodes for each expression.  The conditional
3202 expression is the lowest precedence operator which is why we define it
3203 first - to start the precedence list.
3204
3205 Conditional expressions are of the form "value `if` condition `else`
3206 other_value".  They associate to the right, so everything to the right
3207 of `else` is part of an else value, while only a higher-precedence to
3208 the left of `if` is the if values.  Between `if` and `else` there is no
3209 room for ambiguity, so a full conditional expression is allowed in
3210 there.
3211
3212 ###### Binode types
3213         CondExpr,
3214
3215 ###### declare terminals
3216
3217         $LEFT if $$ifelse
3218
3219 ###### expression grammar
3220
3221         | Expression if Expression else Expression $$ifelse ${ {
3222                 struct binode *b1 = new(binode);
3223                 struct binode *b2 = new(binode);
3224                 b1->op = CondExpr;
3225                 b1->left = $<3;
3226                 b1->right = b2;
3227                 b2->op = CondExpr;
3228                 b2->left = $<1;
3229                 b2->right = $<5;
3230                 $0 = b1;
3231         } }$
3232
3233 ###### print binode cases
3234
3235         case CondExpr:
3236                 b2 = cast(binode, b->right);
3237                 if (bracket) printf("(");
3238                 print_exec(b2->left, -1, bracket);
3239                 printf(" if ");
3240                 print_exec(b->left, -1, bracket);
3241                 printf(" else ");
3242                 print_exec(b2->right, -1, bracket);
3243                 if (bracket) printf(")");
3244                 break;
3245
3246 ###### propagate binode cases
3247
3248         case CondExpr: {
3249                 /* cond must be Tbool, others must match */
3250                 struct binode *b2 = cast(binode, b->right);
3251                 struct type *t2;
3252
3253                 propagate_types(b->left, c, ok, Tbool, 0);
3254                 t = propagate_types(b2->left, c, ok, type, Rnolabel);
3255                 t2 = propagate_types(b2->right, c, ok, type ?: t, Rnolabel);
3256                 return t ?: t2;
3257         }
3258
3259 ###### interp binode cases
3260
3261         case CondExpr: {
3262                 struct binode *b2 = cast(binode, b->right);
3263                 left = interp_exec(c, b->left, &ltype);
3264                 if (left.bool)
3265                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
3266                 else
3267                         rv = interp_exec(c, b2->right, &rvtype);
3268                 }
3269                 break;
3270
3271 ### Expression list
3272
3273 We take a brief detour, now that we have expressions, to describe lists
3274 of expressions.  These will be needed for function parameters and
3275 possibly other situations.  They seem generic enough to introduce here
3276 to be used elsewhere.
3277
3278 And ExpressionList will use the `List` type of `binode`, building up at
3279 the end.  And place where they are used will probably call
3280 `reorder_bilist()` to get a more normal first/next arrangement.
3281
3282 ###### declare terminals
3283         $TERM ,
3284
3285 `List` execs have no implicit semantics, so they are never propagated or
3286 interpreted.  The can be printed as a comma separate list, which is how
3287 they are parsed.  Note they are also used for function formal parameter
3288 lists.  In that case a separate function is used to print them.
3289
3290 ###### print binode cases
3291         case List:
3292                 while (b) {
3293                         printf(" ");
3294                         print_exec(b->left, -1, bracket);
3295                         if (b->right)
3296                                 printf(",");
3297                         b = cast(binode, b->right);
3298                 }
3299                 break;
3300
3301 ###### propagate binode cases
3302         case List: abort(); // NOTEST
3303 ###### interp binode cases
3304         case List: abort(); // NOTEST
3305
3306 ###### Grammar
3307
3308         $*binode
3309         ExpressionList -> ExpressionList , Expression ${
3310                 $0 = new(binode);
3311                 $0->op = List;
3312                 $0->left = $<1;
3313                 $0->right = $<3;
3314         }$
3315         | Expression ${
3316                 $0 = new(binode);
3317                 $0->op = List;
3318                 $0->left = NULL;
3319                 $0->right = $<1;
3320         }$
3321
3322 ### Expressions: Boolean
3323
3324 The next class of expressions to use the `binode` will be Boolean
3325 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
3326 have same corresponding precendence.  The difference is that they don't
3327 evaluate the second expression if not necessary.
3328
3329 ###### Binode types
3330         And,
3331         AndThen,
3332         Or,
3333         OrElse,
3334         Not,
3335
3336 ###### declare terminals
3337         $LEFT or
3338         $LEFT and
3339         $LEFT not
3340
3341 ###### expression grammar
3342         | Expression or Expression ${ {
3343                 struct binode *b = new(binode);
3344                 b->op = Or;
3345                 b->left = $<1;
3346                 b->right = $<3;
3347                 $0 = b;
3348         } }$
3349         | Expression or else Expression ${ {
3350                 struct binode *b = new(binode);
3351                 b->op = OrElse;
3352                 b->left = $<1;
3353                 b->right = $<4;
3354                 $0 = b;
3355         } }$
3356
3357         | Expression and Expression ${ {
3358                 struct binode *b = new(binode);
3359                 b->op = And;
3360                 b->left = $<1;
3361                 b->right = $<3;
3362                 $0 = b;
3363         } }$
3364         | Expression and then Expression ${ {
3365                 struct binode *b = new(binode);
3366                 b->op = AndThen;
3367                 b->left = $<1;
3368                 b->right = $<4;
3369                 $0 = b;
3370         } }$
3371
3372         | not Expression ${ {
3373                 struct binode *b = new(binode);
3374                 b->op = Not;
3375                 b->right = $<2;
3376                 $0 = b;
3377         } }$
3378
3379 ###### print binode cases
3380         case And:
3381                 if (bracket) printf("(");
3382                 print_exec(b->left, -1, bracket);
3383                 printf(" and ");
3384                 print_exec(b->right, -1, bracket);
3385                 if (bracket) printf(")");
3386                 break;
3387         case AndThen:
3388                 if (bracket) printf("(");
3389                 print_exec(b->left, -1, bracket);
3390                 printf(" and then ");
3391                 print_exec(b->right, -1, bracket);
3392                 if (bracket) printf(")");
3393                 break;
3394         case Or:
3395                 if (bracket) printf("(");
3396                 print_exec(b->left, -1, bracket);
3397                 printf(" or ");
3398                 print_exec(b->right, -1, bracket);
3399                 if (bracket) printf(")");
3400                 break;
3401         case OrElse:
3402                 if (bracket) printf("(");
3403                 print_exec(b->left, -1, bracket);
3404                 printf(" or else ");
3405                 print_exec(b->right, -1, bracket);
3406                 if (bracket) printf(")");
3407                 break;
3408         case Not:
3409                 if (bracket) printf("(");
3410                 printf("not ");
3411                 print_exec(b->right, -1, bracket);
3412                 if (bracket) printf(")");
3413                 break;
3414
3415 ###### propagate binode cases
3416         case And:
3417         case AndThen:
3418         case Or:
3419         case OrElse:
3420         case Not:
3421                 /* both must be Tbool, result is Tbool */
3422                 propagate_types(b->left, c, ok, Tbool, 0);
3423                 propagate_types(b->right, c, ok, Tbool, 0);
3424                 if (type && type != Tbool)
3425                         type_err(c, "error: %1 operation found where %2 expected", prog,
3426                                    Tbool, 0, type);
3427                 return Tbool;
3428
3429 ###### interp binode cases
3430         case And:
3431                 rv = interp_exec(c, b->left, &rvtype);
3432                 right = interp_exec(c, b->right, &rtype);
3433                 rv.bool = rv.bool && right.bool;
3434                 break;
3435         case AndThen:
3436                 rv = interp_exec(c, b->left, &rvtype);
3437                 if (rv.bool)
3438                         rv = interp_exec(c, b->right, NULL);
3439                 break;
3440         case Or:
3441                 rv = interp_exec(c, b->left, &rvtype);
3442                 right = interp_exec(c, b->right, &rtype);
3443                 rv.bool = rv.bool || right.bool;
3444                 break;
3445         case OrElse:
3446                 rv = interp_exec(c, b->left, &rvtype);
3447                 if (!rv.bool)
3448                         rv = interp_exec(c, b->right, NULL);
3449                 break;
3450         case Not:
3451                 rv = interp_exec(c, b->right, &rvtype);
3452                 rv.bool = !rv.bool;
3453                 break;
3454
3455 ### Expressions: Comparison
3456
3457 Of slightly higher precedence that Boolean expressions are Comparisons.
3458 A comparison takes arguments of any comparable type, but the two types
3459 must be the same.
3460
3461 To simplify the parsing we introduce an `eop` which can record an
3462 expression operator, and the `CMPop` non-terminal will match one of them.
3463
3464 ###### ast
3465         struct eop {
3466                 enum Btype op;
3467         };
3468
3469 ###### ast functions
3470         static void free_eop(struct eop *e)
3471         {
3472                 if (e)
3473                         free(e);
3474         }
3475
3476 ###### Binode types
3477         Less,
3478         Gtr,
3479         LessEq,
3480         GtrEq,
3481         Eql,
3482         NEql,
3483
3484 ###### declare terminals
3485         $LEFT < > <= >= == != CMPop
3486
3487 ###### expression grammar
3488         | Expression CMPop Expression ${ {
3489                 struct binode *b = new(binode);
3490                 b->op = $2.op;
3491                 b->left = $<1;
3492                 b->right = $<3;
3493                 $0 = b;
3494         } }$
3495
3496 ###### Grammar
3497
3498         $eop
3499         CMPop ->  < ${ $0.op = Less; }$
3500         |         > ${ $0.op = Gtr; }$
3501         |         <= ${ $0.op = LessEq; }$
3502         |         >= ${ $0.op = GtrEq; }$
3503         |         == ${ $0.op = Eql; }$
3504         |         != ${ $0.op = NEql; }$
3505
3506 ###### print binode cases
3507
3508         case Less:
3509         case LessEq:
3510         case Gtr:
3511         case GtrEq:
3512         case Eql:
3513         case NEql:
3514                 if (bracket) printf("(");
3515                 print_exec(b->left, -1, bracket);
3516                 switch(b->op) {
3517                 case Less:   printf(" < "); break;
3518                 case LessEq: printf(" <= "); break;
3519                 case Gtr:    printf(" > "); break;
3520                 case GtrEq:  printf(" >= "); break;
3521                 case Eql:    printf(" == "); break;
3522                 case NEql:   printf(" != "); break;
3523                 default: abort();               // NOTEST
3524                 }
3525                 print_exec(b->right, -1, bracket);
3526                 if (bracket) printf(")");
3527                 break;
3528
3529 ###### propagate binode cases
3530         case Less:
3531         case LessEq:
3532         case Gtr:
3533         case GtrEq:
3534         case Eql:
3535         case NEql:
3536                 /* Both must match but not be labels, result is Tbool */
3537                 t = propagate_types(b->left, c, ok, NULL, Rnolabel);
3538                 if (t)
3539                         propagate_types(b->right, c, ok, t, 0);
3540                 else {
3541                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);   // UNTESTED
3542                         if (t)  // UNTESTED
3543                                 t = propagate_types(b->left, c, ok, t, 0);      // UNTESTED
3544                 }
3545                 if (!type_compat(type, Tbool, 0))
3546                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
3547                                     Tbool, rules, type);
3548                 return Tbool;
3549
3550 ###### interp binode cases
3551         case Less:
3552         case LessEq:
3553         case Gtr:
3554         case GtrEq:
3555         case Eql:
3556         case NEql:
3557         {
3558                 int cmp;
3559                 left = interp_exec(c, b->left, &ltype);
3560                 right = interp_exec(c, b->right, &rtype);
3561                 cmp = value_cmp(ltype, rtype, &left, &right);
3562                 rvtype = Tbool;
3563                 switch (b->op) {
3564                 case Less:      rv.bool = cmp <  0; break;
3565                 case LessEq:    rv.bool = cmp <= 0; break;
3566                 case Gtr:       rv.bool = cmp >  0; break;
3567                 case GtrEq:     rv.bool = cmp >= 0; break;
3568                 case Eql:       rv.bool = cmp == 0; break;
3569                 case NEql:      rv.bool = cmp != 0; break;
3570                 default:        rv.bool = 0; break;     // NOTEST
3571                 }
3572                 break;
3573         }
3574
3575 ### Expressions: Arithmetic etc.
3576
3577 The remaining expressions with the highest precedence are arithmetic,
3578 string concatenation, and string conversion.  String concatenation
3579 (`++`) has the same precedence as multiplication and division, but lower
3580 than the uniary.
3581
3582 String conversion is a temporary feature until I get a better type
3583 system.  `$` is a prefix operator which expects a string and returns
3584 a number.
3585
3586 `+` and `-` are both infix and prefix operations (where they are
3587 absolute value and negation).  These have different operator names.
3588
3589 We also have a 'Bracket' operator which records where parentheses were
3590 found.  This makes it easy to reproduce these when printing.  Possibly I
3591 should only insert brackets were needed for precedence.  Putting
3592 parentheses around an expression converts it into a Term,
3593
3594 ###### Binode types
3595         Plus, Minus,
3596         Times, Divide, Rem,
3597         Concat,
3598         Absolute, Negate,
3599         StringConv,
3600         Bracket,
3601
3602 ###### declare terminals
3603         $LEFT + - Eop
3604         $LEFT * / % ++ Top
3605         $LEFT Uop $
3606         $TERM ( )
3607
3608 ###### expression grammar
3609         | Expression Eop Expression ${ {
3610                 struct binode *b = new(binode);
3611                 b->op = $2.op;
3612                 b->left = $<1;
3613                 b->right = $<3;
3614                 $0 = b;
3615         } }$
3616
3617         | Expression Top Expression ${ {
3618                 struct binode *b = new(binode);
3619                 b->op = $2.op;
3620                 b->left = $<1;
3621                 b->right = $<3;
3622                 $0 = b;
3623         } }$
3624
3625         | Uop Expression ${ {
3626                 struct binode *b = new(binode);
3627                 b->op = $1.op;
3628                 b->right = $<2;
3629                 $0 = b;
3630         } }$
3631
3632 ###### term grammar
3633
3634         | ( Expression ) ${ {
3635                 struct binode *b = new_pos(binode, $1);
3636                 b->op = Bracket;
3637                 b->right = $<2;
3638                 $0 = b;
3639         } }$
3640
3641 ###### Grammar
3642
3643         $eop
3644         Eop ->   + ${ $0.op = Plus; }$
3645         |        - ${ $0.op = Minus; }$
3646
3647         Uop ->   + ${ $0.op = Absolute; }$
3648         |        - ${ $0.op = Negate; }$
3649         |        $ ${ $0.op = StringConv; }$
3650
3651         Top ->   * ${ $0.op = Times; }$
3652         |        / ${ $0.op = Divide; }$
3653         |        % ${ $0.op = Rem; }$
3654         |        ++ ${ $0.op = Concat; }$
3655
3656 ###### print binode cases
3657         case Plus:
3658         case Minus:
3659         case Times:
3660         case Divide:
3661         case Concat:
3662         case Rem:
3663                 if (bracket) printf("(");
3664                 print_exec(b->left, indent, bracket);
3665                 switch(b->op) {
3666                 case Plus:   fputs(" + ", stdout); break;
3667                 case Minus:  fputs(" - ", stdout); break;
3668                 case Times:  fputs(" * ", stdout); break;
3669                 case Divide: fputs(" / ", stdout); break;
3670                 case Rem:    fputs(" % ", stdout); break;
3671                 case Concat: fputs(" ++ ", stdout); break;
3672                 default: abort();       // NOTEST
3673                 }                       // NOTEST
3674                 print_exec(b->right, indent, bracket);
3675                 if (bracket) printf(")");
3676                 break;
3677         case Absolute:
3678         case Negate:
3679         case StringConv:
3680                 if (bracket) printf("(");
3681                 switch (b->op) {
3682                 case Absolute:   fputs("+", stdout); break;
3683                 case Negate:     fputs("-", stdout); break;
3684                 case StringConv: fputs("$", stdout); break;
3685                 default: abort();       // NOTEST
3686                 }                       // NOTEST
3687                 print_exec(b->right, indent, bracket);
3688                 if (bracket) printf(")");
3689                 break;
3690         case Bracket:
3691                 printf("(");
3692                 print_exec(b->right, indent, bracket);
3693                 printf(")");
3694                 break;
3695
3696 ###### propagate binode cases
3697         case Plus:
3698         case Minus:
3699         case Times:
3700         case Rem:
3701         case Divide:
3702                 /* both must be numbers, result is Tnum */
3703         case Absolute:
3704         case Negate:
3705                 /* as propagate_types ignores a NULL,
3706                  * unary ops fit here too */
3707                 propagate_types(b->left, c, ok, Tnum, 0);
3708                 propagate_types(b->right, c, ok, Tnum, 0);
3709                 if (!type_compat(type, Tnum, 0))
3710                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3711                                    Tnum, rules, type);
3712                 return Tnum;
3713
3714         case Concat:
3715                 /* both must be Tstr, result is Tstr */
3716                 propagate_types(b->left, c, ok, Tstr, 0);
3717                 propagate_types(b->right, c, ok, Tstr, 0);
3718                 if (!type_compat(type, Tstr, 0))
3719                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3720                                    Tstr, rules, type);
3721                 return Tstr;
3722
3723         case StringConv:
3724                 /* op must be string, result is number */
3725                 propagate_types(b->left, c, ok, Tstr, 0);
3726                 if (!type_compat(type, Tnum, 0))
3727                         type_err(c,     // UNTESTED
3728                           "error: Can only convert string to number, not %1",
3729                                 prog, type, 0, NULL);
3730                 return Tnum;
3731
3732         case Bracket:
3733                 return propagate_types(b->right, c, ok, type, 0);
3734
3735 ###### interp binode cases
3736
3737         case Plus:
3738                 rv = interp_exec(c, b->left, &rvtype);
3739                 right = interp_exec(c, b->right, &rtype);
3740                 mpq_add(rv.num, rv.num, right.num);
3741                 break;
3742         case Minus:
3743                 rv = interp_exec(c, b->left, &rvtype);
3744                 right = interp_exec(c, b->right, &rtype);
3745                 mpq_sub(rv.num, rv.num, right.num);
3746                 break;
3747         case Times:
3748                 rv = interp_exec(c, b->left, &rvtype);
3749                 right = interp_exec(c, b->right, &rtype);
3750                 mpq_mul(rv.num, rv.num, right.num);
3751                 break;
3752         case Divide:
3753                 rv = interp_exec(c, b->left, &rvtype);
3754                 right = interp_exec(c, b->right, &rtype);
3755                 mpq_div(rv.num, rv.num, right.num);
3756                 break;
3757         case Rem: {
3758                 mpz_t l, r, rem;
3759
3760                 left = interp_exec(c, b->left, &ltype);
3761                 right = interp_exec(c, b->right, &rtype);
3762                 mpz_init(l); mpz_init(r); mpz_init(rem);
3763                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3764                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3765                 mpz_tdiv_r(rem, l, r);
3766                 val_init(Tnum, &rv);
3767                 mpq_set_z(rv.num, rem);
3768                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3769                 rvtype = ltype;
3770                 break;
3771         }
3772         case Negate:
3773                 rv = interp_exec(c, b->right, &rvtype);
3774                 mpq_neg(rv.num, rv.num);
3775                 break;
3776         case Absolute:
3777                 rv = interp_exec(c, b->right, &rvtype);
3778                 mpq_abs(rv.num, rv.num);
3779                 break;
3780         case Bracket:
3781                 rv = interp_exec(c, b->right, &rvtype);
3782                 break;
3783         case Concat:
3784                 left = interp_exec(c, b->left, &ltype);
3785                 right = interp_exec(c, b->right, &rtype);
3786                 rvtype = Tstr;
3787                 rv.str = text_join(left.str, right.str);
3788                 break;
3789         case StringConv:
3790                 right = interp_exec(c, b->right, &rvtype);
3791                 rtype = Tstr;
3792                 rvtype = Tnum;
3793
3794                 struct text tx = right.str;
3795                 char tail[3];
3796                 int neg = 0;
3797                 if (tx.txt[0] == '-') {
3798                         neg = 1;        // UNTESTED
3799                         tx.txt++;       // UNTESTED
3800                         tx.len--;       // UNTESTED
3801                 }
3802                 if (number_parse(rv.num, tail, tx) == 0)
3803                         mpq_init(rv.num);       // UNTESTED
3804                 else if (neg)
3805                         mpq_neg(rv.num, rv.num);        // UNTESTED
3806                 if (tail[0])
3807                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3808
3809                 break;
3810
3811 ###### value functions
3812
3813         static struct text text_join(struct text a, struct text b)
3814         {
3815                 struct text rv;
3816                 rv.len = a.len + b.len;
3817                 rv.txt = malloc(rv.len);
3818                 memcpy(rv.txt, a.txt, a.len);
3819                 memcpy(rv.txt+a.len, b.txt, b.len);
3820                 return rv;
3821         }
3822
3823 ### Blocks, Statements, and Statement lists.
3824
3825 Now that we have expressions out of the way we need to turn to
3826 statements.  There are simple statements and more complex statements.
3827 Simple statements do not contain (syntactic) newlines, complex statements do.
3828
3829 Statements often come in sequences and we have corresponding simple
3830 statement lists and complex statement lists.
3831 The former comprise only simple statements separated by semicolons.
3832 The later comprise complex statements and simple statement lists.  They are
3833 separated by newlines.  Thus the semicolon is only used to separate
3834 simple statements on the one line.  This may be overly restrictive,
3835 but I'm not sure I ever want a complex statement to share a line with
3836 anything else.
3837
3838 Note that a simple statement list can still use multiple lines if
3839 subsequent lines are indented, so
3840
3841 ###### Example: wrapped simple statement list
3842
3843         a = b; c = d;
3844            e = f; print g
3845
3846 is a single simple statement list.  This might allow room for
3847 confusion, so I'm not set on it yet.
3848
3849 A simple statement list needs no extra syntax.  A complex statement
3850 list has two syntactic forms.  It can be enclosed in braces (much like
3851 C blocks), or it can be introduced by an indent and continue until an
3852 unindented newline (much like Python blocks).  With this extra syntax
3853 it is referred to as a block.
3854
3855 Note that a block does not have to include any newlines if it only
3856 contains simple statements.  So both of:
3857
3858         if condition: a=b; d=f
3859
3860         if condition { a=b; print f }
3861
3862 are valid.
3863
3864 In either case the list is constructed from a `binode` list with
3865 `Block` as the operator.  When parsing the list it is most convenient
3866 to append to the end, so a list is a list and a statement.  When using
3867 the list it is more convenient to consider a list to be a statement
3868 and a list.  So we need a function to re-order a list.
3869 `reorder_bilist` serves this purpose.
3870
3871 The only stand-alone statement we introduce at this stage is `pass`
3872 which does nothing and is represented as a `NULL` pointer in a `Block`
3873 list.  Other stand-alone statements will follow once the infrastructure
3874 is in-place.
3875
3876 As many statements will use binodes, we declare a binode pointer 'b' in
3877 the common header for all reductions to use.
3878
3879 ###### Parser: reduce
3880         struct binode *b;
3881
3882 ###### Binode types
3883         Block,
3884
3885 ###### Grammar
3886
3887         $TERM { } ;
3888
3889         $*binode
3890         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3891         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3892         |        SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3893         |        SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3894         |        IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3895
3896         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3897         |        OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3898         |        OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3899         |        OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3900         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3901
3902         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3903         |        { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3904         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3905
3906         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3907         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3908         |        : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3909         |        : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3910         |        : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3911
3912         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3913
3914         ComplexStatements -> ComplexStatements ComplexStatement ${
3915                 if ($2 == NULL) {
3916                         $0 = $<1;
3917                 } else {
3918                         $0 = new(binode);
3919                         $0->op = Block;
3920                         $0->left = $<1;
3921                         $0->right = $<2;
3922                 }
3923         }$
3924         | ComplexStatement ${
3925                 if ($1 == NULL) {
3926                         $0 = NULL;
3927                 } else {
3928                         $0 = new(binode);
3929                         $0->op = Block;
3930                         $0->left = NULL;
3931                         $0->right = $<1;
3932                 }
3933         }$
3934
3935         $*exec
3936         ComplexStatement -> SimpleStatements Newlines ${
3937                 $0 = reorder_bilist($<SS);
3938         }$
3939         |  SimpleStatements ; Newlines ${
3940                 $0 = reorder_bilist($<SS);
3941         }$
3942         ## ComplexStatement Grammar
3943
3944         $*binode
3945         SimpleStatements -> SimpleStatements ; SimpleStatement ${
3946                 $0 = new(binode);
3947                 $0->op = Block;
3948                 $0->left = $<1;
3949                 $0->right = $<3;
3950         }$
3951         | SimpleStatement ${
3952                 $0 = new(binode);
3953                 $0->op = Block;
3954                 $0->left = NULL;
3955                 $0->right = $<1;
3956         }$
3957
3958         $TERM pass
3959         $*exec
3960         SimpleStatement -> pass ${ $0 = NULL; }$
3961         | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
3962         ## SimpleStatement Grammar
3963
3964 ###### print binode cases
3965         case Block:
3966                 if (indent < 0) {
3967                         // simple statement
3968                         if (b->left == NULL)    // UNTESTED
3969                                 printf("pass"); // UNTESTED
3970                         else
3971                                 print_exec(b->left, indent, bracket);   // UNTESTED
3972                         if (b->right) { // UNTESTED
3973                                 printf("; ");   // UNTESTED
3974                                 print_exec(b->right, indent, bracket);  // UNTESTED
3975                         }
3976                 } else {
3977                         // block, one per line
3978                         if (b->left == NULL)
3979                                 do_indent(indent, "pass\n");
3980                         else
3981                                 print_exec(b->left, indent, bracket);
3982                         if (b->right)
3983                                 print_exec(b->right, indent, bracket);
3984                 }
3985                 break;
3986
3987 ###### propagate binode cases
3988         case Block:
3989         {
3990                 /* If any statement returns something other than Tnone
3991                  * or Tbool then all such must return same type.
3992                  * As each statement may be Tnone or something else,
3993                  * we must always pass NULL (unknown) down, otherwise an incorrect
3994                  * error might occur.  We never return Tnone unless it is
3995                  * passed in.
3996                  */
3997                 struct binode *e;
3998
3999                 for (e = b; e; e = cast(binode, e->right)) {
4000                         t = propagate_types(e->left, c, ok, NULL, rules);
4001                         if ((rules & Rboolok) && (t == Tbool || t == Tnone))
4002                                 t = NULL;
4003                         if (t == Tnone && e->right)
4004                                 /* Only the final statement *must* return a value
4005                                  * when not Rboolok
4006                                  */
4007                                 t = NULL;
4008                         if (t) {
4009                                 if (!type)
4010                                         type = t;
4011                                 else if (t != type)
4012                                         type_err(c, "error: expected %1%r, found %2",
4013                                                  e->left, type, rules, t);
4014                         }
4015                 }
4016                 return type;
4017         }
4018
4019 ###### interp binode cases
4020         case Block:
4021                 while (rvtype == Tnone &&
4022                        b) {
4023                         if (b->left)
4024                                 rv = interp_exec(c, b->left, &rvtype);
4025                         b = cast(binode, b->right);
4026                 }
4027                 break;
4028
4029 ### The Print statement
4030
4031 `print` is a simple statement that takes a comma-separated list of
4032 expressions and prints the values separated by spaces and terminated
4033 by a newline.  No control of formatting is possible.
4034
4035 `print` uses `ExpressionList` to collect the expressions and stores them
4036 on the left side of a `Print` binode unlessthere is a trailing comma
4037 when the list is stored on the `right` side and no trailing newline is
4038 printed.
4039
4040 ###### Binode types
4041         Print,
4042
4043 ##### declare terminals
4044         $TERM print
4045
4046 ###### SimpleStatement Grammar
4047
4048         | print ExpressionList ${
4049                 $0 = b = new(binode);
4050                 b->op = Print;
4051                 b->right = NULL;
4052                 b->left = reorder_bilist($<EL);
4053         }$
4054         | print ExpressionList , ${ {
4055                 $0 = b = new(binode);
4056                 b->op = Print;
4057                 b->right = reorder_bilist($<EL);
4058                 b->left = NULL;
4059         } }$
4060         | print ${
4061                 $0 = b = new(binode);
4062                 b->op = Print;
4063                 b->left = NULL;
4064                 b->right = NULL;
4065         }$
4066
4067 ###### print binode cases
4068
4069         case Print:
4070                 do_indent(indent, "print");
4071                 if (b->right) {
4072                         print_exec(b->right, -1, bracket);
4073                         printf(",");
4074                 } else
4075                         print_exec(b->left, -1, bracket);
4076                 if (indent >= 0)
4077                         printf("\n");
4078                 break;
4079
4080 ###### propagate binode cases
4081
4082         case Print:
4083                 /* don't care but all must be consistent */
4084                 if (b->left)
4085                         b = cast(binode, b->left);
4086                 else
4087                         b = cast(binode, b->right);
4088                 while (b) {
4089                         propagate_types(b->left, c, ok, NULL, Rnolabel);
4090                         b = cast(binode, b->right);
4091                 }
4092                 break;
4093
4094 ###### interp binode cases
4095
4096         case Print:
4097         {
4098                 struct binode *b2 = cast(binode, b->left);
4099                 if (!b2)
4100                         b2 = cast(binode, b->right);
4101                 for (; b2; b2 = cast(binode, b2->right)) {
4102                         left = interp_exec(c, b2->left, &ltype);
4103                         print_value(ltype, &left, stdout);
4104                         free_value(ltype, &left);
4105                         if (b2->right)
4106                                 putchar(' ');
4107                 }
4108                 if (b->right == NULL)
4109                         printf("\n");
4110                 ltype = Tnone;
4111                 break;
4112         }
4113
4114 ###### Assignment statement
4115
4116 An assignment will assign a value to a variable, providing it hasn't
4117 been declared as a constant.  The analysis phase ensures that the type
4118 will be correct so the interpreter just needs to perform the
4119 calculation.  There is a form of assignment which declares a new
4120 variable as well as assigning a value.  If a name is assigned before
4121 it is declared, and error will be raised as the name is created as
4122 `Tlabel` and it is illegal to assign to such names.
4123
4124 ###### Binode types
4125         Assign,
4126         Declare,
4127
4128 ###### declare terminals
4129         $TERM =
4130
4131 ###### SimpleStatement Grammar
4132         | Term = Expression ${
4133                 $0 = b= new(binode);
4134                 b->op = Assign;
4135                 b->left = $<1;
4136                 b->right = $<3;
4137         }$
4138         | VariableDecl = Expression ${
4139                 $0 = b= new(binode);
4140                 b->op = Declare;
4141                 b->left = $<1;
4142                 b->right =$<3;
4143         }$
4144
4145         | VariableDecl ${
4146                 if ($1->var->where_set == NULL) {
4147                         type_err(c,
4148                                  "Variable declared with no type or value: %v",
4149                                  $1, NULL, 0, NULL);
4150                         free_var($1);
4151                 } else {
4152                         $0 = b = new(binode);
4153                         b->op = Declare;
4154                         b->left = $<1;
4155                         b->right = NULL;
4156                 }
4157         }$
4158
4159 ###### print binode cases
4160
4161         case Assign:
4162                 do_indent(indent, "");
4163                 print_exec(b->left, indent, bracket);
4164                 printf(" = ");
4165                 print_exec(b->right, indent, bracket);
4166                 if (indent >= 0)
4167                         printf("\n");
4168                 break;
4169
4170         case Declare:
4171                 {
4172                 struct variable *v = cast(var, b->left)->var;
4173                 do_indent(indent, "");
4174                 print_exec(b->left, indent, bracket);
4175                 if (cast(var, b->left)->var->constant) {
4176                         printf("::");
4177                         if (v->explicit_type) {
4178                                 type_print(v->type, stdout);
4179                                 printf(" ");
4180                         }
4181                 } else {
4182                         printf(":");
4183                         if (v->explicit_type) {
4184                                 type_print(v->type, stdout);
4185                                 printf(" ");
4186                         }
4187                 }
4188                 if (b->right) {
4189                         printf("= ");
4190                         print_exec(b->right, indent, bracket);
4191                 }
4192                 if (indent >= 0)
4193                         printf("\n");
4194                 }
4195                 break;
4196
4197 ###### propagate binode cases
4198
4199         case Assign:
4200         case Declare:
4201                 /* Both must match and not be labels,
4202                  * Type must support 'dup',
4203                  * For Assign, left must not be constant.
4204                  * result is Tnone
4205                  */
4206                 t = propagate_types(b->left, c, ok, NULL,
4207                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
4208                 if (!b->right)
4209                         return Tnone;
4210
4211                 if (t) {
4212                         if (propagate_types(b->right, c, ok, t, 0) != t)
4213                                 if (b->left->type == Xvar)
4214                                         type_err(c, "info: variable '%v' was set as %1 here.",
4215                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
4216                 } else {
4217                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
4218                         if (t)
4219                                 propagate_types(b->left, c, ok, t,
4220                                                 (b->op == Assign ? Rnoconstant : 0));
4221                 }
4222                 if (t && t->dup == NULL && t->name.txt[0] != ' ') // HACK
4223                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
4224                 return Tnone;
4225
4226                 break;
4227
4228 ###### interp binode cases
4229
4230         case Assign:
4231                 lleft = linterp_exec(c, b->left, &ltype);
4232                 if (lleft)
4233                         dinterp_exec(c, b->right, lleft, ltype, 1);
4234                 ltype = Tnone;
4235                 break;
4236
4237         case Declare:
4238         {
4239                 struct variable *v = cast(var, b->left)->var;
4240                 struct value *val;
4241                 v = v->merged;
4242                 val = var_value(c, v);
4243                 if (v->type->prepare_type)
4244                         v->type->prepare_type(c, v->type, 0);
4245                 if (b->right)
4246                         dinterp_exec(c, b->right, val, v->type, 0);
4247                 else
4248                         val_init(v->type, val);
4249                 break;
4250         }
4251
4252 ### The `use` statement
4253
4254 The `use` statement is the last "simple" statement.  It is needed when a
4255 statement block can return a value.  This includes the body of a
4256 function which has a return type, and the "condition" code blocks in
4257 `if`, `while`, and `switch` statements.
4258
4259 ###### Binode types
4260         Use,
4261
4262 ###### declare terminals
4263         $TERM use
4264
4265 ###### SimpleStatement Grammar
4266         | use Expression ${
4267                 $0 = b = new_pos(binode, $1);
4268                 b->op = Use;
4269                 b->right = $<2;
4270                 if (b->right->type == Xvar) {
4271                         struct var *v = cast(var, b->right);
4272                         if (v->var->type == Tnone) {
4273                                 /* Convert this to a label */
4274                                 struct value *val;
4275
4276                                 v->var->type = Tlabel;
4277                                 val = global_alloc(c, Tlabel, v->var, NULL);
4278                                 val->label = val;
4279                         }
4280                 }
4281         }$
4282
4283 ###### print binode cases
4284
4285         case Use:
4286                 do_indent(indent, "use ");
4287                 print_exec(b->right, -1, bracket);
4288                 if (indent >= 0)
4289                         printf("\n");
4290                 break;
4291
4292 ###### propagate binode cases
4293
4294         case Use:
4295                 /* result matches value */
4296                 return propagate_types(b->right, c, ok, type, 0);
4297
4298 ###### interp binode cases
4299
4300         case Use:
4301                 rv = interp_exec(c, b->right, &rvtype);
4302                 break;
4303
4304 ### The Conditional Statement
4305
4306 This is the biggy and currently the only complex statement.  This
4307 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
4308 It is comprised of a number of parts, all of which are optional though
4309 set combinations apply.  Each part is (usually) a key word (`then` is
4310 sometimes optional) followed by either an expression or a code block,
4311 except the `casepart` which is a "key word and an expression" followed
4312 by a code block.  The code-block option is valid for all parts and,
4313 where an expression is also allowed, the code block can use the `use`
4314 statement to report a value.  If the code block does not report a value
4315 the effect is similar to reporting `True`.
4316
4317 The `else` and `case` parts, as well as `then` when combined with
4318 `if`, can contain a `use` statement which will apply to some
4319 containing conditional statement. `for` parts, `do` parts and `then`
4320 parts used with `for` can never contain a `use`, except in some
4321 subordinate conditional statement.
4322
4323 If there is a `forpart`, it is executed first, only once.
4324 If there is a `dopart`, then it is executed repeatedly providing
4325 always that the `condpart` or `cond`, if present, does not return a non-True
4326 value.  `condpart` can fail to return any value if it simply executes
4327 to completion.  This is treated the same as returning `True`.
4328
4329 If there is a `thenpart` it will be executed whenever the `condpart`
4330 or `cond` returns True (or does not return any value), but this will happen
4331 *after* `dopart` (when present).
4332
4333 If `elsepart` is present it will be executed at most once when the
4334 condition returns `False` or some value that isn't `True` and isn't
4335 matched by any `casepart`.  If there are any `casepart`s, they will be
4336 executed when the condition returns a matching value.
4337
4338 The particular sorts of values allowed in case parts has not yet been
4339 determined in the language design, so nothing is prohibited.
4340
4341 The various blocks in this complex statement potentially provide scope
4342 for variables as described earlier.  Each such block must include the
4343 "OpenScope" nonterminal before parsing the block, and must call
4344 `var_block_close()` when closing the block.
4345
4346 The code following "`if`", "`switch`" and "`for`" does not get its own
4347 scope, but is in a scope covering the whole statement, so names
4348 declared there cannot be redeclared elsewhere.  Similarly the
4349 condition following "`while`" is in a scope the covers the body
4350 ("`do`" part) of the loop, and which does not allow conditional scope
4351 extension.  Code following "`then`" (both looping and non-looping),
4352 "`else`" and "`case`" each get their own local scope.
4353
4354 The type requirements on the code block in a `whilepart` are quite
4355 unusal.  It is allowed to return a value of some identifiable type, in
4356 which case the loop aborts and an appropriate `casepart` is run, or it
4357 can return a Boolean, in which case the loop either continues to the
4358 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
4359 This is different both from the `ifpart` code block which is expected to
4360 return a Boolean, or the `switchpart` code block which is expected to
4361 return the same type as the casepart values.  The correct analysis of
4362 the type of the `whilepart` code block is the reason for the
4363 `Rboolok` flag which is passed to `propagate_types()`.
4364
4365 The `cond_statement` cannot fit into a `binode` so a new `exec` is
4366 defined.  As there are two scopes which cover multiple parts - one for
4367 the whole statement and one for "while" and "do" - and as we will use
4368 the 'struct exec' to track scopes, we actually need two new types of
4369 exec.  One is a `binode` for the looping part, the rest is the
4370 `cond_statement`.  The `cond_statement` will use an auxilliary `struct
4371 casepart` to track a list of case parts.
4372
4373 ###### Binode types
4374         Loop
4375
4376 ###### exec type
4377         Xcond_statement,
4378
4379 ###### ast
4380         struct casepart {
4381                 struct exec *value;
4382                 struct exec *action;
4383                 struct casepart *next;
4384         };
4385         struct cond_statement {
4386                 struct exec;
4387                 struct exec *forpart, *condpart, *thenpart, *elsepart;
4388                 struct binode *looppart;
4389                 struct casepart *casepart;
4390         };
4391
4392 ###### ast functions
4393
4394         static void free_casepart(struct casepart *cp)
4395         {
4396                 while (cp) {
4397                         struct casepart *t;
4398                         free_exec(cp->value);
4399                         free_exec(cp->action);
4400                         t = cp->next;
4401                         free(cp);
4402                         cp = t;
4403                 }
4404         }
4405
4406         static void free_cond_statement(struct cond_statement *s)
4407         {
4408                 if (!s)
4409                         return;
4410                 free_exec(s->forpart);
4411                 free_exec(s->condpart);
4412                 free_exec(s->looppart);
4413                 free_exec(s->thenpart);
4414                 free_exec(s->elsepart);
4415                 free_casepart(s->casepart);
4416                 free(s);
4417         }
4418
4419 ###### free exec cases
4420         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
4421
4422 ###### ComplexStatement Grammar
4423         | CondStatement ${ $0 = $<1; }$
4424
4425 ###### declare terminals
4426         $TERM for then while do
4427         $TERM else
4428         $TERM switch case
4429
4430 ###### Grammar
4431
4432         $*cond_statement
4433         // A CondStatement must end with EOL, as does CondSuffix and
4434         // IfSuffix.
4435         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
4436         // may or may not end with EOL
4437         // WhilePart and IfPart include an appropriate Suffix
4438
4439         // ForPart, SwitchPart, and IfPart open scopes, o we have to close
4440         // them.  WhilePart opens and closes its own scope.
4441         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
4442                 $0 = $<CS;
4443                 $0->forpart = $<FP;
4444                 $0->thenpart = $<TP;
4445                 $0->looppart = $<WP;
4446                 var_block_close(c, CloseSequential, $0);
4447         }$
4448         | ForPart OptNL WhilePart CondSuffix ${
4449                 $0 = $<CS;
4450                 $0->forpart = $<FP;
4451                 $0->looppart = $<WP;
4452                 var_block_close(c, CloseSequential, $0);
4453         }$
4454         | WhilePart CondSuffix ${
4455                 $0 = $<CS;
4456                 $0->looppart = $<WP;
4457         }$
4458         | SwitchPart OptNL CasePart CondSuffix ${
4459                 $0 = $<CS;
4460                 $0->condpart = $<SP;
4461                 $CP->next = $0->casepart;
4462                 $0->casepart = $<CP;
4463                 var_block_close(c, CloseSequential, $0);
4464         }$
4465         | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
4466                 $0 = $<CS;
4467                 $0->condpart = $<SP;
4468                 $CP->next = $0->casepart;
4469                 $0->casepart = $<CP;
4470                 var_block_close(c, CloseSequential, $0);
4471         }$
4472         | IfPart IfSuffix ${
4473                 $0 = $<IS;
4474                 $0->condpart = $IP.condpart; $IP.condpart = NULL;
4475                 $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
4476                 // This is where we close an "if" statement
4477                 var_block_close(c, CloseSequential, $0);
4478         }$
4479
4480         CondSuffix -> IfSuffix ${
4481                 $0 = $<1;
4482         }$
4483         | Newlines CasePart CondSuffix ${
4484                 $0 = $<CS;
4485                 $CP->next = $0->casepart;
4486                 $0->casepart = $<CP;
4487         }$
4488         | CasePart CondSuffix ${
4489                 $0 = $<CS;
4490                 $CP->next = $0->casepart;
4491                 $0->casepart = $<CP;
4492         }$
4493
4494         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
4495         | Newlines ElsePart ${ $0 = $<EP; }$
4496         | ElsePart ${$0 = $<EP; }$
4497
4498         ElsePart -> else OpenBlock Newlines ${
4499                 $0 = new(cond_statement);
4500                 $0->elsepart = $<OB;
4501                 var_block_close(c, CloseElse, $0->elsepart);
4502         }$
4503         | else OpenScope CondStatement ${
4504                 $0 = new(cond_statement);
4505                 $0->elsepart = $<CS;
4506                 var_block_close(c, CloseElse, $0->elsepart);
4507         }$
4508
4509         $*casepart
4510         CasePart -> case Expression OpenScope ColonBlock ${
4511                 $0 = calloc(1,sizeof(struct casepart));
4512                 $0->value = $<Ex;
4513                 $0->action = $<Bl;
4514                 var_block_close(c, CloseParallel, $0->action);
4515         }$
4516
4517         $*exec
4518         // These scopes are closed in CondStatement
4519         ForPart -> for OpenBlock ${
4520                 $0 = $<Bl;
4521         }$
4522
4523         ThenPart -> then OpenBlock ${
4524                 $0 = $<OB;
4525                 var_block_close(c, CloseSequential, $0);
4526         }$
4527
4528         $*binode
4529         // This scope is closed in CondStatement
4530         WhilePart -> while UseBlock OptNL do OpenBlock ${
4531                 $0 = new(binode);
4532                 $0->op = Loop;
4533                 $0->left = $<UB;
4534                 $0->right = $<OB;
4535                 var_block_close(c, CloseSequential, $0->right);
4536                 var_block_close(c, CloseSequential, $0);
4537         }$
4538         | while OpenScope Expression OpenScope ColonBlock ${
4539                 $0 = new(binode);
4540                 $0->op = Loop;
4541                 $0->left = $<Exp;
4542                 $0->right = $<CB;
4543                 var_block_close(c, CloseSequential, $0->right);
4544                 var_block_close(c, CloseSequential, $0);
4545         }$
4546
4547         $cond_statement
4548         IfPart -> if UseBlock OptNL then OpenBlock ${
4549                 $0.condpart = $<UB;
4550                 $0.thenpart = $<OB;
4551                 var_block_close(c, CloseParallel, $0.thenpart);
4552         }$
4553         | if OpenScope Expression OpenScope ColonBlock ${
4554                 $0.condpart = $<Ex;
4555                 $0.thenpart = $<CB;
4556                 var_block_close(c, CloseParallel, $0.thenpart);
4557         }$
4558         | if OpenScope Expression OpenScope OptNL then Block ${
4559                 $0.condpart = $<Ex;
4560                 $0.thenpart = $<Bl;
4561                 var_block_close(c, CloseParallel, $0.thenpart);
4562         }$
4563
4564         $*exec
4565         // This scope is closed in CondStatement
4566         SwitchPart -> switch OpenScope Expression ${
4567                 $0 = $<Ex;
4568         }$
4569         | switch UseBlock ${
4570                 $0 = $<Bl;
4571         }$
4572
4573 ###### print binode cases
4574         case Loop:
4575                 if (b->left && b->left->type == Xbinode &&
4576                     cast(binode, b->left)->op == Block) {
4577                         if (bracket)
4578                                 do_indent(indent, "while {\n");
4579                         else
4580                                 do_indent(indent, "while\n");
4581                         print_exec(b->left, indent+1, bracket);
4582                         if (bracket)
4583                                 do_indent(indent, "} do {\n");
4584                         else
4585                                 do_indent(indent, "do\n");
4586                         print_exec(b->right, indent+1, bracket);
4587                         if (bracket)
4588                                 do_indent(indent, "}\n");
4589                 } else {
4590                         do_indent(indent, "while ");
4591                         print_exec(b->left, 0, bracket);
4592                         if (bracket)
4593                                 printf(" {\n");
4594                         else
4595                                 printf(":\n");
4596                         print_exec(b->right, indent+1, bracket);
4597                         if (bracket)
4598                                 do_indent(indent, "}\n");
4599                 }
4600                 break;
4601
4602 ###### print exec cases
4603
4604         case Xcond_statement:
4605         {
4606                 struct cond_statement *cs = cast(cond_statement, e);
4607                 struct casepart *cp;
4608                 if (cs->forpart) {
4609                         do_indent(indent, "for");
4610                         if (bracket) printf(" {\n"); else printf("\n");
4611                         print_exec(cs->forpart, indent+1, bracket);
4612                         if (cs->thenpart) {
4613                                 if (bracket)
4614                                         do_indent(indent, "} then {\n");
4615                                 else
4616                                         do_indent(indent, "then\n");
4617                                 print_exec(cs->thenpart, indent+1, bracket);
4618                         }
4619                         if (bracket) do_indent(indent, "}\n");
4620                 }
4621                 if (cs->looppart) {
4622                         print_exec(cs->looppart, indent, bracket);
4623                 } else {
4624                         // a condition
4625                         if (cs->casepart)
4626                                 do_indent(indent, "switch");
4627                         else
4628                                 do_indent(indent, "if");
4629                         if (cs->condpart && cs->condpart->type == Xbinode &&
4630                             cast(binode, cs->condpart)->op == Block) {
4631                                 if (bracket)
4632                                         printf(" {\n");
4633                                 else
4634                                         printf("\n");
4635                                 print_exec(cs->condpart, indent+1, bracket);
4636                                 if (bracket)
4637                                         do_indent(indent, "}\n");
4638                                 if (cs->thenpart) {
4639                                         do_indent(indent, "then\n");
4640                                         print_exec(cs->thenpart, indent+1, bracket);
4641                                 }
4642                         } else {
4643                                 printf(" ");
4644                                 print_exec(cs->condpart, 0, bracket);
4645                                 if (cs->thenpart) {
4646                                         if (bracket)
4647                                                 printf(" {\n");
4648                                         else
4649                                                 printf(":\n");
4650                                         print_exec(cs->thenpart, indent+1, bracket);
4651                                         if (bracket)
4652                                                 do_indent(indent, "}\n");
4653                                 } else
4654                                         printf("\n");
4655                         }
4656                 }
4657                 for (cp = cs->casepart; cp; cp = cp->next) {
4658                         do_indent(indent, "case ");
4659                         print_exec(cp->value, -1, 0);
4660                         if (bracket)
4661                                 printf(" {\n");
4662                         else
4663                                 printf(":\n");
4664                         print_exec(cp->action, indent+1, bracket);
4665                         if (bracket)
4666                                 do_indent(indent, "}\n");
4667                 }
4668                 if (cs->elsepart) {
4669                         do_indent(indent, "else");
4670                         if (bracket)
4671                                 printf(" {\n");
4672                         else
4673                                 printf("\n");
4674                         print_exec(cs->elsepart, indent+1, bracket);
4675                         if (bracket)
4676                                 do_indent(indent, "}\n");
4677                 }
4678                 break;
4679         }
4680
4681 ###### propagate binode cases
4682         case Loop:
4683                 t = propagate_types(b->right, c, ok, Tnone, 0);
4684                 if (!type_compat(Tnone, t, 0))
4685                         *ok = 0;        // UNTESTED
4686                 return propagate_types(b->left, c, ok, type, rules);
4687
4688 ###### propagate exec cases
4689         case Xcond_statement:
4690         {
4691                 // forpart and looppart->right must return Tnone
4692                 // thenpart must return Tnone if there is a loopart,
4693                 // otherwise it is like elsepart.
4694                 // condpart must:
4695                 //    be bool if there is no casepart
4696                 //    match casepart->values if there is a switchpart
4697                 //    either be bool or match casepart->value if there
4698                 //             is a whilepart
4699                 // elsepart and casepart->action must match the return type
4700                 //   expected of this statement.
4701                 struct cond_statement *cs = cast(cond_statement, prog);
4702                 struct casepart *cp;
4703
4704                 t = propagate_types(cs->forpart, c, ok, Tnone, 0);
4705                 if (!type_compat(Tnone, t, 0))
4706                         *ok = 0;        // UNTESTED
4707
4708                 if (cs->looppart) {
4709                         t = propagate_types(cs->thenpart, c, ok, Tnone, 0);
4710                         if (!type_compat(Tnone, t, 0))
4711                                 *ok = 0;        // UNTESTED
4712                 }
4713                 if (cs->casepart == NULL) {
4714                         propagate_types(cs->condpart, c, ok, Tbool, 0);
4715                         propagate_types(cs->looppart, c, ok, Tbool, 0);
4716                 } else {
4717                         /* Condpart must match case values, with bool permitted */
4718                         t = NULL;
4719                         for (cp = cs->casepart;
4720                              cp && !t; cp = cp->next)
4721                                 t = propagate_types(cp->value, c, ok, NULL, 0);
4722                         if (!t && cs->condpart)
4723                                 t = propagate_types(cs->condpart, c, ok, NULL, Rboolok);        // UNTESTED
4724                         if (!t && cs->looppart)
4725                                 t = propagate_types(cs->looppart, c, ok, NULL, Rboolok);        // UNTESTED
4726                         // Now we have a type (I hope) push it down
4727                         if (t) {
4728                                 for (cp = cs->casepart; cp; cp = cp->next)
4729                                         propagate_types(cp->value, c, ok, t, 0);
4730                                 propagate_types(cs->condpart, c, ok, t, Rboolok);
4731                                 propagate_types(cs->looppart, c, ok, t, Rboolok);
4732                         }
4733                 }
4734                 // (if)then, else, and case parts must return expected type.
4735                 if (!cs->looppart && !type)
4736                         type = propagate_types(cs->thenpart, c, ok, NULL, rules);
4737                 if (!type)
4738                         type = propagate_types(cs->elsepart, c, ok, NULL, rules);
4739                 for (cp = cs->casepart;
4740                      cp && !type;
4741                      cp = cp->next)     // UNTESTED
4742                         type = propagate_types(cp->action, c, ok, NULL, rules); // UNTESTED
4743                 if (type) {
4744                         if (!cs->looppart)
4745                                 propagate_types(cs->thenpart, c, ok, type, rules);
4746                         propagate_types(cs->elsepart, c, ok, type, rules);
4747                         for (cp = cs->casepart; cp ; cp = cp->next)
4748                                 propagate_types(cp->action, c, ok, type, rules);
4749                         return type;
4750                 } else
4751                         return NULL;
4752         }
4753
4754 ###### interp binode cases
4755         case Loop:
4756                 // This just performs one iterration of the loop
4757                 rv = interp_exec(c, b->left, &rvtype);
4758                 if (rvtype == Tnone ||
4759                     (rvtype == Tbool && rv.bool != 0))
4760                         // rvtype is Tnone or Tbool, doesn't need to be freed
4761                         interp_exec(c, b->right, NULL);
4762                 break;
4763
4764 ###### interp exec cases
4765         case Xcond_statement:
4766         {
4767                 struct value v, cnd;
4768                 struct type *vtype, *cndtype;
4769                 struct casepart *cp;
4770                 struct cond_statement *cs = cast(cond_statement, e);
4771
4772                 if (cs->forpart)
4773                         interp_exec(c, cs->forpart, NULL);
4774                 if (cs->looppart) {
4775                         while ((cnd = interp_exec(c, cs->looppart, &cndtype)),
4776                                cndtype == Tnone || (cndtype == Tbool && cnd.bool != 0))
4777                                 interp_exec(c, cs->thenpart, NULL);
4778                 } else {
4779                         cnd = interp_exec(c, cs->condpart, &cndtype);
4780                         if ((cndtype == Tnone ||
4781                             (cndtype == Tbool && cnd.bool != 0))) {
4782                                 // cnd is Tnone or Tbool, doesn't need to be freed
4783                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4784                                 // skip else (and cases)
4785                                 goto Xcond_done;
4786                         }
4787                 }
4788                 for (cp = cs->casepart; cp; cp = cp->next) {
4789                         v = interp_exec(c, cp->value, &vtype);
4790                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4791                                 free_value(vtype, &v);
4792                                 free_value(cndtype, &cnd);
4793                                 rv = interp_exec(c, cp->action, &rvtype);
4794                                 goto Xcond_done;
4795                         }
4796                         free_value(vtype, &v);
4797                 }
4798                 free_value(cndtype, &cnd);
4799                 if (cs->elsepart)
4800                         rv = interp_exec(c, cs->elsepart, &rvtype);
4801                 else
4802                         rvtype = Tnone;
4803         Xcond_done:
4804                 break;
4805         }
4806
4807 ### Top level structure
4808
4809 All the language elements so far can be used in various places.  Now
4810 it is time to clarify what those places are.
4811
4812 At the top level of a file there will be a number of declarations.
4813 Many of the things that can be declared haven't been described yet,
4814 such as functions, procedures, imports, and probably more.
4815 For now there are two sorts of things that can appear at the top
4816 level.  They are predefined constants, `struct` types, and the `main`
4817 function.  While the syntax will allow the `main` function to appear
4818 multiple times, that will trigger an error if it is actually attempted.
4819
4820 The various declarations do not return anything.  They store the
4821 various declarations in the parse context.
4822
4823 ###### Parser: grammar
4824
4825         $void
4826         Ocean -> OptNL DeclarationList
4827
4828         ## declare terminals
4829
4830         OptNL ->
4831         | OptNL NEWLINE
4832
4833         Newlines -> NEWLINE
4834         | Newlines NEWLINE
4835
4836         DeclarationList -> Declaration
4837         | DeclarationList Declaration
4838
4839         Declaration -> ERROR Newlines ${
4840                 tok_err(c,      // UNTESTED
4841                         "error: unhandled parse error", &$1);
4842         }$
4843         | DeclareConstant
4844         | DeclareFunction
4845         | DeclareStruct
4846
4847         ## top level grammar
4848
4849         ## Grammar
4850
4851 ### The `const` section
4852
4853 As well as being defined in with the code that uses them, constants
4854 can be declared at the top level.  These have full-file scope, so they
4855 are always `InScope`.  The value of a top level constant can be given
4856 as an expression, and this is evaluated immediately rather than in the
4857 later interpretation stage.  Once we add functions to the language, we
4858 will need rules concern which, if any, can be used to define a top
4859 level constant.
4860
4861 Constants are defined in a section that starts with the reserved word
4862 `const` and then has a block with a list of assignment statements.
4863 For syntactic consistency, these must use the double-colon syntax to
4864 make it clear that they are constants.  Type can also be given: if
4865 not, the type will be determined during analysis, as with other
4866 constants.
4867
4868 As the types constants are inserted at the head of a list, printing
4869 them in the same order that they were read is not straight forward.
4870 We take a quadratic approach here and count the number of constants
4871 (variables of depth 0), then count down from there, each time
4872 searching through for the Nth constant for decreasing N.
4873
4874 ###### top level grammar
4875
4876         $TERM const
4877
4878         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4879         | const { SimpleConstList } Newlines
4880         | const IN OptNL ConstList OUT Newlines
4881         | const SimpleConstList Newlines
4882
4883         ConstList -> ConstList SimpleConstLine
4884         | SimpleConstLine
4885
4886         SimpleConstList -> SimpleConstList ; Const
4887         | Const
4888         | SimpleConstList ;
4889
4890         SimpleConstLine -> SimpleConstList Newlines
4891         | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4892
4893         $*type
4894         CType -> Type   ${ $0 = $<1; }$
4895         |               ${ $0 = NULL; }$
4896
4897         $void
4898         Const -> IDENTIFIER :: CType = Expression ${ {
4899                 int ok;
4900                 struct variable *v;
4901
4902                 v = var_decl(c, $1.txt);
4903                 if (v) {
4904                         struct var *var = new_pos(var, $1);
4905                         v->where_decl = var;
4906                         v->where_set = var;
4907                         var->var = v;
4908                         v->constant = 1;
4909                         v->global = 1;
4910                 } else {
4911                         struct variable *vorig = var_ref(c, $1.txt);
4912                         tok_err(c, "error: name already declared", &$1);
4913                         type_err(c, "info: this is where '%v' was first declared",
4914                                  vorig->where_decl, NULL, 0, NULL);
4915                 }
4916                 do {
4917                         ok = 1;
4918                         propagate_types($5, c, &ok, $3, 0);
4919                 } while (ok == 2);
4920                 if (!ok)
4921                         c->parse_error = 1;
4922                 else if (v) {
4923                         struct value res = interp_exec(c, $5, &v->type);
4924                         global_alloc(c, v->type, v, &res);
4925                 }
4926         } }$
4927
4928 ###### print const decls
4929         {
4930                 struct variable *v;
4931                 int target = -1;
4932
4933                 while (target != 0) {
4934                         int i = 0;
4935                         for (v = context.in_scope; v; v=v->in_scope)
4936                                 if (v->depth == 0 && v->constant) {
4937                                         i += 1;
4938                                         if (i == target)
4939                                                 break;
4940                                 }
4941
4942                         if (target == -1) {
4943                                 if (i)
4944                                         printf("const\n");
4945                                 target = i;
4946                         } else {
4947                                 struct value *val = var_value(&context, v);
4948                                 printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
4949                                 type_print(v->type, stdout);
4950                                 printf(" = ");
4951                                 if (v->type == Tstr)
4952                                         printf("\"");
4953                                 print_value(v->type, val, stdout);
4954                                 if (v->type == Tstr)
4955                                         printf("\"");
4956                                 printf("\n");
4957                                 target -= 1;
4958                         }
4959                 }
4960         }
4961
4962 ### Function declarations
4963
4964 The code in an Ocean program is all stored in function declarations.
4965 One of the functions must be named `main` and it must accept an array of
4966 strings as a parameter - the command line arguments.
4967
4968 As this is the top level, several things are handled a bit differently.
4969 The function is not interpreted by `interp_exec` as that isn't passed
4970 the argument list which the program requires.  Similarly type analysis
4971 is a bit more interesting at this level.
4972
4973 ###### ast functions
4974
4975         static struct type *handle_results(struct parse_context *c,
4976                                            struct binode *results)
4977         {
4978                 /* Create a 'struct' type from the results list, which
4979                  * is a list for 'struct var'
4980                  */
4981                 struct type *t = add_anon_type(c, &structure_prototype,
4982                                                " function result");
4983                 int cnt = 0;
4984                 struct binode *b;
4985
4986                 for (b = results; b; b = cast(binode, b->right))
4987                         cnt += 1;
4988                 t->structure.nfields = cnt;
4989                 t->structure.fields = calloc(cnt, sizeof(struct field));
4990                 cnt = 0;
4991                 for (b = results; b; b = cast(binode, b->right)) {
4992                         struct var *v = cast(var, b->left);
4993                         struct field *f = &t->structure.fields[cnt++];
4994                         int a = v->var->type->align;
4995                         f->name = v->var->name->name;
4996                         f->type = v->var->type;
4997                         f->init = NULL;
4998                         f->offset = t->size;
4999                         v->var->frame_pos = f->offset;
5000                         t->size += ((f->type->size - 1) | (a-1)) + 1;
5001                         if (a > t->align)
5002                                 t->align = a;
5003                         variable_unlink_exec(v->var);
5004                 }
5005                 free_binode(results);
5006                 return t;
5007         }
5008
5009         static struct variable *declare_function(struct parse_context *c,
5010                                                 struct variable *name,
5011                                                 struct binode *args,
5012                                                 struct type *ret,
5013                                                 struct binode *results,
5014                                                 struct exec *code)
5015         {
5016                 if (name) {
5017                         struct value fn = {.function = code};
5018                         struct type *t;
5019                         var_block_close(c, CloseFunction, code);
5020                         t = add_anon_type(c, &function_prototype,
5021                                           "func %.*s", name->name->name.len,
5022                                           name->name->name.txt);
5023                         name->type = t;
5024                         t->function.params = reorder_bilist(args);
5025                         if (!ret) {
5026                                 ret = handle_results(c, reorder_bilist(results));
5027                                 t->function.inline_result = 1;
5028                                 t->function.local_size = ret->size;
5029                         }
5030                         t->function.return_type = ret;
5031                         global_alloc(c, t, name, &fn);
5032                         name->type->function.scope = c->out_scope;
5033                 } else {
5034                         free_binode(args);
5035                         free_type(ret);
5036                         free_exec(code);
5037                         var_block_close(c, CloseFunction, NULL);
5038                 }
5039                 c->out_scope = NULL;
5040                 return name;
5041         }
5042
5043 ###### declare terminals
5044         $TERM return
5045
5046 ###### top level grammar
5047
5048         $*variable
5049         DeclareFunction -> func FuncName ( OpenScope ArgsLine ) Block Newlines ${
5050                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5051         }$
5052         | func FuncName IN OpenScope Args OUT OptNL do Block Newlines ${
5053                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5054         }$
5055         | func FuncName NEWLINE OpenScope OptNL do Block Newlines ${
5056                 $0 = declare_function(c, $<FN, NULL, Tnone, NULL, $<Bl);
5057         }$
5058         | func FuncName ( OpenScope ArgsLine ) : Type Block Newlines ${
5059                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5060         }$
5061         | func FuncName ( OpenScope ArgsLine ) : ( ArgsLine ) Block Newlines ${
5062                 $0 = declare_function(c, $<FN, $<AL, NULL, $<AL2, $<Bl);
5063         }$
5064         | func FuncName IN OpenScope Args OUT OptNL return Type Newlines do Block Newlines ${
5065                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5066         }$
5067         | func FuncName NEWLINE OpenScope return Type Newlines do Block Newlines ${
5068                 $0 = declare_function(c, $<FN, NULL, $<Ty, NULL, $<Bl);
5069         }$
5070         | func FuncName IN OpenScope Args OUT OptNL return IN Args OUT OptNL do Block Newlines ${
5071                 $0 = declare_function(c, $<FN, $<Ar, NULL, $<Ar2, $<Bl);
5072         }$
5073         | func FuncName NEWLINE OpenScope return IN Args OUT OptNL do Block Newlines ${
5074                 $0 = declare_function(c, $<FN, NULL, NULL, $<Ar, $<Bl);
5075         }$
5076
5077 ###### print func decls
5078         {
5079                 struct variable *v;
5080                 int target = -1;
5081
5082                 while (target != 0) {
5083                         int i = 0;
5084                         for (v = context.in_scope; v; v=v->in_scope)
5085                                 if (v->depth == 0 && v->type && v->type->check_args) {
5086                                         i += 1;
5087                                         if (i == target)
5088                                                 break;
5089                                 }
5090
5091                         if (target == -1) {
5092                                 target = i;
5093                         } else {
5094                                 struct value *val = var_value(&context, v);
5095                                 printf("func %.*s", v->name->name.len, v->name->name.txt);
5096                                 v->type->print_type_decl(v->type, stdout);
5097                                 if (brackets)
5098                                         print_exec(val->function, 0, brackets);
5099                                 else
5100                                         print_value(v->type, val, stdout);
5101                                 printf("/* frame size %d */\n", v->type->function.local_size);
5102                                 target -= 1;
5103                         }
5104                 }
5105         }
5106
5107 ###### core functions
5108
5109         static int analyse_funcs(struct parse_context *c)
5110         {
5111                 struct variable *v;
5112                 int all_ok = 1;
5113                 for (v = c->in_scope; v; v = v->in_scope) {
5114                         struct value *val;
5115                         struct type *ret;
5116                         int ok = 1;
5117                         if (v->depth != 0 || !v->type || !v->type->check_args)
5118                                 continue;
5119                         ret = v->type->function.inline_result ?
5120                                 Tnone : v->type->function.return_type;
5121                         val = var_value(c, v);
5122                         do {
5123                                 ok = 1;
5124                                 propagate_types(val->function, c, &ok, ret, 0);
5125                         } while (ok == 2);
5126                         if (ok)
5127                                 /* Make sure everything is still consistent */
5128                                 propagate_types(val->function, c, &ok, ret, 0);
5129                         if (!ok)
5130                                 all_ok = 0;
5131                         if (!v->type->function.inline_result &&
5132                             !v->type->function.return_type->dup) {
5133                                 type_err(c, "error: function cannot return value of type %1",
5134                                          v->where_decl, v->type->function.return_type, 0, NULL);
5135                         }
5136
5137                         scope_finalize(c, v->type);
5138                 }
5139                 return all_ok;
5140         }
5141
5142         static int analyse_main(struct type *type, struct parse_context *c)
5143         {
5144                 struct binode *bp = type->function.params;
5145                 struct binode *b;
5146                 int ok = 1;
5147                 int arg = 0;
5148                 struct type *argv_type;
5149
5150                 argv_type = add_anon_type(c, &array_prototype, "argv");
5151                 argv_type->array.member = Tstr;
5152                 argv_type->array.unspec = 1;
5153
5154                 for (b = bp; b; b = cast(binode, b->right)) {
5155                         ok = 1;
5156                         switch (arg++) {
5157                         case 0: /* argv */
5158                                 propagate_types(b->left, c, &ok, argv_type, 0);
5159                                 break;
5160                         default: /* invalid */  // NOTEST
5161                                 propagate_types(b->left, c, &ok, Tnone, 0);     // NOTEST
5162                         }
5163                         if (!ok)
5164                                 c->parse_error = 1;
5165                 }
5166
5167                 return !c->parse_error;
5168         }
5169
5170         static void interp_main(struct parse_context *c, int argc, char **argv)
5171         {
5172                 struct value *progp = NULL;
5173                 struct text main_name = { "main", 4 };
5174                 struct variable *mainv;
5175                 struct binode *al;
5176                 int anum = 0;
5177                 struct value v;
5178                 struct type *vtype;
5179
5180                 mainv = var_ref(c, main_name);
5181                 if (mainv)
5182                         progp = var_value(c, mainv);
5183                 if (!progp || !progp->function) {
5184                         fprintf(stderr, "oceani: no main function found.\n");
5185                         c->parse_error = 1;
5186                         return;
5187                 }
5188                 if (!analyse_main(mainv->type, c)) {
5189                         fprintf(stderr, "oceani: main has wrong type.\n");
5190                         c->parse_error = 1;
5191                         return;
5192                 }
5193                 al = mainv->type->function.params;
5194
5195                 c->local_size = mainv->type->function.local_size;
5196                 c->local = calloc(1, c->local_size);
5197                 while (al) {
5198                         struct var *v = cast(var, al->left);
5199                         struct value *vl = var_value(c, v->var);
5200                         struct value arg;
5201                         struct type *t;
5202                         mpq_t argcq;
5203                         int i;
5204
5205                         switch (anum++) {
5206                         case 0: /* argv */
5207                                 t = v->var->type;
5208                                 mpq_init(argcq);
5209                                 mpq_set_ui(argcq, argc, 1);
5210                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
5211                                 t->prepare_type(c, t, 0);
5212                                 array_init(v->var->type, vl);
5213                                 for (i = 0; i < argc; i++) {
5214                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
5215
5216                                         arg.str.txt = argv[i];
5217                                         arg.str.len = strlen(argv[i]);
5218                                         free_value(Tstr, vl2);
5219                                         dup_value(Tstr, &arg, vl2);
5220                                 }
5221                                 break;
5222                         }
5223                         al = cast(binode, al->right);
5224                 }
5225                 v = interp_exec(c, progp->function, &vtype);
5226                 free_value(vtype, &v);
5227                 free(c->local);
5228                 c->local = NULL;
5229         }
5230
5231 ###### ast functions
5232         void free_variable(struct variable *v)
5233         {
5234         }
5235
5236 ## And now to test it out.
5237
5238 Having a language requires having a "hello world" program.  I'll
5239 provide a little more than that: a program that prints "Hello world"
5240 finds the GCD of two numbers, prints the first few elements of
5241 Fibonacci, performs a binary search for a number, and a few other
5242 things which will likely grow as the languages grows.
5243
5244 ###### File: oceani.mk
5245         demos :: sayhello
5246         sayhello : oceani
5247                 @echo "===== DEMO ====="
5248                 ./oceani --section "demo: hello" oceani.mdc 55 33
5249
5250 ###### demo: hello
5251
5252         const
5253                 pi ::= 3.141_592_6
5254                 four ::= 2 + 2 ; five ::= 10/2
5255         const pie ::= "I like Pie";
5256                 cake ::= "The cake is"
5257                   ++ " a lie"
5258
5259         struct fred
5260                 size:[four]number
5261                 name:string
5262                 alive:Boolean
5263
5264         func main(argv:[argc::]string)
5265                 print "Hello World, what lovely oceans you have!"
5266                 print "Are there", five, "?"
5267                 print pi, pie, "but", cake
5268
5269                 A := $argv[1]; B := $argv[2]
5270
5271                 /* When a variable is defined in both branches of an 'if',
5272                  * and used afterwards, the variables are merged.
5273                  */
5274                 if A > B:
5275                         bigger := "yes"
5276                 else
5277                         bigger := "no"
5278                 print "Is", A, "bigger than", B,"? ", bigger
5279                 /* If a variable is not used after the 'if', no
5280                  * merge happens, so types can be different
5281                  */
5282                 if A > B * 2:
5283                         double:string = "yes"
5284                         print A, "is more than twice", B, "?", double
5285                 else
5286                         double := B*2
5287                         print "double", B, "is", double
5288
5289                 a : number
5290                 a = A;
5291                 b:number = B
5292                 if a > 0 and then b > 0:
5293                         while a != b:
5294                                 if a < b:
5295                                         b = b - a
5296                                 else
5297                                         a = a - b
5298                         print "GCD of", A, "and", B,"is", a
5299                 else if a <= 0:
5300                         print a, "is not positive, cannot calculate GCD"
5301                 else
5302                         print b, "is not positive, cannot calculate GCD"
5303
5304                 for
5305                         togo := 10
5306                         f1 := 1; f2 := 1
5307                         print "Fibonacci:", f1,f2,
5308                 then togo = togo - 1
5309                 while togo > 0:
5310                         f3 := f1 + f2
5311                         print "", f3,
5312                         f1 = f2
5313                         f2 = f3
5314                 print ""
5315
5316                 /* Binary search... */
5317                 for
5318                         lo:= 0; hi := 100
5319                         target := 77
5320                 while
5321                         mid := (lo + hi) / 2
5322                         if mid == target:
5323                                 use Found
5324                         if mid < target:
5325                                 lo = mid
5326                         else
5327                                 hi = mid
5328                         if hi - lo < 1:
5329                                 lo = mid
5330                                 use GiveUp
5331                         use True
5332                 do pass
5333                 case Found:
5334                         print "Yay, I found", target
5335                 case GiveUp:
5336                         print "Closest I found was", lo
5337
5338                 size::= 10
5339                 list:[size]number
5340                 list[0] = 1234
5341                 // "middle square" PRNG.  Not particularly good, but one my
5342                 // Dad taught me - the first one I ever heard of.
5343                 for i:=1; then i = i + 1; while i < size:
5344                         n := list[i-1] * list[i-1]
5345                         list[i] = (n / 100) % 10 000
5346
5347                 print "Before sort:",
5348                 for i:=0; then i = i + 1; while i < size:
5349                         print "", list[i],
5350                 print
5351
5352                 for i := 1; then i=i+1; while i < size:
5353                         for j:=i-1; then j=j-1; while j >= 0:
5354                                 if list[j] > list[j+1]:
5355                                         t:= list[j]
5356                                         list[j] = list[j+1]
5357                                         list[j+1] = t
5358                 print " After sort:",
5359                 for i:=0; then i = i + 1; while i < size:
5360                         print "", list[i],
5361                 print
5362
5363                 if 1 == 2 then print "yes"; else print "no"
5364
5365                 bob:fred
5366                 bob.name = "Hello"
5367                 bob.alive = (bob.name == "Hello")
5368                 print "bob", "is" if  bob.alive else "isn't", "alive"