ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 ## parse context
 118         };
 119
 120 ###### macros
 121
 122         #define container_of(ptr, type, member) ({                      \
 123                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 124                 (type *)( (char *)__mptr - offsetof(type,member) );})
 125
 126         #define config2context(_conf) container_of(_conf, struct parse_context, \
 127                 config)
 128
 129 ###### Parser: reduce
 130         struct parse_context *c = config2context(config);
 131
 132 ###### Parser: code
 133         #define _GNU_SOURCE
 134         #include <unistd.h>
 135         #include <stdlib.h>
 136         #include <fcntl.h>
 137         #include <errno.h>
 138         #include <sys/mman.h>
 139         #include <string.h>
 140         #include <stdio.h>
 141         #include <locale.h>
 142         #include <malloc.h>
 143         #include "mdcode.h"
 144         #include "scanner.h"
 145         #include "parser.h"
 146
 147         ## includes
 148
 149         #include "oceani.h"
 150
 151         ## forward decls
 152         ## value functions
 153         ## ast functions
 154         ## core functions
 155
 156         #include <getopt.h>
 157         static char Usage[] =
 158                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 159         static const struct option long_options[] = {
 160                 {"trace",     0, NULL, 't'},
 161                 {"print",     0, NULL, 'p'},
 162                 {"noexec",    0, NULL, 'n'},
 163                 {"brackets",  0, NULL, 'b'},
 164                 {"section",   1, NULL, 's'},
 165                 {NULL,        0, NULL, 0},
 166         };
 167         const char *options = "tpnbs";
 168
 169         static void pr_err(char *msg)                   // NOTEST
 170         {
 171                 fprintf(stderr, "%s\n", msg);           // NOTEST
 172         }                                               // NOTEST
 173
 174         int main(int argc, char *argv[])
 175         {
 176                 int fd;
 177                 int len;
 178                 char *file;
 179                 struct section *s = NULL, *ss;
 180                 char *section = NULL;
 181                 struct parse_context context = {
 182                         .config = {
 183                                 .ignored = (1 << TK_mark),
 184                                 .number_chars = ".,_+- ",
 185                                 .word_start = "_",
 186                                 .word_cont = "_",
 187                         },
 188                 };
 189                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 190                 int opt;
 191                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 192                        != -1) {
 193                         switch(opt) {
 194                         case 't': dotrace=1; break;
 195                         case 'p': doprint=1; break;
 196                         case 'n': doexec=0; break;
 197                         case 'b': brackets=1; break;
 198                         case 's': section = optarg; break;
 199                         default: fprintf(stderr, Usage);
 200                                 exit(1);
 201                         }
 202                 }
 203                 if (optind >= argc) {
 204                         fprintf(stderr, "oceani: no input file given\n");
 205                         exit(1);
 206                 }
 207                 fd = open(argv[optind], O_RDONLY);
 208                 if (fd < 0) {
 209                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 210                         exit(1);
 211                 }
 212                 context.file_name = argv[optind];
 213                 len = lseek(fd, 0, 2);
 214                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 215                 s = code_extract(file, file+len, pr_err);
 216                 if (!s) {
 217                         fprintf(stderr, "oceani: could not find any code in %s\n",
 218                                 argv[optind]);
 219                         exit(1);
 220                 }
 221
 222                 ## context initialization
 223
 224                 if (section) {
 225                         for (ss = s; ss; ss = ss->next) {
 226                                 struct text sec = ss->section;
 227                                 if (sec.len == strlen(section) &&
 228                                     strncmp(sec.txt, section, sec.len) == 0)
 229                                         break;
 230                         }
 231                         if (!ss) {
 232                                 fprintf(stderr, "oceani: cannot find section %s\n",
 233                                         section);
 234                                 goto cleanup;
 235                         }
 236                 } else
 237                         ss = s;                         // NOTEST
 238                 if (!ss->code) {
 239                         fprintf(stderr, "oceani: no code found in requested section\n");        // NOTEST
 240                         goto cleanup;                   // NOTEST
 241                 }
 242
 243                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 244
 245                 resolve_consts(&context);
 246                 prepare_types(&context);
 247                 if (!context.parse_error && !analyse_funcs(&context)) {
 248                         fprintf(stderr, "oceani: type error in program - not running.\n");
 249                         context.parse_error += 1;
 250                 }
 251
 252                 if (doprint) {
 253                         ## print const decls
 254                         ## print type decls
 255                         ## print func decls
 256                 }
 257                 if (doexec && !context.parse_error)
 258                         interp_main(&context, argc - optind, argv + optind);
 259         cleanup:
 260                 while (s) {
 261                         struct section *t = s->next;
 262                         code_free(s->code);
 263                         free(s);
 264                         s = t;
 265                 }
 266                 // FIXME parser should pop scope even on error
 267                 while (context.scope_depth > 0)
 268                         scope_pop(&context);
 269                 ## free global vars
 270                 ## free const decls
 271                 ## free context types
 272                 ## free context storage
 273                 exit(context.parse_error ? 1 : 0);
 274         }
 275
 276 ### Analysis
 277
 278 The four requirements of parse, analyse, print, interpret apply to
 279 each language element individually so that is how most of the code
 280 will be structured.
 281
 282 Three of the four are fairly self explanatory.  The one that requires
 283 a little explanation is the analysis step.
 284
 285 The current language design does not require the types of variables to
 286 be declared, but they must still have a single type.  Different
 287 operations impose different requirements on the variables, for example
 288 addition requires both arguments to be numeric, and assignment
 289 requires the variable on the left to have the same type as the
 290 expression on the right.
 291
 292 Analysis involves propagating these type requirements around and
 293 consequently setting the type of each variable.  If any requirements
 294 are violated (e.g. a string is compared with a number) or if a
 295 variable needs to have two different types, then an error is raised
 296 and the program will not run.
 297
 298 If the same variable is declared in both branchs of an 'if/else', or
 299 in all cases of a 'switch' then the multiple instances may be merged
 300 into just one variable if the variable is referenced after the
 301 conditional statement.  When this happens, the types must naturally be
 302 consistent across all the branches.  When the variable is not used
 303 outside the if, the variables in the different branches are distinct
 304 and can be of different types.
 305
 306 Undeclared names may only appear in "use" statements and "case" expressions.
 307 These names are given a type of "label" and a unique value.
 308 This allows them to fill the role of a name in an enumerated type, which
 309 is useful for testing the `switch` statement.
 310
 311 As we will see, the condition part of a `while` statement can return
 312 either a Boolean or some other type.  This requires that the expected
 313 type that gets passed around comprises a type and a flag to indicate
 314 that `Tbool` is also permitted.
 315
 316 As there are, as yet, no distinct types that are compatible, there
 317 isn't much subtlety in the analysis.  When we have distinct number
 318 types, this will become more interesting.
 319
 320 #### Error reporting
 321
 322 When analysis discovers an inconsistency it needs to report an error;
 323 just refusing to run the code ensures that the error doesn't cascade,
 324 but by itself it isn't very useful.  A clear understanding of the sort
 325 of error message that are useful will help guide the process of
 326 analysis.
 327
 328 At a simplistic level, the only sort of error that type analysis can
 329 report is that the type of some construct doesn't match a contextual
 330 requirement.  For example, in `4 + "hello"` the addition provides a
 331 contextual requirement for numbers, but `"hello"` is not a number.  In
 332 this particular example no further information is needed as the types
 333 are obvious from local information.  When a variable is involved that
 334 isn't the case.  It may be helpful to explain why the variable has a
 335 particular type, by indicating the location where the type was set,
 336 whether by declaration or usage.
 337
 338 Using a recursive-descent analysis we can easily detect a problem at
 339 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 340 will detect that one argument is not a number and the usage of `hello`
 341 will detect that a number was wanted, but not provided.  In this
 342 (early) version of the language, we will generate error reports at
 343 multiple locations, so the use of `hello` will report an error and
 344 explain were the value was set, and the addition will report an error
 345 and say why numbers are needed.  To be able to report locations for
 346 errors, each language element will need to record a file location
 347 (line and column) and each variable will need to record the language
 348 element where its type was set.  For now we will assume that each line
 349 of an error message indicates one location in the file, and up to 2
 350 types.  So we provide a `printf`-like function which takes a format, a
 351 location (a `struct exec` which has not yet been introduced), and 2
 352 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 353 will need a function to print the location, once we know how that is
 354 stored. e As will be explained later, there are sometimes extra rules for
 355 type matching and they might affect error messages, we need to pass those
 356 in too.
 357
 358 As well as type errors, we sometimes need to report problems with
 359 tokens, which might be unexpected or might name a type that has not
 360 been defined.  For these we have `tok_err()` which reports an error
 361 with a given token.  Each of the error functions sets the flag in the
 362 context so indicate that parsing failed.
 363
 364 ###### forward decls
 365
 366         static void fput_loc(struct exec *loc, FILE *f);
 367         static void type_err(struct parse_context *c,
 368                              char *fmt, struct exec *loc,
 369                              struct type *t1, int rules, struct type *t2);
 370         static void tok_err(struct parse_context *c, char *fmt, struct token *t);
 371
 372 ###### core functions
 373
 374         static void type_err(struct parse_context *c,
 375                              char *fmt, struct exec *loc,
 376                              struct type *t1, int rules, struct type *t2)
 377         {
 378                 fprintf(stderr, "%s:", c->file_name);
 379                 fput_loc(loc, stderr);
 380                 for (; *fmt ; fmt++) {
 381                         if (*fmt != '%') {
 382                                 fputc(*fmt, stderr);
 383                                 continue;
 384                         }
 385                         fmt++;
 386                         switch (*fmt) {
 387                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 388                         default: fputc('?', stderr); break;     // NOTEST
 389                         case '1':
 390                                 type_print(t1, stderr);
 391                                 break;
 392                         case '2':
 393                                 type_print(t2, stderr);
 394                                 break;
 395                         ## format cases
 396                         }
 397                 }
 398                 fputs("\n", stderr);
 399                 c->parse_error += 1;
 400         }
 401
 402         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 403         {
 404                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 405                         t->txt.len, t->txt.txt);
 406                 c->parse_error += 1;
 407         }
 408
 409 ## Entities: declared and predeclared.
 410
 411 There are various "things" that the language and/or the interpreter
 412 needs to know about to parse and execute a program.  These include
 413 types, variables, values, and executable code.  These are all lumped
 414 together under the term "entities" (calling them "objects" would be
 415 confusing) and introduced here.  The following section will present the
 416 different specific code elements which comprise or manipulate these
 417 various entities.
 418
 419 ### Executables
 420
 421 Executables can be lots of different things.  In many cases an
 422 executable is just an operation combined with one or two other
 423 executables.  This allows for expressions and lists etc.  Other times an
 424 executable is something quite specific like a constant or variable name.
 425 So we define a `struct exec` to be a general executable with a type, and
 426 a `struct binode` which is a subclass of `exec`, forms a node in a
 427 binary tree, and holds an operation.  There will be other subclasses,
 428 and to access these we need to be able to `cast` the `exec` into the
 429 various other types.  The first field in any `struct exec` is the type
 430 from the `exec_types` enum.
 431
 432 ###### macros
 433         #define cast(structname, pointer) ({            \
 434                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 435                 if (__mptr && *__mptr != X##structname) abort();                \
 436                 (struct structname *)( (char *)__mptr);})
 437
 438         #define new(structname) ({                                              \
 439                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 440                 __ptr->type = X##structname;                                            \
 441                 __ptr->line = -1; __ptr->column = -1;                                   \
 442                 __ptr;})
 443
 444         #define new_pos(structname, token) ({                                           \
 445                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 446                 __ptr->type = X##structname;                                            \
 447                 __ptr->line = token.line; __ptr->column = token.col;                    \
 448                 __ptr;})
 449
 450 ###### ast
 451         enum exec_types {
 452                 Xbinode,
 453                 ## exec type
 454         };
 455         struct exec {
 456                 enum exec_types type;
 457                 int line, column;
 458                 ## exec fields
 459         };
 460         struct binode {
 461                 struct exec;
 462                 enum Btype {
 463                         ## Binode types
 464                 } op;
 465                 struct exec *left, *right;
 466         };
 467
 468 ###### ast functions
 469
 470         static int __fput_loc(struct exec *loc, FILE *f)
 471         {
 472                 if (!loc)
 473                         return 0;
 474                 if (loc->line >= 0) {
 475                         fprintf(f, "%d:%d: ", loc->line, loc->column);
 476                         return 1;
 477                 }
 478                 if (loc->type == Xbinode)
 479                         return __fput_loc(cast(binode,loc)->left, f) ||
 480                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
 481                 return 0;
 482         }
 483         static void fput_loc(struct exec *loc, FILE *f)
 484         {
 485                 if (!__fput_loc(loc, f))
 486                         fprintf(f, "??:??: ");
 487         }
 488
 489 Each different type of `exec` node needs a number of functions defined,
 490 a bit like methods.  We must be able to free it, print it, analyse it
 491 and execute it.  Once we have specific `exec` types we will need to
 492 parse them too.  Let's take this a bit more slowly.
 493
 494 #### Freeing
 495
 496 The parser generator requires a `free_foo` function for each struct
 497 that stores attributes and they will often be `exec`s and subtypes
 498 there-of.  So we need `free_exec` which can handle all the subtypes,
 499 and we need `free_binode`.
 500
 501 ###### ast functions
 502
 503         static void free_binode(struct binode *b)
 504         {
 505                 if (!b)
 506                         return;
 507                 free_exec(b->left);
 508                 free_exec(b->right);
 509                 free(b);
 510         }
 511
 512 ###### core functions
 513         static void free_exec(struct exec *e)
 514         {
 515                 if (!e)
 516                         return;
 517                 switch(e->type) {
 518                         ## free exec cases
 519                 }
 520         }
 521
 522 ###### forward decls
 523
 524         static void free_exec(struct exec *e);
 525
 526 ###### free exec cases
 527         case Xbinode: free_binode(cast(binode, e)); break;
 528
 529 #### Printing
 530
 531 Printing an `exec` requires that we know the current indent level for
 532 printing line-oriented components.  As will become clear later, we
 533 also want to know what sort of bracketing to use.
 534
 535 ###### ast functions
 536
 537         static void do_indent(int i, char *str)
 538         {
 539                 while (i-- > 0)
 540                         printf("    ");
 541                 printf("%s", str);
 542         }
 543
 544 ###### core functions
 545         static void print_binode(struct binode *b, int indent, int bracket)
 546         {
 547                 struct binode *b2;
 548                 switch(b->op) {
 549                 ## print binode cases
 550                 }
 551         }
 552
 553         static void print_exec(struct exec *e, int indent, int bracket)
 554         {
 555                 if (!e)
 556                         return;
 557                 switch (e->type) {
 558                 case Xbinode:
 559                         print_binode(cast(binode, e), indent, bracket); break;
 560                 ## print exec cases
 561                 }
 562                 if (e->to_free) {
 563                         struct variable *v;
 564                         do_indent(indent, "/* FREE");
 565                         for (v = e->to_free; v; v = v->next_free) {
 566                                 printf(" %.*s", v->name->name.len, v->name->name.txt);
 567                                 printf("[%d,%d]", v->scope_start, v->scope_end);
 568                                 if (v->frame_pos >= 0)
 569                                         printf("(%d+%d)", v->frame_pos,
 570                                                v->type ? v->type->size:0);
 571                         }
 572                         printf(" */\n");
 573                 }
 574         }
 575
 576 ###### forward decls
 577
 578         static void print_exec(struct exec *e, int indent, int bracket);
 579
 580 #### Analysing
 581
 582 As discussed, analysis involves propagating type requirements around the
 583 program and looking for errors.
 584
 585 So `propagate_types` is passed an expected type (being a `struct type`
 586 pointer together with some `val_rules` flags) that the `exec` is
 587 expected to return, and returns the type that it does return, either of
 588 which can be `NULL` signifying "unknown".  A `prop_err` flag set is
 589 passed by reference.  It has `Efail` set when an error is found, and
 590 `Eretry` when the type for some element is set via propagation.  If
 591 any expression cannot be evaluated immediately, `Enoconst` is set.
 592 If the expression can be copied, `Emaycopy` is set.
 593
 594 If it remains unchanged at `0`, then no more propagation is needed.
 595
 596 ###### ast
 597
 598         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 1<<2};
 599         enum prop_err {Efail = 1<<0, Eretry = 1<<1, Enoconst = 1<<2,
 600                        Emaycopy = 1<<3};
 601
 602 ###### format cases
 603         case 'r':
 604                 if (rules & Rnolabel)
 605                         fputs(" (labels not permitted)", stderr);
 606                 break;
 607
 608 ###### forward decls
 609         static struct type *propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 610                                             struct type *type, int rules);
 611 ###### core functions
 612
 613         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 614                                               struct type *type, int rules)
 615         {
 616                 struct type *t;
 617
 618                 if (!prog)
 619                         return Tnone;
 620
 621                 switch (prog->type) {
 622                 case Xbinode:
 623                 {
 624                         struct binode *b = cast(binode, prog);
 625                         switch (b->op) {
 626                         ## propagate binode cases
 627                         }
 628                         break;
 629                 }
 630                 ## propagate exec cases
 631                 }
 632                 return Tnone;
 633         }
 634
 635         static struct type *propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 636                                             struct type *type, int rules)
 637         {
 638                 int pre_err = c->parse_error;
 639                 struct type *ret = __propagate_types(prog, c, perr, type, rules);
 640
 641                 if (c->parse_error > pre_err)
 642                         *perr |= Efail;
 643                 return ret;
 644         }
 645
 646 #### Interpreting
 647
 648 Interpreting an `exec` doesn't require anything but the `exec`.  State
 649 is stored in variables and each variable will be directly linked from
 650 within the `exec` tree.  The exception to this is the `main` function
 651 which needs to look at command line arguments.  This function will be
 652 interpreted separately.
 653
 654 Each `exec` can return a value combined with a type in `struct lrval`.
 655 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
 656 the location of a value, which can be updated, in `lval`.  Others will
 657 set `lval` to NULL indicating that there is a value of appropriate type
 658 in `rval`.
 659
 660 ###### forward decls
 661         static struct value interp_exec(struct parse_context *c, struct exec *e,
 662                                         struct type **typeret);
 663 ###### core functions
 664
 665         struct lrval {
 666                 struct type *type;
 667                 struct value rval, *lval;
 668         };
 669
 670         /* If dest is passed, dtype must give the expected type, and
 671          * result can go there, in which case type is returned as NULL.
 672          */
 673         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 674                                          struct value *dest, struct type *dtype);
 675
 676         static struct value interp_exec(struct parse_context *c, struct exec *e,
 677                                         struct type **typeret)
 678         {
 679                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 680
 681                 if (!ret.type) abort();
 682                 if (typeret)
 683                         *typeret = ret.type;
 684                 if (ret.lval)
 685                         dup_value(ret.type, ret.lval, &ret.rval);
 686                 return ret.rval;
 687         }
 688
 689         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
 690                                           struct type **typeret)
 691         {
 692                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 693
 694                 if (!ret.type) abort();
 695                 if (ret.lval)
 696                         *typeret = ret.type;
 697                 else
 698                         free_value(ret.type, &ret.rval);
 699                 return ret.lval;
 700         }
 701
 702         /* dinterp_exec is used when the destination type is certain and
 703          * the value has a place to go.
 704          */
 705         static void dinterp_exec(struct parse_context *c, struct exec *e,
 706                                  struct value *dest, struct type *dtype,
 707                                  int need_free)
 708         {
 709                 struct lrval ret = _interp_exec(c, e, dest, dtype);
 710                 if (!ret.type)
 711                         return;
 712                 if (need_free)
 713                         free_value(dtype, dest);
 714                 if (ret.lval)
 715                         dup_value(dtype, ret.lval, dest);
 716                 else
 717                         memcpy(dest, &ret.rval, dtype->size);
 718         }
 719
 720         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 721                                          struct value *dest, struct type *dtype)
 722         {
 723                 /* If the result is copied to dest, ret.type is set to NULL */
 724                 struct lrval ret;
 725                 struct value rv = {}, *lrv = NULL;
 726                 struct type *rvtype;
 727
 728                 rvtype = ret.type = Tnone;
 729                 if (!e) {
 730                         ret.lval = lrv;
 731                         ret.rval = rv;
 732                         return ret;
 733                 }
 734
 735                 switch(e->type) {
 736                 case Xbinode:
 737                 {
 738                         struct binode *b = cast(binode, e);
 739                         struct value left, right, *lleft;
 740                         struct type *ltype, *rtype;
 741                         ltype = rtype = Tnone;
 742                         switch (b->op) {
 743                         ## interp binode cases
 744                         }
 745                         free_value(ltype, &left);
 746                         free_value(rtype, &right);
 747                         break;
 748                 }
 749                 ## interp exec cases
 750                 }
 751                 if (rvtype) {
 752                         ret.lval = lrv;
 753                         ret.rval = rv;
 754                         ret.type = rvtype;
 755                 }
 756                 ## interp exec cleanup
 757                 return ret;
 758         }
 759
 760 ### Types
 761
 762 Values come in a wide range of types, with more likely to be added.
 763 Each type needs to be able to print its own values (for convenience at
 764 least) as well as to compare two values, at least for equality and
 765 possibly for order.  For now, values might need to be duplicated and
 766 freed, though eventually such manipulations will be better integrated
 767 into the language.
 768
 769 Rather than requiring every numeric type to support all numeric
 770 operations (add, multiply, etc), we allow types to be able to present
 771 as one of a few standard types: integer, float, and fraction.  The
 772 existence of these conversion functions eventually enable types to
 773 determine if they are compatible with other types, though such types
 774 have not yet been implemented.
 775
 776 Named type are stored in a simple linked list.  Objects of each type are
 777 "values" which are often passed around by value.
 778
 779 There are both explicitly named types, and anonymous types.  Anonymous
 780 cannot be accessed by name, but are used internally and have a name
 781 which might be reported in error messages.
 782
 783 ###### ast
 784
 785         struct value {
 786                 union {
 787                         char ptr[1];
 788                         ## value union fields
 789                 };
 790         };
 791
 792         struct type {
 793                 struct text name;
 794                 struct type *next;
 795                 struct token first_use;
 796                 int size, align;
 797                 int anon;
 798                 void (*init)(struct type *type, struct value *val);
 799                 int (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 800                 void (*print)(struct type *type, struct value *val, FILE *f);
 801                 void (*print_type)(struct type *type, FILE *f);
 802                 int (*cmp_order)(struct type *t1, struct type *t2,
 803                                  struct value *v1, struct value *v2);
 804                 int (*cmp_eq)(struct type *t1, struct type *t2,
 805                               struct value *v1, struct value *v2);
 806                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 807                 int (*test)(struct type *type, struct value *val);
 808                 void (*free)(struct type *type, struct value *val);
 809                 void (*free_type)(struct type *t);
 810                 long long (*to_int)(struct value *v);
 811                 double (*to_float)(struct value *v);
 812                 int (*to_mpq)(mpq_t *q, struct value *v);
 813                 ## type functions
 814                 union {
 815                         ## type union fields
 816                 };
 817         };
 818
 819 ###### parse context
 820
 821         struct type *typelist;
 822
 823 ###### includes
 824         #include <stdarg.h>
 825
 826 ###### ast functions
 827
 828         static struct type *find_type(struct parse_context *c, struct text s)
 829         {
 830                 struct type *t = c->typelist;
 831
 832                 while (t && (t->anon ||
 833                              text_cmp(t->name, s) != 0))
 834                                 t = t->next;
 835                 return t;
 836         }
 837
 838         static struct type *_add_type(struct parse_context *c, struct text s,
 839                                      struct type *proto, int anon)
 840         {
 841                 struct type *n;
 842
 843                 n = calloc(1, sizeof(*n));
 844                 if (proto)
 845                         *n = *proto;
 846                 else
 847                         n->size = -1;
 848                 n->name = s;
 849                 n->anon = anon;
 850                 n->next = c->typelist;
 851                 c->typelist = n;
 852                 return n;
 853         }
 854
 855         static struct type *add_type(struct parse_context *c, struct text s,
 856                                       struct type *proto)
 857         {
 858                 return _add_type(c, s, proto, 0);
 859         }
 860
 861         static struct type *add_anon_type(struct parse_context *c,
 862                                           struct type *proto, char *name, ...)
 863         {
 864                 struct text t;
 865                 va_list ap;
 866
 867                 va_start(ap, name);
 868                 vasprintf(&t.txt, name, ap);
 869                 va_end(ap);
 870                 t.len = strlen(name);
 871                 return _add_type(c, t, proto, 1);
 872         }
 873
 874         static void free_type(struct type *t)
 875         {
 876                 /* The type is always a reference to something in the
 877                  * context, so we don't need to free anything.
 878                  */
 879         }
 880
 881         static void free_value(struct type *type, struct value *v)
 882         {
 883                 if (type && v) {
 884                         type->free(type, v);
 885                         memset(v, 0x5a, type->size);
 886                 }
 887         }
 888
 889         static void type_print(struct type *type, FILE *f)
 890         {
 891                 if (!type)
 892                         fputs("*unknown*type*", f);     // NOTEST
 893                 else if (type->name.len && !type->anon)
 894                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 895                 else if (type->print_type)
 896                         type->print_type(type, f);
 897                 else
 898                         fputs("*invalid*type*", f);
 899         }
 900
 901         static void val_init(struct type *type, struct value *val)
 902         {
 903                 if (type && type->init)
 904                         type->init(type, val);
 905         }
 906
 907         static void dup_value(struct type *type,
 908                               struct value *vold, struct value *vnew)
 909         {
 910                 if (type && type->dup)
 911                         type->dup(type, vold, vnew);
 912         }
 913
 914         static int value_cmp(struct type *tl, struct type *tr,
 915                              struct value *left, struct value *right)
 916         {
 917                 if (tl && tl->cmp_order)
 918                         return tl->cmp_order(tl, tr, left, right);
 919                 if (tl && tl->cmp_eq)                   // NOTEST
 920                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 921                 return -1;                              // NOTEST
 922         }
 923
 924         static void print_value(struct type *type, struct value *v, FILE *f)
 925         {
 926                 if (type && type->print)
 927                         type->print(type, v, f);
 928                 else
 929                         fprintf(f, "*Unknown*");                // NOTEST
 930         }
 931
 932         static void prepare_types(struct parse_context *c)
 933         {
 934                 struct type *t;
 935                 int retry = 1;
 936                 enum { none, some, cannot } progress = none;
 937
 938                 while (retry) {
 939                         retry = 0;
 940
 941                         for (t = c->typelist; t; t = t->next) {
 942                                 if (t->size < 0)
 943                                         tok_err(c, "error: type used but not declared",
 944                                                  &t->first_use);
 945                                 if (t->size == 0 && t->prepare_type) {
 946                                         if (t->prepare_type(c, t, 1))
 947                                                 progress = some;
 948                                         else if (progress == cannot)
 949                                                 tok_err(c, "error: type has recursive definition",
 950                                                         &t->first_use);
 951                                         else
 952                                                 retry = 1;
 953                                 }
 954                         }
 955                         switch (progress) {
 956                         case cannot:
 957                                 retry = 0; break;
 958                         case none:
 959                                 progress = cannot; break;
 960                         case some:
 961                                 progress = none; break;
 962                         }
 963                 }
 964         }
 965
 966 ###### forward decls
 967
 968         static void free_value(struct type *type, struct value *v);
 969         static int type_compat(struct type *require, struct type *have, int rules);
 970         static void type_print(struct type *type, FILE *f);
 971         static void val_init(struct type *type, struct value *v);
 972         static void dup_value(struct type *type,
 973                               struct value *vold, struct value *vnew);
 974         static int value_cmp(struct type *tl, struct type *tr,
 975                              struct value *left, struct value *right);
 976         static void print_value(struct type *type, struct value *v, FILE *f);
 977
 978 ###### free context types
 979
 980         while (context.typelist) {
 981                 struct type *t = context.typelist;
 982
 983                 context.typelist = t->next;
 984                 if (t->free_type)
 985                         t->free_type(t);
 986                 if (t->anon)
 987                         free(t->name.txt);
 988                 free(t);
 989         }
 990
 991 Type can be specified for local variables, for fields in a structure,
 992 for formal parameters to functions, and possibly elsewhere.  Different
 993 rules may apply in different contexts.  As a minimum, a named type may
 994 always be used.  Currently the type of a formal parameter can be
 995 different from types in other contexts, so we have a separate grammar
 996 symbol for those.
 997
 998 ###### Grammar
 999
1000         $*type
1001         Type -> IDENTIFIER ${
1002                 $0 = find_type(c, $ID.txt);
1003                 if (!$0) {
1004                         $0 = add_type(c, $ID.txt, NULL);
1005                         $0->first_use = $ID;
1006                 }
1007         }$
1008         ## type grammar
1009
1010         FormalType -> Type ${ $0 = $<1; }$
1011         ## formal type grammar
1012
1013 #### Base Types
1014
1015 Values of the base types can be numbers, which we represent as
1016 multi-precision fractions, strings, Booleans and labels.  When
1017 analysing the program we also need to allow for places where no value
1018 is meaningful (type `Tnone`) and where we don't know what type to
1019 expect yet (type is `NULL`).
1020
1021 Values are never shared, they are always copied when used, and freed
1022 when no longer needed.
1023
1024 When propagating type information around the program, we need to
1025 determine if two types are compatible, where type `NULL` is compatible
1026 with anything.  There are two special cases with type compatibility,
1027 both related to the Conditional Statement which will be described
1028 later.  In some cases a Boolean can be accepted as well as some other
1029 primary type, and in others any type is acceptable except a label (`Vlabel`).
1030 A separate function encoding these cases will simplify some code later.
1031
1032 ###### type functions
1033
1034         int (*compat)(struct type *this, struct type *other);
1035
1036 ###### ast functions
1037
1038         static int type_compat(struct type *require, struct type *have, int rules)
1039         {
1040                 if ((rules & Rboolok) && have == Tbool)
1041                         return 1;       // NOTEST
1042                 if ((rules & Rnolabel) && have == Tlabel)
1043                         return 0;       // NOTEST
1044                 if (!require || !have)
1045                         return 1;
1046
1047                 if (require->compat)
1048                         return require->compat(require, have);
1049
1050                 return require == have;
1051         }
1052
1053 ###### includes
1054         #include <gmp.h>
1055         #include "parse_string.h"
1056         #include "parse_number.h"
1057
1058 ###### libs
1059         myLDLIBS := libnumber.o libstring.o -lgmp
1060         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
1061
1062 ###### type union fields
1063         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
1064
1065 ###### value union fields
1066         struct text str;
1067         mpq_t num;
1068         unsigned char bool;
1069         void *label;
1070
1071 ###### ast functions
1072         static void _free_value(struct type *type, struct value *v)
1073         {
1074                 if (!v)
1075                         return;         // NOTEST
1076                 switch (type->vtype) {
1077                 case Vnone: break;
1078                 case Vstr: free(v->str.txt); break;
1079                 case Vnum: mpq_clear(v->num); break;
1080                 case Vlabel:
1081                 case Vbool: break;
1082                 }
1083         }
1084
1085 ###### value functions
1086
1087         static void _val_init(struct type *type, struct value *val)
1088         {
1089                 switch(type->vtype) {
1090                 case Vnone:             // NOTEST
1091                         break;          // NOTEST
1092                 case Vnum:
1093                         mpq_init(val->num); break;
1094                 case Vstr:
1095                         val->str.txt = malloc(1);
1096                         val->str.len = 0;
1097                         break;
1098                 case Vbool:
1099                         val->bool = 0;
1100                         break;
1101                 case Vlabel:
1102                         val->label = NULL;
1103                         break;
1104                 }
1105         }
1106
1107         static void _dup_value(struct type *type,
1108                                struct value *vold, struct value *vnew)
1109         {
1110                 switch (type->vtype) {
1111                 case Vnone:             // NOTEST
1112                         break;          // NOTEST
1113                 case Vlabel:
1114                         vnew->label = vold->label;
1115                         break;
1116                 case Vbool:
1117                         vnew->bool = vold->bool;
1118                         break;
1119                 case Vnum:
1120                         mpq_init(vnew->num);
1121                         mpq_set(vnew->num, vold->num);
1122                         break;
1123                 case Vstr:
1124                         vnew->str.len = vold->str.len;
1125                         vnew->str.txt = malloc(vnew->str.len);
1126                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
1127                         break;
1128                 }
1129         }
1130
1131         static int _value_cmp(struct type *tl, struct type *tr,
1132                               struct value *left, struct value *right)
1133         {
1134                 int cmp;
1135                 if (tl != tr)
1136                         return tl - tr; // NOTEST
1137                 switch (tl->vtype) {
1138                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
1139                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
1140                 case Vstr: cmp = text_cmp(left->str, right->str); break;
1141                 case Vbool: cmp = left->bool - right->bool; break;
1142                 case Vnone: cmp = 0;                    // NOTEST
1143                 }
1144                 return cmp;
1145         }
1146
1147         static void _print_value(struct type *type, struct value *v, FILE *f)
1148         {
1149                 switch (type->vtype) {
1150                 case Vnone:                             // NOTEST
1151                         fprintf(f, "*no-value*"); break;        // NOTEST
1152                 case Vlabel:                            // NOTEST
1153                         fprintf(f, "*label-%p*", v->label); break; // NOTEST
1154                 case Vstr:
1155                         fprintf(f, "%.*s", v->str.len, v->str.txt); break;
1156                 case Vbool:
1157                         fprintf(f, "%s", v->bool ? "True":"False"); break;
1158                 case Vnum:
1159                         {
1160                         mpf_t fl;
1161                         mpf_init2(fl, 20);
1162                         mpf_set_q(fl, v->num);
1163                         gmp_fprintf(f, "%.10Fg", fl);
1164                         mpf_clear(fl);
1165                         break;
1166                         }
1167                 }
1168         }
1169
1170         static void _free_value(struct type *type, struct value *v);
1171
1172         static int bool_test(struct type *type, struct value *v)
1173         {
1174                 return v->bool;
1175         }
1176
1177         static struct type base_prototype = {
1178                 .init = _val_init,
1179                 .print = _print_value,
1180                 .cmp_order = _value_cmp,
1181                 .cmp_eq = _value_cmp,
1182                 .dup = _dup_value,
1183                 .free = _free_value,
1184         };
1185
1186         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
1187
1188 ###### ast functions
1189         static struct type *add_base_type(struct parse_context *c, char *n,
1190                                           enum vtype vt, int size)
1191         {
1192                 struct text txt = { n, strlen(n) };
1193                 struct type *t;
1194
1195                 t = add_type(c, txt, &base_prototype);
1196                 t->vtype = vt;
1197                 t->size = size;
1198                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
1199                 if (t->size & (t->align - 1))
1200                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
1201                 return t;
1202         }
1203
1204 ###### context initialization
1205
1206         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
1207         Tbool->test = bool_test;
1208         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
1209         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
1210         Tnone  = add_base_type(&context, "none", Vnone, 0);
1211         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
1212
1213 ##### Base Values
1214
1215 We have already met values as separate objects.  When manifest constants
1216 appear in the program text, that must result in an executable which has
1217 a constant value.  So the `val` structure embeds a value in an
1218 executable.
1219
1220 ###### exec type
1221         Xval,
1222
1223 ###### ast
1224         struct val {
1225                 struct exec;
1226                 struct type *vtype;
1227                 struct value val;
1228         };
1229
1230 ###### ast functions
1231         struct val *new_val(struct type *T, struct token tk)
1232         {
1233                 struct val *v = new_pos(val, tk);
1234                 v->vtype = T;
1235                 return v;
1236         }
1237
1238 ###### Grammar
1239
1240         $TERM True False
1241
1242         $*val
1243         Value ->  True ${
1244                 $0 = new_val(Tbool, $1);
1245                 $0->val.bool = 1;
1246         }$
1247         | False ${
1248                 $0 = new_val(Tbool, $1);
1249                 $0->val.bool = 0;
1250         }$
1251         | NUMBER ${ {
1252                 char tail[3];
1253                 $0 = new_val(Tnum, $1);
1254                 if (number_parse($0->val.num, tail, $1.txt) == 0)
1255                         mpq_init($0->val.num);  // UNTESTED
1256                         if (tail[0])
1257                                 tok_err(c, "error: unsupported number suffix",
1258                                         &$1);
1259         } }$
1260         | STRING ${ {
1261                 char tail[3];
1262                 $0 = new_val(Tstr, $1);
1263                 string_parse(&$1, '\\', &$0->val.str, tail);
1264                 if (tail[0])
1265                         tok_err(c, "error: unsupported string suffix",
1266                                 &$1);
1267         } }$
1268         | MULTI_STRING ${ {
1269                 char tail[3];
1270                 $0 = new_val(Tstr, $1);
1271                 string_parse(&$1, '\\', &$0->val.str, tail);
1272                 if (tail[0])
1273                         tok_err(c, "error: unsupported string suffix",
1274                                 &$1);
1275         } }$
1276
1277 ###### print exec cases
1278         case Xval:
1279         {
1280                 struct val *v = cast(val, e);
1281                 if (v->vtype == Tstr)
1282                         printf("\"");
1283                 // FIXME how to ensure numbers have same precision.
1284                 print_value(v->vtype, &v->val, stdout);
1285                 if (v->vtype == Tstr)
1286                         printf("\"");
1287                 break;
1288         }
1289
1290 ###### propagate exec cases
1291         case Xval:
1292         {
1293                 struct val *val = cast(val, prog);
1294                 if (!type_compat(type, val->vtype, rules))
1295                         type_err(c, "error: expected %1%r found %2",
1296                                    prog, type, rules, val->vtype);
1297                 return val->vtype;
1298         }
1299
1300 ###### interp exec cases
1301         case Xval:
1302                 rvtype = cast(val, e)->vtype;
1303                 dup_value(rvtype, &cast(val, e)->val, &rv);
1304                 break;
1305
1306 ###### ast functions
1307         static void free_val(struct val *v)
1308         {
1309                 if (v)
1310                         free_value(v->vtype, &v->val);
1311                 free(v);
1312         }
1313
1314 ###### free exec cases
1315         case Xval: free_val(cast(val, e)); break;
1316
1317 ###### ast functions
1318         // Move all nodes from 'b' to 'rv', reversing their order.
1319         // In 'b' 'left' is a list, and 'right' is the last node.
1320         // In 'rv', left' is the first node and 'right' is a list.
1321         static struct binode *reorder_bilist(struct binode *b)
1322         {
1323                 struct binode *rv = NULL;
1324
1325                 while (b) {
1326                         struct exec *t = b->right;
1327                         b->right = rv;
1328                         rv = b;
1329                         if (b->left)
1330                                 b = cast(binode, b->left);
1331                         else
1332                                 b = NULL;
1333                         rv->left = t;
1334                 }
1335                 return rv;
1336         }
1337
1338 ### Variables
1339
1340 Variables are scoped named values.  We store the names in a linked list
1341 of "bindings" sorted in lexical order, and use sequential search and
1342 insertion sort.
1343
1344 ###### ast
1345
1346         struct binding {
1347                 struct text name;
1348                 struct binding *next;   // in lexical order
1349                 ## binding fields
1350         };
1351
1352 This linked list is stored in the parse context so that "reduce"
1353 functions can find or add variables, and so the analysis phase can
1354 ensure that every variable gets a type.
1355
1356 ###### parse context
1357
1358         struct binding *varlist;  // In lexical order
1359
1360 ###### ast functions
1361
1362         static struct binding *find_binding(struct parse_context *c, struct text s)
1363         {
1364                 struct binding **l = &c->varlist;
1365                 struct binding *n;
1366                 int cmp = 1;
1367
1368                 while (*l &&
1369                         (cmp = text_cmp((*l)->name, s)) < 0)
1370                                 l = & (*l)->next;
1371                 if (cmp == 0)
1372                         return *l;
1373                 n = calloc(1, sizeof(*n));
1374                 n->name = s;
1375                 n->next = *l;
1376                 *l = n;
1377                 return n;
1378         }
1379
1380 Each name can be linked to multiple variables defined in different
1381 scopes.  Each scope starts where the name is declared and continues
1382 until the end of the containing code block.  Scopes of a given name
1383 cannot nest, so a declaration while a name is in-scope is an error.
1384
1385 ###### binding fields
1386         struct variable *var;
1387
1388 ###### ast
1389         struct variable {
1390                 struct variable *previous;
1391                 struct type *type;
1392                 struct binding *name;
1393                 struct exec *where_decl;// where name was declared
1394                 struct exec *where_set; // where type was set
1395                 ## variable fields
1396         };
1397
1398 When a scope closes, the values of the variables might need to be freed.
1399 This happens in the context of some `struct exec` and each `exec` will
1400 need to know which variables need to be freed when it completes.
1401
1402 ####### exec fields
1403         struct variable *to_free;
1404
1405 ####### variable fields
1406         struct exec *cleanup_exec;
1407         struct variable *next_free;
1408
1409 ####### interp exec cleanup
1410         {
1411                 struct variable *v;
1412                 for (v = e->to_free; v; v = v->next_free) {
1413                         struct value *val = var_value(c, v);
1414                         free_value(v->type, val);
1415                 }
1416         }
1417
1418 ###### ast functions
1419         static void variable_unlink_exec(struct variable *v)
1420         {
1421                 struct variable **vp;
1422                 if (!v->cleanup_exec)
1423                         return;
1424                 for (vp = &v->cleanup_exec->to_free;
1425                     *vp; vp = &(*vp)->next_free) {
1426                         if (*vp != v)
1427                                 continue;
1428                         *vp = v->next_free;
1429                         v->cleanup_exec = NULL;
1430                         break;
1431                 }
1432         }
1433
1434 While the naming seems strange, we include local constants in the
1435 definition of variables.  A name declared `var := value` can
1436 subsequently be changed, but a name declared `var ::= value` cannot -
1437 it is constant
1438
1439 ###### variable fields
1440         int constant;
1441
1442 Scopes in parallel branches can be partially merged.  More
1443 specifically, if a given name is declared in both branches of an
1444 if/else then its scope is a candidate for merging.  Similarly if
1445 every branch of an exhaustive switch (e.g. has an "else" clause)
1446 declares a given name, then the scopes from the branches are
1447 candidates for merging.
1448
1449 Note that names declared inside a loop (which is only parallel to
1450 itself) are never visible after the loop.  Similarly names defined in
1451 scopes which are not parallel, such as those started by `for` and
1452 `switch`, are never visible after the scope.  Only variables defined in
1453 both `then` and `else` (including the implicit then after an `if`, and
1454 excluding `then` used with `for`) and in all `case`s and `else` of a
1455 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
1456
1457 Labels, which are a bit like variables, follow different rules.
1458 Labels are not explicitly declared, but if an undeclared name appears
1459 in a context where a label is legal, that effectively declares the
1460 name as a label.  The declaration remains in force (or in scope) at
1461 least to the end of the immediately containing block and conditionally
1462 in any larger containing block which does not declare the name in some
1463 other way.  Importantly, the conditional scope extension happens even
1464 if the label is only used in one parallel branch of a conditional --
1465 when used in one branch it is treated as having been declared in all
1466 branches.
1467
1468 Merge candidates are tentatively visible beyond the end of the
1469 branching statement which creates them.  If the name is used, the
1470 merge is affirmed and they become a single variable visible at the
1471 outer layer.  If not - if it is redeclared first - the merge lapses.
1472
1473 To track scopes we have an extra stack, implemented as a linked list,
1474 which roughly parallels the parse stack and which is used exclusively
1475 for scoping.  When a new scope is opened, a new frame is pushed and
1476 the child-count of the parent frame is incremented.  This child-count
1477 is used to distinguish between the first of a set of parallel scopes,
1478 in which declared variables must not be in scope, and subsequent
1479 branches, whether they may already be conditionally scoped.
1480
1481 We need a total ordering of scopes so we can easily compare to variables
1482 to see if they are concurrently in scope.  To achieve this we record a
1483 `scope_count` which is actually a count of both beginnings and endings
1484 of scopes.  Then each variable has a record of the scope count where it
1485 enters scope, and where it leaves.
1486
1487 To push a new frame *before* any code in the frame is parsed, we need a
1488 grammar reduction.  This is most easily achieved with a grammar
1489 element which derives the empty string, and creates the new scope when
1490 it is recognised.  This can be placed, for example, between a keyword
1491 like "if" and the code following it.
1492
1493 ###### ast
1494         struct scope {
1495                 struct scope *parent;
1496                 int child_count;
1497         };
1498
1499 ###### parse context
1500         int scope_depth;
1501         int scope_count;
1502         struct scope *scope_stack;
1503
1504 ###### variable fields
1505         int scope_start, scope_end;
1506
1507 ###### ast functions
1508         static void scope_pop(struct parse_context *c)
1509         {
1510                 struct scope *s = c->scope_stack;
1511
1512                 c->scope_stack = s->parent;
1513                 free(s);
1514                 c->scope_depth -= 1;
1515                 c->scope_count += 1;
1516         }
1517
1518         static void scope_push(struct parse_context *c)
1519         {
1520                 struct scope *s = calloc(1, sizeof(*s));
1521                 if (c->scope_stack)
1522                         c->scope_stack->child_count += 1;
1523                 s->parent = c->scope_stack;
1524                 c->scope_stack = s;
1525                 c->scope_depth += 1;
1526                 c->scope_count += 1;
1527         }
1528
1529 ###### Grammar
1530
1531         $void
1532         OpenScope -> ${ scope_push(c); }$
1533
1534 Each variable records a scope depth and is in one of four states:
1535
1536 - "in scope".  This is the case between the declaration of the
1537   variable and the end of the containing block, and also between
1538   the usage with affirms a merge and the end of that block.
1539
1540   The scope depth is not greater than the current parse context scope
1541   nest depth.  When the block of that depth closes, the state will
1542   change.  To achieve this, all "in scope" variables are linked
1543   together as a stack in nesting order.
1544
1545 - "pending".  The "in scope" block has closed, but other parallel
1546   scopes are still being processed.  So far, every parallel block at
1547   the same level that has closed has declared the name.
1548
1549   The scope depth is the depth of the last parallel block that
1550   enclosed the declaration, and that has closed.
1551
1552 - "conditionally in scope".  The "in scope" block and all parallel
1553   scopes have closed, and no further mention of the name has been seen.
1554   This state includes a secondary nest depth (`min_depth`) which records
1555   the outermost scope seen since the variable became conditionally in
1556   scope.  If a use of the name is found, the variable becomes "in scope"
1557   and that secondary depth becomes the recorded scope depth.  If the
1558   name is declared as a new variable, the old variable becomes "out of
1559   scope" and the recorded scope depth stays unchanged.
1560
1561 - "out of scope".  The variable is neither in scope nor conditionally
1562   in scope.  It is permanently out of scope now and can be removed from
1563   the "in scope" stack.  When a variable becomes out-of-scope it is
1564   moved to a separate list (`out_scope`) of variables which have fully
1565   known scope.  This will be used at the end of each function to assign
1566   each variable a place in the stack frame.
1567
1568 ###### variable fields
1569         int depth, min_depth;
1570         enum { OutScope, PendingScope, CondScope, InScope } scope;
1571         struct variable *in_scope;
1572
1573 ###### parse context
1574
1575         struct variable *in_scope;
1576         struct variable *out_scope;
1577
1578 All variables with the same name are linked together using the
1579 'previous' link.  Those variable that have been affirmatively merged all
1580 have a 'merged' pointer that points to one primary variable - the most
1581 recently declared instance.  When merging variables, we need to also
1582 adjust the 'merged' pointer on any other variables that had previously
1583 been merged with the one that will no longer be primary.
1584
1585 A variable that is no longer the most recent instance of a name may
1586 still have "pending" scope, if it might still be merged with most
1587 recent instance.  These variables don't really belong in the
1588 "in_scope" list, but are not immediately removed when a new instance
1589 is found.  Instead, they are detected and ignored when considering the
1590 list of in_scope names.
1591
1592 The storage of the value of a variable will be described later.  For now
1593 we just need to know that when a variable goes out of scope, it might
1594 need to be freed.  For this we need to be able to find it, so assume that
1595 `var_value()` will provide that.
1596
1597 ###### variable fields
1598         struct variable *merged;
1599
1600 ###### ast functions
1601
1602         static void variable_merge(struct variable *primary, struct variable *secondary)
1603         {
1604                 struct variable *v;
1605
1606                 primary = primary->merged;
1607
1608                 for (v = primary->previous; v; v=v->previous)
1609                         if (v == secondary || v == secondary->merged ||
1610                             v->merged == secondary ||
1611                             v->merged == secondary->merged) {
1612                                 v->scope = OutScope;
1613                                 v->merged = primary;
1614                                 if (v->scope_start < primary->scope_start)
1615                                         primary->scope_start = v->scope_start;
1616                                 if (v->scope_end > primary->scope_end)
1617                                         primary->scope_end = v->scope_end;      // NOTEST
1618                                 variable_unlink_exec(v);
1619                         }
1620         }
1621
1622 ###### forward decls
1623         static struct value *var_value(struct parse_context *c, struct variable *v);
1624
1625 ###### free global vars
1626
1627         while (context.varlist) {
1628                 struct binding *b = context.varlist;
1629                 struct variable *v = b->var;
1630                 context.varlist = b->next;
1631                 free(b);
1632                 while (v) {
1633                         struct variable *next = v->previous;
1634
1635                         if (v->global && v->frame_pos >= 0) {
1636                                 free_value(v->type, var_value(&context, v));
1637                                 if (v->depth == 0 && v->type->free == function_free)
1638                                         // This is a function constant
1639                                         free_exec(v->where_decl);
1640                         }
1641                         free(v);
1642                         v = next;
1643                 }
1644         }
1645
1646 #### Manipulating Bindings
1647
1648 When a name is conditionally visible, a new declaration discards the old
1649 binding - the condition lapses.  Similarly when we reach the end of a
1650 function (outermost non-global scope) any conditional scope must lapse.
1651 Conversely a usage of the name affirms the visibility and extends it to
1652 the end of the containing block - i.e.  the block that contains both the
1653 original declaration and the latest usage.  This is determined from
1654 `min_depth`.  When a conditionally visible variable gets affirmed like
1655 this, it is also merged with other conditionally visible variables with
1656 the same name.
1657
1658 When we parse a variable declaration we either report an error if the
1659 name is currently bound, or create a new variable at the current nest
1660 depth if the name is unbound or bound to a conditionally scoped or
1661 pending-scope variable.  If the previous variable was conditionally
1662 scoped, it and its homonyms becomes out-of-scope.
1663
1664 When we parse a variable reference (including non-declarative assignment
1665 "foo = bar") we report an error if the name is not bound or is bound to
1666 a pending-scope variable; update the scope if the name is bound to a
1667 conditionally scoped variable; or just proceed normally if the named
1668 variable is in scope.
1669
1670 When we exit a scope, any variables bound at this level are either
1671 marked out of scope or pending-scoped, depending on whether the scope
1672 was sequential or parallel.  Here a "parallel" scope means the "then"
1673 or "else" part of a conditional, or any "case" or "else" branch of a
1674 switch.  Other scopes are "sequential".
1675
1676 When exiting a parallel scope we check if there are any variables that
1677 were previously pending and are still visible. If there are, then
1678 they weren't redeclared in the most recent scope, so they cannot be
1679 merged and must become out-of-scope.  If it is not the first of
1680 parallel scopes (based on `child_count`), we check that there was a
1681 previous binding that is still pending-scope.  If there isn't, the new
1682 variable must now be out-of-scope.
1683
1684 When exiting a sequential scope that immediately enclosed parallel
1685 scopes, we need to resolve any pending-scope variables.  If there was
1686 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1687 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1688 all pending-scope variables become conditionally scoped.
1689
1690 ###### ast
1691         enum closetype { CloseSequential, CloseFunction, CloseParallel, CloseElse };
1692
1693 ###### ast functions
1694
1695         static struct variable *var_decl(struct parse_context *c, struct text s)
1696         {
1697                 struct binding *b = find_binding(c, s);
1698                 struct variable *v = b->var;
1699
1700                 switch (v ? v->scope : OutScope) {
1701                 case InScope:
1702                         /* Caller will report the error */
1703                         return NULL;
1704                 case CondScope:
1705                         for (;
1706                              v && v->scope == CondScope;
1707                              v = v->previous)
1708                                 v->scope = OutScope;
1709                         break;
1710                 default: break;
1711                 }
1712                 v = calloc(1, sizeof(*v));
1713                 v->previous = b->var;
1714                 b->var = v;
1715                 v->name = b;
1716                 v->merged = v;
1717                 v->min_depth = v->depth = c->scope_depth;
1718                 v->scope = InScope;
1719                 v->in_scope = c->in_scope;
1720                 v->scope_start = c->scope_count;
1721                 c->in_scope = v;
1722                 ## variable init
1723                 return v;
1724         }
1725
1726         static struct variable *var_ref(struct parse_context *c, struct text s)
1727         {
1728                 struct binding *b = find_binding(c, s);
1729                 struct variable *v = b->var;
1730                 struct variable *v2;
1731
1732                 switch (v ? v->scope : OutScope) {
1733                 case OutScope:
1734                 case PendingScope:
1735                         /* Caller will report the error */
1736                         return NULL;
1737                 case CondScope:
1738                         /* All CondScope variables of this name need to be merged
1739                          * and become InScope
1740                          */
1741                         v->depth = v->min_depth;
1742                         v->scope = InScope;
1743                         for (v2 = v->previous;
1744                              v2 && v2->scope == CondScope;
1745                              v2 = v2->previous)
1746                                 variable_merge(v, v2);
1747                         break;
1748                 case InScope:
1749                         break;
1750                 }
1751                 return v;
1752         }
1753
1754         static int var_refile(struct parse_context *c, struct variable *v)
1755         {
1756                 /* Variable just went out of scope.  Add it to the out_scope
1757                  * list, sorted by ->scope_start
1758                  */
1759                 struct variable **vp = &c->out_scope;
1760                 while ((*vp) && (*vp)->scope_start < v->scope_start)
1761                         vp = &(*vp)->in_scope;
1762                 v->in_scope = *vp;
1763                 *vp = v;
1764                 return 0;
1765         }
1766
1767         static void var_block_close(struct parse_context *c, enum closetype ct,
1768                                     struct exec *e)
1769         {
1770                 /* Close off all variables that are in_scope.
1771                  * Some variables in c->scope may already be not-in-scope,
1772                  * such as when a PendingScope variable is hidden by a new
1773                  * variable with the same name.
1774                  * So we check for v->name->var != v and drop them.
1775                  * If we choose to make a variable OutScope, we drop it
1776                  * immediately too.
1777                  */
1778                 struct variable *v, **vp, *v2;
1779
1780                 scope_pop(c);
1781                 for (vp = &c->in_scope;
1782                      (v = *vp) && v->min_depth > c->scope_depth;
1783                      (v->scope == OutScope || v->name->var != v)
1784                      ? (*vp =  v->in_scope, var_refile(c, v))
1785                      : ( vp = &v->in_scope, 0)) {
1786                         v->min_depth = c->scope_depth;
1787                         if (v->name->var != v)
1788                                 /* This is still in scope, but we haven't just
1789                                  * closed the scope.
1790                                  */
1791                                 continue;
1792                         v->min_depth = c->scope_depth;
1793                         if (v->scope == InScope)
1794                                 v->scope_end = c->scope_count;
1795                         if (v->scope == InScope && e && !v->global) {
1796                                 /* This variable gets cleaned up when 'e' finishes */
1797                                 variable_unlink_exec(v);
1798                                 v->cleanup_exec = e;
1799                                 v->next_free = e->to_free;
1800                                 e->to_free = v;
1801                         }
1802                         switch (ct) {
1803                         case CloseElse:
1804                         case CloseParallel: /* handle PendingScope */
1805                                 switch(v->scope) {
1806                                 case InScope:
1807                                 case CondScope:
1808                                         if (c->scope_stack->child_count == 1)
1809                                                 /* first among parallel branches */
1810                                                 v->scope = PendingScope;
1811                                         else if (v->previous &&
1812                                                  v->previous->scope == PendingScope)
1813                                                 /* all previous branches used name */
1814                                                 v->scope = PendingScope;
1815                                         else if (v->type == Tlabel)
1816                                                 /* Labels remain pending even when not used */
1817                                                 v->scope = PendingScope;        // UNTESTED
1818                                         else
1819                                                 v->scope = OutScope;
1820                                         if (ct == CloseElse) {
1821                                                 /* All Pending variables with this name
1822                                                  * are now Conditional */
1823                                                 for (v2 = v;
1824                                                      v2 && v2->scope == PendingScope;
1825                                                      v2 = v2->previous)
1826                                                         v2->scope = CondScope;
1827                                         }
1828                                         break;
1829                                 case PendingScope:
1830                                         /* Not possible as it would require
1831                                          * parallel scope to be nested immediately
1832                                          * in a parallel scope, and that never
1833                                          * happens.
1834                                          */                     // NOTEST
1835                                 case OutScope:
1836                                         /* Not possible as we already tested for
1837                                          * OutScope
1838                                          */
1839                                         abort();                // NOTEST
1840                                 }
1841                                 break;
1842                         case CloseFunction:
1843                                 if (v->scope == CondScope)
1844                                         /* Condition cannot continue past end of function */
1845                                         v->scope = InScope;
1846                                 /* fallthrough */
1847                         case CloseSequential:
1848                                 if (v->type == Tlabel)
1849                                         v->scope = PendingScope;
1850                                 switch (v->scope) {
1851                                 case InScope:
1852                                         v->scope = OutScope;
1853                                         break;
1854                                 case PendingScope:
1855                                         /* There was no 'else', so we can only become
1856                                          * conditional if we know the cases were exhaustive,
1857                                          * and that doesn't mean anything yet.
1858                                          * So only labels become conditional..
1859                                          */
1860                                         for (v2 = v;
1861                                              v2 && v2->scope == PendingScope;
1862                                              v2 = v2->previous)
1863                                                 if (v2->type == Tlabel)
1864                                                         v2->scope = CondScope;
1865                                                 else
1866                                                         v2->scope = OutScope;
1867                                         break;
1868                                 case CondScope:
1869                                 case OutScope: break;
1870                                 }
1871                                 break;
1872                         }
1873                 }
1874         }
1875
1876 #### Storing Values
1877
1878 The value of a variable is store separately from the variable, on an
1879 analogue of a stack frame.  There are (currently) two frames that can be
1880 active.  A global frame which currently only stores constants, and a
1881 stacked frame which stores local variables.  Each variable knows if it
1882 is global or not, and what its index into the frame is.
1883
1884 Values in the global frame are known immediately they are relevant, so
1885 the frame needs to be reallocated as it grows so it can store those
1886 values.  The local frame doesn't get values until the interpreted phase
1887 is started, so there is no need to allocate until the size is known.
1888
1889 We initialize the `frame_pos` to an impossible value, so that we can
1890 tell if it was set or not later.
1891
1892 ###### variable fields
1893         short frame_pos;
1894         short global;
1895
1896 ###### variable init
1897         v->frame_pos = -1;
1898
1899 ###### parse context
1900
1901         short global_size, global_alloc;
1902         short local_size;
1903         void *global, *local;
1904
1905 ###### forward decls
1906         static struct value *global_alloc(struct parse_context *c, struct type *t,
1907                                           struct variable *v, struct value *init);
1908
1909 ###### ast functions
1910
1911         static struct value *var_value(struct parse_context *c, struct variable *v)
1912         {
1913                 if (!v->global) {
1914                         if (!c->local || !v->type)
1915                                 return NULL;    // UNTESTED
1916                         if (v->frame_pos + v->type->size > c->local_size) {
1917                                 printf("INVALID frame_pos\n");  // NOTEST
1918                                 exit(2);                        // NOTEST
1919                         }
1920                         return c->local + v->frame_pos;
1921                 }
1922                 if (c->global_size > c->global_alloc) {
1923                         int old = c->global_alloc;
1924                         c->global_alloc = (c->global_size | 1023) + 1024;
1925                         c->global = realloc(c->global, c->global_alloc);
1926                         memset(c->global + old, 0, c->global_alloc - old);
1927                 }
1928                 return c->global + v->frame_pos;
1929         }
1930
1931         static struct value *global_alloc(struct parse_context *c, struct type *t,
1932                                           struct variable *v, struct value *init)
1933         {
1934                 struct value *ret;
1935                 struct variable scratch;
1936
1937                 if (t->prepare_type)
1938                         t->prepare_type(c, t, 1);       // NOTEST
1939
1940                 if (c->global_size & (t->align - 1))
1941                         c->global_size = (c->global_size + t->align) & ~(t->align-1);   // NOTEST
1942                 if (!v) {
1943                         v = &scratch;
1944                         v->type = t;
1945                 }
1946                 v->frame_pos = c->global_size;
1947                 v->global = 1;
1948                 c->global_size += v->type->size;
1949                 ret = var_value(c, v);
1950                 if (init)
1951                         memcpy(ret, init, t->size);
1952                 else
1953                         val_init(t, ret);
1954                 return ret;
1955         }
1956
1957 As global values are found -- struct field initializers, labels etc --
1958 `global_alloc()` is called to record the value in the global frame.
1959
1960 When the program is fully parsed, each function is analysed, we need to
1961 walk the list of variables local to that function and assign them an
1962 offset in the stack frame.  For this we have `scope_finalize()`.
1963
1964 We keep the stack from dense by re-using space for between variables
1965 that are not in scope at the same time.  The `out_scope` list is sorted
1966 by `scope_start` and as we process a varible, we move it to an FIFO
1967 stack.  For each variable we consider, we first discard any from the
1968 stack anything that went out of scope before the new variable came in.
1969 Then we place the new variable just after the one at the top of the
1970 stack.
1971
1972 ###### ast functions
1973
1974         static void scope_finalize(struct parse_context *c, struct type *ft)
1975         {
1976                 int size = ft->function.local_size;
1977                 struct variable *next = ft->function.scope;
1978                 struct variable *done = NULL;
1979
1980                 while (next) {
1981                         struct variable *v = next;
1982                         struct type *t = v->type;
1983                         int pos;
1984                         next = v->in_scope;
1985                         if (v->merged != v)
1986                                 continue;
1987                         if (!t)
1988                                 continue;
1989                         if (v->frame_pos >= 0)
1990                                 continue;
1991                         while (done && done->scope_end < v->scope_start)
1992                                 done = done->in_scope;
1993                         if (done)
1994                                 pos = done->frame_pos + done->type->size;
1995                         else
1996                                 pos = ft->function.local_size;
1997                         if (pos & (t->align - 1))
1998                                 pos = (pos + t->align) & ~(t->align-1);
1999                         v->frame_pos = pos;
2000                         if (size < pos + v->type->size)
2001                                 size = pos + v->type->size;
2002                         v->in_scope = done;
2003                         done = v;
2004                 }
2005                 c->out_scope = NULL;
2006                 ft->function.local_size = size;
2007         }
2008
2009 ###### free context storage
2010         free(context.global);
2011
2012 #### Variables as executables
2013
2014 Just as we used a `val` to wrap a value into an `exec`, we similarly
2015 need a `var` to wrap a `variable` into an exec.  While each `val`
2016 contained a copy of the value, each `var` holds a link to the variable
2017 because it really is the same variable no matter where it appears.
2018 When a variable is used, we need to remember to follow the `->merged`
2019 link to find the primary instance.
2020
2021 When a variable is declared, it may or may not be given an explicit
2022 type.  We need to record which so that we can report the parsed code
2023 correctly.
2024
2025 ###### exec type
2026         Xvar,
2027
2028 ###### ast
2029         struct var {
2030                 struct exec;
2031                 struct variable *var;
2032         };
2033
2034 ###### variable fields
2035         int explicit_type;
2036
2037 ###### Grammar
2038
2039         $TERM : ::
2040
2041         $*var
2042         VariableDecl -> IDENTIFIER : ${ {
2043                 struct variable *v = var_decl(c, $1.txt);
2044                 $0 = new_pos(var, $1);
2045                 $0->var = v;
2046                 if (v)
2047                         v->where_decl = $0;
2048                 else {
2049                         v = var_ref(c, $1.txt);
2050                         $0->var = v;
2051                         type_err(c, "error: variable '%v' redeclared",
2052                                  $0, NULL, 0, NULL);
2053                         type_err(c, "info: this is where '%v' was first declared",
2054                                  v->where_decl, NULL, 0, NULL);
2055                 }
2056         } }$
2057         | IDENTIFIER :: ${ {
2058                 struct variable *v = var_decl(c, $1.txt);
2059                 $0 = new_pos(var, $1);
2060                 $0->var = v;
2061                 if (v) {
2062                         v->where_decl = $0;
2063                         v->constant = 1;
2064                 } else {
2065                         v = var_ref(c, $1.txt);
2066                         $0->var = v;
2067                         type_err(c, "error: variable '%v' redeclared",
2068                                  $0, NULL, 0, NULL);
2069                         type_err(c, "info: this is where '%v' was first declared",
2070                                  v->where_decl, NULL, 0, NULL);
2071                 }
2072         } }$
2073         | IDENTIFIER : Type ${ {
2074                 struct variable *v = var_decl(c, $1.txt);
2075                 $0 = new_pos(var, $1);
2076                 $0->var = v;
2077                 if (v) {
2078                         v->where_decl = $0;
2079                         v->where_set = $0;
2080                         v->type = $<Type;
2081                         v->explicit_type = 1;
2082                 } else {
2083                         v = var_ref(c, $1.txt);
2084                         $0->var = v;
2085                         type_err(c, "error: variable '%v' redeclared",
2086                                  $0, NULL, 0, NULL);
2087                         type_err(c, "info: this is where '%v' was first declared",
2088                                  v->where_decl, NULL, 0, NULL);
2089                 }
2090         } }$
2091         | IDENTIFIER :: Type ${ {
2092                 struct variable *v = var_decl(c, $1.txt);
2093                 $0 = new_pos(var, $1);
2094                 $0->var = v;
2095                 if (v) {
2096                         v->where_decl = $0;
2097                         v->where_set = $0;
2098                         v->type = $<Type;
2099                         v->constant = 1;
2100                         v->explicit_type = 1;
2101                 } else {
2102                         v = var_ref(c, $1.txt);
2103                         $0->var = v;
2104                         type_err(c, "error: variable '%v' redeclared",
2105                                  $0, NULL, 0, NULL);
2106                         type_err(c, "info: this is where '%v' was first declared",
2107                                  v->where_decl, NULL, 0, NULL);
2108                 }
2109         } }$
2110
2111         $*exec
2112         Variable -> IDENTIFIER ${ {
2113                 struct variable *v = var_ref(c, $1.txt);
2114                 $0 = new_pos(var, $1);
2115                 if (v == NULL) {
2116                         /* This might be a global const or a label
2117                          * Allocate a var with impossible type Tnone,
2118                          * which will be adjusted when we find out what it is,
2119                          * or will trigger an error.
2120                          */
2121                         v = var_decl(c, $1.txt);
2122                         if (v) {
2123                                 v->type = Tnone;
2124                                 v->where_decl = $0;
2125                                 v->where_set = $0;
2126                         }
2127                 }
2128                 cast(var, $0)->var = v;
2129         } }$
2130
2131 ###### print exec cases
2132         case Xvar:
2133         {
2134                 struct var *v = cast(var, e);
2135                 if (v->var) {
2136                         struct binding *b = v->var->name;
2137                         printf("%.*s", b->name.len, b->name.txt);
2138                 }
2139                 break;
2140         }
2141
2142 ###### format cases
2143         case 'v':
2144                 if (loc && loc->type == Xvar) {
2145                         struct var *v = cast(var, loc);
2146                         if (v->var) {
2147                                 struct binding *b = v->var->name;
2148                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2149                         } else
2150                                 fputs("???", stderr);   // NOTEST
2151                 } else
2152                         fputs("NOTVAR", stderr);
2153                 break;
2154
2155 ###### propagate exec cases
2156
2157         case Xvar:
2158         {
2159                 struct var *var = cast(var, prog);
2160                 struct variable *v = var->var;
2161                 if (!v) {
2162                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2163                         return Tnone;                                   // NOTEST
2164                 }
2165                 v = v->merged;
2166                 if (v->constant && (rules & Rnoconstant)) {
2167                         type_err(c, "error: Cannot assign to a constant: %v",
2168                                  prog, NULL, 0, NULL);
2169                         type_err(c, "info: name was defined as a constant here",
2170                                  v->where_decl, NULL, 0, NULL);
2171                         return v->type;
2172                 }
2173                 if (v->type == Tnone && v->where_decl == prog)
2174                         type_err(c, "error: variable used but not declared: %v",
2175                                  prog, NULL, 0, NULL);
2176                 if (v->type == NULL) {
2177                         if (type && !(*perr & Efail)) {
2178                                 v->type = type;
2179                                 v->where_set = prog;
2180                                 *perr |= Eretry;
2181                         }
2182                 } else if (!type_compat(type, v->type, rules)) {
2183                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2184                                  type, rules, v->type);
2185                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2186                                  v->type, rules, NULL);
2187                 }
2188                 if (!v->global || v->frame_pos < 0)
2189                         *perr |= Enoconst;
2190                 if (!type)
2191                         return v->type;
2192                 return type;
2193         }
2194
2195 ###### interp exec cases
2196         case Xvar:
2197         {
2198                 struct var *var = cast(var, e);
2199                 struct variable *v = var->var;
2200
2201                 v = v->merged;
2202                 lrv = var_value(c, v);
2203                 rvtype = v->type;
2204                 break;
2205         }
2206
2207 ###### ast functions
2208
2209         static void free_var(struct var *v)
2210         {
2211                 free(v);
2212         }
2213
2214 ###### free exec cases
2215         case Xvar: free_var(cast(var, e)); break;
2216
2217
2218 ### Complex types
2219
2220 Now that we have the shape of the interpreter in place we can add some
2221 complex types and connected them in to the data structures and the
2222 different phases of parse, analyse, print, interpret.
2223
2224 Being "complex" the language will naturally have syntax to access
2225 specifics of objects of these types.  These will fit into the grammar as
2226 "Terms" which are the things that are combined with various operators to
2227 form "Expression".  Where a Term is formed by some operation on another
2228 Term, the subordinate Term will always come first, so for example a
2229 member of an array will be expressed as the Term for the array followed
2230 by an index in square brackets.  The strict rule of using postfix
2231 operations makes precedence irrelevant within terms.  To provide a place
2232 to put the grammar for each terms of each type, we will start out by
2233 introducing the "Term" grammar production, with contains at least a
2234 simple "Value" (to be explained later).
2235
2236 ###### Grammar
2237         $*exec
2238         Term ->  Value ${ $0 = $<1; }$
2239         | Variable ${ $0 = $<1; }$
2240         ## term grammar
2241
2242 Thus far the complex types we have are arrays and structs.
2243
2244 #### Arrays
2245
2246 Arrays can be declared by giving a size and a type, as `[size]type' so
2247 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
2248 size can be either a literal number, or a named constant.  Some day an
2249 arbitrary expression will be supported.
2250
2251 As a formal parameter to a function, the array can be declared with a
2252 new variable as the size: `name:[size::number]string`.  The `size`
2253 variable is set to the size of the array and must be a constant.  As
2254 `number` is the only supported type, it can be left out:
2255 `name:[size::]string`.
2256
2257 Arrays cannot be assigned.  When pointers are introduced we will also
2258 introduce array slices which can refer to part or all of an array -
2259 the assignment syntax will create a slice.  For now, an array can only
2260 ever be referenced by the name it is declared with.  It is likely that
2261 a "`copy`" primitive will eventually be define which can be used to
2262 make a copy of an array with controllable recursive depth.
2263
2264 For now we have two sorts of array, those with fixed size either because
2265 it is given as a literal number or because it is a struct member (which
2266 cannot have a runtime-changing size), and those with a size that is
2267 determined at runtime - local variables with a const size.  The former
2268 have their size calculated at parse time, the latter at run time.
2269
2270 For the latter type, the `size` field of the type is the size of a
2271 pointer, and the array is reallocated every time it comes into scope.
2272
2273 We differentiate struct fields with a const size from local variables
2274 with a const size by whether they are prepared at parse time or not.
2275
2276 ###### type union fields
2277
2278         struct {
2279                 int unspec;     // size is unspecified - vsize must be set.
2280                 short size;
2281                 short static_size;
2282                 struct variable *vsize;
2283                 struct type *member;
2284         } array;
2285
2286 ###### value union fields
2287         void *array;  // used if not static_size
2288
2289 ###### value functions
2290
2291         static int array_prepare_type(struct parse_context *c, struct type *type,
2292                                        int parse_time)
2293         {
2294                 struct value *vsize;
2295                 mpz_t q;
2296                 if (type->array.static_size)
2297                         return 1;       // UNTESTED
2298                 if (type->array.unspec && parse_time)
2299                         return 1;       // UNTESTED
2300                 if (parse_time && type->array.vsize && !type->array.vsize->global)
2301                         return 1;       // UNTESTED
2302
2303                 if (type->array.vsize) {
2304                         vsize = var_value(c, type->array.vsize);
2305                         if (!vsize)
2306                                 return 1;       // UNTESTED
2307                         mpz_init(q);
2308                         mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
2309                         type->array.size = mpz_get_si(q);
2310                         mpz_clear(q);
2311                 }
2312                 if (!parse_time)
2313                         return 1;
2314                 if (type->array.member->size <= 0)
2315                         return 0;       // UNTESTED
2316
2317                 type->array.static_size = 1;
2318                 type->size = type->array.size * type->array.member->size;
2319                 type->align = type->array.member->align;
2320
2321                 return 1;
2322         }
2323
2324         static void array_init(struct type *type, struct value *val)
2325         {
2326                 int i;
2327                 void *ptr = val->ptr;
2328
2329                 if (!val)
2330                         return;                         // NOTEST
2331                 if (!type->array.static_size) {
2332                         val->array = calloc(type->array.size,
2333                                             type->array.member->size);
2334                         ptr = val->array;
2335                 }
2336                 for (i = 0; i < type->array.size; i++) {
2337                         struct value *v;
2338                         v = (void*)ptr + i * type->array.member->size;
2339                         val_init(type->array.member, v);
2340                 }
2341         }
2342
2343         static void array_free(struct type *type, struct value *val)
2344         {
2345                 int i;
2346                 void *ptr = val->ptr;
2347
2348                 if (!type->array.static_size)
2349                         ptr = val->array;
2350                 for (i = 0; i < type->array.size; i++) {
2351                         struct value *v;
2352                         v = (void*)ptr + i * type->array.member->size;
2353                         free_value(type->array.member, v);
2354                 }
2355                 if (!type->array.static_size)
2356                         free(ptr);
2357         }
2358
2359         static int array_compat(struct type *require, struct type *have)
2360         {
2361                 if (have->compat != require->compat)
2362                         return 0;
2363                 /* Both are arrays, so we can look at details */
2364                 if (!type_compat(require->array.member, have->array.member, 0))
2365                         return 0;
2366                 if (have->array.unspec && require->array.unspec) {
2367                         if (have->array.vsize && require->array.vsize &&
2368                             have->array.vsize != require->array.vsize)  // UNTESTED
2369                                 /* sizes might not be the same */
2370                                 return 0;       // UNTESTED
2371                         return 1;
2372                 }
2373                 if (have->array.unspec || require->array.unspec)
2374                         return 1;       // UNTESTED
2375                 if (require->array.vsize == NULL && have->array.vsize == NULL)
2376                         return require->array.size == have->array.size;
2377
2378                 return require->array.vsize == have->array.vsize;       // UNTESTED
2379         }
2380
2381         static void array_print_type(struct type *type, FILE *f)
2382         {
2383                 fputs("[", f);
2384                 if (type->array.vsize) {
2385                         struct binding *b = type->array.vsize->name;
2386                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
2387                                 type->array.unspec ? "::" : "");
2388                 } else if (type->array.size)
2389                         fprintf(f, "%d]", type->array.size);
2390                 else
2391                         fprintf(f, "]");
2392                 type_print(type->array.member, f);
2393         }
2394
2395         static struct type array_prototype = {
2396                 .init = array_init,
2397                 .prepare_type = array_prepare_type,
2398                 .print_type = array_print_type,
2399                 .compat = array_compat,
2400                 .free = array_free,
2401                 .size = sizeof(void*),
2402                 .align = sizeof(void*),
2403         };
2404
2405 ###### declare terminals
2406         $TERM [ ]
2407
2408 ###### type grammar
2409
2410         | [ NUMBER ] Type ${ {
2411                 char tail[3];
2412                 mpq_t num;
2413                 struct type *t;
2414                 int elements = 0;
2415
2416                 if (number_parse(num, tail, $2.txt) == 0)
2417                         tok_err(c, "error: unrecognised number", &$2);
2418                 else if (tail[0]) {
2419                         tok_err(c, "error: unsupported number suffix", &$2);
2420                         mpq_clear(num);
2421                 } else {
2422                         elements = mpz_get_ui(mpq_numref(num));
2423                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
2424                                 tok_err(c, "error: array size must be an integer",
2425                                         &$2);
2426                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
2427                                 tok_err(c, "error: array size is too large",
2428                                         &$2);
2429                         mpq_clear(num);
2430                 }
2431
2432                 $0 = t = add_anon_type(c, &array_prototype, "array[%d]", elements );
2433                 t->array.size = elements;
2434                 t->array.member = $<4;
2435                 t->array.vsize = NULL;
2436         } }$
2437
2438         | [ IDENTIFIER ] Type ${ {
2439                 struct variable *v = var_ref(c, $2.txt);
2440
2441                 if (!v)
2442                         tok_err(c, "error: name undeclared", &$2);
2443                 else if (!v->constant)
2444                         tok_err(c, "error: array size must be a constant", &$2);
2445
2446                 $0 = add_anon_type(c, &array_prototype, "array[%.*s]", $2.txt.len, $2.txt.txt);
2447                 $0->array.member = $<4;
2448                 $0->array.size = 0;
2449                 $0->array.vsize = v;
2450         } }$
2451
2452 ###### Grammar
2453         $*type
2454         OptType -> Type ${ $0 = $<1; }$
2455                 | ${ $0 = NULL; }$
2456
2457 ###### formal type grammar
2458
2459         | [ IDENTIFIER :: OptType ] Type ${ {
2460                 struct variable *v = var_decl(c, $ID.txt);
2461
2462                 v->type = $<OT;
2463                 v->constant = 1;
2464                 if (!v->type)
2465                         v->type = Tnum;
2466                 $0 = add_anon_type(c, &array_prototype, "array[var]");
2467                 $0->array.member = $<6;
2468                 $0->array.size = 0;
2469                 $0->array.unspec = 1;
2470                 $0->array.vsize = v;
2471         } }$
2472
2473 ###### Binode types
2474         Index,
2475
2476 ###### term grammar
2477
2478         | Term [ Expression ] ${ {
2479                 struct binode *b = new(binode);
2480                 b->op = Index;
2481                 b->left = $<1;
2482                 b->right = $<3;
2483                 $0 = b;
2484         } }$
2485
2486 ###### print binode cases
2487         case Index:
2488                 print_exec(b->left, -1, bracket);
2489                 printf("[");
2490                 print_exec(b->right, -1, bracket);
2491                 printf("]");
2492                 break;
2493
2494 ###### propagate binode cases
2495         case Index:
2496                 /* left must be an array, right must be a number,
2497                  * result is the member type of the array
2498                  */
2499                 propagate_types(b->right, c, perr, Tnum, 0);
2500                 t = propagate_types(b->left, c, perr, NULL, rules & Rnoconstant);
2501                 if (!t || t->compat != array_compat) {
2502                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
2503                         return NULL;
2504                 } else {
2505                         if (!type_compat(type, t->array.member, rules)) {
2506                                 type_err(c, "error: have %1 but need %2", prog,
2507                                          t->array.member, rules, type);
2508                         }
2509                         return t->array.member;
2510                 }
2511                 break;
2512
2513 ###### interp binode cases
2514         case Index: {
2515                 mpz_t q;
2516                 long i;
2517                 void *ptr;
2518
2519                 lleft = linterp_exec(c, b->left, &ltype);
2520                 right = interp_exec(c, b->right, &rtype);
2521                 mpz_init(q);
2522                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
2523                 i = mpz_get_si(q);
2524                 mpz_clear(q);
2525
2526                 if (ltype->array.static_size)
2527                         ptr = lleft;
2528                 else
2529                         ptr = *(void**)lleft;
2530                 rvtype = ltype->array.member;
2531                 if (i >= 0 && i < ltype->array.size)
2532                         lrv = ptr + i * rvtype->size;
2533                 else
2534                         val_init(ltype->array.member, &rv); // UNSAFE
2535                 ltype = NULL;
2536                 break;
2537         }
2538
2539 #### Structs
2540
2541 A `struct` is a data-type that contains one or more other data-types.
2542 It differs from an array in that each member can be of a different
2543 type, and they are accessed by name rather than by number.  Thus you
2544 cannot choose an element by calculation, you need to know what you
2545 want up-front.
2546
2547 The language makes no promises about how a given structure will be
2548 stored in memory - it is free to rearrange fields to suit whatever
2549 criteria seems important.
2550
2551 Structs are declared separately from program code - they cannot be
2552 declared in-line in a variable declaration like arrays can.  A struct
2553 is given a name and this name is used to identify the type - the name
2554 is not prefixed by the word `struct` as it would be in C.
2555
2556 Structs are only treated as the same if they have the same name.
2557 Simply having the same fields in the same order is not enough.  This
2558 might change once we can create structure initializers from a list of
2559 values.
2560
2561 Each component datum is identified much like a variable is declared,
2562 with a name, one or two colons, and a type.  The type cannot be omitted
2563 as there is no opportunity to deduce the type from usage.  An initial
2564 value can be given following an equals sign, so
2565
2566 ##### Example: a struct type
2567
2568         struct complex:
2569                 x:number = 0
2570                 y:number = 0
2571
2572 would declare a type called "complex" which has two number fields,
2573 each initialised to zero.
2574
2575 Struct will need to be declared separately from the code that uses
2576 them, so we will need to be able to print out the declaration of a
2577 struct when reprinting the whole program.  So a `print_type_decl` type
2578 function will be needed.
2579
2580 ###### type union fields
2581
2582         struct {
2583                 int nfields;
2584                 struct field {
2585                         struct text name;
2586                         struct type *type;
2587                         struct value *init;
2588                         int offset;
2589                 } *fields; // This is created when field_list is analysed.
2590                 struct fieldlist {
2591                         struct fieldlist *prev;
2592                         struct field f;
2593                         struct exec *init;
2594                 } *field_list; // This is created during parsing
2595         } structure;
2596
2597 ###### type functions
2598         void (*print_type_decl)(struct type *type, FILE *f);
2599
2600 ###### value functions
2601
2602         static void structure_init(struct type *type, struct value *val)
2603         {
2604                 int i;
2605
2606                 for (i = 0; i < type->structure.nfields; i++) {
2607                         struct value *v;
2608                         v = (void*) val->ptr + type->structure.fields[i].offset;
2609                         if (type->structure.fields[i].init)
2610                                 dup_value(type->structure.fields[i].type,
2611                                           type->structure.fields[i].init,
2612                                           v);
2613                         else
2614                                 val_init(type->structure.fields[i].type, v);
2615                 }
2616         }
2617
2618         static void structure_free(struct type *type, struct value *val)
2619         {
2620                 int i;
2621
2622                 for (i = 0; i < type->structure.nfields; i++) {
2623                         struct value *v;
2624                         v = (void*)val->ptr + type->structure.fields[i].offset;
2625                         free_value(type->structure.fields[i].type, v);
2626                 }
2627         }
2628
2629         static void free_fieldlist(struct fieldlist *f)
2630         {
2631                 if (!f)
2632                         return;
2633                 free_fieldlist(f->prev);
2634                 free_exec(f->init);
2635                 free(f);
2636         }
2637
2638         static void structure_free_type(struct type *t)
2639         {
2640                 int i;
2641                 for (i = 0; i < t->structure.nfields; i++)
2642                         if (t->structure.fields[i].init) {
2643                                 free_value(t->structure.fields[i].type,
2644                                            t->structure.fields[i].init);
2645                         }
2646                 free(t->structure.fields);
2647                 free_fieldlist(t->structure.field_list);
2648         }
2649
2650         static int structure_prepare_type(struct parse_context *c,
2651                                           struct type *t, int parse_time)
2652         {
2653                 int cnt = 0;
2654                 struct fieldlist *f;
2655
2656                 if (!parse_time || t->structure.fields)
2657                         return 1;
2658
2659                 for (f = t->structure.field_list; f; f=f->prev) {
2660                         enum prop_err perr;
2661                         cnt += 1;
2662
2663                         if (f->f.type->size <= 0)
2664                                 return 0;
2665                         if (f->f.type->prepare_type)
2666                                 f->f.type->prepare_type(c, f->f.type, parse_time);
2667
2668                         if (f->init == NULL)
2669                                 continue;
2670                         do {
2671                                 perr = 0;
2672                                 propagate_types(f->init, c, &perr, f->f.type, 0);
2673                         } while (perr & Eretry);
2674                         if (perr & Efail)
2675                                 c->parse_error += 1;    // NOTEST
2676                 }
2677
2678                 t->structure.nfields = cnt;
2679                 t->structure.fields = calloc(cnt, sizeof(struct field));
2680                 f = t->structure.field_list;
2681                 while (cnt > 0) {
2682                         int a = f->f.type->align;
2683                         cnt -= 1;
2684                         t->structure.fields[cnt] = f->f;
2685                         if (t->size & (a-1))
2686                                 t->size = (t->size | (a-1)) + 1;
2687                         t->structure.fields[cnt].offset = t->size;
2688                         t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2689                         if (a > t->align)
2690                                 t->align = a;
2691
2692                         if (f->init && !c->parse_error) {
2693                                 struct value vl = interp_exec(c, f->init, NULL);
2694                                 t->structure.fields[cnt].init =
2695                                         global_alloc(c, f->f.type, NULL, &vl);
2696                         }
2697
2698                         f = f->prev;
2699                 }
2700                 return 1;
2701         }
2702
2703         static struct type structure_prototype = {
2704                 .init = structure_init,
2705                 .free = structure_free,
2706                 .free_type = structure_free_type,
2707                 .print_type_decl = structure_print_type,
2708                 .prepare_type = structure_prepare_type,
2709         };
2710
2711 ###### exec type
2712         Xfieldref,
2713
2714 ###### ast
2715         struct fieldref {
2716                 struct exec;
2717                 struct exec *left;
2718                 int index;
2719                 struct text name;
2720         };
2721
2722 ###### free exec cases
2723         case Xfieldref:
2724                 free_exec(cast(fieldref, e)->left);
2725                 free(e);
2726                 break;
2727
2728 ###### declare terminals
2729         $TERM struct .
2730
2731 ###### term grammar
2732
2733         | Term . IDENTIFIER ${ {
2734                 struct fieldref *fr = new_pos(fieldref, $2);
2735                 fr->left = $<1;
2736                 fr->name = $3.txt;
2737                 fr->index = -2;
2738                 $0 = fr;
2739         } }$
2740
2741 ###### print exec cases
2742
2743         case Xfieldref:
2744         {
2745                 struct fieldref *f = cast(fieldref, e);
2746                 print_exec(f->left, -1, bracket);
2747                 printf(".%.*s", f->name.len, f->name.txt);
2748                 break;
2749         }
2750
2751 ###### ast functions
2752         static int find_struct_index(struct type *type, struct text field)
2753         {
2754                 int i;
2755                 for (i = 0; i < type->structure.nfields; i++)
2756                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2757                                 return i;
2758                 return -1;
2759         }
2760
2761 ###### propagate exec cases
2762
2763         case Xfieldref:
2764         {
2765                 struct fieldref *f = cast(fieldref, prog);
2766                 struct type *st = propagate_types(f->left, c, perr, NULL, 0);
2767
2768                 if (!st)
2769                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2770                                  NULL, 0, NULL);
2771                 else if (st->init != structure_init)
2772                         type_err(c, "error: field reference attempted on %1, not a struct",
2773                                  f->left, st, 0, NULL);
2774                 else if (f->index == -2) {
2775                         f->index = find_struct_index(st, f->name);
2776                         if (f->index < 0)
2777                                 type_err(c, "error: cannot find requested field in %1",
2778                                          f->left, st, 0, NULL);
2779                 }
2780                 if (f->index >= 0) {
2781                         struct type *ft = st->structure.fields[f->index].type;
2782                         if (!type_compat(type, ft, rules))
2783                                 type_err(c, "error: have %1 but need %2", prog,
2784                                          ft, rules, type);
2785                         return ft;
2786                 }
2787                 break;
2788         }
2789
2790 ###### interp exec cases
2791         case Xfieldref:
2792         {
2793                 struct fieldref *f = cast(fieldref, e);
2794                 struct type *ltype;
2795                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2796                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2797                 rvtype = ltype->structure.fields[f->index].type;
2798                 break;
2799         }
2800
2801 ###### top level grammar
2802         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2803                 struct type *t;
2804                 t = find_type(c, $ID.txt);
2805                 if (!t)
2806                         t = add_type(c, $ID.txt, &structure_prototype);
2807                 else if (t->size >= 0) {
2808                         tok_err(c, "error: type already declared", &$ID);
2809                         tok_err(c, "info: this is location of declartion", &t->first_use);
2810                         /* Create a new one - duplicate */
2811                         t = add_type(c, $ID.txt, &structure_prototype);
2812                 } else {
2813                         struct type tmp = *t;
2814                         *t = structure_prototype;
2815                         t->name = tmp.name;
2816                         t->next = tmp.next;
2817                 }
2818                 t->structure.field_list = $<FB;
2819                 t->first_use = $ID;
2820         } }$
2821
2822         $*fieldlist
2823         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2824         | { SimpleFieldList } ${ $0 = $<SFL; }$
2825         | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2826         | SimpleFieldList EOL ${ $0 = $<SFL; }$
2827
2828         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2829         | FieldLines SimpleFieldList Newlines ${
2830                 $SFL->prev = $<FL;
2831                 $0 = $<SFL;
2832         }$
2833
2834         SimpleFieldList -> Field ${ $0 = $<F; }$
2835         | SimpleFieldList ; Field ${
2836                 $F->prev = $<SFL;
2837                 $0 = $<F;
2838         }$
2839         | SimpleFieldList ; ${
2840                 $0 = $<SFL;
2841         }$
2842         | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2843
2844         Field -> IDENTIFIER : Type = Expression ${ {
2845                 $0 = calloc(1, sizeof(struct fieldlist));
2846                 $0->f.name = $ID.txt;
2847                 $0->f.type = $<Type;
2848                 $0->f.init = NULL;
2849                 $0->init = $<Expr;
2850         } }$
2851         | IDENTIFIER : Type ${
2852                 $0 = calloc(1, sizeof(struct fieldlist));
2853                 $0->f.name = $ID.txt;
2854                 $0->f.type = $<Type;
2855         }$
2856
2857 ###### forward decls
2858         static void structure_print_type(struct type *t, FILE *f);
2859
2860 ###### value functions
2861         static void structure_print_type(struct type *t, FILE *f)
2862         {
2863                 int i;
2864
2865                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2866
2867                 for (i = 0; i < t->structure.nfields; i++) {
2868                         struct field *fl = t->structure.fields + i;
2869                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2870                         type_print(fl->type, f);
2871                         if (fl->type->print && fl->init) {
2872                                 fprintf(f, " = ");
2873                                 if (fl->type == Tstr)
2874                                         fprintf(f, "\"");       // UNTESTED
2875                                 print_value(fl->type, fl->init, f);
2876                                 if (fl->type == Tstr)
2877                                         fprintf(f, "\"");       // UNTESTED
2878                         }
2879                         fprintf(f, "\n");
2880                 }
2881         }
2882
2883 ###### print type decls
2884         {
2885                 struct type *t;
2886                 int target = -1;
2887
2888                 while (target != 0) {
2889                         int i = 0;
2890                         for (t = context.typelist; t ; t=t->next)
2891                                 if (!t->anon && t->print_type_decl &&
2892                                     !t->check_args) {
2893                                         i += 1;
2894                                         if (i == target)
2895                                                 break;
2896                                 }
2897
2898                         if (target == -1) {
2899                                 target = i;
2900                         } else {
2901                                 t->print_type_decl(t, stdout);
2902                                 target -= 1;
2903                         }
2904                 }
2905         }
2906
2907 #### Functions
2908
2909 A function is a chunk of code which can be passed parameters and can
2910 return results.  Each function has a type which includes the set of
2911 parameters and the return value.  As yet these types cannot be declared
2912 separately from the function itself.
2913
2914 The parameters can be specified either in parentheses as a ';' separated
2915 list, such as
2916
2917 ##### Example: function 1
2918
2919         func main(av:[ac::number]string; env:[envc::number]string)
2920                 code block
2921
2922 or as an indented list of one parameter per line (though each line can
2923 be a ';' separated list)
2924
2925 ##### Example: function 2
2926
2927         func main
2928                 argv:[argc::number]string
2929                 env:[envc::number]string
2930         do
2931                 code block
2932
2933 In the first case a return type can follow the parentheses after a colon,
2934 in the second it is given on a line starting with the word `return`.
2935
2936 ##### Example: functions that return
2937
2938         func add(a:number; b:number): number
2939                 code block
2940
2941         func catenate
2942                 a: string
2943                 b: string
2944         return string
2945         do
2946                 code block
2947
2948 Rather than returning a type, the function can specify a set of local
2949 variables to return as a struct.  The values of these variables when the
2950 function exits will be provided to the caller.  For this the return type
2951 is replaced with a block of result declarations, either in parentheses
2952 or bracketed by `return` and `do`.
2953
2954 ##### Example: functions returning multiple variables
2955
2956         func to_cartesian(rho:number; theta:number):(x:number; y:number)
2957                 x = .....
2958                 y = .....
2959
2960         func to_polar
2961                 x:number; y:number
2962         return
2963                 rho:number
2964                 theta:number
2965         do
2966                 rho = ....
2967                 theta = ....
2968
2969 For constructing the lists we use a `List` binode, which will be
2970 further detailed when Expression Lists are introduced.
2971
2972 ###### type union fields
2973
2974         struct {
2975                 struct binode *params;
2976                 struct type *return_type;
2977                 struct variable *scope;
2978                 int inline_result;      // return value is at start of 'local'
2979                 int local_size;
2980         } function;
2981
2982 ###### value union fields
2983         struct exec *function;
2984
2985 ###### type functions
2986         void (*check_args)(struct parse_context *c, enum prop_err *perr,
2987                            struct type *require, struct exec *args);
2988
2989 ###### value functions
2990
2991         static void function_free(struct type *type, struct value *val)
2992         {
2993                 free_exec(val->function);
2994                 val->function = NULL;
2995         }
2996
2997         static int function_compat(struct type *require, struct type *have)
2998         {
2999                 // FIXME can I do anything here yet?
3000                 return 0;
3001         }
3002
3003         static void function_check_args(struct parse_context *c, enum prop_err *perr,
3004                                         struct type *require, struct exec *args)
3005         {
3006                 /* This should be 'compat', but we don't have a 'tuple' type to
3007                  * hold the type of 'args'
3008                  */
3009                 struct binode *arg = cast(binode, args);
3010                 struct binode *param = require->function.params;
3011
3012                 while (param) {
3013                         struct var *pv = cast(var, param->left);
3014                         if (!arg) {
3015                                 type_err(c, "error: insufficient arguments to function.",
3016                                          args, NULL, 0, NULL);
3017                                 break;
3018                         }
3019                         *perr = 0;
3020                         propagate_types(arg->left, c, perr, pv->var->type, 0);
3021                         param = cast(binode, param->right);
3022                         arg = cast(binode, arg->right);
3023                 }
3024                 if (arg)
3025                         type_err(c, "error: too many arguments to function.",
3026                                  args, NULL, 0, NULL);
3027         }
3028
3029         static void function_print(struct type *type, struct value *val, FILE *f)
3030         {
3031                 print_exec(val->function, 1, 0);
3032         }
3033
3034         static void function_print_type_decl(struct type *type, FILE *f)
3035         {
3036                 struct binode *b;
3037                 fprintf(f, "(");
3038                 for (b = type->function.params; b; b = cast(binode, b->right)) {
3039                         struct variable *v = cast(var, b->left)->var;
3040                         fprintf(f, "%.*s%s", v->name->name.len, v->name->name.txt,
3041                                 v->constant ? "::" : ":");
3042                         type_print(v->type, f);
3043                         if (b->right)
3044                                 fprintf(f, "; ");
3045                 }
3046                 fprintf(f, ")");
3047                 if (type->function.return_type != Tnone) {
3048                         fprintf(f, ":");
3049                         if (type->function.inline_result) {
3050                                 int i;
3051                                 struct type *t = type->function.return_type;
3052                                 fprintf(f, " (");
3053                                 for (i = 0; i < t->structure.nfields; i++) {
3054                                         struct field *fl = t->structure.fields + i;
3055                                         if (i)
3056                                                 fprintf(f, "; ");
3057                                         fprintf(f, "%.*s:", fl->name.len, fl->name.txt);
3058                                         type_print(fl->type, f);
3059                                 }
3060                                 fprintf(f, ")");
3061                         } else
3062                                 type_print(type->function.return_type, f);
3063                 }
3064                 fprintf(f, "\n");
3065         }
3066
3067         static void function_free_type(struct type *t)
3068         {
3069                 free_exec(t->function.params);
3070         }
3071
3072         static struct type function_prototype = {
3073                 .size = sizeof(void*),
3074                 .align = sizeof(void*),
3075                 .free = function_free,
3076                 .compat = function_compat,
3077                 .check_args = function_check_args,
3078                 .print = function_print,
3079                 .print_type_decl = function_print_type_decl,
3080                 .free_type = function_free_type,
3081         };
3082
3083 ###### declare terminals
3084
3085         $TERM func
3086
3087 ###### Binode types
3088         List,
3089
3090 ###### Grammar
3091
3092         $*variable
3093         FuncName -> IDENTIFIER ${ {
3094                 struct variable *v = var_decl(c, $1.txt);
3095                 struct var *e = new_pos(var, $1);
3096                 e->var = v;
3097                 if (v) {
3098                         v->where_decl = e;
3099                         $0 = v;
3100                 } else {
3101                         v = var_ref(c, $1.txt);
3102                         e->var = v;
3103                         type_err(c, "error: function '%v' redeclared",
3104                                 e, NULL, 0, NULL);
3105                         type_err(c, "info: this is where '%v' was first declared",
3106                                 v->where_decl, NULL, 0, NULL);
3107                         free_exec(e);
3108                 }
3109         } }$
3110
3111         $*binode
3112         Args -> ArgsLine NEWLINE ${ $0 = $<AL; }$
3113         | Args ArgsLine NEWLINE ${ {
3114                 struct binode *b = $<AL;
3115                 struct binode **bp = &b;
3116                 while (*bp)
3117                         bp = (struct binode **)&(*bp)->left;
3118                 *bp = $<A;
3119                 $0 = b;
3120         } }$
3121
3122         ArgsLine -> ${ $0 = NULL; }$
3123         | Varlist ${ $0 = $<1; }$
3124         | Varlist ; ${ $0 = $<1; }$
3125
3126         Varlist -> Varlist ; ArgDecl ${
3127                 $0 = new(binode);
3128                 $0->op = List;
3129                 $0->left = $<Vl;
3130                 $0->right = $<AD;
3131         }$
3132         | ArgDecl ${
3133                 $0 = new(binode);
3134                 $0->op = List;
3135                 $0->left = NULL;
3136                 $0->right = $<AD;
3137         }$
3138
3139         $*var
3140         ArgDecl -> IDENTIFIER : FormalType ${ {
3141                 struct variable *v = var_decl(c, $1.txt);
3142                 $0 = new(var);
3143                 $0->var = v;
3144                 v->type = $<FT;
3145         } }$
3146
3147 ##### Function calls
3148
3149 A function call can appear either as an expression or as a statement.
3150 We use a new 'Funcall' binode type to link the function with a list of
3151 arguments, form with the 'List' nodes.
3152
3153 We have already seen the "Term" which is how a function call can appear
3154 in an expression.  To parse a function call into a statement we include
3155 it in the "SimpleStatement Grammar" which will be described later.
3156
3157 ###### Binode types
3158         Funcall,
3159
3160 ###### term grammar
3161         | Term ( ExpressionList ) ${ {
3162                 struct binode *b = new(binode);
3163                 b->op = Funcall;
3164                 b->left = $<T;
3165                 b->right = reorder_bilist($<EL);
3166                 $0 = b;
3167         } }$
3168         | Term ( ) ${ {
3169                 struct binode *b = new(binode);
3170                 b->op = Funcall;
3171                 b->left = $<T;
3172                 b->right = NULL;
3173                 $0 = b;
3174         } }$
3175
3176 ###### SimpleStatement Grammar
3177
3178         | Term ( ExpressionList ) ${ {
3179                 struct binode *b = new(binode);
3180                 b->op = Funcall;
3181                 b->left = $<T;
3182                 b->right = reorder_bilist($<EL);
3183                 $0 = b;
3184         } }$
3185
3186 ###### print binode cases
3187
3188         case Funcall:
3189                 do_indent(indent, "");
3190                 print_exec(b->left, -1, bracket);
3191                 printf("(");
3192                 for (b = cast(binode, b->right); b; b = cast(binode, b->right)) {
3193                         if (b->left) {
3194                                 printf(" ");
3195                                 print_exec(b->left, -1, bracket);
3196                                 if (b->right)
3197                                         printf(",");
3198                         }
3199                 }
3200                 printf(")");
3201                 if (indent >= 0)
3202                         printf("\n");
3203                 break;
3204
3205 ###### propagate binode cases
3206
3207         case Funcall: {
3208                 /* Every arg must match formal parameter, and result
3209                  * is return type of function
3210                  */
3211                 struct binode *args = cast(binode, b->right);
3212                 struct var *v = cast(var, b->left);
3213
3214                 if (!v->var->type || v->var->type->check_args == NULL) {
3215                         type_err(c, "error: attempt to call a non-function.",
3216                                  prog, NULL, 0, NULL);
3217                         return NULL;
3218                 }
3219                 *perr |= Enoconst;
3220                 v->var->type->check_args(c, perr, v->var->type, args);
3221                 if (v->var->type->function.inline_result)
3222                         *perr |= Emaycopy;
3223                 return v->var->type->function.return_type;
3224         }
3225
3226 ###### interp binode cases
3227
3228         case Funcall: {
3229                 struct var *v = cast(var, b->left);
3230                 struct type *t = v->var->type;
3231                 void *oldlocal = c->local;
3232                 int old_size = c->local_size;
3233                 void *local = calloc(1, t->function.local_size);
3234                 struct value *fbody = var_value(c, v->var);
3235                 struct binode *arg = cast(binode, b->right);
3236                 struct binode *param = t->function.params;
3237
3238                 while (param) {
3239                         struct var *pv = cast(var, param->left);
3240                         struct type *vtype = NULL;
3241                         struct value val = interp_exec(c, arg->left, &vtype);
3242                         struct value *lval;
3243                         c->local = local; c->local_size = t->function.local_size;
3244                         lval = var_value(c, pv->var);
3245                         c->local = oldlocal; c->local_size = old_size;
3246                         memcpy(lval, &val, vtype->size);
3247                         param = cast(binode, param->right);
3248                         arg = cast(binode, arg->right);
3249                 }
3250                 c->local = local; c->local_size = t->function.local_size;
3251                 if (t->function.inline_result && dtype) {
3252                         _interp_exec(c, fbody->function, NULL, NULL);
3253                         memcpy(dest, local, dtype->size);
3254                         rvtype = ret.type = NULL;
3255                 } else
3256                         rv = interp_exec(c, fbody->function, &rvtype);
3257                 c->local = oldlocal; c->local_size = old_size;
3258                 free(local);
3259                 break;
3260         }
3261
3262 ## Complex executables: statements and expressions
3263
3264 Now that we have types and values and variables and most of the basic
3265 Terms which provide access to these, we can explore the more complex
3266 code that combine all of these to get useful work done.  Specifically
3267 statements and expressions.
3268
3269 Expressions are various combinations of Terms.  We will use operator
3270 precedence to ensure correct parsing.  The simplest Expression is just a
3271 Term - others will follow.
3272
3273 ###### Grammar
3274
3275         $*exec
3276         Expression -> Term ${ $0 = $<Term; }$
3277         ## expression grammar
3278
3279 ### Expressions: Conditional
3280
3281 Our first user of the `binode` will be conditional expressions, which
3282 is a bit odd as they actually have three components.  That will be
3283 handled by having 2 binodes for each expression.  The conditional
3284 expression is the lowest precedence operator which is why we define it
3285 first - to start the precedence list.
3286
3287 Conditional expressions are of the form "value `if` condition `else`
3288 other_value".  They associate to the right, so everything to the right
3289 of `else` is part of an else value, while only a higher-precedence to
3290 the left of `if` is the if values.  Between `if` and `else` there is no
3291 room for ambiguity, so a full conditional expression is allowed in
3292 there.
3293
3294 ###### Binode types
3295         CondExpr,
3296
3297 ###### declare terminals
3298
3299         $LEFT if $$ifelse
3300
3301 ###### expression grammar
3302
3303         | Expression if Expression else Expression $$ifelse ${ {
3304                 struct binode *b1 = new(binode);
3305                 struct binode *b2 = new(binode);
3306                 b1->op = CondExpr;
3307                 b1->left = $<3;
3308                 b1->right = b2;
3309                 b2->op = CondExpr;
3310                 b2->left = $<1;
3311                 b2->right = $<5;
3312                 $0 = b1;
3313         } }$
3314
3315 ###### print binode cases
3316
3317         case CondExpr:
3318                 b2 = cast(binode, b->right);
3319                 if (bracket) printf("(");
3320                 print_exec(b2->left, -1, bracket);
3321                 printf(" if ");
3322                 print_exec(b->left, -1, bracket);
3323                 printf(" else ");
3324                 print_exec(b2->right, -1, bracket);
3325                 if (bracket) printf(")");
3326                 break;
3327
3328 ###### propagate binode cases
3329
3330         case CondExpr: {
3331                 /* cond must be Tbool, others must match */
3332                 struct binode *b2 = cast(binode, b->right);
3333                 struct type *t2;
3334
3335                 propagate_types(b->left, c, perr, Tbool, 0);
3336                 t = propagate_types(b2->left, c, perr, type, Rnolabel);
3337                 t2 = propagate_types(b2->right, c, perr, type ?: t, Rnolabel);
3338                 return t ?: t2;
3339         }
3340
3341 ###### interp binode cases
3342
3343         case CondExpr: {
3344                 struct binode *b2 = cast(binode, b->right);
3345                 left = interp_exec(c, b->left, &ltype);
3346                 if (left.bool)
3347                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
3348                 else
3349                         rv = interp_exec(c, b2->right, &rvtype);
3350                 }
3351                 break;
3352
3353 ### Expression list
3354
3355 We take a brief detour, now that we have expressions, to describe lists
3356 of expressions.  These will be needed for function parameters and
3357 possibly other situations.  They seem generic enough to introduce here
3358 to be used elsewhere.
3359
3360 And ExpressionList will use the `List` type of `binode`, building up at
3361 the end.  And place where they are used will probably call
3362 `reorder_bilist()` to get a more normal first/next arrangement.
3363
3364 ###### declare terminals
3365         $TERM ,
3366
3367 `List` execs have no implicit semantics, so they are never propagated or
3368 interpreted.  The can be printed as a comma separate list, which is how
3369 they are parsed.  Note they are also used for function formal parameter
3370 lists.  In that case a separate function is used to print them.
3371
3372 ###### print binode cases
3373         case List:
3374                 while (b) {
3375                         printf(" ");
3376                         print_exec(b->left, -1, bracket);
3377                         if (b->right)
3378                                 printf(",");
3379                         b = cast(binode, b->right);
3380                 }
3381                 break;
3382
3383 ###### propagate binode cases
3384         case List: abort(); // NOTEST
3385 ###### interp binode cases
3386         case List: abort(); // NOTEST
3387
3388 ###### Grammar
3389
3390         $*binode
3391         ExpressionList -> ExpressionList , Expression ${
3392                 $0 = new(binode);
3393                 $0->op = List;
3394                 $0->left = $<1;
3395                 $0->right = $<3;
3396         }$
3397         | Expression ${
3398                 $0 = new(binode);
3399                 $0->op = List;
3400                 $0->left = NULL;
3401                 $0->right = $<1;
3402         }$
3403
3404 ### Expressions: Boolean
3405
3406 The next class of expressions to use the `binode` will be Boolean
3407 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
3408 have same corresponding precendence.  The difference is that they don't
3409 evaluate the second expression if not necessary.
3410
3411 ###### Binode types
3412         And,
3413         AndThen,
3414         Or,
3415         OrElse,
3416         Not,
3417
3418 ###### declare terminals
3419         $LEFT or
3420         $LEFT and
3421         $LEFT not
3422
3423 ###### expression grammar
3424         | Expression or Expression ${ {
3425                 struct binode *b = new(binode);
3426                 b->op = Or;
3427                 b->left = $<1;
3428                 b->right = $<3;
3429                 $0 = b;
3430         } }$
3431         | Expression or else Expression ${ {
3432                 struct binode *b = new(binode);
3433                 b->op = OrElse;
3434                 b->left = $<1;
3435                 b->right = $<4;
3436                 $0 = b;
3437         } }$
3438
3439         | Expression and Expression ${ {
3440                 struct binode *b = new(binode);
3441                 b->op = And;
3442                 b->left = $<1;
3443                 b->right = $<3;
3444                 $0 = b;
3445         } }$
3446         | Expression and then Expression ${ {
3447                 struct binode *b = new(binode);
3448                 b->op = AndThen;
3449                 b->left = $<1;
3450                 b->right = $<4;
3451                 $0 = b;
3452         } }$
3453
3454         | not Expression ${ {
3455                 struct binode *b = new(binode);
3456                 b->op = Not;
3457                 b->right = $<2;
3458                 $0 = b;
3459         } }$
3460
3461 ###### print binode cases
3462         case And:
3463                 if (bracket) printf("(");
3464                 print_exec(b->left, -1, bracket);
3465                 printf(" and ");
3466                 print_exec(b->right, -1, bracket);
3467                 if (bracket) printf(")");
3468                 break;
3469         case AndThen:
3470                 if (bracket) printf("(");
3471                 print_exec(b->left, -1, bracket);
3472                 printf(" and then ");
3473                 print_exec(b->right, -1, bracket);
3474                 if (bracket) printf(")");
3475                 break;
3476         case Or:
3477                 if (bracket) printf("(");
3478                 print_exec(b->left, -1, bracket);
3479                 printf(" or ");
3480                 print_exec(b->right, -1, bracket);
3481                 if (bracket) printf(")");
3482                 break;
3483         case OrElse:
3484                 if (bracket) printf("(");
3485                 print_exec(b->left, -1, bracket);
3486                 printf(" or else ");
3487                 print_exec(b->right, -1, bracket);
3488                 if (bracket) printf(")");
3489                 break;
3490         case Not:
3491                 if (bracket) printf("(");
3492                 printf("not ");
3493                 print_exec(b->right, -1, bracket);
3494                 if (bracket) printf(")");
3495                 break;
3496
3497 ###### propagate binode cases
3498         case And:
3499         case AndThen:
3500         case Or:
3501         case OrElse:
3502         case Not:
3503                 /* both must be Tbool, result is Tbool */
3504                 propagate_types(b->left, c, perr, Tbool, 0);
3505                 propagate_types(b->right, c, perr, Tbool, 0);
3506                 if (type && type != Tbool)
3507                         type_err(c, "error: %1 operation found where %2 expected", prog,
3508                                    Tbool, 0, type);
3509                 return Tbool;
3510
3511 ###### interp binode cases
3512         case And:
3513                 rv = interp_exec(c, b->left, &rvtype);
3514                 right = interp_exec(c, b->right, &rtype);
3515                 rv.bool = rv.bool && right.bool;
3516                 break;
3517         case AndThen:
3518                 rv = interp_exec(c, b->left, &rvtype);
3519                 if (rv.bool)
3520                         rv = interp_exec(c, b->right, NULL);
3521                 break;
3522         case Or:
3523                 rv = interp_exec(c, b->left, &rvtype);
3524                 right = interp_exec(c, b->right, &rtype);
3525                 rv.bool = rv.bool || right.bool;
3526                 break;
3527         case OrElse:
3528                 rv = interp_exec(c, b->left, &rvtype);
3529                 if (!rv.bool)
3530                         rv = interp_exec(c, b->right, NULL);
3531                 break;
3532         case Not:
3533                 rv = interp_exec(c, b->right, &rvtype);
3534                 rv.bool = !rv.bool;
3535                 break;
3536
3537 ### Expressions: Comparison
3538
3539 Of slightly higher precedence that Boolean expressions are Comparisons.
3540 A comparison takes arguments of any comparable type, but the two types
3541 must be the same.
3542
3543 To simplify the parsing we introduce an `eop` which can record an
3544 expression operator, and the `CMPop` non-terminal will match one of them.
3545
3546 ###### ast
3547         struct eop {
3548                 enum Btype op;
3549         };
3550
3551 ###### ast functions
3552         static void free_eop(struct eop *e)
3553         {
3554                 if (e)
3555                         free(e);
3556         }
3557
3558 ###### Binode types
3559         Less,
3560         Gtr,
3561         LessEq,
3562         GtrEq,
3563         Eql,
3564         NEql,
3565
3566 ###### declare terminals
3567         $LEFT < > <= >= == != CMPop
3568
3569 ###### expression grammar
3570         | Expression CMPop Expression ${ {
3571                 struct binode *b = new(binode);
3572                 b->op = $2.op;
3573                 b->left = $<1;
3574                 b->right = $<3;
3575                 $0 = b;
3576         } }$
3577
3578 ###### Grammar
3579
3580         $eop
3581         CMPop ->  < ${ $0.op = Less; }$
3582         |         > ${ $0.op = Gtr; }$
3583         |         <= ${ $0.op = LessEq; }$
3584         |         >= ${ $0.op = GtrEq; }$
3585         |         == ${ $0.op = Eql; }$
3586         |         != ${ $0.op = NEql; }$
3587
3588 ###### print binode cases
3589
3590         case Less:
3591         case LessEq:
3592         case Gtr:
3593         case GtrEq:
3594         case Eql:
3595         case NEql:
3596                 if (bracket) printf("(");
3597                 print_exec(b->left, -1, bracket);
3598                 switch(b->op) {
3599                 case Less:   printf(" < "); break;
3600                 case LessEq: printf(" <= "); break;
3601                 case Gtr:    printf(" > "); break;
3602                 case GtrEq:  printf(" >= "); break;
3603                 case Eql:    printf(" == "); break;
3604                 case NEql:   printf(" != "); break;
3605                 default: abort();               // NOTEST
3606                 }
3607                 print_exec(b->right, -1, bracket);
3608                 if (bracket) printf(")");
3609                 break;
3610
3611 ###### propagate binode cases
3612         case Less:
3613         case LessEq:
3614         case Gtr:
3615         case GtrEq:
3616         case Eql:
3617         case NEql:
3618                 /* Both must match but not be labels, result is Tbool */
3619                 t = propagate_types(b->left, c, perr, NULL, Rnolabel);
3620                 if (t)
3621                         propagate_types(b->right, c, perr, t, 0);
3622                 else {
3623                         t = propagate_types(b->right, c, perr, NULL, Rnolabel); // UNTESTED
3624                         if (t)  // UNTESTED
3625                                 t = propagate_types(b->left, c, perr, t, 0);    // UNTESTED
3626                 }
3627                 if (!type_compat(type, Tbool, 0))
3628                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
3629                                     Tbool, rules, type);
3630                 return Tbool;
3631
3632 ###### interp binode cases
3633         case Less:
3634         case LessEq:
3635         case Gtr:
3636         case GtrEq:
3637         case Eql:
3638         case NEql:
3639         {
3640                 int cmp;
3641                 left = interp_exec(c, b->left, &ltype);
3642                 right = interp_exec(c, b->right, &rtype);
3643                 cmp = value_cmp(ltype, rtype, &left, &right);
3644                 rvtype = Tbool;
3645                 switch (b->op) {
3646                 case Less:      rv.bool = cmp <  0; break;
3647                 case LessEq:    rv.bool = cmp <= 0; break;
3648                 case Gtr:       rv.bool = cmp >  0; break;
3649                 case GtrEq:     rv.bool = cmp >= 0; break;
3650                 case Eql:       rv.bool = cmp == 0; break;
3651                 case NEql:      rv.bool = cmp != 0; break;
3652                 default:        rv.bool = 0; break;     // NOTEST
3653                 }
3654                 break;
3655         }
3656
3657 ### Expressions: Arithmetic etc.
3658
3659 The remaining expressions with the highest precedence are arithmetic,
3660 string concatenation, string conversion, and testing.  String concatenation
3661 (`++`) has the same precedence as multiplication and division, but lower
3662 than the uniary.
3663
3664 Testing comes in two forms.  A single question mark (`?`) is a uniary
3665 operator which converts come types into Boolean.  The general meaning is
3666 "is this a value value" and there will be more uses as the language
3667 develops.  A double questionmark (`??`) is a binary operator (Choose),
3668 with same precedence as multiplication, which returns the LHS if it
3669 tests successfully, else returns the RHS.
3670
3671 String conversion is a temporary feature until I get a better type
3672 system.  `$` is a prefix operator which expects a string and returns
3673 a number.
3674
3675 `+` and `-` are both infix and prefix operations (where they are
3676 absolute value and negation).  These have different operator names.
3677
3678 We also have a 'Bracket' operator which records where parentheses were
3679 found.  This makes it easy to reproduce these when printing.  Possibly I
3680 should only insert brackets were needed for precedence.  Putting
3681 parentheses around an expression converts it into a Term,
3682
3683 ###### Binode types
3684         Plus, Minus,
3685         Times, Divide, Rem,
3686         Concat, Choose,
3687         Absolute, Negate, Test,
3688         StringConv,
3689         Bracket,
3690
3691 ###### declare terminals
3692         $LEFT + - Eop
3693         $LEFT * / % ++ ?? Top
3694         $LEFT Uop $ ?
3695         $TERM ( )
3696
3697 ###### expression grammar
3698         | Expression Eop Expression ${ {
3699                 struct binode *b = new(binode);
3700                 b->op = $2.op;
3701                 b->left = $<1;
3702                 b->right = $<3;
3703                 $0 = b;
3704         } }$
3705
3706         | Expression Top Expression ${ {
3707                 struct binode *b = new(binode);
3708                 b->op = $2.op;
3709                 b->left = $<1;
3710                 b->right = $<3;
3711                 $0 = b;
3712         } }$
3713
3714         | Uop Expression ${ {
3715                 struct binode *b = new(binode);
3716                 b->op = $1.op;
3717                 b->right = $<2;
3718                 $0 = b;
3719         } }$
3720
3721 ###### term grammar
3722
3723         | ( Expression ) ${ {
3724                 struct binode *b = new_pos(binode, $1);
3725                 b->op = Bracket;
3726                 b->right = $<2;
3727                 $0 = b;
3728         } }$
3729
3730 ###### Grammar
3731
3732         $eop
3733         Eop ->   + ${ $0.op = Plus; }$
3734         |        - ${ $0.op = Minus; }$
3735
3736         Uop ->   + ${ $0.op = Absolute; }$
3737         |        - ${ $0.op = Negate; }$
3738         |        $ ${ $0.op = StringConv; }$
3739         |        ? ${ $0.op = Test; }$
3740
3741         Top ->   * ${ $0.op = Times; }$
3742         |        / ${ $0.op = Divide; }$
3743         |        % ${ $0.op = Rem; }$
3744         |        ++ ${ $0.op = Concat; }$
3745         |        ?? ${ $0.op = Choose; }$
3746
3747 ###### print binode cases
3748         case Plus:
3749         case Minus:
3750         case Times:
3751         case Divide:
3752         case Concat:
3753         case Rem:
3754         case Choose:
3755                 if (bracket) printf("(");
3756                 print_exec(b->left, indent, bracket);
3757                 switch(b->op) {
3758                 case Plus:   fputs(" + ", stdout); break;
3759                 case Minus:  fputs(" - ", stdout); break;
3760                 case Times:  fputs(" * ", stdout); break;
3761                 case Divide: fputs(" / ", stdout); break;
3762                 case Rem:    fputs(" % ", stdout); break;
3763                 case Concat: fputs(" ++ ", stdout); break;
3764                 case Choose: fputs(" ?? ", stdout); break;
3765                 default: abort();       // NOTEST
3766                 }                       // NOTEST
3767                 print_exec(b->right, indent, bracket);
3768                 if (bracket) printf(")");
3769                 break;
3770         case Absolute:
3771         case Negate:
3772         case StringConv:
3773         case Test:
3774                 if (bracket) printf("(");
3775                 switch (b->op) {
3776                 case Absolute:   fputs("+", stdout); break;
3777                 case Negate:     fputs("-", stdout); break;
3778                 case StringConv: fputs("$", stdout); break;
3779                 case Test:       fputs("?", stdout); break;
3780                 default: abort();       // NOTEST
3781                 }                       // NOTEST
3782                 print_exec(b->right, indent, bracket);
3783                 if (bracket) printf(")");
3784                 break;
3785         case Bracket:
3786                 printf("(");
3787                 print_exec(b->right, indent, bracket);
3788                 printf(")");
3789                 break;
3790
3791 ###### propagate binode cases
3792         case Plus:
3793         case Minus:
3794         case Times:
3795         case Rem:
3796         case Divide:
3797                 /* both must be numbers, result is Tnum */
3798         case Absolute:
3799         case Negate:
3800                 /* as propagate_types ignores a NULL,
3801                  * unary ops fit here too */
3802                 propagate_types(b->left, c, perr, Tnum, 0);
3803                 propagate_types(b->right, c, perr, Tnum, 0);
3804                 if (!type_compat(type, Tnum, 0))
3805                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3806                                    Tnum, rules, type);
3807                 return Tnum;
3808
3809         case Concat:
3810                 /* both must be Tstr, result is Tstr */
3811                 propagate_types(b->left, c, perr, Tstr, 0);
3812                 propagate_types(b->right, c, perr, Tstr, 0);
3813                 if (!type_compat(type, Tstr, 0))
3814                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3815                                    Tstr, rules, type);
3816                 return Tstr;
3817
3818         case StringConv:
3819                 /* op must be string, result is number */
3820                 propagate_types(b->left, c, perr, Tstr, 0);
3821                 if (!type_compat(type, Tnum, 0))
3822                         type_err(c,     // UNTESTED
3823                           "error: Can only convert string to number, not %1",
3824                                 prog, type, 0, NULL);
3825                 return Tnum;
3826
3827         case Test:
3828                 /* LHS must support ->test, result is Tbool */
3829                 t = propagate_types(b->right, c, perr, NULL, 0);
3830                 if (!t || !t->test)
3831                         type_err(c, "error: '?' requires a testable value, not %1",
3832                                  prog, t, 0, NULL);
3833                 return Tbool;
3834
3835         case Choose:
3836                 /* LHS and RHS must match and are returned. Must support
3837                  * ->test
3838                  */
3839                 t = propagate_types(b->left, c, perr, type, rules);
3840                 t = propagate_types(b->right, c, perr, t, rules);
3841                 if (t && t->test == NULL)
3842                         type_err(c, "error: \"??\" requires a testable value, not %1",
3843                                  prog, t, 0, NULL);
3844                 return t;
3845
3846         case Bracket:
3847                 return propagate_types(b->right, c, perr, type, 0);
3848
3849 ###### interp binode cases
3850
3851         case Plus:
3852                 rv = interp_exec(c, b->left, &rvtype);
3853                 right = interp_exec(c, b->right, &rtype);
3854                 mpq_add(rv.num, rv.num, right.num);
3855                 break;
3856         case Minus:
3857                 rv = interp_exec(c, b->left, &rvtype);
3858                 right = interp_exec(c, b->right, &rtype);
3859                 mpq_sub(rv.num, rv.num, right.num);
3860                 break;
3861         case Times:
3862                 rv = interp_exec(c, b->left, &rvtype);
3863                 right = interp_exec(c, b->right, &rtype);
3864                 mpq_mul(rv.num, rv.num, right.num);
3865                 break;
3866         case Divide:
3867                 rv = interp_exec(c, b->left, &rvtype);
3868                 right = interp_exec(c, b->right, &rtype);
3869                 mpq_div(rv.num, rv.num, right.num);
3870                 break;
3871         case Rem: {
3872                 mpz_t l, r, rem;
3873
3874                 left = interp_exec(c, b->left, &ltype);
3875                 right = interp_exec(c, b->right, &rtype);
3876                 mpz_init(l); mpz_init(r); mpz_init(rem);
3877                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3878                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3879                 mpz_tdiv_r(rem, l, r);
3880                 val_init(Tnum, &rv);
3881                 mpq_set_z(rv.num, rem);
3882                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3883                 rvtype = ltype;
3884                 break;
3885         }
3886         case Negate:
3887                 rv = interp_exec(c, b->right, &rvtype);
3888                 mpq_neg(rv.num, rv.num);
3889                 break;
3890         case Absolute:
3891                 rv = interp_exec(c, b->right, &rvtype);
3892                 mpq_abs(rv.num, rv.num);
3893                 break;
3894         case Bracket:
3895                 rv = interp_exec(c, b->right, &rvtype);
3896                 break;
3897         case Concat:
3898                 left = interp_exec(c, b->left, &ltype);
3899                 right = interp_exec(c, b->right, &rtype);
3900                 rvtype = Tstr;
3901                 rv.str = text_join(left.str, right.str);
3902                 break;
3903         case StringConv:
3904                 right = interp_exec(c, b->right, &rvtype);
3905                 rtype = Tstr;
3906                 rvtype = Tnum;
3907
3908                 struct text tx = right.str;
3909                 char tail[3];
3910                 int neg = 0;
3911                 if (tx.txt[0] == '-') {
3912                         neg = 1;        // UNTESTED
3913                         tx.txt++;       // UNTESTED
3914                         tx.len--;       // UNTESTED
3915                 }
3916                 if (number_parse(rv.num, tail, tx) == 0)
3917                         mpq_init(rv.num);       // UNTESTED
3918                 else if (neg)
3919                         mpq_neg(rv.num, rv.num);        // UNTESTED
3920                 if (tail[0])
3921                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3922
3923                 break;
3924         case Test:
3925                 right = interp_exec(c, b->right, &rtype);
3926                 rvtype = Tbool;
3927                 rv.bool = !!rtype->test(rtype, &right);
3928                 break;
3929         case Choose:
3930                 left = interp_exec(c, b->left, &ltype);
3931                 if (ltype->test(ltype, &left)) {
3932                         rv = left;
3933                         rvtype = ltype;
3934                         ltype = NULL;
3935                 } else
3936                         rv = interp_exec(c, b->right, &rvtype);
3937                 break;
3938
3939 ###### value functions
3940
3941         static struct text text_join(struct text a, struct text b)
3942         {
3943                 struct text rv;
3944                 rv.len = a.len + b.len;
3945                 rv.txt = malloc(rv.len);
3946                 memcpy(rv.txt, a.txt, a.len);
3947                 memcpy(rv.txt+a.len, b.txt, b.len);
3948                 return rv;
3949         }
3950
3951 ### Blocks, Statements, and Statement lists.
3952
3953 Now that we have expressions out of the way we need to turn to
3954 statements.  There are simple statements and more complex statements.
3955 Simple statements do not contain (syntactic) newlines, complex statements do.
3956
3957 Statements often come in sequences and we have corresponding simple
3958 statement lists and complex statement lists.
3959 The former comprise only simple statements separated by semicolons.
3960 The later comprise complex statements and simple statement lists.  They are
3961 separated by newlines.  Thus the semicolon is only used to separate
3962 simple statements on the one line.  This may be overly restrictive,
3963 but I'm not sure I ever want a complex statement to share a line with
3964 anything else.
3965
3966 Note that a simple statement list can still use multiple lines if
3967 subsequent lines are indented, so
3968
3969 ###### Example: wrapped simple statement list
3970
3971         a = b; c = d;
3972            e = f; print g
3973
3974 is a single simple statement list.  This might allow room for
3975 confusion, so I'm not set on it yet.
3976
3977 A simple statement list needs no extra syntax.  A complex statement
3978 list has two syntactic forms.  It can be enclosed in braces (much like
3979 C blocks), or it can be introduced by an indent and continue until an
3980 unindented newline (much like Python blocks).  With this extra syntax
3981 it is referred to as a block.
3982
3983 Note that a block does not have to include any newlines if it only
3984 contains simple statements.  So both of:
3985
3986         if condition: a=b; d=f
3987
3988         if condition { a=b; print f }
3989
3990 are valid.
3991
3992 In either case the list is constructed from a `binode` list with
3993 `Block` as the operator.  When parsing the list it is most convenient
3994 to append to the end, so a list is a list and a statement.  When using
3995 the list it is more convenient to consider a list to be a statement
3996 and a list.  So we need a function to re-order a list.
3997 `reorder_bilist` serves this purpose.
3998
3999 The only stand-alone statement we introduce at this stage is `pass`
4000 which does nothing and is represented as a `NULL` pointer in a `Block`
4001 list.  Other stand-alone statements will follow once the infrastructure
4002 is in-place.
4003
4004 As many statements will use binodes, we declare a binode pointer 'b' in
4005 the common header for all reductions to use.
4006
4007 ###### Parser: reduce
4008         struct binode *b;
4009
4010 ###### Binode types
4011         Block,
4012
4013 ###### Grammar
4014
4015         $TERM { } ;
4016
4017         $*binode
4018         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
4019         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
4020         |        SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
4021         |        SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
4022         |        IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
4023
4024         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
4025         |        OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
4026         |        OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
4027         |        OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
4028         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
4029
4030         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
4031         |        { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
4032         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
4033
4034         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
4035         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
4036         |        : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
4037         |        : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
4038         |        : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
4039
4040         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
4041
4042         ComplexStatements -> ComplexStatements ComplexStatement ${
4043                 if ($2 == NULL) {
4044                         $0 = $<1;
4045                 } else {
4046                         $0 = new(binode);
4047                         $0->op = Block;
4048                         $0->left = $<1;
4049                         $0->right = $<2;
4050                 }
4051         }$
4052         | ComplexStatement ${
4053                 if ($1 == NULL) {
4054                         $0 = NULL;
4055                 } else {
4056                         $0 = new(binode);
4057                         $0->op = Block;
4058                         $0->left = NULL;
4059                         $0->right = $<1;
4060                 }
4061         }$
4062
4063         $*exec
4064         ComplexStatement -> SimpleStatements Newlines ${
4065                 $0 = reorder_bilist($<SS);
4066         }$
4067         |  SimpleStatements ; Newlines ${
4068                 $0 = reorder_bilist($<SS);
4069         }$
4070         ## ComplexStatement Grammar
4071
4072         $*binode
4073         SimpleStatements -> SimpleStatements ; SimpleStatement ${
4074                 $0 = new(binode);
4075                 $0->op = Block;
4076                 $0->left = $<1;
4077                 $0->right = $<3;
4078         }$
4079         | SimpleStatement ${
4080                 $0 = new(binode);
4081                 $0->op = Block;
4082                 $0->left = NULL;
4083                 $0->right = $<1;
4084         }$
4085
4086         $TERM pass
4087         $*exec
4088         SimpleStatement -> pass ${ $0 = NULL; }$
4089         | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
4090         ## SimpleStatement Grammar
4091
4092 ###### print binode cases
4093         case Block:
4094                 if (indent < 0) {
4095                         // simple statement
4096                         if (b->left == NULL)    // UNTESTED
4097                                 printf("pass"); // UNTESTED
4098                         else
4099                                 print_exec(b->left, indent, bracket);   // UNTESTED
4100                         if (b->right) { // UNTESTED
4101                                 printf("; ");   // UNTESTED
4102                                 print_exec(b->right, indent, bracket);  // UNTESTED
4103                         }
4104                 } else {
4105                         // block, one per line
4106                         if (b->left == NULL)
4107                                 do_indent(indent, "pass\n");
4108                         else
4109                                 print_exec(b->left, indent, bracket);
4110                         if (b->right)
4111                                 print_exec(b->right, indent, bracket);
4112                 }
4113                 break;
4114
4115 ###### propagate binode cases
4116         case Block:
4117         {
4118                 /* If any statement returns something other than Tnone
4119                  * or Tbool then all such must return same type.
4120                  * As each statement may be Tnone or something else,
4121                  * we must always pass NULL (unknown) down, otherwise an incorrect
4122                  * error might occur.  We never return Tnone unless it is
4123                  * passed in.
4124                  */
4125                 struct binode *e;
4126
4127                 for (e = b; e; e = cast(binode, e->right)) {
4128                         t = propagate_types(e->left, c, perr, NULL, rules);
4129                         if ((rules & Rboolok) && (t == Tbool || t == Tnone))
4130                                 t = NULL;
4131                         if (t == Tnone && e->right)
4132                                 /* Only the final statement *must* return a value
4133                                  * when not Rboolok
4134                                  */
4135                                 t = NULL;
4136                         if (t) {
4137                                 if (!type)
4138                                         type = t;
4139                                 else if (t != type)
4140                                         type_err(c, "error: expected %1%r, found %2",
4141                                                  e->left, type, rules, t);
4142                         }
4143                 }
4144                 return type;
4145         }
4146
4147 ###### interp binode cases
4148         case Block:
4149                 while (rvtype == Tnone &&
4150                        b) {
4151                         if (b->left)
4152                                 rv = interp_exec(c, b->left, &rvtype);
4153                         b = cast(binode, b->right);
4154                 }
4155                 break;
4156
4157 ### The Print statement
4158
4159 `print` is a simple statement that takes a comma-separated list of
4160 expressions and prints the values separated by spaces and terminated
4161 by a newline.  No control of formatting is possible.
4162
4163 `print` uses `ExpressionList` to collect the expressions and stores them
4164 on the left side of a `Print` binode unlessthere is a trailing comma
4165 when the list is stored on the `right` side and no trailing newline is
4166 printed.
4167
4168 ###### Binode types
4169         Print,
4170
4171 ##### declare terminals
4172         $TERM print
4173
4174 ###### SimpleStatement Grammar
4175
4176         | print ExpressionList ${
4177                 $0 = b = new(binode);
4178                 b->op = Print;
4179                 b->right = NULL;
4180                 b->left = reorder_bilist($<EL);
4181         }$
4182         | print ExpressionList , ${ {
4183                 $0 = b = new(binode);
4184                 b->op = Print;
4185                 b->right = reorder_bilist($<EL);
4186                 b->left = NULL;
4187         } }$
4188         | print ${
4189                 $0 = b = new(binode);
4190                 b->op = Print;
4191                 b->left = NULL;
4192                 b->right = NULL;
4193         }$
4194
4195 ###### print binode cases
4196
4197         case Print:
4198                 do_indent(indent, "print");
4199                 if (b->right) {
4200                         print_exec(b->right, -1, bracket);
4201                         printf(",");
4202                 } else
4203                         print_exec(b->left, -1, bracket);
4204                 if (indent >= 0)
4205                         printf("\n");
4206                 break;
4207
4208 ###### propagate binode cases
4209
4210         case Print:
4211                 /* don't care but all must be consistent */
4212                 if (b->left)
4213                         b = cast(binode, b->left);
4214                 else
4215                         b = cast(binode, b->right);
4216                 while (b) {
4217                         propagate_types(b->left, c, perr, NULL, Rnolabel);
4218                         b = cast(binode, b->right);
4219                 }
4220                 break;
4221
4222 ###### interp binode cases
4223
4224         case Print:
4225         {
4226                 struct binode *b2 = cast(binode, b->left);
4227                 if (!b2)
4228                         b2 = cast(binode, b->right);
4229                 for (; b2; b2 = cast(binode, b2->right)) {
4230                         left = interp_exec(c, b2->left, &ltype);
4231                         print_value(ltype, &left, stdout);
4232                         free_value(ltype, &left);
4233                         if (b2->right)
4234                                 putchar(' ');
4235                 }
4236                 if (b->right == NULL)
4237                         printf("\n");
4238                 ltype = Tnone;
4239                 break;
4240         }
4241
4242 ###### Assignment statement
4243
4244 An assignment will assign a value to a variable, providing it hasn't
4245 been declared as a constant.  The analysis phase ensures that the type
4246 will be correct so the interpreter just needs to perform the
4247 calculation.  There is a form of assignment which declares a new
4248 variable as well as assigning a value.  If a name is assigned before
4249 it is declared, and error will be raised as the name is created as
4250 `Tlabel` and it is illegal to assign to such names.
4251
4252 ###### Binode types
4253         Assign,
4254         Declare,
4255
4256 ###### declare terminals
4257         $TERM =
4258
4259 ###### SimpleStatement Grammar
4260         | Term = Expression ${
4261                 $0 = b= new(binode);
4262                 b->op = Assign;
4263                 b->left = $<1;
4264                 b->right = $<3;
4265         }$
4266         | VariableDecl = Expression ${
4267                 $0 = b= new(binode);
4268                 b->op = Declare;
4269                 b->left = $<1;
4270                 b->right =$<3;
4271         }$
4272
4273         | VariableDecl ${
4274                 if ($1->var->where_set == NULL) {
4275                         type_err(c,
4276                                  "Variable declared with no type or value: %v",
4277                                  $1, NULL, 0, NULL);
4278                         free_var($1);
4279                 } else {
4280                         $0 = b = new(binode);
4281                         b->op = Declare;
4282                         b->left = $<1;
4283                         b->right = NULL;
4284                 }
4285         }$
4286
4287 ###### print binode cases
4288
4289         case Assign:
4290                 do_indent(indent, "");
4291                 print_exec(b->left, indent, bracket);
4292                 printf(" = ");
4293                 print_exec(b->right, indent, bracket);
4294                 if (indent >= 0)
4295                         printf("\n");
4296                 break;
4297
4298         case Declare:
4299                 {
4300                 struct variable *v = cast(var, b->left)->var;
4301                 do_indent(indent, "");
4302                 print_exec(b->left, indent, bracket);
4303                 if (cast(var, b->left)->var->constant) {
4304                         printf("::");
4305                         if (v->explicit_type) {
4306                                 type_print(v->type, stdout);
4307                                 printf(" ");
4308                         }
4309                 } else {
4310                         printf(":");
4311                         if (v->explicit_type) {
4312                                 type_print(v->type, stdout);
4313                                 printf(" ");
4314                         }
4315                 }
4316                 if (b->right) {
4317                         printf("= ");
4318                         print_exec(b->right, indent, bracket);
4319                 }
4320                 if (indent >= 0)
4321                         printf("\n");
4322                 }
4323                 break;
4324
4325 ###### propagate binode cases
4326
4327         case Assign:
4328         case Declare:
4329                 /* Both must match and not be labels,
4330                  * Type must support 'dup',
4331                  * For Assign, left must not be constant.
4332                  * result is Tnone
4333                  */
4334                 t = propagate_types(b->left, c, perr, NULL,
4335                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
4336                 if (!b->right)
4337                         return Tnone;
4338
4339                 if (t) {
4340                         if (propagate_types(b->right, c, perr, t, 0) != t)
4341                                 if (b->left->type == Xvar)
4342                                         type_err(c, "info: variable '%v' was set as %1 here.",
4343                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
4344                 } else {
4345                         t = propagate_types(b->right, c, perr, NULL, Rnolabel);
4346                         if (t)
4347                                 propagate_types(b->left, c, perr, t,
4348                                                 (b->op == Assign ? Rnoconstant : 0));
4349                 }
4350                 if (t && t->dup == NULL && !(*perr & Emaycopy))
4351                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
4352                 return Tnone;
4353
4354                 break;
4355
4356 ###### interp binode cases
4357
4358         case Assign:
4359                 lleft = linterp_exec(c, b->left, &ltype);
4360                 if (lleft)
4361                         dinterp_exec(c, b->right, lleft, ltype, 1);
4362                 ltype = Tnone;
4363                 break;
4364
4365         case Declare:
4366         {
4367                 struct variable *v = cast(var, b->left)->var;
4368                 struct value *val;
4369                 v = v->merged;
4370                 val = var_value(c, v);
4371                 if (v->type->prepare_type)
4372                         v->type->prepare_type(c, v->type, 0);
4373                 if (b->right)
4374                         dinterp_exec(c, b->right, val, v->type, 0);
4375                 else
4376                         val_init(v->type, val);
4377                 break;
4378         }
4379
4380 ### The `use` statement
4381
4382 The `use` statement is the last "simple" statement.  It is needed when a
4383 statement block can return a value.  This includes the body of a
4384 function which has a return type, and the "condition" code blocks in
4385 `if`, `while`, and `switch` statements.
4386
4387 ###### Binode types
4388         Use,
4389
4390 ###### declare terminals
4391         $TERM use
4392
4393 ###### SimpleStatement Grammar
4394         | use Expression ${
4395                 $0 = b = new_pos(binode, $1);
4396                 b->op = Use;
4397                 b->right = $<2;
4398                 if (b->right->type == Xvar) {
4399                         struct var *v = cast(var, b->right);
4400                         if (v->var->type == Tnone) {
4401                                 /* Convert this to a label */
4402                                 struct value *val;
4403
4404                                 v->var->type = Tlabel;
4405                                 val = global_alloc(c, Tlabel, v->var, NULL);
4406                                 val->label = val;
4407                         }
4408                 }
4409         }$
4410
4411 ###### print binode cases
4412
4413         case Use:
4414                 do_indent(indent, "use ");
4415                 print_exec(b->right, -1, bracket);
4416                 if (indent >= 0)
4417                         printf("\n");
4418                 break;
4419
4420 ###### propagate binode cases
4421
4422         case Use:
4423                 /* result matches value */
4424                 return propagate_types(b->right, c, perr, type, 0);
4425
4426 ###### interp binode cases
4427
4428         case Use:
4429                 rv = interp_exec(c, b->right, &rvtype);
4430                 break;
4431
4432 ### The Conditional Statement
4433
4434 This is the biggy and currently the only complex statement.  This
4435 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
4436 It is comprised of a number of parts, all of which are optional though
4437 set combinations apply.  Each part is (usually) a key word (`then` is
4438 sometimes optional) followed by either an expression or a code block,
4439 except the `casepart` which is a "key word and an expression" followed
4440 by a code block.  The code-block option is valid for all parts and,
4441 where an expression is also allowed, the code block can use the `use`
4442 statement to report a value.  If the code block does not report a value
4443 the effect is similar to reporting `True`.
4444
4445 The `else` and `case` parts, as well as `then` when combined with
4446 `if`, can contain a `use` statement which will apply to some
4447 containing conditional statement. `for` parts, `do` parts and `then`
4448 parts used with `for` can never contain a `use`, except in some
4449 subordinate conditional statement.
4450
4451 If there is a `forpart`, it is executed first, only once.
4452 If there is a `dopart`, then it is executed repeatedly providing
4453 always that the `condpart` or `cond`, if present, does not return a non-True
4454 value.  `condpart` can fail to return any value if it simply executes
4455 to completion.  This is treated the same as returning `True`.
4456
4457 If there is a `thenpart` it will be executed whenever the `condpart`
4458 or `cond` returns True (or does not return any value), but this will happen
4459 *after* `dopart` (when present).
4460
4461 If `elsepart` is present it will be executed at most once when the
4462 condition returns `False` or some value that isn't `True` and isn't
4463 matched by any `casepart`.  If there are any `casepart`s, they will be
4464 executed when the condition returns a matching value.
4465
4466 The particular sorts of values allowed in case parts has not yet been
4467 determined in the language design, so nothing is prohibited.
4468
4469 The various blocks in this complex statement potentially provide scope
4470 for variables as described earlier.  Each such block must include the
4471 "OpenScope" nonterminal before parsing the block, and must call
4472 `var_block_close()` when closing the block.
4473
4474 The code following "`if`", "`switch`" and "`for`" does not get its own
4475 scope, but is in a scope covering the whole statement, so names
4476 declared there cannot be redeclared elsewhere.  Similarly the
4477 condition following "`while`" is in a scope the covers the body
4478 ("`do`" part) of the loop, and which does not allow conditional scope
4479 extension.  Code following "`then`" (both looping and non-looping),
4480 "`else`" and "`case`" each get their own local scope.
4481
4482 The type requirements on the code block in a `whilepart` are quite
4483 unusal.  It is allowed to return a value of some identifiable type, in
4484 which case the loop aborts and an appropriate `casepart` is run, or it
4485 can return a Boolean, in which case the loop either continues to the
4486 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
4487 This is different both from the `ifpart` code block which is expected to
4488 return a Boolean, or the `switchpart` code block which is expected to
4489 return the same type as the casepart values.  The correct analysis of
4490 the type of the `whilepart` code block is the reason for the
4491 `Rboolok` flag which is passed to `propagate_types()`.
4492
4493 The `cond_statement` cannot fit into a `binode` so a new `exec` is
4494 defined.  As there are two scopes which cover multiple parts - one for
4495 the whole statement and one for "while" and "do" - and as we will use
4496 the 'struct exec' to track scopes, we actually need two new types of
4497 exec.  One is a `binode` for the looping part, the rest is the
4498 `cond_statement`.  The `cond_statement` will use an auxilliary `struct
4499 casepart` to track a list of case parts.
4500
4501 ###### Binode types
4502         Loop
4503
4504 ###### exec type
4505         Xcond_statement,
4506
4507 ###### ast
4508         struct casepart {
4509                 struct exec *value;
4510                 struct exec *action;
4511                 struct casepart *next;
4512         };
4513         struct cond_statement {
4514                 struct exec;
4515                 struct exec *forpart, *condpart, *thenpart, *elsepart;
4516                 struct binode *looppart;
4517                 struct casepart *casepart;
4518         };
4519
4520 ###### ast functions
4521
4522         static void free_casepart(struct casepart *cp)
4523         {
4524                 while (cp) {
4525                         struct casepart *t;
4526                         free_exec(cp->value);
4527                         free_exec(cp->action);
4528                         t = cp->next;
4529                         free(cp);
4530                         cp = t;
4531                 }
4532         }
4533
4534         static void free_cond_statement(struct cond_statement *s)
4535         {
4536                 if (!s)
4537                         return;
4538                 free_exec(s->forpart);
4539                 free_exec(s->condpart);
4540                 free_exec(s->looppart);
4541                 free_exec(s->thenpart);
4542                 free_exec(s->elsepart);
4543                 free_casepart(s->casepart);
4544                 free(s);
4545         }
4546
4547 ###### free exec cases
4548         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
4549
4550 ###### ComplexStatement Grammar
4551         | CondStatement ${ $0 = $<1; }$
4552
4553 ###### declare terminals
4554         $TERM for then while do
4555         $TERM else
4556         $TERM switch case
4557
4558 ###### Grammar
4559
4560         $*cond_statement
4561         // A CondStatement must end with EOL, as does CondSuffix and
4562         // IfSuffix.
4563         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
4564         // may or may not end with EOL
4565         // WhilePart and IfPart include an appropriate Suffix
4566
4567         // ForPart, SwitchPart, and IfPart open scopes, o we have to close
4568         // them.  WhilePart opens and closes its own scope.
4569         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
4570                 $0 = $<CS;
4571                 $0->forpart = $<FP;
4572                 $0->thenpart = $<TP;
4573                 $0->looppart = $<WP;
4574                 var_block_close(c, CloseSequential, $0);
4575         }$
4576         | ForPart OptNL WhilePart CondSuffix ${
4577                 $0 = $<CS;
4578                 $0->forpart = $<FP;
4579                 $0->looppart = $<WP;
4580                 var_block_close(c, CloseSequential, $0);
4581         }$
4582         | WhilePart CondSuffix ${
4583                 $0 = $<CS;
4584                 $0->looppart = $<WP;
4585         }$
4586         | SwitchPart OptNL CasePart CondSuffix ${
4587                 $0 = $<CS;
4588                 $0->condpart = $<SP;
4589                 $CP->next = $0->casepart;
4590                 $0->casepart = $<CP;
4591                 var_block_close(c, CloseSequential, $0);
4592         }$
4593         | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
4594                 $0 = $<CS;
4595                 $0->condpart = $<SP;
4596                 $CP->next = $0->casepart;
4597                 $0->casepart = $<CP;
4598                 var_block_close(c, CloseSequential, $0);
4599         }$
4600         | IfPart IfSuffix ${
4601                 $0 = $<IS;
4602                 $0->condpart = $IP.condpart; $IP.condpart = NULL;
4603                 $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
4604                 // This is where we close an "if" statement
4605                 var_block_close(c, CloseSequential, $0);
4606         }$
4607
4608         CondSuffix -> IfSuffix ${
4609                 $0 = $<1;
4610         }$
4611         | Newlines CasePart CondSuffix ${
4612                 $0 = $<CS;
4613                 $CP->next = $0->casepart;
4614                 $0->casepart = $<CP;
4615         }$
4616         | CasePart CondSuffix ${
4617                 $0 = $<CS;
4618                 $CP->next = $0->casepart;
4619                 $0->casepart = $<CP;
4620         }$
4621
4622         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
4623         | Newlines ElsePart ${ $0 = $<EP; }$
4624         | ElsePart ${$0 = $<EP; }$
4625
4626         ElsePart -> else OpenBlock Newlines ${
4627                 $0 = new(cond_statement);
4628                 $0->elsepart = $<OB;
4629                 var_block_close(c, CloseElse, $0->elsepart);
4630         }$
4631         | else OpenScope CondStatement ${
4632                 $0 = new(cond_statement);
4633                 $0->elsepart = $<CS;
4634                 var_block_close(c, CloseElse, $0->elsepart);
4635         }$
4636
4637         $*casepart
4638         CasePart -> case Expression OpenScope ColonBlock ${
4639                 $0 = calloc(1,sizeof(struct casepart));
4640                 $0->value = $<Ex;
4641                 $0->action = $<Bl;
4642                 var_block_close(c, CloseParallel, $0->action);
4643         }$
4644
4645         $*exec
4646         // These scopes are closed in CondStatement
4647         ForPart -> for OpenBlock ${
4648                 $0 = $<Bl;
4649         }$
4650
4651         ThenPart -> then OpenBlock ${
4652                 $0 = $<OB;
4653                 var_block_close(c, CloseSequential, $0);
4654         }$
4655
4656         $*binode
4657         // This scope is closed in CondStatement
4658         WhilePart -> while UseBlock OptNL do OpenBlock ${
4659                 $0 = new(binode);
4660                 $0->op = Loop;
4661                 $0->left = $<UB;
4662                 $0->right = $<OB;
4663                 var_block_close(c, CloseSequential, $0->right);
4664                 var_block_close(c, CloseSequential, $0);
4665         }$
4666         | while OpenScope Expression OpenScope ColonBlock ${
4667                 $0 = new(binode);
4668                 $0->op = Loop;
4669                 $0->left = $<Exp;
4670                 $0->right = $<CB;
4671                 var_block_close(c, CloseSequential, $0->right);
4672                 var_block_close(c, CloseSequential, $0);
4673         }$
4674
4675         $cond_statement
4676         IfPart -> if UseBlock OptNL then OpenBlock ${
4677                 $0.condpart = $<UB;
4678                 $0.thenpart = $<OB;
4679                 var_block_close(c, CloseParallel, $0.thenpart);
4680         }$
4681         | if OpenScope Expression OpenScope ColonBlock ${
4682                 $0.condpart = $<Ex;
4683                 $0.thenpart = $<CB;
4684                 var_block_close(c, CloseParallel, $0.thenpart);
4685         }$
4686         | if OpenScope Expression OpenScope OptNL then Block ${
4687                 $0.condpart = $<Ex;
4688                 $0.thenpart = $<Bl;
4689                 var_block_close(c, CloseParallel, $0.thenpart);
4690         }$
4691
4692         $*exec
4693         // This scope is closed in CondStatement
4694         SwitchPart -> switch OpenScope Expression ${
4695                 $0 = $<Ex;
4696         }$
4697         | switch UseBlock ${
4698                 $0 = $<Bl;
4699         }$
4700
4701 ###### print binode cases
4702         case Loop:
4703                 if (b->left && b->left->type == Xbinode &&
4704                     cast(binode, b->left)->op == Block) {
4705                         if (bracket)
4706                                 do_indent(indent, "while {\n");
4707                         else
4708                                 do_indent(indent, "while\n");
4709                         print_exec(b->left, indent+1, bracket);
4710                         if (bracket)
4711                                 do_indent(indent, "} do {\n");
4712                         else
4713                                 do_indent(indent, "do\n");
4714                         print_exec(b->right, indent+1, bracket);
4715                         if (bracket)
4716                                 do_indent(indent, "}\n");
4717                 } else {
4718                         do_indent(indent, "while ");
4719                         print_exec(b->left, 0, bracket);
4720                         if (bracket)
4721                                 printf(" {\n");
4722                         else
4723                                 printf(":\n");
4724                         print_exec(b->right, indent+1, bracket);
4725                         if (bracket)
4726                                 do_indent(indent, "}\n");
4727                 }
4728                 break;
4729
4730 ###### print exec cases
4731
4732         case Xcond_statement:
4733         {
4734                 struct cond_statement *cs = cast(cond_statement, e);
4735                 struct casepart *cp;
4736                 if (cs->forpart) {
4737                         do_indent(indent, "for");
4738                         if (bracket) printf(" {\n"); else printf("\n");
4739                         print_exec(cs->forpart, indent+1, bracket);
4740                         if (cs->thenpart) {
4741                                 if (bracket)
4742                                         do_indent(indent, "} then {\n");
4743                                 else
4744                                         do_indent(indent, "then\n");
4745                                 print_exec(cs->thenpart, indent+1, bracket);
4746                         }
4747                         if (bracket) do_indent(indent, "}\n");
4748                 }
4749                 if (cs->looppart) {
4750                         print_exec(cs->looppart, indent, bracket);
4751                 } else {
4752                         // a condition
4753                         if (cs->casepart)
4754                                 do_indent(indent, "switch");
4755                         else
4756                                 do_indent(indent, "if");
4757                         if (cs->condpart && cs->condpart->type == Xbinode &&
4758                             cast(binode, cs->condpart)->op == Block) {
4759                                 if (bracket)
4760                                         printf(" {\n");
4761                                 else
4762                                         printf("\n");
4763                                 print_exec(cs->condpart, indent+1, bracket);
4764                                 if (bracket)
4765                                         do_indent(indent, "}\n");
4766                                 if (cs->thenpart) {
4767                                         do_indent(indent, "then\n");
4768                                         print_exec(cs->thenpart, indent+1, bracket);
4769                                 }
4770                         } else {
4771                                 printf(" ");
4772                                 print_exec(cs->condpart, 0, bracket);
4773                                 if (cs->thenpart) {
4774                                         if (bracket)
4775                                                 printf(" {\n");
4776                                         else
4777                                                 printf(":\n");
4778                                         print_exec(cs->thenpart, indent+1, bracket);
4779                                         if (bracket)
4780                                                 do_indent(indent, "}\n");
4781                                 } else
4782                                         printf("\n");
4783                         }
4784                 }
4785                 for (cp = cs->casepart; cp; cp = cp->next) {
4786                         do_indent(indent, "case ");
4787                         print_exec(cp->value, -1, 0);
4788                         if (bracket)
4789                                 printf(" {\n");
4790                         else
4791                                 printf(":\n");
4792                         print_exec(cp->action, indent+1, bracket);
4793                         if (bracket)
4794                                 do_indent(indent, "}\n");
4795                 }
4796                 if (cs->elsepart) {
4797                         do_indent(indent, "else");
4798                         if (bracket)
4799                                 printf(" {\n");
4800                         else
4801                                 printf("\n");
4802                         print_exec(cs->elsepart, indent+1, bracket);
4803                         if (bracket)
4804                                 do_indent(indent, "}\n");
4805                 }
4806                 break;
4807         }
4808
4809 ###### propagate binode cases
4810         case Loop:
4811                 t = propagate_types(b->right, c, perr, Tnone, 0);
4812                 if (!type_compat(Tnone, t, 0))
4813                         *perr |= Efail; // UNTESTED
4814                 return propagate_types(b->left, c, perr, type, rules);
4815
4816 ###### propagate exec cases
4817         case Xcond_statement:
4818         {
4819                 // forpart and looppart->right must return Tnone
4820                 // thenpart must return Tnone if there is a loopart,
4821                 // otherwise it is like elsepart.
4822                 // condpart must:
4823                 //    be bool if there is no casepart
4824                 //    match casepart->values if there is a switchpart
4825                 //    either be bool or match casepart->value if there
4826                 //             is a whilepart
4827                 // elsepart and casepart->action must match the return type
4828                 //   expected of this statement.
4829                 struct cond_statement *cs = cast(cond_statement, prog);
4830                 struct casepart *cp;
4831
4832                 t = propagate_types(cs->forpart, c, perr, Tnone, 0);
4833                 if (!type_compat(Tnone, t, 0))
4834                         *perr |= Efail; // UNTESTED
4835
4836                 if (cs->looppart) {
4837                         t = propagate_types(cs->thenpart, c, perr, Tnone, 0);
4838                         if (!type_compat(Tnone, t, 0))
4839                                 *perr |= Efail; // UNTESTED
4840                 }
4841                 if (cs->casepart == NULL) {
4842                         propagate_types(cs->condpart, c, perr, Tbool, 0);
4843                         propagate_types(cs->looppart, c, perr, Tbool, 0);
4844                 } else {
4845                         /* Condpart must match case values, with bool permitted */
4846                         t = NULL;
4847                         for (cp = cs->casepart;
4848                              cp && !t; cp = cp->next)
4849                                 t = propagate_types(cp->value, c, perr, NULL, 0);
4850                         if (!t && cs->condpart)
4851                                 t = propagate_types(cs->condpart, c, perr, NULL, Rboolok);      // UNTESTED
4852                         if (!t && cs->looppart)
4853                                 t = propagate_types(cs->looppart, c, perr, NULL, Rboolok);      // UNTESTED
4854                         // Now we have a type (I hope) push it down
4855                         if (t) {
4856                                 for (cp = cs->casepart; cp; cp = cp->next)
4857                                         propagate_types(cp->value, c, perr, t, 0);
4858                                 propagate_types(cs->condpart, c, perr, t, Rboolok);
4859                                 propagate_types(cs->looppart, c, perr, t, Rboolok);
4860                         }
4861                 }
4862                 // (if)then, else, and case parts must return expected type.
4863                 if (!cs->looppart && !type)
4864                         type = propagate_types(cs->thenpart, c, perr, NULL, rules);
4865                 if (!type)
4866                         type = propagate_types(cs->elsepart, c, perr, NULL, rules);
4867                 for (cp = cs->casepart;
4868                      cp && !type;
4869                      cp = cp->next)     // UNTESTED
4870                         type = propagate_types(cp->action, c, perr, NULL, rules);       // UNTESTED
4871                 if (type) {
4872                         if (!cs->looppart)
4873                                 propagate_types(cs->thenpart, c, perr, type, rules);
4874                         propagate_types(cs->elsepart, c, perr, type, rules);
4875                         for (cp = cs->casepart; cp ; cp = cp->next)
4876                                 propagate_types(cp->action, c, perr, type, rules);
4877                         return type;
4878                 } else
4879                         return NULL;
4880         }
4881
4882 ###### interp binode cases
4883         case Loop:
4884                 // This just performs one iterration of the loop
4885                 rv = interp_exec(c, b->left, &rvtype);
4886                 if (rvtype == Tnone ||
4887                     (rvtype == Tbool && rv.bool != 0))
4888                         // rvtype is Tnone or Tbool, doesn't need to be freed
4889                         interp_exec(c, b->right, NULL);
4890                 break;
4891
4892 ###### interp exec cases
4893         case Xcond_statement:
4894         {
4895                 struct value v, cnd;
4896                 struct type *vtype, *cndtype;
4897                 struct casepart *cp;
4898                 struct cond_statement *cs = cast(cond_statement, e);
4899
4900                 if (cs->forpart)
4901                         interp_exec(c, cs->forpart, NULL);
4902                 if (cs->looppart) {
4903                         while ((cnd = interp_exec(c, cs->looppart, &cndtype)),
4904                                cndtype == Tnone || (cndtype == Tbool && cnd.bool != 0))
4905                                 interp_exec(c, cs->thenpart, NULL);
4906                 } else {
4907                         cnd = interp_exec(c, cs->condpart, &cndtype);
4908                         if ((cndtype == Tnone ||
4909                             (cndtype == Tbool && cnd.bool != 0))) {
4910                                 // cnd is Tnone or Tbool, doesn't need to be freed
4911                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4912                                 // skip else (and cases)
4913                                 goto Xcond_done;
4914                         }
4915                 }
4916                 for (cp = cs->casepart; cp; cp = cp->next) {
4917                         v = interp_exec(c, cp->value, &vtype);
4918                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4919                                 free_value(vtype, &v);
4920                                 free_value(cndtype, &cnd);
4921                                 rv = interp_exec(c, cp->action, &rvtype);
4922                                 goto Xcond_done;
4923                         }
4924                         free_value(vtype, &v);
4925                 }
4926                 free_value(cndtype, &cnd);
4927                 if (cs->elsepart)
4928                         rv = interp_exec(c, cs->elsepart, &rvtype);
4929                 else
4930                         rvtype = Tnone;
4931         Xcond_done:
4932                 break;
4933         }
4934
4935 ### Top level structure
4936
4937 All the language elements so far can be used in various places.  Now
4938 it is time to clarify what those places are.
4939
4940 At the top level of a file there will be a number of declarations.
4941 Many of the things that can be declared haven't been described yet,
4942 such as functions, procedures, imports, and probably more.
4943 For now there are two sorts of things that can appear at the top
4944 level.  They are predefined constants, `struct` types, and the `main`
4945 function.  While the syntax will allow the `main` function to appear
4946 multiple times, that will trigger an error if it is actually attempted.
4947
4948 The various declarations do not return anything.  They store the
4949 various declarations in the parse context.
4950
4951 ###### Parser: grammar
4952
4953         $void
4954         Ocean -> OptNL DeclarationList
4955
4956         ## declare terminals
4957
4958         OptNL ->
4959         | OptNL NEWLINE
4960
4961         Newlines -> NEWLINE
4962         | Newlines NEWLINE
4963
4964         DeclarationList -> Declaration
4965         | DeclarationList Declaration
4966
4967         Declaration -> ERROR Newlines ${
4968                 tok_err(c,      // UNTESTED
4969                         "error: unhandled parse error", &$1);
4970         }$
4971         | DeclareConstant
4972         | DeclareFunction
4973         | DeclareStruct
4974
4975         ## top level grammar
4976
4977         ## Grammar
4978
4979 ### The `const` section
4980
4981 As well as being defined in with the code that uses them, constants can
4982 be declared at the top level.  These have full-file scope, so they are
4983 always `InScope`, even before(!) they have been declared.  The value of
4984 a top level constant can be given as an expression, and this is
4985 evaluated after parsing and before execution.
4986
4987 A function call can be used to evaluate a constant, but it will not have
4988 access to any program state, once such statement becomes meaningful.
4989 e.g.  arguments and filesystem will not be visible.
4990
4991 Constants are defined in a section that starts with the reserved word
4992 `const` and then has a block with a list of assignment statements.
4993 For syntactic consistency, these must use the double-colon syntax to
4994 make it clear that they are constants.  Type can also be given: if
4995 not, the type will be determined during analysis, as with other
4996 constants.
4997
4998 ###### parse context
4999         struct binode *constlist;
5000
5001 ###### top level grammar
5002
5003         $TERM const
5004
5005         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
5006         | const { SimpleConstList } Newlines
5007         | const IN OptNL ConstList OUT Newlines
5008         | const SimpleConstList Newlines
5009
5010         ConstList -> ConstList SimpleConstLine
5011         | SimpleConstLine
5012
5013         SimpleConstList -> SimpleConstList ; Const
5014         | Const
5015         | SimpleConstList ;
5016
5017         SimpleConstLine -> SimpleConstList Newlines
5018         | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
5019
5020         $*type
5021         CType -> Type   ${ $0 = $<1; }$
5022         |               ${ $0 = NULL; }$
5023
5024         $void
5025         Const -> IDENTIFIER :: CType = Expression ${ {
5026                 struct variable *v;
5027                 struct binode *bl, *bv;
5028                 struct var *var = new_pos(var, $ID);
5029
5030                 v = var_decl(c, $ID.txt);
5031                 if (v) {
5032                         v->where_decl = var;
5033                         v->where_set = var;
5034                         v->type = $<CT;
5035                         v->constant = 1;
5036                         v->global = 1;
5037                 } else {
5038                         v = var_ref(c, $1.txt);
5039                         if (v->type == Tnone) {
5040                                 v->where_decl = var;
5041                                 v->where_set = var;
5042                                 v->type = $<CT;
5043                                 v->constant = 1;
5044                                 v->global = 1;
5045                         } else {
5046                                 tok_err(c, "error: name already declared", &$1);
5047                                 type_err(c, "info: this is where '%v' was first declared",
5048                                          v->where_decl, NULL, 0, NULL);
5049                         }
5050                 }
5051                 var->var = v;
5052
5053                 bv = new(binode);
5054                 bv->op = Declare;
5055                 bv->left = var;
5056                 bv->right= $<Exp;
5057
5058                 bl = new(binode);
5059                 bl->op = List;
5060                 bl->left = c->constlist;
5061                 bl->right = bv;
5062                 c->constlist = bl;
5063         } }$
5064
5065 ###### core functions
5066         static void resolve_consts(struct parse_context *c)
5067         {
5068                 struct binode *b;
5069                 int retry = 1;
5070                 enum { none, some, cannot } progress = none;
5071
5072                 c->constlist = reorder_bilist(c->constlist);
5073                 while (retry) {
5074                         retry = 0;
5075                         for (b = cast(binode, c->constlist); b;
5076                              b = cast(binode, b->right)) {
5077                                 enum prop_err perr;
5078                                 struct binode *vb = cast(binode, b->left);
5079                                 struct var *v = cast(var, vb->left);
5080                                 if (v->var->frame_pos >= 0)
5081                                         continue;
5082                                 do {
5083                                         perr = 0;
5084                                         propagate_types(vb->right, c, &perr,
5085                                                         v->var->type, 0);
5086                                 } while (perr & Eretry);
5087                                 if (perr & Efail)
5088                                         c->parse_error += 1;
5089                                 else if (!(perr & Enoconst)) {
5090                                         progress = some;
5091                                         struct value res = interp_exec(
5092                                                 c, vb->right, &v->var->type);
5093                                         global_alloc(c, v->var->type, v->var, &res);
5094                                 } else {
5095                                         if (progress == cannot)
5096                                                 type_err(c, "error: const %v cannot be resolved.",
5097                                                          v, NULL, 0, NULL);
5098                                         else
5099                                                 retry = 1;
5100                                 }
5101                         }
5102                         switch (progress) {
5103                         case cannot:
5104                                 retry = 0; break;
5105                         case none:
5106                                 progress = cannot; break;
5107                         case some:
5108                                 progress = none; break;
5109                         }
5110                 }
5111         }
5112
5113 ###### print const decls
5114         {
5115                 struct binode *b;
5116                 int first = 1;
5117
5118                 for (b = cast(binode, context.constlist); b;
5119                      b = cast(binode, b->right)) {
5120                         struct binode *vb = cast(binode, b->left);
5121                         struct var *vr = cast(var, vb->left);
5122                         struct variable *v = vr->var;
5123
5124                         if (first)
5125                                 printf("const\n");
5126                         first = 0;
5127
5128                         printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
5129                         type_print(v->type, stdout);
5130                         printf(" = ");
5131                         print_exec(vb->right, -1, 0);
5132                         printf("\n");
5133                 }
5134         }
5135
5136 ###### free const decls
5137         free_binode(context.constlist);
5138
5139 ### Function declarations
5140
5141 The code in an Ocean program is all stored in function declarations.
5142 One of the functions must be named `main` and it must accept an array of
5143 strings as a parameter - the command line arguments.
5144
5145 As this is the top level, several things are handled a bit differently.
5146 The function is not interpreted by `interp_exec` as that isn't passed
5147 the argument list which the program requires.  Similarly type analysis
5148 is a bit more interesting at this level.
5149
5150 ###### ast functions
5151
5152         static struct type *handle_results(struct parse_context *c,
5153                                            struct binode *results)
5154         {
5155                 /* Create a 'struct' type from the results list, which
5156                  * is a list for 'struct var'
5157                  */
5158                 struct type *t = add_anon_type(c, &structure_prototype,
5159                                                "function result");
5160                 int cnt = 0;
5161                 struct binode *b;
5162
5163                 for (b = results; b; b = cast(binode, b->right))
5164                         cnt += 1;
5165                 t->structure.nfields = cnt;
5166                 t->structure.fields = calloc(cnt, sizeof(struct field));
5167                 cnt = 0;
5168                 for (b = results; b; b = cast(binode, b->right)) {
5169                         struct var *v = cast(var, b->left);
5170                         struct field *f = &t->structure.fields[cnt++];
5171                         int a = v->var->type->align;
5172                         f->name = v->var->name->name;
5173                         f->type = v->var->type;
5174                         f->init = NULL;
5175                         f->offset = t->size;
5176                         v->var->frame_pos = f->offset;
5177                         t->size += ((f->type->size - 1) | (a-1)) + 1;
5178                         if (a > t->align)
5179                                 t->align = a;
5180                         variable_unlink_exec(v->var);
5181                 }
5182                 free_binode(results);
5183                 return t;
5184         }
5185
5186         static struct variable *declare_function(struct parse_context *c,
5187                                                 struct variable *name,
5188                                                 struct binode *args,
5189                                                 struct type *ret,
5190                                                 struct binode *results,
5191                                                 struct exec *code)
5192         {
5193                 if (name) {
5194                         struct value fn = {.function = code};
5195                         struct type *t;
5196                         var_block_close(c, CloseFunction, code);
5197                         t = add_anon_type(c, &function_prototype,
5198                                           "func %.*s", name->name->name.len,
5199                                           name->name->name.txt);
5200                         name->type = t;
5201                         t->function.params = reorder_bilist(args);
5202                         if (!ret) {
5203                                 ret = handle_results(c, reorder_bilist(results));
5204                                 t->function.inline_result = 1;
5205                                 t->function.local_size = ret->size;
5206                         }
5207                         t->function.return_type = ret;
5208                         global_alloc(c, t, name, &fn);
5209                         name->type->function.scope = c->out_scope;
5210                 } else {
5211                         free_binode(args);
5212                         free_type(ret);
5213                         free_exec(code);
5214                         var_block_close(c, CloseFunction, NULL);
5215                 }
5216                 c->out_scope = NULL;
5217                 return name;
5218         }
5219
5220 ###### declare terminals
5221         $TERM return
5222
5223 ###### top level grammar
5224
5225         $*variable
5226         DeclareFunction -> func FuncName ( OpenScope ArgsLine ) Block Newlines ${
5227                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5228         }$
5229         | func FuncName IN OpenScope Args OUT OptNL do Block Newlines ${
5230                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5231         }$
5232         | func FuncName NEWLINE OpenScope OptNL do Block Newlines ${
5233                 $0 = declare_function(c, $<FN, NULL, Tnone, NULL, $<Bl);
5234         }$
5235         | func FuncName ( OpenScope ArgsLine ) : Type Block Newlines ${
5236                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5237         }$
5238         | func FuncName ( OpenScope ArgsLine ) : ( ArgsLine ) Block Newlines ${
5239                 $0 = declare_function(c, $<FN, $<AL, NULL, $<AL2, $<Bl);
5240         }$
5241         | func FuncName IN OpenScope Args OUT OptNL return Type Newlines do Block Newlines ${
5242                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5243         }$
5244         | func FuncName NEWLINE OpenScope return Type Newlines do Block Newlines ${
5245                 $0 = declare_function(c, $<FN, NULL, $<Ty, NULL, $<Bl);
5246         }$
5247         | func FuncName IN OpenScope Args OUT OptNL return IN Args OUT OptNL do Block Newlines ${
5248                 $0 = declare_function(c, $<FN, $<Ar, NULL, $<Ar2, $<Bl);
5249         }$
5250         | func FuncName NEWLINE OpenScope return IN Args OUT OptNL do Block Newlines ${
5251                 $0 = declare_function(c, $<FN, NULL, NULL, $<Ar, $<Bl);
5252         }$
5253
5254 ###### print func decls
5255         {
5256                 struct variable *v;
5257                 int target = -1;
5258
5259                 while (target != 0) {
5260                         int i = 0;
5261                         for (v = context.in_scope; v; v=v->in_scope)
5262                                 if (v->depth == 0 && v->type && v->type->check_args) {
5263                                         i += 1;
5264                                         if (i == target)
5265                                                 break;
5266                                 }
5267
5268                         if (target == -1) {
5269                                 target = i;
5270                         } else {
5271                                 struct value *val = var_value(&context, v);
5272                                 printf("func %.*s", v->name->name.len, v->name->name.txt);
5273                                 v->type->print_type_decl(v->type, stdout);
5274                                 if (brackets)
5275                                         print_exec(val->function, 0, brackets);
5276                                 else
5277                                         print_value(v->type, val, stdout);
5278                                 printf("/* frame size %d */\n", v->type->function.local_size);
5279                                 target -= 1;
5280                         }
5281                 }
5282         }
5283
5284 ###### core functions
5285
5286         static int analyse_funcs(struct parse_context *c)
5287         {
5288                 struct variable *v;
5289                 int all_ok = 1;
5290                 for (v = c->in_scope; v; v = v->in_scope) {
5291                         struct value *val;
5292                         struct type *ret;
5293                         enum prop_err perr;
5294                         if (v->depth != 0 || !v->type || !v->type->check_args)
5295                                 continue;
5296                         ret = v->type->function.inline_result ?
5297                                 Tnone : v->type->function.return_type;
5298                         val = var_value(c, v);
5299                         do {
5300                                 perr = 0;
5301                                 propagate_types(val->function, c, &perr, ret, 0);
5302                         } while (!(perr & Efail) && (perr & Eretry));
5303                         if (!(perr & Efail))
5304                                 /* Make sure everything is still consistent */
5305                                 propagate_types(val->function, c, &perr, ret, 0);
5306                         if (perr & Efail)
5307                                 all_ok = 0;
5308                         if (!v->type->function.inline_result &&
5309                             !v->type->function.return_type->dup) {
5310                                 type_err(c, "error: function cannot return value of type %1",
5311                                          v->where_decl, v->type->function.return_type, 0, NULL);
5312                         }
5313
5314                         scope_finalize(c, v->type);
5315                 }
5316                 return all_ok;
5317         }
5318
5319         static int analyse_main(struct type *type, struct parse_context *c)
5320         {
5321                 struct binode *bp = type->function.params;
5322                 struct binode *b;
5323                 enum prop_err perr;
5324                 int arg = 0;
5325                 struct type *argv_type;
5326
5327                 argv_type = add_anon_type(c, &array_prototype, "argv");
5328                 argv_type->array.member = Tstr;
5329                 argv_type->array.unspec = 1;
5330
5331                 for (b = bp; b; b = cast(binode, b->right)) {
5332                         perr = 0;
5333                         switch (arg++) {
5334                         case 0: /* argv */
5335                                 propagate_types(b->left, c, &perr, argv_type, 0);
5336                                 break;
5337                         default: /* invalid */  // NOTEST
5338                                 propagate_types(b->left, c, &perr, Tnone, 0);   // NOTEST
5339                         }
5340                         if (perr & Efail)
5341                                 c->parse_error += 1;
5342                 }
5343
5344                 return !c->parse_error;
5345         }
5346
5347         static void interp_main(struct parse_context *c, int argc, char **argv)
5348         {
5349                 struct value *progp = NULL;
5350                 struct text main_name = { "main", 4 };
5351                 struct variable *mainv;
5352                 struct binode *al;
5353                 int anum = 0;
5354                 struct value v;
5355                 struct type *vtype;
5356
5357                 mainv = var_ref(c, main_name);
5358                 if (mainv)
5359                         progp = var_value(c, mainv);
5360                 if (!progp || !progp->function) {
5361                         fprintf(stderr, "oceani: no main function found.\n");
5362                         c->parse_error += 1;
5363                         return;
5364                 }
5365                 if (!analyse_main(mainv->type, c)) {
5366                         fprintf(stderr, "oceani: main has wrong type.\n");
5367                         c->parse_error += 1;
5368                         return;
5369                 }
5370                 al = mainv->type->function.params;
5371
5372                 c->local_size = mainv->type->function.local_size;
5373                 c->local = calloc(1, c->local_size);
5374                 while (al) {
5375                         struct var *v = cast(var, al->left);
5376                         struct value *vl = var_value(c, v->var);
5377                         struct value arg;
5378                         struct type *t;
5379                         mpq_t argcq;
5380                         int i;
5381
5382                         switch (anum++) {
5383                         case 0: /* argv */
5384                                 t = v->var->type;
5385                                 mpq_init(argcq);
5386                                 mpq_set_ui(argcq, argc, 1);
5387                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
5388                                 t->prepare_type(c, t, 0);
5389                                 array_init(v->var->type, vl);
5390                                 for (i = 0; i < argc; i++) {
5391                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
5392
5393                                         arg.str.txt = argv[i];
5394                                         arg.str.len = strlen(argv[i]);
5395                                         free_value(Tstr, vl2);
5396                                         dup_value(Tstr, &arg, vl2);
5397                                 }
5398                                 break;
5399                         }
5400                         al = cast(binode, al->right);
5401                 }
5402                 v = interp_exec(c, progp->function, &vtype);
5403                 free_value(vtype, &v);
5404                 free(c->local);
5405                 c->local = NULL;
5406         }
5407
5408 ###### ast functions
5409         void free_variable(struct variable *v)
5410         {
5411         }
5412
5413 ## And now to test it out.
5414
5415 Having a language requires having a "hello world" program.  I'll
5416 provide a little more than that: a program that prints "Hello world"
5417 finds the GCD of two numbers, prints the first few elements of
5418 Fibonacci, performs a binary search for a number, and a few other
5419 things which will likely grow as the languages grows.
5420
5421 ###### File: oceani.mk
5422         demos :: sayhello
5423         sayhello : oceani
5424                 @echo "===== DEMO ====="
5425                 ./oceani --section "demo: hello" oceani.mdc 55 33
5426
5427 ###### demo: hello
5428
5429         const
5430                 pi ::= 3.141_592_6
5431                 four ::= 2 + 2 ; five ::= 10/2
5432         const pie ::= "I like Pie";
5433                 cake ::= "The cake is"
5434                   ++ " a lie"
5435
5436         struct fred
5437                 size:[four]number
5438                 name:string
5439                 alive:Boolean
5440
5441         func main(argv:[argc::]string)
5442                 print "Hello World, what lovely oceans you have!"
5443                 print "Are there", five, "?"
5444                 print pi, pie, "but", cake
5445
5446                 A := $argv[1]; B := $argv[2]
5447
5448                 /* When a variable is defined in both branches of an 'if',
5449                  * and used afterwards, the variables are merged.
5450                  */
5451                 if A > B:
5452                         bigger := "yes"
5453                 else
5454                         bigger := "no"
5455                 print "Is", A, "bigger than", B,"? ", bigger
5456                 /* If a variable is not used after the 'if', no
5457                  * merge happens, so types can be different
5458                  */
5459                 if A > B * 2:
5460                         double:string = "yes"
5461                         print A, "is more than twice", B, "?", double
5462                 else
5463                         double := B*2
5464                         print "double", B, "is", double
5465
5466                 a : number
5467                 a = A;
5468                 b:number = B
5469                 if a > 0 and then b > 0:
5470                         while a != b:
5471                                 if a < b:
5472                                         b = b - a
5473                                 else
5474                                         a = a - b
5475                         print "GCD of", A, "and", B,"is", a
5476                 else if a <= 0:
5477                         print a, "is not positive, cannot calculate GCD"
5478                 else
5479                         print b, "is not positive, cannot calculate GCD"
5480
5481                 for
5482                         togo := 10
5483                         f1 := 1; f2 := 1
5484                         print "Fibonacci:", f1,f2,
5485                 then togo = togo - 1
5486                 while togo > 0:
5487                         f3 := f1 + f2
5488                         print "", f3,
5489                         f1 = f2
5490                         f2 = f3
5491                 print ""
5492
5493                 /* Binary search... */
5494                 for
5495                         lo:= 0; hi := 100
5496                         target := 77
5497                 while
5498                         mid := (lo + hi) / 2
5499                         if mid == target:
5500                                 use Found
5501                         if mid < target:
5502                                 lo = mid
5503                         else
5504                                 hi = mid
5505                         if hi - lo < 1:
5506                                 lo = mid
5507                                 use GiveUp
5508                         use True
5509                 do pass
5510                 case Found:
5511                         print "Yay, I found", target
5512                 case GiveUp:
5513                         print "Closest I found was", lo
5514
5515                 size::= 10
5516                 list:[size]number
5517                 list[0] = 1234
5518                 // "middle square" PRNG.  Not particularly good, but one my
5519                 // Dad taught me - the first one I ever heard of.
5520                 for i:=1; then i = i + 1; while i < size:
5521                         n := list[i-1] * list[i-1]
5522                         list[i] = (n / 100) % 10 000
5523
5524                 print "Before sort:",
5525                 for i:=0; then i = i + 1; while i < size:
5526                         print "", list[i],
5527                 print
5528
5529                 for i := 1; then i=i+1; while i < size:
5530                         for j:=i-1; then j=j-1; while j >= 0:
5531                                 if list[j] > list[j+1]:
5532                                         t:= list[j]
5533                                         list[j] = list[j+1]
5534                                         list[j+1] = t
5535                 print " After sort:",
5536                 for i:=0; then i = i + 1; while i < size:
5537                         print "", list[i],
5538                 print
5539
5540                 if 1 == 2 then print "yes"; else print "no"
5541
5542                 bob:fred
5543                 bob.name = "Hello"
5544                 bob.alive = (bob.name == "Hello")
5545                 print "bob", "is" if  bob.alive else "isn't", "alive"