ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 ## parse context
 118         };
 119
 120 ###### macros
 121
 122         #define container_of(ptr, type, member) ({                      \
 123                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 124                 (type *)( (char *)__mptr - offsetof(type,member) );})
 125
 126         #define config2context(_conf) container_of(_conf, struct parse_context, \
 127                 config)
 128
 129 ###### Parser: reduce
 130         struct parse_context *c = config2context(config);
 131
 132 ###### Parser: code
 133         #define _GNU_SOURCE
 134         #include <unistd.h>
 135         #include <stdlib.h>
 136         #include <fcntl.h>
 137         #include <errno.h>
 138         #include <sys/mman.h>
 139         #include <string.h>
 140         #include <stdio.h>
 141         #include <locale.h>
 142         #include <malloc.h>
 143         #include "mdcode.h"
 144         #include "scanner.h"
 145         #include "parser.h"
 146
 147         ## includes
 148
 149         #include "oceani.h"
 150
 151         ## forward decls
 152         ## value functions
 153         ## ast functions
 154         ## core functions
 155
 156         #include <getopt.h>
 157         static char Usage[] =
 158                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 159         static const struct option long_options[] = {
 160                 {"trace",     0, NULL, 't'},
 161                 {"print",     0, NULL, 'p'},
 162                 {"noexec",    0, NULL, 'n'},
 163                 {"brackets",  0, NULL, 'b'},
 164                 {"section",   1, NULL, 's'},
 165                 {NULL,        0, NULL, 0},
 166         };
 167         const char *options = "tpnbs";
 168
 169         static void pr_err(char *msg)                   // NOTEST
 170         {
 171                 fprintf(stderr, "%s\n", msg);           // NOTEST
 172         }                                               // NOTEST
 173
 174         int main(int argc, char *argv[])
 175         {
 176                 int fd;
 177                 int len;
 178                 char *file;
 179                 struct section *s = NULL, *ss;
 180                 char *section = NULL;
 181                 struct parse_context context = {
 182                         .config = {
 183                                 .ignored = (1 << TK_mark),
 184                                 .number_chars = ".,_+- ",
 185                                 .word_start = "_",
 186                                 .word_cont = "_",
 187                         },
 188                 };
 189                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 190                 int opt;
 191                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 192                        != -1) {
 193                         switch(opt) {
 194                         case 't': dotrace=1; break;
 195                         case 'p': doprint=1; break;
 196                         case 'n': doexec=0; break;
 197                         case 'b': brackets=1; break;
 198                         case 's': section = optarg; break;
 199                         default: fprintf(stderr, Usage);
 200                                 exit(1);
 201                         }
 202                 }
 203                 if (optind >= argc) {
 204                         fprintf(stderr, "oceani: no input file given\n");
 205                         exit(1);
 206                 }
 207                 fd = open(argv[optind], O_RDONLY);
 208                 if (fd < 0) {
 209                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 210                         exit(1);
 211                 }
 212                 context.file_name = argv[optind];
 213                 len = lseek(fd, 0, 2);
 214                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 215                 s = code_extract(file, file+len, pr_err);
 216                 if (!s) {
 217                         fprintf(stderr, "oceani: could not find any code in %s\n",
 218                                 argv[optind]);
 219                         exit(1);
 220                 }
 221
 222                 ## context initialization
 223
 224                 if (section) {
 225                         for (ss = s; ss; ss = ss->next) {
 226                                 struct text sec = ss->section;
 227                                 if (sec.len == strlen(section) &&
 228                                     strncmp(sec.txt, section, sec.len) == 0)
 229                                         break;
 230                         }
 231                         if (!ss) {
 232                                 fprintf(stderr, "oceani: cannot find section %s\n",
 233                                         section);
 234                                 goto cleanup;
 235                         }
 236                 } else
 237                         ss = s;                         // NOTEST
 238                 if (!ss->code) {
 239                         fprintf(stderr, "oceani: no code found in requested section\n");        // NOTEST
 240                         goto cleanup;                   // NOTEST
 241                 }
 242
 243                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 244
 245                 if (!context.parse_error && !analyse_funcs(&context)) {
 246                         fprintf(stderr, "oceani: type error in program - not running.\n");
 247                         context.parse_error = 1;
 248                 }
 249
 250                 if (doprint) {
 251                         ## print const decls
 252                         ## print type decls
 253                         ## print func decls
 254                 }
 255                 if (doexec && !context.parse_error)
 256                         interp_main(&context, argc - optind, argv + optind);
 257         cleanup:
 258                 while (s) {
 259                         struct section *t = s->next;
 260                         code_free(s->code);
 261                         free(s);
 262                         s = t;
 263                 }
 264                 // FIXME parser should pop scope even on error
 265                 while (context.scope_depth > 0)
 266                         scope_pop(&context);
 267                 ## free global vars
 268                 ## free context types
 269                 ## free context storage
 270                 exit(context.parse_error ? 1 : 0);
 271         }
 272
 273 ### Analysis
 274
 275 The four requirements of parse, analyse, print, interpret apply to
 276 each language element individually so that is how most of the code
 277 will be structured.
 278
 279 Three of the four are fairly self explanatory.  The one that requires
 280 a little explanation is the analysis step.
 281
 282 The current language design does not require the types of variables to
 283 be declared, but they must still have a single type.  Different
 284 operations impose different requirements on the variables, for example
 285 addition requires both arguments to be numeric, and assignment
 286 requires the variable on the left to have the same type as the
 287 expression on the right.
 288
 289 Analysis involves propagating these type requirements around and
 290 consequently setting the type of each variable.  If any requirements
 291 are violated (e.g. a string is compared with a number) or if a
 292 variable needs to have two different types, then an error is raised
 293 and the program will not run.
 294
 295 If the same variable is declared in both branchs of an 'if/else', or
 296 in all cases of a 'switch' then the multiple instances may be merged
 297 into just one variable if the variable is referenced after the
 298 conditional statement.  When this happens, the types must naturally be
 299 consistent across all the branches.  When the variable is not used
 300 outside the if, the variables in the different branches are distinct
 301 and can be of different types.
 302
 303 Undeclared names may only appear in "use" statements and "case" expressions.
 304 These names are given a type of "label" and a unique value.
 305 This allows them to fill the role of a name in an enumerated type, which
 306 is useful for testing the `switch` statement.
 307
 308 As we will see, the condition part of a `while` statement can return
 309 either a Boolean or some other type.  This requires that the expected
 310 type that gets passed around comprises a type and a flag to indicate
 311 that `Tbool` is also permitted.
 312
 313 As there are, as yet, no distinct types that are compatible, there
 314 isn't much subtlety in the analysis.  When we have distinct number
 315 types, this will become more interesting.
 316
 317 #### Error reporting
 318
 319 When analysis discovers an inconsistency it needs to report an error;
 320 just refusing to run the code ensures that the error doesn't cascade,
 321 but by itself it isn't very useful.  A clear understanding of the sort
 322 of error message that are useful will help guide the process of
 323 analysis.
 324
 325 At a simplistic level, the only sort of error that type analysis can
 326 report is that the type of some construct doesn't match a contextual
 327 requirement.  For example, in `4 + "hello"` the addition provides a
 328 contextual requirement for numbers, but `"hello"` is not a number.  In
 329 this particular example no further information is needed as the types
 330 are obvious from local information.  When a variable is involved that
 331 isn't the case.  It may be helpful to explain why the variable has a
 332 particular type, by indicating the location where the type was set,
 333 whether by declaration or usage.
 334
 335 Using a recursive-descent analysis we can easily detect a problem at
 336 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 337 will detect that one argument is not a number and the usage of `hello`
 338 will detect that a number was wanted, but not provided.  In this
 339 (early) version of the language, we will generate error reports at
 340 multiple locations, so the use of `hello` will report an error and
 341 explain were the value was set, and the addition will report an error
 342 and say why numbers are needed.  To be able to report locations for
 343 errors, each language element will need to record a file location
 344 (line and column) and each variable will need to record the language
 345 element where its type was set.  For now we will assume that each line
 346 of an error message indicates one location in the file, and up to 2
 347 types.  So we provide a `printf`-like function which takes a format, a
 348 location (a `struct exec` which has not yet been introduced), and 2
 349 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 350 will need a function to print the location, once we know how that is
 351 stored. e As will be explained later, there are sometimes extra rules for
 352 type matching and they might affect error messages, we need to pass those
 353 in too.
 354
 355 As well as type errors, we sometimes need to report problems with
 356 tokens, which might be unexpected or might name a type that has not
 357 been defined.  For these we have `tok_err()` which reports an error
 358 with a given token.  Each of the error functions sets the flag in the
 359 context so indicate that parsing failed.
 360
 361 ###### forward decls
 362
 363         static void fput_loc(struct exec *loc, FILE *f);
 364         static void type_err(struct parse_context *c,
 365                              char *fmt, struct exec *loc,
 366                              struct type *t1, int rules, struct type *t2);
 367
 368 ###### core functions
 369
 370         static void type_err(struct parse_context *c,
 371                              char *fmt, struct exec *loc,
 372                              struct type *t1, int rules, struct type *t2)
 373         {
 374                 fprintf(stderr, "%s:", c->file_name);
 375                 fput_loc(loc, stderr);
 376                 for (; *fmt ; fmt++) {
 377                         if (*fmt != '%') {
 378                                 fputc(*fmt, stderr);
 379                                 continue;
 380                         }
 381                         fmt++;
 382                         switch (*fmt) {
 383                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 384                         default: fputc('?', stderr); break;     // NOTEST
 385                         case '1':
 386                                 type_print(t1, stderr);
 387                                 break;
 388                         case '2':
 389                                 type_print(t2, stderr);
 390                                 break;
 391                         ## format cases
 392                         }
 393                 }
 394                 fputs("\n", stderr);
 395                 c->parse_error = 1;
 396         }
 397
 398         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 399         {
 400                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 401                         t->txt.len, t->txt.txt);
 402                 c->parse_error = 1;
 403         }
 404
 405 ## Entities: declared and predeclared.
 406
 407 There are various "things" that the language and/or the interpreter
 408 needs to know about to parse and execute a program.  These include
 409 types, variables, values, and executable code.  These are all lumped
 410 together under the term "entities" (calling them "objects" would be
 411 confusing) and introduced here.  The following section will present the
 412 different specific code elements which comprise or manipulate these
 413 various entities.
 414
 415 ### Executables
 416
 417 Executables can be lots of different things.  In many cases an
 418 executable is just an operation combined with one or two other
 419 executables.  This allows for expressions and lists etc.  Other times an
 420 executable is something quite specific like a constant or variable name.
 421 So we define a `struct exec` to be a general executable with a type, and
 422 a `struct binode` which is a subclass of `exec`, forms a node in a
 423 binary tree, and holds an operation.  There will be other subclasses,
 424 and to access these we need to be able to `cast` the `exec` into the
 425 various other types.  The first field in any `struct exec` is the type
 426 from the `exec_types` enum.
 427
 428 ###### macros
 429         #define cast(structname, pointer) ({            \
 430                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 431                 if (__mptr && *__mptr != X##structname) abort();                \
 432                 (struct structname *)( (char *)__mptr);})
 433
 434         #define new(structname) ({                                              \
 435                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 436                 __ptr->type = X##structname;                                            \
 437                 __ptr->line = -1; __ptr->column = -1;                                   \
 438                 __ptr;})
 439
 440         #define new_pos(structname, token) ({                                           \
 441                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 442                 __ptr->type = X##structname;                                            \
 443                 __ptr->line = token.line; __ptr->column = token.col;                    \
 444                 __ptr;})
 445
 446 ###### ast
 447         enum exec_types {
 448                 Xbinode,
 449                 ## exec type
 450         };
 451         struct exec {
 452                 enum exec_types type;
 453                 int line, column;
 454                 ## exec fields
 455         };
 456         struct binode {
 457                 struct exec;
 458                 enum Btype {
 459                         ## Binode types
 460                 } op;
 461                 struct exec *left, *right;
 462         };
 463
 464 ###### ast functions
 465
 466         static int __fput_loc(struct exec *loc, FILE *f)
 467         {
 468                 if (!loc)
 469                         return 0;
 470                 if (loc->line >= 0) {
 471                         fprintf(f, "%d:%d: ", loc->line, loc->column);
 472                         return 1;
 473                 }
 474                 if (loc->type == Xbinode)
 475                         return __fput_loc(cast(binode,loc)->left, f) ||
 476                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
 477                 return 0;
 478         }
 479         static void fput_loc(struct exec *loc, FILE *f)
 480         {
 481                 if (!__fput_loc(loc, f))
 482                         fprintf(f, "??:??: ");
 483         }
 484
 485 Each different type of `exec` node needs a number of functions defined,
 486 a bit like methods.  We must be able to free it, print it, analyse it
 487 and execute it.  Once we have specific `exec` types we will need to
 488 parse them too.  Let's take this a bit more slowly.
 489
 490 #### Freeing
 491
 492 The parser generator requires a `free_foo` function for each struct
 493 that stores attributes and they will often be `exec`s and subtypes
 494 there-of.  So we need `free_exec` which can handle all the subtypes,
 495 and we need `free_binode`.
 496
 497 ###### ast functions
 498
 499         static void free_binode(struct binode *b)
 500         {
 501                 if (!b)
 502                         return;
 503                 free_exec(b->left);
 504                 free_exec(b->right);
 505                 free(b);
 506         }
 507
 508 ###### core functions
 509         static void free_exec(struct exec *e)
 510         {
 511                 if (!e)
 512                         return;
 513                 switch(e->type) {
 514                         ## free exec cases
 515                 }
 516         }
 517
 518 ###### forward decls
 519
 520         static void free_exec(struct exec *e);
 521
 522 ###### free exec cases
 523         case Xbinode: free_binode(cast(binode, e)); break;
 524
 525 #### Printing
 526
 527 Printing an `exec` requires that we know the current indent level for
 528 printing line-oriented components.  As will become clear later, we
 529 also want to know what sort of bracketing to use.
 530
 531 ###### ast functions
 532
 533         static void do_indent(int i, char *str)
 534         {
 535                 while (i-- > 0)
 536                         printf("    ");
 537                 printf("%s", str);
 538         }
 539
 540 ###### core functions
 541         static void print_binode(struct binode *b, int indent, int bracket)
 542         {
 543                 struct binode *b2;
 544                 switch(b->op) {
 545                 ## print binode cases
 546                 }
 547         }
 548
 549         static void print_exec(struct exec *e, int indent, int bracket)
 550         {
 551                 if (!e)
 552                         return;
 553                 switch (e->type) {
 554                 case Xbinode:
 555                         print_binode(cast(binode, e), indent, bracket); break;
 556                 ## print exec cases
 557                 }
 558                 if (e->to_free) {
 559                         struct variable *v;
 560                         do_indent(indent, "/* FREE");
 561                         for (v = e->to_free; v; v = v->next_free) {
 562                                 printf(" %.*s", v->name->name.len, v->name->name.txt);
 563                                 printf("[%d,%d]", v->scope_start, v->scope_end);
 564                                 if (v->frame_pos >= 0)
 565                                         printf("(%d+%d)", v->frame_pos,
 566                                                v->type ? v->type->size:0);
 567                         }
 568                         printf(" */\n");
 569                 }
 570         }
 571
 572 ###### forward decls
 573
 574         static void print_exec(struct exec *e, int indent, int bracket);
 575
 576 #### Analysing
 577
 578 As discussed, analysis involves propagating type requirements around the
 579 program and looking for errors.
 580
 581 So `propagate_types` is passed an expected type (being a `struct type`
 582 pointer together with some `val_rules` flags) that the `exec` is
 583 expected to return, and returns the type that it does return, either
 584 of which can be `NULL` signifying "unknown".  An `ok` flag is passed
 585 by reference. It is set to `0` when an error is found, and `2` when
 586 any change is made.  If it remains unchanged at `1`, then no more
 587 propagation is needed.
 588
 589 ###### ast
 590
 591         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 1<<2};
 592
 593 ###### format cases
 594         case 'r':
 595                 if (rules & Rnolabel)
 596                         fputs(" (labels not permitted)", stderr);
 597                 break;
 598
 599 ###### forward decls
 600         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 601                                             struct type *type, int rules);
 602 ###### core functions
 603
 604         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 605                                               struct type *type, int rules)
 606         {
 607                 struct type *t;
 608
 609                 if (!prog)
 610                         return Tnone;
 611
 612                 switch (prog->type) {
 613                 case Xbinode:
 614                 {
 615                         struct binode *b = cast(binode, prog);
 616                         switch (b->op) {
 617                         ## propagate binode cases
 618                         }
 619                         break;
 620                 }
 621                 ## propagate exec cases
 622                 }
 623                 return Tnone;
 624         }
 625
 626         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 627                                             struct type *type, int rules)
 628         {
 629                 struct type *ret = __propagate_types(prog, c, ok, type, rules);
 630
 631                 if (c->parse_error)
 632                         *ok = 0;
 633                 return ret;
 634         }
 635
 636 #### Interpreting
 637
 638 Interpreting an `exec` doesn't require anything but the `exec`.  State
 639 is stored in variables and each variable will be directly linked from
 640 within the `exec` tree.  The exception to this is the `main` function
 641 which needs to look at command line arguments.  This function will be
 642 interpreted separately.
 643
 644 Each `exec` can return a value combined with a type in `struct lrval`.
 645 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
 646 the location of a value, which can be updated, in `lval`.  Others will
 647 set `lval` to NULL indicating that there is a value of appropriate type
 648 in `rval`.
 649
 650 ###### core functions
 651
 652         struct lrval {
 653                 struct type *type;
 654                 struct value rval, *lval;
 655         };
 656
 657         /* If dest is passed, dtype must give the expected type, and
 658          * result can go there, in which case type is returned as NULL.
 659          */
 660         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 661                                          struct value *dest, struct type *dtype);
 662
 663         static struct value interp_exec(struct parse_context *c, struct exec *e,
 664                                         struct type **typeret)
 665         {
 666                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 667
 668                 if (!ret.type) abort();
 669                 if (typeret)
 670                         *typeret = ret.type;
 671                 if (ret.lval)
 672                         dup_value(ret.type, ret.lval, &ret.rval);
 673                 return ret.rval;
 674         }
 675
 676         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
 677                                           struct type **typeret)
 678         {
 679                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 680
 681                 if (!ret.type) abort();
 682                 if (ret.lval)
 683                         *typeret = ret.type;
 684                 else
 685                         free_value(ret.type, &ret.rval);
 686                 return ret.lval;
 687         }
 688
 689         /* dinterp_exec is used when the destination type is certain and
 690          * the value has a place to go.
 691          */
 692         static void dinterp_exec(struct parse_context *c, struct exec *e,
 693                                  struct value *dest, struct type *dtype,
 694                                  int need_free)
 695         {
 696                 struct lrval ret = _interp_exec(c, e, dest, dtype);
 697                 if (!ret.type)
 698                         return;
 699                 if (need_free)
 700                         free_value(dtype, dest);
 701                 if (ret.lval)
 702                         dup_value(dtype, ret.lval, dest);
 703                 else
 704                         memcpy(dest, &ret.rval, dtype->size);
 705         }
 706
 707         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 708                                          struct value *dest, struct type *dtype)
 709         {
 710                 /* If the result is copied to dest, ret.type is set to NULL */
 711                 struct lrval ret;
 712                 struct value rv = {}, *lrv = NULL;
 713                 struct type *rvtype;
 714
 715                 rvtype = ret.type = Tnone;
 716                 if (!e) {
 717                         ret.lval = lrv;
 718                         ret.rval = rv;
 719                         return ret;
 720                 }
 721
 722                 switch(e->type) {
 723                 case Xbinode:
 724                 {
 725                         struct binode *b = cast(binode, e);
 726                         struct value left, right, *lleft;
 727                         struct type *ltype, *rtype;
 728                         ltype = rtype = Tnone;
 729                         switch (b->op) {
 730                         ## interp binode cases
 731                         }
 732                         free_value(ltype, &left);
 733                         free_value(rtype, &right);
 734                         break;
 735                 }
 736                 ## interp exec cases
 737                 }
 738                 if (rvtype) {
 739                         ret.lval = lrv;
 740                         ret.rval = rv;
 741                         ret.type = rvtype;
 742                 }
 743                 ## interp exec cleanup
 744                 return ret;
 745         }
 746
 747 ### Types
 748
 749 Values come in a wide range of types, with more likely to be added.
 750 Each type needs to be able to print its own values (for convenience at
 751 least) as well as to compare two values, at least for equality and
 752 possibly for order.  For now, values might need to be duplicated and
 753 freed, though eventually such manipulations will be better integrated
 754 into the language.
 755
 756 Rather than requiring every numeric type to support all numeric
 757 operations (add, multiply, etc), we allow types to be able to present
 758 as one of a few standard types: integer, float, and fraction.  The
 759 existence of these conversion functions eventually enable types to
 760 determine if they are compatible with other types, though such types
 761 have not yet been implemented.
 762
 763 Named type are stored in a simple linked list.  Objects of each type are
 764 "values" which are often passed around by value.
 765
 766 There are both explicitly named types, and anonymous types.  Anonymous
 767 cannot be accessed by name, but are used internally and have a name
 768 which might be reported in error messages.
 769
 770 ###### ast
 771
 772         struct value {
 773                 union {
 774                         char ptr[1];
 775                         ## value union fields
 776                 };
 777         };
 778
 779         struct type {
 780                 struct text name;
 781                 struct type *next;
 782                 int size, align;
 783                 int anon;
 784                 void (*init)(struct type *type, struct value *val);
 785                 void (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 786                 void (*print)(struct type *type, struct value *val, FILE *f);
 787                 void (*print_type)(struct type *type, FILE *f);
 788                 int (*cmp_order)(struct type *t1, struct type *t2,
 789                                  struct value *v1, struct value *v2);
 790                 int (*cmp_eq)(struct type *t1, struct type *t2,
 791                               struct value *v1, struct value *v2);
 792                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 793                 void (*free)(struct type *type, struct value *val);
 794                 void (*free_type)(struct type *t);
 795                 long long (*to_int)(struct value *v);
 796                 double (*to_float)(struct value *v);
 797                 int (*to_mpq)(mpq_t *q, struct value *v);
 798                 ## type functions
 799                 union {
 800                         ## type union fields
 801                 };
 802         };
 803
 804 ###### parse context
 805
 806         struct type *typelist;
 807
 808 ###### includes
 809         #include <stdarg.h>
 810
 811 ###### ast functions
 812
 813         static struct type *find_type(struct parse_context *c, struct text s)
 814         {
 815                 struct type *t = c->typelist;
 816
 817                 while (t && (t->anon ||
 818                              text_cmp(t->name, s) != 0))
 819                                 t = t->next;
 820                 return t;
 821         }
 822
 823         static struct type *_add_type(struct parse_context *c, struct text s,
 824                                      struct type *proto, int anon)
 825         {
 826                 struct type *n;
 827
 828                 n = calloc(1, sizeof(*n));
 829                 *n = *proto;
 830                 n->name = s;
 831                 n->anon = anon;
 832                 n->next = c->typelist;
 833                 c->typelist = n;
 834                 return n;
 835         }
 836
 837         static struct type *add_type(struct parse_context *c, struct text s,
 838                                       struct type *proto)
 839         {
 840                 return _add_type(c, s, proto, 0);
 841         }
 842
 843         static struct type *add_anon_type(struct parse_context *c,
 844                                           struct type *proto, char *name, ...)
 845         {
 846                 struct text t;
 847                 va_list ap;
 848
 849                 va_start(ap, name);
 850                 vasprintf(&t.txt, name, ap);
 851                 va_end(ap);
 852                 t.len = strlen(name);
 853                 return _add_type(c, t, proto, 1);
 854         }
 855
 856         static void free_type(struct type *t)
 857         {
 858                 /* The type is always a reference to something in the
 859                  * context, so we don't need to free anything.
 860                  */
 861         }
 862
 863         static void free_value(struct type *type, struct value *v)
 864         {
 865                 if (type && v) {
 866                         type->free(type, v);
 867                         memset(v, 0x5a, type->size);
 868                 }
 869         }
 870
 871         static void type_print(struct type *type, FILE *f)
 872         {
 873                 if (!type)
 874                         fputs("*unknown*type*", f);     // NOTEST
 875                 else if (type->name.len && !type->anon)
 876                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 877                 else if (type->print_type)
 878                         type->print_type(type, f);
 879                 else
 880                         fputs("*invalid*type*", f);
 881         }
 882
 883         static void val_init(struct type *type, struct value *val)
 884         {
 885                 if (type && type->init)
 886                         type->init(type, val);
 887         }
 888
 889         static void dup_value(struct type *type,
 890                               struct value *vold, struct value *vnew)
 891         {
 892                 if (type && type->dup)
 893                         type->dup(type, vold, vnew);
 894         }
 895
 896         static int value_cmp(struct type *tl, struct type *tr,
 897                              struct value *left, struct value *right)
 898         {
 899                 if (tl && tl->cmp_order)
 900                         return tl->cmp_order(tl, tr, left, right);
 901                 if (tl && tl->cmp_eq)                   // NOTEST
 902                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 903                 return -1;                              // NOTEST
 904         }
 905
 906         static void print_value(struct type *type, struct value *v, FILE *f)
 907         {
 908                 if (type && type->print)
 909                         type->print(type, v, f);
 910                 else
 911                         fprintf(f, "*Unknown*");                // NOTEST
 912         }
 913
 914 ###### forward decls
 915
 916         static void free_value(struct type *type, struct value *v);
 917         static int type_compat(struct type *require, struct type *have, int rules);
 918         static void type_print(struct type *type, FILE *f);
 919         static void val_init(struct type *type, struct value *v);
 920         static void dup_value(struct type *type,
 921                               struct value *vold, struct value *vnew);
 922         static int value_cmp(struct type *tl, struct type *tr,
 923                              struct value *left, struct value *right);
 924         static void print_value(struct type *type, struct value *v, FILE *f);
 925
 926 ###### free context types
 927
 928         while (context.typelist) {
 929                 struct type *t = context.typelist;
 930
 931                 context.typelist = t->next;
 932                 if (t->free_type)
 933                         t->free_type(t);
 934                 if (t->anon)
 935                         free(t->name.txt);
 936                 free(t);
 937         }
 938
 939 Type can be specified for local variables, for fields in a structure,
 940 for formal parameters to functions, and possibly elsewhere.  Different
 941 rules may apply in different contexts.  As a minimum, a named type may
 942 always be used.  Currently the type of a formal parameter can be
 943 different from types in other contexts, so we have a separate grammar
 944 symbol for those.
 945
 946 ###### Grammar
 947
 948         $*type
 949         Type -> IDENTIFIER ${
 950                 $0 = find_type(c, $1.txt);
 951                 if (!$0) {
 952                         tok_err(c,
 953                                 "error: undefined type", &$1);
 954
 955                         $0 = Tnone;
 956                 }
 957         }$
 958         ## type grammar
 959
 960         FormalType -> Type ${ $0 = $<1; }$
 961         ## formal type grammar
 962
 963 #### Base Types
 964
 965 Values of the base types can be numbers, which we represent as
 966 multi-precision fractions, strings, Booleans and labels.  When
 967 analysing the program we also need to allow for places where no value
 968 is meaningful (type `Tnone`) and where we don't know what type to
 969 expect yet (type is `NULL`).
 970
 971 Values are never shared, they are always copied when used, and freed
 972 when no longer needed.
 973
 974 When propagating type information around the program, we need to
 975 determine if two types are compatible, where type `NULL` is compatible
 976 with anything.  There are two special cases with type compatibility,
 977 both related to the Conditional Statement which will be described
 978 later.  In some cases a Boolean can be accepted as well as some other
 979 primary type, and in others any type is acceptable except a label (`Vlabel`).
 980 A separate function encoding these cases will simplify some code later.
 981
 982 ###### type functions
 983
 984         int (*compat)(struct type *this, struct type *other);
 985
 986 ###### ast functions
 987
 988         static int type_compat(struct type *require, struct type *have, int rules)
 989         {
 990                 if ((rules & Rboolok) && have == Tbool)
 991                         return 1;       // NOTEST
 992                 if ((rules & Rnolabel) && have == Tlabel)
 993                         return 0;       // NOTEST
 994                 if (!require || !have)
 995                         return 1;
 996
 997                 if (require->compat)
 998                         return require->compat(require, have);
 999
1000                 return require == have;
1001         }
1002
1003 ###### includes
1004         #include <gmp.h>
1005         #include "parse_string.h"
1006         #include "parse_number.h"
1007
1008 ###### libs
1009         myLDLIBS := libnumber.o libstring.o -lgmp
1010         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
1011
1012 ###### type union fields
1013         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
1014
1015 ###### value union fields
1016         struct text str;
1017         mpq_t num;
1018         unsigned char bool;
1019         void *label;
1020
1021 ###### ast functions
1022         static void _free_value(struct type *type, struct value *v)
1023         {
1024                 if (!v)
1025                         return;         // NOTEST
1026                 switch (type->vtype) {
1027                 case Vnone: break;
1028                 case Vstr: free(v->str.txt); break;
1029                 case Vnum: mpq_clear(v->num); break;
1030                 case Vlabel:
1031                 case Vbool: break;
1032                 }
1033         }
1034
1035 ###### value functions
1036
1037         static void _val_init(struct type *type, struct value *val)
1038         {
1039                 switch(type->vtype) {
1040                 case Vnone:             // NOTEST
1041                         break;          // NOTEST
1042                 case Vnum:
1043                         mpq_init(val->num); break;
1044                 case Vstr:
1045                         val->str.txt = malloc(1);
1046                         val->str.len = 0;
1047                         break;
1048                 case Vbool:
1049                         val->bool = 0;
1050                         break;
1051                 case Vlabel:
1052                         val->label = NULL;
1053                         break;
1054                 }
1055         }
1056
1057         static void _dup_value(struct type *type,
1058                                struct value *vold, struct value *vnew)
1059         {
1060                 switch (type->vtype) {
1061                 case Vnone:             // NOTEST
1062                         break;          // NOTEST
1063                 case Vlabel:
1064                         vnew->label = vold->label;
1065                         break;
1066                 case Vbool:
1067                         vnew->bool = vold->bool;
1068                         break;
1069                 case Vnum:
1070                         mpq_init(vnew->num);
1071                         mpq_set(vnew->num, vold->num);
1072                         break;
1073                 case Vstr:
1074                         vnew->str.len = vold->str.len;
1075                         vnew->str.txt = malloc(vnew->str.len);
1076                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
1077                         break;
1078                 }
1079         }
1080
1081         static int _value_cmp(struct type *tl, struct type *tr,
1082                               struct value *left, struct value *right)
1083         {
1084                 int cmp;
1085                 if (tl != tr)
1086                         return tl - tr; // NOTEST
1087                 switch (tl->vtype) {
1088                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
1089                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
1090                 case Vstr: cmp = text_cmp(left->str, right->str); break;
1091                 case Vbool: cmp = left->bool - right->bool; break;
1092                 case Vnone: cmp = 0;                    // NOTEST
1093                 }
1094                 return cmp;
1095         }
1096
1097         static void _print_value(struct type *type, struct value *v, FILE *f)
1098         {
1099                 switch (type->vtype) {
1100                 case Vnone:                             // NOTEST
1101                         fprintf(f, "*no-value*"); break;        // NOTEST
1102                 case Vlabel:                            // NOTEST
1103                         fprintf(f, "*label-%p*", v->label); break; // NOTEST
1104                 case Vstr:
1105                         fprintf(f, "%.*s", v->str.len, v->str.txt); break;
1106                 case Vbool:
1107                         fprintf(f, "%s", v->bool ? "True":"False"); break;
1108                 case Vnum:
1109                         {
1110                         mpf_t fl;
1111                         mpf_init2(fl, 20);
1112                         mpf_set_q(fl, v->num);
1113                         gmp_fprintf(f, "%Fg", fl);
1114                         mpf_clear(fl);
1115                         break;
1116                         }
1117                 }
1118         }
1119
1120         static void _free_value(struct type *type, struct value *v);
1121
1122         static struct type base_prototype = {
1123                 .init = _val_init,
1124                 .print = _print_value,
1125                 .cmp_order = _value_cmp,
1126                 .cmp_eq = _value_cmp,
1127                 .dup = _dup_value,
1128                 .free = _free_value,
1129         };
1130
1131         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
1132
1133 ###### ast functions
1134         static struct type *add_base_type(struct parse_context *c, char *n,
1135                                           enum vtype vt, int size)
1136         {
1137                 struct text txt = { n, strlen(n) };
1138                 struct type *t;
1139
1140                 t = add_type(c, txt, &base_prototype);
1141                 t->vtype = vt;
1142                 t->size = size;
1143                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
1144                 if (t->size & (t->align - 1))
1145                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
1146                 return t;
1147         }
1148
1149 ###### context initialization
1150
1151         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
1152         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
1153         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
1154         Tnone  = add_base_type(&context, "none", Vnone, 0);
1155         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
1156
1157 ##### Base Values
1158
1159 We have already met values as separate objects.  When manifest constants
1160 appear in the program text, that must result in an executable which has
1161 a constant value.  So the `val` structure embeds a value in an
1162 executable.
1163
1164 ###### exec type
1165         Xval,
1166
1167 ###### ast
1168         struct val {
1169                 struct exec;
1170                 struct type *vtype;
1171                 struct value val;
1172         };
1173
1174 ###### ast functions
1175         struct val *new_val(struct type *T, struct token tk)
1176         {
1177                 struct val *v = new_pos(val, tk);
1178                 v->vtype = T;
1179                 return v;
1180         }
1181
1182 ###### Grammar
1183
1184         $TERM True False
1185
1186         $*val
1187         Value ->  True ${
1188                 $0 = new_val(Tbool, $1);
1189                 $0->val.bool = 1;
1190         }$
1191         | False ${
1192                 $0 = new_val(Tbool, $1);
1193                 $0->val.bool = 0;
1194         }$
1195         | NUMBER ${ {
1196                 char tail[3];
1197                 $0 = new_val(Tnum, $1);
1198                 if (number_parse($0->val.num, tail, $1.txt) == 0)
1199                         mpq_init($0->val.num);  // UNTESTED
1200                         if (tail[0])
1201                                 tok_err(c, "error: unsupported number suffix",
1202                                         &$1);
1203         } }$
1204         | STRING ${ {
1205                 char tail[3];
1206                 $0 = new_val(Tstr, $1);
1207                 string_parse(&$1, '\\', &$0->val.str, tail);
1208                 if (tail[0])
1209                         tok_err(c, "error: unsupported string suffix",
1210                                 &$1);
1211         } }$
1212         | MULTI_STRING ${ {
1213                 char tail[3];
1214                 $0 = new_val(Tstr, $1);
1215                 string_parse(&$1, '\\', &$0->val.str, tail);
1216                 if (tail[0])
1217                         tok_err(c, "error: unsupported string suffix",
1218                                 &$1);
1219         } }$
1220
1221 ###### print exec cases
1222         case Xval:
1223         {
1224                 struct val *v = cast(val, e);
1225                 if (v->vtype == Tstr)
1226                         printf("\"");
1227                 print_value(v->vtype, &v->val, stdout);
1228                 if (v->vtype == Tstr)
1229                         printf("\"");
1230                 break;
1231         }
1232
1233 ###### propagate exec cases
1234         case Xval:
1235         {
1236                 struct val *val = cast(val, prog);
1237                 if (!type_compat(type, val->vtype, rules))
1238                         type_err(c, "error: expected %1%r found %2",
1239                                    prog, type, rules, val->vtype);
1240                 return val->vtype;
1241         }
1242
1243 ###### interp exec cases
1244         case Xval:
1245                 rvtype = cast(val, e)->vtype;
1246                 dup_value(rvtype, &cast(val, e)->val, &rv);
1247                 break;
1248
1249 ###### ast functions
1250         static void free_val(struct val *v)
1251         {
1252                 if (v)
1253                         free_value(v->vtype, &v->val);
1254                 free(v);
1255         }
1256
1257 ###### free exec cases
1258         case Xval: free_val(cast(val, e)); break;
1259
1260 ###### ast functions
1261         // Move all nodes from 'b' to 'rv', reversing their order.
1262         // In 'b' 'left' is a list, and 'right' is the last node.
1263         // In 'rv', left' is the first node and 'right' is a list.
1264         static struct binode *reorder_bilist(struct binode *b)
1265         {
1266                 struct binode *rv = NULL;
1267
1268                 while (b) {
1269                         struct exec *t = b->right;
1270                         b->right = rv;
1271                         rv = b;
1272                         if (b->left)
1273                                 b = cast(binode, b->left);
1274                         else
1275                                 b = NULL;
1276                         rv->left = t;
1277                 }
1278                 return rv;
1279         }
1280
1281 ### Variables
1282
1283 Variables are scoped named values.  We store the names in a linked list
1284 of "bindings" sorted in lexical order, and use sequential search and
1285 insertion sort.
1286
1287 ###### ast
1288
1289         struct binding {
1290                 struct text name;
1291                 struct binding *next;   // in lexical order
1292                 ## binding fields
1293         };
1294
1295 This linked list is stored in the parse context so that "reduce"
1296 functions can find or add variables, and so the analysis phase can
1297 ensure that every variable gets a type.
1298
1299 ###### parse context
1300
1301         struct binding *varlist;  // In lexical order
1302
1303 ###### ast functions
1304
1305         static struct binding *find_binding(struct parse_context *c, struct text s)
1306         {
1307                 struct binding **l = &c->varlist;
1308                 struct binding *n;
1309                 int cmp = 1;
1310
1311                 while (*l &&
1312                         (cmp = text_cmp((*l)->name, s)) < 0)
1313                                 l = & (*l)->next;
1314                 if (cmp == 0)
1315                         return *l;
1316                 n = calloc(1, sizeof(*n));
1317                 n->name = s;
1318                 n->next = *l;
1319                 *l = n;
1320                 return n;
1321         }
1322
1323 Each name can be linked to multiple variables defined in different
1324 scopes.  Each scope starts where the name is declared and continues
1325 until the end of the containing code block.  Scopes of a given name
1326 cannot nest, so a declaration while a name is in-scope is an error.
1327
1328 ###### binding fields
1329         struct variable *var;
1330
1331 ###### ast
1332         struct variable {
1333                 struct variable *previous;
1334                 struct type *type;
1335                 struct binding *name;
1336                 struct exec *where_decl;// where name was declared
1337                 struct exec *where_set; // where type was set
1338                 ## variable fields
1339         };
1340
1341 When a scope closes, the values of the variables might need to be freed.
1342 This happens in the context of some `struct exec` and each `exec` will
1343 need to know which variables need to be freed when it completes.
1344
1345 ####### exec fields
1346         struct variable *to_free;
1347
1348 ####### variable fields
1349         struct exec *cleanup_exec;
1350         struct variable *next_free;
1351
1352 ####### interp exec cleanup
1353         {
1354                 struct variable *v;
1355                 for (v = e->to_free; v; v = v->next_free) {
1356                         struct value *val = var_value(c, v);
1357                         free_value(v->type, val);
1358                 }
1359         }
1360
1361 ###### ast functions
1362         static void variable_unlink_exec(struct variable *v)
1363         {
1364                 struct variable **vp;
1365                 if (!v->cleanup_exec)
1366                         return;
1367                 for (vp = &v->cleanup_exec->to_free;
1368                     *vp; vp = &(*vp)->next_free) {
1369                         if (*vp != v)
1370                                 continue;
1371                         *vp = v->next_free;
1372                         v->cleanup_exec = NULL;
1373                         break;
1374                 }
1375         }
1376
1377 While the naming seems strange, we include local constants in the
1378 definition of variables.  A name declared `var := value` can
1379 subsequently be changed, but a name declared `var ::= value` cannot -
1380 it is constant
1381
1382 ###### variable fields
1383         int constant;
1384
1385 Scopes in parallel branches can be partially merged.  More
1386 specifically, if a given name is declared in both branches of an
1387 if/else then its scope is a candidate for merging.  Similarly if
1388 every branch of an exhaustive switch (e.g. has an "else" clause)
1389 declares a given name, then the scopes from the branches are
1390 candidates for merging.
1391
1392 Note that names declared inside a loop (which is only parallel to
1393 itself) are never visible after the loop.  Similarly names defined in
1394 scopes which are not parallel, such as those started by `for` and
1395 `switch`, are never visible after the scope.  Only variables defined in
1396 both `then` and `else` (including the implicit then after an `if`, and
1397 excluding `then` used with `for`) and in all `case`s and `else` of a
1398 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
1399
1400 Labels, which are a bit like variables, follow different rules.
1401 Labels are not explicitly declared, but if an undeclared name appears
1402 in a context where a label is legal, that effectively declares the
1403 name as a label.  The declaration remains in force (or in scope) at
1404 least to the end of the immediately containing block and conditionally
1405 in any larger containing block which does not declare the name in some
1406 other way.  Importantly, the conditional scope extension happens even
1407 if the label is only used in one parallel branch of a conditional --
1408 when used in one branch it is treated as having been declared in all
1409 branches.
1410
1411 Merge candidates are tentatively visible beyond the end of the
1412 branching statement which creates them.  If the name is used, the
1413 merge is affirmed and they become a single variable visible at the
1414 outer layer.  If not - if it is redeclared first - the merge lapses.
1415
1416 To track scopes we have an extra stack, implemented as a linked list,
1417 which roughly parallels the parse stack and which is used exclusively
1418 for scoping.  When a new scope is opened, a new frame is pushed and
1419 the child-count of the parent frame is incremented.  This child-count
1420 is used to distinguish between the first of a set of parallel scopes,
1421 in which declared variables must not be in scope, and subsequent
1422 branches, whether they may already be conditionally scoped.
1423
1424 We need a total ordering of scopes so we can easily compare to variables
1425 to see if they are concurrently in scope.  To achieve this we record a
1426 `scope_count` which is actually a count of both beginnings and endings
1427 of scopes.  Then each variable has a record of the scope count where it
1428 enters scope, and where it leaves.
1429
1430 To push a new frame *before* any code in the frame is parsed, we need a
1431 grammar reduction.  This is most easily achieved with a grammar
1432 element which derives the empty string, and creates the new scope when
1433 it is recognised.  This can be placed, for example, between a keyword
1434 like "if" and the code following it.
1435
1436 ###### ast
1437         struct scope {
1438                 struct scope *parent;
1439                 int child_count;
1440         };
1441
1442 ###### parse context
1443         int scope_depth;
1444         int scope_count;
1445         struct scope *scope_stack;
1446
1447 ###### variable fields
1448         int scope_start, scope_end;
1449
1450 ###### ast functions
1451         static void scope_pop(struct parse_context *c)
1452         {
1453                 struct scope *s = c->scope_stack;
1454
1455                 c->scope_stack = s->parent;
1456                 free(s);
1457                 c->scope_depth -= 1;
1458                 c->scope_count += 1;
1459         }
1460
1461         static void scope_push(struct parse_context *c)
1462         {
1463                 struct scope *s = calloc(1, sizeof(*s));
1464                 if (c->scope_stack)
1465                         c->scope_stack->child_count += 1;
1466                 s->parent = c->scope_stack;
1467                 c->scope_stack = s;
1468                 c->scope_depth += 1;
1469                 c->scope_count += 1;
1470         }
1471
1472 ###### Grammar
1473
1474         $void
1475         OpenScope -> ${ scope_push(c); }$
1476
1477 Each variable records a scope depth and is in one of four states:
1478
1479 - "in scope".  This is the case between the declaration of the
1480   variable and the end of the containing block, and also between
1481   the usage with affirms a merge and the end of that block.
1482
1483   The scope depth is not greater than the current parse context scope
1484   nest depth.  When the block of that depth closes, the state will
1485   change.  To achieve this, all "in scope" variables are linked
1486   together as a stack in nesting order.
1487
1488 - "pending".  The "in scope" block has closed, but other parallel
1489   scopes are still being processed.  So far, every parallel block at
1490   the same level that has closed has declared the name.
1491
1492   The scope depth is the depth of the last parallel block that
1493   enclosed the declaration, and that has closed.
1494
1495 - "conditionally in scope".  The "in scope" block and all parallel
1496   scopes have closed, and no further mention of the name has been seen.
1497   This state includes a secondary nest depth (`min_depth`) which records
1498   the outermost scope seen since the variable became conditionally in
1499   scope.  If a use of the name is found, the variable becomes "in scope"
1500   and that secondary depth becomes the recorded scope depth.  If the
1501   name is declared as a new variable, the old variable becomes "out of
1502   scope" and the recorded scope depth stays unchanged.
1503
1504 - "out of scope".  The variable is neither in scope nor conditionally
1505   in scope.  It is permanently out of scope now and can be removed from
1506   the "in scope" stack.  When a variable becomes out-of-scope it is
1507   moved to a separate list (`out_scope`) of variables which have fully
1508   known scope.  This will be used at the end of each function to assign
1509   each variable a place in the stack frame.
1510
1511 ###### variable fields
1512         int depth, min_depth;
1513         enum { OutScope, PendingScope, CondScope, InScope } scope;
1514         struct variable *in_scope;
1515
1516 ###### parse context
1517
1518         struct variable *in_scope;
1519         struct variable *out_scope;
1520
1521 All variables with the same name are linked together using the
1522 'previous' link.  Those variable that have been affirmatively merged all
1523 have a 'merged' pointer that points to one primary variable - the most
1524 recently declared instance.  When merging variables, we need to also
1525 adjust the 'merged' pointer on any other variables that had previously
1526 been merged with the one that will no longer be primary.
1527
1528 A variable that is no longer the most recent instance of a name may
1529 still have "pending" scope, if it might still be merged with most
1530 recent instance.  These variables don't really belong in the
1531 "in_scope" list, but are not immediately removed when a new instance
1532 is found.  Instead, they are detected and ignored when considering the
1533 list of in_scope names.
1534
1535 The storage of the value of a variable will be described later.  For now
1536 we just need to know that when a variable goes out of scope, it might
1537 need to be freed.  For this we need to be able to find it, so assume that
1538 `var_value()` will provide that.
1539
1540 ###### variable fields
1541         struct variable *merged;
1542
1543 ###### ast functions
1544
1545         static void variable_merge(struct variable *primary, struct variable *secondary)
1546         {
1547                 struct variable *v;
1548
1549                 primary = primary->merged;
1550
1551                 for (v = primary->previous; v; v=v->previous)
1552                         if (v == secondary || v == secondary->merged ||
1553                             v->merged == secondary ||
1554                             v->merged == secondary->merged) {
1555                                 v->scope = OutScope;
1556                                 v->merged = primary;
1557                                 if (v->scope_start < primary->scope_start)
1558                                         primary->scope_start = v->scope_start;
1559                                 if (v->scope_end > primary->scope_end)
1560                                         primary->scope_end = v->scope_end;      // NOTEST
1561                                 variable_unlink_exec(v);
1562                         }
1563         }
1564
1565 ###### forward decls
1566         static struct value *var_value(struct parse_context *c, struct variable *v);
1567
1568 ###### free global vars
1569
1570         while (context.varlist) {
1571                 struct binding *b = context.varlist;
1572                 struct variable *v = b->var;
1573                 context.varlist = b->next;
1574                 free(b);
1575                 while (v) {
1576                         struct variable *next = v->previous;
1577
1578                         if (v->global) {
1579                                 free_value(v->type, var_value(&context, v));
1580                                 if (v->depth == 0)
1581                                         // This is a global constant
1582                                         free_exec(v->where_decl);
1583                         }
1584                         free(v);
1585                         v = next;
1586                 }
1587         }
1588
1589 #### Manipulating Bindings
1590
1591 When a name is conditionally visible, a new declaration discards the old
1592 binding - the condition lapses.  Similarly when we reach the end of a
1593 function (outermost non-global scope) any conditional scope must lapse.
1594 Conversely a usage of the name affirms the visibility and extends it to
1595 the end of the containing block - i.e.  the block that contains both the
1596 original declaration and the latest usage.  This is determined from
1597 `min_depth`.  When a conditionally visible variable gets affirmed like
1598 this, it is also merged with other conditionally visible variables with
1599 the same name.
1600
1601 When we parse a variable declaration we either report an error if the
1602 name is currently bound, or create a new variable at the current nest
1603 depth if the name is unbound or bound to a conditionally scoped or
1604 pending-scope variable.  If the previous variable was conditionally
1605 scoped, it and its homonyms becomes out-of-scope.
1606
1607 When we parse a variable reference (including non-declarative assignment
1608 "foo = bar") we report an error if the name is not bound or is bound to
1609 a pending-scope variable; update the scope if the name is bound to a
1610 conditionally scoped variable; or just proceed normally if the named
1611 variable is in scope.
1612
1613 When we exit a scope, any variables bound at this level are either
1614 marked out of scope or pending-scoped, depending on whether the scope
1615 was sequential or parallel.  Here a "parallel" scope means the "then"
1616 or "else" part of a conditional, or any "case" or "else" branch of a
1617 switch.  Other scopes are "sequential".
1618
1619 When exiting a parallel scope we check if there are any variables that
1620 were previously pending and are still visible. If there are, then
1621 they weren't redeclared in the most recent scope, so they cannot be
1622 merged and must become out-of-scope.  If it is not the first of
1623 parallel scopes (based on `child_count`), we check that there was a
1624 previous binding that is still pending-scope.  If there isn't, the new
1625 variable must now be out-of-scope.
1626
1627 When exiting a sequential scope that immediately enclosed parallel
1628 scopes, we need to resolve any pending-scope variables.  If there was
1629 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1630 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1631 all pending-scope variables become conditionally scoped.
1632
1633 ###### ast
1634         enum closetype { CloseSequential, CloseFunction, CloseParallel, CloseElse };
1635
1636 ###### ast functions
1637
1638         static struct variable *var_decl(struct parse_context *c, struct text s)
1639         {
1640                 struct binding *b = find_binding(c, s);
1641                 struct variable *v = b->var;
1642
1643                 switch (v ? v->scope : OutScope) {
1644                 case InScope:
1645                         /* Caller will report the error */
1646                         return NULL;
1647                 case CondScope:
1648                         for (;
1649                              v && v->scope == CondScope;
1650                              v = v->previous)
1651                                 v->scope = OutScope;
1652                         break;
1653                 default: break;
1654                 }
1655                 v = calloc(1, sizeof(*v));
1656                 v->previous = b->var;
1657                 b->var = v;
1658                 v->name = b;
1659                 v->merged = v;
1660                 v->min_depth = v->depth = c->scope_depth;
1661                 v->scope = InScope;
1662                 v->in_scope = c->in_scope;
1663                 v->scope_start = c->scope_count;
1664                 c->in_scope = v;
1665                 ## variable init
1666                 return v;
1667         }
1668
1669         static struct variable *var_ref(struct parse_context *c, struct text s)
1670         {
1671                 struct binding *b = find_binding(c, s);
1672                 struct variable *v = b->var;
1673                 struct variable *v2;
1674
1675                 switch (v ? v->scope : OutScope) {
1676                 case OutScope:
1677                 case PendingScope:
1678                         /* Caller will report the error */
1679                         return NULL;
1680                 case CondScope:
1681                         /* All CondScope variables of this name need to be merged
1682                          * and become InScope
1683                          */
1684                         v->depth = v->min_depth;
1685                         v->scope = InScope;
1686                         for (v2 = v->previous;
1687                              v2 && v2->scope == CondScope;
1688                              v2 = v2->previous)
1689                                 variable_merge(v, v2);
1690                         break;
1691                 case InScope:
1692                         break;
1693                 }
1694                 return v;
1695         }
1696
1697         static int var_refile(struct parse_context *c, struct variable *v)
1698         {
1699                 /* Variable just went out of scope.  Add it to the out_scope
1700                  * list, sorted by ->scope_start
1701                  */
1702                 struct variable **vp = &c->out_scope;
1703                 while ((*vp) && (*vp)->scope_start < v->scope_start)
1704                         vp = &(*vp)->in_scope;
1705                 v->in_scope = *vp;
1706                 *vp = v;
1707                 return 0;
1708         }
1709
1710         static void var_block_close(struct parse_context *c, enum closetype ct,
1711                                     struct exec *e)
1712         {
1713                 /* Close off all variables that are in_scope.
1714                  * Some variables in c->scope may already be not-in-scope,
1715                  * such as when a PendingScope variable is hidden by a new
1716                  * variable with the same name.
1717                  * So we check for v->name->var != v and drop them.
1718                  * If we choose to make a variable OutScope, we drop it
1719                  * immediately too.
1720                  */
1721                 struct variable *v, **vp, *v2;
1722
1723                 scope_pop(c);
1724                 for (vp = &c->in_scope;
1725                      (v = *vp) && v->min_depth > c->scope_depth;
1726                      (v->scope == OutScope || v->name->var != v)
1727                      ? (*vp =  v->in_scope, var_refile(c, v))
1728                      : ( vp = &v->in_scope, 0)) {
1729                         v->min_depth = c->scope_depth;
1730                         if (v->name->var != v)
1731                                 /* This is still in scope, but we haven't just
1732                                  * closed the scope.
1733                                  */
1734                                 continue;
1735                         v->min_depth = c->scope_depth;
1736                         if (v->scope == InScope)
1737                                 v->scope_end = c->scope_count;
1738                         if (v->scope == InScope && e && !v->global) {
1739                                 /* This variable gets cleaned up when 'e' finishes */
1740                                 variable_unlink_exec(v);
1741                                 v->cleanup_exec = e;
1742                                 v->next_free = e->to_free;
1743                                 e->to_free = v;
1744                         }
1745                         switch (ct) {
1746                         case CloseElse:
1747                         case CloseParallel: /* handle PendingScope */
1748                                 switch(v->scope) {
1749                                 case InScope:
1750                                 case CondScope:
1751                                         if (c->scope_stack->child_count == 1)
1752                                                 /* first among parallel branches */
1753                                                 v->scope = PendingScope;
1754                                         else if (v->previous &&
1755                                                  v->previous->scope == PendingScope)
1756                                                 /* all previous branches used name */
1757                                                 v->scope = PendingScope;
1758                                         else if (v->type == Tlabel)
1759                                                 /* Labels remain pending even when not used */
1760                                                 v->scope = PendingScope;        // UNTESTED
1761                                         else
1762                                                 v->scope = OutScope;
1763                                         if (ct == CloseElse) {
1764                                                 /* All Pending variables with this name
1765                                                  * are now Conditional */
1766                                                 for (v2 = v;
1767                                                      v2 && v2->scope == PendingScope;
1768                                                      v2 = v2->previous)
1769                                                         v2->scope = CondScope;
1770                                         }
1771                                         break;
1772                                 case PendingScope:
1773                                         /* Not possible as it would require
1774                                          * parallel scope to be nested immediately
1775                                          * in a parallel scope, and that never
1776                                          * happens.
1777                                          */                     // NOTEST
1778                                 case OutScope:
1779                                         /* Not possible as we already tested for
1780                                          * OutScope
1781                                          */
1782                                         abort();                // NOTEST
1783                                 }
1784                                 break;
1785                         case CloseFunction:
1786                                 if (v->scope == CondScope)
1787                                         /* Condition cannot continue past end of function */
1788                                         v->scope = InScope;
1789                                 /* fallthrough */
1790                         case CloseSequential:
1791                                 if (v->type == Tlabel)
1792                                         v->scope = PendingScope;
1793                                 switch (v->scope) {
1794                                 case InScope:
1795                                         v->scope = OutScope;
1796                                         break;
1797                                 case PendingScope:
1798                                         /* There was no 'else', so we can only become
1799                                          * conditional if we know the cases were exhaustive,
1800                                          * and that doesn't mean anything yet.
1801                                          * So only labels become conditional..
1802                                          */
1803                                         for (v2 = v;
1804                                              v2 && v2->scope == PendingScope;
1805                                              v2 = v2->previous)
1806                                                 if (v2->type == Tlabel)
1807                                                         v2->scope = CondScope;
1808                                                 else
1809                                                         v2->scope = OutScope;
1810                                         break;
1811                                 case CondScope:
1812                                 case OutScope: break;
1813                                 }
1814                                 break;
1815                         }
1816                 }
1817         }
1818
1819 #### Storing Values
1820
1821 The value of a variable is store separately from the variable, on an
1822 analogue of a stack frame.  There are (currently) two frames that can be
1823 active.  A global frame which currently only stores constants, and a
1824 stacked frame which stores local variables.  Each variable knows if it
1825 is global or not, and what its index into the frame is.
1826
1827 Values in the global frame are known immediately they are relevant, so
1828 the frame needs to be reallocated as it grows so it can store those
1829 values.  The local frame doesn't get values until the interpreted phase
1830 is started, so there is no need to allocate until the size is known.
1831
1832 We initialize the `frame_pos` to an impossible value, so that we can
1833 tell if it was set or not later.
1834
1835 ###### variable fields
1836         short frame_pos;
1837         short global;
1838
1839 ###### variable init
1840         v->frame_pos = -1;
1841
1842 ###### parse context
1843
1844         short global_size, global_alloc;
1845         short local_size;
1846         void *global, *local;
1847
1848 ###### ast functions
1849
1850         static struct value *var_value(struct parse_context *c, struct variable *v)
1851         {
1852                 if (!v->global) {
1853                         if (!c->local || !v->type)
1854                                 return NULL;                    // NOTEST
1855                         if (v->frame_pos + v->type->size > c->local_size) {
1856                                 printf("INVALID frame_pos\n");  // NOTEST
1857                                 exit(2);                        // NOTEST
1858                         }
1859                         return c->local + v->frame_pos;
1860                 }
1861                 if (c->global_size > c->global_alloc) {
1862                         int old = c->global_alloc;
1863                         c->global_alloc = (c->global_size | 1023) + 1024;
1864                         c->global = realloc(c->global, c->global_alloc);
1865                         memset(c->global + old, 0, c->global_alloc - old);
1866                 }
1867                 return c->global + v->frame_pos;
1868         }
1869
1870         static struct value *global_alloc(struct parse_context *c, struct type *t,
1871                                           struct variable *v, struct value *init)
1872         {
1873                 struct value *ret;
1874                 struct variable scratch;
1875
1876                 if (t->prepare_type)
1877                         t->prepare_type(c, t, 1);       // NOTEST
1878
1879                 if (c->global_size & (t->align - 1))
1880                         c->global_size = (c->global_size + t->align) & ~(t->align-1);
1881                 if (!v) {
1882                         v = &scratch;
1883                         v->type = t;
1884                 }
1885                 v->frame_pos = c->global_size;
1886                 v->global = 1;
1887                 c->global_size += v->type->size;
1888                 ret = var_value(c, v);
1889                 if (init)
1890                         memcpy(ret, init, t->size);
1891                 else
1892                         val_init(t, ret);
1893                 return ret;
1894         }
1895
1896 As global values are found -- struct field initializers, labels etc --
1897 `global_alloc()` is called to record the value in the global frame.
1898
1899 When the program is fully parsed, each function is analysed, we need to
1900 walk the list of variables local to that function and assign them an
1901 offset in the stack frame.  For this we have `scope_finalize()`.
1902
1903 We keep the stack from dense by re-using space for between variables
1904 that are not in scope at the same time.  The `out_scope` list is sorted
1905 by `scope_start` and as we process a varible, we move it to an FIFO
1906 stack.  For each variable we consider, we first discard any from the
1907 stack anything that went out of scope before the new variable came in.
1908 Then we place the new variable just after the one at the top of the
1909 stack.
1910
1911 ###### ast functions
1912
1913         static void scope_finalize(struct parse_context *c, struct type *ft)
1914         {
1915                 int size = ft->function.local_size;
1916                 struct variable *next = ft->function.scope;
1917                 struct variable *done = NULL;
1918
1919                 while (next) {
1920                         struct variable *v = next;
1921                         struct type *t = v->type;
1922                         int pos;
1923                         next = v->in_scope;
1924                         if (v->merged != v)
1925                                 continue;
1926                         if (!t)
1927                                 continue;
1928                         if (v->frame_pos >= 0)
1929                                 continue;
1930                         while (done && done->scope_end < v->scope_start)
1931                                 done = done->in_scope;
1932                         if (done)
1933                                 pos = done->frame_pos + done->type->size;
1934                         else
1935                                 pos = ft->function.local_size;
1936                         if (pos & (t->align - 1))
1937                                 pos = (pos + t->align) & ~(t->align-1);
1938                         v->frame_pos = pos;
1939                         if (size < pos + v->type->size)
1940                                 size = pos + v->type->size;
1941                         v->in_scope = done;
1942                         done = v;
1943                 }
1944                 c->out_scope = NULL;
1945                 ft->function.local_size = size;
1946         }
1947
1948 ###### free context storage
1949         free(context.global);
1950
1951 #### Variables as executables
1952
1953 Just as we used a `val` to wrap a value into an `exec`, we similarly
1954 need a `var` to wrap a `variable` into an exec.  While each `val`
1955 contained a copy of the value, each `var` holds a link to the variable
1956 because it really is the same variable no matter where it appears.
1957 When a variable is used, we need to remember to follow the `->merged`
1958 link to find the primary instance.
1959
1960 When a variable is declared, it may or may not be given an explicit
1961 type.  We need to record which so that we can report the parsed code
1962 correctly.
1963
1964 ###### exec type
1965         Xvar,
1966
1967 ###### ast
1968         struct var {
1969                 struct exec;
1970                 struct variable *var;
1971         };
1972
1973 ###### variable fields
1974         int explicit_type;
1975
1976 ###### Grammar
1977
1978         $TERM : ::
1979
1980         $*var
1981         VariableDecl -> IDENTIFIER : ${ {
1982                 struct variable *v = var_decl(c, $1.txt);
1983                 $0 = new_pos(var, $1);
1984                 $0->var = v;
1985                 if (v)
1986                         v->where_decl = $0;
1987                 else {
1988                         v = var_ref(c, $1.txt);
1989                         $0->var = v;
1990                         type_err(c, "error: variable '%v' redeclared",
1991                                  $0, NULL, 0, NULL);
1992                         type_err(c, "info: this is where '%v' was first declared",
1993                                  v->where_decl, NULL, 0, NULL);
1994                 }
1995         } }$
1996         | IDENTIFIER :: ${ {
1997                 struct variable *v = var_decl(c, $1.txt);
1998                 $0 = new_pos(var, $1);
1999                 $0->var = v;
2000                 if (v) {
2001                         v->where_decl = $0;
2002                         v->constant = 1;
2003                 } else {
2004                         v = var_ref(c, $1.txt);
2005                         $0->var = v;
2006                         type_err(c, "error: variable '%v' redeclared",
2007                                  $0, NULL, 0, NULL);
2008                         type_err(c, "info: this is where '%v' was first declared",
2009                                  v->where_decl, NULL, 0, NULL);
2010                 }
2011         } }$
2012         | IDENTIFIER : Type ${ {
2013                 struct variable *v = var_decl(c, $1.txt);
2014                 $0 = new_pos(var, $1);
2015                 $0->var = v;
2016                 if (v) {
2017                         v->where_decl = $0;
2018                         v->where_set = $0;
2019                         v->type = $<Type;
2020                         v->explicit_type = 1;
2021                 } else {
2022                         v = var_ref(c, $1.txt);
2023                         $0->var = v;
2024                         type_err(c, "error: variable '%v' redeclared",
2025                                  $0, NULL, 0, NULL);
2026                         type_err(c, "info: this is where '%v' was first declared",
2027                                  v->where_decl, NULL, 0, NULL);
2028                 }
2029         } }$
2030         | IDENTIFIER :: Type ${ {
2031                 struct variable *v = var_decl(c, $1.txt);
2032                 $0 = new_pos(var, $1);
2033                 $0->var = v;
2034                 if (v) {
2035                         v->where_decl = $0;
2036                         v->where_set = $0;
2037                         v->type = $<Type;
2038                         v->constant = 1;
2039                         v->explicit_type = 1;
2040                 } else {
2041                         v = var_ref(c, $1.txt);
2042                         $0->var = v;
2043                         type_err(c, "error: variable '%v' redeclared",
2044                                  $0, NULL, 0, NULL);
2045                         type_err(c, "info: this is where '%v' was first declared",
2046                                  v->where_decl, NULL, 0, NULL);
2047                 }
2048         } }$
2049
2050         $*exec
2051         Variable -> IDENTIFIER ${ {
2052                 struct variable *v = var_ref(c, $1.txt);
2053                 $0 = new_pos(var, $1);
2054                 if (v == NULL) {
2055                         /* This might be a label - allocate a var just in case */
2056                         v = var_decl(c, $1.txt);
2057                         if (v) {
2058                                 v->type = Tnone;
2059                                 v->where_decl = $0;
2060                                 v->where_set = $0;
2061                         }
2062                 }
2063                 cast(var, $0)->var = v;
2064         } }$
2065
2066 ###### print exec cases
2067         case Xvar:
2068         {
2069                 struct var *v = cast(var, e);
2070                 if (v->var) {
2071                         struct binding *b = v->var->name;
2072                         printf("%.*s", b->name.len, b->name.txt);
2073                 }
2074                 break;
2075         }
2076
2077 ###### format cases
2078         case 'v':
2079                 if (loc && loc->type == Xvar) {
2080                         struct var *v = cast(var, loc);
2081                         if (v->var) {
2082                                 struct binding *b = v->var->name;
2083                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2084                         } else
2085                                 fputs("???", stderr);   // NOTEST
2086                 } else
2087                         fputs("NOTVAR", stderr);
2088                 break;
2089
2090 ###### propagate exec cases
2091
2092         case Xvar:
2093         {
2094                 struct var *var = cast(var, prog);
2095                 struct variable *v = var->var;
2096                 if (!v) {
2097                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2098                         return Tnone;                                   // NOTEST
2099                 }
2100                 v = v->merged;
2101                 if (v->constant && (rules & Rnoconstant)) {
2102                         type_err(c, "error: Cannot assign to a constant: %v",
2103                                  prog, NULL, 0, NULL);
2104                         type_err(c, "info: name was defined as a constant here",
2105                                  v->where_decl, NULL, 0, NULL);
2106                         return v->type;
2107                 }
2108                 if (v->type == Tnone && v->where_decl == prog)
2109                         type_err(c, "error: variable used but not declared: %v",
2110                                  prog, NULL, 0, NULL);
2111                 if (v->type == NULL) {
2112                         if (type && *ok != 0) {
2113                                 v->type = type;
2114                                 v->where_set = prog;
2115                                 *ok = 2;
2116                         }
2117                         return type;
2118                 }
2119                 if (!type_compat(type, v->type, rules)) {
2120                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2121                                  type, rules, v->type);
2122                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2123                                  v->type, rules, NULL);
2124                 }
2125                 if (!type)
2126                         return v->type;
2127                 return type;
2128         }
2129
2130 ###### interp exec cases
2131         case Xvar:
2132         {
2133                 struct var *var = cast(var, e);
2134                 struct variable *v = var->var;
2135
2136                 v = v->merged;
2137                 lrv = var_value(c, v);
2138                 rvtype = v->type;
2139                 break;
2140         }
2141
2142 ###### ast functions
2143
2144         static void free_var(struct var *v)
2145         {
2146                 free(v);
2147         }
2148
2149 ###### free exec cases
2150         case Xvar: free_var(cast(var, e)); break;
2151
2152
2153 ### Complex types
2154
2155 Now that we have the shape of the interpreter in place we can add some
2156 complex types and connected them in to the data structures and the
2157 different phases of parse, analyse, print, interpret.
2158
2159 Being "complex" the language will naturally have syntax to access
2160 specifics of objects of these types.  These will fit into the grammar as
2161 "Terms" which are the things that are combined with various operators to
2162 form "Expression".  Where a Term is formed by some operation on another
2163 Term, the subordinate Term will always come first, so for example a
2164 member of an array will be expressed as the Term for the array followed
2165 by an index in square brackets.  The strict rule of using postfix
2166 operations makes precedence irrelevant within terms.  To provide a place
2167 to put the grammar for each terms of each type, we will start out by
2168 introducing the "Term" grammar production, with contains at least a
2169 simple "Value" (to be explained later).
2170
2171 ###### Grammar
2172         $*exec
2173         Term ->  Value ${ $0 = $<1; }$
2174         | Variable ${ $0 = $<1; }$
2175         ## term grammar
2176
2177 Thus far the complex types we have are arrays and structs.
2178
2179 #### Arrays
2180
2181 Arrays can be declared by giving a size and a type, as `[size]type' so
2182 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
2183 size can be either a literal number, or a named constant.  Some day an
2184 arbitrary expression will be supported.
2185
2186 As a formal parameter to a function, the array can be declared with a
2187 new variable as the size: `name:[size::number]string`.  The `size`
2188 variable is set to the size of the array and must be a constant.  As
2189 `number` is the only supported type, it can be left out:
2190 `name:[size::]string`.
2191
2192 Arrays cannot be assigned.  When pointers are introduced we will also
2193 introduce array slices which can refer to part or all of an array -
2194 the assignment syntax will create a slice.  For now, an array can only
2195 ever be referenced by the name it is declared with.  It is likely that
2196 a "`copy`" primitive will eventually be define which can be used to
2197 make a copy of an array with controllable recursive depth.
2198
2199 For now we have two sorts of array, those with fixed size either because
2200 it is given as a literal number or because it is a struct member (which
2201 cannot have a runtime-changing size), and those with a size that is
2202 determined at runtime - local variables with a const size.  The former
2203 have their size calculated at parse time, the latter at run time.
2204
2205 For the latter type, the `size` field of the type is the size of a
2206 pointer, and the array is reallocated every time it comes into scope.
2207
2208 We differentiate struct fields with a const size from local variables
2209 with a const size by whether they are prepared at parse time or not.
2210
2211 ###### type union fields
2212
2213         struct {
2214                 int unspec;     // size is unspecified - vsize must be set.
2215                 short size;
2216                 short static_size;
2217                 struct variable *vsize;
2218                 struct type *member;
2219         } array;
2220
2221 ###### value union fields
2222         void *array;  // used if not static_size
2223
2224 ###### value functions
2225
2226         static void array_prepare_type(struct parse_context *c, struct type *type,
2227                                        int parse_time)
2228         {
2229                 struct value *vsize;
2230                 mpz_t q;
2231                 if (!type->array.vsize || type->array.static_size)
2232                         return;
2233
2234                 vsize = var_value(c, type->array.vsize);
2235                 mpz_init(q);
2236                 mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
2237                 type->array.size = mpz_get_si(q);
2238                 mpz_clear(q);
2239
2240                 if (parse_time) {
2241                         type->array.static_size = 1;
2242                         type->size = type->array.size * type->array.member->size;
2243                         type->align = type->array.member->align;
2244                 }
2245         }
2246
2247         static void array_init(struct type *type, struct value *val)
2248         {
2249                 int i;
2250                 void *ptr = val->ptr;
2251
2252                 if (!val)
2253                         return;                         // NOTEST
2254                 if (!type->array.static_size) {
2255                         val->array = calloc(type->array.size,
2256                                             type->array.member->size);
2257                         ptr = val->array;
2258                 }
2259                 for (i = 0; i < type->array.size; i++) {
2260                         struct value *v;
2261                         v = (void*)ptr + i * type->array.member->size;
2262                         val_init(type->array.member, v);
2263                 }
2264         }
2265
2266         static void array_free(struct type *type, struct value *val)
2267         {
2268                 int i;
2269                 void *ptr = val->ptr;
2270
2271                 if (!type->array.static_size)
2272                         ptr = val->array;
2273                 for (i = 0; i < type->array.size; i++) {
2274                         struct value *v;
2275                         v = (void*)ptr + i * type->array.member->size;
2276                         free_value(type->array.member, v);
2277                 }
2278                 if (!type->array.static_size)
2279                         free(ptr);
2280         }
2281
2282         static int array_compat(struct type *require, struct type *have)
2283         {
2284                 if (have->compat != require->compat)
2285                         return 0;
2286                 /* Both are arrays, so we can look at details */
2287                 if (!type_compat(require->array.member, have->array.member, 0))
2288                         return 0;
2289                 if (have->array.unspec && require->array.unspec) {
2290                         if (have->array.vsize && require->array.vsize &&
2291                             have->array.vsize != require->array.vsize)  // UNTESTED
2292                                 /* sizes might not be the same */
2293                                 return 0;       // UNTESTED
2294                         return 1;
2295                 }
2296                 if (have->array.unspec || require->array.unspec)
2297                         return 1;       // UNTESTED
2298                 if (require->array.vsize == NULL && have->array.vsize == NULL)
2299                         return require->array.size == have->array.size;
2300
2301                 return require->array.vsize == have->array.vsize;       // UNTESTED
2302         }
2303
2304         static void array_print_type(struct type *type, FILE *f)
2305         {
2306                 fputs("[", f);
2307                 if (type->array.vsize) {
2308                         struct binding *b = type->array.vsize->name;
2309                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
2310                                 type->array.unspec ? "::" : "");
2311                 } else if (type->array.size)
2312                         fprintf(f, "%d]", type->array.size);
2313                 else
2314                         fprintf(f, "]");
2315                 type_print(type->array.member, f);
2316         }
2317
2318         static struct type array_prototype = {
2319                 .init = array_init,
2320                 .prepare_type = array_prepare_type,
2321                 .print_type = array_print_type,
2322                 .compat = array_compat,
2323                 .free = array_free,
2324                 .size = sizeof(void*),
2325                 .align = sizeof(void*),
2326         };
2327
2328 ###### declare terminals
2329         $TERM [ ]
2330
2331 ###### type grammar
2332
2333         | [ NUMBER ] Type ${ {
2334                 char tail[3];
2335                 mpq_t num;
2336                 struct type *t;
2337                 int elements = 0;
2338
2339                 if (number_parse(num, tail, $2.txt) == 0)
2340                         tok_err(c, "error: unrecognised number", &$2);
2341                 else if (tail[0]) {
2342                         tok_err(c, "error: unsupported number suffix", &$2);
2343                         mpq_clear(num);
2344                 } else {
2345                         elements = mpz_get_ui(mpq_numref(num));
2346                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
2347                                 tok_err(c, "error: array size must be an integer",
2348                                         &$2);
2349                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
2350                                 tok_err(c, "error: array size is too large",
2351                                         &$2);
2352                         mpq_clear(num);
2353                 }
2354
2355                 $0 = t = add_anon_type(c, &array_prototype, "array[%d]", elements );
2356                 t->array.size = elements;
2357                 t->array.member = $<4;
2358                 t->array.vsize = NULL;
2359                 t->array.static_size = 1;
2360                 t->size = t->array.size * t->array.member->size;
2361                 t->align = t->array.member->align;
2362         } }$
2363
2364         | [ IDENTIFIER ] Type ${ {
2365                 struct variable *v = var_ref(c, $2.txt);
2366
2367                 if (!v)
2368                         tok_err(c, "error: name undeclared", &$2);
2369                 else if (!v->constant)
2370                         tok_err(c, "error: array size must be a constant", &$2);
2371
2372                 $0 = add_anon_type(c, &array_prototype, "array[%.*s]", $2.txt.len, $2.txt.txt);
2373                 $0->array.member = $<4;
2374                 $0->array.size = 0;
2375                 $0->array.vsize = v;
2376         } }$
2377
2378 ###### Grammar
2379         $*type
2380         OptType -> Type ${ $0 = $<1; }$
2381                 | ${ $0 = NULL; }$
2382
2383 ###### formal type grammar
2384
2385         | [ IDENTIFIER :: OptType ] Type ${ {
2386                 struct variable *v = var_decl(c, $ID.txt);
2387
2388                 v->type = $<OT;
2389                 v->constant = 1;
2390                 if (!v->type)
2391                         v->type = Tnum;
2392                 $0 = add_anon_type(c, &array_prototype, "array[var]");
2393                 $0->array.member = $<6;
2394                 $0->array.size = 0;
2395                 $0->array.unspec = 1;
2396                 $0->array.vsize = v;
2397         } }$
2398
2399 ###### Binode types
2400         Index,
2401
2402 ###### term grammar
2403
2404         | Term [ Expression ] ${ {
2405                 struct binode *b = new(binode);
2406                 b->op = Index;
2407                 b->left = $<1;
2408                 b->right = $<3;
2409                 $0 = b;
2410         } }$
2411
2412 ###### print binode cases
2413         case Index:
2414                 print_exec(b->left, -1, bracket);
2415                 printf("[");
2416                 print_exec(b->right, -1, bracket);
2417                 printf("]");
2418                 break;
2419
2420 ###### propagate binode cases
2421         case Index:
2422                 /* left must be an array, right must be a number,
2423                  * result is the member type of the array
2424                  */
2425                 propagate_types(b->right, c, ok, Tnum, 0);
2426                 t = propagate_types(b->left, c, ok, NULL, rules & Rnoconstant);
2427                 if (!t || t->compat != array_compat) {
2428                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
2429                         return NULL;
2430                 } else {
2431                         if (!type_compat(type, t->array.member, rules)) {
2432                                 type_err(c, "error: have %1 but need %2", prog,
2433                                          t->array.member, rules, type);
2434                         }
2435                         return t->array.member;
2436                 }
2437                 break;
2438
2439 ###### interp binode cases
2440         case Index: {
2441                 mpz_t q;
2442                 long i;
2443                 void *ptr;
2444
2445                 lleft = linterp_exec(c, b->left, &ltype);
2446                 right = interp_exec(c, b->right, &rtype);
2447                 mpz_init(q);
2448                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
2449                 i = mpz_get_si(q);
2450                 mpz_clear(q);
2451
2452                 if (ltype->array.static_size)
2453                         ptr = lleft;
2454                 else
2455                         ptr = *(void**)lleft;
2456                 rvtype = ltype->array.member;
2457                 if (i >= 0 && i < ltype->array.size)
2458                         lrv = ptr + i * rvtype->size;
2459                 else
2460                         val_init(ltype->array.member, &rv); // UNSAFE
2461                 ltype = NULL;
2462                 break;
2463         }
2464
2465 #### Structs
2466
2467 A `struct` is a data-type that contains one or more other data-types.
2468 It differs from an array in that each member can be of a different
2469 type, and they are accessed by name rather than by number.  Thus you
2470 cannot choose an element by calculation, you need to know what you
2471 want up-front.
2472
2473 The language makes no promises about how a given structure will be
2474 stored in memory - it is free to rearrange fields to suit whatever
2475 criteria seems important.
2476
2477 Structs are declared separately from program code - they cannot be
2478 declared in-line in a variable declaration like arrays can.  A struct
2479 is given a name and this name is used to identify the type - the name
2480 is not prefixed by the word `struct` as it would be in C.
2481
2482 Structs are only treated as the same if they have the same name.
2483 Simply having the same fields in the same order is not enough.  This
2484 might change once we can create structure initializers from a list of
2485 values.
2486
2487 Each component datum is identified much like a variable is declared,
2488 with a name, one or two colons, and a type.  The type cannot be omitted
2489 as there is no opportunity to deduce the type from usage.  An initial
2490 value can be given following an equals sign, so
2491
2492 ##### Example: a struct type
2493
2494         struct complex:
2495                 x:number = 0
2496                 y:number = 0
2497
2498 would declare a type called "complex" which has two number fields,
2499 each initialised to zero.
2500
2501 Struct will need to be declared separately from the code that uses
2502 them, so we will need to be able to print out the declaration of a
2503 struct when reprinting the whole program.  So a `print_type_decl` type
2504 function will be needed.
2505
2506 ###### type union fields
2507
2508         struct {
2509                 int nfields;
2510                 struct field {
2511                         struct text name;
2512                         struct type *type;
2513                         struct value *init;
2514                         int offset;
2515                 } *fields;
2516         } structure;
2517
2518 ###### type functions
2519         void (*print_type_decl)(struct type *type, FILE *f);
2520
2521 ###### value functions
2522
2523         static void structure_init(struct type *type, struct value *val)
2524         {
2525                 int i;
2526
2527                 for (i = 0; i < type->structure.nfields; i++) {
2528                         struct value *v;
2529                         v = (void*) val->ptr + type->structure.fields[i].offset;
2530                         if (type->structure.fields[i].init)
2531                                 dup_value(type->structure.fields[i].type,
2532                                           type->structure.fields[i].init,
2533                                           v);
2534                         else
2535                                 val_init(type->structure.fields[i].type, v);
2536                 }
2537         }
2538
2539         static void structure_free(struct type *type, struct value *val)
2540         {
2541                 int i;
2542
2543                 for (i = 0; i < type->structure.nfields; i++) {
2544                         struct value *v;
2545                         v = (void*)val->ptr + type->structure.fields[i].offset;
2546                         free_value(type->structure.fields[i].type, v);
2547                 }
2548         }
2549
2550         static void structure_free_type(struct type *t)
2551         {
2552                 int i;
2553                 for (i = 0; i < t->structure.nfields; i++)
2554                         if (t->structure.fields[i].init) {
2555                                 free_value(t->structure.fields[i].type,
2556                                            t->structure.fields[i].init);
2557                         }
2558                 free(t->structure.fields);
2559         }
2560
2561         static struct type structure_prototype = {
2562                 .init = structure_init,
2563                 .free = structure_free,
2564                 .free_type = structure_free_type,
2565                 .print_type_decl = structure_print_type,
2566         };
2567
2568 ###### exec type
2569         Xfieldref,
2570
2571 ###### ast
2572         struct fieldref {
2573                 struct exec;
2574                 struct exec *left;
2575                 int index;
2576                 struct text name;
2577         };
2578
2579 ###### free exec cases
2580         case Xfieldref:
2581                 free_exec(cast(fieldref, e)->left);
2582                 free(e);
2583                 break;
2584
2585 ###### declare terminals
2586         $TERM struct .
2587
2588 ###### term grammar
2589
2590         | Term . IDENTIFIER ${ {
2591                 struct fieldref *fr = new_pos(fieldref, $2);
2592                 fr->left = $<1;
2593                 fr->name = $3.txt;
2594                 fr->index = -2;
2595                 $0 = fr;
2596         } }$
2597
2598 ###### print exec cases
2599
2600         case Xfieldref:
2601         {
2602                 struct fieldref *f = cast(fieldref, e);
2603                 print_exec(f->left, -1, bracket);
2604                 printf(".%.*s", f->name.len, f->name.txt);
2605                 break;
2606         }
2607
2608 ###### ast functions
2609         static int find_struct_index(struct type *type, struct text field)
2610         {
2611                 int i;
2612                 for (i = 0; i < type->structure.nfields; i++)
2613                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2614                                 return i;
2615                 return -1;
2616         }
2617
2618 ###### propagate exec cases
2619
2620         case Xfieldref:
2621         {
2622                 struct fieldref *f = cast(fieldref, prog);
2623                 struct type *st = propagate_types(f->left, c, ok, NULL, 0);
2624
2625                 if (!st)
2626                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2627                                  NULL, 0, NULL);
2628                 else if (st->init != structure_init)
2629                         type_err(c, "error: field reference attempted on %1, not a struct",
2630                                  f->left, st, 0, NULL);
2631                 else if (f->index == -2) {
2632                         f->index = find_struct_index(st, f->name);
2633                         if (f->index < 0)
2634                                 type_err(c, "error: cannot find requested field in %1",
2635                                          f->left, st, 0, NULL);
2636                 }
2637                 if (f->index >= 0) {
2638                         struct type *ft = st->structure.fields[f->index].type;
2639                         if (!type_compat(type, ft, rules))
2640                                 type_err(c, "error: have %1 but need %2", prog,
2641                                          ft, rules, type);
2642                         return ft;
2643                 }
2644                 break;
2645         }
2646
2647 ###### interp exec cases
2648         case Xfieldref:
2649         {
2650                 struct fieldref *f = cast(fieldref, e);
2651                 struct type *ltype;
2652                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2653                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2654                 rvtype = ltype->structure.fields[f->index].type;
2655                 break;
2656         }
2657
2658 ###### ast
2659         struct fieldlist {
2660                 struct fieldlist *prev;
2661                 struct field f;
2662         };
2663
2664 ###### ast functions
2665         static void free_fieldlist(struct fieldlist *f)
2666         {
2667                 if (!f)
2668                         return;
2669                 free_fieldlist(f->prev);
2670                 if (f->f.init) {
2671                         free_value(f->f.type, f->f.init);       // UNTESTED
2672                         free(f->f.init);        // UNTESTED
2673                 }
2674                 free(f);
2675         }
2676
2677 ###### top level grammar
2678         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2679                 struct type *t =
2680                         add_type(c, $2.txt, &structure_prototype);
2681                 int cnt = 0;
2682                 struct fieldlist *f;
2683
2684                 for (f = $3; f; f=f->prev)
2685                         cnt += 1;
2686
2687                 t->structure.nfields = cnt;
2688                 t->structure.fields = calloc(cnt, sizeof(struct field));
2689                 f = $3;
2690                 while (cnt > 0) {
2691                         int a = f->f.type->align;
2692                         cnt -= 1;
2693                         t->structure.fields[cnt] = f->f;
2694                         if (t->size & (a-1))
2695                                 t->size = (t->size | (a-1)) + 1;
2696                         t->structure.fields[cnt].offset = t->size;
2697                         t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2698                         if (a > t->align)
2699                                 t->align = a;
2700                         f->f.init = NULL;
2701                         f = f->prev;
2702                 }
2703         } }$
2704
2705         $*fieldlist
2706         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2707         | { SimpleFieldList } ${ $0 = $<SFL; }$
2708         | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2709         | SimpleFieldList EOL ${ $0 = $<SFL; }$
2710
2711         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2712         | FieldLines SimpleFieldList Newlines ${
2713                 $SFL->prev = $<FL;
2714                 $0 = $<SFL;
2715         }$
2716
2717         SimpleFieldList -> Field ${ $0 = $<F; }$
2718         | SimpleFieldList ; Field ${
2719                 $F->prev = $<SFL;
2720                 $0 = $<F;
2721         }$
2722         | SimpleFieldList ; ${
2723                 $0 = $<SFL;
2724         }$
2725         | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2726
2727         Field -> IDENTIFIER : Type = Expression ${ {
2728                 int ok;
2729
2730                 $0 = calloc(1, sizeof(struct fieldlist));
2731                 $0->f.name = $1.txt;
2732                 $0->f.type = $<3;
2733                 $0->f.init = NULL;
2734                 do {
2735                         ok = 1;
2736                         propagate_types($<5, c, &ok, $3, 0);
2737                 } while (ok == 2);
2738                 if (!ok)
2739                         c->parse_error = 1;     // UNTESTED
2740                 else {
2741                         struct value vl = interp_exec(c, $5, NULL);
2742                         $0->f.init = global_alloc(c, $0->f.type, NULL, &vl);
2743                 }
2744         } }$
2745         | IDENTIFIER : Type ${
2746                 $0 = calloc(1, sizeof(struct fieldlist));
2747                 $0->f.name = $1.txt;
2748                 $0->f.type = $<3;
2749                 if ($0->f.type->prepare_type)
2750                         $0->f.type->prepare_type(c, $0->f.type, 1);
2751         }$
2752
2753 ###### forward decls
2754         static void structure_print_type(struct type *t, FILE *f);
2755
2756 ###### value functions
2757         static void structure_print_type(struct type *t, FILE *f)
2758         {
2759                 int i;
2760
2761                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2762
2763                 for (i = 0; i < t->structure.nfields; i++) {
2764                         struct field *fl = t->structure.fields + i;
2765                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2766                         type_print(fl->type, f);
2767                         if (fl->type->print && fl->init) {
2768                                 fprintf(f, " = ");
2769                                 if (fl->type == Tstr)
2770                                         fprintf(f, "\"");       // UNTESTED
2771                                 print_value(fl->type, fl->init, f);
2772                                 if (fl->type == Tstr)
2773                                         fprintf(f, "\"");       // UNTESTED
2774                         }
2775                         fprintf(f, "\n");
2776                 }
2777         }
2778
2779 ###### print type decls
2780         {
2781                 struct type *t;
2782                 int target = -1;
2783
2784                 while (target != 0) {
2785                         int i = 0;
2786                         for (t = context.typelist; t ; t=t->next)
2787                                 if (!t->anon && t->print_type_decl &&
2788                                     !t->check_args) {
2789                                         i += 1;
2790                                         if (i == target)
2791                                                 break;
2792                                 }
2793
2794                         if (target == -1) {
2795                                 target = i;
2796                         } else {
2797                                 t->print_type_decl(t, stdout);
2798                                 target -= 1;
2799                         }
2800                 }
2801         }
2802
2803 #### Functions
2804
2805 A function is a chunk of code which can be passed parameters and can
2806 return results.  Each function has a type which includes the set of
2807 parameters and the return value.  As yet these types cannot be declared
2808 separately from the function itself.
2809
2810 The parameters can be specified either in parentheses as a ';' separated
2811 list, such as
2812
2813 ##### Example: function 1
2814
2815         func main(av:[ac::number]string; env:[envc::number]string)
2816                 code block
2817
2818 or as an indented list of one parameter per line (though each line can
2819 be a ';' separated list)
2820
2821 ##### Example: function 2
2822
2823         func main
2824                 argv:[argc::number]string
2825                 env:[envc::number]string
2826         do
2827                 code block
2828
2829 In the first case a return type can follow the parentheses after a colon,
2830 in the second it is given on a line starting with the word `return`.
2831
2832 ##### Example: functions that return
2833
2834         func add(a:number; b:number): number
2835                 code block
2836
2837         func catenate
2838                 a: string
2839                 b: string
2840         return string
2841         do
2842                 code block
2843
2844 Rather than returning a type, the function can specify a set of local
2845 variables to return as a struct.  The values of these variables when the
2846 function exits will be provided to the caller.  For this the return type
2847 is replaced with a block of result declarations, either in parentheses
2848 or bracketed by `return` and `do`.
2849
2850 ##### Example: functions returning multiple variables
2851
2852         func to_cartesian(rho:number; theta:number):(x:number; y:number)
2853                 x = .....
2854                 y = .....
2855
2856         func to_polar
2857                 x:number; y:number
2858         return
2859                 rho:number
2860                 theta:number
2861         do
2862                 rho = ....
2863                 theta = ....
2864
2865 For constructing the lists we use a `List` binode, which will be
2866 further detailed when Expression Lists are introduced.
2867
2868 ###### type union fields
2869
2870         struct {
2871                 struct binode *params;
2872                 struct type *return_type;
2873                 struct variable *scope;
2874                 int inline_result;      // return value is at start of 'local'
2875                 int local_size;
2876         } function;
2877
2878 ###### value union fields
2879         struct exec *function;
2880
2881 ###### type functions
2882         void (*check_args)(struct parse_context *c, int *ok,
2883                            struct type *require, struct exec *args);
2884
2885 ###### value functions
2886
2887         static void function_free(struct type *type, struct value *val)
2888         {
2889                 free_exec(val->function);
2890                 val->function = NULL;
2891         }
2892
2893         static int function_compat(struct type *require, struct type *have)
2894         {
2895                 // FIXME can I do anything here yet?
2896                 return 0;
2897         }
2898
2899         static void function_check_args(struct parse_context *c, int *ok,
2900                                         struct type *require, struct exec *args)
2901         {
2902                 /* This should be 'compat', but we don't have a 'tuple' type to
2903                  * hold the type of 'args'
2904                  */
2905                 struct binode *arg = cast(binode, args);
2906                 struct binode *param = require->function.params;
2907
2908                 while (param) {
2909                         struct var *pv = cast(var, param->left);
2910                         if (!arg) {
2911                                 type_err(c, "error: insufficient arguments to function.",
2912                                          args, NULL, 0, NULL);
2913                                 break;
2914                         }
2915                         *ok = 1;
2916                         propagate_types(arg->left, c, ok, pv->var->type, 0);
2917                         param = cast(binode, param->right);
2918                         arg = cast(binode, arg->right);
2919                 }
2920                 if (arg)
2921                         type_err(c, "error: too many arguments to function.",
2922                                  args, NULL, 0, NULL);
2923         }
2924
2925         static void function_print(struct type *type, struct value *val, FILE *f)
2926         {
2927                 print_exec(val->function, 1, 0);
2928         }
2929
2930         static void function_print_type_decl(struct type *type, FILE *f)
2931         {
2932                 struct binode *b;
2933                 fprintf(f, "(");
2934                 for (b = type->function.params; b; b = cast(binode, b->right)) {
2935                         struct variable *v = cast(var, b->left)->var;
2936                         fprintf(f, "%.*s%s", v->name->name.len, v->name->name.txt,
2937                                 v->constant ? "::" : ":");
2938                         type_print(v->type, f);
2939                         if (b->right)
2940                                 fprintf(f, "; ");
2941                 }
2942                 fprintf(f, ")");
2943                 if (type->function.return_type != Tnone) {
2944                         fprintf(f, ":");
2945                         if (type->function.inline_result) {
2946                                 int i;
2947                                 struct type *t = type->function.return_type;
2948                                 fprintf(f, " (");
2949                                 for (i = 0; i < t->structure.nfields; i++) {
2950                                         struct field *fl = t->structure.fields + i;
2951                                         if (i)
2952                                                 fprintf(f, "; ");
2953                                         fprintf(f, "%.*s:", fl->name.len, fl->name.txt);
2954                                         type_print(fl->type, f);
2955                                 }
2956                                 fprintf(f, ")");
2957                         } else
2958                                 type_print(type->function.return_type, f);
2959                 }
2960                 fprintf(f, "\n");
2961         }
2962
2963         static void function_free_type(struct type *t)
2964         {
2965                 free_exec(t->function.params);
2966         }
2967
2968         static struct type function_prototype = {
2969                 .size = sizeof(void*),
2970                 .align = sizeof(void*),
2971                 .free = function_free,
2972                 .compat = function_compat,
2973                 .check_args = function_check_args,
2974                 .print = function_print,
2975                 .print_type_decl = function_print_type_decl,
2976                 .free_type = function_free_type,
2977         };
2978
2979 ###### declare terminals
2980
2981         $TERM func
2982
2983 ###### Binode types
2984         List,
2985
2986 ###### Grammar
2987
2988         $*variable
2989         FuncName -> IDENTIFIER ${ {
2990                 struct variable *v = var_decl(c, $1.txt);
2991                 struct var *e = new_pos(var, $1);
2992                 e->var = v;
2993                 if (v) {
2994                         v->where_decl = e;
2995                         $0 = v;
2996                 } else {
2997                         v = var_ref(c, $1.txt);
2998                         e->var = v;
2999                         type_err(c, "error: function '%v' redeclared",
3000                                 e, NULL, 0, NULL);
3001                         type_err(c, "info: this is where '%v' was first declared",
3002                                 v->where_decl, NULL, 0, NULL);
3003                         free_exec(e);
3004                 }
3005         } }$
3006
3007         $*binode
3008         Args -> ArgsLine NEWLINE ${ $0 = $<AL; }$
3009         | Args ArgsLine NEWLINE ${ {
3010                 struct binode *b = $<AL;
3011                 struct binode **bp = &b;
3012                 while (*bp)
3013                         bp = (struct binode **)&(*bp)->left;
3014                 *bp = $<A;
3015                 $0 = b;
3016         } }$
3017
3018         ArgsLine -> ${ $0 = NULL; }$
3019         | Varlist ${ $0 = $<1; }$
3020         | Varlist ; ${ $0 = $<1; }$
3021
3022         Varlist -> Varlist ; ArgDecl ${
3023                 $0 = new(binode);
3024                 $0->op = List;
3025                 $0->left = $<Vl;
3026                 $0->right = $<AD;
3027         }$
3028         | ArgDecl ${
3029                 $0 = new(binode);
3030                 $0->op = List;
3031                 $0->left = NULL;
3032                 $0->right = $<AD;
3033         }$
3034
3035         $*var
3036         ArgDecl -> IDENTIFIER : FormalType ${ {
3037                 struct variable *v = var_decl(c, $1.txt);
3038                 $0 = new(var);
3039                 $0->var = v;
3040                 v->type = $<FT;
3041         } }$
3042
3043 ##### Function calls
3044
3045 A function call can appear either as an expression or as a statement.
3046 We use a new 'Funcall' binode type to link the function with a list of
3047 arguments, form with the 'List' nodes.
3048
3049 We have already seen the "Term" which is how a function call can appear
3050 in an expression.  To parse a function call into a statement we include
3051 it in the "SimpleStatement Grammar" which will be described later.
3052
3053 ###### Binode types
3054         Funcall,
3055
3056 ###### term grammar
3057         | Term ( ExpressionList ) ${ {
3058                 struct binode *b = new(binode);
3059                 b->op = Funcall;
3060                 b->left = $<T;
3061                 b->right = reorder_bilist($<EL);
3062                 $0 = b;
3063         } }$
3064         | Term ( ) ${ {
3065                 struct binode *b = new(binode);
3066                 b->op = Funcall;
3067                 b->left = $<T;
3068                 b->right = NULL;
3069                 $0 = b;
3070         } }$
3071
3072 ###### SimpleStatement Grammar
3073
3074         | Term ( ExpressionList ) ${ {
3075                 struct binode *b = new(binode);
3076                 b->op = Funcall;
3077                 b->left = $<T;
3078                 b->right = reorder_bilist($<EL);
3079                 $0 = b;
3080         } }$
3081
3082 ###### print binode cases
3083
3084         case Funcall:
3085                 do_indent(indent, "");
3086                 print_exec(b->left, -1, bracket);
3087                 printf("(");
3088                 for (b = cast(binode, b->right); b; b = cast(binode, b->right)) {
3089                         if (b->left) {
3090                                 printf(" ");
3091                                 print_exec(b->left, -1, bracket);
3092                                 if (b->right)
3093                                         printf(",");
3094                         }
3095                 }
3096                 printf(")");
3097                 if (indent >= 0)
3098                         printf("\n");
3099                 break;
3100
3101 ###### propagate binode cases
3102
3103         case Funcall: {
3104                 /* Every arg must match formal parameter, and result
3105                  * is return type of function
3106                  */
3107                 struct binode *args = cast(binode, b->right);
3108                 struct var *v = cast(var, b->left);
3109
3110                 if (!v->var->type || v->var->type->check_args == NULL) {
3111                         type_err(c, "error: attempt to call a non-function.",
3112                                  prog, NULL, 0, NULL);
3113                         return NULL;
3114                 }
3115                 v->var->type->check_args(c, ok, v->var->type, args);
3116                 return v->var->type->function.return_type;
3117         }
3118
3119 ###### interp binode cases
3120
3121         case Funcall: {
3122                 struct var *v = cast(var, b->left);
3123                 struct type *t = v->var->type;
3124                 void *oldlocal = c->local;
3125                 int old_size = c->local_size;
3126                 void *local = calloc(1, t->function.local_size);
3127                 struct value *fbody = var_value(c, v->var);
3128                 struct binode *arg = cast(binode, b->right);
3129                 struct binode *param = t->function.params;
3130
3131                 while (param) {
3132                         struct var *pv = cast(var, param->left);
3133                         struct type *vtype = NULL;
3134                         struct value val = interp_exec(c, arg->left, &vtype);
3135                         struct value *lval;
3136                         c->local = local; c->local_size = t->function.local_size;
3137                         lval = var_value(c, pv->var);
3138                         c->local = oldlocal; c->local_size = old_size;
3139                         memcpy(lval, &val, vtype->size);
3140                         param = cast(binode, param->right);
3141                         arg = cast(binode, arg->right);
3142                 }
3143                 c->local = local; c->local_size = t->function.local_size;
3144                 if (t->function.inline_result && dtype) {
3145                         _interp_exec(c, fbody->function, NULL, NULL);
3146                         memcpy(dest, local, dtype->size);
3147                         rvtype = ret.type = NULL;
3148                 } else
3149                         rv = interp_exec(c, fbody->function, &rvtype);
3150                 c->local = oldlocal; c->local_size = old_size;
3151                 free(local);
3152                 break;
3153         }
3154
3155 ## Complex executables: statements and expressions
3156
3157 Now that we have types and values and variables and most of the basic
3158 Terms which provide access to these, we can explore the more complex
3159 code that combine all of these to get useful work done.  Specifically
3160 statements and expressions.
3161
3162 Expressions are various combinations of Terms.  We will use operator
3163 precedence to ensure correct parsing.  The simplest Expression is just a
3164 Term - others will follow.
3165
3166 ###### Grammar
3167
3168         $*exec
3169         Expression -> Term ${ $0 = $<Term; }$
3170         ## expression grammar
3171
3172 ### Expressions: Conditional
3173
3174 Our first user of the `binode` will be conditional expressions, which
3175 is a bit odd as they actually have three components.  That will be
3176 handled by having 2 binodes for each expression.  The conditional
3177 expression is the lowest precedence operator which is why we define it
3178 first - to start the precedence list.
3179
3180 Conditional expressions are of the form "value `if` condition `else`
3181 other_value".  They associate to the right, so everything to the right
3182 of `else` is part of an else value, while only a higher-precedence to
3183 the left of `if` is the if values.  Between `if` and `else` there is no
3184 room for ambiguity, so a full conditional expression is allowed in
3185 there.
3186
3187 ###### Binode types
3188         CondExpr,
3189
3190 ###### declare terminals
3191
3192         $LEFT if $$ifelse
3193
3194 ###### expression grammar
3195
3196         | Expression if Expression else Expression $$ifelse ${ {
3197                 struct binode *b1 = new(binode);
3198                 struct binode *b2 = new(binode);
3199                 b1->op = CondExpr;
3200                 b1->left = $<3;
3201                 b1->right = b2;
3202                 b2->op = CondExpr;
3203                 b2->left = $<1;
3204                 b2->right = $<5;
3205                 $0 = b1;
3206         } }$
3207
3208 ###### print binode cases
3209
3210         case CondExpr:
3211                 b2 = cast(binode, b->right);
3212                 if (bracket) printf("(");
3213                 print_exec(b2->left, -1, bracket);
3214                 printf(" if ");
3215                 print_exec(b->left, -1, bracket);
3216                 printf(" else ");
3217                 print_exec(b2->right, -1, bracket);
3218                 if (bracket) printf(")");
3219                 break;
3220
3221 ###### propagate binode cases
3222
3223         case CondExpr: {
3224                 /* cond must be Tbool, others must match */
3225                 struct binode *b2 = cast(binode, b->right);
3226                 struct type *t2;
3227
3228                 propagate_types(b->left, c, ok, Tbool, 0);
3229                 t = propagate_types(b2->left, c, ok, type, Rnolabel);
3230                 t2 = propagate_types(b2->right, c, ok, type ?: t, Rnolabel);
3231                 return t ?: t2;
3232         }
3233
3234 ###### interp binode cases
3235
3236         case CondExpr: {
3237                 struct binode *b2 = cast(binode, b->right);
3238                 left = interp_exec(c, b->left, &ltype);
3239                 if (left.bool)
3240                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
3241                 else
3242                         rv = interp_exec(c, b2->right, &rvtype);
3243                 }
3244                 break;
3245
3246 ### Expression list
3247
3248 We take a brief detour, now that we have expressions, to describe lists
3249 of expressions.  These will be needed for function parameters and
3250 possibly other situations.  They seem generic enough to introduce here
3251 to be used elsewhere.
3252
3253 And ExpressionList will use the `List` type of `binode`, building up at
3254 the end.  And place where they are used will probably call
3255 `reorder_bilist()` to get a more normal first/next arrangement.
3256
3257 ###### declare terminals
3258         $TERM ,
3259
3260 `List` execs have no implicit semantics, so they are never propagated or
3261 interpreted.  The can be printed as a comma separate list, which is how
3262 they are parsed.  Note they are also used for function formal parameter
3263 lists.  In that case a separate function is used to print them.
3264
3265 ###### print binode cases
3266         case List:
3267                 while (b) {
3268                         printf(" ");
3269                         print_exec(b->left, -1, bracket);
3270                         if (b->right)
3271                                 printf(",");
3272                         b = cast(binode, b->right);
3273                 }
3274                 break;
3275
3276 ###### propagate binode cases
3277         case List: abort(); // NOTEST
3278 ###### interp binode cases
3279         case List: abort(); // NOTEST
3280
3281 ###### Grammar
3282
3283         $*binode
3284         ExpressionList -> ExpressionList , Expression ${
3285                 $0 = new(binode);
3286                 $0->op = List;
3287                 $0->left = $<1;
3288                 $0->right = $<3;
3289         }$
3290         | Expression ${
3291                 $0 = new(binode);
3292                 $0->op = List;
3293                 $0->left = NULL;
3294                 $0->right = $<1;
3295         }$
3296
3297 ### Expressions: Boolean
3298
3299 The next class of expressions to use the `binode` will be Boolean
3300 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
3301 have same corresponding precendence.  The difference is that they don't
3302 evaluate the second expression if not necessary.
3303
3304 ###### Binode types
3305         And,
3306         AndThen,
3307         Or,
3308         OrElse,
3309         Not,
3310
3311 ###### declare terminals
3312         $LEFT or
3313         $LEFT and
3314         $LEFT not
3315
3316 ###### expression grammar
3317         | Expression or Expression ${ {
3318                 struct binode *b = new(binode);
3319                 b->op = Or;
3320                 b->left = $<1;
3321                 b->right = $<3;
3322                 $0 = b;
3323         } }$
3324         | Expression or else Expression ${ {
3325                 struct binode *b = new(binode);
3326                 b->op = OrElse;
3327                 b->left = $<1;
3328                 b->right = $<4;
3329                 $0 = b;
3330         } }$
3331
3332         | Expression and Expression ${ {
3333                 struct binode *b = new(binode);
3334                 b->op = And;
3335                 b->left = $<1;
3336                 b->right = $<3;
3337                 $0 = b;
3338         } }$
3339         | Expression and then Expression ${ {
3340                 struct binode *b = new(binode);
3341                 b->op = AndThen;
3342                 b->left = $<1;
3343                 b->right = $<4;
3344                 $0 = b;
3345         } }$
3346
3347         | not Expression ${ {
3348                 struct binode *b = new(binode);
3349                 b->op = Not;
3350                 b->right = $<2;
3351                 $0 = b;
3352         } }$
3353
3354 ###### print binode cases
3355         case And:
3356                 if (bracket) printf("(");
3357                 print_exec(b->left, -1, bracket);
3358                 printf(" and ");
3359                 print_exec(b->right, -1, bracket);
3360                 if (bracket) printf(")");
3361                 break;
3362         case AndThen:
3363                 if (bracket) printf("(");
3364                 print_exec(b->left, -1, bracket);
3365                 printf(" and then ");
3366                 print_exec(b->right, -1, bracket);
3367                 if (bracket) printf(")");
3368                 break;
3369         case Or:
3370                 if (bracket) printf("(");
3371                 print_exec(b->left, -1, bracket);
3372                 printf(" or ");
3373                 print_exec(b->right, -1, bracket);
3374                 if (bracket) printf(")");
3375                 break;
3376         case OrElse:
3377                 if (bracket) printf("(");
3378                 print_exec(b->left, -1, bracket);
3379                 printf(" or else ");
3380                 print_exec(b->right, -1, bracket);
3381                 if (bracket) printf(")");
3382                 break;
3383         case Not:
3384                 if (bracket) printf("(");
3385                 printf("not ");
3386                 print_exec(b->right, -1, bracket);
3387                 if (bracket) printf(")");
3388                 break;
3389
3390 ###### propagate binode cases
3391         case And:
3392         case AndThen:
3393         case Or:
3394         case OrElse:
3395         case Not:
3396                 /* both must be Tbool, result is Tbool */
3397                 propagate_types(b->left, c, ok, Tbool, 0);
3398                 propagate_types(b->right, c, ok, Tbool, 0);
3399                 if (type && type != Tbool)
3400                         type_err(c, "error: %1 operation found where %2 expected", prog,
3401                                    Tbool, 0, type);
3402                 return Tbool;
3403
3404 ###### interp binode cases
3405         case And:
3406                 rv = interp_exec(c, b->left, &rvtype);
3407                 right = interp_exec(c, b->right, &rtype);
3408                 rv.bool = rv.bool && right.bool;
3409                 break;
3410         case AndThen:
3411                 rv = interp_exec(c, b->left, &rvtype);
3412                 if (rv.bool)
3413                         rv = interp_exec(c, b->right, NULL);
3414                 break;
3415         case Or:
3416                 rv = interp_exec(c, b->left, &rvtype);
3417                 right = interp_exec(c, b->right, &rtype);
3418                 rv.bool = rv.bool || right.bool;
3419                 break;
3420         case OrElse:
3421                 rv = interp_exec(c, b->left, &rvtype);
3422                 if (!rv.bool)
3423                         rv = interp_exec(c, b->right, NULL);
3424                 break;
3425         case Not:
3426                 rv = interp_exec(c, b->right, &rvtype);
3427                 rv.bool = !rv.bool;
3428                 break;
3429
3430 ### Expressions: Comparison
3431
3432 Of slightly higher precedence that Boolean expressions are Comparisons.
3433 A comparison takes arguments of any comparable type, but the two types
3434 must be the same.
3435
3436 To simplify the parsing we introduce an `eop` which can record an
3437 expression operator, and the `CMPop` non-terminal will match one of them.
3438
3439 ###### ast
3440         struct eop {
3441                 enum Btype op;
3442         };
3443
3444 ###### ast functions
3445         static void free_eop(struct eop *e)
3446         {
3447                 if (e)
3448                         free(e);
3449         }
3450
3451 ###### Binode types
3452         Less,
3453         Gtr,
3454         LessEq,
3455         GtrEq,
3456         Eql,
3457         NEql,
3458
3459 ###### declare terminals
3460         $LEFT < > <= >= == != CMPop
3461
3462 ###### expression grammar
3463         | Expression CMPop Expression ${ {
3464                 struct binode *b = new(binode);
3465                 b->op = $2.op;
3466                 b->left = $<1;
3467                 b->right = $<3;
3468                 $0 = b;
3469         } }$
3470
3471 ###### Grammar
3472
3473         $eop
3474         CMPop ->  < ${ $0.op = Less; }$
3475         |         > ${ $0.op = Gtr; }$
3476         |         <= ${ $0.op = LessEq; }$
3477         |         >= ${ $0.op = GtrEq; }$
3478         |         == ${ $0.op = Eql; }$
3479         |         != ${ $0.op = NEql; }$
3480
3481 ###### print binode cases
3482
3483         case Less:
3484         case LessEq:
3485         case Gtr:
3486         case GtrEq:
3487         case Eql:
3488         case NEql:
3489                 if (bracket) printf("(");
3490                 print_exec(b->left, -1, bracket);
3491                 switch(b->op) {
3492                 case Less:   printf(" < "); break;
3493                 case LessEq: printf(" <= "); break;
3494                 case Gtr:    printf(" > "); break;
3495                 case GtrEq:  printf(" >= "); break;
3496                 case Eql:    printf(" == "); break;
3497                 case NEql:   printf(" != "); break;
3498                 default: abort();               // NOTEST
3499                 }
3500                 print_exec(b->right, -1, bracket);
3501                 if (bracket) printf(")");
3502                 break;
3503
3504 ###### propagate binode cases
3505         case Less:
3506         case LessEq:
3507         case Gtr:
3508         case GtrEq:
3509         case Eql:
3510         case NEql:
3511                 /* Both must match but not be labels, result is Tbool */
3512                 t = propagate_types(b->left, c, ok, NULL, Rnolabel);
3513                 if (t)
3514                         propagate_types(b->right, c, ok, t, 0);
3515                 else {
3516                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);   // UNTESTED
3517                         if (t)  // UNTESTED
3518                                 t = propagate_types(b->left, c, ok, t, 0);      // UNTESTED
3519                 }
3520                 if (!type_compat(type, Tbool, 0))
3521                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
3522                                     Tbool, rules, type);
3523                 return Tbool;
3524
3525 ###### interp binode cases
3526         case Less:
3527         case LessEq:
3528         case Gtr:
3529         case GtrEq:
3530         case Eql:
3531         case NEql:
3532         {
3533                 int cmp;
3534                 left = interp_exec(c, b->left, &ltype);
3535                 right = interp_exec(c, b->right, &rtype);
3536                 cmp = value_cmp(ltype, rtype, &left, &right);
3537                 rvtype = Tbool;
3538                 switch (b->op) {
3539                 case Less:      rv.bool = cmp <  0; break;
3540                 case LessEq:    rv.bool = cmp <= 0; break;
3541                 case Gtr:       rv.bool = cmp >  0; break;
3542                 case GtrEq:     rv.bool = cmp >= 0; break;
3543                 case Eql:       rv.bool = cmp == 0; break;
3544                 case NEql:      rv.bool = cmp != 0; break;
3545                 default:        rv.bool = 0; break;     // NOTEST
3546                 }
3547                 break;
3548         }
3549
3550 ### Expressions: Arithmetic etc.
3551
3552 The remaining expressions with the highest precedence are arithmetic,
3553 string concatenation, and string conversion.  String concatenation
3554 (`++`) has the same precedence as multiplication and division, but lower
3555 than the uniary.
3556
3557 String conversion is a temporary feature until I get a better type
3558 system.  `$` is a prefix operator which expects a string and returns
3559 a number.
3560
3561 `+` and `-` are both infix and prefix operations (where they are
3562 absolute value and negation).  These have different operator names.
3563
3564 We also have a 'Bracket' operator which records where parentheses were
3565 found.  This makes it easy to reproduce these when printing.  Possibly I
3566 should only insert brackets were needed for precedence.  Putting
3567 parentheses around an expression converts it into a Term,
3568
3569 ###### Binode types
3570         Plus, Minus,
3571         Times, Divide, Rem,
3572         Concat,
3573         Absolute, Negate,
3574         StringConv,
3575         Bracket,
3576
3577 ###### declare terminals
3578         $LEFT + - Eop
3579         $LEFT * / % ++ Top
3580         $LEFT Uop $
3581         $TERM ( )
3582
3583 ###### expression grammar
3584         | Expression Eop Expression ${ {
3585                 struct binode *b = new(binode);
3586                 b->op = $2.op;
3587                 b->left = $<1;
3588                 b->right = $<3;
3589                 $0 = b;
3590         } }$
3591
3592         | Expression Top Expression ${ {
3593                 struct binode *b = new(binode);
3594                 b->op = $2.op;
3595                 b->left = $<1;
3596                 b->right = $<3;
3597                 $0 = b;
3598         } }$
3599
3600         | Uop Expression ${ {
3601                 struct binode *b = new(binode);
3602                 b->op = $1.op;
3603                 b->right = $<2;
3604                 $0 = b;
3605         } }$
3606
3607 ###### term grammar
3608
3609         | ( Expression ) ${ {
3610                 struct binode *b = new_pos(binode, $1);
3611                 b->op = Bracket;
3612                 b->right = $<2;
3613                 $0 = b;
3614         } }$
3615
3616 ###### Grammar
3617
3618         $eop
3619         Eop ->   + ${ $0.op = Plus; }$
3620         |        - ${ $0.op = Minus; }$
3621
3622         Uop ->   + ${ $0.op = Absolute; }$
3623         |        - ${ $0.op = Negate; }$
3624         |        $ ${ $0.op = StringConv; }$
3625
3626         Top ->   * ${ $0.op = Times; }$
3627         |        / ${ $0.op = Divide; }$
3628         |        % ${ $0.op = Rem; }$
3629         |        ++ ${ $0.op = Concat; }$
3630
3631 ###### print binode cases
3632         case Plus:
3633         case Minus:
3634         case Times:
3635         case Divide:
3636         case Concat:
3637         case Rem:
3638                 if (bracket) printf("(");
3639                 print_exec(b->left, indent, bracket);
3640                 switch(b->op) {
3641                 case Plus:   fputs(" + ", stdout); break;
3642                 case Minus:  fputs(" - ", stdout); break;
3643                 case Times:  fputs(" * ", stdout); break;
3644                 case Divide: fputs(" / ", stdout); break;
3645                 case Rem:    fputs(" % ", stdout); break;
3646                 case Concat: fputs(" ++ ", stdout); break;
3647                 default: abort();       // NOTEST
3648                 }                       // NOTEST
3649                 print_exec(b->right, indent, bracket);
3650                 if (bracket) printf(")");
3651                 break;
3652         case Absolute:
3653         case Negate:
3654         case StringConv:
3655                 if (bracket) printf("(");
3656                 switch (b->op) {
3657                 case Absolute:   fputs("+", stdout); break;
3658                 case Negate:     fputs("-", stdout); break;
3659                 case StringConv: fputs("$", stdout); break;
3660                 default: abort();       // NOTEST
3661                 }                       // NOTEST
3662                 print_exec(b->right, indent, bracket);
3663                 if (bracket) printf(")");
3664                 break;
3665         case Bracket:
3666                 printf("(");
3667                 print_exec(b->right, indent, bracket);
3668                 printf(")");
3669                 break;
3670
3671 ###### propagate binode cases
3672         case Plus:
3673         case Minus:
3674         case Times:
3675         case Rem:
3676         case Divide:
3677                 /* both must be numbers, result is Tnum */
3678         case Absolute:
3679         case Negate:
3680                 /* as propagate_types ignores a NULL,
3681                  * unary ops fit here too */
3682                 propagate_types(b->left, c, ok, Tnum, 0);
3683                 propagate_types(b->right, c, ok, Tnum, 0);
3684                 if (!type_compat(type, Tnum, 0))
3685                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3686                                    Tnum, rules, type);
3687                 return Tnum;
3688
3689         case Concat:
3690                 /* both must be Tstr, result is Tstr */
3691                 propagate_types(b->left, c, ok, Tstr, 0);
3692                 propagate_types(b->right, c, ok, Tstr, 0);
3693                 if (!type_compat(type, Tstr, 0))
3694                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3695                                    Tstr, rules, type);
3696                 return Tstr;
3697
3698         case StringConv:
3699                 /* op must be string, result is number */
3700                 propagate_types(b->left, c, ok, Tstr, 0);
3701                 if (!type_compat(type, Tnum, 0))
3702                         type_err(c,     // UNTESTED
3703                           "error: Can only convert string to number, not %1",
3704                                 prog, type, 0, NULL);
3705                 return Tnum;
3706
3707         case Bracket:
3708                 return propagate_types(b->right, c, ok, type, 0);
3709
3710 ###### interp binode cases
3711
3712         case Plus:
3713                 rv = interp_exec(c, b->left, &rvtype);
3714                 right = interp_exec(c, b->right, &rtype);
3715                 mpq_add(rv.num, rv.num, right.num);
3716                 break;
3717         case Minus:
3718                 rv = interp_exec(c, b->left, &rvtype);
3719                 right = interp_exec(c, b->right, &rtype);
3720                 mpq_sub(rv.num, rv.num, right.num);
3721                 break;
3722         case Times:
3723                 rv = interp_exec(c, b->left, &rvtype);
3724                 right = interp_exec(c, b->right, &rtype);
3725                 mpq_mul(rv.num, rv.num, right.num);
3726                 break;
3727         case Divide:
3728                 rv = interp_exec(c, b->left, &rvtype);
3729                 right = interp_exec(c, b->right, &rtype);
3730                 mpq_div(rv.num, rv.num, right.num);
3731                 break;
3732         case Rem: {
3733                 mpz_t l, r, rem;
3734
3735                 left = interp_exec(c, b->left, &ltype);
3736                 right = interp_exec(c, b->right, &rtype);
3737                 mpz_init(l); mpz_init(r); mpz_init(rem);
3738                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3739                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3740                 mpz_tdiv_r(rem, l, r);
3741                 val_init(Tnum, &rv);
3742                 mpq_set_z(rv.num, rem);
3743                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3744                 rvtype = ltype;
3745                 break;
3746         }
3747         case Negate:
3748                 rv = interp_exec(c, b->right, &rvtype);
3749                 mpq_neg(rv.num, rv.num);
3750                 break;
3751         case Absolute:
3752                 rv = interp_exec(c, b->right, &rvtype);
3753                 mpq_abs(rv.num, rv.num);
3754                 break;
3755         case Bracket:
3756                 rv = interp_exec(c, b->right, &rvtype);
3757                 break;
3758         case Concat:
3759                 left = interp_exec(c, b->left, &ltype);
3760                 right = interp_exec(c, b->right, &rtype);
3761                 rvtype = Tstr;
3762                 rv.str = text_join(left.str, right.str);
3763                 break;
3764         case StringConv:
3765                 right = interp_exec(c, b->right, &rvtype);
3766                 rtype = Tstr;
3767                 rvtype = Tnum;
3768
3769                 struct text tx = right.str;
3770                 char tail[3];
3771                 int neg = 0;
3772                 if (tx.txt[0] == '-') {
3773                         neg = 1;        // UNTESTED
3774                         tx.txt++;       // UNTESTED
3775                         tx.len--;       // UNTESTED
3776                 }
3777                 if (number_parse(rv.num, tail, tx) == 0)
3778                         mpq_init(rv.num);       // UNTESTED
3779                 else if (neg)
3780                         mpq_neg(rv.num, rv.num);        // UNTESTED
3781                 if (tail[0])
3782                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3783
3784                 break;
3785
3786 ###### value functions
3787
3788         static struct text text_join(struct text a, struct text b)
3789         {
3790                 struct text rv;
3791                 rv.len = a.len + b.len;
3792                 rv.txt = malloc(rv.len);
3793                 memcpy(rv.txt, a.txt, a.len);
3794                 memcpy(rv.txt+a.len, b.txt, b.len);
3795                 return rv;
3796         }
3797
3798 ### Blocks, Statements, and Statement lists.
3799
3800 Now that we have expressions out of the way we need to turn to
3801 statements.  There are simple statements and more complex statements.
3802 Simple statements do not contain (syntactic) newlines, complex statements do.
3803
3804 Statements often come in sequences and we have corresponding simple
3805 statement lists and complex statement lists.
3806 The former comprise only simple statements separated by semicolons.
3807 The later comprise complex statements and simple statement lists.  They are
3808 separated by newlines.  Thus the semicolon is only used to separate
3809 simple statements on the one line.  This may be overly restrictive,
3810 but I'm not sure I ever want a complex statement to share a line with
3811 anything else.
3812
3813 Note that a simple statement list can still use multiple lines if
3814 subsequent lines are indented, so
3815
3816 ###### Example: wrapped simple statement list
3817
3818         a = b; c = d;
3819            e = f; print g
3820
3821 is a single simple statement list.  This might allow room for
3822 confusion, so I'm not set on it yet.
3823
3824 A simple statement list needs no extra syntax.  A complex statement
3825 list has two syntactic forms.  It can be enclosed in braces (much like
3826 C blocks), or it can be introduced by an indent and continue until an
3827 unindented newline (much like Python blocks).  With this extra syntax
3828 it is referred to as a block.
3829
3830 Note that a block does not have to include any newlines if it only
3831 contains simple statements.  So both of:
3832
3833         if condition: a=b; d=f
3834
3835         if condition { a=b; print f }
3836
3837 are valid.
3838
3839 In either case the list is constructed from a `binode` list with
3840 `Block` as the operator.  When parsing the list it is most convenient
3841 to append to the end, so a list is a list and a statement.  When using
3842 the list it is more convenient to consider a list to be a statement
3843 and a list.  So we need a function to re-order a list.
3844 `reorder_bilist` serves this purpose.
3845
3846 The only stand-alone statement we introduce at this stage is `pass`
3847 which does nothing and is represented as a `NULL` pointer in a `Block`
3848 list.  Other stand-alone statements will follow once the infrastructure
3849 is in-place.
3850
3851 As many statements will use binodes, we declare a binode pointer 'b' in
3852 the common header for all reductions to use.
3853
3854 ###### Parser: reduce
3855         struct binode *b;
3856
3857 ###### Binode types
3858         Block,
3859
3860 ###### Grammar
3861
3862         $TERM { } ;
3863
3864         $*binode
3865         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3866         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3867         |        SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3868         |        SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3869         |        IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3870
3871         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3872         |        OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3873         |        OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3874         |        OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3875         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3876
3877         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3878         |        { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3879         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3880
3881         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3882         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3883         |        : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3884         |        : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3885         |        : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3886
3887         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3888
3889         ComplexStatements -> ComplexStatements ComplexStatement ${
3890                 if ($2 == NULL) {
3891                         $0 = $<1;
3892                 } else {
3893                         $0 = new(binode);
3894                         $0->op = Block;
3895                         $0->left = $<1;
3896                         $0->right = $<2;
3897                 }
3898         }$
3899         | ComplexStatement ${
3900                 if ($1 == NULL) {
3901                         $0 = NULL;
3902                 } else {
3903                         $0 = new(binode);
3904                         $0->op = Block;
3905                         $0->left = NULL;
3906                         $0->right = $<1;
3907                 }
3908         }$
3909
3910         $*exec
3911         ComplexStatement -> SimpleStatements Newlines ${
3912                 $0 = reorder_bilist($<SS);
3913         }$
3914         |  SimpleStatements ; Newlines ${
3915                 $0 = reorder_bilist($<SS);
3916         }$
3917         ## ComplexStatement Grammar
3918
3919         $*binode
3920         SimpleStatements -> SimpleStatements ; SimpleStatement ${
3921                 $0 = new(binode);
3922                 $0->op = Block;
3923                 $0->left = $<1;
3924                 $0->right = $<3;
3925         }$
3926         | SimpleStatement ${
3927                 $0 = new(binode);
3928                 $0->op = Block;
3929                 $0->left = NULL;
3930                 $0->right = $<1;
3931         }$
3932
3933         $TERM pass
3934         $*exec
3935         SimpleStatement -> pass ${ $0 = NULL; }$
3936         | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
3937         ## SimpleStatement Grammar
3938
3939 ###### print binode cases
3940         case Block:
3941                 if (indent < 0) {
3942                         // simple statement
3943                         if (b->left == NULL)    // UNTESTED
3944                                 printf("pass"); // UNTESTED
3945                         else
3946                                 print_exec(b->left, indent, bracket);   // UNTESTED
3947                         if (b->right) { // UNTESTED
3948                                 printf("; ");   // UNTESTED
3949                                 print_exec(b->right, indent, bracket);  // UNTESTED
3950                         }
3951                 } else {
3952                         // block, one per line
3953                         if (b->left == NULL)
3954                                 do_indent(indent, "pass\n");
3955                         else
3956                                 print_exec(b->left, indent, bracket);
3957                         if (b->right)
3958                                 print_exec(b->right, indent, bracket);
3959                 }
3960                 break;
3961
3962 ###### propagate binode cases
3963         case Block:
3964         {
3965                 /* If any statement returns something other than Tnone
3966                  * or Tbool then all such must return same type.
3967                  * As each statement may be Tnone or something else,
3968                  * we must always pass NULL (unknown) down, otherwise an incorrect
3969                  * error might occur.  We never return Tnone unless it is
3970                  * passed in.
3971                  */
3972                 struct binode *e;
3973
3974                 for (e = b; e; e = cast(binode, e->right)) {
3975                         t = propagate_types(e->left, c, ok, NULL, rules);
3976                         if ((rules & Rboolok) && (t == Tbool || t == Tnone))
3977                                 t = NULL;
3978                         if (t == Tnone && e->right)
3979                                 /* Only the final statement *must* return a value
3980                                  * when not Rboolok
3981                                  */
3982                                 t = NULL;
3983                         if (t) {
3984                                 if (!type)
3985                                         type = t;
3986                                 else if (t != type)
3987                                         type_err(c, "error: expected %1%r, found %2",
3988                                                  e->left, type, rules, t);
3989                         }
3990                 }
3991                 return type;
3992         }
3993
3994 ###### interp binode cases
3995         case Block:
3996                 while (rvtype == Tnone &&
3997                        b) {
3998                         if (b->left)
3999                                 rv = interp_exec(c, b->left, &rvtype);
4000                         b = cast(binode, b->right);
4001                 }
4002                 break;
4003
4004 ### The Print statement
4005
4006 `print` is a simple statement that takes a comma-separated list of
4007 expressions and prints the values separated by spaces and terminated
4008 by a newline.  No control of formatting is possible.
4009
4010 `print` uses `ExpressionList` to collect the expressions and stores them
4011 on the left side of a `Print` binode unlessthere is a trailing comma
4012 when the list is stored on the `right` side and no trailing newline is
4013 printed.
4014
4015 ###### Binode types
4016         Print,
4017
4018 ##### declare terminals
4019         $TERM print
4020
4021 ###### SimpleStatement Grammar
4022
4023         | print ExpressionList ${
4024                 $0 = b = new(binode);
4025                 b->op = Print;
4026                 b->right = NULL;
4027                 b->left = reorder_bilist($<EL);
4028         }$
4029         | print ExpressionList , ${ {
4030                 $0 = b = new(binode);
4031                 b->op = Print;
4032                 b->right = reorder_bilist($<EL);
4033                 b->left = NULL;
4034         } }$
4035         | print ${
4036                 $0 = b = new(binode);
4037                 b->op = Print;
4038                 b->left = NULL;
4039                 b->right = NULL;
4040         }$
4041
4042 ###### print binode cases
4043
4044         case Print:
4045                 do_indent(indent, "print");
4046                 if (b->right) {
4047                         print_exec(b->right, -1, bracket);
4048                         printf(",");
4049                 } else
4050                         print_exec(b->left, -1, bracket);
4051                 if (indent >= 0)
4052                         printf("\n");
4053                 break;
4054
4055 ###### propagate binode cases
4056
4057         case Print:
4058                 /* don't care but all must be consistent */
4059                 if (b->left)
4060                         b = cast(binode, b->left);
4061                 else
4062                         b = cast(binode, b->right);
4063                 while (b) {
4064                         propagate_types(b->left, c, ok, NULL, Rnolabel);
4065                         b = cast(binode, b->right);
4066                 }
4067                 break;
4068
4069 ###### interp binode cases
4070
4071         case Print:
4072         {
4073                 struct binode *b2 = cast(binode, b->left);
4074                 if (!b2)
4075                         b2 = cast(binode, b->right);
4076                 for (; b2; b2 = cast(binode, b2->right)) {
4077                         left = interp_exec(c, b2->left, &ltype);
4078                         print_value(ltype, &left, stdout);
4079                         free_value(ltype, &left);
4080                         if (b2->right)
4081                                 putchar(' ');
4082                 }
4083                 if (b->right == NULL)
4084                         printf("\n");
4085                 ltype = Tnone;
4086                 break;
4087         }
4088
4089 ###### Assignment statement
4090
4091 An assignment will assign a value to a variable, providing it hasn't
4092 been declared as a constant.  The analysis phase ensures that the type
4093 will be correct so the interpreter just needs to perform the
4094 calculation.  There is a form of assignment which declares a new
4095 variable as well as assigning a value.  If a name is assigned before
4096 it is declared, and error will be raised as the name is created as
4097 `Tlabel` and it is illegal to assign to such names.
4098
4099 ###### Binode types
4100         Assign,
4101         Declare,
4102
4103 ###### declare terminals
4104         $TERM =
4105
4106 ###### SimpleStatement Grammar
4107         | Term = Expression ${
4108                 $0 = b= new(binode);
4109                 b->op = Assign;
4110                 b->left = $<1;
4111                 b->right = $<3;
4112         }$
4113         | VariableDecl = Expression ${
4114                 $0 = b= new(binode);
4115                 b->op = Declare;
4116                 b->left = $<1;
4117                 b->right =$<3;
4118         }$
4119
4120         | VariableDecl ${
4121                 if ($1->var->where_set == NULL) {
4122                         type_err(c,
4123                                  "Variable declared with no type or value: %v",
4124                                  $1, NULL, 0, NULL);
4125                         free_var($1);
4126                 } else {
4127                         $0 = b = new(binode);
4128                         b->op = Declare;
4129                         b->left = $<1;
4130                         b->right = NULL;
4131                 }
4132         }$
4133
4134 ###### print binode cases
4135
4136         case Assign:
4137                 do_indent(indent, "");
4138                 print_exec(b->left, indent, bracket);
4139                 printf(" = ");
4140                 print_exec(b->right, indent, bracket);
4141                 if (indent >= 0)
4142                         printf("\n");
4143                 break;
4144
4145         case Declare:
4146                 {
4147                 struct variable *v = cast(var, b->left)->var;
4148                 do_indent(indent, "");
4149                 print_exec(b->left, indent, bracket);
4150                 if (cast(var, b->left)->var->constant) {
4151                         printf("::");
4152                         if (v->explicit_type) {
4153                                 type_print(v->type, stdout);
4154                                 printf(" ");
4155                         }
4156                 } else {
4157                         printf(":");
4158                         if (v->explicit_type) {
4159                                 type_print(v->type, stdout);
4160                                 printf(" ");
4161                         }
4162                 }
4163                 if (b->right) {
4164                         printf("= ");
4165                         print_exec(b->right, indent, bracket);
4166                 }
4167                 if (indent >= 0)
4168                         printf("\n");
4169                 }
4170                 break;
4171
4172 ###### propagate binode cases
4173
4174         case Assign:
4175         case Declare:
4176                 /* Both must match and not be labels,
4177                  * Type must support 'dup',
4178                  * For Assign, left must not be constant.
4179                  * result is Tnone
4180                  */
4181                 t = propagate_types(b->left, c, ok, NULL,
4182                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
4183                 if (!b->right)
4184                         return Tnone;
4185
4186                 if (t) {
4187                         if (propagate_types(b->right, c, ok, t, 0) != t)
4188                                 if (b->left->type == Xvar)
4189                                         type_err(c, "info: variable '%v' was set as %1 here.",
4190                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
4191                 } else {
4192                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
4193                         if (t)
4194                                 propagate_types(b->left, c, ok, t,
4195                                                 (b->op == Assign ? Rnoconstant : 0));
4196                 }
4197                 if (t && t->dup == NULL && t->name.txt[0] != ' ') // HACK
4198                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
4199                 return Tnone;
4200
4201                 break;
4202
4203 ###### interp binode cases
4204
4205         case Assign:
4206                 lleft = linterp_exec(c, b->left, &ltype);
4207                 if (lleft)
4208                         dinterp_exec(c, b->right, lleft, ltype, 1);
4209                 ltype = Tnone;
4210                 break;
4211
4212         case Declare:
4213         {
4214                 struct variable *v = cast(var, b->left)->var;
4215                 struct value *val;
4216                 v = v->merged;
4217                 val = var_value(c, v);
4218                 if (v->type->prepare_type)
4219                         v->type->prepare_type(c, v->type, 0);
4220                 if (b->right)
4221                         dinterp_exec(c, b->right, val, v->type, 0);
4222                 else
4223                         val_init(v->type, val);
4224                 break;
4225         }
4226
4227 ### The `use` statement
4228
4229 The `use` statement is the last "simple" statement.  It is needed when a
4230 statement block can return a value.  This includes the body of a
4231 function which has a return type, and the "condition" code blocks in
4232 `if`, `while`, and `switch` statements.
4233
4234 ###### Binode types
4235         Use,
4236
4237 ###### declare terminals
4238         $TERM use
4239
4240 ###### SimpleStatement Grammar
4241         | use Expression ${
4242                 $0 = b = new_pos(binode, $1);
4243                 b->op = Use;
4244                 b->right = $<2;
4245                 if (b->right->type == Xvar) {
4246                         struct var *v = cast(var, b->right);
4247                         if (v->var->type == Tnone) {
4248                                 /* Convert this to a label */
4249                                 struct value *val;
4250
4251                                 v->var->type = Tlabel;
4252                                 val = global_alloc(c, Tlabel, v->var, NULL);
4253                                 val->label = val;
4254                         }
4255                 }
4256         }$
4257
4258 ###### print binode cases
4259
4260         case Use:
4261                 do_indent(indent, "use ");
4262                 print_exec(b->right, -1, bracket);
4263                 if (indent >= 0)
4264                         printf("\n");
4265                 break;
4266
4267 ###### propagate binode cases
4268
4269         case Use:
4270                 /* result matches value */
4271                 return propagate_types(b->right, c, ok, type, 0);
4272
4273 ###### interp binode cases
4274
4275         case Use:
4276                 rv = interp_exec(c, b->right, &rvtype);
4277                 break;
4278
4279 ### The Conditional Statement
4280
4281 This is the biggy and currently the only complex statement.  This
4282 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
4283 It is comprised of a number of parts, all of which are optional though
4284 set combinations apply.  Each part is (usually) a key word (`then` is
4285 sometimes optional) followed by either an expression or a code block,
4286 except the `casepart` which is a "key word and an expression" followed
4287 by a code block.  The code-block option is valid for all parts and,
4288 where an expression is also allowed, the code block can use the `use`
4289 statement to report a value.  If the code block does not report a value
4290 the effect is similar to reporting `True`.
4291
4292 The `else` and `case` parts, as well as `then` when combined with
4293 `if`, can contain a `use` statement which will apply to some
4294 containing conditional statement. `for` parts, `do` parts and `then`
4295 parts used with `for` can never contain a `use`, except in some
4296 subordinate conditional statement.
4297
4298 If there is a `forpart`, it is executed first, only once.
4299 If there is a `dopart`, then it is executed repeatedly providing
4300 always that the `condpart` or `cond`, if present, does not return a non-True
4301 value.  `condpart` can fail to return any value if it simply executes
4302 to completion.  This is treated the same as returning `True`.
4303
4304 If there is a `thenpart` it will be executed whenever the `condpart`
4305 or `cond` returns True (or does not return any value), but this will happen
4306 *after* `dopart` (when present).
4307
4308 If `elsepart` is present it will be executed at most once when the
4309 condition returns `False` or some value that isn't `True` and isn't
4310 matched by any `casepart`.  If there are any `casepart`s, they will be
4311 executed when the condition returns a matching value.
4312
4313 The particular sorts of values allowed in case parts has not yet been
4314 determined in the language design, so nothing is prohibited.
4315
4316 The various blocks in this complex statement potentially provide scope
4317 for variables as described earlier.  Each such block must include the
4318 "OpenScope" nonterminal before parsing the block, and must call
4319 `var_block_close()` when closing the block.
4320
4321 The code following "`if`", "`switch`" and "`for`" does not get its own
4322 scope, but is in a scope covering the whole statement, so names
4323 declared there cannot be redeclared elsewhere.  Similarly the
4324 condition following "`while`" is in a scope the covers the body
4325 ("`do`" part) of the loop, and which does not allow conditional scope
4326 extension.  Code following "`then`" (both looping and non-looping),
4327 "`else`" and "`case`" each get their own local scope.
4328
4329 The type requirements on the code block in a `whilepart` are quite
4330 unusal.  It is allowed to return a value of some identifiable type, in
4331 which case the loop aborts and an appropriate `casepart` is run, or it
4332 can return a Boolean, in which case the loop either continues to the
4333 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
4334 This is different both from the `ifpart` code block which is expected to
4335 return a Boolean, or the `switchpart` code block which is expected to
4336 return the same type as the casepart values.  The correct analysis of
4337 the type of the `whilepart` code block is the reason for the
4338 `Rboolok` flag which is passed to `propagate_types()`.
4339
4340 The `cond_statement` cannot fit into a `binode` so a new `exec` is
4341 defined.  As there are two scopes which cover multiple parts - one for
4342 the whole statement and one for "while" and "do" - and as we will use
4343 the 'struct exec' to track scopes, we actually need two new types of
4344 exec.  One is a `binode` for the looping part, the rest is the
4345 `cond_statement`.  The `cond_statement` will use an auxilliary `struct
4346 casepart` to track a list of case parts.
4347
4348 ###### Binode types
4349         Loop
4350
4351 ###### exec type
4352         Xcond_statement,
4353
4354 ###### ast
4355         struct casepart {
4356                 struct exec *value;
4357                 struct exec *action;
4358                 struct casepart *next;
4359         };
4360         struct cond_statement {
4361                 struct exec;
4362                 struct exec *forpart, *condpart, *thenpart, *elsepart;
4363                 struct binode *looppart;
4364                 struct casepart *casepart;
4365         };
4366
4367 ###### ast functions
4368
4369         static void free_casepart(struct casepart *cp)
4370         {
4371                 while (cp) {
4372                         struct casepart *t;
4373                         free_exec(cp->value);
4374                         free_exec(cp->action);
4375                         t = cp->next;
4376                         free(cp);
4377                         cp = t;
4378                 }
4379         }
4380
4381         static void free_cond_statement(struct cond_statement *s)
4382         {
4383                 if (!s)
4384                         return;
4385                 free_exec(s->forpart);
4386                 free_exec(s->condpart);
4387                 free_exec(s->looppart);
4388                 free_exec(s->thenpart);
4389                 free_exec(s->elsepart);
4390                 free_casepart(s->casepart);
4391                 free(s);
4392         }
4393
4394 ###### free exec cases
4395         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
4396
4397 ###### ComplexStatement Grammar
4398         | CondStatement ${ $0 = $<1; }$
4399
4400 ###### declare terminals
4401         $TERM for then while do
4402         $TERM else
4403         $TERM switch case
4404
4405 ###### Grammar
4406
4407         $*cond_statement
4408         // A CondStatement must end with EOL, as does CondSuffix and
4409         // IfSuffix.
4410         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
4411         // may or may not end with EOL
4412         // WhilePart and IfPart include an appropriate Suffix
4413
4414         // ForPart, SwitchPart, and IfPart open scopes, o we have to close
4415         // them.  WhilePart opens and closes its own scope.
4416         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
4417                 $0 = $<CS;
4418                 $0->forpart = $<FP;
4419                 $0->thenpart = $<TP;
4420                 $0->looppart = $<WP;
4421                 var_block_close(c, CloseSequential, $0);
4422         }$
4423         | ForPart OptNL WhilePart CondSuffix ${
4424                 $0 = $<CS;
4425                 $0->forpart = $<FP;
4426                 $0->looppart = $<WP;
4427                 var_block_close(c, CloseSequential, $0);
4428         }$
4429         | WhilePart CondSuffix ${
4430                 $0 = $<CS;
4431                 $0->looppart = $<WP;
4432         }$
4433         | SwitchPart OptNL CasePart CondSuffix ${
4434                 $0 = $<CS;
4435                 $0->condpart = $<SP;
4436                 $CP->next = $0->casepart;
4437                 $0->casepart = $<CP;
4438                 var_block_close(c, CloseSequential, $0);
4439         }$
4440         | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
4441                 $0 = $<CS;
4442                 $0->condpart = $<SP;
4443                 $CP->next = $0->casepart;
4444                 $0->casepart = $<CP;
4445                 var_block_close(c, CloseSequential, $0);
4446         }$
4447         | IfPart IfSuffix ${
4448                 $0 = $<IS;
4449                 $0->condpart = $IP.condpart; $IP.condpart = NULL;
4450                 $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
4451                 // This is where we close an "if" statement
4452                 var_block_close(c, CloseSequential, $0);
4453         }$
4454
4455         CondSuffix -> IfSuffix ${
4456                 $0 = $<1;
4457         }$
4458         | Newlines CasePart CondSuffix ${
4459                 $0 = $<CS;
4460                 $CP->next = $0->casepart;
4461                 $0->casepart = $<CP;
4462         }$
4463         | CasePart CondSuffix ${
4464                 $0 = $<CS;
4465                 $CP->next = $0->casepart;
4466                 $0->casepart = $<CP;
4467         }$
4468
4469         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
4470         | Newlines ElsePart ${ $0 = $<EP; }$
4471         | ElsePart ${$0 = $<EP; }$
4472
4473         ElsePart -> else OpenBlock Newlines ${
4474                 $0 = new(cond_statement);
4475                 $0->elsepart = $<OB;
4476                 var_block_close(c, CloseElse, $0->elsepart);
4477         }$
4478         | else OpenScope CondStatement ${
4479                 $0 = new(cond_statement);
4480                 $0->elsepart = $<CS;
4481                 var_block_close(c, CloseElse, $0->elsepart);
4482         }$
4483
4484         $*casepart
4485         CasePart -> case Expression OpenScope ColonBlock ${
4486                 $0 = calloc(1,sizeof(struct casepart));
4487                 $0->value = $<Ex;
4488                 $0->action = $<Bl;
4489                 var_block_close(c, CloseParallel, $0->action);
4490         }$
4491
4492         $*exec
4493         // These scopes are closed in CondStatement
4494         ForPart -> for OpenBlock ${
4495                 $0 = $<Bl;
4496         }$
4497
4498         ThenPart -> then OpenBlock ${
4499                 $0 = $<OB;
4500                 var_block_close(c, CloseSequential, $0);
4501         }$
4502
4503         $*binode
4504         // This scope is closed in CondStatement
4505         WhilePart -> while UseBlock OptNL do OpenBlock ${
4506                 $0 = new(binode);
4507                 $0->op = Loop;
4508                 $0->left = $<UB;
4509                 $0->right = $<OB;
4510                 var_block_close(c, CloseSequential, $0->right);
4511                 var_block_close(c, CloseSequential, $0);
4512         }$
4513         | while OpenScope Expression OpenScope ColonBlock ${
4514                 $0 = new(binode);
4515                 $0->op = Loop;
4516                 $0->left = $<Exp;
4517                 $0->right = $<CB;
4518                 var_block_close(c, CloseSequential, $0->right);
4519                 var_block_close(c, CloseSequential, $0);
4520         }$
4521
4522         $cond_statement
4523         IfPart -> if UseBlock OptNL then OpenBlock ${
4524                 $0.condpart = $<UB;
4525                 $0.thenpart = $<OB;
4526                 var_block_close(c, CloseParallel, $0.thenpart);
4527         }$
4528         | if OpenScope Expression OpenScope ColonBlock ${
4529                 $0.condpart = $<Ex;
4530                 $0.thenpart = $<CB;
4531                 var_block_close(c, CloseParallel, $0.thenpart);
4532         }$
4533         | if OpenScope Expression OpenScope OptNL then Block ${
4534                 $0.condpart = $<Ex;
4535                 $0.thenpart = $<Bl;
4536                 var_block_close(c, CloseParallel, $0.thenpart);
4537         }$
4538
4539         $*exec
4540         // This scope is closed in CondStatement
4541         SwitchPart -> switch OpenScope Expression ${
4542                 $0 = $<Ex;
4543         }$
4544         | switch UseBlock ${
4545                 $0 = $<Bl;
4546         }$
4547
4548 ###### print binode cases
4549         case Loop:
4550                 if (b->left && b->left->type == Xbinode &&
4551                     cast(binode, b->left)->op == Block) {
4552                         if (bracket)
4553                                 do_indent(indent, "while {\n");
4554                         else
4555                                 do_indent(indent, "while\n");
4556                         print_exec(b->left, indent+1, bracket);
4557                         if (bracket)
4558                                 do_indent(indent, "} do {\n");
4559                         else
4560                                 do_indent(indent, "do\n");
4561                         print_exec(b->right, indent+1, bracket);
4562                         if (bracket)
4563                                 do_indent(indent, "}\n");
4564                 } else {
4565                         do_indent(indent, "while ");
4566                         print_exec(b->left, 0, bracket);
4567                         if (bracket)
4568                                 printf(" {\n");
4569                         else
4570                                 printf(":\n");
4571                         print_exec(b->right, indent+1, bracket);
4572                         if (bracket)
4573                                 do_indent(indent, "}\n");
4574                 }
4575                 break;
4576
4577 ###### print exec cases
4578
4579         case Xcond_statement:
4580         {
4581                 struct cond_statement *cs = cast(cond_statement, e);
4582                 struct casepart *cp;
4583                 if (cs->forpart) {
4584                         do_indent(indent, "for");
4585                         if (bracket) printf(" {\n"); else printf("\n");
4586                         print_exec(cs->forpart, indent+1, bracket);
4587                         if (cs->thenpart) {
4588                                 if (bracket)
4589                                         do_indent(indent, "} then {\n");
4590                                 else
4591                                         do_indent(indent, "then\n");
4592                                 print_exec(cs->thenpart, indent+1, bracket);
4593                         }
4594                         if (bracket) do_indent(indent, "}\n");
4595                 }
4596                 if (cs->looppart) {
4597                         print_exec(cs->looppart, indent, bracket);
4598                 } else {
4599                         // a condition
4600                         if (cs->casepart)
4601                                 do_indent(indent, "switch");
4602                         else
4603                                 do_indent(indent, "if");
4604                         if (cs->condpart && cs->condpart->type == Xbinode &&
4605                             cast(binode, cs->condpart)->op == Block) {
4606                                 if (bracket)
4607                                         printf(" {\n");
4608                                 else
4609                                         printf("\n");
4610                                 print_exec(cs->condpart, indent+1, bracket);
4611                                 if (bracket)
4612                                         do_indent(indent, "}\n");
4613                                 if (cs->thenpart) {
4614                                         do_indent(indent, "then\n");
4615                                         print_exec(cs->thenpart, indent+1, bracket);
4616                                 }
4617                         } else {
4618                                 printf(" ");
4619                                 print_exec(cs->condpart, 0, bracket);
4620                                 if (cs->thenpart) {
4621                                         if (bracket)
4622                                                 printf(" {\n");
4623                                         else
4624                                                 printf(":\n");
4625                                         print_exec(cs->thenpart, indent+1, bracket);
4626                                         if (bracket)
4627                                                 do_indent(indent, "}\n");
4628                                 } else
4629                                         printf("\n");
4630                         }
4631                 }
4632                 for (cp = cs->casepart; cp; cp = cp->next) {
4633                         do_indent(indent, "case ");
4634                         print_exec(cp->value, -1, 0);
4635                         if (bracket)
4636                                 printf(" {\n");
4637                         else
4638                                 printf(":\n");
4639                         print_exec(cp->action, indent+1, bracket);
4640                         if (bracket)
4641                                 do_indent(indent, "}\n");
4642                 }
4643                 if (cs->elsepart) {
4644                         do_indent(indent, "else");
4645                         if (bracket)
4646                                 printf(" {\n");
4647                         else
4648                                 printf("\n");
4649                         print_exec(cs->elsepart, indent+1, bracket);
4650                         if (bracket)
4651                                 do_indent(indent, "}\n");
4652                 }
4653                 break;
4654         }
4655
4656 ###### propagate binode cases
4657         case Loop:
4658                 t = propagate_types(b->right, c, ok, Tnone, 0);
4659                 if (!type_compat(Tnone, t, 0))
4660                         *ok = 0;        // UNTESTED
4661                 return propagate_types(b->left, c, ok, type, rules);
4662
4663 ###### propagate exec cases
4664         case Xcond_statement:
4665         {
4666                 // forpart and looppart->right must return Tnone
4667                 // thenpart must return Tnone if there is a loopart,
4668                 // otherwise it is like elsepart.
4669                 // condpart must:
4670                 //    be bool if there is no casepart
4671                 //    match casepart->values if there is a switchpart
4672                 //    either be bool or match casepart->value if there
4673                 //             is a whilepart
4674                 // elsepart and casepart->action must match the return type
4675                 //   expected of this statement.
4676                 struct cond_statement *cs = cast(cond_statement, prog);
4677                 struct casepart *cp;
4678
4679                 t = propagate_types(cs->forpart, c, ok, Tnone, 0);
4680                 if (!type_compat(Tnone, t, 0))
4681                         *ok = 0;        // UNTESTED
4682
4683                 if (cs->looppart) {
4684                         t = propagate_types(cs->thenpart, c, ok, Tnone, 0);
4685                         if (!type_compat(Tnone, t, 0))
4686                                 *ok = 0;        // UNTESTED
4687                 }
4688                 if (cs->casepart == NULL) {
4689                         propagate_types(cs->condpart, c, ok, Tbool, 0);
4690                         propagate_types(cs->looppart, c, ok, Tbool, 0);
4691                 } else {
4692                         /* Condpart must match case values, with bool permitted */
4693                         t = NULL;
4694                         for (cp = cs->casepart;
4695                              cp && !t; cp = cp->next)
4696                                 t = propagate_types(cp->value, c, ok, NULL, 0);
4697                         if (!t && cs->condpart)
4698                                 t = propagate_types(cs->condpart, c, ok, NULL, Rboolok);        // UNTESTED
4699                         if (!t && cs->looppart)
4700                                 t = propagate_types(cs->looppart, c, ok, NULL, Rboolok);        // UNTESTED
4701                         // Now we have a type (I hope) push it down
4702                         if (t) {
4703                                 for (cp = cs->casepart; cp; cp = cp->next)
4704                                         propagate_types(cp->value, c, ok, t, 0);
4705                                 propagate_types(cs->condpart, c, ok, t, Rboolok);
4706                                 propagate_types(cs->looppart, c, ok, t, Rboolok);
4707                         }
4708                 }
4709                 // (if)then, else, and case parts must return expected type.
4710                 if (!cs->looppart && !type)
4711                         type = propagate_types(cs->thenpart, c, ok, NULL, rules);
4712                 if (!type)
4713                         type = propagate_types(cs->elsepart, c, ok, NULL, rules);
4714                 for (cp = cs->casepart;
4715                      cp && !type;
4716                      cp = cp->next)     // UNTESTED
4717                         type = propagate_types(cp->action, c, ok, NULL, rules); // UNTESTED
4718                 if (type) {
4719                         if (!cs->looppart)
4720                                 propagate_types(cs->thenpart, c, ok, type, rules);
4721                         propagate_types(cs->elsepart, c, ok, type, rules);
4722                         for (cp = cs->casepart; cp ; cp = cp->next)
4723                                 propagate_types(cp->action, c, ok, type, rules);
4724                         return type;
4725                 } else
4726                         return NULL;
4727         }
4728
4729 ###### interp binode cases
4730         case Loop:
4731                 // This just performs one iterration of the loop
4732                 rv = interp_exec(c, b->left, &rvtype);
4733                 if (rvtype == Tnone ||
4734                     (rvtype == Tbool && rv.bool != 0))
4735                         // rvtype is Tnone or Tbool, doesn't need to be freed
4736                         interp_exec(c, b->right, NULL);
4737                 break;
4738
4739 ###### interp exec cases
4740         case Xcond_statement:
4741         {
4742                 struct value v, cnd;
4743                 struct type *vtype, *cndtype;
4744                 struct casepart *cp;
4745                 struct cond_statement *cs = cast(cond_statement, e);
4746
4747                 if (cs->forpart)
4748                         interp_exec(c, cs->forpart, NULL);
4749                 if (cs->looppart) {
4750                         while ((cnd = interp_exec(c, cs->looppart, &cndtype)),
4751                                cndtype == Tnone || (cndtype == Tbool && cnd.bool != 0))
4752                                 interp_exec(c, cs->thenpart, NULL);
4753                 } else {
4754                         cnd = interp_exec(c, cs->condpart, &cndtype);
4755                         if ((cndtype == Tnone ||
4756                             (cndtype == Tbool && cnd.bool != 0))) {
4757                                 // cnd is Tnone or Tbool, doesn't need to be freed
4758                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4759                                 // skip else (and cases)
4760                                 goto Xcond_done;
4761                         }
4762                 }
4763                 for (cp = cs->casepart; cp; cp = cp->next) {
4764                         v = interp_exec(c, cp->value, &vtype);
4765                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4766                                 free_value(vtype, &v);
4767                                 free_value(cndtype, &cnd);
4768                                 rv = interp_exec(c, cp->action, &rvtype);
4769                                 goto Xcond_done;
4770                         }
4771                         free_value(vtype, &v);
4772                 }
4773                 free_value(cndtype, &cnd);
4774                 if (cs->elsepart)
4775                         rv = interp_exec(c, cs->elsepart, &rvtype);
4776                 else
4777                         rvtype = Tnone;
4778         Xcond_done:
4779                 break;
4780         }
4781
4782 ### Top level structure
4783
4784 All the language elements so far can be used in various places.  Now
4785 it is time to clarify what those places are.
4786
4787 At the top level of a file there will be a number of declarations.
4788 Many of the things that can be declared haven't been described yet,
4789 such as functions, procedures, imports, and probably more.
4790 For now there are two sorts of things that can appear at the top
4791 level.  They are predefined constants, `struct` types, and the `main`
4792 function.  While the syntax will allow the `main` function to appear
4793 multiple times, that will trigger an error if it is actually attempted.
4794
4795 The various declarations do not return anything.  They store the
4796 various declarations in the parse context.
4797
4798 ###### Parser: grammar
4799
4800         $void
4801         Ocean -> OptNL DeclarationList
4802
4803         ## declare terminals
4804
4805         OptNL ->
4806         | OptNL NEWLINE
4807
4808         Newlines -> NEWLINE
4809         | Newlines NEWLINE
4810
4811         DeclarationList -> Declaration
4812         | DeclarationList Declaration
4813
4814         Declaration -> ERROR Newlines ${
4815                 tok_err(c,      // UNTESTED
4816                         "error: unhandled parse error", &$1);
4817         }$
4818         | DeclareConstant
4819         | DeclareFunction
4820         | DeclareStruct
4821
4822         ## top level grammar
4823
4824         ## Grammar
4825
4826 ### The `const` section
4827
4828 As well as being defined in with the code that uses them, constants
4829 can be declared at the top level.  These have full-file scope, so they
4830 are always `InScope`.  The value of a top level constant can be given
4831 as an expression, and this is evaluated immediately rather than in the
4832 later interpretation stage.  Once we add functions to the language, we
4833 will need rules concern which, if any, can be used to define a top
4834 level constant.
4835
4836 Constants are defined in a section that starts with the reserved word
4837 `const` and then has a block with a list of assignment statements.
4838 For syntactic consistency, these must use the double-colon syntax to
4839 make it clear that they are constants.  Type can also be given: if
4840 not, the type will be determined during analysis, as with other
4841 constants.
4842
4843 As the types constants are inserted at the head of a list, printing
4844 them in the same order that they were read is not straight forward.
4845 We take a quadratic approach here and count the number of constants
4846 (variables of depth 0), then count down from there, each time
4847 searching through for the Nth constant for decreasing N.
4848
4849 ###### top level grammar
4850
4851         $TERM const
4852
4853         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4854         | const { SimpleConstList } Newlines
4855         | const IN OptNL ConstList OUT Newlines
4856         | const SimpleConstList Newlines
4857
4858         ConstList -> ConstList SimpleConstLine
4859         | SimpleConstLine
4860
4861         SimpleConstList -> SimpleConstList ; Const
4862         | Const
4863         | SimpleConstList ;
4864
4865         SimpleConstLine -> SimpleConstList Newlines
4866         | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4867
4868         $*type
4869         CType -> Type   ${ $0 = $<1; }$
4870         |               ${ $0 = NULL; }$
4871
4872         $void
4873         Const -> IDENTIFIER :: CType = Expression ${ {
4874                 int ok;
4875                 struct variable *v;
4876
4877                 v = var_decl(c, $1.txt);
4878                 if (v) {
4879                         struct var *var = new_pos(var, $1);
4880                         v->where_decl = var;
4881                         v->where_set = var;
4882                         var->var = v;
4883                         v->constant = 1;
4884                         v->global = 1;
4885                 } else {
4886                         struct variable *vorig = var_ref(c, $1.txt);
4887                         tok_err(c, "error: name already declared", &$1);
4888                         type_err(c, "info: this is where '%v' was first declared",
4889                                  vorig->where_decl, NULL, 0, NULL);
4890                 }
4891                 do {
4892                         ok = 1;
4893                         propagate_types($5, c, &ok, $3, 0);
4894                 } while (ok == 2);
4895                 if (!ok)
4896                         c->parse_error = 1;
4897                 else if (v) {
4898                         struct value res = interp_exec(c, $5, &v->type);
4899                         global_alloc(c, v->type, v, &res);
4900                 }
4901         } }$
4902
4903 ###### print const decls
4904         {
4905                 struct variable *v;
4906                 int target = -1;
4907
4908                 while (target != 0) {
4909                         int i = 0;
4910                         for (v = context.in_scope; v; v=v->in_scope)
4911                                 if (v->depth == 0 && v->constant) {
4912                                         i += 1;
4913                                         if (i == target)
4914                                                 break;
4915                                 }
4916
4917                         if (target == -1) {
4918                                 if (i)
4919                                         printf("const\n");
4920                                 target = i;
4921                         } else {
4922                                 struct value *val = var_value(&context, v);
4923                                 printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
4924                                 type_print(v->type, stdout);
4925                                 printf(" = ");
4926                                 if (v->type == Tstr)
4927                                         printf("\"");
4928                                 print_value(v->type, val, stdout);
4929                                 if (v->type == Tstr)
4930                                         printf("\"");
4931                                 printf("\n");
4932                                 target -= 1;
4933                         }
4934                 }
4935         }
4936
4937 ### Function declarations
4938
4939 The code in an Ocean program is all stored in function declarations.
4940 One of the functions must be named `main` and it must accept an array of
4941 strings as a parameter - the command line arguments.
4942
4943 As this is the top level, several things are handled a bit differently.
4944 The function is not interpreted by `interp_exec` as that isn't passed
4945 the argument list which the program requires.  Similarly type analysis
4946 is a bit more interesting at this level.
4947
4948 ###### ast functions
4949
4950         static struct type *handle_results(struct parse_context *c,
4951                                            struct binode *results)
4952         {
4953                 /* Create a 'struct' type from the results list, which
4954                  * is a list for 'struct var'
4955                  */
4956                 struct type *t = add_anon_type(c, &structure_prototype,
4957                                                " function result");
4958                 int cnt = 0;
4959                 struct binode *b;
4960
4961                 for (b = results; b; b = cast(binode, b->right))
4962                         cnt += 1;
4963                 t->structure.nfields = cnt;
4964                 t->structure.fields = calloc(cnt, sizeof(struct field));
4965                 cnt = 0;
4966                 for (b = results; b; b = cast(binode, b->right)) {
4967                         struct var *v = cast(var, b->left);
4968                         struct field *f = &t->structure.fields[cnt++];
4969                         int a = v->var->type->align;
4970                         f->name = v->var->name->name;
4971                         f->type = v->var->type;
4972                         f->init = NULL;
4973                         f->offset = t->size;
4974                         v->var->frame_pos = f->offset;
4975                         t->size += ((f->type->size - 1) | (a-1)) + 1;
4976                         if (a > t->align)
4977                                 t->align = a;
4978                         variable_unlink_exec(v->var);
4979                 }
4980                 free_binode(results);
4981                 return t;
4982         }
4983
4984         static struct variable *declare_function(struct parse_context *c,
4985                                                 struct variable *name,
4986                                                 struct binode *args,
4987                                                 struct type *ret,
4988                                                 struct binode *results,
4989                                                 struct exec *code)
4990         {
4991                 if (name) {
4992                         struct value fn = {.function = code};
4993                         struct type *t;
4994                         var_block_close(c, CloseFunction, code);
4995                         t = add_anon_type(c, &function_prototype,
4996                                           "func %.*s", name->name->name.len,
4997                                           name->name->name.txt);
4998                         name->type = t;
4999                         t->function.params = reorder_bilist(args);
5000                         if (!ret) {
5001                                 ret = handle_results(c, reorder_bilist(results));
5002                                 t->function.inline_result = 1;
5003                                 t->function.local_size = ret->size;
5004                         }
5005                         t->function.return_type = ret;
5006                         global_alloc(c, t, name, &fn);
5007                         name->type->function.scope = c->out_scope;
5008                 } else {
5009                         free_binode(args);
5010                         free_type(ret);
5011                         free_exec(code);
5012                         var_block_close(c, CloseFunction, NULL);
5013                 }
5014                 c->out_scope = NULL;
5015                 return name;
5016         }
5017
5018 ###### declare terminals
5019         $TERM return
5020
5021 ###### top level grammar
5022
5023         $*variable
5024         DeclareFunction -> func FuncName ( OpenScope ArgsLine ) Block Newlines ${
5025                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5026         }$
5027         | func FuncName IN OpenScope Args OUT OptNL do Block Newlines ${
5028                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5029         }$
5030         | func FuncName NEWLINE OpenScope OptNL do Block Newlines ${
5031                 $0 = declare_function(c, $<FN, NULL, Tnone, NULL, $<Bl);
5032         }$
5033         | func FuncName ( OpenScope ArgsLine ) : Type Block Newlines ${
5034                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5035         }$
5036         | func FuncName ( OpenScope ArgsLine ) : ( ArgsLine ) Block Newlines ${
5037                 $0 = declare_function(c, $<FN, $<AL, NULL, $<AL2, $<Bl);
5038         }$
5039         | func FuncName IN OpenScope Args OUT OptNL return Type Newlines do Block Newlines ${
5040                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5041         }$
5042         | func FuncName NEWLINE OpenScope return Type Newlines do Block Newlines ${
5043                 $0 = declare_function(c, $<FN, NULL, $<Ty, NULL, $<Bl);
5044         }$
5045         | func FuncName IN OpenScope Args OUT OptNL return IN Args OUT OptNL do Block Newlines ${
5046                 $0 = declare_function(c, $<FN, $<Ar, NULL, $<Ar2, $<Bl);
5047         }$
5048         | func FuncName NEWLINE OpenScope return IN Args OUT OptNL do Block Newlines ${
5049                 $0 = declare_function(c, $<FN, NULL, NULL, $<Ar, $<Bl);
5050         }$
5051
5052 ###### print func decls
5053         {
5054                 struct variable *v;
5055                 int target = -1;
5056
5057                 while (target != 0) {
5058                         int i = 0;
5059                         for (v = context.in_scope; v; v=v->in_scope)
5060                                 if (v->depth == 0 && v->type && v->type->check_args) {
5061                                         i += 1;
5062                                         if (i == target)
5063                                                 break;
5064                                 }
5065
5066                         if (target == -1) {
5067                                 target = i;
5068                         } else {
5069                                 struct value *val = var_value(&context, v);
5070                                 printf("func %.*s", v->name->name.len, v->name->name.txt);
5071                                 v->type->print_type_decl(v->type, stdout);
5072                                 if (brackets)
5073                                         print_exec(val->function, 0, brackets);
5074                                 else
5075                                         print_value(v->type, val, stdout);
5076                                 printf("/* frame size %d */\n", v->type->function.local_size);
5077                                 target -= 1;
5078                         }
5079                 }
5080         }
5081
5082 ###### core functions
5083
5084         static int analyse_funcs(struct parse_context *c)
5085         {
5086                 struct variable *v;
5087                 int all_ok = 1;
5088                 for (v = c->in_scope; v; v = v->in_scope) {
5089                         struct value *val;
5090                         struct type *ret;
5091                         int ok = 1;
5092                         if (v->depth != 0 || !v->type || !v->type->check_args)
5093                                 continue;
5094                         ret = v->type->function.inline_result ?
5095                                 Tnone : v->type->function.return_type;
5096                         val = var_value(c, v);
5097                         do {
5098                                 ok = 1;
5099                                 propagate_types(val->function, c, &ok, ret, 0);
5100                         } while (ok == 2);
5101                         if (ok)
5102                                 /* Make sure everything is still consistent */
5103                                 propagate_types(val->function, c, &ok, ret, 0);
5104                         if (!ok)
5105                                 all_ok = 0;
5106                         if (!v->type->function.inline_result &&
5107                             !v->type->function.return_type->dup) {
5108                                 type_err(c, "error: function cannot return value of type %1",
5109                                          v->where_decl, v->type->function.return_type, 0, NULL);
5110                         }
5111
5112                         scope_finalize(c, v->type);
5113                 }
5114                 return all_ok;
5115         }
5116
5117         static int analyse_main(struct type *type, struct parse_context *c)
5118         {
5119                 struct binode *bp = type->function.params;
5120                 struct binode *b;
5121                 int ok = 1;
5122                 int arg = 0;
5123                 struct type *argv_type;
5124
5125                 argv_type = add_anon_type(c, &array_prototype, "argv");
5126                 argv_type->array.member = Tstr;
5127                 argv_type->array.unspec = 1;
5128
5129                 for (b = bp; b; b = cast(binode, b->right)) {
5130                         ok = 1;
5131                         switch (arg++) {
5132                         case 0: /* argv */
5133                                 propagate_types(b->left, c, &ok, argv_type, 0);
5134                                 break;
5135                         default: /* invalid */  // NOTEST
5136                                 propagate_types(b->left, c, &ok, Tnone, 0);     // NOTEST
5137                         }
5138                         if (!ok)
5139                                 c->parse_error = 1;
5140                 }
5141
5142                 return !c->parse_error;
5143         }
5144
5145         static void interp_main(struct parse_context *c, int argc, char **argv)
5146         {
5147                 struct value *progp = NULL;
5148                 struct text main_name = { "main", 4 };
5149                 struct variable *mainv;
5150                 struct binode *al;
5151                 int anum = 0;
5152                 struct value v;
5153                 struct type *vtype;
5154
5155                 mainv = var_ref(c, main_name);
5156                 if (mainv)
5157                         progp = var_value(c, mainv);
5158                 if (!progp || !progp->function) {
5159                         fprintf(stderr, "oceani: no main function found.\n");
5160                         c->parse_error = 1;
5161                         return;
5162                 }
5163                 if (!analyse_main(mainv->type, c)) {
5164                         fprintf(stderr, "oceani: main has wrong type.\n");
5165                         c->parse_error = 1;
5166                         return;
5167                 }
5168                 al = mainv->type->function.params;
5169
5170                 c->local_size = mainv->type->function.local_size;
5171                 c->local = calloc(1, c->local_size);
5172                 while (al) {
5173                         struct var *v = cast(var, al->left);
5174                         struct value *vl = var_value(c, v->var);
5175                         struct value arg;
5176                         struct type *t;
5177                         mpq_t argcq;
5178                         int i;
5179
5180                         switch (anum++) {
5181                         case 0: /* argv */
5182                                 t = v->var->type;
5183                                 mpq_init(argcq);
5184                                 mpq_set_ui(argcq, argc, 1);
5185                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
5186                                 t->prepare_type(c, t, 0);
5187                                 array_init(v->var->type, vl);
5188                                 for (i = 0; i < argc; i++) {
5189                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
5190
5191                                         arg.str.txt = argv[i];
5192                                         arg.str.len = strlen(argv[i]);
5193                                         free_value(Tstr, vl2);
5194                                         dup_value(Tstr, &arg, vl2);
5195                                 }
5196                                 break;
5197                         }
5198                         al = cast(binode, al->right);
5199                 }
5200                 v = interp_exec(c, progp->function, &vtype);
5201                 free_value(vtype, &v);
5202                 free(c->local);
5203                 c->local = NULL;
5204         }
5205
5206 ###### ast functions
5207         void free_variable(struct variable *v)
5208         {
5209         }
5210
5211 ## And now to test it out.
5212
5213 Having a language requires having a "hello world" program.  I'll
5214 provide a little more than that: a program that prints "Hello world"
5215 finds the GCD of two numbers, prints the first few elements of
5216 Fibonacci, performs a binary search for a number, and a few other
5217 things which will likely grow as the languages grows.
5218
5219 ###### File: oceani.mk
5220         demos :: sayhello
5221         sayhello : oceani
5222                 @echo "===== DEMO ====="
5223                 ./oceani --section "demo: hello" oceani.mdc 55 33
5224
5225 ###### demo: hello
5226
5227         const
5228                 pi ::= 3.141_592_6
5229                 four ::= 2 + 2 ; five ::= 10/2
5230         const pie ::= "I like Pie";
5231                 cake ::= "The cake is"
5232                   ++ " a lie"
5233
5234         struct fred
5235                 size:[four]number
5236                 name:string
5237                 alive:Boolean
5238
5239         func main(argv:[argc::]string)
5240                 print "Hello World, what lovely oceans you have!"
5241                 print "Are there", five, "?"
5242                 print pi, pie, "but", cake
5243
5244                 A := $argv[1]; B := $argv[2]
5245
5246                 /* When a variable is defined in both branches of an 'if',
5247                  * and used afterwards, the variables are merged.
5248                  */
5249                 if A > B:
5250                         bigger := "yes"
5251                 else
5252                         bigger := "no"
5253                 print "Is", A, "bigger than", B,"? ", bigger
5254                 /* If a variable is not used after the 'if', no
5255                  * merge happens, so types can be different
5256                  */
5257                 if A > B * 2:
5258                         double:string = "yes"
5259                         print A, "is more than twice", B, "?", double
5260                 else
5261                         double := B*2
5262                         print "double", B, "is", double
5263
5264                 a : number
5265                 a = A;
5266                 b:number = B
5267                 if a > 0 and then b > 0:
5268                         while a != b:
5269                                 if a < b:
5270                                         b = b - a
5271                                 else
5272                                         a = a - b
5273                         print "GCD of", A, "and", B,"is", a
5274                 else if a <= 0:
5275                         print a, "is not positive, cannot calculate GCD"
5276                 else
5277                         print b, "is not positive, cannot calculate GCD"
5278
5279                 for
5280                         togo := 10
5281                         f1 := 1; f2 := 1
5282                         print "Fibonacci:", f1,f2,
5283                 then togo = togo - 1
5284                 while togo > 0:
5285                         f3 := f1 + f2
5286                         print "", f3,
5287                         f1 = f2
5288                         f2 = f3
5289                 print ""
5290
5291                 /* Binary search... */
5292                 for
5293                         lo:= 0; hi := 100
5294                         target := 77
5295                 while
5296                         mid := (lo + hi) / 2
5297                         if mid == target:
5298                                 use Found
5299                         if mid < target:
5300                                 lo = mid
5301                         else
5302                                 hi = mid
5303                         if hi - lo < 1:
5304                                 lo = mid
5305                                 use GiveUp
5306                         use True
5307                 do pass
5308                 case Found:
5309                         print "Yay, I found", target
5310                 case GiveUp:
5311                         print "Closest I found was", lo
5312
5313                 size::= 10
5314                 list:[size]number
5315                 list[0] = 1234
5316                 // "middle square" PRNG.  Not particularly good, but one my
5317                 // Dad taught me - the first one I ever heard of.
5318                 for i:=1; then i = i + 1; while i < size:
5319                         n := list[i-1] * list[i-1]
5320                         list[i] = (n / 100) % 10 000
5321
5322                 print "Before sort:",
5323                 for i:=0; then i = i + 1; while i < size:
5324                         print "", list[i],
5325                 print
5326
5327                 for i := 1; then i=i+1; while i < size:
5328                         for j:=i-1; then j=j-1; while j >= 0:
5329                                 if list[j] > list[j+1]:
5330                                         t:= list[j]
5331                                         list[j] = list[j+1]
5332                                         list[j+1] = t
5333                 print " After sort:",
5334                 for i:=0; then i = i + 1; while i < size:
5335                         print "", list[i],
5336                 print
5337
5338                 if 1 == 2 then print "yes"; else print "no"
5339
5340                 bob:fred
5341                 bob.name = "Hello"
5342                 bob.alive = (bob.name == "Hello")
5343                 print "bob", "is" if  bob.alive else "isn't", "alive"