ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 ## parse context
 118         };
 119
 120 ###### macros
 121
 122         #define container_of(ptr, type, member) ({                      \
 123                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 124                 (type *)( (char *)__mptr - offsetof(type,member) );})
 125
 126         #define config2context(_conf) container_of(_conf, struct parse_context, \
 127                 config)
 128
 129 ###### Parser: reduce
 130         struct parse_context *c = config2context(config);
 131
 132 ###### Parser: code
 133         #define _GNU_SOURCE
 134         #include <unistd.h>
 135         #include <stdlib.h>
 136         #include <fcntl.h>
 137         #include <errno.h>
 138         #include <sys/mman.h>
 139         #include <string.h>
 140         #include <stdio.h>
 141         #include <locale.h>
 142         #include <malloc.h>
 143         #include "mdcode.h"
 144         #include "scanner.h"
 145         #include "parser.h"
 146
 147         ## includes
 148
 149         #include "oceani.h"
 150
 151         ## forward decls
 152         ## value functions
 153         ## ast functions
 154         ## core functions
 155
 156         #include <getopt.h>
 157         static char Usage[] =
 158                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 159         static const struct option long_options[] = {
 160                 {"trace",     0, NULL, 't'},
 161                 {"print",     0, NULL, 'p'},
 162                 {"noexec",    0, NULL, 'n'},
 163                 {"brackets",  0, NULL, 'b'},
 164                 {"section",   1, NULL, 's'},
 165                 {NULL,        0, NULL, 0},
 166         };
 167         const char *options = "tpnbs";
 168
 169         static void pr_err(char *msg)                   // NOTEST
 170         {
 171                 fprintf(stderr, "%s\n", msg);           // NOTEST
 172         }                                               // NOTEST
 173
 174         int main(int argc, char *argv[])
 175         {
 176                 int fd;
 177                 int len;
 178                 char *file;
 179                 struct section *s = NULL, *ss;
 180                 char *section = NULL;
 181                 struct parse_context context = {
 182                         .config = {
 183                                 .ignored = (1 << TK_mark),
 184                                 .number_chars = ".,_+- ",
 185                                 .word_start = "_",
 186                                 .word_cont = "_",
 187                         },
 188                 };
 189                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 190                 int opt;
 191                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 192                        != -1) {
 193                         switch(opt) {
 194                         case 't': dotrace=1; break;
 195                         case 'p': doprint=1; break;
 196                         case 'n': doexec=0; break;
 197                         case 'b': brackets=1; break;
 198                         case 's': section = optarg; break;
 199                         default: fprintf(stderr, Usage);
 200                                 exit(1);
 201                         }
 202                 }
 203                 if (optind >= argc) {
 204                         fprintf(stderr, "oceani: no input file given\n");
 205                         exit(1);
 206                 }
 207                 fd = open(argv[optind], O_RDONLY);
 208                 if (fd < 0) {
 209                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 210                         exit(1);
 211                 }
 212                 context.file_name = argv[optind];
 213                 len = lseek(fd, 0, 2);
 214                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 215                 s = code_extract(file, file+len, pr_err);
 216                 if (!s) {
 217                         fprintf(stderr, "oceani: could not find any code in %s\n",
 218                                 argv[optind]);
 219                         exit(1);
 220                 }
 221
 222                 ## context initialization
 223
 224                 if (section) {
 225                         for (ss = s; ss; ss = ss->next) {
 226                                 struct text sec = ss->section;
 227                                 if (sec.len == strlen(section) &&
 228                                     strncmp(sec.txt, section, sec.len) == 0)
 229                                         break;
 230                         }
 231                         if (!ss) {
 232                                 fprintf(stderr, "oceani: cannot find section %s\n",
 233                                         section);
 234                                 goto cleanup;
 235                         }
 236                 } else
 237                         ss = s;                         // NOTEST
 238                 if (!ss->code) {
 239                         fprintf(stderr, "oceani: no code found in requested section\n");        // NOTEST
 240                         goto cleanup;                   // NOTEST
 241                 }
 242
 243                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 244
 245                 if (!context.parse_error && !analyse_funcs(&context)) {
 246                         fprintf(stderr, "oceani: type error in program - not running.\n");
 247                         context.parse_error = 1;
 248                 }
 249
 250                 if (doprint) {
 251                         ## print const decls
 252                         ## print type decls
 253                         ## print func decls
 254                 }
 255                 if (doexec && !context.parse_error)
 256                         interp_main(&context, argc - optind, argv + optind);
 257         cleanup:
 258                 while (s) {
 259                         struct section *t = s->next;
 260                         code_free(s->code);
 261                         free(s);
 262                         s = t;
 263                 }
 264                 // FIXME parser should pop scope even on error
 265                 while (context.scope_depth > 0)
 266                         scope_pop(&context);
 267                 ## free global vars
 268                 ## free context types
 269                 ## free context storage
 270                 exit(context.parse_error ? 1 : 0);
 271         }
 272
 273 ### Analysis
 274
 275 The four requirements of parse, analyse, print, interpret apply to
 276 each language element individually so that is how most of the code
 277 will be structured.
 278
 279 Three of the four are fairly self explanatory.  The one that requires
 280 a little explanation is the analysis step.
 281
 282 The current language design does not require the types of variables to
 283 be declared, but they must still have a single type.  Different
 284 operations impose different requirements on the variables, for example
 285 addition requires both arguments to be numeric, and assignment
 286 requires the variable on the left to have the same type as the
 287 expression on the right.
 288
 289 Analysis involves propagating these type requirements around and
 290 consequently setting the type of each variable.  If any requirements
 291 are violated (e.g. a string is compared with a number) or if a
 292 variable needs to have two different types, then an error is raised
 293 and the program will not run.
 294
 295 If the same variable is declared in both branchs of an 'if/else', or
 296 in all cases of a 'switch' then the multiple instances may be merged
 297 into just one variable if the variable is referenced after the
 298 conditional statement.  When this happens, the types must naturally be
 299 consistent across all the branches.  When the variable is not used
 300 outside the if, the variables in the different branches are distinct
 301 and can be of different types.
 302
 303 Undeclared names may only appear in "use" statements and "case" expressions.
 304 These names are given a type of "label" and a unique value.
 305 This allows them to fill the role of a name in an enumerated type, which
 306 is useful for testing the `switch` statement.
 307
 308 As we will see, the condition part of a `while` statement can return
 309 either a Boolean or some other type.  This requires that the expected
 310 type that gets passed around comprises a type and a flag to indicate
 311 that `Tbool` is also permitted.
 312
 313 As there are, as yet, no distinct types that are compatible, there
 314 isn't much subtlety in the analysis.  When we have distinct number
 315 types, this will become more interesting.
 316
 317 #### Error reporting
 318
 319 When analysis discovers an inconsistency it needs to report an error;
 320 just refusing to run the code ensures that the error doesn't cascade,
 321 but by itself it isn't very useful.  A clear understanding of the sort
 322 of error message that are useful will help guide the process of
 323 analysis.
 324
 325 At a simplistic level, the only sort of error that type analysis can
 326 report is that the type of some construct doesn't match a contextual
 327 requirement.  For example, in `4 + "hello"` the addition provides a
 328 contextual requirement for numbers, but `"hello"` is not a number.  In
 329 this particular example no further information is needed as the types
 330 are obvious from local information.  When a variable is involved that
 331 isn't the case.  It may be helpful to explain why the variable has a
 332 particular type, by indicating the location where the type was set,
 333 whether by declaration or usage.
 334
 335 Using a recursive-descent analysis we can easily detect a problem at
 336 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 337 will detect that one argument is not a number and the usage of `hello`
 338 will detect that a number was wanted, but not provided.  In this
 339 (early) version of the language, we will generate error reports at
 340 multiple locations, so the use of `hello` will report an error and
 341 explain were the value was set, and the addition will report an error
 342 and say why numbers are needed.  To be able to report locations for
 343 errors, each language element will need to record a file location
 344 (line and column) and each variable will need to record the language
 345 element where its type was set.  For now we will assume that each line
 346 of an error message indicates one location in the file, and up to 2
 347 types.  So we provide a `printf`-like function which takes a format, a
 348 location (a `struct exec` which has not yet been introduced), and 2
 349 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 350 will need a function to print the location, once we know how that is
 351 stored. e As will be explained later, there are sometimes extra rules for
 352 type matching and they might affect error messages, we need to pass those
 353 in too.
 354
 355 As well as type errors, we sometimes need to report problems with
 356 tokens, which might be unexpected or might name a type that has not
 357 been defined.  For these we have `tok_err()` which reports an error
 358 with a given token.  Each of the error functions sets the flag in the
 359 context so indicate that parsing failed.
 360
 361 ###### forward decls
 362
 363         static void fput_loc(struct exec *loc, FILE *f);
 364         static void type_err(struct parse_context *c,
 365                              char *fmt, struct exec *loc,
 366                              struct type *t1, int rules, struct type *t2);
 367
 368 ###### core functions
 369
 370         static void type_err(struct parse_context *c,
 371                              char *fmt, struct exec *loc,
 372                              struct type *t1, int rules, struct type *t2)
 373         {
 374                 fprintf(stderr, "%s:", c->file_name);
 375                 fput_loc(loc, stderr);
 376                 for (; *fmt ; fmt++) {
 377                         if (*fmt != '%') {
 378                                 fputc(*fmt, stderr);
 379                                 continue;
 380                         }
 381                         fmt++;
 382                         switch (*fmt) {
 383                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 384                         default: fputc('?', stderr); break;     // NOTEST
 385                         case '1':
 386                                 type_print(t1, stderr);
 387                                 break;
 388                         case '2':
 389                                 type_print(t2, stderr);
 390                                 break;
 391                         ## format cases
 392                         }
 393                 }
 394                 fputs("\n", stderr);
 395                 c->parse_error = 1;
 396         }
 397
 398         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 399         {
 400                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 401                         t->txt.len, t->txt.txt);
 402                 c->parse_error = 1;
 403         }
 404
 405 ## Entities: declared and predeclared.
 406
 407 There are various "things" that the language and/or the interpreter
 408 needs to know about to parse and execute a program.  These include
 409 types, variables, values, and executable code.  These are all lumped
 410 together under the term "entities" (calling them "objects" would be
 411 confusing) and introduced here.  The following section will present the
 412 different specific code elements which comprise or manipulate these
 413 various entities.
 414
 415 ### Executables
 416
 417 Executables can be lots of different things.  In many cases an
 418 executable is just an operation combined with one or two other
 419 executables.  This allows for expressions and lists etc.  Other times an
 420 executable is something quite specific like a constant or variable name.
 421 So we define a `struct exec` to be a general executable with a type, and
 422 a `struct binode` which is a subclass of `exec`, forms a node in a
 423 binary tree, and holds an operation.  There will be other subclasses,
 424 and to access these we need to be able to `cast` the `exec` into the
 425 various other types.  The first field in any `struct exec` is the type
 426 from the `exec_types` enum.
 427
 428 ###### macros
 429         #define cast(structname, pointer) ({            \
 430                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 431                 if (__mptr && *__mptr != X##structname) abort();                \
 432                 (struct structname *)( (char *)__mptr);})
 433
 434         #define new(structname) ({                                              \
 435                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 436                 __ptr->type = X##structname;                                            \
 437                 __ptr->line = -1; __ptr->column = -1;                                   \
 438                 __ptr;})
 439
 440         #define new_pos(structname, token) ({                                           \
 441                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 442                 __ptr->type = X##structname;                                            \
 443                 __ptr->line = token.line; __ptr->column = token.col;                    \
 444                 __ptr;})
 445
 446 ###### ast
 447         enum exec_types {
 448                 Xbinode,
 449                 ## exec type
 450         };
 451         struct exec {
 452                 enum exec_types type;
 453                 int line, column;
 454                 ## exec fields
 455         };
 456         struct binode {
 457                 struct exec;
 458                 enum Btype {
 459                         ## Binode types
 460                 } op;
 461                 struct exec *left, *right;
 462         };
 463
 464 ###### ast functions
 465
 466         static int __fput_loc(struct exec *loc, FILE *f)
 467         {
 468                 if (!loc)
 469                         return 0;
 470                 if (loc->line >= 0) {
 471                         fprintf(f, "%d:%d: ", loc->line, loc->column);
 472                         return 1;
 473                 }
 474                 if (loc->type == Xbinode)
 475                         return __fput_loc(cast(binode,loc)->left, f) ||
 476                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
 477                 return 0;
 478         }
 479         static void fput_loc(struct exec *loc, FILE *f)
 480         {
 481                 if (!__fput_loc(loc, f))
 482                         fprintf(f, "??:??: ");
 483         }
 484
 485 Each different type of `exec` node needs a number of functions defined,
 486 a bit like methods.  We must be able to free it, print it, analyse it
 487 and execute it.  Once we have specific `exec` types we will need to
 488 parse them too.  Let's take this a bit more slowly.
 489
 490 #### Freeing
 491
 492 The parser generator requires a `free_foo` function for each struct
 493 that stores attributes and they will often be `exec`s and subtypes
 494 there-of.  So we need `free_exec` which can handle all the subtypes,
 495 and we need `free_binode`.
 496
 497 ###### ast functions
 498
 499         static void free_binode(struct binode *b)
 500         {
 501                 if (!b)
 502                         return;
 503                 free_exec(b->left);
 504                 free_exec(b->right);
 505                 free(b);
 506         }
 507
 508 ###### core functions
 509         static void free_exec(struct exec *e)
 510         {
 511                 if (!e)
 512                         return;
 513                 switch(e->type) {
 514                         ## free exec cases
 515                 }
 516         }
 517
 518 ###### forward decls
 519
 520         static void free_exec(struct exec *e);
 521
 522 ###### free exec cases
 523         case Xbinode: free_binode(cast(binode, e)); break;
 524
 525 #### Printing
 526
 527 Printing an `exec` requires that we know the current indent level for
 528 printing line-oriented components.  As will become clear later, we
 529 also want to know what sort of bracketing to use.
 530
 531 ###### ast functions
 532
 533         static void do_indent(int i, char *str)
 534         {
 535                 while (i-- > 0)
 536                         printf("    ");
 537                 printf("%s", str);
 538         }
 539
 540 ###### core functions
 541         static void print_binode(struct binode *b, int indent, int bracket)
 542         {
 543                 struct binode *b2;
 544                 switch(b->op) {
 545                 ## print binode cases
 546                 }
 547         }
 548
 549         static void print_exec(struct exec *e, int indent, int bracket)
 550         {
 551                 if (!e)
 552                         return;
 553                 switch (e->type) {
 554                 case Xbinode:
 555                         print_binode(cast(binode, e), indent, bracket); break;
 556                 ## print exec cases
 557                 }
 558                 if (e->to_free) {
 559                         struct variable *v;
 560                         do_indent(indent, "/* FREE");
 561                         for (v = e->to_free; v; v = v->next_free) {
 562                                 printf(" %.*s", v->name->name.len, v->name->name.txt);
 563                                 printf("[%d,%d]", v->scope_start, v->scope_end);
 564                                 if (v->frame_pos >= 0)
 565                                         printf("(%d+%d)", v->frame_pos,
 566                                                v->type ? v->type->size:0);
 567                         }
 568                         printf(" */\n");
 569                 }
 570         }
 571
 572 ###### forward decls
 573
 574         static void print_exec(struct exec *e, int indent, int bracket);
 575
 576 #### Analysing
 577
 578 As discussed, analysis involves propagating type requirements around the
 579 program and looking for errors.
 580
 581 So `propagate_types` is passed an expected type (being a `struct type`
 582 pointer together with some `val_rules` flags) that the `exec` is
 583 expected to return, and returns the type that it does return, either
 584 of which can be `NULL` signifying "unknown".  An `ok` flag is passed
 585 by reference. It is set to `0` when an error is found, and `2` when
 586 any change is made.  If it remains unchanged at `1`, then no more
 587 propagation is needed.
 588
 589 ###### ast
 590
 591         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 1<<2};
 592
 593 ###### format cases
 594         case 'r':
 595                 if (rules & Rnolabel)
 596                         fputs(" (labels not permitted)", stderr);
 597                 break;
 598
 599 ###### forward decls
 600         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 601                                             struct type *type, int rules);
 602 ###### core functions
 603
 604         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 605                                               struct type *type, int rules)
 606         {
 607                 struct type *t;
 608
 609                 if (!prog)
 610                         return Tnone;
 611
 612                 switch (prog->type) {
 613                 case Xbinode:
 614                 {
 615                         struct binode *b = cast(binode, prog);
 616                         switch (b->op) {
 617                         ## propagate binode cases
 618                         }
 619                         break;
 620                 }
 621                 ## propagate exec cases
 622                 }
 623                 return Tnone;
 624         }
 625
 626         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 627                                             struct type *type, int rules)
 628         {
 629                 struct type *ret = __propagate_types(prog, c, ok, type, rules);
 630
 631                 if (c->parse_error)
 632                         *ok = 0;
 633                 return ret;
 634         }
 635
 636 #### Interpreting
 637
 638 Interpreting an `exec` doesn't require anything but the `exec`.  State
 639 is stored in variables and each variable will be directly linked from
 640 within the `exec` tree.  The exception to this is the `main` function
 641 which needs to look at command line arguments.  This function will be
 642 interpreted separately.
 643
 644 Each `exec` can return a value combined with a type in `struct lrval`.
 645 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
 646 the location of a value, which can be updated, in `lval`.  Others will
 647 set `lval` to NULL indicating that there is a value of appropriate type
 648 in `rval`.
 649
 650 ###### core functions
 651
 652         struct lrval {
 653                 struct type *type;
 654                 struct value rval, *lval;
 655         };
 656
 657         /* If dest is passed, dtype must give the expected type, and
 658          * result can go there, in which case type is returned as NULL.
 659          */
 660         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 661                                          struct value *dest, struct type *dtype);
 662
 663         static struct value interp_exec(struct parse_context *c, struct exec *e,
 664                                         struct type **typeret)
 665         {
 666                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 667
 668                 if (!ret.type) abort();
 669                 if (typeret)
 670                         *typeret = ret.type;
 671                 if (ret.lval)
 672                         dup_value(ret.type, ret.lval, &ret.rval);
 673                 return ret.rval;
 674         }
 675
 676         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
 677                                           struct type **typeret)
 678         {
 679                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 680
 681                 if (!ret.type) abort();
 682                 if (ret.lval)
 683                         *typeret = ret.type;
 684                 else
 685                         free_value(ret.type, &ret.rval);
 686                 return ret.lval;
 687         }
 688
 689         /* dinterp_exec is used when the destination type is certain and
 690          * the value has a place to go.
 691          */
 692         static void dinterp_exec(struct parse_context *c, struct exec *e,
 693                                  struct value *dest, struct type *dtype,
 694                                  int need_free)
 695         {
 696                 struct lrval ret = _interp_exec(c, e, dest, dtype);
 697                 if (!ret.type)
 698                         return;
 699                 if (need_free)
 700                         free_value(dtype, dest);
 701                 if (ret.lval)
 702                         dup_value(dtype, ret.lval, dest);
 703                 else
 704                         memcpy(dest, &ret.rval, dtype->size);
 705         }
 706
 707         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 708                                          struct value *dest, struct type *dtype)
 709         {
 710                 /* If the result is copied to dest, ret.type is set to NULL */
 711                 struct lrval ret;
 712                 struct value rv = {}, *lrv = NULL;
 713                 struct type *rvtype;
 714
 715                 rvtype = ret.type = Tnone;
 716                 if (!e) {
 717                         ret.lval = lrv;
 718                         ret.rval = rv;
 719                         return ret;
 720                 }
 721
 722                 switch(e->type) {
 723                 case Xbinode:
 724                 {
 725                         struct binode *b = cast(binode, e);
 726                         struct value left, right, *lleft;
 727                         struct type *ltype, *rtype;
 728                         ltype = rtype = Tnone;
 729                         switch (b->op) {
 730                         ## interp binode cases
 731                         }
 732                         free_value(ltype, &left);
 733                         free_value(rtype, &right);
 734                         break;
 735                 }
 736                 ## interp exec cases
 737                 }
 738                 if (rvtype) {
 739                         ret.lval = lrv;
 740                         ret.rval = rv;
 741                         ret.type = rvtype;
 742                 }
 743                 ## interp exec cleanup
 744                 return ret;
 745         }
 746
 747 ### Types
 748
 749 Values come in a wide range of types, with more likely to be added.
 750 Each type needs to be able to print its own values (for convenience at
 751 least) as well as to compare two values, at least for equality and
 752 possibly for order.  For now, values might need to be duplicated and
 753 freed, though eventually such manipulations will be better integrated
 754 into the language.
 755
 756 Rather than requiring every numeric type to support all numeric
 757 operations (add, multiply, etc), we allow types to be able to present
 758 as one of a few standard types: integer, float, and fraction.  The
 759 existence of these conversion functions eventually enable types to
 760 determine if they are compatible with other types, though such types
 761 have not yet been implemented.
 762
 763 Named type are stored in a simple linked list.  Objects of each type are
 764 "values" which are often passed around by value.
 765
 766 There are both explicitly named types, and anonymous types.  Anonymous
 767 cannot be accessed by name, but are used internally and have a name
 768 which might be reported in error messages.
 769
 770 ###### ast
 771
 772         struct value {
 773                 union {
 774                         char ptr[1];
 775                         ## value union fields
 776                 };
 777         };
 778
 779         struct type {
 780                 struct text name;
 781                 struct type *next;
 782                 int size, align;
 783                 int anon;
 784                 void (*init)(struct type *type, struct value *val);
 785                 void (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 786                 void (*print)(struct type *type, struct value *val, FILE *f);
 787                 void (*print_type)(struct type *type, FILE *f);
 788                 int (*cmp_order)(struct type *t1, struct type *t2,
 789                                  struct value *v1, struct value *v2);
 790                 int (*cmp_eq)(struct type *t1, struct type *t2,
 791                               struct value *v1, struct value *v2);
 792                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 793                 void (*free)(struct type *type, struct value *val);
 794                 void (*free_type)(struct type *t);
 795                 long long (*to_int)(struct value *v);
 796                 double (*to_float)(struct value *v);
 797                 int (*to_mpq)(mpq_t *q, struct value *v);
 798                 ## type functions
 799                 union {
 800                         ## type union fields
 801                 };
 802         };
 803
 804 ###### parse context
 805
 806         struct type *typelist;
 807
 808 ###### includes
 809         #include <stdarg.h>
 810
 811 ###### ast functions
 812
 813         static struct type *find_type(struct parse_context *c, struct text s)
 814         {
 815                 struct type *t = c->typelist;
 816
 817                 while (t && (t->anon ||
 818                              text_cmp(t->name, s) != 0))
 819                                 t = t->next;
 820                 return t;
 821         }
 822
 823         static struct type *_add_type(struct parse_context *c, struct text s,
 824                                      struct type *proto, int anon)
 825         {
 826                 struct type *n;
 827
 828                 n = calloc(1, sizeof(*n));
 829                 *n = *proto;
 830                 n->name = s;
 831                 n->anon = anon;
 832                 n->next = c->typelist;
 833                 c->typelist = n;
 834                 return n;
 835         }
 836
 837         static struct type *add_type(struct parse_context *c, struct text s,
 838                                       struct type *proto)
 839         {
 840                 return _add_type(c, s, proto, 0);
 841         }
 842
 843         static struct type *add_anon_type(struct parse_context *c,
 844                                           struct type *proto, char *name, ...)
 845         {
 846                 struct text t;
 847                 va_list ap;
 848
 849                 va_start(ap, name);
 850                 vasprintf(&t.txt, name, ap);
 851                 va_end(ap);
 852                 t.len = strlen(name);
 853                 return _add_type(c, t, proto, 1);
 854         }
 855
 856         static void free_type(struct type *t)
 857         {
 858                 /* The type is always a reference to something in the
 859                  * context, so we don't need to free anything.
 860                  */
 861         }
 862
 863         static void free_value(struct type *type, struct value *v)
 864         {
 865                 if (type && v) {
 866                         type->free(type, v);
 867                         memset(v, 0x5a, type->size);
 868                 }
 869         }
 870
 871         static void type_print(struct type *type, FILE *f)
 872         {
 873                 if (!type)
 874                         fputs("*unknown*type*", f);     // NOTEST
 875                 else if (type->name.len && !type->anon)
 876                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 877                 else if (type->print_type)
 878                         type->print_type(type, f);
 879                 else
 880                         fputs("*invalid*type*", f);
 881         }
 882
 883         static void val_init(struct type *type, struct value *val)
 884         {
 885                 if (type && type->init)
 886                         type->init(type, val);
 887         }
 888
 889         static void dup_value(struct type *type,
 890                               struct value *vold, struct value *vnew)
 891         {
 892                 if (type && type->dup)
 893                         type->dup(type, vold, vnew);
 894         }
 895
 896         static int value_cmp(struct type *tl, struct type *tr,
 897                              struct value *left, struct value *right)
 898         {
 899                 if (tl && tl->cmp_order)
 900                         return tl->cmp_order(tl, tr, left, right);
 901                 if (tl && tl->cmp_eq)                   // NOTEST
 902                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 903                 return -1;                              // NOTEST
 904         }
 905
 906         static void print_value(struct type *type, struct value *v, FILE *f)
 907         {
 908                 if (type && type->print)
 909                         type->print(type, v, f);
 910                 else
 911                         fprintf(f, "*Unknown*");                // NOTEST
 912         }
 913
 914 ###### forward decls
 915
 916         static void free_value(struct type *type, struct value *v);
 917         static int type_compat(struct type *require, struct type *have, int rules);
 918         static void type_print(struct type *type, FILE *f);
 919         static void val_init(struct type *type, struct value *v);
 920         static void dup_value(struct type *type,
 921                               struct value *vold, struct value *vnew);
 922         static int value_cmp(struct type *tl, struct type *tr,
 923                              struct value *left, struct value *right);
 924         static void print_value(struct type *type, struct value *v, FILE *f);
 925
 926 ###### free context types
 927
 928         while (context.typelist) {
 929                 struct type *t = context.typelist;
 930
 931                 context.typelist = t->next;
 932                 if (t->free_type)
 933                         t->free_type(t);
 934                 if (t->anon)
 935                         free(t->name.txt);
 936                 free(t);
 937         }
 938
 939 Type can be specified for local variables, for fields in a structure,
 940 for formal parameters to functions, and possibly elsewhere.  Different
 941 rules may apply in different contexts.  As a minimum, a named type may
 942 always be used.  Currently the type of a formal parameter can be
 943 different from types in other contexts, so we have a separate grammar
 944 symbol for those.
 945
 946 ###### Grammar
 947
 948         $*type
 949         Type -> IDENTIFIER ${
 950                 $0 = find_type(c, $1.txt);
 951                 if (!$0) {
 952                         tok_err(c,
 953                                 "error: undefined type", &$1);
 954
 955                         $0 = Tnone;
 956                 }
 957         }$
 958         ## type grammar
 959
 960         FormalType -> Type ${ $0 = $<1; }$
 961         ## formal type grammar
 962
 963 #### Base Types
 964
 965 Values of the base types can be numbers, which we represent as
 966 multi-precision fractions, strings, Booleans and labels.  When
 967 analysing the program we also need to allow for places where no value
 968 is meaningful (type `Tnone`) and where we don't know what type to
 969 expect yet (type is `NULL`).
 970
 971 Values are never shared, they are always copied when used, and freed
 972 when no longer needed.
 973
 974 When propagating type information around the program, we need to
 975 determine if two types are compatible, where type `NULL` is compatible
 976 with anything.  There are two special cases with type compatibility,
 977 both related to the Conditional Statement which will be described
 978 later.  In some cases a Boolean can be accepted as well as some other
 979 primary type, and in others any type is acceptable except a label (`Vlabel`).
 980 A separate function encoding these cases will simplify some code later.
 981
 982 ###### type functions
 983
 984         int (*compat)(struct type *this, struct type *other);
 985
 986 ###### ast functions
 987
 988         static int type_compat(struct type *require, struct type *have, int rules)
 989         {
 990                 if ((rules & Rboolok) && have == Tbool)
 991                         return 1;       // NOTEST
 992                 if ((rules & Rnolabel) && have == Tlabel)
 993                         return 0;       // NOTEST
 994                 if (!require || !have)
 995                         return 1;
 996
 997                 if (require->compat)
 998                         return require->compat(require, have);
 999
1000                 return require == have;
1001         }
1002
1003 ###### includes
1004         #include <gmp.h>
1005         #include "parse_string.h"
1006         #include "parse_number.h"
1007
1008 ###### libs
1009         myLDLIBS := libnumber.o libstring.o -lgmp
1010         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
1011
1012 ###### type union fields
1013         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
1014
1015 ###### value union fields
1016         struct text str;
1017         mpq_t num;
1018         unsigned char bool;
1019         void *label;
1020
1021 ###### ast functions
1022         static void _free_value(struct type *type, struct value *v)
1023         {
1024                 if (!v)
1025                         return;         // NOTEST
1026                 switch (type->vtype) {
1027                 case Vnone: break;
1028                 case Vstr: free(v->str.txt); break;
1029                 case Vnum: mpq_clear(v->num); break;
1030                 case Vlabel:
1031                 case Vbool: break;
1032                 }
1033         }
1034
1035 ###### value functions
1036
1037         static void _val_init(struct type *type, struct value *val)
1038         {
1039                 switch(type->vtype) {
1040                 case Vnone:             // NOTEST
1041                         break;          // NOTEST
1042                 case Vnum:
1043                         mpq_init(val->num); break;
1044                 case Vstr:
1045                         val->str.txt = malloc(1);
1046                         val->str.len = 0;
1047                         break;
1048                 case Vbool:
1049                         val->bool = 0;
1050                         break;
1051                 case Vlabel:
1052                         val->label = NULL;
1053                         break;
1054                 }
1055         }
1056
1057         static void _dup_value(struct type *type,
1058                                struct value *vold, struct value *vnew)
1059         {
1060                 switch (type->vtype) {
1061                 case Vnone:             // NOTEST
1062                         break;          // NOTEST
1063                 case Vlabel:
1064                         vnew->label = vold->label;
1065                         break;
1066                 case Vbool:
1067                         vnew->bool = vold->bool;
1068                         break;
1069                 case Vnum:
1070                         mpq_init(vnew->num);
1071                         mpq_set(vnew->num, vold->num);
1072                         break;
1073                 case Vstr:
1074                         vnew->str.len = vold->str.len;
1075                         vnew->str.txt = malloc(vnew->str.len);
1076                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
1077                         break;
1078                 }
1079         }
1080
1081         static int _value_cmp(struct type *tl, struct type *tr,
1082                               struct value *left, struct value *right)
1083         {
1084                 int cmp;
1085                 if (tl != tr)
1086                         return tl - tr; // NOTEST
1087                 switch (tl->vtype) {
1088                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
1089                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
1090                 case Vstr: cmp = text_cmp(left->str, right->str); break;
1091                 case Vbool: cmp = left->bool - right->bool; break;
1092                 case Vnone: cmp = 0;                    // NOTEST
1093                 }
1094                 return cmp;
1095         }
1096
1097         static void _print_value(struct type *type, struct value *v, FILE *f)
1098         {
1099                 switch (type->vtype) {
1100                 case Vnone:                             // NOTEST
1101                         fprintf(f, "*no-value*"); break;        // NOTEST
1102                 case Vlabel:                            // NOTEST
1103                         fprintf(f, "*label-%p*", v->label); break; // NOTEST
1104                 case Vstr:
1105                         fprintf(f, "%.*s", v->str.len, v->str.txt); break;
1106                 case Vbool:
1107                         fprintf(f, "%s", v->bool ? "True":"False"); break;
1108                 case Vnum:
1109                         {
1110                         mpf_t fl;
1111                         mpf_init2(fl, 20);
1112                         mpf_set_q(fl, v->num);
1113                         gmp_fprintf(f, "%Fg", fl);
1114                         mpf_clear(fl);
1115                         break;
1116                         }
1117                 }
1118         }
1119
1120         static void _free_value(struct type *type, struct value *v);
1121
1122         static struct type base_prototype = {
1123                 .init = _val_init,
1124                 .print = _print_value,
1125                 .cmp_order = _value_cmp,
1126                 .cmp_eq = _value_cmp,
1127                 .dup = _dup_value,
1128                 .free = _free_value,
1129         };
1130
1131         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
1132
1133 ###### ast functions
1134         static struct type *add_base_type(struct parse_context *c, char *n,
1135                                           enum vtype vt, int size)
1136         {
1137                 struct text txt = { n, strlen(n) };
1138                 struct type *t;
1139
1140                 t = add_type(c, txt, &base_prototype);
1141                 t->vtype = vt;
1142                 t->size = size;
1143                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
1144                 if (t->size & (t->align - 1))
1145                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
1146                 return t;
1147         }
1148
1149 ###### context initialization
1150
1151         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
1152         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
1153         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
1154         Tnone  = add_base_type(&context, "none", Vnone, 0);
1155         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
1156
1157 ##### Base Values
1158
1159 We have already met values as separate objects.  When manifest constants
1160 appear in the program text, that must result in an executable which has
1161 a constant value.  So the `val` structure embeds a value in an
1162 executable.
1163
1164 ###### exec type
1165         Xval,
1166
1167 ###### ast
1168         struct val {
1169                 struct exec;
1170                 struct type *vtype;
1171                 struct value val;
1172         };
1173
1174 ###### ast functions
1175         struct val *new_val(struct type *T, struct token tk)
1176         {
1177                 struct val *v = new_pos(val, tk);
1178                 v->vtype = T;
1179                 return v;
1180         }
1181
1182 ###### Grammar
1183
1184         $TERM True False
1185
1186         $*val
1187         Value ->  True ${
1188                 $0 = new_val(Tbool, $1);
1189                 $0->val.bool = 1;
1190         }$
1191         | False ${
1192                 $0 = new_val(Tbool, $1);
1193                 $0->val.bool = 0;
1194         }$
1195         | NUMBER ${ {
1196                 char tail[3];
1197                 $0 = new_val(Tnum, $1);
1198                 if (number_parse($0->val.num, tail, $1.txt) == 0)
1199                         mpq_init($0->val.num);  // UNTESTED
1200                         if (tail[0])
1201                                 tok_err(c, "error: unsupported number suffix",
1202                                         &$1);
1203         } }$
1204         | STRING ${ {
1205                 char tail[3];
1206                 $0 = new_val(Tstr, $1);
1207                 string_parse(&$1, '\\', &$0->val.str, tail);
1208                 if (tail[0])
1209                         tok_err(c, "error: unsupported string suffix",
1210                                 &$1);
1211         } }$
1212         | MULTI_STRING ${ {
1213                 char tail[3];
1214                 $0 = new_val(Tstr, $1);
1215                 string_parse(&$1, '\\', &$0->val.str, tail);
1216                 if (tail[0])
1217                         tok_err(c, "error: unsupported string suffix",
1218                                 &$1);
1219         } }$
1220
1221 ###### print exec cases
1222         case Xval:
1223         {
1224                 struct val *v = cast(val, e);
1225                 if (v->vtype == Tstr)
1226                         printf("\"");
1227                 print_value(v->vtype, &v->val, stdout);
1228                 if (v->vtype == Tstr)
1229                         printf("\"");
1230                 break;
1231         }
1232
1233 ###### propagate exec cases
1234         case Xval:
1235         {
1236                 struct val *val = cast(val, prog);
1237                 if (!type_compat(type, val->vtype, rules))
1238                         type_err(c, "error: expected %1%r found %2",
1239                                    prog, type, rules, val->vtype);
1240                 return val->vtype;
1241         }
1242
1243 ###### interp exec cases
1244         case Xval:
1245                 rvtype = cast(val, e)->vtype;
1246                 dup_value(rvtype, &cast(val, e)->val, &rv);
1247                 break;
1248
1249 ###### ast functions
1250         static void free_val(struct val *v)
1251         {
1252                 if (v)
1253                         free_value(v->vtype, &v->val);
1254                 free(v);
1255         }
1256
1257 ###### free exec cases
1258         case Xval: free_val(cast(val, e)); break;
1259
1260 ###### ast functions
1261         // Move all nodes from 'b' to 'rv', reversing their order.
1262         // In 'b' 'left' is a list, and 'right' is the last node.
1263         // In 'rv', left' is the first node and 'right' is a list.
1264         static struct binode *reorder_bilist(struct binode *b)
1265         {
1266                 struct binode *rv = NULL;
1267
1268                 while (b) {
1269                         struct exec *t = b->right;
1270                         b->right = rv;
1271                         rv = b;
1272                         if (b->left)
1273                                 b = cast(binode, b->left);
1274                         else
1275                                 b = NULL;
1276                         rv->left = t;
1277                 }
1278                 return rv;
1279         }
1280
1281 ### Variables
1282
1283 Variables are scoped named values.  We store the names in a linked list
1284 of "bindings" sorted in lexical order, and use sequential search and
1285 insertion sort.
1286
1287 ###### ast
1288
1289         struct binding {
1290                 struct text name;
1291                 struct binding *next;   // in lexical order
1292                 ## binding fields
1293         };
1294
1295 This linked list is stored in the parse context so that "reduce"
1296 functions can find or add variables, and so the analysis phase can
1297 ensure that every variable gets a type.
1298
1299 ###### parse context
1300
1301         struct binding *varlist;  // In lexical order
1302
1303 ###### ast functions
1304
1305         static struct binding *find_binding(struct parse_context *c, struct text s)
1306         {
1307                 struct binding **l = &c->varlist;
1308                 struct binding *n;
1309                 int cmp = 1;
1310
1311                 while (*l &&
1312                         (cmp = text_cmp((*l)->name, s)) < 0)
1313                                 l = & (*l)->next;
1314                 if (cmp == 0)
1315                         return *l;
1316                 n = calloc(1, sizeof(*n));
1317                 n->name = s;
1318                 n->next = *l;
1319                 *l = n;
1320                 return n;
1321         }
1322
1323 Each name can be linked to multiple variables defined in different
1324 scopes.  Each scope starts where the name is declared and continues
1325 until the end of the containing code block.  Scopes of a given name
1326 cannot nest, so a declaration while a name is in-scope is an error.
1327
1328 ###### binding fields
1329         struct variable *var;
1330
1331 ###### ast
1332         struct variable {
1333                 struct variable *previous;
1334                 struct type *type;
1335                 struct binding *name;
1336                 struct exec *where_decl;// where name was declared
1337                 struct exec *where_set; // where type was set
1338                 ## variable fields
1339         };
1340
1341 When a scope closes, the values of the variables might need to be freed.
1342 This happens in the context of some `struct exec` and each `exec` will
1343 need to know which variables need to be freed when it completes.
1344
1345 ####### exec fields
1346         struct variable *to_free;
1347
1348 ####### variable fields
1349         struct exec *cleanup_exec;
1350         struct variable *next_free;
1351
1352 ####### interp exec cleanup
1353         {
1354                 struct variable *v;
1355                 for (v = e->to_free; v; v = v->next_free) {
1356                         struct value *val = var_value(c, v);
1357                         free_value(v->type, val);
1358                 }
1359         }
1360
1361 ###### ast functions
1362         static void variable_unlink_exec(struct variable *v)
1363         {
1364                 struct variable **vp;
1365                 if (!v->cleanup_exec)
1366                         return;
1367                 for (vp = &v->cleanup_exec->to_free;
1368                     *vp; vp = &(*vp)->next_free) {
1369                         if (*vp != v)
1370                                 continue;
1371                         *vp = v->next_free;
1372                         v->cleanup_exec = NULL;
1373                         break;
1374                 }
1375         }
1376
1377 While the naming seems strange, we include local constants in the
1378 definition of variables.  A name declared `var := value` can
1379 subsequently be changed, but a name declared `var ::= value` cannot -
1380 it is constant
1381
1382 ###### variable fields
1383         int constant;
1384
1385 Scopes in parallel branches can be partially merged.  More
1386 specifically, if a given name is declared in both branches of an
1387 if/else then its scope is a candidate for merging.  Similarly if
1388 every branch of an exhaustive switch (e.g. has an "else" clause)
1389 declares a given name, then the scopes from the branches are
1390 candidates for merging.
1391
1392 Note that names declared inside a loop (which is only parallel to
1393 itself) are never visible after the loop.  Similarly names defined in
1394 scopes which are not parallel, such as those started by `for` and
1395 `switch`, are never visible after the scope.  Only variables defined in
1396 both `then` and `else` (including the implicit then after an `if`, and
1397 excluding `then` used with `for`) and in all `case`s and `else` of a
1398 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
1399
1400 Labels, which are a bit like variables, follow different rules.
1401 Labels are not explicitly declared, but if an undeclared name appears
1402 in a context where a label is legal, that effectively declares the
1403 name as a label.  The declaration remains in force (or in scope) at
1404 least to the end of the immediately containing block and conditionally
1405 in any larger containing block which does not declare the name in some
1406 other way.  Importantly, the conditional scope extension happens even
1407 if the label is only used in one parallel branch of a conditional --
1408 when used in one branch it is treated as having been declared in all
1409 branches.
1410
1411 Merge candidates are tentatively visible beyond the end of the
1412 branching statement which creates them.  If the name is used, the
1413 merge is affirmed and they become a single variable visible at the
1414 outer layer.  If not - if it is redeclared first - the merge lapses.
1415
1416 To track scopes we have an extra stack, implemented as a linked list,
1417 which roughly parallels the parse stack and which is used exclusively
1418 for scoping.  When a new scope is opened, a new frame is pushed and
1419 the child-count of the parent frame is incremented.  This child-count
1420 is used to distinguish between the first of a set of parallel scopes,
1421 in which declared variables must not be in scope, and subsequent
1422 branches, whether they may already be conditionally scoped.
1423
1424 We need a total ordering of scopes so we can easily compare to variables
1425 to see if they are concurrently in scope.  To achieve this we record a
1426 `scope_count` which is actually a count of both beginnings and endings
1427 of scopes.  Then each variable has a record of the scope count where it
1428 enters scope, and where it leaves.
1429
1430 To push a new frame *before* any code in the frame is parsed, we need a
1431 grammar reduction.  This is most easily achieved with a grammar
1432 element which derives the empty string, and creates the new scope when
1433 it is recognised.  This can be placed, for example, between a keyword
1434 like "if" and the code following it.
1435
1436 ###### ast
1437         struct scope {
1438                 struct scope *parent;
1439                 int child_count;
1440         };
1441
1442 ###### parse context
1443         int scope_depth;
1444         int scope_count;
1445         struct scope *scope_stack;
1446
1447 ###### variable fields
1448         int scope_start, scope_end;
1449
1450 ###### ast functions
1451         static void scope_pop(struct parse_context *c)
1452         {
1453                 struct scope *s = c->scope_stack;
1454
1455                 c->scope_stack = s->parent;
1456                 free(s);
1457                 c->scope_depth -= 1;
1458                 c->scope_count += 1;
1459         }
1460
1461         static void scope_push(struct parse_context *c)
1462         {
1463                 struct scope *s = calloc(1, sizeof(*s));
1464                 if (c->scope_stack)
1465                         c->scope_stack->child_count += 1;
1466                 s->parent = c->scope_stack;
1467                 c->scope_stack = s;
1468                 c->scope_depth += 1;
1469                 c->scope_count += 1;
1470         }
1471
1472 ###### Grammar
1473
1474         $void
1475         OpenScope -> ${ scope_push(c); }$
1476
1477 Each variable records a scope depth and is in one of four states:
1478
1479 - "in scope".  This is the case between the declaration of the
1480   variable and the end of the containing block, and also between
1481   the usage with affirms a merge and the end of that block.
1482
1483   The scope depth is not greater than the current parse context scope
1484   nest depth.  When the block of that depth closes, the state will
1485   change.  To achieve this, all "in scope" variables are linked
1486   together as a stack in nesting order.
1487
1488 - "pending".  The "in scope" block has closed, but other parallel
1489   scopes are still being processed.  So far, every parallel block at
1490   the same level that has closed has declared the name.
1491
1492   The scope depth is the depth of the last parallel block that
1493   enclosed the declaration, and that has closed.
1494
1495 - "conditionally in scope".  The "in scope" block and all parallel
1496   scopes have closed, and no further mention of the name has been seen.
1497   This state includes a secondary nest depth (`min_depth`) which records
1498   the outermost scope seen since the variable became conditionally in
1499   scope.  If a use of the name is found, the variable becomes "in scope"
1500   and that secondary depth becomes the recorded scope depth.  If the
1501   name is declared as a new variable, the old variable becomes "out of
1502   scope" and the recorded scope depth stays unchanged.
1503
1504 - "out of scope".  The variable is neither in scope nor conditionally
1505   in scope.  It is permanently out of scope now and can be removed from
1506   the "in scope" stack.  When a variable becomes out-of-scope it is
1507   moved to a separate list (`out_scope`) of variables which have fully
1508   known scope.  This will be used at the end of each function to assign
1509   each variable a place in the stack frame.
1510
1511 ###### variable fields
1512         int depth, min_depth;
1513         enum { OutScope, PendingScope, CondScope, InScope } scope;
1514         struct variable *in_scope;
1515
1516 ###### parse context
1517
1518         struct variable *in_scope;
1519         struct variable *out_scope;
1520
1521 All variables with the same name are linked together using the
1522 'previous' link.  Those variable that have been affirmatively merged all
1523 have a 'merged' pointer that points to one primary variable - the most
1524 recently declared instance.  When merging variables, we need to also
1525 adjust the 'merged' pointer on any other variables that had previously
1526 been merged with the one that will no longer be primary.
1527
1528 A variable that is no longer the most recent instance of a name may
1529 still have "pending" scope, if it might still be merged with most
1530 recent instance.  These variables don't really belong in the
1531 "in_scope" list, but are not immediately removed when a new instance
1532 is found.  Instead, they are detected and ignored when considering the
1533 list of in_scope names.
1534
1535 The storage of the value of a variable will be described later.  For now
1536 we just need to know that when a variable goes out of scope, it might
1537 need to be freed.  For this we need to be able to find it, so assume that
1538 `var_value()` will provide that.
1539
1540 ###### variable fields
1541         struct variable *merged;
1542
1543 ###### ast functions
1544
1545         static void variable_merge(struct variable *primary, struct variable *secondary)
1546         {
1547                 struct variable *v;
1548
1549                 primary = primary->merged;
1550
1551                 for (v = primary->previous; v; v=v->previous)
1552                         if (v == secondary || v == secondary->merged ||
1553                             v->merged == secondary ||
1554                             v->merged == secondary->merged) {
1555                                 v->scope = OutScope;
1556                                 v->merged = primary;
1557                                 if (v->scope_start < primary->scope_start)
1558                                         primary->scope_start = v->scope_start;
1559                                 if (v->scope_end > primary->scope_end)
1560                                         primary->scope_end = v->scope_end;      // NOTEST
1561                                 variable_unlink_exec(v);
1562                         }
1563         }
1564
1565 ###### forward decls
1566         static struct value *var_value(struct parse_context *c, struct variable *v);
1567
1568 ###### free global vars
1569
1570         while (context.varlist) {
1571                 struct binding *b = context.varlist;
1572                 struct variable *v = b->var;
1573                 context.varlist = b->next;
1574                 free(b);
1575                 while (v) {
1576                         struct variable *next = v->previous;
1577
1578                         if (v->global) {
1579                                 free_value(v->type, var_value(&context, v));
1580                                 if (v->depth == 0)
1581                                         // This is a global constant
1582                                         free_exec(v->where_decl);
1583                         }
1584                         free(v);
1585                         v = next;
1586                 }
1587         }
1588
1589 #### Manipulating Bindings
1590
1591 When a name is conditionally visible, a new declaration discards the old
1592 binding - the condition lapses.  Similarly when we reach the end of a
1593 function (outermost non-global scope) any conditional scope must lapse.
1594 Conversely a usage of the name affirms the visibility and extends it to
1595 the end of the containing block - i.e.  the block that contains both the
1596 original declaration and the latest usage.  This is determined from
1597 `min_depth`.  When a conditionally visible variable gets affirmed like
1598 this, it is also merged with other conditionally visible variables with
1599 the same name.
1600
1601 When we parse a variable declaration we either report an error if the
1602 name is currently bound, or create a new variable at the current nest
1603 depth if the name is unbound or bound to a conditionally scoped or
1604 pending-scope variable.  If the previous variable was conditionally
1605 scoped, it and its homonyms becomes out-of-scope.
1606
1607 When we parse a variable reference (including non-declarative assignment
1608 "foo = bar") we report an error if the name is not bound or is bound to
1609 a pending-scope variable; update the scope if the name is bound to a
1610 conditionally scoped variable; or just proceed normally if the named
1611 variable is in scope.
1612
1613 When we exit a scope, any variables bound at this level are either
1614 marked out of scope or pending-scoped, depending on whether the scope
1615 was sequential or parallel.  Here a "parallel" scope means the "then"
1616 or "else" part of a conditional, or any "case" or "else" branch of a
1617 switch.  Other scopes are "sequential".
1618
1619 When exiting a parallel scope we check if there are any variables that
1620 were previously pending and are still visible. If there are, then
1621 they weren't redeclared in the most recent scope, so they cannot be
1622 merged and must become out-of-scope.  If it is not the first of
1623 parallel scopes (based on `child_count`), we check that there was a
1624 previous binding that is still pending-scope.  If there isn't, the new
1625 variable must now be out-of-scope.
1626
1627 When exiting a sequential scope that immediately enclosed parallel
1628 scopes, we need to resolve any pending-scope variables.  If there was
1629 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1630 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1631 all pending-scope variables become conditionally scoped.
1632
1633 ###### ast
1634         enum closetype { CloseSequential, CloseFunction, CloseParallel, CloseElse };
1635
1636 ###### ast functions
1637
1638         static struct variable *var_decl(struct parse_context *c, struct text s)
1639         {
1640                 struct binding *b = find_binding(c, s);
1641                 struct variable *v = b->var;
1642
1643                 switch (v ? v->scope : OutScope) {
1644                 case InScope:
1645                         /* Caller will report the error */
1646                         return NULL;
1647                 case CondScope:
1648                         for (;
1649                              v && v->scope == CondScope;
1650                              v = v->previous)
1651                                 v->scope = OutScope;
1652                         break;
1653                 default: break;
1654                 }
1655                 v = calloc(1, sizeof(*v));
1656                 v->previous = b->var;
1657                 b->var = v;
1658                 v->name = b;
1659                 v->merged = v;
1660                 v->min_depth = v->depth = c->scope_depth;
1661                 v->scope = InScope;
1662                 v->in_scope = c->in_scope;
1663                 v->scope_start = c->scope_count;
1664                 c->in_scope = v;
1665                 ## variable init
1666                 return v;
1667         }
1668
1669         static struct variable *var_ref(struct parse_context *c, struct text s)
1670         {
1671                 struct binding *b = find_binding(c, s);
1672                 struct variable *v = b->var;
1673                 struct variable *v2;
1674
1675                 switch (v ? v->scope : OutScope) {
1676                 case OutScope:
1677                 case PendingScope:
1678                         /* Caller will report the error */
1679                         return NULL;
1680                 case CondScope:
1681                         /* All CondScope variables of this name need to be merged
1682                          * and become InScope
1683                          */
1684                         v->depth = v->min_depth;
1685                         v->scope = InScope;
1686                         for (v2 = v->previous;
1687                              v2 && v2->scope == CondScope;
1688                              v2 = v2->previous)
1689                                 variable_merge(v, v2);
1690                         break;
1691                 case InScope:
1692                         break;
1693                 }
1694                 return v;
1695         }
1696
1697         static int var_refile(struct parse_context *c, struct variable *v)
1698         {
1699                 /* Variable just went out of scope.  Add it to the out_scope
1700                  * list, sorted by ->scope_start
1701                  */
1702                 struct variable **vp = &c->out_scope;
1703                 while ((*vp) && (*vp)->scope_start < v->scope_start)
1704                         vp = &(*vp)->in_scope;
1705                 v->in_scope = *vp;
1706                 *vp = v;
1707                 return 0;
1708         }
1709
1710         static void var_block_close(struct parse_context *c, enum closetype ct,
1711                                     struct exec *e)
1712         {
1713                 /* Close off all variables that are in_scope.
1714                  * Some variables in c->scope may already be not-in-scope,
1715                  * such as when a PendingScope variable is hidden by a new
1716                  * variable with the same name.
1717                  * So we check for v->name->var != v and drop them.
1718                  * If we choose to make a variable OutScope, we drop it
1719                  * immediately too.
1720                  */
1721                 struct variable *v, **vp, *v2;
1722
1723                 scope_pop(c);
1724                 for (vp = &c->in_scope;
1725                      (v = *vp) && v->min_depth > c->scope_depth;
1726                      (v->scope == OutScope || v->name->var != v)
1727                      ? (*vp =  v->in_scope, var_refile(c, v))
1728                      : ( vp = &v->in_scope, 0)) {
1729                         v->min_depth = c->scope_depth;
1730                         if (v->name->var != v)
1731                                 /* This is still in scope, but we haven't just
1732                                  * closed the scope.
1733                                  */
1734                                 continue;
1735                         v->min_depth = c->scope_depth;
1736                         if (v->scope == InScope)
1737                                 v->scope_end = c->scope_count;
1738                         if (v->scope == InScope && e && !v->global) {
1739                                 /* This variable gets cleaned up when 'e' finishes */
1740                                 variable_unlink_exec(v);
1741                                 v->cleanup_exec = e;
1742                                 v->next_free = e->to_free;
1743                                 e->to_free = v;
1744                         }
1745                         switch (ct) {
1746                         case CloseElse:
1747                         case CloseParallel: /* handle PendingScope */
1748                                 switch(v->scope) {
1749                                 case InScope:
1750                                 case CondScope:
1751                                         if (c->scope_stack->child_count == 1)
1752                                                 /* first among parallel branches */
1753                                                 v->scope = PendingScope;
1754                                         else if (v->previous &&
1755                                                  v->previous->scope == PendingScope)
1756                                                 /* all previous branches used name */
1757                                                 v->scope = PendingScope;
1758                                         else if (v->type == Tlabel)
1759                                                 /* Labels remain pending even when not used */
1760                                                 v->scope = PendingScope;        // UNTESTED
1761                                         else
1762                                                 v->scope = OutScope;
1763                                         if (ct == CloseElse) {
1764                                                 /* All Pending variables with this name
1765                                                  * are now Conditional */
1766                                                 for (v2 = v;
1767                                                      v2 && v2->scope == PendingScope;
1768                                                      v2 = v2->previous)
1769                                                         v2->scope = CondScope;
1770                                         }
1771                                         break;
1772                                 case PendingScope:
1773                                         /* Not possible as it would require
1774                                          * parallel scope to be nested immediately
1775                                          * in a parallel scope, and that never
1776                                          * happens.
1777                                          */                     // NOTEST
1778                                 case OutScope:
1779                                         /* Not possible as we already tested for
1780                                          * OutScope
1781                                          */
1782                                         abort();                // NOTEST
1783                                 }
1784                                 break;
1785                         case CloseFunction:
1786                                 if (v->scope == CondScope)
1787                                         /* Condition cannot continue past end of function */
1788                                         v->scope = InScope;
1789                                 /* fallthrough */
1790                         case CloseSequential:
1791                                 if (v->type == Tlabel)
1792                                         v->scope = PendingScope;
1793                                 switch (v->scope) {
1794                                 case InScope:
1795                                         v->scope = OutScope;
1796                                         break;
1797                                 case PendingScope:
1798                                         /* There was no 'else', so we can only become
1799                                          * conditional if we know the cases were exhaustive,
1800                                          * and that doesn't mean anything yet.
1801                                          * So only labels become conditional..
1802                                          */
1803                                         for (v2 = v;
1804                                              v2 && v2->scope == PendingScope;
1805                                              v2 = v2->previous)
1806                                                 if (v2->type == Tlabel)
1807                                                         v2->scope = CondScope;
1808                                                 else
1809                                                         v2->scope = OutScope;
1810                                         break;
1811                                 case CondScope:
1812                                 case OutScope: break;
1813                                 }
1814                                 break;
1815                         }
1816                 }
1817         }
1818
1819 #### Storing Values
1820
1821 The value of a variable is store separately from the variable, on an
1822 analogue of a stack frame.  There are (currently) two frames that can be
1823 active.  A global frame which currently only stores constants, and a
1824 stacked frame which stores local variables.  Each variable knows if it
1825 is global or not, and what its index into the frame is.
1826
1827 Values in the global frame are known immediately they are relevant, so
1828 the frame needs to be reallocated as it grows so it can store those
1829 values.  The local frame doesn't get values until the interpreted phase
1830 is started, so there is no need to allocate until the size is known.
1831
1832 We initialize the `frame_pos` to an impossible value, so that we can
1833 tell if it was set or not later.
1834
1835 ###### variable fields
1836         short frame_pos;
1837         short global;
1838
1839 ###### variable init
1840         v->frame_pos = -1;
1841
1842 ###### parse context
1843
1844         short global_size, global_alloc;
1845         short local_size;
1846         void *global, *local;
1847
1848 ###### ast functions
1849
1850         static struct value *var_value(struct parse_context *c, struct variable *v)
1851         {
1852                 if (!v->global) {
1853                         if (!c->local || !v->type)
1854                                 return NULL;                    // NOTEST
1855                         if (v->frame_pos + v->type->size > c->local_size) {
1856                                 printf("INVALID frame_pos\n");  // NOTEST
1857                                 exit(2);                        // NOTEST
1858                         }
1859                         return c->local + v->frame_pos;
1860                 }
1861                 if (c->global_size > c->global_alloc) {
1862                         int old = c->global_alloc;
1863                         c->global_alloc = (c->global_size | 1023) + 1024;
1864                         c->global = realloc(c->global, c->global_alloc);
1865                         memset(c->global + old, 0, c->global_alloc - old);
1866                 }
1867                 return c->global + v->frame_pos;
1868         }
1869
1870         static struct value *global_alloc(struct parse_context *c, struct type *t,
1871                                           struct variable *v, struct value *init)
1872         {
1873                 struct value *ret;
1874                 struct variable scratch;
1875
1876                 if (t->prepare_type)
1877                         t->prepare_type(c, t, 1);       // NOTEST
1878
1879                 if (c->global_size & (t->align - 1))
1880                         c->global_size = (c->global_size + t->align) & ~(t->align-1);
1881                 if (!v) {
1882                         v = &scratch;
1883                         v->type = t;
1884                 }
1885                 v->frame_pos = c->global_size;
1886                 v->global = 1;
1887                 c->global_size += v->type->size;
1888                 ret = var_value(c, v);
1889                 if (init)
1890                         memcpy(ret, init, t->size);
1891                 else
1892                         val_init(t, ret);
1893                 return ret;
1894         }
1895
1896 As global values are found -- struct field initializers, labels etc --
1897 `global_alloc()` is called to record the value in the global frame.
1898
1899 When the program is fully parsed, each function is analysed, we need to
1900 walk the list of variables local to that function and assign them an
1901 offset in the stack frame.  For this we have `scope_finalize()`.
1902
1903 We keep the stack from dense by re-using space for between variables
1904 that are not in scope at the same time.  The `out_scope` list is sorted
1905 by `scope_start` and as we process a varible, we move it to an FIFO
1906 stack.  For each variable we consider, we first discard any from the
1907 stack anything that went out of scope before the new variable came in.
1908 Then we place the new variable just after the one at the top of the
1909 stack.
1910
1911 ###### ast functions
1912
1913         static void scope_finalize(struct parse_context *c, struct type *ft)
1914         {
1915                 int size = ft->function.local_size;
1916                 struct variable *next = ft->function.scope;
1917                 struct variable *done = NULL;
1918
1919                 while (next) {
1920                         struct variable *v = next;
1921                         struct type *t = v->type;
1922                         int pos;
1923                         next = v->in_scope;
1924                         if (v->merged != v)
1925                                 continue;
1926                         if (!t)
1927                                 continue;
1928                         if (v->frame_pos >= 0)
1929                                 continue;
1930                         while (done && done->scope_end < v->scope_start)
1931                                 done = done->in_scope;
1932                         if (done)
1933                                 pos = done->frame_pos + done->type->size;
1934                         else
1935                                 pos = ft->function.local_size;
1936                         if (pos & (t->align - 1))
1937                                 pos = (pos + t->align) & ~(t->align-1);
1938                         v->frame_pos = pos;
1939                         if (size < pos + v->type->size)
1940                                 size = pos + v->type->size;
1941                         v->in_scope = done;
1942                         done = v;
1943                 }
1944                 c->out_scope = NULL;
1945                 ft->function.local_size = size;
1946         }
1947
1948 ###### free context storage
1949         free(context.global);
1950
1951 #### Variables as executables
1952
1953 Just as we used a `val` to wrap a value into an `exec`, we similarly
1954 need a `var` to wrap a `variable` into an exec.  While each `val`
1955 contained a copy of the value, each `var` holds a link to the variable
1956 because it really is the same variable no matter where it appears.
1957 When a variable is used, we need to remember to follow the `->merged`
1958 link to find the primary instance.
1959
1960 When a variable is declared, it may or may not be given an explicit
1961 type.  We need to record which so that we can report the parsed code
1962 correctly.
1963
1964 ###### exec type
1965         Xvar,
1966
1967 ###### ast
1968         struct var {
1969                 struct exec;
1970                 struct variable *var;
1971         };
1972
1973 ###### variable fields
1974         int explicit_type;
1975
1976 ###### Grammar
1977
1978         $TERM : ::
1979
1980         $*var
1981         VariableDecl -> IDENTIFIER : ${ {
1982                 struct variable *v = var_decl(c, $1.txt);
1983                 $0 = new_pos(var, $1);
1984                 $0->var = v;
1985                 if (v)
1986                         v->where_decl = $0;
1987                 else {
1988                         v = var_ref(c, $1.txt);
1989                         $0->var = v;
1990                         type_err(c, "error: variable '%v' redeclared",
1991                                  $0, NULL, 0, NULL);
1992                         type_err(c, "info: this is where '%v' was first declared",
1993                                  v->where_decl, NULL, 0, NULL);
1994                 }
1995         } }$
1996         | IDENTIFIER :: ${ {
1997                 struct variable *v = var_decl(c, $1.txt);
1998                 $0 = new_pos(var, $1);
1999                 $0->var = v;
2000                 if (v) {
2001                         v->where_decl = $0;
2002                         v->constant = 1;
2003                 } else {
2004                         v = var_ref(c, $1.txt);
2005                         $0->var = v;
2006                         type_err(c, "error: variable '%v' redeclared",
2007                                  $0, NULL, 0, NULL);
2008                         type_err(c, "info: this is where '%v' was first declared",
2009                                  v->where_decl, NULL, 0, NULL);
2010                 }
2011         } }$
2012         | IDENTIFIER : Type ${ {
2013                 struct variable *v = var_decl(c, $1.txt);
2014                 $0 = new_pos(var, $1);
2015                 $0->var = v;
2016                 if (v) {
2017                         v->where_decl = $0;
2018                         v->where_set = $0;
2019                         v->type = $<Type;
2020                         v->explicit_type = 1;
2021                 } else {
2022                         v = var_ref(c, $1.txt);
2023                         $0->var = v;
2024                         type_err(c, "error: variable '%v' redeclared",
2025                                  $0, NULL, 0, NULL);
2026                         type_err(c, "info: this is where '%v' was first declared",
2027                                  v->where_decl, NULL, 0, NULL);
2028                 }
2029         } }$
2030         | IDENTIFIER :: Type ${ {
2031                 struct variable *v = var_decl(c, $1.txt);
2032                 $0 = new_pos(var, $1);
2033                 $0->var = v;
2034                 if (v) {
2035                         v->where_decl = $0;
2036                         v->where_set = $0;
2037                         v->type = $<Type;
2038                         v->constant = 1;
2039                         v->explicit_type = 1;
2040                 } else {
2041                         v = var_ref(c, $1.txt);
2042                         $0->var = v;
2043                         type_err(c, "error: variable '%v' redeclared",
2044                                  $0, NULL, 0, NULL);
2045                         type_err(c, "info: this is where '%v' was first declared",
2046                                  v->where_decl, NULL, 0, NULL);
2047                 }
2048         } }$
2049
2050         $*exec
2051         Variable -> IDENTIFIER ${ {
2052                 struct variable *v = var_ref(c, $1.txt);
2053                 $0 = new_pos(var, $1);
2054                 if (v == NULL) {
2055                         /* This might be a label - allocate a var just in case */
2056                         v = var_decl(c, $1.txt);
2057                         if (v) {
2058                                 v->type = Tnone;
2059                                 v->where_decl = $0;
2060                                 v->where_set = $0;
2061                         }
2062                 }
2063                 cast(var, $0)->var = v;
2064         } }$
2065
2066 ###### print exec cases
2067         case Xvar:
2068         {
2069                 struct var *v = cast(var, e);
2070                 if (v->var) {
2071                         struct binding *b = v->var->name;
2072                         printf("%.*s", b->name.len, b->name.txt);
2073                 }
2074                 break;
2075         }
2076
2077 ###### format cases
2078         case 'v':
2079                 if (loc && loc->type == Xvar) {
2080                         struct var *v = cast(var, loc);
2081                         if (v->var) {
2082                                 struct binding *b = v->var->name;
2083                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2084                         } else
2085                                 fputs("???", stderr);   // NOTEST
2086                 } else
2087                         fputs("NOTVAR", stderr);
2088                 break;
2089
2090 ###### propagate exec cases
2091
2092         case Xvar:
2093         {
2094                 struct var *var = cast(var, prog);
2095                 struct variable *v = var->var;
2096                 if (!v) {
2097                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2098                         return Tnone;                                   // NOTEST
2099                 }
2100                 v = v->merged;
2101                 if (v->constant && (rules & Rnoconstant)) {
2102                         type_err(c, "error: Cannot assign to a constant: %v",
2103                                  prog, NULL, 0, NULL);
2104                         type_err(c, "info: name was defined as a constant here",
2105                                  v->where_decl, NULL, 0, NULL);
2106                         return v->type;
2107                 }
2108                 if (v->type == Tnone && v->where_decl == prog)
2109                         type_err(c, "error: variable used but not declared: %v",
2110                                  prog, NULL, 0, NULL);
2111                 if (v->type == NULL) {
2112                         if (type && *ok != 0) {
2113                                 v->type = type;
2114                                 v->where_set = prog;
2115                                 *ok = 2;
2116                         }
2117                         return type;
2118                 }
2119                 if (!type_compat(type, v->type, rules)) {
2120                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2121                                  type, rules, v->type);
2122                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2123                                  v->type, rules, NULL);
2124                 }
2125                 if (!type)
2126                         return v->type;
2127                 return type;
2128         }
2129
2130 ###### interp exec cases
2131         case Xvar:
2132         {
2133                 struct var *var = cast(var, e);
2134                 struct variable *v = var->var;
2135
2136                 v = v->merged;
2137                 lrv = var_value(c, v);
2138                 rvtype = v->type;
2139                 break;
2140         }
2141
2142 ###### ast functions
2143
2144         static void free_var(struct var *v)
2145         {
2146                 free(v);
2147         }
2148
2149 ###### free exec cases
2150         case Xvar: free_var(cast(var, e)); break;
2151
2152
2153 ### Complex types
2154
2155 Now that we have the shape of the interpreter in place we can add some
2156 complex types and connected them in to the data structures and the
2157 different phases of parse, analyse, print, interpret.
2158
2159 Being "complex" the language will naturally have syntax to access
2160 specifics of objects of these types.  These will fit into the grammar as
2161 "Terms" which are the things that are combined with various operators to
2162 form "Expression".  Where a Term is formed by some operation on another
2163 Term, the subordinate Term will always come first, so for example a
2164 member of an array will be expressed as the Term for the array followed
2165 by an index in square brackets.  The strict rule of using postfix
2166 operations makes precedence irrelevant within terms.  To provide a place
2167 to put the grammar for each terms of each type, we will start out by
2168 introducing the "Term" grammar production, with contains at least a
2169 simple "Value" (to be explained later).
2170
2171 ###### Grammar
2172         $*exec
2173         Term ->  Value ${ $0 = $<1; }$
2174         | Variable ${ $0 = $<1; }$
2175         ## term grammar
2176
2177 Thus far the complex types we have are arrays and structs.
2178
2179 #### Arrays
2180
2181 Arrays can be declared by giving a size and a type, as `[size]type' so
2182 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
2183 size can be either a literal number, or a named constant.  Some day an
2184 arbitrary expression will be supported.
2185
2186 As a formal parameter to a function, the array can be declared with a
2187 new variable as the size: `name:[size::number]string`.  The `size`
2188 variable is set to the size of the array and must be a constant.  As
2189 `number` is the only supported type, it can be left out:
2190 `name:[size::]string`.
2191
2192 Arrays cannot be assigned.  When pointers are introduced we will also
2193 introduce array slices which can refer to part or all of an array -
2194 the assignment syntax will create a slice.  For now, an array can only
2195 ever be referenced by the name it is declared with.  It is likely that
2196 a "`copy`" primitive will eventually be define which can be used to
2197 make a copy of an array with controllable recursive depth.
2198
2199 For now we have two sorts of array, those with fixed size either because
2200 it is given as a literal number or because it is a struct member (which
2201 cannot have a runtime-changing size), and those with a size that is
2202 determined at runtime - local variables with a const size.  The former
2203 have their size calculated at parse time, the latter at run time.
2204
2205 For the latter type, the `size` field of the type is the size of a
2206 pointer, and the array is reallocated every time it comes into scope.
2207
2208 We differentiate struct fields with a const size from local variables
2209 with a const size by whether they are prepared at parse time or not.
2210
2211 ###### type union fields
2212
2213         struct {
2214                 int unspec;     // size is unspecified - vsize must be set.
2215                 short size;
2216                 short static_size;
2217                 struct variable *vsize;
2218                 struct type *member;
2219         } array;
2220
2221 ###### value union fields
2222         void *array;  // used if not static_size
2223
2224 ###### value functions
2225
2226         static void array_prepare_type(struct parse_context *c, struct type *type,
2227                                        int parse_time)
2228         {
2229                 struct value *vsize;
2230                 mpz_t q;
2231                 if (type->array.static_size)
2232                         return; // NOTEST
2233                 if (type->array.unspec && parse_time)
2234                         return; // NOTEST
2235
2236                 if (type->array.vsize) {
2237                         vsize = var_value(c, type->array.vsize);
2238                         if (!vsize)
2239                                 return; // NOTEST
2240                         mpz_init(q);
2241                         mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
2242                         type->array.size = mpz_get_si(q);
2243                         mpz_clear(q);
2244                 }
2245
2246                 if (parse_time && type->array.member->size) {
2247                         type->array.static_size = 1;
2248                         type->size = type->array.size * type->array.member->size;
2249                         type->align = type->array.member->align;
2250                 }
2251         }
2252
2253         static void array_init(struct type *type, struct value *val)
2254         {
2255                 int i;
2256                 void *ptr = val->ptr;
2257
2258                 if (!val)
2259                         return;                         // NOTEST
2260                 if (!type->array.static_size) {
2261                         val->array = calloc(type->array.size,
2262                                             type->array.member->size);
2263                         ptr = val->array;
2264                 }
2265                 for (i = 0; i < type->array.size; i++) {
2266                         struct value *v;
2267                         v = (void*)ptr + i * type->array.member->size;
2268                         val_init(type->array.member, v);
2269                 }
2270         }
2271
2272         static void array_free(struct type *type, struct value *val)
2273         {
2274                 int i;
2275                 void *ptr = val->ptr;
2276
2277                 if (!type->array.static_size)
2278                         ptr = val->array;
2279                 for (i = 0; i < type->array.size; i++) {
2280                         struct value *v;
2281                         v = (void*)ptr + i * type->array.member->size;
2282                         free_value(type->array.member, v);
2283                 }
2284                 if (!type->array.static_size)
2285                         free(ptr);
2286         }
2287
2288         static int array_compat(struct type *require, struct type *have)
2289         {
2290                 if (have->compat != require->compat)
2291                         return 0;
2292                 /* Both are arrays, so we can look at details */
2293                 if (!type_compat(require->array.member, have->array.member, 0))
2294                         return 0;
2295                 if (have->array.unspec && require->array.unspec) {
2296                         if (have->array.vsize && require->array.vsize &&
2297                             have->array.vsize != require->array.vsize)  // UNTESTED
2298                                 /* sizes might not be the same */
2299                                 return 0;       // UNTESTED
2300                         return 1;
2301                 }
2302                 if (have->array.unspec || require->array.unspec)
2303                         return 1;       // UNTESTED
2304                 if (require->array.vsize == NULL && have->array.vsize == NULL)
2305                         return require->array.size == have->array.size;
2306
2307                 return require->array.vsize == have->array.vsize;       // UNTESTED
2308         }
2309
2310         static void array_print_type(struct type *type, FILE *f)
2311         {
2312                 fputs("[", f);
2313                 if (type->array.vsize) {
2314                         struct binding *b = type->array.vsize->name;
2315                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
2316                                 type->array.unspec ? "::" : "");
2317                 } else if (type->array.size)
2318                         fprintf(f, "%d]", type->array.size);
2319                 else
2320                         fprintf(f, "]");
2321                 type_print(type->array.member, f);
2322         }
2323
2324         static struct type array_prototype = {
2325                 .init = array_init,
2326                 .prepare_type = array_prepare_type,
2327                 .print_type = array_print_type,
2328                 .compat = array_compat,
2329                 .free = array_free,
2330                 .size = sizeof(void*),
2331                 .align = sizeof(void*),
2332         };
2333
2334 ###### declare terminals
2335         $TERM [ ]
2336
2337 ###### type grammar
2338
2339         | [ NUMBER ] Type ${ {
2340                 char tail[3];
2341                 mpq_t num;
2342                 struct type *t;
2343                 int elements = 0;
2344
2345                 if (number_parse(num, tail, $2.txt) == 0)
2346                         tok_err(c, "error: unrecognised number", &$2);
2347                 else if (tail[0]) {
2348                         tok_err(c, "error: unsupported number suffix", &$2);
2349                         mpq_clear(num);
2350                 } else {
2351                         elements = mpz_get_ui(mpq_numref(num));
2352                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
2353                                 tok_err(c, "error: array size must be an integer",
2354                                         &$2);
2355                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
2356                                 tok_err(c, "error: array size is too large",
2357                                         &$2);
2358                         mpq_clear(num);
2359                 }
2360
2361                 $0 = t = add_anon_type(c, &array_prototype, "array[%d]", elements );
2362                 t->array.size = elements;
2363                 t->array.member = $<4;
2364                 t->array.vsize = NULL;
2365         } }$
2366
2367         | [ IDENTIFIER ] Type ${ {
2368                 struct variable *v = var_ref(c, $2.txt);
2369
2370                 if (!v)
2371                         tok_err(c, "error: name undeclared", &$2);
2372                 else if (!v->constant)
2373                         tok_err(c, "error: array size must be a constant", &$2);
2374
2375                 $0 = add_anon_type(c, &array_prototype, "array[%.*s]", $2.txt.len, $2.txt.txt);
2376                 $0->array.member = $<4;
2377                 $0->array.size = 0;
2378                 $0->array.vsize = v;
2379         } }$
2380
2381 ###### Grammar
2382         $*type
2383         OptType -> Type ${ $0 = $<1; }$
2384                 | ${ $0 = NULL; }$
2385
2386 ###### formal type grammar
2387
2388         | [ IDENTIFIER :: OptType ] Type ${ {
2389                 struct variable *v = var_decl(c, $ID.txt);
2390
2391                 v->type = $<OT;
2392                 v->constant = 1;
2393                 if (!v->type)
2394                         v->type = Tnum;
2395                 $0 = add_anon_type(c, &array_prototype, "array[var]");
2396                 $0->array.member = $<6;
2397                 $0->array.size = 0;
2398                 $0->array.unspec = 1;
2399                 $0->array.vsize = v;
2400         } }$
2401
2402 ###### Binode types
2403         Index,
2404
2405 ###### term grammar
2406
2407         | Term [ Expression ] ${ {
2408                 struct binode *b = new(binode);
2409                 b->op = Index;
2410                 b->left = $<1;
2411                 b->right = $<3;
2412                 $0 = b;
2413         } }$
2414
2415 ###### print binode cases
2416         case Index:
2417                 print_exec(b->left, -1, bracket);
2418                 printf("[");
2419                 print_exec(b->right, -1, bracket);
2420                 printf("]");
2421                 break;
2422
2423 ###### propagate binode cases
2424         case Index:
2425                 /* left must be an array, right must be a number,
2426                  * result is the member type of the array
2427                  */
2428                 propagate_types(b->right, c, ok, Tnum, 0);
2429                 t = propagate_types(b->left, c, ok, NULL, rules & Rnoconstant);
2430                 if (!t || t->compat != array_compat) {
2431                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
2432                         return NULL;
2433                 } else {
2434                         if (!type_compat(type, t->array.member, rules)) {
2435                                 type_err(c, "error: have %1 but need %2", prog,
2436                                          t->array.member, rules, type);
2437                         }
2438                         return t->array.member;
2439                 }
2440                 break;
2441
2442 ###### interp binode cases
2443         case Index: {
2444                 mpz_t q;
2445                 long i;
2446                 void *ptr;
2447
2448                 lleft = linterp_exec(c, b->left, &ltype);
2449                 right = interp_exec(c, b->right, &rtype);
2450                 mpz_init(q);
2451                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
2452                 i = mpz_get_si(q);
2453                 mpz_clear(q);
2454
2455                 if (ltype->array.static_size)
2456                         ptr = lleft;
2457                 else
2458                         ptr = *(void**)lleft;
2459                 rvtype = ltype->array.member;
2460                 if (i >= 0 && i < ltype->array.size)
2461                         lrv = ptr + i * rvtype->size;
2462                 else
2463                         val_init(ltype->array.member, &rv); // UNSAFE
2464                 ltype = NULL;
2465                 break;
2466         }
2467
2468 #### Structs
2469
2470 A `struct` is a data-type that contains one or more other data-types.
2471 It differs from an array in that each member can be of a different
2472 type, and they are accessed by name rather than by number.  Thus you
2473 cannot choose an element by calculation, you need to know what you
2474 want up-front.
2475
2476 The language makes no promises about how a given structure will be
2477 stored in memory - it is free to rearrange fields to suit whatever
2478 criteria seems important.
2479
2480 Structs are declared separately from program code - they cannot be
2481 declared in-line in a variable declaration like arrays can.  A struct
2482 is given a name and this name is used to identify the type - the name
2483 is not prefixed by the word `struct` as it would be in C.
2484
2485 Structs are only treated as the same if they have the same name.
2486 Simply having the same fields in the same order is not enough.  This
2487 might change once we can create structure initializers from a list of
2488 values.
2489
2490 Each component datum is identified much like a variable is declared,
2491 with a name, one or two colons, and a type.  The type cannot be omitted
2492 as there is no opportunity to deduce the type from usage.  An initial
2493 value can be given following an equals sign, so
2494
2495 ##### Example: a struct type
2496
2497         struct complex:
2498                 x:number = 0
2499                 y:number = 0
2500
2501 would declare a type called "complex" which has two number fields,
2502 each initialised to zero.
2503
2504 Struct will need to be declared separately from the code that uses
2505 them, so we will need to be able to print out the declaration of a
2506 struct when reprinting the whole program.  So a `print_type_decl` type
2507 function will be needed.
2508
2509 ###### type union fields
2510
2511         struct {
2512                 int nfields;
2513                 struct field {
2514                         struct text name;
2515                         struct type *type;
2516                         struct value *init;
2517                         int offset;
2518                 } *fields;
2519         } structure;
2520
2521 ###### type functions
2522         void (*print_type_decl)(struct type *type, FILE *f);
2523
2524 ###### value functions
2525
2526         static void structure_init(struct type *type, struct value *val)
2527         {
2528                 int i;
2529
2530                 for (i = 0; i < type->structure.nfields; i++) {
2531                         struct value *v;
2532                         v = (void*) val->ptr + type->structure.fields[i].offset;
2533                         if (type->structure.fields[i].init)
2534                                 dup_value(type->structure.fields[i].type,
2535                                           type->structure.fields[i].init,
2536                                           v);
2537                         else
2538                                 val_init(type->structure.fields[i].type, v);
2539                 }
2540         }
2541
2542         static void structure_free(struct type *type, struct value *val)
2543         {
2544                 int i;
2545
2546                 for (i = 0; i < type->structure.nfields; i++) {
2547                         struct value *v;
2548                         v = (void*)val->ptr + type->structure.fields[i].offset;
2549                         free_value(type->structure.fields[i].type, v);
2550                 }
2551         }
2552
2553         static void structure_free_type(struct type *t)
2554         {
2555                 int i;
2556                 for (i = 0; i < t->structure.nfields; i++)
2557                         if (t->structure.fields[i].init) {
2558                                 free_value(t->structure.fields[i].type,
2559                                            t->structure.fields[i].init);
2560                         }
2561                 free(t->structure.fields);
2562         }
2563
2564         static struct type structure_prototype = {
2565                 .init = structure_init,
2566                 .free = structure_free,
2567                 .free_type = structure_free_type,
2568                 .print_type_decl = structure_print_type,
2569         };
2570
2571 ###### exec type
2572         Xfieldref,
2573
2574 ###### ast
2575         struct fieldref {
2576                 struct exec;
2577                 struct exec *left;
2578                 int index;
2579                 struct text name;
2580         };
2581
2582 ###### free exec cases
2583         case Xfieldref:
2584                 free_exec(cast(fieldref, e)->left);
2585                 free(e);
2586                 break;
2587
2588 ###### declare terminals
2589         $TERM struct .
2590
2591 ###### term grammar
2592
2593         | Term . IDENTIFIER ${ {
2594                 struct fieldref *fr = new_pos(fieldref, $2);
2595                 fr->left = $<1;
2596                 fr->name = $3.txt;
2597                 fr->index = -2;
2598                 $0 = fr;
2599         } }$
2600
2601 ###### print exec cases
2602
2603         case Xfieldref:
2604         {
2605                 struct fieldref *f = cast(fieldref, e);
2606                 print_exec(f->left, -1, bracket);
2607                 printf(".%.*s", f->name.len, f->name.txt);
2608                 break;
2609         }
2610
2611 ###### ast functions
2612         static int find_struct_index(struct type *type, struct text field)
2613         {
2614                 int i;
2615                 for (i = 0; i < type->structure.nfields; i++)
2616                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2617                                 return i;
2618                 return -1;
2619         }
2620
2621 ###### propagate exec cases
2622
2623         case Xfieldref:
2624         {
2625                 struct fieldref *f = cast(fieldref, prog);
2626                 struct type *st = propagate_types(f->left, c, ok, NULL, 0);
2627
2628                 if (!st)
2629                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2630                                  NULL, 0, NULL);
2631                 else if (st->init != structure_init)
2632                         type_err(c, "error: field reference attempted on %1, not a struct",
2633                                  f->left, st, 0, NULL);
2634                 else if (f->index == -2) {
2635                         f->index = find_struct_index(st, f->name);
2636                         if (f->index < 0)
2637                                 type_err(c, "error: cannot find requested field in %1",
2638                                          f->left, st, 0, NULL);
2639                 }
2640                 if (f->index >= 0) {
2641                         struct type *ft = st->structure.fields[f->index].type;
2642                         if (!type_compat(type, ft, rules))
2643                                 type_err(c, "error: have %1 but need %2", prog,
2644                                          ft, rules, type);
2645                         return ft;
2646                 }
2647                 break;
2648         }
2649
2650 ###### interp exec cases
2651         case Xfieldref:
2652         {
2653                 struct fieldref *f = cast(fieldref, e);
2654                 struct type *ltype;
2655                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2656                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2657                 rvtype = ltype->structure.fields[f->index].type;
2658                 break;
2659         }
2660
2661 ###### ast
2662         struct fieldlist {
2663                 struct fieldlist *prev;
2664                 struct field f;
2665         };
2666
2667 ###### ast functions
2668         static void free_fieldlist(struct fieldlist *f)
2669         {
2670                 if (!f)
2671                         return;
2672                 free_fieldlist(f->prev);
2673                 if (f->f.init) {
2674                         free_value(f->f.type, f->f.init);       // UNTESTED
2675                         free(f->f.init);        // UNTESTED
2676                 }
2677                 free(f);
2678         }
2679
2680 ###### top level grammar
2681         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2682                 struct type *t =
2683                         add_type(c, $2.txt, &structure_prototype);
2684                 int cnt = 0;
2685                 struct fieldlist *f;
2686
2687                 for (f = $3; f; f=f->prev)
2688                         cnt += 1;
2689
2690                 t->structure.nfields = cnt;
2691                 t->structure.fields = calloc(cnt, sizeof(struct field));
2692                 f = $3;
2693                 while (cnt > 0) {
2694                         int a = f->f.type->align;
2695                         cnt -= 1;
2696                         t->structure.fields[cnt] = f->f;
2697                         if (t->size & (a-1))
2698                                 t->size = (t->size | (a-1)) + 1;
2699                         t->structure.fields[cnt].offset = t->size;
2700                         t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2701                         if (a > t->align)
2702                                 t->align = a;
2703                         f->f.init = NULL;
2704                         f = f->prev;
2705                 }
2706         } }$
2707
2708         $*fieldlist
2709         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2710         | { SimpleFieldList } ${ $0 = $<SFL; }$
2711         | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2712         | SimpleFieldList EOL ${ $0 = $<SFL; }$
2713
2714         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2715         | FieldLines SimpleFieldList Newlines ${
2716                 $SFL->prev = $<FL;
2717                 $0 = $<SFL;
2718         }$
2719
2720         SimpleFieldList -> Field ${ $0 = $<F; }$
2721         | SimpleFieldList ; Field ${
2722                 $F->prev = $<SFL;
2723                 $0 = $<F;
2724         }$
2725         | SimpleFieldList ; ${
2726                 $0 = $<SFL;
2727         }$
2728         | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2729
2730         Field -> IDENTIFIER : Type = Expression ${ {
2731                 int ok;
2732
2733                 $0 = calloc(1, sizeof(struct fieldlist));
2734                 $0->f.name = $1.txt;
2735                 $0->f.type = $<3;
2736                 $0->f.init = NULL;
2737                 do {
2738                         ok = 1;
2739                         propagate_types($<5, c, &ok, $3, 0);
2740                 } while (ok == 2);
2741                 if (!ok)
2742                         c->parse_error = 1;     // UNTESTED
2743                 else {
2744                         struct value vl = interp_exec(c, $5, NULL);
2745                         $0->f.init = global_alloc(c, $0->f.type, NULL, &vl);
2746                 }
2747         } }$
2748         | IDENTIFIER : Type ${
2749                 $0 = calloc(1, sizeof(struct fieldlist));
2750                 $0->f.name = $1.txt;
2751                 $0->f.type = $<3;
2752                 if ($0->f.type->prepare_type)
2753                         $0->f.type->prepare_type(c, $0->f.type, 1);
2754         }$
2755
2756 ###### forward decls
2757         static void structure_print_type(struct type *t, FILE *f);
2758
2759 ###### value functions
2760         static void structure_print_type(struct type *t, FILE *f)
2761         {
2762                 int i;
2763
2764                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2765
2766                 for (i = 0; i < t->structure.nfields; i++) {
2767                         struct field *fl = t->structure.fields + i;
2768                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2769                         type_print(fl->type, f);
2770                         if (fl->type->print && fl->init) {
2771                                 fprintf(f, " = ");
2772                                 if (fl->type == Tstr)
2773                                         fprintf(f, "\"");       // UNTESTED
2774                                 print_value(fl->type, fl->init, f);
2775                                 if (fl->type == Tstr)
2776                                         fprintf(f, "\"");       // UNTESTED
2777                         }
2778                         fprintf(f, "\n");
2779                 }
2780         }
2781
2782 ###### print type decls
2783         {
2784                 struct type *t;
2785                 int target = -1;
2786
2787                 while (target != 0) {
2788                         int i = 0;
2789                         for (t = context.typelist; t ; t=t->next)
2790                                 if (!t->anon && t->print_type_decl &&
2791                                     !t->check_args) {
2792                                         i += 1;
2793                                         if (i == target)
2794                                                 break;
2795                                 }
2796
2797                         if (target == -1) {
2798                                 target = i;
2799                         } else {
2800                                 t->print_type_decl(t, stdout);
2801                                 target -= 1;
2802                         }
2803                 }
2804         }
2805
2806 #### Functions
2807
2808 A function is a chunk of code which can be passed parameters and can
2809 return results.  Each function has a type which includes the set of
2810 parameters and the return value.  As yet these types cannot be declared
2811 separately from the function itself.
2812
2813 The parameters can be specified either in parentheses as a ';' separated
2814 list, such as
2815
2816 ##### Example: function 1
2817
2818         func main(av:[ac::number]string; env:[envc::number]string)
2819                 code block
2820
2821 or as an indented list of one parameter per line (though each line can
2822 be a ';' separated list)
2823
2824 ##### Example: function 2
2825
2826         func main
2827                 argv:[argc::number]string
2828                 env:[envc::number]string
2829         do
2830                 code block
2831
2832 In the first case a return type can follow the parentheses after a colon,
2833 in the second it is given on a line starting with the word `return`.
2834
2835 ##### Example: functions that return
2836
2837         func add(a:number; b:number): number
2838                 code block
2839
2840         func catenate
2841                 a: string
2842                 b: string
2843         return string
2844         do
2845                 code block
2846
2847 Rather than returning a type, the function can specify a set of local
2848 variables to return as a struct.  The values of these variables when the
2849 function exits will be provided to the caller.  For this the return type
2850 is replaced with a block of result declarations, either in parentheses
2851 or bracketed by `return` and `do`.
2852
2853 ##### Example: functions returning multiple variables
2854
2855         func to_cartesian(rho:number; theta:number):(x:number; y:number)
2856                 x = .....
2857                 y = .....
2858
2859         func to_polar
2860                 x:number; y:number
2861         return
2862                 rho:number
2863                 theta:number
2864         do
2865                 rho = ....
2866                 theta = ....
2867
2868 For constructing the lists we use a `List` binode, which will be
2869 further detailed when Expression Lists are introduced.
2870
2871 ###### type union fields
2872
2873         struct {
2874                 struct binode *params;
2875                 struct type *return_type;
2876                 struct variable *scope;
2877                 int inline_result;      // return value is at start of 'local'
2878                 int local_size;
2879         } function;
2880
2881 ###### value union fields
2882         struct exec *function;
2883
2884 ###### type functions
2885         void (*check_args)(struct parse_context *c, int *ok,
2886                            struct type *require, struct exec *args);
2887
2888 ###### value functions
2889
2890         static void function_free(struct type *type, struct value *val)
2891         {
2892                 free_exec(val->function);
2893                 val->function = NULL;
2894         }
2895
2896         static int function_compat(struct type *require, struct type *have)
2897         {
2898                 // FIXME can I do anything here yet?
2899                 return 0;
2900         }
2901
2902         static void function_check_args(struct parse_context *c, int *ok,
2903                                         struct type *require, struct exec *args)
2904         {
2905                 /* This should be 'compat', but we don't have a 'tuple' type to
2906                  * hold the type of 'args'
2907                  */
2908                 struct binode *arg = cast(binode, args);
2909                 struct binode *param = require->function.params;
2910
2911                 while (param) {
2912                         struct var *pv = cast(var, param->left);
2913                         if (!arg) {
2914                                 type_err(c, "error: insufficient arguments to function.",
2915                                          args, NULL, 0, NULL);
2916                                 break;
2917                         }
2918                         *ok = 1;
2919                         propagate_types(arg->left, c, ok, pv->var->type, 0);
2920                         param = cast(binode, param->right);
2921                         arg = cast(binode, arg->right);
2922                 }
2923                 if (arg)
2924                         type_err(c, "error: too many arguments to function.",
2925                                  args, NULL, 0, NULL);
2926         }
2927
2928         static void function_print(struct type *type, struct value *val, FILE *f)
2929         {
2930                 print_exec(val->function, 1, 0);
2931         }
2932
2933         static void function_print_type_decl(struct type *type, FILE *f)
2934         {
2935                 struct binode *b;
2936                 fprintf(f, "(");
2937                 for (b = type->function.params; b; b = cast(binode, b->right)) {
2938                         struct variable *v = cast(var, b->left)->var;
2939                         fprintf(f, "%.*s%s", v->name->name.len, v->name->name.txt,
2940                                 v->constant ? "::" : ":");
2941                         type_print(v->type, f);
2942                         if (b->right)
2943                                 fprintf(f, "; ");
2944                 }
2945                 fprintf(f, ")");
2946                 if (type->function.return_type != Tnone) {
2947                         fprintf(f, ":");
2948                         if (type->function.inline_result) {
2949                                 int i;
2950                                 struct type *t = type->function.return_type;
2951                                 fprintf(f, " (");
2952                                 for (i = 0; i < t->structure.nfields; i++) {
2953                                         struct field *fl = t->structure.fields + i;
2954                                         if (i)
2955                                                 fprintf(f, "; ");
2956                                         fprintf(f, "%.*s:", fl->name.len, fl->name.txt);
2957                                         type_print(fl->type, f);
2958                                 }
2959                                 fprintf(f, ")");
2960                         } else
2961                                 type_print(type->function.return_type, f);
2962                 }
2963                 fprintf(f, "\n");
2964         }
2965
2966         static void function_free_type(struct type *t)
2967         {
2968                 free_exec(t->function.params);
2969         }
2970
2971         static struct type function_prototype = {
2972                 .size = sizeof(void*),
2973                 .align = sizeof(void*),
2974                 .free = function_free,
2975                 .compat = function_compat,
2976                 .check_args = function_check_args,
2977                 .print = function_print,
2978                 .print_type_decl = function_print_type_decl,
2979                 .free_type = function_free_type,
2980         };
2981
2982 ###### declare terminals
2983
2984         $TERM func
2985
2986 ###### Binode types
2987         List,
2988
2989 ###### Grammar
2990
2991         $*variable
2992         FuncName -> IDENTIFIER ${ {
2993                 struct variable *v = var_decl(c, $1.txt);
2994                 struct var *e = new_pos(var, $1);
2995                 e->var = v;
2996                 if (v) {
2997                         v->where_decl = e;
2998                         $0 = v;
2999                 } else {
3000                         v = var_ref(c, $1.txt);
3001                         e->var = v;
3002                         type_err(c, "error: function '%v' redeclared",
3003                                 e, NULL, 0, NULL);
3004                         type_err(c, "info: this is where '%v' was first declared",
3005                                 v->where_decl, NULL, 0, NULL);
3006                         free_exec(e);
3007                 }
3008         } }$
3009
3010         $*binode
3011         Args -> ArgsLine NEWLINE ${ $0 = $<AL; }$
3012         | Args ArgsLine NEWLINE ${ {
3013                 struct binode *b = $<AL;
3014                 struct binode **bp = &b;
3015                 while (*bp)
3016                         bp = (struct binode **)&(*bp)->left;
3017                 *bp = $<A;
3018                 $0 = b;
3019         } }$
3020
3021         ArgsLine -> ${ $0 = NULL; }$
3022         | Varlist ${ $0 = $<1; }$
3023         | Varlist ; ${ $0 = $<1; }$
3024
3025         Varlist -> Varlist ; ArgDecl ${
3026                 $0 = new(binode);
3027                 $0->op = List;
3028                 $0->left = $<Vl;
3029                 $0->right = $<AD;
3030         }$
3031         | ArgDecl ${
3032                 $0 = new(binode);
3033                 $0->op = List;
3034                 $0->left = NULL;
3035                 $0->right = $<AD;
3036         }$
3037
3038         $*var
3039         ArgDecl -> IDENTIFIER : FormalType ${ {
3040                 struct variable *v = var_decl(c, $1.txt);
3041                 $0 = new(var);
3042                 $0->var = v;
3043                 v->type = $<FT;
3044         } }$
3045
3046 ##### Function calls
3047
3048 A function call can appear either as an expression or as a statement.
3049 We use a new 'Funcall' binode type to link the function with a list of
3050 arguments, form with the 'List' nodes.
3051
3052 We have already seen the "Term" which is how a function call can appear
3053 in an expression.  To parse a function call into a statement we include
3054 it in the "SimpleStatement Grammar" which will be described later.
3055
3056 ###### Binode types
3057         Funcall,
3058
3059 ###### term grammar
3060         | Term ( ExpressionList ) ${ {
3061                 struct binode *b = new(binode);
3062                 b->op = Funcall;
3063                 b->left = $<T;
3064                 b->right = reorder_bilist($<EL);
3065                 $0 = b;
3066         } }$
3067         | Term ( ) ${ {
3068                 struct binode *b = new(binode);
3069                 b->op = Funcall;
3070                 b->left = $<T;
3071                 b->right = NULL;
3072                 $0 = b;
3073         } }$
3074
3075 ###### SimpleStatement Grammar
3076
3077         | Term ( ExpressionList ) ${ {
3078                 struct binode *b = new(binode);
3079                 b->op = Funcall;
3080                 b->left = $<T;
3081                 b->right = reorder_bilist($<EL);
3082                 $0 = b;
3083         } }$
3084
3085 ###### print binode cases
3086
3087         case Funcall:
3088                 do_indent(indent, "");
3089                 print_exec(b->left, -1, bracket);
3090                 printf("(");
3091                 for (b = cast(binode, b->right); b; b = cast(binode, b->right)) {
3092                         if (b->left) {
3093                                 printf(" ");
3094                                 print_exec(b->left, -1, bracket);
3095                                 if (b->right)
3096                                         printf(",");
3097                         }
3098                 }
3099                 printf(")");
3100                 if (indent >= 0)
3101                         printf("\n");
3102                 break;
3103
3104 ###### propagate binode cases
3105
3106         case Funcall: {
3107                 /* Every arg must match formal parameter, and result
3108                  * is return type of function
3109                  */
3110                 struct binode *args = cast(binode, b->right);
3111                 struct var *v = cast(var, b->left);
3112
3113                 if (!v->var->type || v->var->type->check_args == NULL) {
3114                         type_err(c, "error: attempt to call a non-function.",
3115                                  prog, NULL, 0, NULL);
3116                         return NULL;
3117                 }
3118                 v->var->type->check_args(c, ok, v->var->type, args);
3119                 return v->var->type->function.return_type;
3120         }
3121
3122 ###### interp binode cases
3123
3124         case Funcall: {
3125                 struct var *v = cast(var, b->left);
3126                 struct type *t = v->var->type;
3127                 void *oldlocal = c->local;
3128                 int old_size = c->local_size;
3129                 void *local = calloc(1, t->function.local_size);
3130                 struct value *fbody = var_value(c, v->var);
3131                 struct binode *arg = cast(binode, b->right);
3132                 struct binode *param = t->function.params;
3133
3134                 while (param) {
3135                         struct var *pv = cast(var, param->left);
3136                         struct type *vtype = NULL;
3137                         struct value val = interp_exec(c, arg->left, &vtype);
3138                         struct value *lval;
3139                         c->local = local; c->local_size = t->function.local_size;
3140                         lval = var_value(c, pv->var);
3141                         c->local = oldlocal; c->local_size = old_size;
3142                         memcpy(lval, &val, vtype->size);
3143                         param = cast(binode, param->right);
3144                         arg = cast(binode, arg->right);
3145                 }
3146                 c->local = local; c->local_size = t->function.local_size;
3147                 if (t->function.inline_result && dtype) {
3148                         _interp_exec(c, fbody->function, NULL, NULL);
3149                         memcpy(dest, local, dtype->size);
3150                         rvtype = ret.type = NULL;
3151                 } else
3152                         rv = interp_exec(c, fbody->function, &rvtype);
3153                 c->local = oldlocal; c->local_size = old_size;
3154                 free(local);
3155                 break;
3156         }
3157
3158 ## Complex executables: statements and expressions
3159
3160 Now that we have types and values and variables and most of the basic
3161 Terms which provide access to these, we can explore the more complex
3162 code that combine all of these to get useful work done.  Specifically
3163 statements and expressions.
3164
3165 Expressions are various combinations of Terms.  We will use operator
3166 precedence to ensure correct parsing.  The simplest Expression is just a
3167 Term - others will follow.
3168
3169 ###### Grammar
3170
3171         $*exec
3172         Expression -> Term ${ $0 = $<Term; }$
3173         ## expression grammar
3174
3175 ### Expressions: Conditional
3176
3177 Our first user of the `binode` will be conditional expressions, which
3178 is a bit odd as they actually have three components.  That will be
3179 handled by having 2 binodes for each expression.  The conditional
3180 expression is the lowest precedence operator which is why we define it
3181 first - to start the precedence list.
3182
3183 Conditional expressions are of the form "value `if` condition `else`
3184 other_value".  They associate to the right, so everything to the right
3185 of `else` is part of an else value, while only a higher-precedence to
3186 the left of `if` is the if values.  Between `if` and `else` there is no
3187 room for ambiguity, so a full conditional expression is allowed in
3188 there.
3189
3190 ###### Binode types
3191         CondExpr,
3192
3193 ###### declare terminals
3194
3195         $LEFT if $$ifelse
3196
3197 ###### expression grammar
3198
3199         | Expression if Expression else Expression $$ifelse ${ {
3200                 struct binode *b1 = new(binode);
3201                 struct binode *b2 = new(binode);
3202                 b1->op = CondExpr;
3203                 b1->left = $<3;
3204                 b1->right = b2;
3205                 b2->op = CondExpr;
3206                 b2->left = $<1;
3207                 b2->right = $<5;
3208                 $0 = b1;
3209         } }$
3210
3211 ###### print binode cases
3212
3213         case CondExpr:
3214                 b2 = cast(binode, b->right);
3215                 if (bracket) printf("(");
3216                 print_exec(b2->left, -1, bracket);
3217                 printf(" if ");
3218                 print_exec(b->left, -1, bracket);
3219                 printf(" else ");
3220                 print_exec(b2->right, -1, bracket);
3221                 if (bracket) printf(")");
3222                 break;
3223
3224 ###### propagate binode cases
3225
3226         case CondExpr: {
3227                 /* cond must be Tbool, others must match */
3228                 struct binode *b2 = cast(binode, b->right);
3229                 struct type *t2;
3230
3231                 propagate_types(b->left, c, ok, Tbool, 0);
3232                 t = propagate_types(b2->left, c, ok, type, Rnolabel);
3233                 t2 = propagate_types(b2->right, c, ok, type ?: t, Rnolabel);
3234                 return t ?: t2;
3235         }
3236
3237 ###### interp binode cases
3238
3239         case CondExpr: {
3240                 struct binode *b2 = cast(binode, b->right);
3241                 left = interp_exec(c, b->left, &ltype);
3242                 if (left.bool)
3243                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
3244                 else
3245                         rv = interp_exec(c, b2->right, &rvtype);
3246                 }
3247                 break;
3248
3249 ### Expression list
3250
3251 We take a brief detour, now that we have expressions, to describe lists
3252 of expressions.  These will be needed for function parameters and
3253 possibly other situations.  They seem generic enough to introduce here
3254 to be used elsewhere.
3255
3256 And ExpressionList will use the `List` type of `binode`, building up at
3257 the end.  And place where they are used will probably call
3258 `reorder_bilist()` to get a more normal first/next arrangement.
3259
3260 ###### declare terminals
3261         $TERM ,
3262
3263 `List` execs have no implicit semantics, so they are never propagated or
3264 interpreted.  The can be printed as a comma separate list, which is how
3265 they are parsed.  Note they are also used for function formal parameter
3266 lists.  In that case a separate function is used to print them.
3267
3268 ###### print binode cases
3269         case List:
3270                 while (b) {
3271                         printf(" ");
3272                         print_exec(b->left, -1, bracket);
3273                         if (b->right)
3274                                 printf(",");
3275                         b = cast(binode, b->right);
3276                 }
3277                 break;
3278
3279 ###### propagate binode cases
3280         case List: abort(); // NOTEST
3281 ###### interp binode cases
3282         case List: abort(); // NOTEST
3283
3284 ###### Grammar
3285
3286         $*binode
3287         ExpressionList -> ExpressionList , Expression ${
3288                 $0 = new(binode);
3289                 $0->op = List;
3290                 $0->left = $<1;
3291                 $0->right = $<3;
3292         }$
3293         | Expression ${
3294                 $0 = new(binode);
3295                 $0->op = List;
3296                 $0->left = NULL;
3297                 $0->right = $<1;
3298         }$
3299
3300 ### Expressions: Boolean
3301
3302 The next class of expressions to use the `binode` will be Boolean
3303 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
3304 have same corresponding precendence.  The difference is that they don't
3305 evaluate the second expression if not necessary.
3306
3307 ###### Binode types
3308         And,
3309         AndThen,
3310         Or,
3311         OrElse,
3312         Not,
3313
3314 ###### declare terminals
3315         $LEFT or
3316         $LEFT and
3317         $LEFT not
3318
3319 ###### expression grammar
3320         | Expression or Expression ${ {
3321                 struct binode *b = new(binode);
3322                 b->op = Or;
3323                 b->left = $<1;
3324                 b->right = $<3;
3325                 $0 = b;
3326         } }$
3327         | Expression or else Expression ${ {
3328                 struct binode *b = new(binode);
3329                 b->op = OrElse;
3330                 b->left = $<1;
3331                 b->right = $<4;
3332                 $0 = b;
3333         } }$
3334
3335         | Expression and Expression ${ {
3336                 struct binode *b = new(binode);
3337                 b->op = And;
3338                 b->left = $<1;
3339                 b->right = $<3;
3340                 $0 = b;
3341         } }$
3342         | Expression and then Expression ${ {
3343                 struct binode *b = new(binode);
3344                 b->op = AndThen;
3345                 b->left = $<1;
3346                 b->right = $<4;
3347                 $0 = b;
3348         } }$
3349
3350         | not Expression ${ {
3351                 struct binode *b = new(binode);
3352                 b->op = Not;
3353                 b->right = $<2;
3354                 $0 = b;
3355         } }$
3356
3357 ###### print binode cases
3358         case And:
3359                 if (bracket) printf("(");
3360                 print_exec(b->left, -1, bracket);
3361                 printf(" and ");
3362                 print_exec(b->right, -1, bracket);
3363                 if (bracket) printf(")");
3364                 break;
3365         case AndThen:
3366                 if (bracket) printf("(");
3367                 print_exec(b->left, -1, bracket);
3368                 printf(" and then ");
3369                 print_exec(b->right, -1, bracket);
3370                 if (bracket) printf(")");
3371                 break;
3372         case Or:
3373                 if (bracket) printf("(");
3374                 print_exec(b->left, -1, bracket);
3375                 printf(" or ");
3376                 print_exec(b->right, -1, bracket);
3377                 if (bracket) printf(")");
3378                 break;
3379         case OrElse:
3380                 if (bracket) printf("(");
3381                 print_exec(b->left, -1, bracket);
3382                 printf(" or else ");
3383                 print_exec(b->right, -1, bracket);
3384                 if (bracket) printf(")");
3385                 break;
3386         case Not:
3387                 if (bracket) printf("(");
3388                 printf("not ");
3389                 print_exec(b->right, -1, bracket);
3390                 if (bracket) printf(")");
3391                 break;
3392
3393 ###### propagate binode cases
3394         case And:
3395         case AndThen:
3396         case Or:
3397         case OrElse:
3398         case Not:
3399                 /* both must be Tbool, result is Tbool */
3400                 propagate_types(b->left, c, ok, Tbool, 0);
3401                 propagate_types(b->right, c, ok, Tbool, 0);
3402                 if (type && type != Tbool)
3403                         type_err(c, "error: %1 operation found where %2 expected", prog,
3404                                    Tbool, 0, type);
3405                 return Tbool;
3406
3407 ###### interp binode cases
3408         case And:
3409                 rv = interp_exec(c, b->left, &rvtype);
3410                 right = interp_exec(c, b->right, &rtype);
3411                 rv.bool = rv.bool && right.bool;
3412                 break;
3413         case AndThen:
3414                 rv = interp_exec(c, b->left, &rvtype);
3415                 if (rv.bool)
3416                         rv = interp_exec(c, b->right, NULL);
3417                 break;
3418         case Or:
3419                 rv = interp_exec(c, b->left, &rvtype);
3420                 right = interp_exec(c, b->right, &rtype);
3421                 rv.bool = rv.bool || right.bool;
3422                 break;
3423         case OrElse:
3424                 rv = interp_exec(c, b->left, &rvtype);
3425                 if (!rv.bool)
3426                         rv = interp_exec(c, b->right, NULL);
3427                 break;
3428         case Not:
3429                 rv = interp_exec(c, b->right, &rvtype);
3430                 rv.bool = !rv.bool;
3431                 break;
3432
3433 ### Expressions: Comparison
3434
3435 Of slightly higher precedence that Boolean expressions are Comparisons.
3436 A comparison takes arguments of any comparable type, but the two types
3437 must be the same.
3438
3439 To simplify the parsing we introduce an `eop` which can record an
3440 expression operator, and the `CMPop` non-terminal will match one of them.
3441
3442 ###### ast
3443         struct eop {
3444                 enum Btype op;
3445         };
3446
3447 ###### ast functions
3448         static void free_eop(struct eop *e)
3449         {
3450                 if (e)
3451                         free(e);
3452         }
3453
3454 ###### Binode types
3455         Less,
3456         Gtr,
3457         LessEq,
3458         GtrEq,
3459         Eql,
3460         NEql,
3461
3462 ###### declare terminals
3463         $LEFT < > <= >= == != CMPop
3464
3465 ###### expression grammar
3466         | Expression CMPop Expression ${ {
3467                 struct binode *b = new(binode);
3468                 b->op = $2.op;
3469                 b->left = $<1;
3470                 b->right = $<3;
3471                 $0 = b;
3472         } }$
3473
3474 ###### Grammar
3475
3476         $eop
3477         CMPop ->  < ${ $0.op = Less; }$
3478         |         > ${ $0.op = Gtr; }$
3479         |         <= ${ $0.op = LessEq; }$
3480         |         >= ${ $0.op = GtrEq; }$
3481         |         == ${ $0.op = Eql; }$
3482         |         != ${ $0.op = NEql; }$
3483
3484 ###### print binode cases
3485
3486         case Less:
3487         case LessEq:
3488         case Gtr:
3489         case GtrEq:
3490         case Eql:
3491         case NEql:
3492                 if (bracket) printf("(");
3493                 print_exec(b->left, -1, bracket);
3494                 switch(b->op) {
3495                 case Less:   printf(" < "); break;
3496                 case LessEq: printf(" <= "); break;
3497                 case Gtr:    printf(" > "); break;
3498                 case GtrEq:  printf(" >= "); break;
3499                 case Eql:    printf(" == "); break;
3500                 case NEql:   printf(" != "); break;
3501                 default: abort();               // NOTEST
3502                 }
3503                 print_exec(b->right, -1, bracket);
3504                 if (bracket) printf(")");
3505                 break;
3506
3507 ###### propagate binode cases
3508         case Less:
3509         case LessEq:
3510         case Gtr:
3511         case GtrEq:
3512         case Eql:
3513         case NEql:
3514                 /* Both must match but not be labels, result is Tbool */
3515                 t = propagate_types(b->left, c, ok, NULL, Rnolabel);
3516                 if (t)
3517                         propagate_types(b->right, c, ok, t, 0);
3518                 else {
3519                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);   // UNTESTED
3520                         if (t)  // UNTESTED
3521                                 t = propagate_types(b->left, c, ok, t, 0);      // UNTESTED
3522                 }
3523                 if (!type_compat(type, Tbool, 0))
3524                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
3525                                     Tbool, rules, type);
3526                 return Tbool;
3527
3528 ###### interp binode cases
3529         case Less:
3530         case LessEq:
3531         case Gtr:
3532         case GtrEq:
3533         case Eql:
3534         case NEql:
3535         {
3536                 int cmp;
3537                 left = interp_exec(c, b->left, &ltype);
3538                 right = interp_exec(c, b->right, &rtype);
3539                 cmp = value_cmp(ltype, rtype, &left, &right);
3540                 rvtype = Tbool;
3541                 switch (b->op) {
3542                 case Less:      rv.bool = cmp <  0; break;
3543                 case LessEq:    rv.bool = cmp <= 0; break;
3544                 case Gtr:       rv.bool = cmp >  0; break;
3545                 case GtrEq:     rv.bool = cmp >= 0; break;
3546                 case Eql:       rv.bool = cmp == 0; break;
3547                 case NEql:      rv.bool = cmp != 0; break;
3548                 default:        rv.bool = 0; break;     // NOTEST
3549                 }
3550                 break;
3551         }
3552
3553 ### Expressions: Arithmetic etc.
3554
3555 The remaining expressions with the highest precedence are arithmetic,
3556 string concatenation, and string conversion.  String concatenation
3557 (`++`) has the same precedence as multiplication and division, but lower
3558 than the uniary.
3559
3560 String conversion is a temporary feature until I get a better type
3561 system.  `$` is a prefix operator which expects a string and returns
3562 a number.
3563
3564 `+` and `-` are both infix and prefix operations (where they are
3565 absolute value and negation).  These have different operator names.
3566
3567 We also have a 'Bracket' operator which records where parentheses were
3568 found.  This makes it easy to reproduce these when printing.  Possibly I
3569 should only insert brackets were needed for precedence.  Putting
3570 parentheses around an expression converts it into a Term,
3571
3572 ###### Binode types
3573         Plus, Minus,
3574         Times, Divide, Rem,
3575         Concat,
3576         Absolute, Negate,
3577         StringConv,
3578         Bracket,
3579
3580 ###### declare terminals
3581         $LEFT + - Eop
3582         $LEFT * / % ++ Top
3583         $LEFT Uop $
3584         $TERM ( )
3585
3586 ###### expression grammar
3587         | Expression Eop Expression ${ {
3588                 struct binode *b = new(binode);
3589                 b->op = $2.op;
3590                 b->left = $<1;
3591                 b->right = $<3;
3592                 $0 = b;
3593         } }$
3594
3595         | Expression Top Expression ${ {
3596                 struct binode *b = new(binode);
3597                 b->op = $2.op;
3598                 b->left = $<1;
3599                 b->right = $<3;
3600                 $0 = b;
3601         } }$
3602
3603         | Uop Expression ${ {
3604                 struct binode *b = new(binode);
3605                 b->op = $1.op;
3606                 b->right = $<2;
3607                 $0 = b;
3608         } }$
3609
3610 ###### term grammar
3611
3612         | ( Expression ) ${ {
3613                 struct binode *b = new_pos(binode, $1);
3614                 b->op = Bracket;
3615                 b->right = $<2;
3616                 $0 = b;
3617         } }$
3618
3619 ###### Grammar
3620
3621         $eop
3622         Eop ->   + ${ $0.op = Plus; }$
3623         |        - ${ $0.op = Minus; }$
3624
3625         Uop ->   + ${ $0.op = Absolute; }$
3626         |        - ${ $0.op = Negate; }$
3627         |        $ ${ $0.op = StringConv; }$
3628
3629         Top ->   * ${ $0.op = Times; }$
3630         |        / ${ $0.op = Divide; }$
3631         |        % ${ $0.op = Rem; }$
3632         |        ++ ${ $0.op = Concat; }$
3633
3634 ###### print binode cases
3635         case Plus:
3636         case Minus:
3637         case Times:
3638         case Divide:
3639         case Concat:
3640         case Rem:
3641                 if (bracket) printf("(");
3642                 print_exec(b->left, indent, bracket);
3643                 switch(b->op) {
3644                 case Plus:   fputs(" + ", stdout); break;
3645                 case Minus:  fputs(" - ", stdout); break;
3646                 case Times:  fputs(" * ", stdout); break;
3647                 case Divide: fputs(" / ", stdout); break;
3648                 case Rem:    fputs(" % ", stdout); break;
3649                 case Concat: fputs(" ++ ", stdout); break;
3650                 default: abort();       // NOTEST
3651                 }                       // NOTEST
3652                 print_exec(b->right, indent, bracket);
3653                 if (bracket) printf(")");
3654                 break;
3655         case Absolute:
3656         case Negate:
3657         case StringConv:
3658                 if (bracket) printf("(");
3659                 switch (b->op) {
3660                 case Absolute:   fputs("+", stdout); break;
3661                 case Negate:     fputs("-", stdout); break;
3662                 case StringConv: fputs("$", stdout); break;
3663                 default: abort();       // NOTEST
3664                 }                       // NOTEST
3665                 print_exec(b->right, indent, bracket);
3666                 if (bracket) printf(")");
3667                 break;
3668         case Bracket:
3669                 printf("(");
3670                 print_exec(b->right, indent, bracket);
3671                 printf(")");
3672                 break;
3673
3674 ###### propagate binode cases
3675         case Plus:
3676         case Minus:
3677         case Times:
3678         case Rem:
3679         case Divide:
3680                 /* both must be numbers, result is Tnum */
3681         case Absolute:
3682         case Negate:
3683                 /* as propagate_types ignores a NULL,
3684                  * unary ops fit here too */
3685                 propagate_types(b->left, c, ok, Tnum, 0);
3686                 propagate_types(b->right, c, ok, Tnum, 0);
3687                 if (!type_compat(type, Tnum, 0))
3688                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3689                                    Tnum, rules, type);
3690                 return Tnum;
3691
3692         case Concat:
3693                 /* both must be Tstr, result is Tstr */
3694                 propagate_types(b->left, c, ok, Tstr, 0);
3695                 propagate_types(b->right, c, ok, Tstr, 0);
3696                 if (!type_compat(type, Tstr, 0))
3697                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3698                                    Tstr, rules, type);
3699                 return Tstr;
3700
3701         case StringConv:
3702                 /* op must be string, result is number */
3703                 propagate_types(b->left, c, ok, Tstr, 0);
3704                 if (!type_compat(type, Tnum, 0))
3705                         type_err(c,     // UNTESTED
3706                           "error: Can only convert string to number, not %1",
3707                                 prog, type, 0, NULL);
3708                 return Tnum;
3709
3710         case Bracket:
3711                 return propagate_types(b->right, c, ok, type, 0);
3712
3713 ###### interp binode cases
3714
3715         case Plus:
3716                 rv = interp_exec(c, b->left, &rvtype);
3717                 right = interp_exec(c, b->right, &rtype);
3718                 mpq_add(rv.num, rv.num, right.num);
3719                 break;
3720         case Minus:
3721                 rv = interp_exec(c, b->left, &rvtype);
3722                 right = interp_exec(c, b->right, &rtype);
3723                 mpq_sub(rv.num, rv.num, right.num);
3724                 break;
3725         case Times:
3726                 rv = interp_exec(c, b->left, &rvtype);
3727                 right = interp_exec(c, b->right, &rtype);
3728                 mpq_mul(rv.num, rv.num, right.num);
3729                 break;
3730         case Divide:
3731                 rv = interp_exec(c, b->left, &rvtype);
3732                 right = interp_exec(c, b->right, &rtype);
3733                 mpq_div(rv.num, rv.num, right.num);
3734                 break;
3735         case Rem: {
3736                 mpz_t l, r, rem;
3737
3738                 left = interp_exec(c, b->left, &ltype);
3739                 right = interp_exec(c, b->right, &rtype);
3740                 mpz_init(l); mpz_init(r); mpz_init(rem);
3741                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3742                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3743                 mpz_tdiv_r(rem, l, r);
3744                 val_init(Tnum, &rv);
3745                 mpq_set_z(rv.num, rem);
3746                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3747                 rvtype = ltype;
3748                 break;
3749         }
3750         case Negate:
3751                 rv = interp_exec(c, b->right, &rvtype);
3752                 mpq_neg(rv.num, rv.num);
3753                 break;
3754         case Absolute:
3755                 rv = interp_exec(c, b->right, &rvtype);
3756                 mpq_abs(rv.num, rv.num);
3757                 break;
3758         case Bracket:
3759                 rv = interp_exec(c, b->right, &rvtype);
3760                 break;
3761         case Concat:
3762                 left = interp_exec(c, b->left, &ltype);
3763                 right = interp_exec(c, b->right, &rtype);
3764                 rvtype = Tstr;
3765                 rv.str = text_join(left.str, right.str);
3766                 break;
3767         case StringConv:
3768                 right = interp_exec(c, b->right, &rvtype);
3769                 rtype = Tstr;
3770                 rvtype = Tnum;
3771
3772                 struct text tx = right.str;
3773                 char tail[3];
3774                 int neg = 0;
3775                 if (tx.txt[0] == '-') {
3776                         neg = 1;        // UNTESTED
3777                         tx.txt++;       // UNTESTED
3778                         tx.len--;       // UNTESTED
3779                 }
3780                 if (number_parse(rv.num, tail, tx) == 0)
3781                         mpq_init(rv.num);       // UNTESTED
3782                 else if (neg)
3783                         mpq_neg(rv.num, rv.num);        // UNTESTED
3784                 if (tail[0])
3785                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3786
3787                 break;
3788
3789 ###### value functions
3790
3791         static struct text text_join(struct text a, struct text b)
3792         {
3793                 struct text rv;
3794                 rv.len = a.len + b.len;
3795                 rv.txt = malloc(rv.len);
3796                 memcpy(rv.txt, a.txt, a.len);
3797                 memcpy(rv.txt+a.len, b.txt, b.len);
3798                 return rv;
3799         }
3800
3801 ### Blocks, Statements, and Statement lists.
3802
3803 Now that we have expressions out of the way we need to turn to
3804 statements.  There are simple statements and more complex statements.
3805 Simple statements do not contain (syntactic) newlines, complex statements do.
3806
3807 Statements often come in sequences and we have corresponding simple
3808 statement lists and complex statement lists.
3809 The former comprise only simple statements separated by semicolons.
3810 The later comprise complex statements and simple statement lists.  They are
3811 separated by newlines.  Thus the semicolon is only used to separate
3812 simple statements on the one line.  This may be overly restrictive,
3813 but I'm not sure I ever want a complex statement to share a line with
3814 anything else.
3815
3816 Note that a simple statement list can still use multiple lines if
3817 subsequent lines are indented, so
3818
3819 ###### Example: wrapped simple statement list
3820
3821         a = b; c = d;
3822            e = f; print g
3823
3824 is a single simple statement list.  This might allow room for
3825 confusion, so I'm not set on it yet.
3826
3827 A simple statement list needs no extra syntax.  A complex statement
3828 list has two syntactic forms.  It can be enclosed in braces (much like
3829 C blocks), or it can be introduced by an indent and continue until an
3830 unindented newline (much like Python blocks).  With this extra syntax
3831 it is referred to as a block.
3832
3833 Note that a block does not have to include any newlines if it only
3834 contains simple statements.  So both of:
3835
3836         if condition: a=b; d=f
3837
3838         if condition { a=b; print f }
3839
3840 are valid.
3841
3842 In either case the list is constructed from a `binode` list with
3843 `Block` as the operator.  When parsing the list it is most convenient
3844 to append to the end, so a list is a list and a statement.  When using
3845 the list it is more convenient to consider a list to be a statement
3846 and a list.  So we need a function to re-order a list.
3847 `reorder_bilist` serves this purpose.
3848
3849 The only stand-alone statement we introduce at this stage is `pass`
3850 which does nothing and is represented as a `NULL` pointer in a `Block`
3851 list.  Other stand-alone statements will follow once the infrastructure
3852 is in-place.
3853
3854 As many statements will use binodes, we declare a binode pointer 'b' in
3855 the common header for all reductions to use.
3856
3857 ###### Parser: reduce
3858         struct binode *b;
3859
3860 ###### Binode types
3861         Block,
3862
3863 ###### Grammar
3864
3865         $TERM { } ;
3866
3867         $*binode
3868         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3869         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3870         |        SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3871         |        SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3872         |        IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3873
3874         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3875         |        OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3876         |        OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3877         |        OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3878         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3879
3880         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3881         |        { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3882         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3883
3884         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3885         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3886         |        : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3887         |        : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3888         |        : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3889
3890         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3891
3892         ComplexStatements -> ComplexStatements ComplexStatement ${
3893                 if ($2 == NULL) {
3894                         $0 = $<1;
3895                 } else {
3896                         $0 = new(binode);
3897                         $0->op = Block;
3898                         $0->left = $<1;
3899                         $0->right = $<2;
3900                 }
3901         }$
3902         | ComplexStatement ${
3903                 if ($1 == NULL) {
3904                         $0 = NULL;
3905                 } else {
3906                         $0 = new(binode);
3907                         $0->op = Block;
3908                         $0->left = NULL;
3909                         $0->right = $<1;
3910                 }
3911         }$
3912
3913         $*exec
3914         ComplexStatement -> SimpleStatements Newlines ${
3915                 $0 = reorder_bilist($<SS);
3916         }$
3917         |  SimpleStatements ; Newlines ${
3918                 $0 = reorder_bilist($<SS);
3919         }$
3920         ## ComplexStatement Grammar
3921
3922         $*binode
3923         SimpleStatements -> SimpleStatements ; SimpleStatement ${
3924                 $0 = new(binode);
3925                 $0->op = Block;
3926                 $0->left = $<1;
3927                 $0->right = $<3;
3928         }$
3929         | SimpleStatement ${
3930                 $0 = new(binode);
3931                 $0->op = Block;
3932                 $0->left = NULL;
3933                 $0->right = $<1;
3934         }$
3935
3936         $TERM pass
3937         $*exec
3938         SimpleStatement -> pass ${ $0 = NULL; }$
3939         | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
3940         ## SimpleStatement Grammar
3941
3942 ###### print binode cases
3943         case Block:
3944                 if (indent < 0) {
3945                         // simple statement
3946                         if (b->left == NULL)    // UNTESTED
3947                                 printf("pass"); // UNTESTED
3948                         else
3949                                 print_exec(b->left, indent, bracket);   // UNTESTED
3950                         if (b->right) { // UNTESTED
3951                                 printf("; ");   // UNTESTED
3952                                 print_exec(b->right, indent, bracket);  // UNTESTED
3953                         }
3954                 } else {
3955                         // block, one per line
3956                         if (b->left == NULL)
3957                                 do_indent(indent, "pass\n");
3958                         else
3959                                 print_exec(b->left, indent, bracket);
3960                         if (b->right)
3961                                 print_exec(b->right, indent, bracket);
3962                 }
3963                 break;
3964
3965 ###### propagate binode cases
3966         case Block:
3967         {
3968                 /* If any statement returns something other than Tnone
3969                  * or Tbool then all such must return same type.
3970                  * As each statement may be Tnone or something else,
3971                  * we must always pass NULL (unknown) down, otherwise an incorrect
3972                  * error might occur.  We never return Tnone unless it is
3973                  * passed in.
3974                  */
3975                 struct binode *e;
3976
3977                 for (e = b; e; e = cast(binode, e->right)) {
3978                         t = propagate_types(e->left, c, ok, NULL, rules);
3979                         if ((rules & Rboolok) && (t == Tbool || t == Tnone))
3980                                 t = NULL;
3981                         if (t == Tnone && e->right)
3982                                 /* Only the final statement *must* return a value
3983                                  * when not Rboolok
3984                                  */
3985                                 t = NULL;
3986                         if (t) {
3987                                 if (!type)
3988                                         type = t;
3989                                 else if (t != type)
3990                                         type_err(c, "error: expected %1%r, found %2",
3991                                                  e->left, type, rules, t);
3992                         }
3993                 }
3994                 return type;
3995         }
3996
3997 ###### interp binode cases
3998         case Block:
3999                 while (rvtype == Tnone &&
4000                        b) {
4001                         if (b->left)
4002                                 rv = interp_exec(c, b->left, &rvtype);
4003                         b = cast(binode, b->right);
4004                 }
4005                 break;
4006
4007 ### The Print statement
4008
4009 `print` is a simple statement that takes a comma-separated list of
4010 expressions and prints the values separated by spaces and terminated
4011 by a newline.  No control of formatting is possible.
4012
4013 `print` uses `ExpressionList` to collect the expressions and stores them
4014 on the left side of a `Print` binode unlessthere is a trailing comma
4015 when the list is stored on the `right` side and no trailing newline is
4016 printed.
4017
4018 ###### Binode types
4019         Print,
4020
4021 ##### declare terminals
4022         $TERM print
4023
4024 ###### SimpleStatement Grammar
4025
4026         | print ExpressionList ${
4027                 $0 = b = new(binode);
4028                 b->op = Print;
4029                 b->right = NULL;
4030                 b->left = reorder_bilist($<EL);
4031         }$
4032         | print ExpressionList , ${ {
4033                 $0 = b = new(binode);
4034                 b->op = Print;
4035                 b->right = reorder_bilist($<EL);
4036                 b->left = NULL;
4037         } }$
4038         | print ${
4039                 $0 = b = new(binode);
4040                 b->op = Print;
4041                 b->left = NULL;
4042                 b->right = NULL;
4043         }$
4044
4045 ###### print binode cases
4046
4047         case Print:
4048                 do_indent(indent, "print");
4049                 if (b->right) {
4050                         print_exec(b->right, -1, bracket);
4051                         printf(",");
4052                 } else
4053                         print_exec(b->left, -1, bracket);
4054                 if (indent >= 0)
4055                         printf("\n");
4056                 break;
4057
4058 ###### propagate binode cases
4059
4060         case Print:
4061                 /* don't care but all must be consistent */
4062                 if (b->left)
4063                         b = cast(binode, b->left);
4064                 else
4065                         b = cast(binode, b->right);
4066                 while (b) {
4067                         propagate_types(b->left, c, ok, NULL, Rnolabel);
4068                         b = cast(binode, b->right);
4069                 }
4070                 break;
4071
4072 ###### interp binode cases
4073
4074         case Print:
4075         {
4076                 struct binode *b2 = cast(binode, b->left);
4077                 if (!b2)
4078                         b2 = cast(binode, b->right);
4079                 for (; b2; b2 = cast(binode, b2->right)) {
4080                         left = interp_exec(c, b2->left, &ltype);
4081                         print_value(ltype, &left, stdout);
4082                         free_value(ltype, &left);
4083                         if (b2->right)
4084                                 putchar(' ');
4085                 }
4086                 if (b->right == NULL)
4087                         printf("\n");
4088                 ltype = Tnone;
4089                 break;
4090         }
4091
4092 ###### Assignment statement
4093
4094 An assignment will assign a value to a variable, providing it hasn't
4095 been declared as a constant.  The analysis phase ensures that the type
4096 will be correct so the interpreter just needs to perform the
4097 calculation.  There is a form of assignment which declares a new
4098 variable as well as assigning a value.  If a name is assigned before
4099 it is declared, and error will be raised as the name is created as
4100 `Tlabel` and it is illegal to assign to such names.
4101
4102 ###### Binode types
4103         Assign,
4104         Declare,
4105
4106 ###### declare terminals
4107         $TERM =
4108
4109 ###### SimpleStatement Grammar
4110         | Term = Expression ${
4111                 $0 = b= new(binode);
4112                 b->op = Assign;
4113                 b->left = $<1;
4114                 b->right = $<3;
4115         }$
4116         | VariableDecl = Expression ${
4117                 $0 = b= new(binode);
4118                 b->op = Declare;
4119                 b->left = $<1;
4120                 b->right =$<3;
4121         }$
4122
4123         | VariableDecl ${
4124                 if ($1->var->where_set == NULL) {
4125                         type_err(c,
4126                                  "Variable declared with no type or value: %v",
4127                                  $1, NULL, 0, NULL);
4128                         free_var($1);
4129                 } else {
4130                         $0 = b = new(binode);
4131                         b->op = Declare;
4132                         b->left = $<1;
4133                         b->right = NULL;
4134                 }
4135         }$
4136
4137 ###### print binode cases
4138
4139         case Assign:
4140                 do_indent(indent, "");
4141                 print_exec(b->left, indent, bracket);
4142                 printf(" = ");
4143                 print_exec(b->right, indent, bracket);
4144                 if (indent >= 0)
4145                         printf("\n");
4146                 break;
4147
4148         case Declare:
4149                 {
4150                 struct variable *v = cast(var, b->left)->var;
4151                 do_indent(indent, "");
4152                 print_exec(b->left, indent, bracket);
4153                 if (cast(var, b->left)->var->constant) {
4154                         printf("::");
4155                         if (v->explicit_type) {
4156                                 type_print(v->type, stdout);
4157                                 printf(" ");
4158                         }
4159                 } else {
4160                         printf(":");
4161                         if (v->explicit_type) {
4162                                 type_print(v->type, stdout);
4163                                 printf(" ");
4164                         }
4165                 }
4166                 if (b->right) {
4167                         printf("= ");
4168                         print_exec(b->right, indent, bracket);
4169                 }
4170                 if (indent >= 0)
4171                         printf("\n");
4172                 }
4173                 break;
4174
4175 ###### propagate binode cases
4176
4177         case Assign:
4178         case Declare:
4179                 /* Both must match and not be labels,
4180                  * Type must support 'dup',
4181                  * For Assign, left must not be constant.
4182                  * result is Tnone
4183                  */
4184                 t = propagate_types(b->left, c, ok, NULL,
4185                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
4186                 if (!b->right)
4187                         return Tnone;
4188
4189                 if (t) {
4190                         if (propagate_types(b->right, c, ok, t, 0) != t)
4191                                 if (b->left->type == Xvar)
4192                                         type_err(c, "info: variable '%v' was set as %1 here.",
4193                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
4194                 } else {
4195                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
4196                         if (t)
4197                                 propagate_types(b->left, c, ok, t,
4198                                                 (b->op == Assign ? Rnoconstant : 0));
4199                 }
4200                 if (t && t->dup == NULL && t->name.txt[0] != ' ') // HACK
4201                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
4202                 return Tnone;
4203
4204                 break;
4205
4206 ###### interp binode cases
4207
4208         case Assign:
4209                 lleft = linterp_exec(c, b->left, &ltype);
4210                 if (lleft)
4211                         dinterp_exec(c, b->right, lleft, ltype, 1);
4212                 ltype = Tnone;
4213                 break;
4214
4215         case Declare:
4216         {
4217                 struct variable *v = cast(var, b->left)->var;
4218                 struct value *val;
4219                 v = v->merged;
4220                 val = var_value(c, v);
4221                 if (v->type->prepare_type)
4222                         v->type->prepare_type(c, v->type, 0);
4223                 if (b->right)
4224                         dinterp_exec(c, b->right, val, v->type, 0);
4225                 else
4226                         val_init(v->type, val);
4227                 break;
4228         }
4229
4230 ### The `use` statement
4231
4232 The `use` statement is the last "simple" statement.  It is needed when a
4233 statement block can return a value.  This includes the body of a
4234 function which has a return type, and the "condition" code blocks in
4235 `if`, `while`, and `switch` statements.
4236
4237 ###### Binode types
4238         Use,
4239
4240 ###### declare terminals
4241         $TERM use
4242
4243 ###### SimpleStatement Grammar
4244         | use Expression ${
4245                 $0 = b = new_pos(binode, $1);
4246                 b->op = Use;
4247                 b->right = $<2;
4248                 if (b->right->type == Xvar) {
4249                         struct var *v = cast(var, b->right);
4250                         if (v->var->type == Tnone) {
4251                                 /* Convert this to a label */
4252                                 struct value *val;
4253
4254                                 v->var->type = Tlabel;
4255                                 val = global_alloc(c, Tlabel, v->var, NULL);
4256                                 val->label = val;
4257                         }
4258                 }
4259         }$
4260
4261 ###### print binode cases
4262
4263         case Use:
4264                 do_indent(indent, "use ");
4265                 print_exec(b->right, -1, bracket);
4266                 if (indent >= 0)
4267                         printf("\n");
4268                 break;
4269
4270 ###### propagate binode cases
4271
4272         case Use:
4273                 /* result matches value */
4274                 return propagate_types(b->right, c, ok, type, 0);
4275
4276 ###### interp binode cases
4277
4278         case Use:
4279                 rv = interp_exec(c, b->right, &rvtype);
4280                 break;
4281
4282 ### The Conditional Statement
4283
4284 This is the biggy and currently the only complex statement.  This
4285 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
4286 It is comprised of a number of parts, all of which are optional though
4287 set combinations apply.  Each part is (usually) a key word (`then` is
4288 sometimes optional) followed by either an expression or a code block,
4289 except the `casepart` which is a "key word and an expression" followed
4290 by a code block.  The code-block option is valid for all parts and,
4291 where an expression is also allowed, the code block can use the `use`
4292 statement to report a value.  If the code block does not report a value
4293 the effect is similar to reporting `True`.
4294
4295 The `else` and `case` parts, as well as `then` when combined with
4296 `if`, can contain a `use` statement which will apply to some
4297 containing conditional statement. `for` parts, `do` parts and `then`
4298 parts used with `for` can never contain a `use`, except in some
4299 subordinate conditional statement.
4300
4301 If there is a `forpart`, it is executed first, only once.
4302 If there is a `dopart`, then it is executed repeatedly providing
4303 always that the `condpart` or `cond`, if present, does not return a non-True
4304 value.  `condpart` can fail to return any value if it simply executes
4305 to completion.  This is treated the same as returning `True`.
4306
4307 If there is a `thenpart` it will be executed whenever the `condpart`
4308 or `cond` returns True (or does not return any value), but this will happen
4309 *after* `dopart` (when present).
4310
4311 If `elsepart` is present it will be executed at most once when the
4312 condition returns `False` or some value that isn't `True` and isn't
4313 matched by any `casepart`.  If there are any `casepart`s, they will be
4314 executed when the condition returns a matching value.
4315
4316 The particular sorts of values allowed in case parts has not yet been
4317 determined in the language design, so nothing is prohibited.
4318
4319 The various blocks in this complex statement potentially provide scope
4320 for variables as described earlier.  Each such block must include the
4321 "OpenScope" nonterminal before parsing the block, and must call
4322 `var_block_close()` when closing the block.
4323
4324 The code following "`if`", "`switch`" and "`for`" does not get its own
4325 scope, but is in a scope covering the whole statement, so names
4326 declared there cannot be redeclared elsewhere.  Similarly the
4327 condition following "`while`" is in a scope the covers the body
4328 ("`do`" part) of the loop, and which does not allow conditional scope
4329 extension.  Code following "`then`" (both looping and non-looping),
4330 "`else`" and "`case`" each get their own local scope.
4331
4332 The type requirements on the code block in a `whilepart` are quite
4333 unusal.  It is allowed to return a value of some identifiable type, in
4334 which case the loop aborts and an appropriate `casepart` is run, or it
4335 can return a Boolean, in which case the loop either continues to the
4336 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
4337 This is different both from the `ifpart` code block which is expected to
4338 return a Boolean, or the `switchpart` code block which is expected to
4339 return the same type as the casepart values.  The correct analysis of
4340 the type of the `whilepart` code block is the reason for the
4341 `Rboolok` flag which is passed to `propagate_types()`.
4342
4343 The `cond_statement` cannot fit into a `binode` so a new `exec` is
4344 defined.  As there are two scopes which cover multiple parts - one for
4345 the whole statement and one for "while" and "do" - and as we will use
4346 the 'struct exec' to track scopes, we actually need two new types of
4347 exec.  One is a `binode` for the looping part, the rest is the
4348 `cond_statement`.  The `cond_statement` will use an auxilliary `struct
4349 casepart` to track a list of case parts.
4350
4351 ###### Binode types
4352         Loop
4353
4354 ###### exec type
4355         Xcond_statement,
4356
4357 ###### ast
4358         struct casepart {
4359                 struct exec *value;
4360                 struct exec *action;
4361                 struct casepart *next;
4362         };
4363         struct cond_statement {
4364                 struct exec;
4365                 struct exec *forpart, *condpart, *thenpart, *elsepart;
4366                 struct binode *looppart;
4367                 struct casepart *casepart;
4368         };
4369
4370 ###### ast functions
4371
4372         static void free_casepart(struct casepart *cp)
4373         {
4374                 while (cp) {
4375                         struct casepart *t;
4376                         free_exec(cp->value);
4377                         free_exec(cp->action);
4378                         t = cp->next;
4379                         free(cp);
4380                         cp = t;
4381                 }
4382         }
4383
4384         static void free_cond_statement(struct cond_statement *s)
4385         {
4386                 if (!s)
4387                         return;
4388                 free_exec(s->forpart);
4389                 free_exec(s->condpart);
4390                 free_exec(s->looppart);
4391                 free_exec(s->thenpart);
4392                 free_exec(s->elsepart);
4393                 free_casepart(s->casepart);
4394                 free(s);
4395         }
4396
4397 ###### free exec cases
4398         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
4399
4400 ###### ComplexStatement Grammar
4401         | CondStatement ${ $0 = $<1; }$
4402
4403 ###### declare terminals
4404         $TERM for then while do
4405         $TERM else
4406         $TERM switch case
4407
4408 ###### Grammar
4409
4410         $*cond_statement
4411         // A CondStatement must end with EOL, as does CondSuffix and
4412         // IfSuffix.
4413         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
4414         // may or may not end with EOL
4415         // WhilePart and IfPart include an appropriate Suffix
4416
4417         // ForPart, SwitchPart, and IfPart open scopes, o we have to close
4418         // them.  WhilePart opens and closes its own scope.
4419         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
4420                 $0 = $<CS;
4421                 $0->forpart = $<FP;
4422                 $0->thenpart = $<TP;
4423                 $0->looppart = $<WP;
4424                 var_block_close(c, CloseSequential, $0);
4425         }$
4426         | ForPart OptNL WhilePart CondSuffix ${
4427                 $0 = $<CS;
4428                 $0->forpart = $<FP;
4429                 $0->looppart = $<WP;
4430                 var_block_close(c, CloseSequential, $0);
4431         }$
4432         | WhilePart CondSuffix ${
4433                 $0 = $<CS;
4434                 $0->looppart = $<WP;
4435         }$
4436         | SwitchPart OptNL CasePart CondSuffix ${
4437                 $0 = $<CS;
4438                 $0->condpart = $<SP;
4439                 $CP->next = $0->casepart;
4440                 $0->casepart = $<CP;
4441                 var_block_close(c, CloseSequential, $0);
4442         }$
4443         | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
4444                 $0 = $<CS;
4445                 $0->condpart = $<SP;
4446                 $CP->next = $0->casepart;
4447                 $0->casepart = $<CP;
4448                 var_block_close(c, CloseSequential, $0);
4449         }$
4450         | IfPart IfSuffix ${
4451                 $0 = $<IS;
4452                 $0->condpart = $IP.condpart; $IP.condpart = NULL;
4453                 $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
4454                 // This is where we close an "if" statement
4455                 var_block_close(c, CloseSequential, $0);
4456         }$
4457
4458         CondSuffix -> IfSuffix ${
4459                 $0 = $<1;
4460         }$
4461         | Newlines CasePart CondSuffix ${
4462                 $0 = $<CS;
4463                 $CP->next = $0->casepart;
4464                 $0->casepart = $<CP;
4465         }$
4466         | CasePart CondSuffix ${
4467                 $0 = $<CS;
4468                 $CP->next = $0->casepart;
4469                 $0->casepart = $<CP;
4470         }$
4471
4472         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
4473         | Newlines ElsePart ${ $0 = $<EP; }$
4474         | ElsePart ${$0 = $<EP; }$
4475
4476         ElsePart -> else OpenBlock Newlines ${
4477                 $0 = new(cond_statement);
4478                 $0->elsepart = $<OB;
4479                 var_block_close(c, CloseElse, $0->elsepart);
4480         }$
4481         | else OpenScope CondStatement ${
4482                 $0 = new(cond_statement);
4483                 $0->elsepart = $<CS;
4484                 var_block_close(c, CloseElse, $0->elsepart);
4485         }$
4486
4487         $*casepart
4488         CasePart -> case Expression OpenScope ColonBlock ${
4489                 $0 = calloc(1,sizeof(struct casepart));
4490                 $0->value = $<Ex;
4491                 $0->action = $<Bl;
4492                 var_block_close(c, CloseParallel, $0->action);
4493         }$
4494
4495         $*exec
4496         // These scopes are closed in CondStatement
4497         ForPart -> for OpenBlock ${
4498                 $0 = $<Bl;
4499         }$
4500
4501         ThenPart -> then OpenBlock ${
4502                 $0 = $<OB;
4503                 var_block_close(c, CloseSequential, $0);
4504         }$
4505
4506         $*binode
4507         // This scope is closed in CondStatement
4508         WhilePart -> while UseBlock OptNL do OpenBlock ${
4509                 $0 = new(binode);
4510                 $0->op = Loop;
4511                 $0->left = $<UB;
4512                 $0->right = $<OB;
4513                 var_block_close(c, CloseSequential, $0->right);
4514                 var_block_close(c, CloseSequential, $0);
4515         }$
4516         | while OpenScope Expression OpenScope ColonBlock ${
4517                 $0 = new(binode);
4518                 $0->op = Loop;
4519                 $0->left = $<Exp;
4520                 $0->right = $<CB;
4521                 var_block_close(c, CloseSequential, $0->right);
4522                 var_block_close(c, CloseSequential, $0);
4523         }$
4524
4525         $cond_statement
4526         IfPart -> if UseBlock OptNL then OpenBlock ${
4527                 $0.condpart = $<UB;
4528                 $0.thenpart = $<OB;
4529                 var_block_close(c, CloseParallel, $0.thenpart);
4530         }$
4531         | if OpenScope Expression OpenScope ColonBlock ${
4532                 $0.condpart = $<Ex;
4533                 $0.thenpart = $<CB;
4534                 var_block_close(c, CloseParallel, $0.thenpart);
4535         }$
4536         | if OpenScope Expression OpenScope OptNL then Block ${
4537                 $0.condpart = $<Ex;
4538                 $0.thenpart = $<Bl;
4539                 var_block_close(c, CloseParallel, $0.thenpart);
4540         }$
4541
4542         $*exec
4543         // This scope is closed in CondStatement
4544         SwitchPart -> switch OpenScope Expression ${
4545                 $0 = $<Ex;
4546         }$
4547         | switch UseBlock ${
4548                 $0 = $<Bl;
4549         }$
4550
4551 ###### print binode cases
4552         case Loop:
4553                 if (b->left && b->left->type == Xbinode &&
4554                     cast(binode, b->left)->op == Block) {
4555                         if (bracket)
4556                                 do_indent(indent, "while {\n");
4557                         else
4558                                 do_indent(indent, "while\n");
4559                         print_exec(b->left, indent+1, bracket);
4560                         if (bracket)
4561                                 do_indent(indent, "} do {\n");
4562                         else
4563                                 do_indent(indent, "do\n");
4564                         print_exec(b->right, indent+1, bracket);
4565                         if (bracket)
4566                                 do_indent(indent, "}\n");
4567                 } else {
4568                         do_indent(indent, "while ");
4569                         print_exec(b->left, 0, bracket);
4570                         if (bracket)
4571                                 printf(" {\n");
4572                         else
4573                                 printf(":\n");
4574                         print_exec(b->right, indent+1, bracket);
4575                         if (bracket)
4576                                 do_indent(indent, "}\n");
4577                 }
4578                 break;
4579
4580 ###### print exec cases
4581
4582         case Xcond_statement:
4583         {
4584                 struct cond_statement *cs = cast(cond_statement, e);
4585                 struct casepart *cp;
4586                 if (cs->forpart) {
4587                         do_indent(indent, "for");
4588                         if (bracket) printf(" {\n"); else printf("\n");
4589                         print_exec(cs->forpart, indent+1, bracket);
4590                         if (cs->thenpart) {
4591                                 if (bracket)
4592                                         do_indent(indent, "} then {\n");
4593                                 else
4594                                         do_indent(indent, "then\n");
4595                                 print_exec(cs->thenpart, indent+1, bracket);
4596                         }
4597                         if (bracket) do_indent(indent, "}\n");
4598                 }
4599                 if (cs->looppart) {
4600                         print_exec(cs->looppart, indent, bracket);
4601                 } else {
4602                         // a condition
4603                         if (cs->casepart)
4604                                 do_indent(indent, "switch");
4605                         else
4606                                 do_indent(indent, "if");
4607                         if (cs->condpart && cs->condpart->type == Xbinode &&
4608                             cast(binode, cs->condpart)->op == Block) {
4609                                 if (bracket)
4610                                         printf(" {\n");
4611                                 else
4612                                         printf("\n");
4613                                 print_exec(cs->condpart, indent+1, bracket);
4614                                 if (bracket)
4615                                         do_indent(indent, "}\n");
4616                                 if (cs->thenpart) {
4617                                         do_indent(indent, "then\n");
4618                                         print_exec(cs->thenpart, indent+1, bracket);
4619                                 }
4620                         } else {
4621                                 printf(" ");
4622                                 print_exec(cs->condpart, 0, bracket);
4623                                 if (cs->thenpart) {
4624                                         if (bracket)
4625                                                 printf(" {\n");
4626                                         else
4627                                                 printf(":\n");
4628                                         print_exec(cs->thenpart, indent+1, bracket);
4629                                         if (bracket)
4630                                                 do_indent(indent, "}\n");
4631                                 } else
4632                                         printf("\n");
4633                         }
4634                 }
4635                 for (cp = cs->casepart; cp; cp = cp->next) {
4636                         do_indent(indent, "case ");
4637                         print_exec(cp->value, -1, 0);
4638                         if (bracket)
4639                                 printf(" {\n");
4640                         else
4641                                 printf(":\n");
4642                         print_exec(cp->action, indent+1, bracket);
4643                         if (bracket)
4644                                 do_indent(indent, "}\n");
4645                 }
4646                 if (cs->elsepart) {
4647                         do_indent(indent, "else");
4648                         if (bracket)
4649                                 printf(" {\n");
4650                         else
4651                                 printf("\n");
4652                         print_exec(cs->elsepart, indent+1, bracket);
4653                         if (bracket)
4654                                 do_indent(indent, "}\n");
4655                 }
4656                 break;
4657         }
4658
4659 ###### propagate binode cases
4660         case Loop:
4661                 t = propagate_types(b->right, c, ok, Tnone, 0);
4662                 if (!type_compat(Tnone, t, 0))
4663                         *ok = 0;        // UNTESTED
4664                 return propagate_types(b->left, c, ok, type, rules);
4665
4666 ###### propagate exec cases
4667         case Xcond_statement:
4668         {
4669                 // forpart and looppart->right must return Tnone
4670                 // thenpart must return Tnone if there is a loopart,
4671                 // otherwise it is like elsepart.
4672                 // condpart must:
4673                 //    be bool if there is no casepart
4674                 //    match casepart->values if there is a switchpart
4675                 //    either be bool or match casepart->value if there
4676                 //             is a whilepart
4677                 // elsepart and casepart->action must match the return type
4678                 //   expected of this statement.
4679                 struct cond_statement *cs = cast(cond_statement, prog);
4680                 struct casepart *cp;
4681
4682                 t = propagate_types(cs->forpart, c, ok, Tnone, 0);
4683                 if (!type_compat(Tnone, t, 0))
4684                         *ok = 0;        // UNTESTED
4685
4686                 if (cs->looppart) {
4687                         t = propagate_types(cs->thenpart, c, ok, Tnone, 0);
4688                         if (!type_compat(Tnone, t, 0))
4689                                 *ok = 0;        // UNTESTED
4690                 }
4691                 if (cs->casepart == NULL) {
4692                         propagate_types(cs->condpart, c, ok, Tbool, 0);
4693                         propagate_types(cs->looppart, c, ok, Tbool, 0);
4694                 } else {
4695                         /* Condpart must match case values, with bool permitted */
4696                         t = NULL;
4697                         for (cp = cs->casepart;
4698                              cp && !t; cp = cp->next)
4699                                 t = propagate_types(cp->value, c, ok, NULL, 0);
4700                         if (!t && cs->condpart)
4701                                 t = propagate_types(cs->condpart, c, ok, NULL, Rboolok);        // UNTESTED
4702                         if (!t && cs->looppart)
4703                                 t = propagate_types(cs->looppart, c, ok, NULL, Rboolok);        // UNTESTED
4704                         // Now we have a type (I hope) push it down
4705                         if (t) {
4706                                 for (cp = cs->casepart; cp; cp = cp->next)
4707                                         propagate_types(cp->value, c, ok, t, 0);
4708                                 propagate_types(cs->condpart, c, ok, t, Rboolok);
4709                                 propagate_types(cs->looppart, c, ok, t, Rboolok);
4710                         }
4711                 }
4712                 // (if)then, else, and case parts must return expected type.
4713                 if (!cs->looppart && !type)
4714                         type = propagate_types(cs->thenpart, c, ok, NULL, rules);
4715                 if (!type)
4716                         type = propagate_types(cs->elsepart, c, ok, NULL, rules);
4717                 for (cp = cs->casepart;
4718                      cp && !type;
4719                      cp = cp->next)     // UNTESTED
4720                         type = propagate_types(cp->action, c, ok, NULL, rules); // UNTESTED
4721                 if (type) {
4722                         if (!cs->looppart)
4723                                 propagate_types(cs->thenpart, c, ok, type, rules);
4724                         propagate_types(cs->elsepart, c, ok, type, rules);
4725                         for (cp = cs->casepart; cp ; cp = cp->next)
4726                                 propagate_types(cp->action, c, ok, type, rules);
4727                         return type;
4728                 } else
4729                         return NULL;
4730         }
4731
4732 ###### interp binode cases
4733         case Loop:
4734                 // This just performs one iterration of the loop
4735                 rv = interp_exec(c, b->left, &rvtype);
4736                 if (rvtype == Tnone ||
4737                     (rvtype == Tbool && rv.bool != 0))
4738                         // rvtype is Tnone or Tbool, doesn't need to be freed
4739                         interp_exec(c, b->right, NULL);
4740                 break;
4741
4742 ###### interp exec cases
4743         case Xcond_statement:
4744         {
4745                 struct value v, cnd;
4746                 struct type *vtype, *cndtype;
4747                 struct casepart *cp;
4748                 struct cond_statement *cs = cast(cond_statement, e);
4749
4750                 if (cs->forpart)
4751                         interp_exec(c, cs->forpart, NULL);
4752                 if (cs->looppart) {
4753                         while ((cnd = interp_exec(c, cs->looppart, &cndtype)),
4754                                cndtype == Tnone || (cndtype == Tbool && cnd.bool != 0))
4755                                 interp_exec(c, cs->thenpart, NULL);
4756                 } else {
4757                         cnd = interp_exec(c, cs->condpart, &cndtype);
4758                         if ((cndtype == Tnone ||
4759                             (cndtype == Tbool && cnd.bool != 0))) {
4760                                 // cnd is Tnone or Tbool, doesn't need to be freed
4761                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4762                                 // skip else (and cases)
4763                                 goto Xcond_done;
4764                         }
4765                 }
4766                 for (cp = cs->casepart; cp; cp = cp->next) {
4767                         v = interp_exec(c, cp->value, &vtype);
4768                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4769                                 free_value(vtype, &v);
4770                                 free_value(cndtype, &cnd);
4771                                 rv = interp_exec(c, cp->action, &rvtype);
4772                                 goto Xcond_done;
4773                         }
4774                         free_value(vtype, &v);
4775                 }
4776                 free_value(cndtype, &cnd);
4777                 if (cs->elsepart)
4778                         rv = interp_exec(c, cs->elsepart, &rvtype);
4779                 else
4780                         rvtype = Tnone;
4781         Xcond_done:
4782                 break;
4783         }
4784
4785 ### Top level structure
4786
4787 All the language elements so far can be used in various places.  Now
4788 it is time to clarify what those places are.
4789
4790 At the top level of a file there will be a number of declarations.
4791 Many of the things that can be declared haven't been described yet,
4792 such as functions, procedures, imports, and probably more.
4793 For now there are two sorts of things that can appear at the top
4794 level.  They are predefined constants, `struct` types, and the `main`
4795 function.  While the syntax will allow the `main` function to appear
4796 multiple times, that will trigger an error if it is actually attempted.
4797
4798 The various declarations do not return anything.  They store the
4799 various declarations in the parse context.
4800
4801 ###### Parser: grammar
4802
4803         $void
4804         Ocean -> OptNL DeclarationList
4805
4806         ## declare terminals
4807
4808         OptNL ->
4809         | OptNL NEWLINE
4810
4811         Newlines -> NEWLINE
4812         | Newlines NEWLINE
4813
4814         DeclarationList -> Declaration
4815         | DeclarationList Declaration
4816
4817         Declaration -> ERROR Newlines ${
4818                 tok_err(c,      // UNTESTED
4819                         "error: unhandled parse error", &$1);
4820         }$
4821         | DeclareConstant
4822         | DeclareFunction
4823         | DeclareStruct
4824
4825         ## top level grammar
4826
4827         ## Grammar
4828
4829 ### The `const` section
4830
4831 As well as being defined in with the code that uses them, constants
4832 can be declared at the top level.  These have full-file scope, so they
4833 are always `InScope`.  The value of a top level constant can be given
4834 as an expression, and this is evaluated immediately rather than in the
4835 later interpretation stage.  Once we add functions to the language, we
4836 will need rules concern which, if any, can be used to define a top
4837 level constant.
4838
4839 Constants are defined in a section that starts with the reserved word
4840 `const` and then has a block with a list of assignment statements.
4841 For syntactic consistency, these must use the double-colon syntax to
4842 make it clear that they are constants.  Type can also be given: if
4843 not, the type will be determined during analysis, as with other
4844 constants.
4845
4846 As the types constants are inserted at the head of a list, printing
4847 them in the same order that they were read is not straight forward.
4848 We take a quadratic approach here and count the number of constants
4849 (variables of depth 0), then count down from there, each time
4850 searching through for the Nth constant for decreasing N.
4851
4852 ###### top level grammar
4853
4854         $TERM const
4855
4856         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4857         | const { SimpleConstList } Newlines
4858         | const IN OptNL ConstList OUT Newlines
4859         | const SimpleConstList Newlines
4860
4861         ConstList -> ConstList SimpleConstLine
4862         | SimpleConstLine
4863
4864         SimpleConstList -> SimpleConstList ; Const
4865         | Const
4866         | SimpleConstList ;
4867
4868         SimpleConstLine -> SimpleConstList Newlines
4869         | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4870
4871         $*type
4872         CType -> Type   ${ $0 = $<1; }$
4873         |               ${ $0 = NULL; }$
4874
4875         $void
4876         Const -> IDENTIFIER :: CType = Expression ${ {
4877                 int ok;
4878                 struct variable *v;
4879
4880                 v = var_decl(c, $1.txt);
4881                 if (v) {
4882                         struct var *var = new_pos(var, $1);
4883                         v->where_decl = var;
4884                         v->where_set = var;
4885                         var->var = v;
4886                         v->constant = 1;
4887                         v->global = 1;
4888                 } else {
4889                         struct variable *vorig = var_ref(c, $1.txt);
4890                         tok_err(c, "error: name already declared", &$1);
4891                         type_err(c, "info: this is where '%v' was first declared",
4892                                  vorig->where_decl, NULL, 0, NULL);
4893                 }
4894                 do {
4895                         ok = 1;
4896                         propagate_types($5, c, &ok, $3, 0);
4897                 } while (ok == 2);
4898                 if (!ok)
4899                         c->parse_error = 1;
4900                 else if (v) {
4901                         struct value res = interp_exec(c, $5, &v->type);
4902                         global_alloc(c, v->type, v, &res);
4903                 }
4904         } }$
4905
4906 ###### print const decls
4907         {
4908                 struct variable *v;
4909                 int target = -1;
4910
4911                 while (target != 0) {
4912                         int i = 0;
4913                         for (v = context.in_scope; v; v=v->in_scope)
4914                                 if (v->depth == 0 && v->constant) {
4915                                         i += 1;
4916                                         if (i == target)
4917                                                 break;
4918                                 }
4919
4920                         if (target == -1) {
4921                                 if (i)
4922                                         printf("const\n");
4923                                 target = i;
4924                         } else {
4925                                 struct value *val = var_value(&context, v);
4926                                 printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
4927                                 type_print(v->type, stdout);
4928                                 printf(" = ");
4929                                 if (v->type == Tstr)
4930                                         printf("\"");
4931                                 print_value(v->type, val, stdout);
4932                                 if (v->type == Tstr)
4933                                         printf("\"");
4934                                 printf("\n");
4935                                 target -= 1;
4936                         }
4937                 }
4938         }
4939
4940 ### Function declarations
4941
4942 The code in an Ocean program is all stored in function declarations.
4943 One of the functions must be named `main` and it must accept an array of
4944 strings as a parameter - the command line arguments.
4945
4946 As this is the top level, several things are handled a bit differently.
4947 The function is not interpreted by `interp_exec` as that isn't passed
4948 the argument list which the program requires.  Similarly type analysis
4949 is a bit more interesting at this level.
4950
4951 ###### ast functions
4952
4953         static struct type *handle_results(struct parse_context *c,
4954                                            struct binode *results)
4955         {
4956                 /* Create a 'struct' type from the results list, which
4957                  * is a list for 'struct var'
4958                  */
4959                 struct type *t = add_anon_type(c, &structure_prototype,
4960                                                " function result");
4961                 int cnt = 0;
4962                 struct binode *b;
4963
4964                 for (b = results; b; b = cast(binode, b->right))
4965                         cnt += 1;
4966                 t->structure.nfields = cnt;
4967                 t->structure.fields = calloc(cnt, sizeof(struct field));
4968                 cnt = 0;
4969                 for (b = results; b; b = cast(binode, b->right)) {
4970                         struct var *v = cast(var, b->left);
4971                         struct field *f = &t->structure.fields[cnt++];
4972                         int a = v->var->type->align;
4973                         f->name = v->var->name->name;
4974                         f->type = v->var->type;
4975                         f->init = NULL;
4976                         f->offset = t->size;
4977                         v->var->frame_pos = f->offset;
4978                         t->size += ((f->type->size - 1) | (a-1)) + 1;
4979                         if (a > t->align)
4980                                 t->align = a;
4981                         variable_unlink_exec(v->var);
4982                 }
4983                 free_binode(results);
4984                 return t;
4985         }
4986
4987         static struct variable *declare_function(struct parse_context *c,
4988                                                 struct variable *name,
4989                                                 struct binode *args,
4990                                                 struct type *ret,
4991                                                 struct binode *results,
4992                                                 struct exec *code)
4993         {
4994                 if (name) {
4995                         struct value fn = {.function = code};
4996                         struct type *t;
4997                         var_block_close(c, CloseFunction, code);
4998                         t = add_anon_type(c, &function_prototype,
4999                                           "func %.*s", name->name->name.len,
5000                                           name->name->name.txt);
5001                         name->type = t;
5002                         t->function.params = reorder_bilist(args);
5003                         if (!ret) {
5004                                 ret = handle_results(c, reorder_bilist(results));
5005                                 t->function.inline_result = 1;
5006                                 t->function.local_size = ret->size;
5007                         }
5008                         t->function.return_type = ret;
5009                         global_alloc(c, t, name, &fn);
5010                         name->type->function.scope = c->out_scope;
5011                 } else {
5012                         free_binode(args);
5013                         free_type(ret);
5014                         free_exec(code);
5015                         var_block_close(c, CloseFunction, NULL);
5016                 }
5017                 c->out_scope = NULL;
5018                 return name;
5019         }
5020
5021 ###### declare terminals
5022         $TERM return
5023
5024 ###### top level grammar
5025
5026         $*variable
5027         DeclareFunction -> func FuncName ( OpenScope ArgsLine ) Block Newlines ${
5028                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5029         }$
5030         | func FuncName IN OpenScope Args OUT OptNL do Block Newlines ${
5031                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5032         }$
5033         | func FuncName NEWLINE OpenScope OptNL do Block Newlines ${
5034                 $0 = declare_function(c, $<FN, NULL, Tnone, NULL, $<Bl);
5035         }$
5036         | func FuncName ( OpenScope ArgsLine ) : Type Block Newlines ${
5037                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5038         }$
5039         | func FuncName ( OpenScope ArgsLine ) : ( ArgsLine ) Block Newlines ${
5040                 $0 = declare_function(c, $<FN, $<AL, NULL, $<AL2, $<Bl);
5041         }$
5042         | func FuncName IN OpenScope Args OUT OptNL return Type Newlines do Block Newlines ${
5043                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5044         }$
5045         | func FuncName NEWLINE OpenScope return Type Newlines do Block Newlines ${
5046                 $0 = declare_function(c, $<FN, NULL, $<Ty, NULL, $<Bl);
5047         }$
5048         | func FuncName IN OpenScope Args OUT OptNL return IN Args OUT OptNL do Block Newlines ${
5049                 $0 = declare_function(c, $<FN, $<Ar, NULL, $<Ar2, $<Bl);
5050         }$
5051         | func FuncName NEWLINE OpenScope return IN Args OUT OptNL do Block Newlines ${
5052                 $0 = declare_function(c, $<FN, NULL, NULL, $<Ar, $<Bl);
5053         }$
5054
5055 ###### print func decls
5056         {
5057                 struct variable *v;
5058                 int target = -1;
5059
5060                 while (target != 0) {
5061                         int i = 0;
5062                         for (v = context.in_scope; v; v=v->in_scope)
5063                                 if (v->depth == 0 && v->type && v->type->check_args) {
5064                                         i += 1;
5065                                         if (i == target)
5066                                                 break;
5067                                 }
5068
5069                         if (target == -1) {
5070                                 target = i;
5071                         } else {
5072                                 struct value *val = var_value(&context, v);
5073                                 printf("func %.*s", v->name->name.len, v->name->name.txt);
5074                                 v->type->print_type_decl(v->type, stdout);
5075                                 if (brackets)
5076                                         print_exec(val->function, 0, brackets);
5077                                 else
5078                                         print_value(v->type, val, stdout);
5079                                 printf("/* frame size %d */\n", v->type->function.local_size);
5080                                 target -= 1;
5081                         }
5082                 }
5083         }
5084
5085 ###### core functions
5086
5087         static int analyse_funcs(struct parse_context *c)
5088         {
5089                 struct variable *v;
5090                 int all_ok = 1;
5091                 for (v = c->in_scope; v; v = v->in_scope) {
5092                         struct value *val;
5093                         struct type *ret;
5094                         int ok = 1;
5095                         if (v->depth != 0 || !v->type || !v->type->check_args)
5096                                 continue;
5097                         ret = v->type->function.inline_result ?
5098                                 Tnone : v->type->function.return_type;
5099                         val = var_value(c, v);
5100                         do {
5101                                 ok = 1;
5102                                 propagate_types(val->function, c, &ok, ret, 0);
5103                         } while (ok == 2);
5104                         if (ok)
5105                                 /* Make sure everything is still consistent */
5106                                 propagate_types(val->function, c, &ok, ret, 0);
5107                         if (!ok)
5108                                 all_ok = 0;
5109                         if (!v->type->function.inline_result &&
5110                             !v->type->function.return_type->dup) {
5111                                 type_err(c, "error: function cannot return value of type %1",
5112                                          v->where_decl, v->type->function.return_type, 0, NULL);
5113                         }
5114
5115                         scope_finalize(c, v->type);
5116                 }
5117                 return all_ok;
5118         }
5119
5120         static int analyse_main(struct type *type, struct parse_context *c)
5121         {
5122                 struct binode *bp = type->function.params;
5123                 struct binode *b;
5124                 int ok = 1;
5125                 int arg = 0;
5126                 struct type *argv_type;
5127
5128                 argv_type = add_anon_type(c, &array_prototype, "argv");
5129                 argv_type->array.member = Tstr;
5130                 argv_type->array.unspec = 1;
5131
5132                 for (b = bp; b; b = cast(binode, b->right)) {
5133                         ok = 1;
5134                         switch (arg++) {
5135                         case 0: /* argv */
5136                                 propagate_types(b->left, c, &ok, argv_type, 0);
5137                                 break;
5138                         default: /* invalid */  // NOTEST
5139                                 propagate_types(b->left, c, &ok, Tnone, 0);     // NOTEST
5140                         }
5141                         if (!ok)
5142                                 c->parse_error = 1;
5143                 }
5144
5145                 return !c->parse_error;
5146         }
5147
5148         static void interp_main(struct parse_context *c, int argc, char **argv)
5149         {
5150                 struct value *progp = NULL;
5151                 struct text main_name = { "main", 4 };
5152                 struct variable *mainv;
5153                 struct binode *al;
5154                 int anum = 0;
5155                 struct value v;
5156                 struct type *vtype;
5157
5158                 mainv = var_ref(c, main_name);
5159                 if (mainv)
5160                         progp = var_value(c, mainv);
5161                 if (!progp || !progp->function) {
5162                         fprintf(stderr, "oceani: no main function found.\n");
5163                         c->parse_error = 1;
5164                         return;
5165                 }
5166                 if (!analyse_main(mainv->type, c)) {
5167                         fprintf(stderr, "oceani: main has wrong type.\n");
5168                         c->parse_error = 1;
5169                         return;
5170                 }
5171                 al = mainv->type->function.params;
5172
5173                 c->local_size = mainv->type->function.local_size;
5174                 c->local = calloc(1, c->local_size);
5175                 while (al) {
5176                         struct var *v = cast(var, al->left);
5177                         struct value *vl = var_value(c, v->var);
5178                         struct value arg;
5179                         struct type *t;
5180                         mpq_t argcq;
5181                         int i;
5182
5183                         switch (anum++) {
5184                         case 0: /* argv */
5185                                 t = v->var->type;
5186                                 mpq_init(argcq);
5187                                 mpq_set_ui(argcq, argc, 1);
5188                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
5189                                 t->prepare_type(c, t, 0);
5190                                 array_init(v->var->type, vl);
5191                                 for (i = 0; i < argc; i++) {
5192                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
5193
5194                                         arg.str.txt = argv[i];
5195                                         arg.str.len = strlen(argv[i]);
5196                                         free_value(Tstr, vl2);
5197                                         dup_value(Tstr, &arg, vl2);
5198                                 }
5199                                 break;
5200                         }
5201                         al = cast(binode, al->right);
5202                 }
5203                 v = interp_exec(c, progp->function, &vtype);
5204                 free_value(vtype, &v);
5205                 free(c->local);
5206                 c->local = NULL;
5207         }
5208
5209 ###### ast functions
5210         void free_variable(struct variable *v)
5211         {
5212         }
5213
5214 ## And now to test it out.
5215
5216 Having a language requires having a "hello world" program.  I'll
5217 provide a little more than that: a program that prints "Hello world"
5218 finds the GCD of two numbers, prints the first few elements of
5219 Fibonacci, performs a binary search for a number, and a few other
5220 things which will likely grow as the languages grows.
5221
5222 ###### File: oceani.mk
5223         demos :: sayhello
5224         sayhello : oceani
5225                 @echo "===== DEMO ====="
5226                 ./oceani --section "demo: hello" oceani.mdc 55 33
5227
5228 ###### demo: hello
5229
5230         const
5231                 pi ::= 3.141_592_6
5232                 four ::= 2 + 2 ; five ::= 10/2
5233         const pie ::= "I like Pie";
5234                 cake ::= "The cake is"
5235                   ++ " a lie"
5236
5237         struct fred
5238                 size:[four]number
5239                 name:string
5240                 alive:Boolean
5241
5242         func main(argv:[argc::]string)
5243                 print "Hello World, what lovely oceans you have!"
5244                 print "Are there", five, "?"
5245                 print pi, pie, "but", cake
5246
5247                 A := $argv[1]; B := $argv[2]
5248
5249                 /* When a variable is defined in both branches of an 'if',
5250                  * and used afterwards, the variables are merged.
5251                  */
5252                 if A > B:
5253                         bigger := "yes"
5254                 else
5255                         bigger := "no"
5256                 print "Is", A, "bigger than", B,"? ", bigger
5257                 /* If a variable is not used after the 'if', no
5258                  * merge happens, so types can be different
5259                  */
5260                 if A > B * 2:
5261                         double:string = "yes"
5262                         print A, "is more than twice", B, "?", double
5263                 else
5264                         double := B*2
5265                         print "double", B, "is", double
5266
5267                 a : number
5268                 a = A;
5269                 b:number = B
5270                 if a > 0 and then b > 0:
5271                         while a != b:
5272                                 if a < b:
5273                                         b = b - a
5274                                 else
5275                                         a = a - b
5276                         print "GCD of", A, "and", B,"is", a
5277                 else if a <= 0:
5278                         print a, "is not positive, cannot calculate GCD"
5279                 else
5280                         print b, "is not positive, cannot calculate GCD"
5281
5282                 for
5283                         togo := 10
5284                         f1 := 1; f2 := 1
5285                         print "Fibonacci:", f1,f2,
5286                 then togo = togo - 1
5287                 while togo > 0:
5288                         f3 := f1 + f2
5289                         print "", f3,
5290                         f1 = f2
5291                         f2 = f3
5292                 print ""
5293
5294                 /* Binary search... */
5295                 for
5296                         lo:= 0; hi := 100
5297                         target := 77
5298                 while
5299                         mid := (lo + hi) / 2
5300                         if mid == target:
5301                                 use Found
5302                         if mid < target:
5303                                 lo = mid
5304                         else
5305                                 hi = mid
5306                         if hi - lo < 1:
5307                                 lo = mid
5308                                 use GiveUp
5309                         use True
5310                 do pass
5311                 case Found:
5312                         print "Yay, I found", target
5313                 case GiveUp:
5314                         print "Closest I found was", lo
5315
5316                 size::= 10
5317                 list:[size]number
5318                 list[0] = 1234
5319                 // "middle square" PRNG.  Not particularly good, but one my
5320                 // Dad taught me - the first one I ever heard of.
5321                 for i:=1; then i = i + 1; while i < size:
5322                         n := list[i-1] * list[i-1]
5323                         list[i] = (n / 100) % 10 000
5324
5325                 print "Before sort:",
5326                 for i:=0; then i = i + 1; while i < size:
5327                         print "", list[i],
5328                 print
5329
5330                 for i := 1; then i=i+1; while i < size:
5331                         for j:=i-1; then j=j-1; while j >= 0:
5332                                 if list[j] > list[j+1]:
5333                                         t:= list[j]
5334                                         list[j] = list[j+1]
5335                                         list[j+1] = t
5336                 print " After sort:",
5337                 for i:=0; then i = i + 1; while i < size:
5338                         print "", list[i],
5339                 print
5340
5341                 if 1 == 2 then print "yes"; else print "no"
5342
5343                 bob:fred
5344                 bob.name = "Hello"
5345                 bob.alive = (bob.name == "Hello")
5346                 print "bob", "is" if  bob.alive else "isn't", "alive"