ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 ## parse context
 118         };
 119
 120 ###### macros
 121
 122         #define container_of(ptr, type, member) ({                      \
 123                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 124                 (type *)( (char *)__mptr - offsetof(type,member) );})
 125
 126         #define config2context(_conf) container_of(_conf, struct parse_context, \
 127                 config)
 128
 129 ###### Parser: reduce
 130         struct parse_context *c = config2context(config);
 131
 132 ###### Parser: code
 133         #define _GNU_SOURCE
 134         #include <unistd.h>
 135         #include <stdlib.h>
 136         #include <fcntl.h>
 137         #include <errno.h>
 138         #include <sys/mman.h>
 139         #include <string.h>
 140         #include <stdio.h>
 141         #include <locale.h>
 142         #include <malloc.h>
 143         #include "mdcode.h"
 144         #include "scanner.h"
 145         #include "parser.h"
 146
 147         ## includes
 148
 149         #include "oceani.h"
 150
 151         ## forward decls
 152         ## value functions
 153         ## ast functions
 154         ## core functions
 155
 156         #include <getopt.h>
 157         static char Usage[] =
 158                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 159         static const struct option long_options[] = {
 160                 {"trace",     0, NULL, 't'},
 161                 {"print",     0, NULL, 'p'},
 162                 {"noexec",    0, NULL, 'n'},
 163                 {"brackets",  0, NULL, 'b'},
 164                 {"section",   1, NULL, 's'},
 165                 {NULL,        0, NULL, 0},
 166         };
 167         const char *options = "tpnbs";
 168
 169         static void pr_err(char *msg)                   // NOTEST
 170         {
 171                 fprintf(stderr, "%s\n", msg);           // NOTEST
 172         }                                               // NOTEST
 173
 174         int main(int argc, char *argv[])
 175         {
 176                 int fd;
 177                 int len;
 178                 char *file;
 179                 struct section *s = NULL, *ss;
 180                 char *section = NULL;
 181                 struct parse_context context = {
 182                         .config = {
 183                                 .ignored = (1 << TK_mark),
 184                                 .number_chars = ".,_+- ",
 185                                 .word_start = "_",
 186                                 .word_cont = "_",
 187                         },
 188                 };
 189                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 190                 int opt;
 191                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 192                        != -1) {
 193                         switch(opt) {
 194                         case 't': dotrace=1; break;
 195                         case 'p': doprint=1; break;
 196                         case 'n': doexec=0; break;
 197                         case 'b': brackets=1; break;
 198                         case 's': section = optarg; break;
 199                         default: fprintf(stderr, Usage);
 200                                 exit(1);
 201                         }
 202                 }
 203                 if (optind >= argc) {
 204                         fprintf(stderr, "oceani: no input file given\n");
 205                         exit(1);
 206                 }
 207                 fd = open(argv[optind], O_RDONLY);
 208                 if (fd < 0) {
 209                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 210                         exit(1);
 211                 }
 212                 context.file_name = argv[optind];
 213                 len = lseek(fd, 0, 2);
 214                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 215                 s = code_extract(file, file+len, pr_err);
 216                 if (!s) {
 217                         fprintf(stderr, "oceani: could not find any code in %s\n",
 218                                 argv[optind]);
 219                         exit(1);
 220                 }
 221
 222                 ## context initialization
 223
 224                 if (section) {
 225                         for (ss = s; ss; ss = ss->next) {
 226                                 struct text sec = ss->section;
 227                                 if (sec.len == strlen(section) &&
 228                                     strncmp(sec.txt, section, sec.len) == 0)
 229                                         break;
 230                         }
 231                         if (!ss) {
 232                                 fprintf(stderr, "oceani: cannot find section %s\n",
 233                                         section);
 234                                 goto cleanup;
 235                         }
 236                 } else
 237                         ss = s;                         // NOTEST
 238                 if (!ss->code) {
 239                         fprintf(stderr, "oceani: no code found in requested section\n");        // NOTEST
 240                         goto cleanup;                   // NOTEST
 241                 }
 242
 243                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 244
 245                 resolve_consts(&context);
 246                 prepare_types(&context);
 247                 if (!context.parse_error && !analyse_funcs(&context)) {
 248                         fprintf(stderr, "oceani: type error in program - not running.\n");
 249                         context.parse_error += 1;
 250                 }
 251
 252                 if (doprint) {
 253                         ## print const decls
 254                         ## print type decls
 255                         ## print func decls
 256                 }
 257                 if (doexec && !context.parse_error)
 258                         interp_main(&context, argc - optind, argv + optind);
 259         cleanup:
 260                 while (s) {
 261                         struct section *t = s->next;
 262                         code_free(s->code);
 263                         free(s);
 264                         s = t;
 265                 }
 266                 // FIXME parser should pop scope even on error
 267                 while (context.scope_depth > 0)
 268                         scope_pop(&context);
 269                 ## free global vars
 270                 ## free const decls
 271                 ## free context types
 272                 ## free context storage
 273                 exit(context.parse_error ? 1 : 0);
 274         }
 275
 276 ### Analysis
 277
 278 The four requirements of parse, analyse, print, interpret apply to
 279 each language element individually so that is how most of the code
 280 will be structured.
 281
 282 Three of the four are fairly self explanatory.  The one that requires
 283 a little explanation is the analysis step.
 284
 285 The current language design does not require the types of variables to
 286 be declared, but they must still have a single type.  Different
 287 operations impose different requirements on the variables, for example
 288 addition requires both arguments to be numeric, and assignment
 289 requires the variable on the left to have the same type as the
 290 expression on the right.
 291
 292 Analysis involves propagating these type requirements around and
 293 consequently setting the type of each variable.  If any requirements
 294 are violated (e.g. a string is compared with a number) or if a
 295 variable needs to have two different types, then an error is raised
 296 and the program will not run.
 297
 298 If the same variable is declared in both branchs of an 'if/else', or
 299 in all cases of a 'switch' then the multiple instances may be merged
 300 into just one variable if the variable is referenced after the
 301 conditional statement.  When this happens, the types must naturally be
 302 consistent across all the branches.  When the variable is not used
 303 outside the if, the variables in the different branches are distinct
 304 and can be of different types.
 305
 306 Undeclared names may only appear in "use" statements and "case" expressions.
 307 These names are given a type of "label" and a unique value.
 308 This allows them to fill the role of a name in an enumerated type, which
 309 is useful for testing the `switch` statement.
 310
 311 As we will see, the condition part of a `while` statement can return
 312 either a Boolean or some other type.  This requires that the expected
 313 type that gets passed around comprises a type and a flag to indicate
 314 that `Tbool` is also permitted.
 315
 316 As there are, as yet, no distinct types that are compatible, there
 317 isn't much subtlety in the analysis.  When we have distinct number
 318 types, this will become more interesting.
 319
 320 #### Error reporting
 321
 322 When analysis discovers an inconsistency it needs to report an error;
 323 just refusing to run the code ensures that the error doesn't cascade,
 324 but by itself it isn't very useful.  A clear understanding of the sort
 325 of error message that are useful will help guide the process of
 326 analysis.
 327
 328 At a simplistic level, the only sort of error that type analysis can
 329 report is that the type of some construct doesn't match a contextual
 330 requirement.  For example, in `4 + "hello"` the addition provides a
 331 contextual requirement for numbers, but `"hello"` is not a number.  In
 332 this particular example no further information is needed as the types
 333 are obvious from local information.  When a variable is involved that
 334 isn't the case.  It may be helpful to explain why the variable has a
 335 particular type, by indicating the location where the type was set,
 336 whether by declaration or usage.
 337
 338 Using a recursive-descent analysis we can easily detect a problem at
 339 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 340 will detect that one argument is not a number and the usage of `hello`
 341 will detect that a number was wanted, but not provided.  In this
 342 (early) version of the language, we will generate error reports at
 343 multiple locations, so the use of `hello` will report an error and
 344 explain were the value was set, and the addition will report an error
 345 and say why numbers are needed.  To be able to report locations for
 346 errors, each language element will need to record a file location
 347 (line and column) and each variable will need to record the language
 348 element where its type was set.  For now we will assume that each line
 349 of an error message indicates one location in the file, and up to 2
 350 types.  So we provide a `printf`-like function which takes a format, a
 351 location (a `struct exec` which has not yet been introduced), and 2
 352 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 353 will need a function to print the location, once we know how that is
 354 stored. e As will be explained later, there are sometimes extra rules for
 355 type matching and they might affect error messages, we need to pass those
 356 in too.
 357
 358 As well as type errors, we sometimes need to report problems with
 359 tokens, which might be unexpected or might name a type that has not
 360 been defined.  For these we have `tok_err()` which reports an error
 361 with a given token.  Each of the error functions sets the flag in the
 362 context so indicate that parsing failed.
 363
 364 ###### forward decls
 365
 366         static void fput_loc(struct exec *loc, FILE *f);
 367         static void type_err(struct parse_context *c,
 368                              char *fmt, struct exec *loc,
 369                              struct type *t1, int rules, struct type *t2);
 370         static void tok_err(struct parse_context *c, char *fmt, struct token *t);
 371
 372 ###### core functions
 373
 374         static void type_err(struct parse_context *c,
 375                              char *fmt, struct exec *loc,
 376                              struct type *t1, int rules, struct type *t2)
 377         {
 378                 fprintf(stderr, "%s:", c->file_name);
 379                 fput_loc(loc, stderr);
 380                 for (; *fmt ; fmt++) {
 381                         if (*fmt != '%') {
 382                                 fputc(*fmt, stderr);
 383                                 continue;
 384                         }
 385                         fmt++;
 386                         switch (*fmt) {
 387                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 388                         default: fputc('?', stderr); break;     // NOTEST
 389                         case '1':
 390                                 type_print(t1, stderr);
 391                                 break;
 392                         case '2':
 393                                 type_print(t2, stderr);
 394                                 break;
 395                         ## format cases
 396                         }
 397                 }
 398                 fputs("\n", stderr);
 399                 c->parse_error += 1;
 400         }
 401
 402         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 403         {
 404                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 405                         t->txt.len, t->txt.txt);
 406                 c->parse_error += 1;
 407         }
 408
 409 ## Entities: declared and predeclared.
 410
 411 There are various "things" that the language and/or the interpreter
 412 needs to know about to parse and execute a program.  These include
 413 types, variables, values, and executable code.  These are all lumped
 414 together under the term "entities" (calling them "objects" would be
 415 confusing) and introduced here.  The following section will present the
 416 different specific code elements which comprise or manipulate these
 417 various entities.
 418
 419 ### Executables
 420
 421 Executables can be lots of different things.  In many cases an
 422 executable is just an operation combined with one or two other
 423 executables.  This allows for expressions and lists etc.  Other times an
 424 executable is something quite specific like a constant or variable name.
 425 So we define a `struct exec` to be a general executable with a type, and
 426 a `struct binode` which is a subclass of `exec`, forms a node in a
 427 binary tree, and holds an operation.  There will be other subclasses,
 428 and to access these we need to be able to `cast` the `exec` into the
 429 various other types.  The first field in any `struct exec` is the type
 430 from the `exec_types` enum.
 431
 432 ###### macros
 433         #define cast(structname, pointer) ({            \
 434                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 435                 if (__mptr && *__mptr != X##structname) abort();                \
 436                 (struct structname *)( (char *)__mptr);})
 437
 438         #define new(structname) ({                                              \
 439                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 440                 __ptr->type = X##structname;                                            \
 441                 __ptr->line = -1; __ptr->column = -1;                                   \
 442                 __ptr;})
 443
 444         #define new_pos(structname, token) ({                                           \
 445                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 446                 __ptr->type = X##structname;                                            \
 447                 __ptr->line = token.line; __ptr->column = token.col;                    \
 448                 __ptr;})
 449
 450 ###### ast
 451         enum exec_types {
 452                 Xbinode,
 453                 ## exec type
 454         };
 455         struct exec {
 456                 enum exec_types type;
 457                 int line, column;
 458                 ## exec fields
 459         };
 460         struct binode {
 461                 struct exec;
 462                 enum Btype {
 463                         ## Binode types
 464                 } op;
 465                 struct exec *left, *right;
 466         };
 467
 468 ###### ast functions
 469
 470         static int __fput_loc(struct exec *loc, FILE *f)
 471         {
 472                 if (!loc)
 473                         return 0;
 474                 if (loc->line >= 0) {
 475                         fprintf(f, "%d:%d: ", loc->line, loc->column);
 476                         return 1;
 477                 }
 478                 if (loc->type == Xbinode)
 479                         return __fput_loc(cast(binode,loc)->left, f) ||
 480                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
 481                 return 0;
 482         }
 483         static void fput_loc(struct exec *loc, FILE *f)
 484         {
 485                 if (!__fput_loc(loc, f))
 486                         fprintf(f, "??:??: ");
 487         }
 488
 489 Each different type of `exec` node needs a number of functions defined,
 490 a bit like methods.  We must be able to free it, print it, analyse it
 491 and execute it.  Once we have specific `exec` types we will need to
 492 parse them too.  Let's take this a bit more slowly.
 493
 494 #### Freeing
 495
 496 The parser generator requires a `free_foo` function for each struct
 497 that stores attributes and they will often be `exec`s and subtypes
 498 there-of.  So we need `free_exec` which can handle all the subtypes,
 499 and we need `free_binode`.
 500
 501 ###### ast functions
 502
 503         static void free_binode(struct binode *b)
 504         {
 505                 if (!b)
 506                         return;
 507                 free_exec(b->left);
 508                 free_exec(b->right);
 509                 free(b);
 510         }
 511
 512 ###### core functions
 513         static void free_exec(struct exec *e)
 514         {
 515                 if (!e)
 516                         return;
 517                 switch(e->type) {
 518                         ## free exec cases
 519                 }
 520         }
 521
 522 ###### forward decls
 523
 524         static void free_exec(struct exec *e);
 525
 526 ###### free exec cases
 527         case Xbinode: free_binode(cast(binode, e)); break;
 528
 529 #### Printing
 530
 531 Printing an `exec` requires that we know the current indent level for
 532 printing line-oriented components.  As will become clear later, we
 533 also want to know what sort of bracketing to use.
 534
 535 ###### ast functions
 536
 537         static void do_indent(int i, char *str)
 538         {
 539                 while (i-- > 0)
 540                         printf("    ");
 541                 printf("%s", str);
 542         }
 543
 544 ###### core functions
 545         static void print_binode(struct binode *b, int indent, int bracket)
 546         {
 547                 struct binode *b2;
 548                 switch(b->op) {
 549                 ## print binode cases
 550                 }
 551         }
 552
 553         static void print_exec(struct exec *e, int indent, int bracket)
 554         {
 555                 if (!e)
 556                         return;
 557                 switch (e->type) {
 558                 case Xbinode:
 559                         print_binode(cast(binode, e), indent, bracket); break;
 560                 ## print exec cases
 561                 }
 562                 if (e->to_free) {
 563                         struct variable *v;
 564                         do_indent(indent, "/* FREE");
 565                         for (v = e->to_free; v; v = v->next_free) {
 566                                 printf(" %.*s", v->name->name.len, v->name->name.txt);
 567                                 printf("[%d,%d]", v->scope_start, v->scope_end);
 568                                 if (v->frame_pos >= 0)
 569                                         printf("(%d+%d)", v->frame_pos,
 570                                                v->type ? v->type->size:0);
 571                         }
 572                         printf(" */\n");
 573                 }
 574         }
 575
 576 ###### forward decls
 577
 578         static void print_exec(struct exec *e, int indent, int bracket);
 579
 580 #### Analysing
 581
 582 As discussed, analysis involves propagating type requirements around the
 583 program and looking for errors.
 584
 585 So `propagate_types` is passed an expected type (being a `struct type`
 586 pointer together with some `val_rules` flags) that the `exec` is
 587 expected to return, and returns the type that it does return, either of
 588 which can be `NULL` signifying "unknown".  A `prop_err` flag set is
 589 passed by reference.  It has `Efail` set when an error is found, and
 590 `Eretry` when the type for some element is set via propagation.  If
 591 any expression cannot be evaluated immediately, `Enoconst` is set.
 592 If the expression can be copied, `Emaycopy` is set.
 593
 594 If it remains unchanged at `0`, then no more propagation is needed.
 595
 596 ###### ast
 597
 598         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 1<<2};
 599         enum prop_err {Efail = 1<<0, Eretry = 1<<1, Enoconst = 1<<2,
 600                        Emaycopy = 1<<3};
 601
 602 ###### format cases
 603         case 'r':
 604                 if (rules & Rnolabel)
 605                         fputs(" (labels not permitted)", stderr);
 606                 break;
 607
 608 ###### forward decls
 609         static struct type *propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 610                                             struct type *type, int rules);
 611 ###### core functions
 612
 613         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 614                                               struct type *type, int rules)
 615         {
 616                 struct type *t;
 617
 618                 if (!prog)
 619                         return Tnone;
 620
 621                 switch (prog->type) {
 622                 case Xbinode:
 623                 {
 624                         struct binode *b = cast(binode, prog);
 625                         switch (b->op) {
 626                         ## propagate binode cases
 627                         }
 628                         break;
 629                 }
 630                 ## propagate exec cases
 631                 }
 632                 return Tnone;
 633         }
 634
 635         static struct type *propagate_types(struct exec *prog, struct parse_context *c, enum prop_err *perr,
 636                                             struct type *type, int rules)
 637         {
 638                 int pre_err = c->parse_error;
 639                 struct type *ret = __propagate_types(prog, c, perr, type, rules);
 640
 641                 if (c->parse_error > pre_err)
 642                         *perr |= Efail;
 643                 return ret;
 644         }
 645
 646 #### Interpreting
 647
 648 Interpreting an `exec` doesn't require anything but the `exec`.  State
 649 is stored in variables and each variable will be directly linked from
 650 within the `exec` tree.  The exception to this is the `main` function
 651 which needs to look at command line arguments.  This function will be
 652 interpreted separately.
 653
 654 Each `exec` can return a value combined with a type in `struct lrval`.
 655 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
 656 the location of a value, which can be updated, in `lval`.  Others will
 657 set `lval` to NULL indicating that there is a value of appropriate type
 658 in `rval`.
 659
 660 ###### forward decls
 661         static struct value interp_exec(struct parse_context *c, struct exec *e,
 662                                         struct type **typeret);
 663 ###### core functions
 664
 665         struct lrval {
 666                 struct type *type;
 667                 struct value rval, *lval;
 668         };
 669
 670         /* If dest is passed, dtype must give the expected type, and
 671          * result can go there, in which case type is returned as NULL.
 672          */
 673         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 674                                          struct value *dest, struct type *dtype);
 675
 676         static struct value interp_exec(struct parse_context *c, struct exec *e,
 677                                         struct type **typeret)
 678         {
 679                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 680
 681                 if (!ret.type) abort();
 682                 if (typeret)
 683                         *typeret = ret.type;
 684                 if (ret.lval)
 685                         dup_value(ret.type, ret.lval, &ret.rval);
 686                 return ret.rval;
 687         }
 688
 689         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
 690                                           struct type **typeret)
 691         {
 692                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 693
 694                 if (!ret.type) abort();
 695                 if (ret.lval)
 696                         *typeret = ret.type;
 697                 else
 698                         free_value(ret.type, &ret.rval);
 699                 return ret.lval;
 700         }
 701
 702         /* dinterp_exec is used when the destination type is certain and
 703          * the value has a place to go.
 704          */
 705         static void dinterp_exec(struct parse_context *c, struct exec *e,
 706                                  struct value *dest, struct type *dtype,
 707                                  int need_free)
 708         {
 709                 struct lrval ret = _interp_exec(c, e, dest, dtype);
 710                 if (!ret.type)
 711                         return;
 712                 if (need_free)
 713                         free_value(dtype, dest);
 714                 if (ret.lval)
 715                         dup_value(dtype, ret.lval, dest);
 716                 else
 717                         memcpy(dest, &ret.rval, dtype->size);
 718         }
 719
 720         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 721                                          struct value *dest, struct type *dtype)
 722         {
 723                 /* If the result is copied to dest, ret.type is set to NULL */
 724                 struct lrval ret;
 725                 struct value rv = {}, *lrv = NULL;
 726                 struct type *rvtype;
 727
 728                 rvtype = ret.type = Tnone;
 729                 if (!e) {
 730                         ret.lval = lrv;
 731                         ret.rval = rv;
 732                         return ret;
 733                 }
 734
 735                 switch(e->type) {
 736                 case Xbinode:
 737                 {
 738                         struct binode *b = cast(binode, e);
 739                         struct value left, right, *lleft;
 740                         struct type *ltype, *rtype;
 741                         ltype = rtype = Tnone;
 742                         switch (b->op) {
 743                         ## interp binode cases
 744                         }
 745                         free_value(ltype, &left);
 746                         free_value(rtype, &right);
 747                         break;
 748                 }
 749                 ## interp exec cases
 750                 }
 751                 if (rvtype) {
 752                         ret.lval = lrv;
 753                         ret.rval = rv;
 754                         ret.type = rvtype;
 755                 }
 756                 ## interp exec cleanup
 757                 return ret;
 758         }
 759
 760 ### Types
 761
 762 Values come in a wide range of types, with more likely to be added.
 763 Each type needs to be able to print its own values (for convenience at
 764 least) as well as to compare two values, at least for equality and
 765 possibly for order.  For now, values might need to be duplicated and
 766 freed, though eventually such manipulations will be better integrated
 767 into the language.
 768
 769 Rather than requiring every numeric type to support all numeric
 770 operations (add, multiply, etc), we allow types to be able to present
 771 as one of a few standard types: integer, float, and fraction.  The
 772 existence of these conversion functions eventually enable types to
 773 determine if they are compatible with other types, though such types
 774 have not yet been implemented.
 775
 776 Named type are stored in a simple linked list.  Objects of each type are
 777 "values" which are often passed around by value.
 778
 779 There are both explicitly named types, and anonymous types.  Anonymous
 780 cannot be accessed by name, but are used internally and have a name
 781 which might be reported in error messages.
 782
 783 ###### ast
 784
 785         struct value {
 786                 union {
 787                         char ptr[1];
 788                         ## value union fields
 789                 };
 790         };
 791
 792         struct type {
 793                 struct text name;
 794                 struct type *next;
 795                 struct token first_use;
 796                 int size, align;
 797                 int anon;
 798                 void (*init)(struct type *type, struct value *val);
 799                 int (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 800                 void (*print)(struct type *type, struct value *val, FILE *f);
 801                 void (*print_type)(struct type *type, FILE *f);
 802                 int (*cmp_order)(struct type *t1, struct type *t2,
 803                                  struct value *v1, struct value *v2);
 804                 int (*cmp_eq)(struct type *t1, struct type *t2,
 805                               struct value *v1, struct value *v2);
 806                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 807                 void (*free)(struct type *type, struct value *val);
 808                 void (*free_type)(struct type *t);
 809                 long long (*to_int)(struct value *v);
 810                 double (*to_float)(struct value *v);
 811                 int (*to_mpq)(mpq_t *q, struct value *v);
 812                 ## type functions
 813                 union {
 814                         ## type union fields
 815                 };
 816         };
 817
 818 ###### parse context
 819
 820         struct type *typelist;
 821
 822 ###### includes
 823         #include <stdarg.h>
 824
 825 ###### ast functions
 826
 827         static struct type *find_type(struct parse_context *c, struct text s)
 828         {
 829                 struct type *t = c->typelist;
 830
 831                 while (t && (t->anon ||
 832                              text_cmp(t->name, s) != 0))
 833                                 t = t->next;
 834                 return t;
 835         }
 836
 837         static struct type *_add_type(struct parse_context *c, struct text s,
 838                                      struct type *proto, int anon)
 839         {
 840                 struct type *n;
 841
 842                 n = calloc(1, sizeof(*n));
 843                 if (proto)
 844                         *n = *proto;
 845                 else
 846                         n->size = -1;
 847                 n->name = s;
 848                 n->anon = anon;
 849                 n->next = c->typelist;
 850                 c->typelist = n;
 851                 return n;
 852         }
 853
 854         static struct type *add_type(struct parse_context *c, struct text s,
 855                                       struct type *proto)
 856         {
 857                 return _add_type(c, s, proto, 0);
 858         }
 859
 860         static struct type *add_anon_type(struct parse_context *c,
 861                                           struct type *proto, char *name, ...)
 862         {
 863                 struct text t;
 864                 va_list ap;
 865
 866                 va_start(ap, name);
 867                 vasprintf(&t.txt, name, ap);
 868                 va_end(ap);
 869                 t.len = strlen(name);
 870                 return _add_type(c, t, proto, 1);
 871         }
 872
 873         static void free_type(struct type *t)
 874         {
 875                 /* The type is always a reference to something in the
 876                  * context, so we don't need to free anything.
 877                  */
 878         }
 879
 880         static void free_value(struct type *type, struct value *v)
 881         {
 882                 if (type && v) {
 883                         type->free(type, v);
 884                         memset(v, 0x5a, type->size);
 885                 }
 886         }
 887
 888         static void type_print(struct type *type, FILE *f)
 889         {
 890                 if (!type)
 891                         fputs("*unknown*type*", f);     // NOTEST
 892                 else if (type->name.len && !type->anon)
 893                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 894                 else if (type->print_type)
 895                         type->print_type(type, f);
 896                 else
 897                         fputs("*invalid*type*", f);
 898         }
 899
 900         static void val_init(struct type *type, struct value *val)
 901         {
 902                 if (type && type->init)
 903                         type->init(type, val);
 904         }
 905
 906         static void dup_value(struct type *type,
 907                               struct value *vold, struct value *vnew)
 908         {
 909                 if (type && type->dup)
 910                         type->dup(type, vold, vnew);
 911         }
 912
 913         static int value_cmp(struct type *tl, struct type *tr,
 914                              struct value *left, struct value *right)
 915         {
 916                 if (tl && tl->cmp_order)
 917                         return tl->cmp_order(tl, tr, left, right);
 918                 if (tl && tl->cmp_eq)                   // NOTEST
 919                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 920                 return -1;                              // NOTEST
 921         }
 922
 923         static void print_value(struct type *type, struct value *v, FILE *f)
 924         {
 925                 if (type && type->print)
 926                         type->print(type, v, f);
 927                 else
 928                         fprintf(f, "*Unknown*");                // NOTEST
 929         }
 930
 931         static void prepare_types(struct parse_context *c)
 932         {
 933                 struct type *t;
 934                 int retry = 1;
 935                 enum { none, some, cannot } progress = none;
 936
 937                 while (retry) {
 938                         retry = 0;
 939
 940                         for (t = c->typelist; t; t = t->next) {
 941                                 if (t->size < 0)
 942                                         tok_err(c, "error: type used but not declared",
 943                                                  &t->first_use);
 944                                 if (t->size == 0 && t->prepare_type) {
 945                                         if (t->prepare_type(c, t, 1))
 946                                                 progress = some;
 947                                         else if (progress == cannot)
 948                                                 tok_err(c, "error: type has recursive definition",
 949                                                         &t->first_use);
 950                                         else
 951                                                 retry = 1;
 952                                 }
 953                         }
 954                         switch (progress) {
 955                         case cannot:
 956                                 retry = 0; break;
 957                         case none:
 958                                 progress = cannot; break;
 959                         case some:
 960                                 progress = none; break;
 961                         }
 962                 }
 963         }
 964
 965 ###### forward decls
 966
 967         static void free_value(struct type *type, struct value *v);
 968         static int type_compat(struct type *require, struct type *have, int rules);
 969         static void type_print(struct type *type, FILE *f);
 970         static void val_init(struct type *type, struct value *v);
 971         static void dup_value(struct type *type,
 972                               struct value *vold, struct value *vnew);
 973         static int value_cmp(struct type *tl, struct type *tr,
 974                              struct value *left, struct value *right);
 975         static void print_value(struct type *type, struct value *v, FILE *f);
 976
 977 ###### free context types
 978
 979         while (context.typelist) {
 980                 struct type *t = context.typelist;
 981
 982                 context.typelist = t->next;
 983                 if (t->free_type)
 984                         t->free_type(t);
 985                 if (t->anon)
 986                         free(t->name.txt);
 987                 free(t);
 988         }
 989
 990 Type can be specified for local variables, for fields in a structure,
 991 for formal parameters to functions, and possibly elsewhere.  Different
 992 rules may apply in different contexts.  As a minimum, a named type may
 993 always be used.  Currently the type of a formal parameter can be
 994 different from types in other contexts, so we have a separate grammar
 995 symbol for those.
 996
 997 ###### Grammar
 998
 999         $*type
1000         Type -> IDENTIFIER ${
1001                 $0 = find_type(c, $ID.txt);
1002                 if (!$0) {
1003                         $0 = add_type(c, $ID.txt, NULL);
1004                         $0->first_use = $ID;
1005                 }
1006         }$
1007         ## type grammar
1008
1009         FormalType -> Type ${ $0 = $<1; }$
1010         ## formal type grammar
1011
1012 #### Base Types
1013
1014 Values of the base types can be numbers, which we represent as
1015 multi-precision fractions, strings, Booleans and labels.  When
1016 analysing the program we also need to allow for places where no value
1017 is meaningful (type `Tnone`) and where we don't know what type to
1018 expect yet (type is `NULL`).
1019
1020 Values are never shared, they are always copied when used, and freed
1021 when no longer needed.
1022
1023 When propagating type information around the program, we need to
1024 determine if two types are compatible, where type `NULL` is compatible
1025 with anything.  There are two special cases with type compatibility,
1026 both related to the Conditional Statement which will be described
1027 later.  In some cases a Boolean can be accepted as well as some other
1028 primary type, and in others any type is acceptable except a label (`Vlabel`).
1029 A separate function encoding these cases will simplify some code later.
1030
1031 ###### type functions
1032
1033         int (*compat)(struct type *this, struct type *other);
1034
1035 ###### ast functions
1036
1037         static int type_compat(struct type *require, struct type *have, int rules)
1038         {
1039                 if ((rules & Rboolok) && have == Tbool)
1040                         return 1;       // NOTEST
1041                 if ((rules & Rnolabel) && have == Tlabel)
1042                         return 0;       // NOTEST
1043                 if (!require || !have)
1044                         return 1;
1045
1046                 if (require->compat)
1047                         return require->compat(require, have);
1048
1049                 return require == have;
1050         }
1051
1052 ###### includes
1053         #include <gmp.h>
1054         #include "parse_string.h"
1055         #include "parse_number.h"
1056
1057 ###### libs
1058         myLDLIBS := libnumber.o libstring.o -lgmp
1059         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
1060
1061 ###### type union fields
1062         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
1063
1064 ###### value union fields
1065         struct text str;
1066         mpq_t num;
1067         unsigned char bool;
1068         void *label;
1069
1070 ###### ast functions
1071         static void _free_value(struct type *type, struct value *v)
1072         {
1073                 if (!v)
1074                         return;         // NOTEST
1075                 switch (type->vtype) {
1076                 case Vnone: break;
1077                 case Vstr: free(v->str.txt); break;
1078                 case Vnum: mpq_clear(v->num); break;
1079                 case Vlabel:
1080                 case Vbool: break;
1081                 }
1082         }
1083
1084 ###### value functions
1085
1086         static void _val_init(struct type *type, struct value *val)
1087         {
1088                 switch(type->vtype) {
1089                 case Vnone:             // NOTEST
1090                         break;          // NOTEST
1091                 case Vnum:
1092                         mpq_init(val->num); break;
1093                 case Vstr:
1094                         val->str.txt = malloc(1);
1095                         val->str.len = 0;
1096                         break;
1097                 case Vbool:
1098                         val->bool = 0;
1099                         break;
1100                 case Vlabel:
1101                         val->label = NULL;
1102                         break;
1103                 }
1104         }
1105
1106         static void _dup_value(struct type *type,
1107                                struct value *vold, struct value *vnew)
1108         {
1109                 switch (type->vtype) {
1110                 case Vnone:             // NOTEST
1111                         break;          // NOTEST
1112                 case Vlabel:
1113                         vnew->label = vold->label;
1114                         break;
1115                 case Vbool:
1116                         vnew->bool = vold->bool;
1117                         break;
1118                 case Vnum:
1119                         mpq_init(vnew->num);
1120                         mpq_set(vnew->num, vold->num);
1121                         break;
1122                 case Vstr:
1123                         vnew->str.len = vold->str.len;
1124                         vnew->str.txt = malloc(vnew->str.len);
1125                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
1126                         break;
1127                 }
1128         }
1129
1130         static int _value_cmp(struct type *tl, struct type *tr,
1131                               struct value *left, struct value *right)
1132         {
1133                 int cmp;
1134                 if (tl != tr)
1135                         return tl - tr; // NOTEST
1136                 switch (tl->vtype) {
1137                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
1138                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
1139                 case Vstr: cmp = text_cmp(left->str, right->str); break;
1140                 case Vbool: cmp = left->bool - right->bool; break;
1141                 case Vnone: cmp = 0;                    // NOTEST
1142                 }
1143                 return cmp;
1144         }
1145
1146         static void _print_value(struct type *type, struct value *v, FILE *f)
1147         {
1148                 switch (type->vtype) {
1149                 case Vnone:                             // NOTEST
1150                         fprintf(f, "*no-value*"); break;        // NOTEST
1151                 case Vlabel:                            // NOTEST
1152                         fprintf(f, "*label-%p*", v->label); break; // NOTEST
1153                 case Vstr:
1154                         fprintf(f, "%.*s", v->str.len, v->str.txt); break;
1155                 case Vbool:
1156                         fprintf(f, "%s", v->bool ? "True":"False"); break;
1157                 case Vnum:
1158                         {
1159                         mpf_t fl;
1160                         mpf_init2(fl, 20);
1161                         mpf_set_q(fl, v->num);
1162                         gmp_fprintf(f, "%.10Fg", fl);
1163                         mpf_clear(fl);
1164                         break;
1165                         }
1166                 }
1167         }
1168
1169         static void _free_value(struct type *type, struct value *v);
1170
1171         static struct type base_prototype = {
1172                 .init = _val_init,
1173                 .print = _print_value,
1174                 .cmp_order = _value_cmp,
1175                 .cmp_eq = _value_cmp,
1176                 .dup = _dup_value,
1177                 .free = _free_value,
1178         };
1179
1180         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
1181
1182 ###### ast functions
1183         static struct type *add_base_type(struct parse_context *c, char *n,
1184                                           enum vtype vt, int size)
1185         {
1186                 struct text txt = { n, strlen(n) };
1187                 struct type *t;
1188
1189                 t = add_type(c, txt, &base_prototype);
1190                 t->vtype = vt;
1191                 t->size = size;
1192                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
1193                 if (t->size & (t->align - 1))
1194                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
1195                 return t;
1196         }
1197
1198 ###### context initialization
1199
1200         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
1201         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
1202         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
1203         Tnone  = add_base_type(&context, "none", Vnone, 0);
1204         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
1205
1206 ##### Base Values
1207
1208 We have already met values as separate objects.  When manifest constants
1209 appear in the program text, that must result in an executable which has
1210 a constant value.  So the `val` structure embeds a value in an
1211 executable.
1212
1213 ###### exec type
1214         Xval,
1215
1216 ###### ast
1217         struct val {
1218                 struct exec;
1219                 struct type *vtype;
1220                 struct value val;
1221         };
1222
1223 ###### ast functions
1224         struct val *new_val(struct type *T, struct token tk)
1225         {
1226                 struct val *v = new_pos(val, tk);
1227                 v->vtype = T;
1228                 return v;
1229         }
1230
1231 ###### Grammar
1232
1233         $TERM True False
1234
1235         $*val
1236         Value ->  True ${
1237                 $0 = new_val(Tbool, $1);
1238                 $0->val.bool = 1;
1239         }$
1240         | False ${
1241                 $0 = new_val(Tbool, $1);
1242                 $0->val.bool = 0;
1243         }$
1244         | NUMBER ${ {
1245                 char tail[3];
1246                 $0 = new_val(Tnum, $1);
1247                 if (number_parse($0->val.num, tail, $1.txt) == 0)
1248                         mpq_init($0->val.num);  // UNTESTED
1249                         if (tail[0])
1250                                 tok_err(c, "error: unsupported number suffix",
1251                                         &$1);
1252         } }$
1253         | STRING ${ {
1254                 char tail[3];
1255                 $0 = new_val(Tstr, $1);
1256                 string_parse(&$1, '\\', &$0->val.str, tail);
1257                 if (tail[0])
1258                         tok_err(c, "error: unsupported string suffix",
1259                                 &$1);
1260         } }$
1261         | MULTI_STRING ${ {
1262                 char tail[3];
1263                 $0 = new_val(Tstr, $1);
1264                 string_parse(&$1, '\\', &$0->val.str, tail);
1265                 if (tail[0])
1266                         tok_err(c, "error: unsupported string suffix",
1267                                 &$1);
1268         } }$
1269
1270 ###### print exec cases
1271         case Xval:
1272         {
1273                 struct val *v = cast(val, e);
1274                 if (v->vtype == Tstr)
1275                         printf("\"");
1276                 // FIXME how to ensure numbers have same precision.
1277                 print_value(v->vtype, &v->val, stdout);
1278                 if (v->vtype == Tstr)
1279                         printf("\"");
1280                 break;
1281         }
1282
1283 ###### propagate exec cases
1284         case Xval:
1285         {
1286                 struct val *val = cast(val, prog);
1287                 if (!type_compat(type, val->vtype, rules))
1288                         type_err(c, "error: expected %1%r found %2",
1289                                    prog, type, rules, val->vtype);
1290                 return val->vtype;
1291         }
1292
1293 ###### interp exec cases
1294         case Xval:
1295                 rvtype = cast(val, e)->vtype;
1296                 dup_value(rvtype, &cast(val, e)->val, &rv);
1297                 break;
1298
1299 ###### ast functions
1300         static void free_val(struct val *v)
1301         {
1302                 if (v)
1303                         free_value(v->vtype, &v->val);
1304                 free(v);
1305         }
1306
1307 ###### free exec cases
1308         case Xval: free_val(cast(val, e)); break;
1309
1310 ###### ast functions
1311         // Move all nodes from 'b' to 'rv', reversing their order.
1312         // In 'b' 'left' is a list, and 'right' is the last node.
1313         // In 'rv', left' is the first node and 'right' is a list.
1314         static struct binode *reorder_bilist(struct binode *b)
1315         {
1316                 struct binode *rv = NULL;
1317
1318                 while (b) {
1319                         struct exec *t = b->right;
1320                         b->right = rv;
1321                         rv = b;
1322                         if (b->left)
1323                                 b = cast(binode, b->left);
1324                         else
1325                                 b = NULL;
1326                         rv->left = t;
1327                 }
1328                 return rv;
1329         }
1330
1331 ### Variables
1332
1333 Variables are scoped named values.  We store the names in a linked list
1334 of "bindings" sorted in lexical order, and use sequential search and
1335 insertion sort.
1336
1337 ###### ast
1338
1339         struct binding {
1340                 struct text name;
1341                 struct binding *next;   // in lexical order
1342                 ## binding fields
1343         };
1344
1345 This linked list is stored in the parse context so that "reduce"
1346 functions can find or add variables, and so the analysis phase can
1347 ensure that every variable gets a type.
1348
1349 ###### parse context
1350
1351         struct binding *varlist;  // In lexical order
1352
1353 ###### ast functions
1354
1355         static struct binding *find_binding(struct parse_context *c, struct text s)
1356         {
1357                 struct binding **l = &c->varlist;
1358                 struct binding *n;
1359                 int cmp = 1;
1360
1361                 while (*l &&
1362                         (cmp = text_cmp((*l)->name, s)) < 0)
1363                                 l = & (*l)->next;
1364                 if (cmp == 0)
1365                         return *l;
1366                 n = calloc(1, sizeof(*n));
1367                 n->name = s;
1368                 n->next = *l;
1369                 *l = n;
1370                 return n;
1371         }
1372
1373 Each name can be linked to multiple variables defined in different
1374 scopes.  Each scope starts where the name is declared and continues
1375 until the end of the containing code block.  Scopes of a given name
1376 cannot nest, so a declaration while a name is in-scope is an error.
1377
1378 ###### binding fields
1379         struct variable *var;
1380
1381 ###### ast
1382         struct variable {
1383                 struct variable *previous;
1384                 struct type *type;
1385                 struct binding *name;
1386                 struct exec *where_decl;// where name was declared
1387                 struct exec *where_set; // where type was set
1388                 ## variable fields
1389         };
1390
1391 When a scope closes, the values of the variables might need to be freed.
1392 This happens in the context of some `struct exec` and each `exec` will
1393 need to know which variables need to be freed when it completes.
1394
1395 ####### exec fields
1396         struct variable *to_free;
1397
1398 ####### variable fields
1399         struct exec *cleanup_exec;
1400         struct variable *next_free;
1401
1402 ####### interp exec cleanup
1403         {
1404                 struct variable *v;
1405                 for (v = e->to_free; v; v = v->next_free) {
1406                         struct value *val = var_value(c, v);
1407                         free_value(v->type, val);
1408                 }
1409         }
1410
1411 ###### ast functions
1412         static void variable_unlink_exec(struct variable *v)
1413         {
1414                 struct variable **vp;
1415                 if (!v->cleanup_exec)
1416                         return;
1417                 for (vp = &v->cleanup_exec->to_free;
1418                     *vp; vp = &(*vp)->next_free) {
1419                         if (*vp != v)
1420                                 continue;
1421                         *vp = v->next_free;
1422                         v->cleanup_exec = NULL;
1423                         break;
1424                 }
1425         }
1426
1427 While the naming seems strange, we include local constants in the
1428 definition of variables.  A name declared `var := value` can
1429 subsequently be changed, but a name declared `var ::= value` cannot -
1430 it is constant
1431
1432 ###### variable fields
1433         int constant;
1434
1435 Scopes in parallel branches can be partially merged.  More
1436 specifically, if a given name is declared in both branches of an
1437 if/else then its scope is a candidate for merging.  Similarly if
1438 every branch of an exhaustive switch (e.g. has an "else" clause)
1439 declares a given name, then the scopes from the branches are
1440 candidates for merging.
1441
1442 Note that names declared inside a loop (which is only parallel to
1443 itself) are never visible after the loop.  Similarly names defined in
1444 scopes which are not parallel, such as those started by `for` and
1445 `switch`, are never visible after the scope.  Only variables defined in
1446 both `then` and `else` (including the implicit then after an `if`, and
1447 excluding `then` used with `for`) and in all `case`s and `else` of a
1448 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
1449
1450 Labels, which are a bit like variables, follow different rules.
1451 Labels are not explicitly declared, but if an undeclared name appears
1452 in a context where a label is legal, that effectively declares the
1453 name as a label.  The declaration remains in force (or in scope) at
1454 least to the end of the immediately containing block and conditionally
1455 in any larger containing block which does not declare the name in some
1456 other way.  Importantly, the conditional scope extension happens even
1457 if the label is only used in one parallel branch of a conditional --
1458 when used in one branch it is treated as having been declared in all
1459 branches.
1460
1461 Merge candidates are tentatively visible beyond the end of the
1462 branching statement which creates them.  If the name is used, the
1463 merge is affirmed and they become a single variable visible at the
1464 outer layer.  If not - if it is redeclared first - the merge lapses.
1465
1466 To track scopes we have an extra stack, implemented as a linked list,
1467 which roughly parallels the parse stack and which is used exclusively
1468 for scoping.  When a new scope is opened, a new frame is pushed and
1469 the child-count of the parent frame is incremented.  This child-count
1470 is used to distinguish between the first of a set of parallel scopes,
1471 in which declared variables must not be in scope, and subsequent
1472 branches, whether they may already be conditionally scoped.
1473
1474 We need a total ordering of scopes so we can easily compare to variables
1475 to see if they are concurrently in scope.  To achieve this we record a
1476 `scope_count` which is actually a count of both beginnings and endings
1477 of scopes.  Then each variable has a record of the scope count where it
1478 enters scope, and where it leaves.
1479
1480 To push a new frame *before* any code in the frame is parsed, we need a
1481 grammar reduction.  This is most easily achieved with a grammar
1482 element which derives the empty string, and creates the new scope when
1483 it is recognised.  This can be placed, for example, between a keyword
1484 like "if" and the code following it.
1485
1486 ###### ast
1487         struct scope {
1488                 struct scope *parent;
1489                 int child_count;
1490         };
1491
1492 ###### parse context
1493         int scope_depth;
1494         int scope_count;
1495         struct scope *scope_stack;
1496
1497 ###### variable fields
1498         int scope_start, scope_end;
1499
1500 ###### ast functions
1501         static void scope_pop(struct parse_context *c)
1502         {
1503                 struct scope *s = c->scope_stack;
1504
1505                 c->scope_stack = s->parent;
1506                 free(s);
1507                 c->scope_depth -= 1;
1508                 c->scope_count += 1;
1509         }
1510
1511         static void scope_push(struct parse_context *c)
1512         {
1513                 struct scope *s = calloc(1, sizeof(*s));
1514                 if (c->scope_stack)
1515                         c->scope_stack->child_count += 1;
1516                 s->parent = c->scope_stack;
1517                 c->scope_stack = s;
1518                 c->scope_depth += 1;
1519                 c->scope_count += 1;
1520         }
1521
1522 ###### Grammar
1523
1524         $void
1525         OpenScope -> ${ scope_push(c); }$
1526
1527 Each variable records a scope depth and is in one of four states:
1528
1529 - "in scope".  This is the case between the declaration of the
1530   variable and the end of the containing block, and also between
1531   the usage with affirms a merge and the end of that block.
1532
1533   The scope depth is not greater than the current parse context scope
1534   nest depth.  When the block of that depth closes, the state will
1535   change.  To achieve this, all "in scope" variables are linked
1536   together as a stack in nesting order.
1537
1538 - "pending".  The "in scope" block has closed, but other parallel
1539   scopes are still being processed.  So far, every parallel block at
1540   the same level that has closed has declared the name.
1541
1542   The scope depth is the depth of the last parallel block that
1543   enclosed the declaration, and that has closed.
1544
1545 - "conditionally in scope".  The "in scope" block and all parallel
1546   scopes have closed, and no further mention of the name has been seen.
1547   This state includes a secondary nest depth (`min_depth`) which records
1548   the outermost scope seen since the variable became conditionally in
1549   scope.  If a use of the name is found, the variable becomes "in scope"
1550   and that secondary depth becomes the recorded scope depth.  If the
1551   name is declared as a new variable, the old variable becomes "out of
1552   scope" and the recorded scope depth stays unchanged.
1553
1554 - "out of scope".  The variable is neither in scope nor conditionally
1555   in scope.  It is permanently out of scope now and can be removed from
1556   the "in scope" stack.  When a variable becomes out-of-scope it is
1557   moved to a separate list (`out_scope`) of variables which have fully
1558   known scope.  This will be used at the end of each function to assign
1559   each variable a place in the stack frame.
1560
1561 ###### variable fields
1562         int depth, min_depth;
1563         enum { OutScope, PendingScope, CondScope, InScope } scope;
1564         struct variable *in_scope;
1565
1566 ###### parse context
1567
1568         struct variable *in_scope;
1569         struct variable *out_scope;
1570
1571 All variables with the same name are linked together using the
1572 'previous' link.  Those variable that have been affirmatively merged all
1573 have a 'merged' pointer that points to one primary variable - the most
1574 recently declared instance.  When merging variables, we need to also
1575 adjust the 'merged' pointer on any other variables that had previously
1576 been merged with the one that will no longer be primary.
1577
1578 A variable that is no longer the most recent instance of a name may
1579 still have "pending" scope, if it might still be merged with most
1580 recent instance.  These variables don't really belong in the
1581 "in_scope" list, but are not immediately removed when a new instance
1582 is found.  Instead, they are detected and ignored when considering the
1583 list of in_scope names.
1584
1585 The storage of the value of a variable will be described later.  For now
1586 we just need to know that when a variable goes out of scope, it might
1587 need to be freed.  For this we need to be able to find it, so assume that
1588 `var_value()` will provide that.
1589
1590 ###### variable fields
1591         struct variable *merged;
1592
1593 ###### ast functions
1594
1595         static void variable_merge(struct variable *primary, struct variable *secondary)
1596         {
1597                 struct variable *v;
1598
1599                 primary = primary->merged;
1600
1601                 for (v = primary->previous; v; v=v->previous)
1602                         if (v == secondary || v == secondary->merged ||
1603                             v->merged == secondary ||
1604                             v->merged == secondary->merged) {
1605                                 v->scope = OutScope;
1606                                 v->merged = primary;
1607                                 if (v->scope_start < primary->scope_start)
1608                                         primary->scope_start = v->scope_start;
1609                                 if (v->scope_end > primary->scope_end)
1610                                         primary->scope_end = v->scope_end;      // NOTEST
1611                                 variable_unlink_exec(v);
1612                         }
1613         }
1614
1615 ###### forward decls
1616         static struct value *var_value(struct parse_context *c, struct variable *v);
1617
1618 ###### free global vars
1619
1620         while (context.varlist) {
1621                 struct binding *b = context.varlist;
1622                 struct variable *v = b->var;
1623                 context.varlist = b->next;
1624                 free(b);
1625                 while (v) {
1626                         struct variable *next = v->previous;
1627
1628                         if (v->global && v->frame_pos >= 0) {
1629                                 free_value(v->type, var_value(&context, v));
1630                                 if (v->depth == 0 && v->type->free == function_free)
1631                                         // This is a function constant
1632                                         free_exec(v->where_decl);
1633                         }
1634                         free(v);
1635                         v = next;
1636                 }
1637         }
1638
1639 #### Manipulating Bindings
1640
1641 When a name is conditionally visible, a new declaration discards the old
1642 binding - the condition lapses.  Similarly when we reach the end of a
1643 function (outermost non-global scope) any conditional scope must lapse.
1644 Conversely a usage of the name affirms the visibility and extends it to
1645 the end of the containing block - i.e.  the block that contains both the
1646 original declaration and the latest usage.  This is determined from
1647 `min_depth`.  When a conditionally visible variable gets affirmed like
1648 this, it is also merged with other conditionally visible variables with
1649 the same name.
1650
1651 When we parse a variable declaration we either report an error if the
1652 name is currently bound, or create a new variable at the current nest
1653 depth if the name is unbound or bound to a conditionally scoped or
1654 pending-scope variable.  If the previous variable was conditionally
1655 scoped, it and its homonyms becomes out-of-scope.
1656
1657 When we parse a variable reference (including non-declarative assignment
1658 "foo = bar") we report an error if the name is not bound or is bound to
1659 a pending-scope variable; update the scope if the name is bound to a
1660 conditionally scoped variable; or just proceed normally if the named
1661 variable is in scope.
1662
1663 When we exit a scope, any variables bound at this level are either
1664 marked out of scope or pending-scoped, depending on whether the scope
1665 was sequential or parallel.  Here a "parallel" scope means the "then"
1666 or "else" part of a conditional, or any "case" or "else" branch of a
1667 switch.  Other scopes are "sequential".
1668
1669 When exiting a parallel scope we check if there are any variables that
1670 were previously pending and are still visible. If there are, then
1671 they weren't redeclared in the most recent scope, so they cannot be
1672 merged and must become out-of-scope.  If it is not the first of
1673 parallel scopes (based on `child_count`), we check that there was a
1674 previous binding that is still pending-scope.  If there isn't, the new
1675 variable must now be out-of-scope.
1676
1677 When exiting a sequential scope that immediately enclosed parallel
1678 scopes, we need to resolve any pending-scope variables.  If there was
1679 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1680 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1681 all pending-scope variables become conditionally scoped.
1682
1683 ###### ast
1684         enum closetype { CloseSequential, CloseFunction, CloseParallel, CloseElse };
1685
1686 ###### ast functions
1687
1688         static struct variable *var_decl(struct parse_context *c, struct text s)
1689         {
1690                 struct binding *b = find_binding(c, s);
1691                 struct variable *v = b->var;
1692
1693                 switch (v ? v->scope : OutScope) {
1694                 case InScope:
1695                         /* Caller will report the error */
1696                         return NULL;
1697                 case CondScope:
1698                         for (;
1699                              v && v->scope == CondScope;
1700                              v = v->previous)
1701                                 v->scope = OutScope;
1702                         break;
1703                 default: break;
1704                 }
1705                 v = calloc(1, sizeof(*v));
1706                 v->previous = b->var;
1707                 b->var = v;
1708                 v->name = b;
1709                 v->merged = v;
1710                 v->min_depth = v->depth = c->scope_depth;
1711                 v->scope = InScope;
1712                 v->in_scope = c->in_scope;
1713                 v->scope_start = c->scope_count;
1714                 c->in_scope = v;
1715                 ## variable init
1716                 return v;
1717         }
1718
1719         static struct variable *var_ref(struct parse_context *c, struct text s)
1720         {
1721                 struct binding *b = find_binding(c, s);
1722                 struct variable *v = b->var;
1723                 struct variable *v2;
1724
1725                 switch (v ? v->scope : OutScope) {
1726                 case OutScope:
1727                 case PendingScope:
1728                         /* Caller will report the error */
1729                         return NULL;
1730                 case CondScope:
1731                         /* All CondScope variables of this name need to be merged
1732                          * and become InScope
1733                          */
1734                         v->depth = v->min_depth;
1735                         v->scope = InScope;
1736                         for (v2 = v->previous;
1737                              v2 && v2->scope == CondScope;
1738                              v2 = v2->previous)
1739                                 variable_merge(v, v2);
1740                         break;
1741                 case InScope:
1742                         break;
1743                 }
1744                 return v;
1745         }
1746
1747         static int var_refile(struct parse_context *c, struct variable *v)
1748         {
1749                 /* Variable just went out of scope.  Add it to the out_scope
1750                  * list, sorted by ->scope_start
1751                  */
1752                 struct variable **vp = &c->out_scope;
1753                 while ((*vp) && (*vp)->scope_start < v->scope_start)
1754                         vp = &(*vp)->in_scope;
1755                 v->in_scope = *vp;
1756                 *vp = v;
1757                 return 0;
1758         }
1759
1760         static void var_block_close(struct parse_context *c, enum closetype ct,
1761                                     struct exec *e)
1762         {
1763                 /* Close off all variables that are in_scope.
1764                  * Some variables in c->scope may already be not-in-scope,
1765                  * such as when a PendingScope variable is hidden by a new
1766                  * variable with the same name.
1767                  * So we check for v->name->var != v and drop them.
1768                  * If we choose to make a variable OutScope, we drop it
1769                  * immediately too.
1770                  */
1771                 struct variable *v, **vp, *v2;
1772
1773                 scope_pop(c);
1774                 for (vp = &c->in_scope;
1775                      (v = *vp) && v->min_depth > c->scope_depth;
1776                      (v->scope == OutScope || v->name->var != v)
1777                      ? (*vp =  v->in_scope, var_refile(c, v))
1778                      : ( vp = &v->in_scope, 0)) {
1779                         v->min_depth = c->scope_depth;
1780                         if (v->name->var != v)
1781                                 /* This is still in scope, but we haven't just
1782                                  * closed the scope.
1783                                  */
1784                                 continue;
1785                         v->min_depth = c->scope_depth;
1786                         if (v->scope == InScope)
1787                                 v->scope_end = c->scope_count;
1788                         if (v->scope == InScope && e && !v->global) {
1789                                 /* This variable gets cleaned up when 'e' finishes */
1790                                 variable_unlink_exec(v);
1791                                 v->cleanup_exec = e;
1792                                 v->next_free = e->to_free;
1793                                 e->to_free = v;
1794                         }
1795                         switch (ct) {
1796                         case CloseElse:
1797                         case CloseParallel: /* handle PendingScope */
1798                                 switch(v->scope) {
1799                                 case InScope:
1800                                 case CondScope:
1801                                         if (c->scope_stack->child_count == 1)
1802                                                 /* first among parallel branches */
1803                                                 v->scope = PendingScope;
1804                                         else if (v->previous &&
1805                                                  v->previous->scope == PendingScope)
1806                                                 /* all previous branches used name */
1807                                                 v->scope = PendingScope;
1808                                         else if (v->type == Tlabel)
1809                                                 /* Labels remain pending even when not used */
1810                                                 v->scope = PendingScope;        // UNTESTED
1811                                         else
1812                                                 v->scope = OutScope;
1813                                         if (ct == CloseElse) {
1814                                                 /* All Pending variables with this name
1815                                                  * are now Conditional */
1816                                                 for (v2 = v;
1817                                                      v2 && v2->scope == PendingScope;
1818                                                      v2 = v2->previous)
1819                                                         v2->scope = CondScope;
1820                                         }
1821                                         break;
1822                                 case PendingScope:
1823                                         /* Not possible as it would require
1824                                          * parallel scope to be nested immediately
1825                                          * in a parallel scope, and that never
1826                                          * happens.
1827                                          */                     // NOTEST
1828                                 case OutScope:
1829                                         /* Not possible as we already tested for
1830                                          * OutScope
1831                                          */
1832                                         abort();                // NOTEST
1833                                 }
1834                                 break;
1835                         case CloseFunction:
1836                                 if (v->scope == CondScope)
1837                                         /* Condition cannot continue past end of function */
1838                                         v->scope = InScope;
1839                                 /* fallthrough */
1840                         case CloseSequential:
1841                                 if (v->type == Tlabel)
1842                                         v->scope = PendingScope;
1843                                 switch (v->scope) {
1844                                 case InScope:
1845                                         v->scope = OutScope;
1846                                         break;
1847                                 case PendingScope:
1848                                         /* There was no 'else', so we can only become
1849                                          * conditional if we know the cases were exhaustive,
1850                                          * and that doesn't mean anything yet.
1851                                          * So only labels become conditional..
1852                                          */
1853                                         for (v2 = v;
1854                                              v2 && v2->scope == PendingScope;
1855                                              v2 = v2->previous)
1856                                                 if (v2->type == Tlabel)
1857                                                         v2->scope = CondScope;
1858                                                 else
1859                                                         v2->scope = OutScope;
1860                                         break;
1861                                 case CondScope:
1862                                 case OutScope: break;
1863                                 }
1864                                 break;
1865                         }
1866                 }
1867         }
1868
1869 #### Storing Values
1870
1871 The value of a variable is store separately from the variable, on an
1872 analogue of a stack frame.  There are (currently) two frames that can be
1873 active.  A global frame which currently only stores constants, and a
1874 stacked frame which stores local variables.  Each variable knows if it
1875 is global or not, and what its index into the frame is.
1876
1877 Values in the global frame are known immediately they are relevant, so
1878 the frame needs to be reallocated as it grows so it can store those
1879 values.  The local frame doesn't get values until the interpreted phase
1880 is started, so there is no need to allocate until the size is known.
1881
1882 We initialize the `frame_pos` to an impossible value, so that we can
1883 tell if it was set or not later.
1884
1885 ###### variable fields
1886         short frame_pos;
1887         short global;
1888
1889 ###### variable init
1890         v->frame_pos = -1;
1891
1892 ###### parse context
1893
1894         short global_size, global_alloc;
1895         short local_size;
1896         void *global, *local;
1897
1898 ###### forward decls
1899         static struct value *global_alloc(struct parse_context *c, struct type *t,
1900                                           struct variable *v, struct value *init);
1901
1902 ###### ast functions
1903
1904         static struct value *var_value(struct parse_context *c, struct variable *v)
1905         {
1906                 if (!v->global) {
1907                         if (!c->local || !v->type)
1908                                 return NULL;    // UNTESTED
1909                         if (v->frame_pos + v->type->size > c->local_size) {
1910                                 printf("INVALID frame_pos\n");  // NOTEST
1911                                 exit(2);                        // NOTEST
1912                         }
1913                         return c->local + v->frame_pos;
1914                 }
1915                 if (c->global_size > c->global_alloc) {
1916                         int old = c->global_alloc;
1917                         c->global_alloc = (c->global_size | 1023) + 1024;
1918                         c->global = realloc(c->global, c->global_alloc);
1919                         memset(c->global + old, 0, c->global_alloc - old);
1920                 }
1921                 return c->global + v->frame_pos;
1922         }
1923
1924         static struct value *global_alloc(struct parse_context *c, struct type *t,
1925                                           struct variable *v, struct value *init)
1926         {
1927                 struct value *ret;
1928                 struct variable scratch;
1929
1930                 if (t->prepare_type)
1931                         t->prepare_type(c, t, 1);       // NOTEST
1932
1933                 if (c->global_size & (t->align - 1))
1934                         c->global_size = (c->global_size + t->align) & ~(t->align-1);   // NOTEST
1935                 if (!v) {
1936                         v = &scratch;
1937                         v->type = t;
1938                 }
1939                 v->frame_pos = c->global_size;
1940                 v->global = 1;
1941                 c->global_size += v->type->size;
1942                 ret = var_value(c, v);
1943                 if (init)
1944                         memcpy(ret, init, t->size);
1945                 else
1946                         val_init(t, ret);
1947                 return ret;
1948         }
1949
1950 As global values are found -- struct field initializers, labels etc --
1951 `global_alloc()` is called to record the value in the global frame.
1952
1953 When the program is fully parsed, each function is analysed, we need to
1954 walk the list of variables local to that function and assign them an
1955 offset in the stack frame.  For this we have `scope_finalize()`.
1956
1957 We keep the stack from dense by re-using space for between variables
1958 that are not in scope at the same time.  The `out_scope` list is sorted
1959 by `scope_start` and as we process a varible, we move it to an FIFO
1960 stack.  For each variable we consider, we first discard any from the
1961 stack anything that went out of scope before the new variable came in.
1962 Then we place the new variable just after the one at the top of the
1963 stack.
1964
1965 ###### ast functions
1966
1967         static void scope_finalize(struct parse_context *c, struct type *ft)
1968         {
1969                 int size = ft->function.local_size;
1970                 struct variable *next = ft->function.scope;
1971                 struct variable *done = NULL;
1972
1973                 while (next) {
1974                         struct variable *v = next;
1975                         struct type *t = v->type;
1976                         int pos;
1977                         next = v->in_scope;
1978                         if (v->merged != v)
1979                                 continue;
1980                         if (!t)
1981                                 continue;
1982                         if (v->frame_pos >= 0)
1983                                 continue;
1984                         while (done && done->scope_end < v->scope_start)
1985                                 done = done->in_scope;
1986                         if (done)
1987                                 pos = done->frame_pos + done->type->size;
1988                         else
1989                                 pos = ft->function.local_size;
1990                         if (pos & (t->align - 1))
1991                                 pos = (pos + t->align) & ~(t->align-1);
1992                         v->frame_pos = pos;
1993                         if (size < pos + v->type->size)
1994                                 size = pos + v->type->size;
1995                         v->in_scope = done;
1996                         done = v;
1997                 }
1998                 c->out_scope = NULL;
1999                 ft->function.local_size = size;
2000         }
2001
2002 ###### free context storage
2003         free(context.global);
2004
2005 #### Variables as executables
2006
2007 Just as we used a `val` to wrap a value into an `exec`, we similarly
2008 need a `var` to wrap a `variable` into an exec.  While each `val`
2009 contained a copy of the value, each `var` holds a link to the variable
2010 because it really is the same variable no matter where it appears.
2011 When a variable is used, we need to remember to follow the `->merged`
2012 link to find the primary instance.
2013
2014 When a variable is declared, it may or may not be given an explicit
2015 type.  We need to record which so that we can report the parsed code
2016 correctly.
2017
2018 ###### exec type
2019         Xvar,
2020
2021 ###### ast
2022         struct var {
2023                 struct exec;
2024                 struct variable *var;
2025         };
2026
2027 ###### variable fields
2028         int explicit_type;
2029
2030 ###### Grammar
2031
2032         $TERM : ::
2033
2034         $*var
2035         VariableDecl -> IDENTIFIER : ${ {
2036                 struct variable *v = var_decl(c, $1.txt);
2037                 $0 = new_pos(var, $1);
2038                 $0->var = v;
2039                 if (v)
2040                         v->where_decl = $0;
2041                 else {
2042                         v = var_ref(c, $1.txt);
2043                         $0->var = v;
2044                         type_err(c, "error: variable '%v' redeclared",
2045                                  $0, NULL, 0, NULL);
2046                         type_err(c, "info: this is where '%v' was first declared",
2047                                  v->where_decl, NULL, 0, NULL);
2048                 }
2049         } }$
2050         | IDENTIFIER :: ${ {
2051                 struct variable *v = var_decl(c, $1.txt);
2052                 $0 = new_pos(var, $1);
2053                 $0->var = v;
2054                 if (v) {
2055                         v->where_decl = $0;
2056                         v->constant = 1;
2057                 } else {
2058                         v = var_ref(c, $1.txt);
2059                         $0->var = v;
2060                         type_err(c, "error: variable '%v' redeclared",
2061                                  $0, NULL, 0, NULL);
2062                         type_err(c, "info: this is where '%v' was first declared",
2063                                  v->where_decl, NULL, 0, NULL);
2064                 }
2065         } }$
2066         | IDENTIFIER : Type ${ {
2067                 struct variable *v = var_decl(c, $1.txt);
2068                 $0 = new_pos(var, $1);
2069                 $0->var = v;
2070                 if (v) {
2071                         v->where_decl = $0;
2072                         v->where_set = $0;
2073                         v->type = $<Type;
2074                         v->explicit_type = 1;
2075                 } else {
2076                         v = var_ref(c, $1.txt);
2077                         $0->var = v;
2078                         type_err(c, "error: variable '%v' redeclared",
2079                                  $0, NULL, 0, NULL);
2080                         type_err(c, "info: this is where '%v' was first declared",
2081                                  v->where_decl, NULL, 0, NULL);
2082                 }
2083         } }$
2084         | IDENTIFIER :: Type ${ {
2085                 struct variable *v = var_decl(c, $1.txt);
2086                 $0 = new_pos(var, $1);
2087                 $0->var = v;
2088                 if (v) {
2089                         v->where_decl = $0;
2090                         v->where_set = $0;
2091                         v->type = $<Type;
2092                         v->constant = 1;
2093                         v->explicit_type = 1;
2094                 } else {
2095                         v = var_ref(c, $1.txt);
2096                         $0->var = v;
2097                         type_err(c, "error: variable '%v' redeclared",
2098                                  $0, NULL, 0, NULL);
2099                         type_err(c, "info: this is where '%v' was first declared",
2100                                  v->where_decl, NULL, 0, NULL);
2101                 }
2102         } }$
2103
2104         $*exec
2105         Variable -> IDENTIFIER ${ {
2106                 struct variable *v = var_ref(c, $1.txt);
2107                 $0 = new_pos(var, $1);
2108                 if (v == NULL) {
2109                         /* This might be a global const or a label
2110                          * Allocate a var with impossible type Tnone,
2111                          * which will be adjusted when we find out what it is,
2112                          * or will trigger an error.
2113                          */
2114                         v = var_decl(c, $1.txt);
2115                         if (v) {
2116                                 v->type = Tnone;
2117                                 v->where_decl = $0;
2118                                 v->where_set = $0;
2119                         }
2120                 }
2121                 cast(var, $0)->var = v;
2122         } }$
2123
2124 ###### print exec cases
2125         case Xvar:
2126         {
2127                 struct var *v = cast(var, e);
2128                 if (v->var) {
2129                         struct binding *b = v->var->name;
2130                         printf("%.*s", b->name.len, b->name.txt);
2131                 }
2132                 break;
2133         }
2134
2135 ###### format cases
2136         case 'v':
2137                 if (loc && loc->type == Xvar) {
2138                         struct var *v = cast(var, loc);
2139                         if (v->var) {
2140                                 struct binding *b = v->var->name;
2141                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2142                         } else
2143                                 fputs("???", stderr);   // NOTEST
2144                 } else
2145                         fputs("NOTVAR", stderr);
2146                 break;
2147
2148 ###### propagate exec cases
2149
2150         case Xvar:
2151         {
2152                 struct var *var = cast(var, prog);
2153                 struct variable *v = var->var;
2154                 if (!v) {
2155                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2156                         return Tnone;                                   // NOTEST
2157                 }
2158                 v = v->merged;
2159                 if (v->constant && (rules & Rnoconstant)) {
2160                         type_err(c, "error: Cannot assign to a constant: %v",
2161                                  prog, NULL, 0, NULL);
2162                         type_err(c, "info: name was defined as a constant here",
2163                                  v->where_decl, NULL, 0, NULL);
2164                         return v->type;
2165                 }
2166                 if (v->type == Tnone && v->where_decl == prog)
2167                         type_err(c, "error: variable used but not declared: %v",
2168                                  prog, NULL, 0, NULL);
2169                 if (v->type == NULL) {
2170                         if (type && !(*perr & Efail)) {
2171                                 v->type = type;
2172                                 v->where_set = prog;
2173                                 *perr |= Eretry;
2174                         }
2175                 } else if (!type_compat(type, v->type, rules)) {
2176                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2177                                  type, rules, v->type);
2178                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2179                                  v->type, rules, NULL);
2180                 }
2181                 if (!v->global || v->frame_pos < 0)
2182                         *perr |= Enoconst;
2183                 if (!type)
2184                         return v->type;
2185                 return type;
2186         }
2187
2188 ###### interp exec cases
2189         case Xvar:
2190         {
2191                 struct var *var = cast(var, e);
2192                 struct variable *v = var->var;
2193
2194                 v = v->merged;
2195                 lrv = var_value(c, v);
2196                 rvtype = v->type;
2197                 break;
2198         }
2199
2200 ###### ast functions
2201
2202         static void free_var(struct var *v)
2203         {
2204                 free(v);
2205         }
2206
2207 ###### free exec cases
2208         case Xvar: free_var(cast(var, e)); break;
2209
2210
2211 ### Complex types
2212
2213 Now that we have the shape of the interpreter in place we can add some
2214 complex types and connected them in to the data structures and the
2215 different phases of parse, analyse, print, interpret.
2216
2217 Being "complex" the language will naturally have syntax to access
2218 specifics of objects of these types.  These will fit into the grammar as
2219 "Terms" which are the things that are combined with various operators to
2220 form "Expression".  Where a Term is formed by some operation on another
2221 Term, the subordinate Term will always come first, so for example a
2222 member of an array will be expressed as the Term for the array followed
2223 by an index in square brackets.  The strict rule of using postfix
2224 operations makes precedence irrelevant within terms.  To provide a place
2225 to put the grammar for each terms of each type, we will start out by
2226 introducing the "Term" grammar production, with contains at least a
2227 simple "Value" (to be explained later).
2228
2229 ###### Grammar
2230         $*exec
2231         Term ->  Value ${ $0 = $<1; }$
2232         | Variable ${ $0 = $<1; }$
2233         ## term grammar
2234
2235 Thus far the complex types we have are arrays and structs.
2236
2237 #### Arrays
2238
2239 Arrays can be declared by giving a size and a type, as `[size]type' so
2240 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
2241 size can be either a literal number, or a named constant.  Some day an
2242 arbitrary expression will be supported.
2243
2244 As a formal parameter to a function, the array can be declared with a
2245 new variable as the size: `name:[size::number]string`.  The `size`
2246 variable is set to the size of the array and must be a constant.  As
2247 `number` is the only supported type, it can be left out:
2248 `name:[size::]string`.
2249
2250 Arrays cannot be assigned.  When pointers are introduced we will also
2251 introduce array slices which can refer to part or all of an array -
2252 the assignment syntax will create a slice.  For now, an array can only
2253 ever be referenced by the name it is declared with.  It is likely that
2254 a "`copy`" primitive will eventually be define which can be used to
2255 make a copy of an array with controllable recursive depth.
2256
2257 For now we have two sorts of array, those with fixed size either because
2258 it is given as a literal number or because it is a struct member (which
2259 cannot have a runtime-changing size), and those with a size that is
2260 determined at runtime - local variables with a const size.  The former
2261 have their size calculated at parse time, the latter at run time.
2262
2263 For the latter type, the `size` field of the type is the size of a
2264 pointer, and the array is reallocated every time it comes into scope.
2265
2266 We differentiate struct fields with a const size from local variables
2267 with a const size by whether they are prepared at parse time or not.
2268
2269 ###### type union fields
2270
2271         struct {
2272                 int unspec;     // size is unspecified - vsize must be set.
2273                 short size;
2274                 short static_size;
2275                 struct variable *vsize;
2276                 struct type *member;
2277         } array;
2278
2279 ###### value union fields
2280         void *array;  // used if not static_size
2281
2282 ###### value functions
2283
2284         static int array_prepare_type(struct parse_context *c, struct type *type,
2285                                        int parse_time)
2286         {
2287                 struct value *vsize;
2288                 mpz_t q;
2289                 if (type->array.static_size)
2290                         return 1;       // UNTESTED
2291                 if (type->array.unspec && parse_time)
2292                         return 1;       // UNTESTED
2293                 if (parse_time && type->array.vsize && !type->array.vsize->global)
2294                         return 1;       // UNTESTED
2295
2296                 if (type->array.vsize) {
2297                         vsize = var_value(c, type->array.vsize);
2298                         if (!vsize)
2299                                 return 1;       // UNTESTED
2300                         mpz_init(q);
2301                         mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
2302                         type->array.size = mpz_get_si(q);
2303                         mpz_clear(q);
2304                 }
2305                 if (!parse_time)
2306                         return 1;
2307                 if (type->array.member->size <= 0)
2308                         return 0;
2309
2310                 type->array.static_size = 1;
2311                 type->size = type->array.size * type->array.member->size;
2312                 type->align = type->array.member->align;
2313
2314                 return 1;
2315         }
2316
2317         static void array_init(struct type *type, struct value *val)
2318         {
2319                 int i;
2320                 void *ptr = val->ptr;
2321
2322                 if (!val)
2323                         return;                         // NOTEST
2324                 if (!type->array.static_size) {
2325                         val->array = calloc(type->array.size,
2326                                             type->array.member->size);
2327                         ptr = val->array;
2328                 }
2329                 for (i = 0; i < type->array.size; i++) {
2330                         struct value *v;
2331                         v = (void*)ptr + i * type->array.member->size;
2332                         val_init(type->array.member, v);
2333                 }
2334         }
2335
2336         static void array_free(struct type *type, struct value *val)
2337         {
2338                 int i;
2339                 void *ptr = val->ptr;
2340
2341                 if (!type->array.static_size)
2342                         ptr = val->array;
2343                 for (i = 0; i < type->array.size; i++) {
2344                         struct value *v;
2345                         v = (void*)ptr + i * type->array.member->size;
2346                         free_value(type->array.member, v);
2347                 }
2348                 if (!type->array.static_size)
2349                         free(ptr);
2350         }
2351
2352         static int array_compat(struct type *require, struct type *have)
2353         {
2354                 if (have->compat != require->compat)
2355                         return 0;
2356                 /* Both are arrays, so we can look at details */
2357                 if (!type_compat(require->array.member, have->array.member, 0))
2358                         return 0;
2359                 if (have->array.unspec && require->array.unspec) {
2360                         if (have->array.vsize && require->array.vsize &&
2361                             have->array.vsize != require->array.vsize)  // UNTESTED
2362                                 /* sizes might not be the same */
2363                                 return 0;       // UNTESTED
2364                         return 1;
2365                 }
2366                 if (have->array.unspec || require->array.unspec)
2367                         return 1;       // UNTESTED
2368                 if (require->array.vsize == NULL && have->array.vsize == NULL)
2369                         return require->array.size == have->array.size;
2370
2371                 return require->array.vsize == have->array.vsize;       // UNTESTED
2372         }
2373
2374         static void array_print_type(struct type *type, FILE *f)
2375         {
2376                 fputs("[", f);
2377                 if (type->array.vsize) {
2378                         struct binding *b = type->array.vsize->name;
2379                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
2380                                 type->array.unspec ? "::" : "");
2381                 } else if (type->array.size)
2382                         fprintf(f, "%d]", type->array.size);
2383                 else
2384                         fprintf(f, "]");
2385                 type_print(type->array.member, f);
2386         }
2387
2388         static struct type array_prototype = {
2389                 .init = array_init,
2390                 .prepare_type = array_prepare_type,
2391                 .print_type = array_print_type,
2392                 .compat = array_compat,
2393                 .free = array_free,
2394                 .size = sizeof(void*),
2395                 .align = sizeof(void*),
2396         };
2397
2398 ###### declare terminals
2399         $TERM [ ]
2400
2401 ###### type grammar
2402
2403         | [ NUMBER ] Type ${ {
2404                 char tail[3];
2405                 mpq_t num;
2406                 struct type *t;
2407                 int elements = 0;
2408
2409                 if (number_parse(num, tail, $2.txt) == 0)
2410                         tok_err(c, "error: unrecognised number", &$2);
2411                 else if (tail[0]) {
2412                         tok_err(c, "error: unsupported number suffix", &$2);
2413                         mpq_clear(num);
2414                 } else {
2415                         elements = mpz_get_ui(mpq_numref(num));
2416                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
2417                                 tok_err(c, "error: array size must be an integer",
2418                                         &$2);
2419                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
2420                                 tok_err(c, "error: array size is too large",
2421                                         &$2);
2422                         mpq_clear(num);
2423                 }
2424
2425                 $0 = t = add_anon_type(c, &array_prototype, "array[%d]", elements );
2426                 t->array.size = elements;
2427                 t->array.member = $<4;
2428                 t->array.vsize = NULL;
2429         } }$
2430
2431         | [ IDENTIFIER ] Type ${ {
2432                 struct variable *v = var_ref(c, $2.txt);
2433
2434                 if (!v)
2435                         tok_err(c, "error: name undeclared", &$2);
2436                 else if (!v->constant)
2437                         tok_err(c, "error: array size must be a constant", &$2);
2438
2439                 $0 = add_anon_type(c, &array_prototype, "array[%.*s]", $2.txt.len, $2.txt.txt);
2440                 $0->array.member = $<4;
2441                 $0->array.size = 0;
2442                 $0->array.vsize = v;
2443         } }$
2444
2445 ###### Grammar
2446         $*type
2447         OptType -> Type ${ $0 = $<1; }$
2448                 | ${ $0 = NULL; }$
2449
2450 ###### formal type grammar
2451
2452         | [ IDENTIFIER :: OptType ] Type ${ {
2453                 struct variable *v = var_decl(c, $ID.txt);
2454
2455                 v->type = $<OT;
2456                 v->constant = 1;
2457                 if (!v->type)
2458                         v->type = Tnum;
2459                 $0 = add_anon_type(c, &array_prototype, "array[var]");
2460                 $0->array.member = $<6;
2461                 $0->array.size = 0;
2462                 $0->array.unspec = 1;
2463                 $0->array.vsize = v;
2464         } }$
2465
2466 ###### Binode types
2467         Index,
2468
2469 ###### term grammar
2470
2471         | Term [ Expression ] ${ {
2472                 struct binode *b = new(binode);
2473                 b->op = Index;
2474                 b->left = $<1;
2475                 b->right = $<3;
2476                 $0 = b;
2477         } }$
2478
2479 ###### print binode cases
2480         case Index:
2481                 print_exec(b->left, -1, bracket);
2482                 printf("[");
2483                 print_exec(b->right, -1, bracket);
2484                 printf("]");
2485                 break;
2486
2487 ###### propagate binode cases
2488         case Index:
2489                 /* left must be an array, right must be a number,
2490                  * result is the member type of the array
2491                  */
2492                 propagate_types(b->right, c, perr, Tnum, 0);
2493                 t = propagate_types(b->left, c, perr, NULL, rules & Rnoconstant);
2494                 if (!t || t->compat != array_compat) {
2495                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
2496                         return NULL;
2497                 } else {
2498                         if (!type_compat(type, t->array.member, rules)) {
2499                                 type_err(c, "error: have %1 but need %2", prog,
2500                                          t->array.member, rules, type);
2501                         }
2502                         return t->array.member;
2503                 }
2504                 break;
2505
2506 ###### interp binode cases
2507         case Index: {
2508                 mpz_t q;
2509                 long i;
2510                 void *ptr;
2511
2512                 lleft = linterp_exec(c, b->left, &ltype);
2513                 right = interp_exec(c, b->right, &rtype);
2514                 mpz_init(q);
2515                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
2516                 i = mpz_get_si(q);
2517                 mpz_clear(q);
2518
2519                 if (ltype->array.static_size)
2520                         ptr = lleft;
2521                 else
2522                         ptr = *(void**)lleft;
2523                 rvtype = ltype->array.member;
2524                 if (i >= 0 && i < ltype->array.size)
2525                         lrv = ptr + i * rvtype->size;
2526                 else
2527                         val_init(ltype->array.member, &rv); // UNSAFE
2528                 ltype = NULL;
2529                 break;
2530         }
2531
2532 #### Structs
2533
2534 A `struct` is a data-type that contains one or more other data-types.
2535 It differs from an array in that each member can be of a different
2536 type, and they are accessed by name rather than by number.  Thus you
2537 cannot choose an element by calculation, you need to know what you
2538 want up-front.
2539
2540 The language makes no promises about how a given structure will be
2541 stored in memory - it is free to rearrange fields to suit whatever
2542 criteria seems important.
2543
2544 Structs are declared separately from program code - they cannot be
2545 declared in-line in a variable declaration like arrays can.  A struct
2546 is given a name and this name is used to identify the type - the name
2547 is not prefixed by the word `struct` as it would be in C.
2548
2549 Structs are only treated as the same if they have the same name.
2550 Simply having the same fields in the same order is not enough.  This
2551 might change once we can create structure initializers from a list of
2552 values.
2553
2554 Each component datum is identified much like a variable is declared,
2555 with a name, one or two colons, and a type.  The type cannot be omitted
2556 as there is no opportunity to deduce the type from usage.  An initial
2557 value can be given following an equals sign, so
2558
2559 ##### Example: a struct type
2560
2561         struct complex:
2562                 x:number = 0
2563                 y:number = 0
2564
2565 would declare a type called "complex" which has two number fields,
2566 each initialised to zero.
2567
2568 Struct will need to be declared separately from the code that uses
2569 them, so we will need to be able to print out the declaration of a
2570 struct when reprinting the whole program.  So a `print_type_decl` type
2571 function will be needed.
2572
2573 ###### type union fields
2574
2575         struct {
2576                 int nfields;
2577                 struct field {
2578                         struct text name;
2579                         struct type *type;
2580                         struct value *init;
2581                         int offset;
2582                 } *fields; // This is created when field_list is analysed.
2583                 struct fieldlist {
2584                         struct fieldlist *prev;
2585                         struct field f;
2586                         struct exec *init;
2587                 } *field_list; // This is created during parsing
2588         } structure;
2589
2590 ###### type functions
2591         void (*print_type_decl)(struct type *type, FILE *f);
2592
2593 ###### value functions
2594
2595         static void structure_init(struct type *type, struct value *val)
2596         {
2597                 int i;
2598
2599                 for (i = 0; i < type->structure.nfields; i++) {
2600                         struct value *v;
2601                         v = (void*) val->ptr + type->structure.fields[i].offset;
2602                         if (type->structure.fields[i].init)
2603                                 dup_value(type->structure.fields[i].type,
2604                                           type->structure.fields[i].init,
2605                                           v);
2606                         else
2607                                 val_init(type->structure.fields[i].type, v);
2608                 }
2609         }
2610
2611         static void structure_free(struct type *type, struct value *val)
2612         {
2613                 int i;
2614
2615                 for (i = 0; i < type->structure.nfields; i++) {
2616                         struct value *v;
2617                         v = (void*)val->ptr + type->structure.fields[i].offset;
2618                         free_value(type->structure.fields[i].type, v);
2619                 }
2620         }
2621
2622         static void free_fieldlist(struct fieldlist *f)
2623         {
2624                 if (!f)
2625                         return;
2626                 free_fieldlist(f->prev);
2627                 free_exec(f->init);
2628                 free(f);
2629         }
2630
2631         static void structure_free_type(struct type *t)
2632         {
2633                 int i;
2634                 for (i = 0; i < t->structure.nfields; i++)
2635                         if (t->structure.fields[i].init) {
2636                                 free_value(t->structure.fields[i].type,
2637                                            t->structure.fields[i].init);
2638                         }
2639                 free(t->structure.fields);
2640                 free_fieldlist(t->structure.field_list);
2641         }
2642
2643         static int structure_prepare_type(struct parse_context *c,
2644                                           struct type *t, int parse_time)
2645         {
2646                 int cnt = 0;
2647                 struct fieldlist *f;
2648
2649                 if (!parse_time || t->structure.fields)
2650                         return 1;
2651
2652                 for (f = t->structure.field_list; f; f=f->prev) {
2653                         enum prop_err perr;
2654                         cnt += 1;
2655
2656                         if (f->f.type->size <= 0)
2657                                 return 0;
2658                         if (f->f.type->prepare_type)
2659                                 f->f.type->prepare_type(c, f->f.type, parse_time);
2660
2661                         if (f->init == NULL)
2662                                 continue;
2663                         do {
2664                                 perr = 0;
2665                                 propagate_types(f->init, c, &perr, f->f.type, 0);
2666                         } while (perr & Eretry);
2667                         if (perr & Efail)
2668                                 c->parse_error += 1;    // NOTEST
2669                 }
2670
2671                 t->structure.nfields = cnt;
2672                 t->structure.fields = calloc(cnt, sizeof(struct field));
2673                 f = t->structure.field_list;
2674                 while (cnt > 0) {
2675                         int a = f->f.type->align;
2676                         cnt -= 1;
2677                         t->structure.fields[cnt] = f->f;
2678                         if (t->size & (a-1))
2679                                 t->size = (t->size | (a-1)) + 1;
2680                         t->structure.fields[cnt].offset = t->size;
2681                         t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2682                         if (a > t->align)
2683                                 t->align = a;
2684
2685                         if (f->init && !c->parse_error) {
2686                                 struct value vl = interp_exec(c, f->init, NULL);
2687                                 t->structure.fields[cnt].init =
2688                                         global_alloc(c, f->f.type, NULL, &vl);
2689                         }
2690
2691                         f = f->prev;
2692                 }
2693                 return 1;
2694         }
2695
2696         static struct type structure_prototype = {
2697                 .init = structure_init,
2698                 .free = structure_free,
2699                 .free_type = structure_free_type,
2700                 .print_type_decl = structure_print_type,
2701                 .prepare_type = structure_prepare_type,
2702         };
2703
2704 ###### exec type
2705         Xfieldref,
2706
2707 ###### ast
2708         struct fieldref {
2709                 struct exec;
2710                 struct exec *left;
2711                 int index;
2712                 struct text name;
2713         };
2714
2715 ###### free exec cases
2716         case Xfieldref:
2717                 free_exec(cast(fieldref, e)->left);
2718                 free(e);
2719                 break;
2720
2721 ###### declare terminals
2722         $TERM struct .
2723
2724 ###### term grammar
2725
2726         | Term . IDENTIFIER ${ {
2727                 struct fieldref *fr = new_pos(fieldref, $2);
2728                 fr->left = $<1;
2729                 fr->name = $3.txt;
2730                 fr->index = -2;
2731                 $0 = fr;
2732         } }$
2733
2734 ###### print exec cases
2735
2736         case Xfieldref:
2737         {
2738                 struct fieldref *f = cast(fieldref, e);
2739                 print_exec(f->left, -1, bracket);
2740                 printf(".%.*s", f->name.len, f->name.txt);
2741                 break;
2742         }
2743
2744 ###### ast functions
2745         static int find_struct_index(struct type *type, struct text field)
2746         {
2747                 int i;
2748                 for (i = 0; i < type->structure.nfields; i++)
2749                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2750                                 return i;
2751                 return -1;
2752         }
2753
2754 ###### propagate exec cases
2755
2756         case Xfieldref:
2757         {
2758                 struct fieldref *f = cast(fieldref, prog);
2759                 struct type *st = propagate_types(f->left, c, perr, NULL, 0);
2760
2761                 if (!st)
2762                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2763                                  NULL, 0, NULL);
2764                 else if (st->init != structure_init)
2765                         type_err(c, "error: field reference attempted on %1, not a struct",
2766                                  f->left, st, 0, NULL);
2767                 else if (f->index == -2) {
2768                         f->index = find_struct_index(st, f->name);
2769                         if (f->index < 0)
2770                                 type_err(c, "error: cannot find requested field in %1",
2771                                          f->left, st, 0, NULL);
2772                 }
2773                 if (f->index >= 0) {
2774                         struct type *ft = st->structure.fields[f->index].type;
2775                         if (!type_compat(type, ft, rules))
2776                                 type_err(c, "error: have %1 but need %2", prog,
2777                                          ft, rules, type);
2778                         return ft;
2779                 }
2780                 break;
2781         }
2782
2783 ###### interp exec cases
2784         case Xfieldref:
2785         {
2786                 struct fieldref *f = cast(fieldref, e);
2787                 struct type *ltype;
2788                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2789                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2790                 rvtype = ltype->structure.fields[f->index].type;
2791                 break;
2792         }
2793
2794 ###### top level grammar
2795         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2796                 struct type *t;
2797                 t = find_type(c, $ID.txt);
2798                 if (!t)
2799                         t = add_type(c, $ID.txt, &structure_prototype);
2800                 else if (t->size >= 0) {
2801                         tok_err(c, "error: type already declared", &$ID);
2802                         tok_err(c, "info: this is location of declartion", &t->first_use);
2803                         /* Create a new one - duplicate */
2804                         t = add_type(c, $ID.txt, &structure_prototype);
2805                 } else {
2806                         struct type tmp = *t;
2807                         *t = structure_prototype;
2808                         t->name = tmp.name;
2809                         t->next = tmp.next;
2810                 }
2811                 t->structure.field_list = $<FB;
2812                 t->first_use = $ID;
2813         } }$
2814
2815         $*fieldlist
2816         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2817         | { SimpleFieldList } ${ $0 = $<SFL; }$
2818         | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2819         | SimpleFieldList EOL ${ $0 = $<SFL; }$
2820
2821         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2822         | FieldLines SimpleFieldList Newlines ${
2823                 $SFL->prev = $<FL;
2824                 $0 = $<SFL;
2825         }$
2826
2827         SimpleFieldList -> Field ${ $0 = $<F; }$
2828         | SimpleFieldList ; Field ${
2829                 $F->prev = $<SFL;
2830                 $0 = $<F;
2831         }$
2832         | SimpleFieldList ; ${
2833                 $0 = $<SFL;
2834         }$
2835         | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2836
2837         Field -> IDENTIFIER : Type = Expression ${ {
2838                 $0 = calloc(1, sizeof(struct fieldlist));
2839                 $0->f.name = $ID.txt;
2840                 $0->f.type = $<Type;
2841                 $0->f.init = NULL;
2842                 $0->init = $<Expr;
2843         } }$
2844         | IDENTIFIER : Type ${
2845                 $0 = calloc(1, sizeof(struct fieldlist));
2846                 $0->f.name = $ID.txt;
2847                 $0->f.type = $<Type;
2848         }$
2849
2850 ###### forward decls
2851         static void structure_print_type(struct type *t, FILE *f);
2852
2853 ###### value functions
2854         static void structure_print_type(struct type *t, FILE *f)
2855         {
2856                 int i;
2857
2858                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2859
2860                 for (i = 0; i < t->structure.nfields; i++) {
2861                         struct field *fl = t->structure.fields + i;
2862                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2863                         type_print(fl->type, f);
2864                         if (fl->type->print && fl->init) {
2865                                 fprintf(f, " = ");
2866                                 if (fl->type == Tstr)
2867                                         fprintf(f, "\"");       // UNTESTED
2868                                 print_value(fl->type, fl->init, f);
2869                                 if (fl->type == Tstr)
2870                                         fprintf(f, "\"");       // UNTESTED
2871                         }
2872                         fprintf(f, "\n");
2873                 }
2874         }
2875
2876 ###### print type decls
2877         {
2878                 struct type *t;
2879                 int target = -1;
2880
2881                 while (target != 0) {
2882                         int i = 0;
2883                         for (t = context.typelist; t ; t=t->next)
2884                                 if (!t->anon && t->print_type_decl &&
2885                                     !t->check_args) {
2886                                         i += 1;
2887                                         if (i == target)
2888                                                 break;
2889                                 }
2890
2891                         if (target == -1) {
2892                                 target = i;
2893                         } else {
2894                                 t->print_type_decl(t, stdout);
2895                                 target -= 1;
2896                         }
2897                 }
2898         }
2899
2900 #### Functions
2901
2902 A function is a chunk of code which can be passed parameters and can
2903 return results.  Each function has a type which includes the set of
2904 parameters and the return value.  As yet these types cannot be declared
2905 separately from the function itself.
2906
2907 The parameters can be specified either in parentheses as a ';' separated
2908 list, such as
2909
2910 ##### Example: function 1
2911
2912         func main(av:[ac::number]string; env:[envc::number]string)
2913                 code block
2914
2915 or as an indented list of one parameter per line (though each line can
2916 be a ';' separated list)
2917
2918 ##### Example: function 2
2919
2920         func main
2921                 argv:[argc::number]string
2922                 env:[envc::number]string
2923         do
2924                 code block
2925
2926 In the first case a return type can follow the parentheses after a colon,
2927 in the second it is given on a line starting with the word `return`.
2928
2929 ##### Example: functions that return
2930
2931         func add(a:number; b:number): number
2932                 code block
2933
2934         func catenate
2935                 a: string
2936                 b: string
2937         return string
2938         do
2939                 code block
2940
2941 Rather than returning a type, the function can specify a set of local
2942 variables to return as a struct.  The values of these variables when the
2943 function exits will be provided to the caller.  For this the return type
2944 is replaced with a block of result declarations, either in parentheses
2945 or bracketed by `return` and `do`.
2946
2947 ##### Example: functions returning multiple variables
2948
2949         func to_cartesian(rho:number; theta:number):(x:number; y:number)
2950                 x = .....
2951                 y = .....
2952
2953         func to_polar
2954                 x:number; y:number
2955         return
2956                 rho:number
2957                 theta:number
2958         do
2959                 rho = ....
2960                 theta = ....
2961
2962 For constructing the lists we use a `List` binode, which will be
2963 further detailed when Expression Lists are introduced.
2964
2965 ###### type union fields
2966
2967         struct {
2968                 struct binode *params;
2969                 struct type *return_type;
2970                 struct variable *scope;
2971                 int inline_result;      // return value is at start of 'local'
2972                 int local_size;
2973         } function;
2974
2975 ###### value union fields
2976         struct exec *function;
2977
2978 ###### type functions
2979         void (*check_args)(struct parse_context *c, enum prop_err *perr,
2980                            struct type *require, struct exec *args);
2981
2982 ###### value functions
2983
2984         static void function_free(struct type *type, struct value *val)
2985         {
2986                 free_exec(val->function);
2987                 val->function = NULL;
2988         }
2989
2990         static int function_compat(struct type *require, struct type *have)
2991         {
2992                 // FIXME can I do anything here yet?
2993                 return 0;
2994         }
2995
2996         static void function_check_args(struct parse_context *c, enum prop_err *perr,
2997                                         struct type *require, struct exec *args)
2998         {
2999                 /* This should be 'compat', but we don't have a 'tuple' type to
3000                  * hold the type of 'args'
3001                  */
3002                 struct binode *arg = cast(binode, args);
3003                 struct binode *param = require->function.params;
3004
3005                 while (param) {
3006                         struct var *pv = cast(var, param->left);
3007                         if (!arg) {
3008                                 type_err(c, "error: insufficient arguments to function.",
3009                                          args, NULL, 0, NULL);
3010                                 break;
3011                         }
3012                         *perr = 0;
3013                         propagate_types(arg->left, c, perr, pv->var->type, 0);
3014                         param = cast(binode, param->right);
3015                         arg = cast(binode, arg->right);
3016                 }
3017                 if (arg)
3018                         type_err(c, "error: too many arguments to function.",
3019                                  args, NULL, 0, NULL);
3020         }
3021
3022         static void function_print(struct type *type, struct value *val, FILE *f)
3023         {
3024                 print_exec(val->function, 1, 0);
3025         }
3026
3027         static void function_print_type_decl(struct type *type, FILE *f)
3028         {
3029                 struct binode *b;
3030                 fprintf(f, "(");
3031                 for (b = type->function.params; b; b = cast(binode, b->right)) {
3032                         struct variable *v = cast(var, b->left)->var;
3033                         fprintf(f, "%.*s%s", v->name->name.len, v->name->name.txt,
3034                                 v->constant ? "::" : ":");
3035                         type_print(v->type, f);
3036                         if (b->right)
3037                                 fprintf(f, "; ");
3038                 }
3039                 fprintf(f, ")");
3040                 if (type->function.return_type != Tnone) {
3041                         fprintf(f, ":");
3042                         if (type->function.inline_result) {
3043                                 int i;
3044                                 struct type *t = type->function.return_type;
3045                                 fprintf(f, " (");
3046                                 for (i = 0; i < t->structure.nfields; i++) {
3047                                         struct field *fl = t->structure.fields + i;
3048                                         if (i)
3049                                                 fprintf(f, "; ");
3050                                         fprintf(f, "%.*s:", fl->name.len, fl->name.txt);
3051                                         type_print(fl->type, f);
3052                                 }
3053                                 fprintf(f, ")");
3054                         } else
3055                                 type_print(type->function.return_type, f);
3056                 }
3057                 fprintf(f, "\n");
3058         }
3059
3060         static void function_free_type(struct type *t)
3061         {
3062                 free_exec(t->function.params);
3063         }
3064
3065         static struct type function_prototype = {
3066                 .size = sizeof(void*),
3067                 .align = sizeof(void*),
3068                 .free = function_free,
3069                 .compat = function_compat,
3070                 .check_args = function_check_args,
3071                 .print = function_print,
3072                 .print_type_decl = function_print_type_decl,
3073                 .free_type = function_free_type,
3074         };
3075
3076 ###### declare terminals
3077
3078         $TERM func
3079
3080 ###### Binode types
3081         List,
3082
3083 ###### Grammar
3084
3085         $*variable
3086         FuncName -> IDENTIFIER ${ {
3087                 struct variable *v = var_decl(c, $1.txt);
3088                 struct var *e = new_pos(var, $1);
3089                 e->var = v;
3090                 if (v) {
3091                         v->where_decl = e;
3092                         $0 = v;
3093                 } else {
3094                         v = var_ref(c, $1.txt);
3095                         e->var = v;
3096                         type_err(c, "error: function '%v' redeclared",
3097                                 e, NULL, 0, NULL);
3098                         type_err(c, "info: this is where '%v' was first declared",
3099                                 v->where_decl, NULL, 0, NULL);
3100                         free_exec(e);
3101                 }
3102         } }$
3103
3104         $*binode
3105         Args -> ArgsLine NEWLINE ${ $0 = $<AL; }$
3106         | Args ArgsLine NEWLINE ${ {
3107                 struct binode *b = $<AL;
3108                 struct binode **bp = &b;
3109                 while (*bp)
3110                         bp = (struct binode **)&(*bp)->left;
3111                 *bp = $<A;
3112                 $0 = b;
3113         } }$
3114
3115         ArgsLine -> ${ $0 = NULL; }$
3116         | Varlist ${ $0 = $<1; }$
3117         | Varlist ; ${ $0 = $<1; }$
3118
3119         Varlist -> Varlist ; ArgDecl ${
3120                 $0 = new(binode);
3121                 $0->op = List;
3122                 $0->left = $<Vl;
3123                 $0->right = $<AD;
3124         }$
3125         | ArgDecl ${
3126                 $0 = new(binode);
3127                 $0->op = List;
3128                 $0->left = NULL;
3129                 $0->right = $<AD;
3130         }$
3131
3132         $*var
3133         ArgDecl -> IDENTIFIER : FormalType ${ {
3134                 struct variable *v = var_decl(c, $1.txt);
3135                 $0 = new(var);
3136                 $0->var = v;
3137                 v->type = $<FT;
3138         } }$
3139
3140 ##### Function calls
3141
3142 A function call can appear either as an expression or as a statement.
3143 We use a new 'Funcall' binode type to link the function with a list of
3144 arguments, form with the 'List' nodes.
3145
3146 We have already seen the "Term" which is how a function call can appear
3147 in an expression.  To parse a function call into a statement we include
3148 it in the "SimpleStatement Grammar" which will be described later.
3149
3150 ###### Binode types
3151         Funcall,
3152
3153 ###### term grammar
3154         | Term ( ExpressionList ) ${ {
3155                 struct binode *b = new(binode);
3156                 b->op = Funcall;
3157                 b->left = $<T;
3158                 b->right = reorder_bilist($<EL);
3159                 $0 = b;
3160         } }$
3161         | Term ( ) ${ {
3162                 struct binode *b = new(binode);
3163                 b->op = Funcall;
3164                 b->left = $<T;
3165                 b->right = NULL;
3166                 $0 = b;
3167         } }$
3168
3169 ###### SimpleStatement Grammar
3170
3171         | Term ( ExpressionList ) ${ {
3172                 struct binode *b = new(binode);
3173                 b->op = Funcall;
3174                 b->left = $<T;
3175                 b->right = reorder_bilist($<EL);
3176                 $0 = b;
3177         } }$
3178
3179 ###### print binode cases
3180
3181         case Funcall:
3182                 do_indent(indent, "");
3183                 print_exec(b->left, -1, bracket);
3184                 printf("(");
3185                 for (b = cast(binode, b->right); b; b = cast(binode, b->right)) {
3186                         if (b->left) {
3187                                 printf(" ");
3188                                 print_exec(b->left, -1, bracket);
3189                                 if (b->right)
3190                                         printf(",");
3191                         }
3192                 }
3193                 printf(")");
3194                 if (indent >= 0)
3195                         printf("\n");
3196                 break;
3197
3198 ###### propagate binode cases
3199
3200         case Funcall: {
3201                 /* Every arg must match formal parameter, and result
3202                  * is return type of function
3203                  */
3204                 struct binode *args = cast(binode, b->right);
3205                 struct var *v = cast(var, b->left);
3206
3207                 if (!v->var->type || v->var->type->check_args == NULL) {
3208                         type_err(c, "error: attempt to call a non-function.",
3209                                  prog, NULL, 0, NULL);
3210                         return NULL;
3211                 }
3212                 *perr |= Enoconst;
3213                 v->var->type->check_args(c, perr, v->var->type, args);
3214                 if (v->var->type->function.inline_result)
3215                         *perr |= Emaycopy;
3216                 return v->var->type->function.return_type;
3217         }
3218
3219 ###### interp binode cases
3220
3221         case Funcall: {
3222                 struct var *v = cast(var, b->left);
3223                 struct type *t = v->var->type;
3224                 void *oldlocal = c->local;
3225                 int old_size = c->local_size;
3226                 void *local = calloc(1, t->function.local_size);
3227                 struct value *fbody = var_value(c, v->var);
3228                 struct binode *arg = cast(binode, b->right);
3229                 struct binode *param = t->function.params;
3230
3231                 while (param) {
3232                         struct var *pv = cast(var, param->left);
3233                         struct type *vtype = NULL;
3234                         struct value val = interp_exec(c, arg->left, &vtype);
3235                         struct value *lval;
3236                         c->local = local; c->local_size = t->function.local_size;
3237                         lval = var_value(c, pv->var);
3238                         c->local = oldlocal; c->local_size = old_size;
3239                         memcpy(lval, &val, vtype->size);
3240                         param = cast(binode, param->right);
3241                         arg = cast(binode, arg->right);
3242                 }
3243                 c->local = local; c->local_size = t->function.local_size;
3244                 if (t->function.inline_result && dtype) {
3245                         _interp_exec(c, fbody->function, NULL, NULL);
3246                         memcpy(dest, local, dtype->size);
3247                         rvtype = ret.type = NULL;
3248                 } else
3249                         rv = interp_exec(c, fbody->function, &rvtype);
3250                 c->local = oldlocal; c->local_size = old_size;
3251                 free(local);
3252                 break;
3253         }
3254
3255 ## Complex executables: statements and expressions
3256
3257 Now that we have types and values and variables and most of the basic
3258 Terms which provide access to these, we can explore the more complex
3259 code that combine all of these to get useful work done.  Specifically
3260 statements and expressions.
3261
3262 Expressions are various combinations of Terms.  We will use operator
3263 precedence to ensure correct parsing.  The simplest Expression is just a
3264 Term - others will follow.
3265
3266 ###### Grammar
3267
3268         $*exec
3269         Expression -> Term ${ $0 = $<Term; }$
3270         ## expression grammar
3271
3272 ### Expressions: Conditional
3273
3274 Our first user of the `binode` will be conditional expressions, which
3275 is a bit odd as they actually have three components.  That will be
3276 handled by having 2 binodes for each expression.  The conditional
3277 expression is the lowest precedence operator which is why we define it
3278 first - to start the precedence list.
3279
3280 Conditional expressions are of the form "value `if` condition `else`
3281 other_value".  They associate to the right, so everything to the right
3282 of `else` is part of an else value, while only a higher-precedence to
3283 the left of `if` is the if values.  Between `if` and `else` there is no
3284 room for ambiguity, so a full conditional expression is allowed in
3285 there.
3286
3287 ###### Binode types
3288         CondExpr,
3289
3290 ###### declare terminals
3291
3292         $LEFT if $$ifelse
3293
3294 ###### expression grammar
3295
3296         | Expression if Expression else Expression $$ifelse ${ {
3297                 struct binode *b1 = new(binode);
3298                 struct binode *b2 = new(binode);
3299                 b1->op = CondExpr;
3300                 b1->left = $<3;
3301                 b1->right = b2;
3302                 b2->op = CondExpr;
3303                 b2->left = $<1;
3304                 b2->right = $<5;
3305                 $0 = b1;
3306         } }$
3307
3308 ###### print binode cases
3309
3310         case CondExpr:
3311                 b2 = cast(binode, b->right);
3312                 if (bracket) printf("(");
3313                 print_exec(b2->left, -1, bracket);
3314                 printf(" if ");
3315                 print_exec(b->left, -1, bracket);
3316                 printf(" else ");
3317                 print_exec(b2->right, -1, bracket);
3318                 if (bracket) printf(")");
3319                 break;
3320
3321 ###### propagate binode cases
3322
3323         case CondExpr: {
3324                 /* cond must be Tbool, others must match */
3325                 struct binode *b2 = cast(binode, b->right);
3326                 struct type *t2;
3327
3328                 propagate_types(b->left, c, perr, Tbool, 0);
3329                 t = propagate_types(b2->left, c, perr, type, Rnolabel);
3330                 t2 = propagate_types(b2->right, c, perr, type ?: t, Rnolabel);
3331                 return t ?: t2;
3332         }
3333
3334 ###### interp binode cases
3335
3336         case CondExpr: {
3337                 struct binode *b2 = cast(binode, b->right);
3338                 left = interp_exec(c, b->left, &ltype);
3339                 if (left.bool)
3340                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
3341                 else
3342                         rv = interp_exec(c, b2->right, &rvtype);
3343                 }
3344                 break;
3345
3346 ### Expression list
3347
3348 We take a brief detour, now that we have expressions, to describe lists
3349 of expressions.  These will be needed for function parameters and
3350 possibly other situations.  They seem generic enough to introduce here
3351 to be used elsewhere.
3352
3353 And ExpressionList will use the `List` type of `binode`, building up at
3354 the end.  And place where they are used will probably call
3355 `reorder_bilist()` to get a more normal first/next arrangement.
3356
3357 ###### declare terminals
3358         $TERM ,
3359
3360 `List` execs have no implicit semantics, so they are never propagated or
3361 interpreted.  The can be printed as a comma separate list, which is how
3362 they are parsed.  Note they are also used for function formal parameter
3363 lists.  In that case a separate function is used to print them.
3364
3365 ###### print binode cases
3366         case List:
3367                 while (b) {
3368                         printf(" ");
3369                         print_exec(b->left, -1, bracket);
3370                         if (b->right)
3371                                 printf(",");
3372                         b = cast(binode, b->right);
3373                 }
3374                 break;
3375
3376 ###### propagate binode cases
3377         case List: abort(); // NOTEST
3378 ###### interp binode cases
3379         case List: abort(); // NOTEST
3380
3381 ###### Grammar
3382
3383         $*binode
3384         ExpressionList -> ExpressionList , Expression ${
3385                 $0 = new(binode);
3386                 $0->op = List;
3387                 $0->left = $<1;
3388                 $0->right = $<3;
3389         }$
3390         | Expression ${
3391                 $0 = new(binode);
3392                 $0->op = List;
3393                 $0->left = NULL;
3394                 $0->right = $<1;
3395         }$
3396
3397 ### Expressions: Boolean
3398
3399 The next class of expressions to use the `binode` will be Boolean
3400 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
3401 have same corresponding precendence.  The difference is that they don't
3402 evaluate the second expression if not necessary.
3403
3404 ###### Binode types
3405         And,
3406         AndThen,
3407         Or,
3408         OrElse,
3409         Not,
3410
3411 ###### declare terminals
3412         $LEFT or
3413         $LEFT and
3414         $LEFT not
3415
3416 ###### expression grammar
3417         | Expression or Expression ${ {
3418                 struct binode *b = new(binode);
3419                 b->op = Or;
3420                 b->left = $<1;
3421                 b->right = $<3;
3422                 $0 = b;
3423         } }$
3424         | Expression or else Expression ${ {
3425                 struct binode *b = new(binode);
3426                 b->op = OrElse;
3427                 b->left = $<1;
3428                 b->right = $<4;
3429                 $0 = b;
3430         } }$
3431
3432         | Expression and Expression ${ {
3433                 struct binode *b = new(binode);
3434                 b->op = And;
3435                 b->left = $<1;
3436                 b->right = $<3;
3437                 $0 = b;
3438         } }$
3439         | Expression and then Expression ${ {
3440                 struct binode *b = new(binode);
3441                 b->op = AndThen;
3442                 b->left = $<1;
3443                 b->right = $<4;
3444                 $0 = b;
3445         } }$
3446
3447         | not Expression ${ {
3448                 struct binode *b = new(binode);
3449                 b->op = Not;
3450                 b->right = $<2;
3451                 $0 = b;
3452         } }$
3453
3454 ###### print binode cases
3455         case And:
3456                 if (bracket) printf("(");
3457                 print_exec(b->left, -1, bracket);
3458                 printf(" and ");
3459                 print_exec(b->right, -1, bracket);
3460                 if (bracket) printf(")");
3461                 break;
3462         case AndThen:
3463                 if (bracket) printf("(");
3464                 print_exec(b->left, -1, bracket);
3465                 printf(" and then ");
3466                 print_exec(b->right, -1, bracket);
3467                 if (bracket) printf(")");
3468                 break;
3469         case Or:
3470                 if (bracket) printf("(");
3471                 print_exec(b->left, -1, bracket);
3472                 printf(" or ");
3473                 print_exec(b->right, -1, bracket);
3474                 if (bracket) printf(")");
3475                 break;
3476         case OrElse:
3477                 if (bracket) printf("(");
3478                 print_exec(b->left, -1, bracket);
3479                 printf(" or else ");
3480                 print_exec(b->right, -1, bracket);
3481                 if (bracket) printf(")");
3482                 break;
3483         case Not:
3484                 if (bracket) printf("(");
3485                 printf("not ");
3486                 print_exec(b->right, -1, bracket);
3487                 if (bracket) printf(")");
3488                 break;
3489
3490 ###### propagate binode cases
3491         case And:
3492         case AndThen:
3493         case Or:
3494         case OrElse:
3495         case Not:
3496                 /* both must be Tbool, result is Tbool */
3497                 propagate_types(b->left, c, perr, Tbool, 0);
3498                 propagate_types(b->right, c, perr, Tbool, 0);
3499                 if (type && type != Tbool)
3500                         type_err(c, "error: %1 operation found where %2 expected", prog,
3501                                    Tbool, 0, type);
3502                 return Tbool;
3503
3504 ###### interp binode cases
3505         case And:
3506                 rv = interp_exec(c, b->left, &rvtype);
3507                 right = interp_exec(c, b->right, &rtype);
3508                 rv.bool = rv.bool && right.bool;
3509                 break;
3510         case AndThen:
3511                 rv = interp_exec(c, b->left, &rvtype);
3512                 if (rv.bool)
3513                         rv = interp_exec(c, b->right, NULL);
3514                 break;
3515         case Or:
3516                 rv = interp_exec(c, b->left, &rvtype);
3517                 right = interp_exec(c, b->right, &rtype);
3518                 rv.bool = rv.bool || right.bool;
3519                 break;
3520         case OrElse:
3521                 rv = interp_exec(c, b->left, &rvtype);
3522                 if (!rv.bool)
3523                         rv = interp_exec(c, b->right, NULL);
3524                 break;
3525         case Not:
3526                 rv = interp_exec(c, b->right, &rvtype);
3527                 rv.bool = !rv.bool;
3528                 break;
3529
3530 ### Expressions: Comparison
3531
3532 Of slightly higher precedence that Boolean expressions are Comparisons.
3533 A comparison takes arguments of any comparable type, but the two types
3534 must be the same.
3535
3536 To simplify the parsing we introduce an `eop` which can record an
3537 expression operator, and the `CMPop` non-terminal will match one of them.
3538
3539 ###### ast
3540         struct eop {
3541                 enum Btype op;
3542         };
3543
3544 ###### ast functions
3545         static void free_eop(struct eop *e)
3546         {
3547                 if (e)
3548                         free(e);
3549         }
3550
3551 ###### Binode types
3552         Less,
3553         Gtr,
3554         LessEq,
3555         GtrEq,
3556         Eql,
3557         NEql,
3558
3559 ###### declare terminals
3560         $LEFT < > <= >= == != CMPop
3561
3562 ###### expression grammar
3563         | Expression CMPop Expression ${ {
3564                 struct binode *b = new(binode);
3565                 b->op = $2.op;
3566                 b->left = $<1;
3567                 b->right = $<3;
3568                 $0 = b;
3569         } }$
3570
3571 ###### Grammar
3572
3573         $eop
3574         CMPop ->  < ${ $0.op = Less; }$
3575         |         > ${ $0.op = Gtr; }$
3576         |         <= ${ $0.op = LessEq; }$
3577         |         >= ${ $0.op = GtrEq; }$
3578         |         == ${ $0.op = Eql; }$
3579         |         != ${ $0.op = NEql; }$
3580
3581 ###### print binode cases
3582
3583         case Less:
3584         case LessEq:
3585         case Gtr:
3586         case GtrEq:
3587         case Eql:
3588         case NEql:
3589                 if (bracket) printf("(");
3590                 print_exec(b->left, -1, bracket);
3591                 switch(b->op) {
3592                 case Less:   printf(" < "); break;
3593                 case LessEq: printf(" <= "); break;
3594                 case Gtr:    printf(" > "); break;
3595                 case GtrEq:  printf(" >= "); break;
3596                 case Eql:    printf(" == "); break;
3597                 case NEql:   printf(" != "); break;
3598                 default: abort();               // NOTEST
3599                 }
3600                 print_exec(b->right, -1, bracket);
3601                 if (bracket) printf(")");
3602                 break;
3603
3604 ###### propagate binode cases
3605         case Less:
3606         case LessEq:
3607         case Gtr:
3608         case GtrEq:
3609         case Eql:
3610         case NEql:
3611                 /* Both must match but not be labels, result is Tbool */
3612                 t = propagate_types(b->left, c, perr, NULL, Rnolabel);
3613                 if (t)
3614                         propagate_types(b->right, c, perr, t, 0);
3615                 else {
3616                         t = propagate_types(b->right, c, perr, NULL, Rnolabel); // UNTESTED
3617                         if (t)  // UNTESTED
3618                                 t = propagate_types(b->left, c, perr, t, 0);    // UNTESTED
3619                 }
3620                 if (!type_compat(type, Tbool, 0))
3621                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
3622                                     Tbool, rules, type);
3623                 return Tbool;
3624
3625 ###### interp binode cases
3626         case Less:
3627         case LessEq:
3628         case Gtr:
3629         case GtrEq:
3630         case Eql:
3631         case NEql:
3632         {
3633                 int cmp;
3634                 left = interp_exec(c, b->left, &ltype);
3635                 right = interp_exec(c, b->right, &rtype);
3636                 cmp = value_cmp(ltype, rtype, &left, &right);
3637                 rvtype = Tbool;
3638                 switch (b->op) {
3639                 case Less:      rv.bool = cmp <  0; break;
3640                 case LessEq:    rv.bool = cmp <= 0; break;
3641                 case Gtr:       rv.bool = cmp >  0; break;
3642                 case GtrEq:     rv.bool = cmp >= 0; break;
3643                 case Eql:       rv.bool = cmp == 0; break;
3644                 case NEql:      rv.bool = cmp != 0; break;
3645                 default:        rv.bool = 0; break;     // NOTEST
3646                 }
3647                 break;
3648         }
3649
3650 ### Expressions: Arithmetic etc.
3651
3652 The remaining expressions with the highest precedence are arithmetic,
3653 string concatenation, and string conversion.  String concatenation
3654 (`++`) has the same precedence as multiplication and division, but lower
3655 than the uniary.
3656
3657 String conversion is a temporary feature until I get a better type
3658 system.  `$` is a prefix operator which expects a string and returns
3659 a number.
3660
3661 `+` and `-` are both infix and prefix operations (where they are
3662 absolute value and negation).  These have different operator names.
3663
3664 We also have a 'Bracket' operator which records where parentheses were
3665 found.  This makes it easy to reproduce these when printing.  Possibly I
3666 should only insert brackets were needed for precedence.  Putting
3667 parentheses around an expression converts it into a Term,
3668
3669 ###### Binode types
3670         Plus, Minus,
3671         Times, Divide, Rem,
3672         Concat,
3673         Absolute, Negate,
3674         StringConv,
3675         Bracket,
3676
3677 ###### declare terminals
3678         $LEFT + - Eop
3679         $LEFT * / % ++ Top
3680         $LEFT Uop $
3681         $TERM ( )
3682
3683 ###### expression grammar
3684         | Expression Eop Expression ${ {
3685                 struct binode *b = new(binode);
3686                 b->op = $2.op;
3687                 b->left = $<1;
3688                 b->right = $<3;
3689                 $0 = b;
3690         } }$
3691
3692         | Expression Top Expression ${ {
3693                 struct binode *b = new(binode);
3694                 b->op = $2.op;
3695                 b->left = $<1;
3696                 b->right = $<3;
3697                 $0 = b;
3698         } }$
3699
3700         | Uop Expression ${ {
3701                 struct binode *b = new(binode);
3702                 b->op = $1.op;
3703                 b->right = $<2;
3704                 $0 = b;
3705         } }$
3706
3707 ###### term grammar
3708
3709         | ( Expression ) ${ {
3710                 struct binode *b = new_pos(binode, $1);
3711                 b->op = Bracket;
3712                 b->right = $<2;
3713                 $0 = b;
3714         } }$
3715
3716 ###### Grammar
3717
3718         $eop
3719         Eop ->   + ${ $0.op = Plus; }$
3720         |        - ${ $0.op = Minus; }$
3721
3722         Uop ->   + ${ $0.op = Absolute; }$
3723         |        - ${ $0.op = Negate; }$
3724         |        $ ${ $0.op = StringConv; }$
3725
3726         Top ->   * ${ $0.op = Times; }$
3727         |        / ${ $0.op = Divide; }$
3728         |        % ${ $0.op = Rem; }$
3729         |        ++ ${ $0.op = Concat; }$
3730
3731 ###### print binode cases
3732         case Plus:
3733         case Minus:
3734         case Times:
3735         case Divide:
3736         case Concat:
3737         case Rem:
3738                 if (bracket) printf("(");
3739                 print_exec(b->left, indent, bracket);
3740                 switch(b->op) {
3741                 case Plus:   fputs(" + ", stdout); break;
3742                 case Minus:  fputs(" - ", stdout); break;
3743                 case Times:  fputs(" * ", stdout); break;
3744                 case Divide: fputs(" / ", stdout); break;
3745                 case Rem:    fputs(" % ", stdout); break;
3746                 case Concat: fputs(" ++ ", stdout); break;
3747                 default: abort();       // NOTEST
3748                 }                       // NOTEST
3749                 print_exec(b->right, indent, bracket);
3750                 if (bracket) printf(")");
3751                 break;
3752         case Absolute:
3753         case Negate:
3754         case StringConv:
3755                 if (bracket) printf("(");
3756                 switch (b->op) {
3757                 case Absolute:   fputs("+", stdout); break;
3758                 case Negate:     fputs("-", stdout); break;
3759                 case StringConv: fputs("$", stdout); break;
3760                 default: abort();       // NOTEST
3761                 }                       // NOTEST
3762                 print_exec(b->right, indent, bracket);
3763                 if (bracket) printf(")");
3764                 break;
3765         case Bracket:
3766                 printf("(");
3767                 print_exec(b->right, indent, bracket);
3768                 printf(")");
3769                 break;
3770
3771 ###### propagate binode cases
3772         case Plus:
3773         case Minus:
3774         case Times:
3775         case Rem:
3776         case Divide:
3777                 /* both must be numbers, result is Tnum */
3778         case Absolute:
3779         case Negate:
3780                 /* as propagate_types ignores a NULL,
3781                  * unary ops fit here too */
3782                 propagate_types(b->left, c, perr, Tnum, 0);
3783                 propagate_types(b->right, c, perr, Tnum, 0);
3784                 if (!type_compat(type, Tnum, 0))
3785                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3786                                    Tnum, rules, type);
3787                 return Tnum;
3788
3789         case Concat:
3790                 /* both must be Tstr, result is Tstr */
3791                 propagate_types(b->left, c, perr, Tstr, 0);
3792                 propagate_types(b->right, c, perr, Tstr, 0);
3793                 if (!type_compat(type, Tstr, 0))
3794                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3795                                    Tstr, rules, type);
3796                 return Tstr;
3797
3798         case StringConv:
3799                 /* op must be string, result is number */
3800                 propagate_types(b->left, c, perr, Tstr, 0);
3801                 if (!type_compat(type, Tnum, 0))
3802                         type_err(c,     // UNTESTED
3803                           "error: Can only convert string to number, not %1",
3804                                 prog, type, 0, NULL);
3805                 return Tnum;
3806
3807         case Bracket:
3808                 return propagate_types(b->right, c, perr, type, 0);
3809
3810 ###### interp binode cases
3811
3812         case Plus:
3813                 rv = interp_exec(c, b->left, &rvtype);
3814                 right = interp_exec(c, b->right, &rtype);
3815                 mpq_add(rv.num, rv.num, right.num);
3816                 break;
3817         case Minus:
3818                 rv = interp_exec(c, b->left, &rvtype);
3819                 right = interp_exec(c, b->right, &rtype);
3820                 mpq_sub(rv.num, rv.num, right.num);
3821                 break;
3822         case Times:
3823                 rv = interp_exec(c, b->left, &rvtype);
3824                 right = interp_exec(c, b->right, &rtype);
3825                 mpq_mul(rv.num, rv.num, right.num);
3826                 break;
3827         case Divide:
3828                 rv = interp_exec(c, b->left, &rvtype);
3829                 right = interp_exec(c, b->right, &rtype);
3830                 mpq_div(rv.num, rv.num, right.num);
3831                 break;
3832         case Rem: {
3833                 mpz_t l, r, rem;
3834
3835                 left = interp_exec(c, b->left, &ltype);
3836                 right = interp_exec(c, b->right, &rtype);
3837                 mpz_init(l); mpz_init(r); mpz_init(rem);
3838                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3839                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3840                 mpz_tdiv_r(rem, l, r);
3841                 val_init(Tnum, &rv);
3842                 mpq_set_z(rv.num, rem);
3843                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3844                 rvtype = ltype;
3845                 break;
3846         }
3847         case Negate:
3848                 rv = interp_exec(c, b->right, &rvtype);
3849                 mpq_neg(rv.num, rv.num);
3850                 break;
3851         case Absolute:
3852                 rv = interp_exec(c, b->right, &rvtype);
3853                 mpq_abs(rv.num, rv.num);
3854                 break;
3855         case Bracket:
3856                 rv = interp_exec(c, b->right, &rvtype);
3857                 break;
3858         case Concat:
3859                 left = interp_exec(c, b->left, &ltype);
3860                 right = interp_exec(c, b->right, &rtype);
3861                 rvtype = Tstr;
3862                 rv.str = text_join(left.str, right.str);
3863                 break;
3864         case StringConv:
3865                 right = interp_exec(c, b->right, &rvtype);
3866                 rtype = Tstr;
3867                 rvtype = Tnum;
3868
3869                 struct text tx = right.str;
3870                 char tail[3];
3871                 int neg = 0;
3872                 if (tx.txt[0] == '-') {
3873                         neg = 1;        // UNTESTED
3874                         tx.txt++;       // UNTESTED
3875                         tx.len--;       // UNTESTED
3876                 }
3877                 if (number_parse(rv.num, tail, tx) == 0)
3878                         mpq_init(rv.num);       // UNTESTED
3879                 else if (neg)
3880                         mpq_neg(rv.num, rv.num);        // UNTESTED
3881                 if (tail[0])
3882                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3883
3884                 break;
3885
3886 ###### value functions
3887
3888         static struct text text_join(struct text a, struct text b)
3889         {
3890                 struct text rv;
3891                 rv.len = a.len + b.len;
3892                 rv.txt = malloc(rv.len);
3893                 memcpy(rv.txt, a.txt, a.len);
3894                 memcpy(rv.txt+a.len, b.txt, b.len);
3895                 return rv;
3896         }
3897
3898 ### Blocks, Statements, and Statement lists.
3899
3900 Now that we have expressions out of the way we need to turn to
3901 statements.  There are simple statements and more complex statements.
3902 Simple statements do not contain (syntactic) newlines, complex statements do.
3903
3904 Statements often come in sequences and we have corresponding simple
3905 statement lists and complex statement lists.
3906 The former comprise only simple statements separated by semicolons.
3907 The later comprise complex statements and simple statement lists.  They are
3908 separated by newlines.  Thus the semicolon is only used to separate
3909 simple statements on the one line.  This may be overly restrictive,
3910 but I'm not sure I ever want a complex statement to share a line with
3911 anything else.
3912
3913 Note that a simple statement list can still use multiple lines if
3914 subsequent lines are indented, so
3915
3916 ###### Example: wrapped simple statement list
3917
3918         a = b; c = d;
3919            e = f; print g
3920
3921 is a single simple statement list.  This might allow room for
3922 confusion, so I'm not set on it yet.
3923
3924 A simple statement list needs no extra syntax.  A complex statement
3925 list has two syntactic forms.  It can be enclosed in braces (much like
3926 C blocks), or it can be introduced by an indent and continue until an
3927 unindented newline (much like Python blocks).  With this extra syntax
3928 it is referred to as a block.
3929
3930 Note that a block does not have to include any newlines if it only
3931 contains simple statements.  So both of:
3932
3933         if condition: a=b; d=f
3934
3935         if condition { a=b; print f }
3936
3937 are valid.
3938
3939 In either case the list is constructed from a `binode` list with
3940 `Block` as the operator.  When parsing the list it is most convenient
3941 to append to the end, so a list is a list and a statement.  When using
3942 the list it is more convenient to consider a list to be a statement
3943 and a list.  So we need a function to re-order a list.
3944 `reorder_bilist` serves this purpose.
3945
3946 The only stand-alone statement we introduce at this stage is `pass`
3947 which does nothing and is represented as a `NULL` pointer in a `Block`
3948 list.  Other stand-alone statements will follow once the infrastructure
3949 is in-place.
3950
3951 As many statements will use binodes, we declare a binode pointer 'b' in
3952 the common header for all reductions to use.
3953
3954 ###### Parser: reduce
3955         struct binode *b;
3956
3957 ###### Binode types
3958         Block,
3959
3960 ###### Grammar
3961
3962         $TERM { } ;
3963
3964         $*binode
3965         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3966         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3967         |        SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3968         |        SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3969         |        IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3970
3971         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3972         |        OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3973         |        OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3974         |        OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3975         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3976
3977         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3978         |        { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3979         |        IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3980
3981         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3982         |        { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3983         |        : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3984         |        : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3985         |        : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3986
3987         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3988
3989         ComplexStatements -> ComplexStatements ComplexStatement ${
3990                 if ($2 == NULL) {
3991                         $0 = $<1;
3992                 } else {
3993                         $0 = new(binode);
3994                         $0->op = Block;
3995                         $0->left = $<1;
3996                         $0->right = $<2;
3997                 }
3998         }$
3999         | ComplexStatement ${
4000                 if ($1 == NULL) {
4001                         $0 = NULL;
4002                 } else {
4003                         $0 = new(binode);
4004                         $0->op = Block;
4005                         $0->left = NULL;
4006                         $0->right = $<1;
4007                 }
4008         }$
4009
4010         $*exec
4011         ComplexStatement -> SimpleStatements Newlines ${
4012                 $0 = reorder_bilist($<SS);
4013         }$
4014         |  SimpleStatements ; Newlines ${
4015                 $0 = reorder_bilist($<SS);
4016         }$
4017         ## ComplexStatement Grammar
4018
4019         $*binode
4020         SimpleStatements -> SimpleStatements ; SimpleStatement ${
4021                 $0 = new(binode);
4022                 $0->op = Block;
4023                 $0->left = $<1;
4024                 $0->right = $<3;
4025         }$
4026         | SimpleStatement ${
4027                 $0 = new(binode);
4028                 $0->op = Block;
4029                 $0->left = NULL;
4030                 $0->right = $<1;
4031         }$
4032
4033         $TERM pass
4034         $*exec
4035         SimpleStatement -> pass ${ $0 = NULL; }$
4036         | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
4037         ## SimpleStatement Grammar
4038
4039 ###### print binode cases
4040         case Block:
4041                 if (indent < 0) {
4042                         // simple statement
4043                         if (b->left == NULL)    // UNTESTED
4044                                 printf("pass"); // UNTESTED
4045                         else
4046                                 print_exec(b->left, indent, bracket);   // UNTESTED
4047                         if (b->right) { // UNTESTED
4048                                 printf("; ");   // UNTESTED
4049                                 print_exec(b->right, indent, bracket);  // UNTESTED
4050                         }
4051                 } else {
4052                         // block, one per line
4053                         if (b->left == NULL)
4054                                 do_indent(indent, "pass\n");
4055                         else
4056                                 print_exec(b->left, indent, bracket);
4057                         if (b->right)
4058                                 print_exec(b->right, indent, bracket);
4059                 }
4060                 break;
4061
4062 ###### propagate binode cases
4063         case Block:
4064         {
4065                 /* If any statement returns something other than Tnone
4066                  * or Tbool then all such must return same type.
4067                  * As each statement may be Tnone or something else,
4068                  * we must always pass NULL (unknown) down, otherwise an incorrect
4069                  * error might occur.  We never return Tnone unless it is
4070                  * passed in.
4071                  */
4072                 struct binode *e;
4073
4074                 for (e = b; e; e = cast(binode, e->right)) {
4075                         t = propagate_types(e->left, c, perr, NULL, rules);
4076                         if ((rules & Rboolok) && (t == Tbool || t == Tnone))
4077                                 t = NULL;
4078                         if (t == Tnone && e->right)
4079                                 /* Only the final statement *must* return a value
4080                                  * when not Rboolok
4081                                  */
4082                                 t = NULL;
4083                         if (t) {
4084                                 if (!type)
4085                                         type = t;
4086                                 else if (t != type)
4087                                         type_err(c, "error: expected %1%r, found %2",
4088                                                  e->left, type, rules, t);
4089                         }
4090                 }
4091                 return type;
4092         }
4093
4094 ###### interp binode cases
4095         case Block:
4096                 while (rvtype == Tnone &&
4097                        b) {
4098                         if (b->left)
4099                                 rv = interp_exec(c, b->left, &rvtype);
4100                         b = cast(binode, b->right);
4101                 }
4102                 break;
4103
4104 ### The Print statement
4105
4106 `print` is a simple statement that takes a comma-separated list of
4107 expressions and prints the values separated by spaces and terminated
4108 by a newline.  No control of formatting is possible.
4109
4110 `print` uses `ExpressionList` to collect the expressions and stores them
4111 on the left side of a `Print` binode unlessthere is a trailing comma
4112 when the list is stored on the `right` side and no trailing newline is
4113 printed.
4114
4115 ###### Binode types
4116         Print,
4117
4118 ##### declare terminals
4119         $TERM print
4120
4121 ###### SimpleStatement Grammar
4122
4123         | print ExpressionList ${
4124                 $0 = b = new(binode);
4125                 b->op = Print;
4126                 b->right = NULL;
4127                 b->left = reorder_bilist($<EL);
4128         }$
4129         | print ExpressionList , ${ {
4130                 $0 = b = new(binode);
4131                 b->op = Print;
4132                 b->right = reorder_bilist($<EL);
4133                 b->left = NULL;
4134         } }$
4135         | print ${
4136                 $0 = b = new(binode);
4137                 b->op = Print;
4138                 b->left = NULL;
4139                 b->right = NULL;
4140         }$
4141
4142 ###### print binode cases
4143
4144         case Print:
4145                 do_indent(indent, "print");
4146                 if (b->right) {
4147                         print_exec(b->right, -1, bracket);
4148                         printf(",");
4149                 } else
4150                         print_exec(b->left, -1, bracket);
4151                 if (indent >= 0)
4152                         printf("\n");
4153                 break;
4154
4155 ###### propagate binode cases
4156
4157         case Print:
4158                 /* don't care but all must be consistent */
4159                 if (b->left)
4160                         b = cast(binode, b->left);
4161                 else
4162                         b = cast(binode, b->right);
4163                 while (b) {
4164                         propagate_types(b->left, c, perr, NULL, Rnolabel);
4165                         b = cast(binode, b->right);
4166                 }
4167                 break;
4168
4169 ###### interp binode cases
4170
4171         case Print:
4172         {
4173                 struct binode *b2 = cast(binode, b->left);
4174                 if (!b2)
4175                         b2 = cast(binode, b->right);
4176                 for (; b2; b2 = cast(binode, b2->right)) {
4177                         left = interp_exec(c, b2->left, &ltype);
4178                         print_value(ltype, &left, stdout);
4179                         free_value(ltype, &left);
4180                         if (b2->right)
4181                                 putchar(' ');
4182                 }
4183                 if (b->right == NULL)
4184                         printf("\n");
4185                 ltype = Tnone;
4186                 break;
4187         }
4188
4189 ###### Assignment statement
4190
4191 An assignment will assign a value to a variable, providing it hasn't
4192 been declared as a constant.  The analysis phase ensures that the type
4193 will be correct so the interpreter just needs to perform the
4194 calculation.  There is a form of assignment which declares a new
4195 variable as well as assigning a value.  If a name is assigned before
4196 it is declared, and error will be raised as the name is created as
4197 `Tlabel` and it is illegal to assign to such names.
4198
4199 ###### Binode types
4200         Assign,
4201         Declare,
4202
4203 ###### declare terminals
4204         $TERM =
4205
4206 ###### SimpleStatement Grammar
4207         | Term = Expression ${
4208                 $0 = b= new(binode);
4209                 b->op = Assign;
4210                 b->left = $<1;
4211                 b->right = $<3;
4212         }$
4213         | VariableDecl = Expression ${
4214                 $0 = b= new(binode);
4215                 b->op = Declare;
4216                 b->left = $<1;
4217                 b->right =$<3;
4218         }$
4219
4220         | VariableDecl ${
4221                 if ($1->var->where_set == NULL) {
4222                         type_err(c,
4223                                  "Variable declared with no type or value: %v",
4224                                  $1, NULL, 0, NULL);
4225                         free_var($1);
4226                 } else {
4227                         $0 = b = new(binode);
4228                         b->op = Declare;
4229                         b->left = $<1;
4230                         b->right = NULL;
4231                 }
4232         }$
4233
4234 ###### print binode cases
4235
4236         case Assign:
4237                 do_indent(indent, "");
4238                 print_exec(b->left, indent, bracket);
4239                 printf(" = ");
4240                 print_exec(b->right, indent, bracket);
4241                 if (indent >= 0)
4242                         printf("\n");
4243                 break;
4244
4245         case Declare:
4246                 {
4247                 struct variable *v = cast(var, b->left)->var;
4248                 do_indent(indent, "");
4249                 print_exec(b->left, indent, bracket);
4250                 if (cast(var, b->left)->var->constant) {
4251                         printf("::");
4252                         if (v->explicit_type) {
4253                                 type_print(v->type, stdout);
4254                                 printf(" ");
4255                         }
4256                 } else {
4257                         printf(":");
4258                         if (v->explicit_type) {
4259                                 type_print(v->type, stdout);
4260                                 printf(" ");
4261                         }
4262                 }
4263                 if (b->right) {
4264                         printf("= ");
4265                         print_exec(b->right, indent, bracket);
4266                 }
4267                 if (indent >= 0)
4268                         printf("\n");
4269                 }
4270                 break;
4271
4272 ###### propagate binode cases
4273
4274         case Assign:
4275         case Declare:
4276                 /* Both must match and not be labels,
4277                  * Type must support 'dup',
4278                  * For Assign, left must not be constant.
4279                  * result is Tnone
4280                  */
4281                 t = propagate_types(b->left, c, perr, NULL,
4282                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
4283                 if (!b->right)
4284                         return Tnone;
4285
4286                 if (t) {
4287                         if (propagate_types(b->right, c, perr, t, 0) != t)
4288                                 if (b->left->type == Xvar)
4289                                         type_err(c, "info: variable '%v' was set as %1 here.",
4290                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
4291                 } else {
4292                         t = propagate_types(b->right, c, perr, NULL, Rnolabel);
4293                         if (t)
4294                                 propagate_types(b->left, c, perr, t,
4295                                                 (b->op == Assign ? Rnoconstant : 0));
4296                 }
4297                 if (t && t->dup == NULL && !(*perr & Emaycopy))
4298                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
4299                 return Tnone;
4300
4301                 break;
4302
4303 ###### interp binode cases
4304
4305         case Assign:
4306                 lleft = linterp_exec(c, b->left, &ltype);
4307                 if (lleft)
4308                         dinterp_exec(c, b->right, lleft, ltype, 1);
4309                 ltype = Tnone;
4310                 break;
4311
4312         case Declare:
4313         {
4314                 struct variable *v = cast(var, b->left)->var;
4315                 struct value *val;
4316                 v = v->merged;
4317                 val = var_value(c, v);
4318                 if (v->type->prepare_type)
4319                         v->type->prepare_type(c, v->type, 0);
4320                 if (b->right)
4321                         dinterp_exec(c, b->right, val, v->type, 0);
4322                 else
4323                         val_init(v->type, val);
4324                 break;
4325         }
4326
4327 ### The `use` statement
4328
4329 The `use` statement is the last "simple" statement.  It is needed when a
4330 statement block can return a value.  This includes the body of a
4331 function which has a return type, and the "condition" code blocks in
4332 `if`, `while`, and `switch` statements.
4333
4334 ###### Binode types
4335         Use,
4336
4337 ###### declare terminals
4338         $TERM use
4339
4340 ###### SimpleStatement Grammar
4341         | use Expression ${
4342                 $0 = b = new_pos(binode, $1);
4343                 b->op = Use;
4344                 b->right = $<2;
4345                 if (b->right->type == Xvar) {
4346                         struct var *v = cast(var, b->right);
4347                         if (v->var->type == Tnone) {
4348                                 /* Convert this to a label */
4349                                 struct value *val;
4350
4351                                 v->var->type = Tlabel;
4352                                 val = global_alloc(c, Tlabel, v->var, NULL);
4353                                 val->label = val;
4354                         }
4355                 }
4356         }$
4357
4358 ###### print binode cases
4359
4360         case Use:
4361                 do_indent(indent, "use ");
4362                 print_exec(b->right, -1, bracket);
4363                 if (indent >= 0)
4364                         printf("\n");
4365                 break;
4366
4367 ###### propagate binode cases
4368
4369         case Use:
4370                 /* result matches value */
4371                 return propagate_types(b->right, c, perr, type, 0);
4372
4373 ###### interp binode cases
4374
4375         case Use:
4376                 rv = interp_exec(c, b->right, &rvtype);
4377                 break;
4378
4379 ### The Conditional Statement
4380
4381 This is the biggy and currently the only complex statement.  This
4382 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
4383 It is comprised of a number of parts, all of which are optional though
4384 set combinations apply.  Each part is (usually) a key word (`then` is
4385 sometimes optional) followed by either an expression or a code block,
4386 except the `casepart` which is a "key word and an expression" followed
4387 by a code block.  The code-block option is valid for all parts and,
4388 where an expression is also allowed, the code block can use the `use`
4389 statement to report a value.  If the code block does not report a value
4390 the effect is similar to reporting `True`.
4391
4392 The `else` and `case` parts, as well as `then` when combined with
4393 `if`, can contain a `use` statement which will apply to some
4394 containing conditional statement. `for` parts, `do` parts and `then`
4395 parts used with `for` can never contain a `use`, except in some
4396 subordinate conditional statement.
4397
4398 If there is a `forpart`, it is executed first, only once.
4399 If there is a `dopart`, then it is executed repeatedly providing
4400 always that the `condpart` or `cond`, if present, does not return a non-True
4401 value.  `condpart` can fail to return any value if it simply executes
4402 to completion.  This is treated the same as returning `True`.
4403
4404 If there is a `thenpart` it will be executed whenever the `condpart`
4405 or `cond` returns True (or does not return any value), but this will happen
4406 *after* `dopart` (when present).
4407
4408 If `elsepart` is present it will be executed at most once when the
4409 condition returns `False` or some value that isn't `True` and isn't
4410 matched by any `casepart`.  If there are any `casepart`s, they will be
4411 executed when the condition returns a matching value.
4412
4413 The particular sorts of values allowed in case parts has not yet been
4414 determined in the language design, so nothing is prohibited.
4415
4416 The various blocks in this complex statement potentially provide scope
4417 for variables as described earlier.  Each such block must include the
4418 "OpenScope" nonterminal before parsing the block, and must call
4419 `var_block_close()` when closing the block.
4420
4421 The code following "`if`", "`switch`" and "`for`" does not get its own
4422 scope, but is in a scope covering the whole statement, so names
4423 declared there cannot be redeclared elsewhere.  Similarly the
4424 condition following "`while`" is in a scope the covers the body
4425 ("`do`" part) of the loop, and which does not allow conditional scope
4426 extension.  Code following "`then`" (both looping and non-looping),
4427 "`else`" and "`case`" each get their own local scope.
4428
4429 The type requirements on the code block in a `whilepart` are quite
4430 unusal.  It is allowed to return a value of some identifiable type, in
4431 which case the loop aborts and an appropriate `casepart` is run, or it
4432 can return a Boolean, in which case the loop either continues to the
4433 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
4434 This is different both from the `ifpart` code block which is expected to
4435 return a Boolean, or the `switchpart` code block which is expected to
4436 return the same type as the casepart values.  The correct analysis of
4437 the type of the `whilepart` code block is the reason for the
4438 `Rboolok` flag which is passed to `propagate_types()`.
4439
4440 The `cond_statement` cannot fit into a `binode` so a new `exec` is
4441 defined.  As there are two scopes which cover multiple parts - one for
4442 the whole statement and one for "while" and "do" - and as we will use
4443 the 'struct exec' to track scopes, we actually need two new types of
4444 exec.  One is a `binode` for the looping part, the rest is the
4445 `cond_statement`.  The `cond_statement` will use an auxilliary `struct
4446 casepart` to track a list of case parts.
4447
4448 ###### Binode types
4449         Loop
4450
4451 ###### exec type
4452         Xcond_statement,
4453
4454 ###### ast
4455         struct casepart {
4456                 struct exec *value;
4457                 struct exec *action;
4458                 struct casepart *next;
4459         };
4460         struct cond_statement {
4461                 struct exec;
4462                 struct exec *forpart, *condpart, *thenpart, *elsepart;
4463                 struct binode *looppart;
4464                 struct casepart *casepart;
4465         };
4466
4467 ###### ast functions
4468
4469         static void free_casepart(struct casepart *cp)
4470         {
4471                 while (cp) {
4472                         struct casepart *t;
4473                         free_exec(cp->value);
4474                         free_exec(cp->action);
4475                         t = cp->next;
4476                         free(cp);
4477                         cp = t;
4478                 }
4479         }
4480
4481         static void free_cond_statement(struct cond_statement *s)
4482         {
4483                 if (!s)
4484                         return;
4485                 free_exec(s->forpart);
4486                 free_exec(s->condpart);
4487                 free_exec(s->looppart);
4488                 free_exec(s->thenpart);
4489                 free_exec(s->elsepart);
4490                 free_casepart(s->casepart);
4491                 free(s);
4492         }
4493
4494 ###### free exec cases
4495         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
4496
4497 ###### ComplexStatement Grammar
4498         | CondStatement ${ $0 = $<1; }$
4499
4500 ###### declare terminals
4501         $TERM for then while do
4502         $TERM else
4503         $TERM switch case
4504
4505 ###### Grammar
4506
4507         $*cond_statement
4508         // A CondStatement must end with EOL, as does CondSuffix and
4509         // IfSuffix.
4510         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
4511         // may or may not end with EOL
4512         // WhilePart and IfPart include an appropriate Suffix
4513
4514         // ForPart, SwitchPart, and IfPart open scopes, o we have to close
4515         // them.  WhilePart opens and closes its own scope.
4516         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
4517                 $0 = $<CS;
4518                 $0->forpart = $<FP;
4519                 $0->thenpart = $<TP;
4520                 $0->looppart = $<WP;
4521                 var_block_close(c, CloseSequential, $0);
4522         }$
4523         | ForPart OptNL WhilePart CondSuffix ${
4524                 $0 = $<CS;
4525                 $0->forpart = $<FP;
4526                 $0->looppart = $<WP;
4527                 var_block_close(c, CloseSequential, $0);
4528         }$
4529         | WhilePart CondSuffix ${
4530                 $0 = $<CS;
4531                 $0->looppart = $<WP;
4532         }$
4533         | SwitchPart OptNL CasePart CondSuffix ${
4534                 $0 = $<CS;
4535                 $0->condpart = $<SP;
4536                 $CP->next = $0->casepart;
4537                 $0->casepart = $<CP;
4538                 var_block_close(c, CloseSequential, $0);
4539         }$
4540         | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
4541                 $0 = $<CS;
4542                 $0->condpart = $<SP;
4543                 $CP->next = $0->casepart;
4544                 $0->casepart = $<CP;
4545                 var_block_close(c, CloseSequential, $0);
4546         }$
4547         | IfPart IfSuffix ${
4548                 $0 = $<IS;
4549                 $0->condpart = $IP.condpart; $IP.condpart = NULL;
4550                 $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
4551                 // This is where we close an "if" statement
4552                 var_block_close(c, CloseSequential, $0);
4553         }$
4554
4555         CondSuffix -> IfSuffix ${
4556                 $0 = $<1;
4557         }$
4558         | Newlines CasePart CondSuffix ${
4559                 $0 = $<CS;
4560                 $CP->next = $0->casepart;
4561                 $0->casepart = $<CP;
4562         }$
4563         | CasePart CondSuffix ${
4564                 $0 = $<CS;
4565                 $CP->next = $0->casepart;
4566                 $0->casepart = $<CP;
4567         }$
4568
4569         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
4570         | Newlines ElsePart ${ $0 = $<EP; }$
4571         | ElsePart ${$0 = $<EP; }$
4572
4573         ElsePart -> else OpenBlock Newlines ${
4574                 $0 = new(cond_statement);
4575                 $0->elsepart = $<OB;
4576                 var_block_close(c, CloseElse, $0->elsepart);
4577         }$
4578         | else OpenScope CondStatement ${
4579                 $0 = new(cond_statement);
4580                 $0->elsepart = $<CS;
4581                 var_block_close(c, CloseElse, $0->elsepart);
4582         }$
4583
4584         $*casepart
4585         CasePart -> case Expression OpenScope ColonBlock ${
4586                 $0 = calloc(1,sizeof(struct casepart));
4587                 $0->value = $<Ex;
4588                 $0->action = $<Bl;
4589                 var_block_close(c, CloseParallel, $0->action);
4590         }$
4591
4592         $*exec
4593         // These scopes are closed in CondStatement
4594         ForPart -> for OpenBlock ${
4595                 $0 = $<Bl;
4596         }$
4597
4598         ThenPart -> then OpenBlock ${
4599                 $0 = $<OB;
4600                 var_block_close(c, CloseSequential, $0);
4601         }$
4602
4603         $*binode
4604         // This scope is closed in CondStatement
4605         WhilePart -> while UseBlock OptNL do OpenBlock ${
4606                 $0 = new(binode);
4607                 $0->op = Loop;
4608                 $0->left = $<UB;
4609                 $0->right = $<OB;
4610                 var_block_close(c, CloseSequential, $0->right);
4611                 var_block_close(c, CloseSequential, $0);
4612         }$
4613         | while OpenScope Expression OpenScope ColonBlock ${
4614                 $0 = new(binode);
4615                 $0->op = Loop;
4616                 $0->left = $<Exp;
4617                 $0->right = $<CB;
4618                 var_block_close(c, CloseSequential, $0->right);
4619                 var_block_close(c, CloseSequential, $0);
4620         }$
4621
4622         $cond_statement
4623         IfPart -> if UseBlock OptNL then OpenBlock ${
4624                 $0.condpart = $<UB;
4625                 $0.thenpart = $<OB;
4626                 var_block_close(c, CloseParallel, $0.thenpart);
4627         }$
4628         | if OpenScope Expression OpenScope ColonBlock ${
4629                 $0.condpart = $<Ex;
4630                 $0.thenpart = $<CB;
4631                 var_block_close(c, CloseParallel, $0.thenpart);
4632         }$
4633         | if OpenScope Expression OpenScope OptNL then Block ${
4634                 $0.condpart = $<Ex;
4635                 $0.thenpart = $<Bl;
4636                 var_block_close(c, CloseParallel, $0.thenpart);
4637         }$
4638
4639         $*exec
4640         // This scope is closed in CondStatement
4641         SwitchPart -> switch OpenScope Expression ${
4642                 $0 = $<Ex;
4643         }$
4644         | switch UseBlock ${
4645                 $0 = $<Bl;
4646         }$
4647
4648 ###### print binode cases
4649         case Loop:
4650                 if (b->left && b->left->type == Xbinode &&
4651                     cast(binode, b->left)->op == Block) {
4652                         if (bracket)
4653                                 do_indent(indent, "while {\n");
4654                         else
4655                                 do_indent(indent, "while\n");
4656                         print_exec(b->left, indent+1, bracket);
4657                         if (bracket)
4658                                 do_indent(indent, "} do {\n");
4659                         else
4660                                 do_indent(indent, "do\n");
4661                         print_exec(b->right, indent+1, bracket);
4662                         if (bracket)
4663                                 do_indent(indent, "}\n");
4664                 } else {
4665                         do_indent(indent, "while ");
4666                         print_exec(b->left, 0, bracket);
4667                         if (bracket)
4668                                 printf(" {\n");
4669                         else
4670                                 printf(":\n");
4671                         print_exec(b->right, indent+1, bracket);
4672                         if (bracket)
4673                                 do_indent(indent, "}\n");
4674                 }
4675                 break;
4676
4677 ###### print exec cases
4678
4679         case Xcond_statement:
4680         {
4681                 struct cond_statement *cs = cast(cond_statement, e);
4682                 struct casepart *cp;
4683                 if (cs->forpart) {
4684                         do_indent(indent, "for");
4685                         if (bracket) printf(" {\n"); else printf("\n");
4686                         print_exec(cs->forpart, indent+1, bracket);
4687                         if (cs->thenpart) {
4688                                 if (bracket)
4689                                         do_indent(indent, "} then {\n");
4690                                 else
4691                                         do_indent(indent, "then\n");
4692                                 print_exec(cs->thenpart, indent+1, bracket);
4693                         }
4694                         if (bracket) do_indent(indent, "}\n");
4695                 }
4696                 if (cs->looppart) {
4697                         print_exec(cs->looppart, indent, bracket);
4698                 } else {
4699                         // a condition
4700                         if (cs->casepart)
4701                                 do_indent(indent, "switch");
4702                         else
4703                                 do_indent(indent, "if");
4704                         if (cs->condpart && cs->condpart->type == Xbinode &&
4705                             cast(binode, cs->condpart)->op == Block) {
4706                                 if (bracket)
4707                                         printf(" {\n");
4708                                 else
4709                                         printf("\n");
4710                                 print_exec(cs->condpart, indent+1, bracket);
4711                                 if (bracket)
4712                                         do_indent(indent, "}\n");
4713                                 if (cs->thenpart) {
4714                                         do_indent(indent, "then\n");
4715                                         print_exec(cs->thenpart, indent+1, bracket);
4716                                 }
4717                         } else {
4718                                 printf(" ");
4719                                 print_exec(cs->condpart, 0, bracket);
4720                                 if (cs->thenpart) {
4721                                         if (bracket)
4722                                                 printf(" {\n");
4723                                         else
4724                                                 printf(":\n");
4725                                         print_exec(cs->thenpart, indent+1, bracket);
4726                                         if (bracket)
4727                                                 do_indent(indent, "}\n");
4728                                 } else
4729                                         printf("\n");
4730                         }
4731                 }
4732                 for (cp = cs->casepart; cp; cp = cp->next) {
4733                         do_indent(indent, "case ");
4734                         print_exec(cp->value, -1, 0);
4735                         if (bracket)
4736                                 printf(" {\n");
4737                         else
4738                                 printf(":\n");
4739                         print_exec(cp->action, indent+1, bracket);
4740                         if (bracket)
4741                                 do_indent(indent, "}\n");
4742                 }
4743                 if (cs->elsepart) {
4744                         do_indent(indent, "else");
4745                         if (bracket)
4746                                 printf(" {\n");
4747                         else
4748                                 printf("\n");
4749                         print_exec(cs->elsepart, indent+1, bracket);
4750                         if (bracket)
4751                                 do_indent(indent, "}\n");
4752                 }
4753                 break;
4754         }
4755
4756 ###### propagate binode cases
4757         case Loop:
4758                 t = propagate_types(b->right, c, perr, Tnone, 0);
4759                 if (!type_compat(Tnone, t, 0))
4760                         *perr |= Efail; // UNTESTED
4761                 return propagate_types(b->left, c, perr, type, rules);
4762
4763 ###### propagate exec cases
4764         case Xcond_statement:
4765         {
4766                 // forpart and looppart->right must return Tnone
4767                 // thenpart must return Tnone if there is a loopart,
4768                 // otherwise it is like elsepart.
4769                 // condpart must:
4770                 //    be bool if there is no casepart
4771                 //    match casepart->values if there is a switchpart
4772                 //    either be bool or match casepart->value if there
4773                 //             is a whilepart
4774                 // elsepart and casepart->action must match the return type
4775                 //   expected of this statement.
4776                 struct cond_statement *cs = cast(cond_statement, prog);
4777                 struct casepart *cp;
4778
4779                 t = propagate_types(cs->forpart, c, perr, Tnone, 0);
4780                 if (!type_compat(Tnone, t, 0))
4781                         *perr |= Efail; // UNTESTED
4782
4783                 if (cs->looppart) {
4784                         t = propagate_types(cs->thenpart, c, perr, Tnone, 0);
4785                         if (!type_compat(Tnone, t, 0))
4786                                 *perr |= Efail; // UNTESTED
4787                 }
4788                 if (cs->casepart == NULL) {
4789                         propagate_types(cs->condpart, c, perr, Tbool, 0);
4790                         propagate_types(cs->looppart, c, perr, Tbool, 0);
4791                 } else {
4792                         /* Condpart must match case values, with bool permitted */
4793                         t = NULL;
4794                         for (cp = cs->casepart;
4795                              cp && !t; cp = cp->next)
4796                                 t = propagate_types(cp->value, c, perr, NULL, 0);
4797                         if (!t && cs->condpart)
4798                                 t = propagate_types(cs->condpart, c, perr, NULL, Rboolok);      // UNTESTED
4799                         if (!t && cs->looppart)
4800                                 t = propagate_types(cs->looppart, c, perr, NULL, Rboolok);      // UNTESTED
4801                         // Now we have a type (I hope) push it down
4802                         if (t) {
4803                                 for (cp = cs->casepart; cp; cp = cp->next)
4804                                         propagate_types(cp->value, c, perr, t, 0);
4805                                 propagate_types(cs->condpart, c, perr, t, Rboolok);
4806                                 propagate_types(cs->looppart, c, perr, t, Rboolok);
4807                         }
4808                 }
4809                 // (if)then, else, and case parts must return expected type.
4810                 if (!cs->looppart && !type)
4811                         type = propagate_types(cs->thenpart, c, perr, NULL, rules);
4812                 if (!type)
4813                         type = propagate_types(cs->elsepart, c, perr, NULL, rules);
4814                 for (cp = cs->casepart;
4815                      cp && !type;
4816                      cp = cp->next)     // UNTESTED
4817                         type = propagate_types(cp->action, c, perr, NULL, rules);       // UNTESTED
4818                 if (type) {
4819                         if (!cs->looppart)
4820                                 propagate_types(cs->thenpart, c, perr, type, rules);
4821                         propagate_types(cs->elsepart, c, perr, type, rules);
4822                         for (cp = cs->casepart; cp ; cp = cp->next)
4823                                 propagate_types(cp->action, c, perr, type, rules);
4824                         return type;
4825                 } else
4826                         return NULL;
4827         }
4828
4829 ###### interp binode cases
4830         case Loop:
4831                 // This just performs one iterration of the loop
4832                 rv = interp_exec(c, b->left, &rvtype);
4833                 if (rvtype == Tnone ||
4834                     (rvtype == Tbool && rv.bool != 0))
4835                         // rvtype is Tnone or Tbool, doesn't need to be freed
4836                         interp_exec(c, b->right, NULL);
4837                 break;
4838
4839 ###### interp exec cases
4840         case Xcond_statement:
4841         {
4842                 struct value v, cnd;
4843                 struct type *vtype, *cndtype;
4844                 struct casepart *cp;
4845                 struct cond_statement *cs = cast(cond_statement, e);
4846
4847                 if (cs->forpart)
4848                         interp_exec(c, cs->forpart, NULL);
4849                 if (cs->looppart) {
4850                         while ((cnd = interp_exec(c, cs->looppart, &cndtype)),
4851                                cndtype == Tnone || (cndtype == Tbool && cnd.bool != 0))
4852                                 interp_exec(c, cs->thenpart, NULL);
4853                 } else {
4854                         cnd = interp_exec(c, cs->condpart, &cndtype);
4855                         if ((cndtype == Tnone ||
4856                             (cndtype == Tbool && cnd.bool != 0))) {
4857                                 // cnd is Tnone or Tbool, doesn't need to be freed
4858                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4859                                 // skip else (and cases)
4860                                 goto Xcond_done;
4861                         }
4862                 }
4863                 for (cp = cs->casepart; cp; cp = cp->next) {
4864                         v = interp_exec(c, cp->value, &vtype);
4865                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4866                                 free_value(vtype, &v);
4867                                 free_value(cndtype, &cnd);
4868                                 rv = interp_exec(c, cp->action, &rvtype);
4869                                 goto Xcond_done;
4870                         }
4871                         free_value(vtype, &v);
4872                 }
4873                 free_value(cndtype, &cnd);
4874                 if (cs->elsepart)
4875                         rv = interp_exec(c, cs->elsepart, &rvtype);
4876                 else
4877                         rvtype = Tnone;
4878         Xcond_done:
4879                 break;
4880         }
4881
4882 ### Top level structure
4883
4884 All the language elements so far can be used in various places.  Now
4885 it is time to clarify what those places are.
4886
4887 At the top level of a file there will be a number of declarations.
4888 Many of the things that can be declared haven't been described yet,
4889 such as functions, procedures, imports, and probably more.
4890 For now there are two sorts of things that can appear at the top
4891 level.  They are predefined constants, `struct` types, and the `main`
4892 function.  While the syntax will allow the `main` function to appear
4893 multiple times, that will trigger an error if it is actually attempted.
4894
4895 The various declarations do not return anything.  They store the
4896 various declarations in the parse context.
4897
4898 ###### Parser: grammar
4899
4900         $void
4901         Ocean -> OptNL DeclarationList
4902
4903         ## declare terminals
4904
4905         OptNL ->
4906         | OptNL NEWLINE
4907
4908         Newlines -> NEWLINE
4909         | Newlines NEWLINE
4910
4911         DeclarationList -> Declaration
4912         | DeclarationList Declaration
4913
4914         Declaration -> ERROR Newlines ${
4915                 tok_err(c,      // UNTESTED
4916                         "error: unhandled parse error", &$1);
4917         }$
4918         | DeclareConstant
4919         | DeclareFunction
4920         | DeclareStruct
4921
4922         ## top level grammar
4923
4924         ## Grammar
4925
4926 ### The `const` section
4927
4928 As well as being defined in with the code that uses them, constants can
4929 be declared at the top level.  These have full-file scope, so they are
4930 always `InScope`, even before(!) they have been declared.  The value of
4931 a top level constant can be given as an expression, and this is
4932 evaluated after parsing and before execution.
4933
4934 A function call can be used to evaluate a constant, but it will not have
4935 access to any program state, once such statement becomes meaningful.
4936 e.g.  arguments and filesystem will not be visible.
4937
4938 Constants are defined in a section that starts with the reserved word
4939 `const` and then has a block with a list of assignment statements.
4940 For syntactic consistency, these must use the double-colon syntax to
4941 make it clear that they are constants.  Type can also be given: if
4942 not, the type will be determined during analysis, as with other
4943 constants.
4944
4945 ###### parse context
4946         struct binode *constlist;
4947
4948 ###### top level grammar
4949
4950         $TERM const
4951
4952         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4953         | const { SimpleConstList } Newlines
4954         | const IN OptNL ConstList OUT Newlines
4955         | const SimpleConstList Newlines
4956
4957         ConstList -> ConstList SimpleConstLine
4958         | SimpleConstLine
4959
4960         SimpleConstList -> SimpleConstList ; Const
4961         | Const
4962         | SimpleConstList ;
4963
4964         SimpleConstLine -> SimpleConstList Newlines
4965         | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4966
4967         $*type
4968         CType -> Type   ${ $0 = $<1; }$
4969         |               ${ $0 = NULL; }$
4970
4971         $void
4972         Const -> IDENTIFIER :: CType = Expression ${ {
4973                 struct variable *v;
4974                 struct binode *bl, *bv;
4975                 struct var *var = new_pos(var, $ID);
4976
4977                 v = var_decl(c, $ID.txt);
4978                 if (v) {
4979                         v->where_decl = var;
4980                         v->where_set = var;
4981                         v->type = $<CT;
4982                         v->constant = 1;
4983                         v->global = 1;
4984                 } else {
4985                         v = var_ref(c, $1.txt);
4986                         if (v->type == Tnone) {
4987                                 v->where_decl = var;
4988                                 v->where_set = var;
4989                                 v->type = $<CT;
4990                                 v->constant = 1;
4991                                 v->global = 1;
4992                         } else {
4993                                 tok_err(c, "error: name already declared", &$1);
4994                                 type_err(c, "info: this is where '%v' was first declared",
4995                                          v->where_decl, NULL, 0, NULL);
4996                         }
4997                 }
4998                 var->var = v;
4999
5000                 bv = new(binode);
5001                 bv->op = Declare;
5002                 bv->left = var;
5003                 bv->right= $<Exp;
5004
5005                 bl = new(binode);
5006                 bl->op = List;
5007                 bl->left = c->constlist;
5008                 bl->right = bv;
5009                 c->constlist = bl;
5010         } }$
5011
5012 ###### core functions
5013         static void resolve_consts(struct parse_context *c)
5014         {
5015                 struct binode *b;
5016                 int retry = 1;
5017                 enum { none, some, cannot } progress = none;
5018
5019                 c->constlist = reorder_bilist(c->constlist);
5020                 while (retry) {
5021                         retry = 0;
5022                         for (b = cast(binode, c->constlist); b;
5023                              b = cast(binode, b->right)) {
5024                                 enum prop_err perr;
5025                                 struct binode *vb = cast(binode, b->left);
5026                                 struct var *v = cast(var, vb->left);
5027                                 if (v->var->frame_pos >= 0)
5028                                         continue;
5029                                 do {
5030                                         perr = 0;
5031                                         propagate_types(vb->right, c, &perr,
5032                                                         v->var->type, 0);
5033                                 } while (perr & Eretry);
5034                                 if (perr & Efail)
5035                                         c->parse_error += 1;
5036                                 else if (!(perr & Enoconst)) {
5037                                         progress = some;
5038                                         struct value res = interp_exec(
5039                                                 c, vb->right, &v->var->type);
5040                                         global_alloc(c, v->var->type, v->var, &res);
5041                                 } else {
5042                                         if (progress == cannot)
5043                                                 type_err(c, "error: const %v cannot be resolved.",
5044                                                          v, NULL, 0, NULL);
5045                                         else
5046                                                 retry = 1;
5047                                 }
5048                         }
5049                         switch (progress) {
5050                         case cannot:
5051                                 retry = 0; break;
5052                         case none:
5053                                 progress = cannot; break;
5054                         case some:
5055                                 progress = none; break;
5056                         }
5057                 }
5058         }
5059
5060 ###### print const decls
5061         {
5062                 struct binode *b;
5063                 int first = 1;
5064
5065                 for (b = cast(binode, context.constlist); b;
5066                      b = cast(binode, b->right)) {
5067                         struct binode *vb = cast(binode, b->left);
5068                         struct var *vr = cast(var, vb->left);
5069                         struct variable *v = vr->var;
5070
5071                         if (first)
5072                                 printf("const\n");
5073                         first = 0;
5074
5075                         printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
5076                         type_print(v->type, stdout);
5077                         printf(" = ");
5078                         print_exec(vb->right, -1, 0);
5079                         printf("\n");
5080                 }
5081         }
5082
5083 ###### free const decls
5084         free_binode(context.constlist);
5085
5086 ### Function declarations
5087
5088 The code in an Ocean program is all stored in function declarations.
5089 One of the functions must be named `main` and it must accept an array of
5090 strings as a parameter - the command line arguments.
5091
5092 As this is the top level, several things are handled a bit differently.
5093 The function is not interpreted by `interp_exec` as that isn't passed
5094 the argument list which the program requires.  Similarly type analysis
5095 is a bit more interesting at this level.
5096
5097 ###### ast functions
5098
5099         static struct type *handle_results(struct parse_context *c,
5100                                            struct binode *results)
5101         {
5102                 /* Create a 'struct' type from the results list, which
5103                  * is a list for 'struct var'
5104                  */
5105                 struct type *t = add_anon_type(c, &structure_prototype,
5106                                                "function result");
5107                 int cnt = 0;
5108                 struct binode *b;
5109
5110                 for (b = results; b; b = cast(binode, b->right))
5111                         cnt += 1;
5112                 t->structure.nfields = cnt;
5113                 t->structure.fields = calloc(cnt, sizeof(struct field));
5114                 cnt = 0;
5115                 for (b = results; b; b = cast(binode, b->right)) {
5116                         struct var *v = cast(var, b->left);
5117                         struct field *f = &t->structure.fields[cnt++];
5118                         int a = v->var->type->align;
5119                         f->name = v->var->name->name;
5120                         f->type = v->var->type;
5121                         f->init = NULL;
5122                         f->offset = t->size;
5123                         v->var->frame_pos = f->offset;
5124                         t->size += ((f->type->size - 1) | (a-1)) + 1;
5125                         if (a > t->align)
5126                                 t->align = a;
5127                         variable_unlink_exec(v->var);
5128                 }
5129                 free_binode(results);
5130                 return t;
5131         }
5132
5133         static struct variable *declare_function(struct parse_context *c,
5134                                                 struct variable *name,
5135                                                 struct binode *args,
5136                                                 struct type *ret,
5137                                                 struct binode *results,
5138                                                 struct exec *code)
5139         {
5140                 if (name) {
5141                         struct value fn = {.function = code};
5142                         struct type *t;
5143                         var_block_close(c, CloseFunction, code);
5144                         t = add_anon_type(c, &function_prototype,
5145                                           "func %.*s", name->name->name.len,
5146                                           name->name->name.txt);
5147                         name->type = t;
5148                         t->function.params = reorder_bilist(args);
5149                         if (!ret) {
5150                                 ret = handle_results(c, reorder_bilist(results));
5151                                 t->function.inline_result = 1;
5152                                 t->function.local_size = ret->size;
5153                         }
5154                         t->function.return_type = ret;
5155                         global_alloc(c, t, name, &fn);
5156                         name->type->function.scope = c->out_scope;
5157                 } else {
5158                         free_binode(args);
5159                         free_type(ret);
5160                         free_exec(code);
5161                         var_block_close(c, CloseFunction, NULL);
5162                 }
5163                 c->out_scope = NULL;
5164                 return name;
5165         }
5166
5167 ###### declare terminals
5168         $TERM return
5169
5170 ###### top level grammar
5171
5172         $*variable
5173         DeclareFunction -> func FuncName ( OpenScope ArgsLine ) Block Newlines ${
5174                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5175         }$
5176         | func FuncName IN OpenScope Args OUT OptNL do Block Newlines ${
5177                 $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
5178         }$
5179         | func FuncName NEWLINE OpenScope OptNL do Block Newlines ${
5180                 $0 = declare_function(c, $<FN, NULL, Tnone, NULL, $<Bl);
5181         }$
5182         | func FuncName ( OpenScope ArgsLine ) : Type Block Newlines ${
5183                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5184         }$
5185         | func FuncName ( OpenScope ArgsLine ) : ( ArgsLine ) Block Newlines ${
5186                 $0 = declare_function(c, $<FN, $<AL, NULL, $<AL2, $<Bl);
5187         }$
5188         | func FuncName IN OpenScope Args OUT OptNL return Type Newlines do Block Newlines ${
5189                 $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5190         }$
5191         | func FuncName NEWLINE OpenScope return Type Newlines do Block Newlines ${
5192                 $0 = declare_function(c, $<FN, NULL, $<Ty, NULL, $<Bl);
5193         }$
5194         | func FuncName IN OpenScope Args OUT OptNL return IN Args OUT OptNL do Block Newlines ${
5195                 $0 = declare_function(c, $<FN, $<Ar, NULL, $<Ar2, $<Bl);
5196         }$
5197         | func FuncName NEWLINE OpenScope return IN Args OUT OptNL do Block Newlines ${
5198                 $0 = declare_function(c, $<FN, NULL, NULL, $<Ar, $<Bl);
5199         }$
5200
5201 ###### print func decls
5202         {
5203                 struct variable *v;
5204                 int target = -1;
5205
5206                 while (target != 0) {
5207                         int i = 0;
5208                         for (v = context.in_scope; v; v=v->in_scope)
5209                                 if (v->depth == 0 && v->type && v->type->check_args) {
5210                                         i += 1;
5211                                         if (i == target)
5212                                                 break;
5213                                 }
5214
5215                         if (target == -1) {
5216                                 target = i;
5217                         } else {
5218                                 struct value *val = var_value(&context, v);
5219                                 printf("func %.*s", v->name->name.len, v->name->name.txt);
5220                                 v->type->print_type_decl(v->type, stdout);
5221                                 if (brackets)
5222                                         print_exec(val->function, 0, brackets);
5223                                 else
5224                                         print_value(v->type, val, stdout);
5225                                 printf("/* frame size %d */\n", v->type->function.local_size);
5226                                 target -= 1;
5227                         }
5228                 }
5229         }
5230
5231 ###### core functions
5232
5233         static int analyse_funcs(struct parse_context *c)
5234         {
5235                 struct variable *v;
5236                 int all_ok = 1;
5237                 for (v = c->in_scope; v; v = v->in_scope) {
5238                         struct value *val;
5239                         struct type *ret;
5240                         enum prop_err perr;
5241                         if (v->depth != 0 || !v->type || !v->type->check_args)
5242                                 continue;
5243                         ret = v->type->function.inline_result ?
5244                                 Tnone : v->type->function.return_type;
5245                         val = var_value(c, v);
5246                         do {
5247                                 perr = 0;
5248                                 propagate_types(val->function, c, &perr, ret, 0);
5249                         } while (!(perr & Efail) && (perr & Eretry));
5250                         if (!(perr & Efail))
5251                                 /* Make sure everything is still consistent */
5252                                 propagate_types(val->function, c, &perr, ret, 0);
5253                         if (perr & Efail)
5254                                 all_ok = 0;
5255                         if (!v->type->function.inline_result &&
5256                             !v->type->function.return_type->dup) {
5257                                 type_err(c, "error: function cannot return value of type %1",
5258                                          v->where_decl, v->type->function.return_type, 0, NULL);
5259                         }
5260
5261                         scope_finalize(c, v->type);
5262                 }
5263                 return all_ok;
5264         }
5265
5266         static int analyse_main(struct type *type, struct parse_context *c)
5267         {
5268                 struct binode *bp = type->function.params;
5269                 struct binode *b;
5270                 enum prop_err perr;
5271                 int arg = 0;
5272                 struct type *argv_type;
5273
5274                 argv_type = add_anon_type(c, &array_prototype, "argv");
5275                 argv_type->array.member = Tstr;
5276                 argv_type->array.unspec = 1;
5277
5278                 for (b = bp; b; b = cast(binode, b->right)) {
5279                         perr = 0;
5280                         switch (arg++) {
5281                         case 0: /* argv */
5282                                 propagate_types(b->left, c, &perr, argv_type, 0);
5283                                 break;
5284                         default: /* invalid */  // NOTEST
5285                                 propagate_types(b->left, c, &perr, Tnone, 0);   // NOTEST
5286                         }
5287                         if (perr & Efail)
5288                                 c->parse_error += 1;
5289                 }
5290
5291                 return !c->parse_error;
5292         }
5293
5294         static void interp_main(struct parse_context *c, int argc, char **argv)
5295         {
5296                 struct value *progp = NULL;
5297                 struct text main_name = { "main", 4 };
5298                 struct variable *mainv;
5299                 struct binode *al;
5300                 int anum = 0;
5301                 struct value v;
5302                 struct type *vtype;
5303
5304                 mainv = var_ref(c, main_name);
5305                 if (mainv)
5306                         progp = var_value(c, mainv);
5307                 if (!progp || !progp->function) {
5308                         fprintf(stderr, "oceani: no main function found.\n");
5309                         c->parse_error += 1;
5310                         return;
5311                 }
5312                 if (!analyse_main(mainv->type, c)) {
5313                         fprintf(stderr, "oceani: main has wrong type.\n");
5314                         c->parse_error += 1;
5315                         return;
5316                 }
5317                 al = mainv->type->function.params;
5318
5319                 c->local_size = mainv->type->function.local_size;
5320                 c->local = calloc(1, c->local_size);
5321                 while (al) {
5322                         struct var *v = cast(var, al->left);
5323                         struct value *vl = var_value(c, v->var);
5324                         struct value arg;
5325                         struct type *t;
5326                         mpq_t argcq;
5327                         int i;
5328
5329                         switch (anum++) {
5330                         case 0: /* argv */
5331                                 t = v->var->type;
5332                                 mpq_init(argcq);
5333                                 mpq_set_ui(argcq, argc, 1);
5334                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
5335                                 t->prepare_type(c, t, 0);
5336                                 array_init(v->var->type, vl);
5337                                 for (i = 0; i < argc; i++) {
5338                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
5339
5340                                         arg.str.txt = argv[i];
5341                                         arg.str.len = strlen(argv[i]);
5342                                         free_value(Tstr, vl2);
5343                                         dup_value(Tstr, &arg, vl2);
5344                                 }
5345                                 break;
5346                         }
5347                         al = cast(binode, al->right);
5348                 }
5349                 v = interp_exec(c, progp->function, &vtype);
5350                 free_value(vtype, &v);
5351                 free(c->local);
5352                 c->local = NULL;
5353         }
5354
5355 ###### ast functions
5356         void free_variable(struct variable *v)
5357         {
5358         }
5359
5360 ## And now to test it out.
5361
5362 Having a language requires having a "hello world" program.  I'll
5363 provide a little more than that: a program that prints "Hello world"
5364 finds the GCD of two numbers, prints the first few elements of
5365 Fibonacci, performs a binary search for a number, and a few other
5366 things which will likely grow as the languages grows.
5367
5368 ###### File: oceani.mk
5369         demos :: sayhello
5370         sayhello : oceani
5371                 @echo "===== DEMO ====="
5372                 ./oceani --section "demo: hello" oceani.mdc 55 33
5373
5374 ###### demo: hello
5375
5376         const
5377                 pi ::= 3.141_592_6
5378                 four ::= 2 + 2 ; five ::= 10/2
5379         const pie ::= "I like Pie";
5380                 cake ::= "The cake is"
5381                   ++ " a lie"
5382
5383         struct fred
5384                 size:[four]number
5385                 name:string
5386                 alive:Boolean
5387
5388         func main(argv:[argc::]string)
5389                 print "Hello World, what lovely oceans you have!"
5390                 print "Are there", five, "?"
5391                 print pi, pie, "but", cake
5392
5393                 A := $argv[1]; B := $argv[2]
5394
5395                 /* When a variable is defined in both branches of an 'if',
5396                  * and used afterwards, the variables are merged.
5397                  */
5398                 if A > B:
5399                         bigger := "yes"
5400                 else
5401                         bigger := "no"
5402                 print "Is", A, "bigger than", B,"? ", bigger
5403                 /* If a variable is not used after the 'if', no
5404                  * merge happens, so types can be different
5405                  */
5406                 if A > B * 2:
5407                         double:string = "yes"
5408                         print A, "is more than twice", B, "?", double
5409                 else
5410                         double := B*2
5411                         print "double", B, "is", double
5412
5413                 a : number
5414                 a = A;
5415                 b:number = B
5416                 if a > 0 and then b > 0:
5417                         while a != b:
5418                                 if a < b:
5419                                         b = b - a
5420                                 else
5421                                         a = a - b
5422                         print "GCD of", A, "and", B,"is", a
5423                 else if a <= 0:
5424                         print a, "is not positive, cannot calculate GCD"
5425                 else
5426                         print b, "is not positive, cannot calculate GCD"
5427
5428                 for
5429                         togo := 10
5430                         f1 := 1; f2 := 1
5431                         print "Fibonacci:", f1,f2,
5432                 then togo = togo - 1
5433                 while togo > 0:
5434                         f3 := f1 + f2
5435                         print "", f3,
5436                         f1 = f2
5437                         f2 = f3
5438                 print ""
5439
5440                 /* Binary search... */
5441                 for
5442                         lo:= 0; hi := 100
5443                         target := 77
5444                 while
5445                         mid := (lo + hi) / 2
5446                         if mid == target:
5447                                 use Found
5448                         if mid < target:
5449                                 lo = mid
5450                         else
5451                                 hi = mid
5452                         if hi - lo < 1:
5453                                 lo = mid
5454                                 use GiveUp
5455                         use True
5456                 do pass
5457                 case Found:
5458                         print "Yay, I found", target
5459                 case GiveUp:
5460                         print "Closest I found was", lo
5461
5462                 size::= 10
5463                 list:[size]number
5464                 list[0] = 1234
5465                 // "middle square" PRNG.  Not particularly good, but one my
5466                 // Dad taught me - the first one I ever heard of.
5467                 for i:=1; then i = i + 1; while i < size:
5468                         n := list[i-1] * list[i-1]
5469                         list[i] = (n / 100) % 10 000
5470
5471                 print "Before sort:",
5472                 for i:=0; then i = i + 1; while i < size:
5473                         print "", list[i],
5474                 print
5475
5476                 for i := 1; then i=i+1; while i < size:
5477                         for j:=i-1; then j=j-1; while j >= 0:
5478                                 if list[j] > list[j+1]:
5479                                         t:= list[j]
5480                                         list[j] = list[j+1]
5481                                         list[j+1] = t
5482                 print " After sort:",
5483                 for i:=0; then i = i + 1; while i < size:
5484                         print "", list[i],
5485                 print
5486
5487                 if 1 == 2 then print "yes"; else print "no"
5488
5489                 bob:fred
5490                 bob.name = "Hello"
5491                 bob.alive = (bob.name == "Hello")
5492                 print "bob", "is" if  bob.alive else "isn't", "alive"