ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 struct exec *prog;
 118                 ## parse context
 119         };
 120
 121 ###### macros
 122
 123         #define container_of(ptr, type, member) ({                      \
 124                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 125                 (type *)( (char *)__mptr - offsetof(type,member) );})
 126
 127         #define config2context(_conf) container_of(_conf, struct parse_context, \
 128                 config)
 129
 130 ###### Parser: reduce
 131         struct parse_context *c = config2context(config);
 132
 133 ###### Parser: code
 134
 135         #include <unistd.h>
 136         #include <stdlib.h>
 137         #include <fcntl.h>
 138         #include <errno.h>
 139         #include <sys/mman.h>
 140         #include <string.h>
 141         #include <stdio.h>
 142         #include <locale.h>
 143         #include <malloc.h>
 144         #include "mdcode.h"
 145         #include "scanner.h"
 146         #include "parser.h"
 147
 148         ## includes
 149
 150         #include "oceani.h"
 151
 152         ## forward decls
 153         ## value functions
 154         ## ast functions
 155         ## core functions
 156
 157         #include <getopt.h>
 158         static char Usage[] =
 159                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 160         static const struct option long_options[] = {
 161                 {"trace",     0, NULL, 't'},
 162                 {"print",     0, NULL, 'p'},
 163                 {"noexec",    0, NULL, 'n'},
 164                 {"brackets",  0, NULL, 'b'},
 165                 {"section",   1, NULL, 's'},
 166                 {NULL,        0, NULL, 0},
 167         };
 168         const char *options = "tpnbs";
 169
 170         static void pr_err(char *msg)                   // NOTEST
 171         {
 172                 fprintf(stderr, "%s\n", msg);           // NOTEST
 173         }                                               // NOTEST
 174
 175         int main(int argc, char *argv[])
 176         {
 177                 int fd;
 178                 int len;
 179                 char *file;
 180                 struct section *s, *ss;
 181                 char *section = NULL;
 182                 struct parse_context context = {
 183                         .config = {
 184                                 .ignored = (1 << TK_mark),
 185                                 .number_chars = ".,_+- ",
 186                                 .word_start = "_",
 187                                 .word_cont = "_",
 188                         },
 189                 };
 190                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 191                 int opt;
 192                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 193                        != -1) {
 194                         switch(opt) {
 195                         case 't': dotrace=1; break;
 196                         case 'p': doprint=1; break;
 197                         case 'n': doexec=0; break;
 198                         case 'b': brackets=1; break;
 199                         case 's': section = optarg; break;
 200                         default: fprintf(stderr, Usage);
 201                                 exit(1);
 202                         }
 203                 }
 204                 if (optind >= argc) {
 205                         fprintf(stderr, "oceani: no input file given\n");
 206                         exit(1);
 207                 }
 208                 fd = open(argv[optind], O_RDONLY);
 209                 if (fd < 0) {
 210                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 211                         exit(1);
 212                 }
 213                 context.file_name = argv[optind];
 214                 len = lseek(fd, 0, 2);
 215                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 216                 s = code_extract(file, file+len, pr_err);
 217                 if (!s) {
 218                         fprintf(stderr, "oceani: could not find any code in %s\n",
 219                                 argv[optind]);
 220                         exit(1);
 221                 }
 222
 223                 ## context initialization
 224
 225                 if (section) {
 226                         for (ss = s; ss; ss = ss->next) {
 227                                 struct text sec = ss->section;
 228                                 if (sec.len == strlen(section) &&
 229                                     strncmp(sec.txt, section, sec.len) == 0)
 230                                         break;
 231                         }
 232                         if (!ss) {
 233                                 fprintf(stderr, "oceani: cannot find section %s\n",
 234                                         section);
 235                                 exit(1);
 236                         }
 237                 } else
 238                         ss = s;                         // NOTEST
 239                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 240
 241                 if (!context.prog) {
 242                         fprintf(stderr, "oceani: no main function found.\n");
 243                         context.parse_error = 1;
 244                 }
 245                 if (context.prog && doprint) {
 246                         ## print const decls
 247                         ## print type decls
 248                         print_exec(context.prog, 0, brackets);
 249                 }
 250                 if (context.prog && doexec && !context.parse_error) {
 251                         if (!analyse_prog(context.prog, &context)) {
 252                                 fprintf(stderr, "oceani: type error in program - not running.\n");
 253                                 exit(1);
 254                         }
 255                         interp_prog(&context, context.prog, argc - optind, argv+optind);
 256                 }
 257                 free_exec(context.prog);
 258
 259                 while (s) {
 260                         struct section *t = s->next;
 261                         code_free(s->code);
 262                         free(s);
 263                         s = t;
 264                 }
 265                 ## free context vars
 266                 ## free context types
 267                 ## free context storage
 268                 exit(context.parse_error ? 1 : 0);
 269         }
 270
 271 ### Analysis
 272
 273 The four requirements of parse, analyse, print, interpret apply to
 274 each language element individually so that is how most of the code
 275 will be structured.
 276
 277 Three of the four are fairly self explanatory.  The one that requires
 278 a little explanation is the analysis step.
 279
 280 The current language design does not require the types of variables to
 281 be declared, but they must still have a single type.  Different
 282 operations impose different requirements on the variables, for example
 283 addition requires both arguments to be numeric, and assignment
 284 requires the variable on the left to have the same type as the
 285 expression on the right.
 286
 287 Analysis involves propagating these type requirements around and
 288 consequently setting the type of each variable.  If any requirements
 289 are violated (e.g. a string is compared with a number) or if a
 290 variable needs to have two different types, then an error is raised
 291 and the program will not run.
 292
 293 If the same variable is declared in both branchs of an 'if/else', or
 294 in all cases of a 'switch' then the multiple instances may be merged
 295 into just one variable if the variable is referenced after the
 296 conditional statement.  When this happens, the types must naturally be
 297 consistent across all the branches.  When the variable is not used
 298 outside the if, the variables in the different branches are distinct
 299 and can be of different types.
 300
 301 Undeclared names may only appear in "use" statements and "case" expressions.
 302 These names are given a type of "label" and a unique value.
 303 This allows them to fill the role of a name in an enumerated type, which
 304 is useful for testing the `switch` statement.
 305
 306 As we will see, the condition part of a `while` statement can return
 307 either a Boolean or some other type.  This requires that the expected
 308 type that gets passed around comprises a type and a flag to indicate
 309 that `Tbool` is also permitted.
 310
 311 As there are, as yet, no distinct types that are compatible, there
 312 isn't much subtlety in the analysis.  When we have distinct number
 313 types, this will become more interesting.
 314
 315 #### Error reporting
 316
 317 When analysis discovers an inconsistency it needs to report an error;
 318 just refusing to run the code ensures that the error doesn't cascade,
 319 but by itself it isn't very useful.  A clear understanding of the sort
 320 of error message that are useful will help guide the process of
 321 analysis.
 322
 323 At a simplistic level, the only sort of error that type analysis can
 324 report is that the type of some construct doesn't match a contextual
 325 requirement.  For example, in `4 + "hello"` the addition provides a
 326 contextual requirement for numbers, but `"hello"` is not a number.  In
 327 this particular example no further information is needed as the types
 328 are obvious from local information.  When a variable is involved that
 329 isn't the case.  It may be helpful to explain why the variable has a
 330 particular type, by indicating the location where the type was set,
 331 whether by declaration or usage.
 332
 333 Using a recursive-descent analysis we can easily detect a problem at
 334 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 335 will detect that one argument is not a number and the usage of `hello`
 336 will detect that a number was wanted, but not provided.  In this
 337 (early) version of the language, we will generate error reports at
 338 multiple locations, so the use of `hello` will report an error and
 339 explain were the value was set, and the addition will report an error
 340 and say why numbers are needed.  To be able to report locations for
 341 errors, each language element will need to record a file location
 342 (line and column) and each variable will need to record the language
 343 element where its type was set.  For now we will assume that each line
 344 of an error message indicates one location in the file, and up to 2
 345 types.  So we provide a `printf`-like function which takes a format, a
 346 location (a `struct exec` which has not yet been introduced), and 2
 347 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 348 will need a function to print the location, once we know how that is
 349 stored. e As will be explained later, there are sometimes extra rules for
 350 type matching and they might affect error messages, we need to pass those
 351 in too.
 352
 353 As well as type errors, we sometimes need to report problems with
 354 tokens, which might be unexpected or might name a type that has not
 355 been defined.  For these we have `tok_err()` which reports an error
 356 with a given token.  Each of the error functions sets the flag in the
 357 context so indicate that parsing failed.
 358
 359 ###### forward decls
 360
 361         static void fput_loc(struct exec *loc, FILE *f);
 362
 363 ###### core functions
 364
 365         static void type_err(struct parse_context *c,
 366                              char *fmt, struct exec *loc,
 367                              struct type *t1, int rules, struct type *t2)
 368         {
 369                 fprintf(stderr, "%s:", c->file_name);
 370                 fput_loc(loc, stderr);
 371                 for (; *fmt ; fmt++) {
 372                         if (*fmt != '%') {
 373                                 fputc(*fmt, stderr);
 374                                 continue;
 375                         }
 376                         fmt++;
 377                         switch (*fmt) {
 378                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 379                         default: fputc('?', stderr); break;     // NOTEST
 380                         case '1':
 381                                 type_print(t1, stderr);
 382                                 break;
 383                         case '2':
 384                                 type_print(t2, stderr);
 385                                 break;
 386                         ## format cases
 387                         }
 388                 }
 389                 fputs("\n", stderr);
 390                 c->parse_error = 1;
 391         }
 392
 393         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 394         {
 395                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 396                         t->txt.len, t->txt.txt);
 397                 c->parse_error = 1;
 398         }
 399
 400 ## Entities: declared and predeclared.
 401
 402 There are various "things" that the language and/or the interpreter
 403 needs to know about to parse and execute a program.  These include
 404 types, variables, values, and executable code.  These are all lumped
 405 together under the term "entities" (calling them "objects" would be
 406 confusing) and introduced here.  The following section will present the
 407 different specific code elements which comprise or manipulate these
 408 various entities.
 409
 410 ### Types
 411
 412 Values come in a wide range of types, with more likely to be added.
 413 Each type needs to be able to print its own values (for convenience at
 414 least) as well as to compare two values, at least for equality and
 415 possibly for order.  For now, values might need to be duplicated and
 416 freed, though eventually such manipulations will be better integrated
 417 into the language.
 418
 419 Rather than requiring every numeric type to support all numeric
 420 operations (add, multiple, etc), we allow types to be able to present
 421 as one of a few standard types: integer, float, and fraction.  The
 422 existence of these conversion functions eventually enable types to
 423 determine if they are compatible with other types, though such types
 424 have not yet been implemented.
 425
 426 Named type are stored in a simple linked list.  Objects of each type are
 427 "values" which are often passed around by value.
 428
 429 ###### ast
 430
 431         struct value {
 432                 union {
 433                         char ptr[1];
 434                         ## value union fields
 435                 };
 436         };
 437
 438         struct type {
 439                 struct text name;
 440                 struct type *next;
 441                 int size, align;
 442                 void (*init)(struct type *type, struct value *val);
 443                 void (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 444                 void (*print)(struct type *type, struct value *val);
 445                 void (*print_type)(struct type *type, FILE *f);
 446                 int (*cmp_order)(struct type *t1, struct type *t2,
 447                                  struct value *v1, struct value *v2);
 448                 int (*cmp_eq)(struct type *t1, struct type *t2,
 449                               struct value *v1, struct value *v2);
 450                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 451                 void (*free)(struct type *type, struct value *val);
 452                 void (*free_type)(struct type *t);
 453                 long long (*to_int)(struct value *v);
 454                 double (*to_float)(struct value *v);
 455                 int (*to_mpq)(mpq_t *q, struct value *v);
 456                 ## type functions
 457                 union {
 458                         ## type union fields
 459                 };
 460         };
 461
 462 ###### parse context
 463
 464         struct type *typelist;
 465
 466 ###### ast functions
 467
 468         static struct type *find_type(struct parse_context *c, struct text s)
 469         {
 470                 struct type *l = c->typelist;
 471
 472                 while (l &&
 473                        text_cmp(l->name, s) != 0)
 474                                 l = l->next;
 475                 return l;
 476         }
 477
 478         static struct type *add_type(struct parse_context *c, struct text s,
 479                                      struct type *proto)
 480         {
 481                 struct type *n;
 482
 483                 n = calloc(1, sizeof(*n));
 484                 *n = *proto;
 485                 n->name = s;
 486                 n->next = c->typelist;
 487                 c->typelist = n;
 488                 return n;
 489         }
 490
 491         static void free_type(struct type *t)
 492         {
 493                 /* The type is always a reference to something in the
 494                  * context, so we don't need to free anything.
 495                  */
 496         }
 497
 498         static void free_value(struct type *type, struct value *v)
 499         {
 500                 if (type && v)
 501                         type->free(type, v);
 502         }
 503
 504         static void type_print(struct type *type, FILE *f)
 505         {
 506                 if (!type)
 507                         fputs("*unknown*type*", f);     // NOTEST
 508                 else if (type->name.len)
 509                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 510                 else if (type->print_type)
 511                         type->print_type(type, f);
 512                 else
 513                         fputs("*invalid*type*", f);     // NOTEST
 514         }
 515
 516         static void val_init(struct type *type, struct value *val)
 517         {
 518                 if (type && type->init)
 519                         type->init(type, val);
 520         }
 521
 522         static void dup_value(struct type *type,
 523                               struct value *vold, struct value *vnew)
 524         {
 525                 if (type && type->dup)
 526                         type->dup(type, vold, vnew);
 527         }
 528
 529         static int value_cmp(struct type *tl, struct type *tr,
 530                              struct value *left, struct value *right)
 531         {
 532                 if (tl && tl->cmp_order)
 533                         return tl->cmp_order(tl, tr, left, right);
 534                 if (tl && tl->cmp_eq)                   // NOTEST
 535                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 536                 return -1;                              // NOTEST
 537         }
 538
 539         static void print_value(struct type *type, struct value *v)
 540         {
 541                 if (type && type->print)
 542                         type->print(type, v);
 543                 else
 544                         printf("*Unknown*");            // NOTEST
 545         }
 546
 547 ###### forward decls
 548
 549         static void free_value(struct type *type, struct value *v);
 550         static int type_compat(struct type *require, struct type *have, int rules);
 551         static void type_print(struct type *type, FILE *f);
 552         static void val_init(struct type *type, struct value *v);
 553         static void dup_value(struct type *type,
 554                               struct value *vold, struct value *vnew);
 555         static int value_cmp(struct type *tl, struct type *tr,
 556                              struct value *left, struct value *right);
 557         static void print_value(struct type *type, struct value *v);
 558
 559 ###### free context types
 560
 561         while (context.typelist) {
 562                 struct type *t = context.typelist;
 563
 564                 context.typelist = t->next;
 565                 if (t->free_type)
 566                         t->free_type(t);
 567                 free(t);
 568         }
 569
 570 Type can be specified for local variables, for fields in a structure,
 571 for formal parameters to functions, and possibly elsewhere.  Different
 572 rules may apply in different contexts.  As a minimum, a named type may
 573 always be used.  Currently the type of a formal parameter can be
 574 different from types in other contexts, so we have a separate grammar
 575 symbol for those.
 576
 577 ###### Grammar
 578
 579         $*type
 580         Type -> IDENTIFIER ${
 581                 $0 = find_type(c, $1.txt);
 582                 if (!$0) {
 583                         tok_err(c,
 584                                 "error: undefined type", &$1);
 585
 586                         $0 = Tnone;
 587                 }
 588         }$
 589         ## type grammar
 590
 591         FormalType -> Type ${ $0 = $<1; }$
 592         ## formal type grammar
 593
 594 #### Base Types
 595
 596 Values of the base types can be numbers, which we represent as
 597 multi-precision fractions, strings, Booleans and labels.  When
 598 analysing the program we also need to allow for places where no value
 599 is meaningful (type `Tnone`) and where we don't know what type to
 600 expect yet (type is `NULL`).
 601
 602 Values are never shared, they are always copied when used, and freed
 603 when no longer needed.
 604
 605 When propagating type information around the program, we need to
 606 determine if two types are compatible, where type `NULL` is compatible
 607 with anything.  There are two special cases with type compatibility,
 608 both related to the Conditional Statement which will be described
 609 later.  In some cases a Boolean can be accepted as well as some other
 610 primary type, and in others any type is acceptable except a label (`Vlabel`).
 611 A separate function encoding these cases will simplify some code later.
 612
 613 ###### type functions
 614
 615         int (*compat)(struct type *this, struct type *other);
 616
 617 ###### ast functions
 618
 619         static int type_compat(struct type *require, struct type *have, int rules)
 620         {
 621                 if ((rules & Rboolok) && have == Tbool)
 622                         return 1;       // NOTEST
 623                 if ((rules & Rnolabel) && have == Tlabel)
 624                         return 0;       // NOTEST
 625                 if (!require || !have)
 626                         return 1;
 627
 628                 if (require->compat)
 629                         return require->compat(require, have);
 630
 631                 return require == have;
 632         }
 633
 634 ###### includes
 635         #include <gmp.h>
 636         #include "parse_string.h"
 637         #include "parse_number.h"
 638
 639 ###### libs
 640         myLDLIBS := libnumber.o libstring.o -lgmp
 641         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
 642
 643 ###### type union fields
 644         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
 645
 646 ###### value union fields
 647         struct text str;
 648         mpq_t num;
 649         unsigned char bool;
 650         void *label;
 651
 652 ###### ast functions
 653         static void _free_value(struct type *type, struct value *v)
 654         {
 655                 if (!v)
 656                         return;         // NOTEST
 657                 switch (type->vtype) {
 658                 case Vnone: break;
 659                 case Vstr: free(v->str.txt); break;
 660                 case Vnum: mpq_clear(v->num); break;
 661                 case Vlabel:
 662                 case Vbool: break;
 663                 }
 664         }
 665
 666 ###### value functions
 667
 668         static void _val_init(struct type *type, struct value *val)
 669         {
 670                 switch(type->vtype) {
 671                 case Vnone:             // NOTEST
 672                         break;          // NOTEST
 673                 case Vnum:
 674                         mpq_init(val->num); break;
 675                 case Vstr:
 676                         val->str.txt = malloc(1);
 677                         val->str.len = 0;
 678                         break;
 679                 case Vbool:
 680                         val->bool = 0;
 681                         break;
 682                 case Vlabel:
 683                         val->label = NULL;
 684                         break;
 685                 }
 686         }
 687
 688         static void _dup_value(struct type *type,
 689                                struct value *vold, struct value *vnew)
 690         {
 691                 switch (type->vtype) {
 692                 case Vnone:             // NOTEST
 693                         break;          // NOTEST
 694                 case Vlabel:
 695                         vnew->label = vold->label;
 696                         break;
 697                 case Vbool:
 698                         vnew->bool = vold->bool;
 699                         break;
 700                 case Vnum:
 701                         mpq_init(vnew->num);
 702                         mpq_set(vnew->num, vold->num);
 703                         break;
 704                 case Vstr:
 705                         vnew->str.len = vold->str.len;
 706                         vnew->str.txt = malloc(vnew->str.len);
 707                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
 708                         break;
 709                 }
 710         }
 711
 712         static int _value_cmp(struct type *tl, struct type *tr,
 713                               struct value *left, struct value *right)
 714         {
 715                 int cmp;
 716                 if (tl != tr)
 717                         return tl - tr; // NOTEST
 718                 switch (tl->vtype) {
 719                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
 720                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
 721                 case Vstr: cmp = text_cmp(left->str, right->str); break;
 722                 case Vbool: cmp = left->bool - right->bool; break;
 723                 case Vnone: cmp = 0;                    // NOTEST
 724                 }
 725                 return cmp;
 726         }
 727
 728         static void _print_value(struct type *type, struct value *v)
 729         {
 730                 switch (type->vtype) {
 731                 case Vnone:                             // NOTEST
 732                         printf("*no-value*"); break;    // NOTEST
 733                 case Vlabel:                            // NOTEST
 734                         printf("*label-%p*", v->label); break; // NOTEST
 735                 case Vstr:
 736                         printf("%.*s", v->str.len, v->str.txt); break;
 737                 case Vbool:
 738                         printf("%s", v->bool ? "True":"False"); break;
 739                 case Vnum:
 740                         {
 741                         mpf_t fl;
 742                         mpf_init2(fl, 20);
 743                         mpf_set_q(fl, v->num);
 744                         gmp_printf("%Fg", fl);
 745                         mpf_clear(fl);
 746                         break;
 747                         }
 748                 }
 749         }
 750
 751         static void _free_value(struct type *type, struct value *v);
 752
 753         static struct type base_prototype = {
 754                 .init = _val_init,
 755                 .print = _print_value,
 756                 .cmp_order = _value_cmp,
 757                 .cmp_eq = _value_cmp,
 758                 .dup = _dup_value,
 759                 .free = _free_value,
 760         };
 761
 762         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
 763
 764 ###### ast functions
 765         static struct type *add_base_type(struct parse_context *c, char *n,
 766                                           enum vtype vt, int size)
 767         {
 768                 struct text txt = { n, strlen(n) };
 769                 struct type *t;
 770
 771                 t = add_type(c, txt, &base_prototype);
 772                 t->vtype = vt;
 773                 t->size = size;
 774                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
 775                 if (t->size & (t->align - 1))
 776                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
 777                 return t;
 778         }
 779
 780 ###### context initialization
 781
 782         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
 783         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
 784         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
 785         Tnone  = add_base_type(&context, "none", Vnone, 0);
 786         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
 787
 788 ### Variables
 789
 790 Variables are scoped named values.  We store the names in a linked list
 791 of "bindings" sorted in lexical order, and use sequential search and
 792 insertion sort.
 793
 794 ###### ast
 795
 796         struct binding {
 797                 struct text name;
 798                 struct binding *next;   // in lexical order
 799                 ## binding fields
 800         };
 801
 802 This linked list is stored in the parse context so that "reduce"
 803 functions can find or add variables, and so the analysis phase can
 804 ensure that every variable gets a type.
 805
 806 ###### parse context
 807
 808         struct binding *varlist;  // In lexical order
 809
 810 ###### ast functions
 811
 812         static struct binding *find_binding(struct parse_context *c, struct text s)
 813         {
 814                 struct binding **l = &c->varlist;
 815                 struct binding *n;
 816                 int cmp = 1;
 817
 818                 while (*l &&
 819                         (cmp = text_cmp((*l)->name, s)) < 0)
 820                                 l = & (*l)->next;
 821                 if (cmp == 0)
 822                         return *l;
 823                 n = calloc(1, sizeof(*n));
 824                 n->name = s;
 825                 n->next = *l;
 826                 *l = n;
 827                 return n;
 828         }
 829
 830 Each name can be linked to multiple variables defined in different
 831 scopes.  Each scope starts where the name is declared and continues
 832 until the end of the containing code block.  Scopes of a given name
 833 cannot nest, so a declaration while a name is in-scope is an error.
 834
 835 ###### binding fields
 836         struct variable *var;
 837
 838 ###### ast
 839         struct variable {
 840                 struct variable *previous;
 841                 struct type *type;
 842                 struct binding *name;
 843                 struct exec *where_decl;// where name was declared
 844                 struct exec *where_set; // where type was set
 845                 ## variable fields
 846         };
 847
 848 While the naming seems strange, we include local constants in the
 849 definition of variables.  A name declared `var := value` can
 850 subsequently be changed, but a name declared `var ::= value` cannot -
 851 it is constant
 852
 853 ###### variable fields
 854         int constant;
 855
 856 Scopes in parallel branches can be partially merged.  More
 857 specifically, if a given name is declared in both branches of an
 858 if/else then its scope is a candidate for merging.  Similarly if
 859 every branch of an exhaustive switch (e.g. has an "else" clause)
 860 declares a given name, then the scopes from the branches are
 861 candidates for merging.
 862
 863 Note that names declared inside a loop (which is only parallel to
 864 itself) are never visible after the loop.  Similarly names defined in
 865 scopes which are not parallel, such as those started by `for` and
 866 `switch`, are never visible after the scope.  Only variables defined in
 867 both `then` and `else` (including the implicit then after an `if`, and
 868 excluding `then` used with `for`) and in all `case`s and `else` of a
 869 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
 870
 871 Labels, which are a bit like variables, follow different rules.
 872 Labels are not explicitly declared, but if an undeclared name appears
 873 in a context where a label is legal, that effectively declares the
 874 name as a label.  The declaration remains in force (or in scope) at
 875 least to the end of the immediately containing block and conditionally
 876 in any larger containing block which does not declare the name in some
 877 other way.  Importantly, the conditional scope extension happens even
 878 if the label is only used in one parallel branch of a conditional --
 879 when used in one branch it is treated as having been declared in all
 880 branches.
 881
 882 Merge candidates are tentatively visible beyond the end of the
 883 branching statement which creates them.  If the name is used, the
 884 merge is affirmed and they become a single variable visible at the
 885 outer layer.  If not - if it is redeclared first - the merge lapses.
 886
 887 To track scopes we have an extra stack, implemented as a linked list,
 888 which roughly parallels the parse stack and which is used exclusively
 889 for scoping.  When a new scope is opened, a new frame is pushed and
 890 the child-count of the parent frame is incremented.  This child-count
 891 is used to distinguish between the first of a set of parallel scopes,
 892 in which declared variables must not be in scope, and subsequent
 893 branches, whether they may already be conditionally scoped.
 894
 895 To push a new frame *before* any code in the frame is parsed, we need a
 896 grammar reduction.  This is most easily achieved with a grammar
 897 element which derives the empty string, and creates the new scope when
 898 it is recognised.  This can be placed, for example, between a keyword
 899 like "if" and the code following it.
 900
 901 ###### ast
 902         struct scope {
 903                 struct scope *parent;
 904                 int child_count;
 905         };
 906
 907 ###### parse context
 908         int scope_depth;
 909         struct scope *scope_stack;
 910
 911 ###### ast functions
 912         static void scope_pop(struct parse_context *c)
 913         {
 914                 struct scope *s = c->scope_stack;
 915
 916                 c->scope_stack = s->parent;
 917                 free(s);
 918                 c->scope_depth -= 1;
 919         }
 920
 921         static void scope_push(struct parse_context *c)
 922         {
 923                 struct scope *s = calloc(1, sizeof(*s));
 924                 if (c->scope_stack)
 925                         c->scope_stack->child_count += 1;
 926                 s->parent = c->scope_stack;
 927                 c->scope_stack = s;
 928                 c->scope_depth += 1;
 929         }
 930
 931 ###### Grammar
 932
 933         $void
 934         OpenScope -> ${ scope_push(c); }$
 935         ClosePara -> ${ var_block_close(c, CloseParallel); }$
 936
 937 Each variable records a scope depth and is in one of four states:
 938
 939 - "in scope".  This is the case between the declaration of the
 940   variable and the end of the containing block, and also between
 941   the usage with affirms a merge and the end of that block.
 942
 943   The scope depth is not greater than the current parse context scope
 944   nest depth.  When the block of that depth closes, the state will
 945   change.  To achieve this, all "in scope" variables are linked
 946   together as a stack in nesting order.
 947
 948 - "pending".  The "in scope" block has closed, but other parallel
 949   scopes are still being processed.  So far, every parallel block at
 950   the same level that has closed has declared the name.
 951
 952   The scope depth is the depth of the last parallel block that
 953   enclosed the declaration, and that has closed.
 954
 955 - "conditionally in scope".  The "in scope" block and all parallel
 956   scopes have closed, and no further mention of the name has been
 957   seen.  This state includes a secondary nest depth which records the
 958   outermost scope seen since the variable became conditionally in
 959   scope.  If a use of the name is found, the variable becomes "in
 960   scope" and that secondary depth becomes the recorded scope depth.
 961   If the name is declared as a new variable, the old variable becomes
 962   "out of scope" and the recorded scope depth stays unchanged.
 963
 964 - "out of scope".  The variable is neither in scope nor conditionally
 965   in scope.  It is permanently out of scope now and can be removed from
 966   the "in scope" stack.
 967
 968 ###### variable fields
 969         int depth, min_depth;
 970         enum { OutScope, PendingScope, CondScope, InScope } scope;
 971         struct variable *in_scope;
 972
 973 ###### parse context
 974
 975         struct variable *in_scope;
 976
 977 All variables with the same name are linked together using the
 978 'previous' link.  Those variable that have been affirmatively merged all
 979 have a 'merged' pointer that points to one primary variable - the most
 980 recently declared instance.  When merging variables, we need to also
 981 adjust the 'merged' pointer on any other variables that had previously
 982 been merged with the one that will no longer be primary.
 983
 984 A variable that is no longer the most recent instance of a name may
 985 still have "pending" scope, if it might still be merged with most
 986 recent instance.  These variables don't really belong in the
 987 "in_scope" list, but are not immediately removed when a new instance
 988 is found.  Instead, they are detected and ignored when considering the
 989 list of in_scope names.
 990
 991 The storage of the value of a variable will be described later.  For now
 992 we just need to know that when a variable goes out of scope, it might
 993 need to be freed.  For this we need to be able to find it, so assume that
 994 `var_value()` will provide that.
 995
 996 ###### variable fields
 997         struct variable *merged;
 998
 999 ###### ast functions
1000
1001         static void variable_merge(struct variable *primary, struct variable *secondary)
1002         {
1003                 struct variable *v;
1004
1005                 primary = primary->merged;
1006
1007                 for (v = primary->previous; v; v=v->previous)
1008                         if (v == secondary || v == secondary->merged ||
1009                             v->merged == secondary ||
1010                             v->merged == secondary->merged) {
1011                                 v->scope = OutScope;
1012                                 v->merged = primary;
1013                         }
1014         }
1015
1016 ###### forward decls
1017         static struct value *var_value(struct parse_context *c, struct variable *v);
1018
1019 ###### free context vars
1020
1021         while (context.varlist) {
1022                 struct binding *b = context.varlist;
1023                 struct variable *v = b->var;
1024                 context.varlist = b->next;
1025                 free(b);
1026                 while (v) {
1027                         struct variable *t = v;
1028
1029                         v = t->previous;
1030                         free_value(t->type, var_value(&context, t));
1031                         if (t->depth == 0)
1032                                 // This is a global constant
1033                                 free_exec(t->where_decl);
1034                         free(t);
1035                 }
1036         }
1037
1038 #### Manipulating Bindings
1039
1040 When a name is conditionally visible, a new declaration discards the
1041 old binding - the condition lapses.  Conversely a usage of the name
1042 affirms the visibility and extends it to the end of the containing
1043 block - i.e. the block that contains both the original declaration and
1044 the latest usage.  This is determined from `min_depth`.  When a
1045 conditionally visible variable gets affirmed like this, it is also
1046 merged with other conditionally visible variables with the same name.
1047
1048 When we parse a variable declaration we either report an error if the
1049 name is currently bound, or create a new variable at the current nest
1050 depth if the name is unbound or bound to a conditionally scoped or
1051 pending-scope variable.  If the previous variable was conditionally
1052 scoped, it and its homonyms becomes out-of-scope.
1053
1054 When we parse a variable reference (including non-declarative assignment
1055 "foo = bar") we report an error if the name is not bound or is bound to
1056 a pending-scope variable; update the scope if the name is bound to a
1057 conditionally scoped variable; or just proceed normally if the named
1058 variable is in scope.
1059
1060 When we exit a scope, any variables bound at this level are either
1061 marked out of scope or pending-scoped, depending on whether the scope
1062 was sequential or parallel.  Here a "parallel" scope means the "then"
1063 or "else" part of a conditional, or any "case" or "else" branch of a
1064 switch.  Other scopes are "sequential".
1065
1066 When exiting a parallel scope we check if there are any variables that
1067 were previously pending and are still visible. If there are, then
1068 there weren't redeclared in the most recent scope, so they cannot be
1069 merged and must become out-of-scope.  If it is not the first of
1070 parallel scopes (based on `child_count`), we check that there was a
1071 previous binding that is still pending-scope.  If there isn't, the new
1072 variable must now be out-of-scope.
1073
1074 When exiting a sequential scope that immediately enclosed parallel
1075 scopes, we need to resolve any pending-scope variables.  If there was
1076 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1077 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1078 all pending-scope variables become conditionally scoped.
1079
1080 ###### ast
1081         enum closetype { CloseSequential, CloseParallel, CloseElse };
1082
1083 ###### ast functions
1084
1085         static struct variable *var_decl(struct parse_context *c, struct text s)
1086         {
1087                 struct binding *b = find_binding(c, s);
1088                 struct variable *v = b->var;
1089
1090                 switch (v ? v->scope : OutScope) {
1091                 case InScope:
1092                         /* Caller will report the error */
1093                         return NULL;
1094                 case CondScope:
1095                         for (;
1096                              v && v->scope == CondScope;
1097                              v = v->previous)
1098                                 v->scope = OutScope;
1099                         break;
1100                 default: break;
1101                 }
1102                 v = calloc(1, sizeof(*v));
1103                 v->previous = b->var;
1104                 b->var = v;
1105                 v->name = b;
1106                 v->merged = v;
1107                 v->min_depth = v->depth = c->scope_depth;
1108                 v->scope = InScope;
1109                 v->in_scope = c->in_scope;
1110                 c->in_scope = v;
1111                 return v;
1112         }
1113
1114         static struct variable *var_ref(struct parse_context *c, struct text s)
1115         {
1116                 struct binding *b = find_binding(c, s);
1117                 struct variable *v = b->var;
1118                 struct variable *v2;
1119
1120                 switch (v ? v->scope : OutScope) {
1121                 case OutScope:
1122                 case PendingScope:
1123                         /* Caller will report the error */
1124                         return NULL;
1125                 case CondScope:
1126                         /* All CondScope variables of this name need to be merged
1127                          * and become InScope
1128                          */
1129                         v->depth = v->min_depth;
1130                         v->scope = InScope;
1131                         for (v2 = v->previous;
1132                              v2 && v2->scope == CondScope;
1133                              v2 = v2->previous)
1134                                 variable_merge(v, v2);
1135                         break;
1136                 case InScope:
1137                         break;
1138                 }
1139                 return v;
1140         }
1141
1142         static void var_block_close(struct parse_context *c, enum closetype ct)
1143         {
1144                 /* Close off all variables that are in_scope */
1145                 struct variable *v, **vp, *v2;
1146
1147                 scope_pop(c);
1148                 for (vp = &c->in_scope;
1149                      (v = *vp) && v->min_depth > c->scope_depth;
1150                      (v->scope == OutScope || v->name->var != v)
1151                      ? (*vp =  v->in_scope, 0)
1152                      : ( vp = &v->in_scope, 0)) {
1153                         if (v->name->var != v) {
1154                                 /* This is still in scope, but we haven't just
1155                                  * closed the scope.
1156                                  */
1157                                 continue;
1158                         }
1159                         switch (ct) {
1160                         case CloseElse:
1161                         case CloseParallel: /* handle PendingScope */
1162                                 switch(v->scope) {
1163                                 case InScope:
1164                                 case CondScope:
1165                                         if (c->scope_stack->child_count == 1)
1166                                                 v->scope = PendingScope;
1167                                         else if (v->previous &&
1168                                                  v->previous->scope == PendingScope)
1169                                                 v->scope = PendingScope;
1170                                         else if (v->type == Tlabel)     // UNTESTED
1171                                                 v->scope = PendingScope;        // UNTESTED
1172                                         else if (v->name->var == v)     // UNTESTED
1173                                                 v->scope = OutScope;    // UNTESTED
1174                                         if (ct == CloseElse) {
1175                                                 /* All Pending variables with this name
1176                                                  * are now Conditional */
1177                                                 for (v2 = v;
1178                                                      v2 && v2->scope == PendingScope;
1179                                                      v2 = v2->previous)
1180                                                         v2->scope = CondScope;
1181                                         }
1182                                         break;
1183                                 case PendingScope:
1184                                         for (v2 = v;
1185                                              v2 && v2->scope == PendingScope;
1186                                              v2 = v2->previous)
1187                                                 if (v2->type != Tlabel)
1188                                                         v2->scope = OutScope;
1189                                         break;
1190                                 case OutScope: break;   // UNTESTED
1191                                 }
1192                                 break;
1193                         case CloseSequential:
1194                                 if (v->type == Tlabel)
1195                                         v->scope = PendingScope;
1196                                 switch (v->scope) {
1197                                 case InScope:
1198                                         v->scope = OutScope;
1199                                         break;
1200                                 case PendingScope:
1201                                         /* There was no 'else', so we can only become
1202                                          * conditional if we know the cases were exhaustive,
1203                                          * and that doesn't mean anything yet.
1204                                          * So only labels become conditional..
1205                                          */
1206                                         for (v2 = v;
1207                                              v2 && v2->scope == PendingScope;
1208                                              v2 = v2->previous)
1209                                                 if (v2->type == Tlabel) {
1210                                                         v2->scope = CondScope;
1211                                                         v2->min_depth = c->scope_depth;
1212                                                 } else
1213                                                         v2->scope = OutScope;
1214                                         break;
1215                                 case CondScope:
1216                                 case OutScope: break;
1217                                 }
1218                                 break;
1219                         }
1220                 }
1221         }
1222
1223 #### Storing Values
1224
1225 The value of a variable is store separately from the variable, on an
1226 analogue of a stack frame.  There are (currently) two frames that can be
1227 active.  A global frame which currently only stores constants, and a
1228 stacked frame which stores local variables.  Each variable knows if it
1229 is global or not, and what its index into the frame is.
1230
1231 Values in the global frame are known immediately they are relevant, so
1232 the frame needs to be reallocated as it grows so it can store those
1233 values.  The local frame doesn't get values until the interpreted phase
1234 is started, so there is no need to allocate until the size is known.
1235
1236 ###### variable fields
1237                 short frame_pos;
1238                 short global;
1239
1240 ###### parse context
1241
1242         short global_size, global_alloc;
1243         short local_size;
1244         void *global, *local;
1245
1246 ###### ast functions
1247
1248         static struct value *var_value(struct parse_context *c, struct variable *v)
1249         {
1250                 if (!v->global) {
1251                         if (!c->local || !v->type)
1252                                 return NULL;
1253                         if (v->frame_pos + v->type->size > c->local_size) {
1254                                 printf("INVALID frame_pos\n");  // NOTEST
1255                                 exit(2);                        // NOTEST
1256                         }
1257                         return c->local + v->frame_pos;
1258                 }
1259                 if (c->global_size > c->global_alloc) {
1260                         int old = c->global_alloc;
1261                         c->global_alloc = (c->global_size | 1023) + 1024;
1262                         c->global = realloc(c->global, c->global_alloc);
1263                         memset(c->global + old, 0, c->global_alloc - old);
1264                 }
1265                 return c->global + v->frame_pos;
1266         }
1267
1268         static struct value *global_alloc(struct parse_context *c, struct type *t,
1269                                           struct variable *v, struct value *init)
1270         {
1271                 struct value *ret;
1272                 struct variable scratch;
1273
1274                 if (t->prepare_type)
1275                         t->prepare_type(c, t, 1);       // NOTEST
1276
1277                 if (c->global_size & (t->align - 1))
1278                         c->global_size = (c->global_size + t->align) & ~(t->align-1);   // UNTESTED
1279                 if (!v) {
1280                         v = &scratch;
1281                         v->type = t;
1282                 }
1283                 v->frame_pos = c->global_size;
1284                 v->global = 1;
1285                 c->global_size += v->type->size;
1286                 ret = var_value(c, v);
1287                 if (init)
1288                         memcpy(ret, init, t->size);
1289                 else
1290                         val_init(t, ret);
1291                 return ret;
1292         }
1293
1294 As global values are found -- struct field initializers, labels etc --
1295 `global_alloc()` is called to record the value in the global frame.
1296
1297 When the program is fully parsed, we need to walk the list of variables
1298 to find any that weren't merged away and that aren't global, and to
1299 calculate the frame size and assign a frame position for each variable.
1300 For this we have `scope_finalize()`.
1301
1302 ###### ast functions
1303
1304         static void scope_finalize(struct parse_context *c)
1305         {
1306                 struct binding *b;
1307
1308                 for (b = c->varlist; b; b = b->next) {
1309                         struct variable *v;
1310                         for (v = b->var; v; v = v->previous) {
1311                                 struct type *t = v->type;
1312                                 if (v->merged != v)
1313                                         continue;
1314                                 if (v->global)
1315                                         continue;
1316                                 if (c->local_size & (t->align - 1))
1317                                         c->local_size = (c->local_size + t->align) & ~(t->align-1);
1318                                 v->frame_pos = c->local_size;
1319                                 c->local_size += v->type->size;
1320                         }
1321                 }
1322                 c->local = calloc(1, c->local_size);
1323         }
1324
1325 ###### free context storage
1326         free(context.global);
1327         free(context.local);
1328
1329 ### Executables
1330
1331 Executables can be lots of different things.  In many cases an
1332 executable is just an operation combined with one or two other
1333 executables.  This allows for expressions and lists etc.  Other times an
1334 executable is something quite specific like a constant or variable name.
1335 So we define a `struct exec` to be a general executable with a type, and
1336 a `struct binode` which is a subclass of `exec`, forms a node in a
1337 binary tree, and holds an operation.  There will be other subclasses,
1338 and to access these we need to be able to `cast` the `exec` into the
1339 various other types.  The first field in any `struct exec` is the type
1340 from the `exec_types` enum.
1341
1342 ###### macros
1343         #define cast(structname, pointer) ({            \
1344                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
1345                 if (__mptr && *__mptr != X##structname) abort();                \
1346                 (struct structname *)( (char *)__mptr);})
1347
1348         #define new(structname) ({                                              \
1349                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
1350                 __ptr->type = X##structname;                                            \
1351                 __ptr->line = -1; __ptr->column = -1;                                   \
1352                 __ptr;})
1353
1354         #define new_pos(structname, token) ({                                           \
1355                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
1356                 __ptr->type = X##structname;                                            \
1357                 __ptr->line = token.line; __ptr->column = token.col;                    \
1358                 __ptr;})
1359
1360 ###### ast
1361         enum exec_types {
1362                 Xbinode,
1363                 ## exec type
1364         };
1365         struct exec {
1366                 enum exec_types type;
1367                 int line, column;
1368         };
1369         struct binode {
1370                 struct exec;
1371                 enum Btype {
1372                         ## Binode types
1373                 } op;
1374                 struct exec *left, *right;
1375         };
1376
1377 ###### ast functions
1378
1379         static int __fput_loc(struct exec *loc, FILE *f)
1380         {
1381                 if (!loc)
1382                         return 0;
1383                 if (loc->line >= 0) {
1384                         fprintf(f, "%d:%d: ", loc->line, loc->column);
1385                         return 1;
1386                 }
1387                 if (loc->type == Xbinode)
1388                         return __fput_loc(cast(binode,loc)->left, f) ||
1389                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
1390                 return 0;                       // NOTEST
1391         }
1392         static void fput_loc(struct exec *loc, FILE *f)
1393         {
1394                 if (!__fput_loc(loc, f))
1395                         fprintf(f, "??:??: ");  // NOTEST
1396         }
1397
1398 Each different type of `exec` node needs a number of functions defined,
1399 a bit like methods.  We must be able to free it, print it, analyse it
1400 and execute it.  Once we have specific `exec` types we will need to
1401 parse them too.  Let's take this a bit more slowly.
1402
1403 #### Freeing
1404
1405 The parser generator requires a `free_foo` function for each struct
1406 that stores attributes and they will often be `exec`s and subtypes
1407 there-of.  So we need `free_exec` which can handle all the subtypes,
1408 and we need `free_binode`.
1409
1410 ###### ast functions
1411
1412         static void free_binode(struct binode *b)
1413         {
1414                 if (!b)
1415                         return;
1416                 free_exec(b->left);
1417                 free_exec(b->right);
1418                 free(b);
1419         }
1420
1421 ###### core functions
1422         static void free_exec(struct exec *e)
1423         {
1424                 if (!e)
1425                         return;
1426                 switch(e->type) {
1427                         ## free exec cases
1428                 }
1429         }
1430
1431 ###### forward decls
1432
1433         static void free_exec(struct exec *e);
1434
1435 ###### free exec cases
1436         case Xbinode: free_binode(cast(binode, e)); break;
1437
1438 #### Printing
1439
1440 Printing an `exec` requires that we know the current indent level for
1441 printing line-oriented components.  As will become clear later, we
1442 also want to know what sort of bracketing to use.
1443
1444 ###### ast functions
1445
1446         static void do_indent(int i, char *str)
1447         {
1448                 while (i--)
1449                         printf("    ");
1450                 printf("%s", str);
1451         }
1452
1453 ###### core functions
1454         static void print_binode(struct binode *b, int indent, int bracket)
1455         {
1456                 struct binode *b2;
1457                 switch(b->op) {
1458                 ## print binode cases
1459                 }
1460         }
1461
1462         static void print_exec(struct exec *e, int indent, int bracket)
1463         {
1464                 if (!e)
1465                         return;         // NOTEST
1466                 switch (e->type) {
1467                 case Xbinode:
1468                         print_binode(cast(binode, e), indent, bracket); break;
1469                 ## print exec cases
1470                 }
1471         }
1472
1473 ###### forward decls
1474
1475         static void print_exec(struct exec *e, int indent, int bracket);
1476
1477 #### Analysing
1478
1479 As discussed, analysis involves propagating type requirements around the
1480 program and looking for errors.
1481
1482 So `propagate_types` is passed an expected type (being a `struct type`
1483 pointer together with some `val_rules` flags) that the `exec` is
1484 expected to return, and returns the type that it does return, either
1485 of which can be `NULL` signifying "unknown".  An `ok` flag is passed
1486 by reference. It is set to `0` when an error is found, and `2` when
1487 any change is made.  If it remains unchanged at `1`, then no more
1488 propagation is needed.
1489
1490 ###### ast
1491
1492         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 2<<1};
1493
1494 ###### format cases
1495         case 'r':
1496                 if (rules & Rnolabel)
1497                         fputs(" (labels not permitted)", stderr);
1498                 break;
1499
1500 ###### core functions
1501
1502         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1503                                             struct type *type, int rules);
1504         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1505                                               struct type *type, int rules)
1506         {
1507                 struct type *t;
1508
1509                 if (!prog)
1510                         return Tnone;
1511
1512                 switch (prog->type) {
1513                 case Xbinode:
1514                 {
1515                         struct binode *b = cast(binode, prog);
1516                         switch (b->op) {
1517                         ## propagate binode cases
1518                         }
1519                         break;
1520                 }
1521                 ## propagate exec cases
1522                 }
1523                 return Tnone;
1524         }
1525
1526         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1527                                             struct type *type, int rules)
1528         {
1529                 struct type *ret = __propagate_types(prog, c, ok, type, rules);
1530
1531                 if (c->parse_error)
1532                         *ok = 0;
1533                 return ret;
1534         }
1535
1536 #### Interpreting
1537
1538 Interpreting an `exec` doesn't require anything but the `exec`.  State
1539 is stored in variables and each variable will be directly linked from
1540 within the `exec` tree.  The exception to this is the `main` function
1541 which needs to look at command line arguments.  This function will be
1542 interpreted separately.
1543
1544 Each `exec` can return a value combined with a type in `struct lrval`.
1545 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
1546 the location of a value, which can be updated, in `lval`.  Others will
1547 set `lval` to NULL indicating that there is a value of appropriate type
1548 in `rval`.
1549
1550 ###### core functions
1551
1552         struct lrval {
1553                 struct type *type;
1554                 struct value rval, *lval;
1555         };
1556
1557         static struct lrval _interp_exec(struct parse_context *c, struct exec *e);
1558
1559         static struct value interp_exec(struct parse_context *c, struct exec *e,
1560                                         struct type **typeret)
1561         {
1562                 struct lrval ret = _interp_exec(c, e);
1563
1564                 if (!ret.type) abort();
1565                 if (typeret)
1566                         *typeret = ret.type;
1567                 if (ret.lval)
1568                         dup_value(ret.type, ret.lval, &ret.rval);
1569                 return ret.rval;
1570         }
1571
1572         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
1573                                           struct type **typeret)
1574         {
1575                 struct lrval ret = _interp_exec(c, e);
1576
1577                 if (ret.lval)
1578                         *typeret = ret.type;
1579                 else
1580                         free_value(ret.type, &ret.rval);
1581                 return ret.lval;
1582         }
1583
1584         static struct lrval _interp_exec(struct parse_context *c, struct exec *e)
1585         {
1586                 struct lrval ret;
1587                 struct value rv = {}, *lrv = NULL;
1588                 struct type *rvtype;
1589
1590                 rvtype = ret.type = Tnone;
1591                 if (!e) {
1592                         ret.lval = lrv; // UNTESTED
1593                         ret.rval = rv;  // UNTESTED
1594                         return ret;     // UNTESTED
1595                 }
1596
1597                 switch(e->type) {
1598                 case Xbinode:
1599                 {
1600                         struct binode *b = cast(binode, e);
1601                         struct value left, right, *lleft;
1602                         struct type *ltype, *rtype;
1603                         ltype = rtype = Tnone;
1604                         switch (b->op) {
1605                         ## interp binode cases
1606                         }
1607                         free_value(ltype, &left);
1608                         free_value(rtype, &right);
1609                         break;
1610                 }
1611                 ## interp exec cases
1612                 }
1613                 ret.lval = lrv;
1614                 ret.rval = rv;
1615                 ret.type = rvtype;
1616                 return ret;
1617         }
1618
1619 ### Complex types
1620
1621 Now that we have the shape of the interpreter in place we can add some
1622 complex types and connected them in to the data structures and the
1623 different phases of parse, analyse, print, interpret.
1624
1625 Thus far we have arrays and structs.
1626
1627 #### Arrays
1628
1629 Arrays can be declared by giving a size and a type, as `[size]type' so
1630 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
1631 size can be either a literal number, or a named constant.  Some day an
1632 arbitrary expression will be supported.
1633
1634 As a formal parameter to a function, the array can be declared with a
1635 new variable as the size: `name:[size::number]string`.  The `size`
1636 variable is set to the size of the array and must be a constant.  As
1637 `number` is the only supported type, it can be left out:
1638 `name:[size::]string`.
1639
1640 Arrays cannot be assigned.  When pointers are introduced we will also
1641 introduce array slices which can refer to part or all of an array -
1642 the assignment syntax will create a slice.  For now, an array can only
1643 ever be referenced by the name it is declared with.  It is likely that
1644 a "`copy`" primitive will eventually be define which can be used to
1645 make a copy of an array with controllable recursive depth.
1646
1647 For now we have two sorts of array, those with fixed size either because
1648 it is given as a literal number or because it is a struct member (which
1649 cannot have a runtime-changing size), and those with a size that is
1650 determined at runtime - local variables with a const size.  The former
1651 have their size calculated at parse time, the latter at run time.
1652
1653 For the latter type, the `size` field of the type is the size of a
1654 pointer, and the array is reallocated every time it comes into scope.
1655
1656 We differentiate struct fields with a const size from local variables
1657 with a const size by whether they are prepared at parse time or not.
1658
1659 ###### type union fields
1660
1661         struct {
1662                 int unspec;     // size is unspecified - vsize must be set.
1663                 short size;
1664                 short static_size;
1665                 struct variable *vsize;
1666                 struct type *member;
1667         } array;
1668
1669 ###### value union fields
1670         void *array;  // used if not static_size
1671
1672 ###### value functions
1673
1674         static void array_prepare_type(struct parse_context *c, struct type *type,
1675                                        int parse_time)
1676         {
1677                 struct value *vsize;
1678                 mpz_t q;
1679                 if (!type->array.vsize || type->array.static_size)
1680                         return;
1681
1682                 vsize = var_value(c, type->array.vsize);
1683                 mpz_init(q);
1684                 mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
1685                 type->array.size = mpz_get_si(q);
1686                 mpz_clear(q);
1687
1688                 if (parse_time) {
1689                         type->array.static_size = 1;
1690                         type->size = type->array.size * type->array.member->size;
1691                         type->align = type->array.member->align;
1692                 }
1693         }
1694
1695         static void array_init(struct type *type, struct value *val)
1696         {
1697                 int i;
1698                 void *ptr = val->ptr;
1699
1700                 if (!val)
1701                         return;                         // NOTEST
1702                 if (!type->array.static_size) {
1703                         val->array = calloc(type->array.size,
1704                                             type->array.member->size);
1705                         ptr = val->array;
1706                 }
1707                 for (i = 0; i < type->array.size; i++) {
1708                         struct value *v;
1709                         v = (void*)ptr + i * type->array.member->size;
1710                         val_init(type->array.member, v);
1711                 }
1712         }
1713
1714         static void array_free(struct type *type, struct value *val)
1715         {
1716                 int i;
1717                 void *ptr = val->ptr;
1718
1719                 if (!type->array.static_size)
1720                         ptr = val->array;
1721                 for (i = 0; i < type->array.size; i++) {
1722                         struct value *v;
1723                         v = (void*)ptr + i * type->array.member->size;
1724                         free_value(type->array.member, v);
1725                 }
1726                 if (!type->array.static_size)
1727                         free(ptr);
1728         }
1729
1730         static int array_compat(struct type *require, struct type *have)
1731         {
1732                 if (have->compat != require->compat)
1733                         return 0;       // UNTESTED
1734                 /* Both are arrays, so we can look at details */
1735                 if (!type_compat(require->array.member, have->array.member, 0))
1736                         return 0;
1737                 if (have->array.unspec && require->array.unspec) {
1738                         if (have->array.vsize && require->array.vsize &&
1739                             have->array.vsize != require->array.vsize)  // UNTESTED
1740                                 /* sizes might not be the same */
1741                                 return 0;       // UNTESTED
1742                         return 1;
1743                 }
1744                 if (have->array.unspec || require->array.unspec)
1745                         return 1;       // UNTESTED
1746                 if (require->array.vsize == NULL && have->array.vsize == NULL)
1747                         return require->array.size == have->array.size;
1748
1749                 return require->array.vsize == have->array.vsize;       // UNTESTED
1750         }
1751
1752         static void array_print_type(struct type *type, FILE *f)
1753         {
1754                 fputs("[", f);
1755                 if (type->array.vsize) {
1756                         struct binding *b = type->array.vsize->name;
1757                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
1758                                 type->array.unspec ? "::" : "");
1759                 } else
1760                         fprintf(f, "%d]", type->array.size);
1761                 type_print(type->array.member, f);
1762         }
1763
1764         static struct type array_prototype = {
1765                 .init = array_init,
1766                 .prepare_type = array_prepare_type,
1767                 .print_type = array_print_type,
1768                 .compat = array_compat,
1769                 .free = array_free,
1770                 .size = sizeof(void*),
1771                 .align = sizeof(void*),
1772         };
1773
1774 ###### declare terminals
1775         $TERM [ ]
1776
1777 ###### type grammar
1778
1779         | [ NUMBER ] Type ${ {
1780                 char tail[3];
1781                 mpq_t num;
1782                 struct text noname = { "", 0 };
1783                 struct type *t;
1784
1785                 $0 = t = add_type(c, noname, &array_prototype);
1786                 t->array.member = $<4;
1787                 t->array.vsize = NULL;
1788                 if (number_parse(num, tail, $2.txt) == 0)
1789                         tok_err(c, "error: unrecognised number", &$2);
1790                 else if (tail[0])
1791                         tok_err(c, "error: unsupported number suffix", &$2);
1792                 else {
1793                         t->array.size = mpz_get_ui(mpq_numref(num));
1794                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
1795                                 tok_err(c, "error: array size must be an integer",
1796                                         &$2);
1797                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
1798                                 tok_err(c, "error: array size is too large",
1799                                         &$2);
1800                         mpq_clear(num);
1801                 }
1802                 t->array.static_size = 1;
1803                 t->size = t->array.size * t->array.member->size;
1804                 t->align = t->array.member->align;
1805         } }$
1806
1807         | [ IDENTIFIER ] Type ${ {
1808                 struct variable *v = var_ref(c, $2.txt);
1809                 struct text noname = { "", 0 };
1810
1811                 if (!v)
1812                         tok_err(c, "error: name undeclared", &$2);
1813                 else if (!v->constant)
1814                         tok_err(c, "error: array size must be a constant", &$2);
1815
1816                 $0 = add_type(c, noname, &array_prototype);
1817                 $0->array.member = $<4;
1818                 $0->array.size = 0;
1819                 $0->array.vsize = v;
1820         } }$
1821
1822 ###### Grammar
1823         $*type
1824         OptType -> Type ${ $0 = $<1; }$
1825                 | ${ $0 = NULL; }$
1826
1827 ###### formal type grammar
1828
1829         | [ IDENTIFIER :: OptType ] Type ${ {
1830                 struct variable *v = var_decl(c, $ID.txt);
1831                 struct text noname = { "", 0 };
1832
1833                 v->type = $<OT;
1834                 v->constant = 1;
1835                 if (!v->type)
1836                         v->type = Tnum;
1837                 $0 = add_type(c, noname, &array_prototype);
1838                 $0->array.member = $<6;
1839                 $0->array.size = 0;
1840                 $0->array.unspec = 1;
1841                 $0->array.vsize = v;
1842         } }$
1843
1844 ###### Binode types
1845         Index,
1846
1847 ###### variable grammar
1848
1849         | Variable [ Expression ] ${ {
1850                 struct binode *b = new(binode);
1851                 b->op = Index;
1852                 b->left = $<1;
1853                 b->right = $<3;
1854                 $0 = b;
1855         } }$
1856
1857 ###### print binode cases
1858         case Index:
1859                 print_exec(b->left, -1, bracket);
1860                 printf("[");
1861                 print_exec(b->right, -1, bracket);
1862                 printf("]");
1863                 break;
1864
1865 ###### propagate binode cases
1866         case Index:
1867                 /* left must be an array, right must be a number,
1868                  * result is the member type of the array
1869                  */
1870                 propagate_types(b->right, c, ok, Tnum, 0);
1871                 t = propagate_types(b->left, c, ok, NULL, rules & Rnoconstant);
1872                 if (!t || t->compat != array_compat) {
1873                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
1874                         return NULL;
1875                 } else {
1876                         if (!type_compat(type, t->array.member, rules)) {
1877                                 type_err(c, "error: have %1 but need %2", prog,
1878                                          t->array.member, rules, type);
1879                         }
1880                         return t->array.member;
1881                 }
1882                 break;
1883
1884 ###### interp binode cases
1885         case Index: {
1886                 mpz_t q;
1887                 long i;
1888                 void *ptr;
1889
1890                 lleft = linterp_exec(c, b->left, &ltype);
1891                 right = interp_exec(c, b->right, &rtype);
1892                 mpz_init(q);
1893                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
1894                 i = mpz_get_si(q);
1895                 mpz_clear(q);
1896
1897                 if (ltype->array.static_size)
1898                         ptr = lleft;
1899                 else
1900                         ptr = *(void**)lleft;
1901                 rvtype = ltype->array.member;
1902                 if (i >= 0 && i < ltype->array.size)
1903                         lrv = ptr + i * rvtype->size;
1904                 else
1905                         val_init(ltype->array.member, &rv);
1906                 ltype = NULL;
1907                 break;
1908         }
1909
1910 #### Structs
1911
1912 A `struct` is a data-type that contains one or more other data-types.
1913 It differs from an array in that each member can be of a different
1914 type, and they are accessed by name rather than by number.  Thus you
1915 cannot choose an element by calculation, you need to know what you
1916 want up-front.
1917
1918 The language makes no promises about how a given structure will be
1919 stored in memory - it is free to rearrange fields to suit whatever
1920 criteria seems important.
1921
1922 Structs are declared separately from program code - they cannot be
1923 declared in-line in a variable declaration like arrays can.  A struct
1924 is given a name and this name is used to identify the type - the name
1925 is not prefixed by the word `struct` as it would be in C.
1926
1927 Structs are only treated as the same if they have the same name.
1928 Simply having the same fields in the same order is not enough.  This
1929 might change once we can create structure initializers from a list of
1930 values.
1931
1932 Each component datum is identified much like a variable is declared,
1933 with a name, one or two colons, and a type.  The type cannot be omitted
1934 as there is no opportunity to deduce the type from usage.  An initial
1935 value can be given following an equals sign, so
1936
1937 ##### Example: a struct type
1938
1939         struct complex:
1940                 x:number = 0
1941                 y:number = 0
1942
1943 would declare a type called "complex" which has two number fields,
1944 each initialised to zero.
1945
1946 Struct will need to be declared separately from the code that uses
1947 them, so we will need to be able to print out the declaration of a
1948 struct when reprinting the whole program.  So a `print_type_decl` type
1949 function will be needed.
1950
1951 ###### type union fields
1952
1953         struct {
1954                 int nfields;
1955                 struct field {
1956                         struct text name;
1957                         struct type *type;
1958                         struct value *init;
1959                         int offset;
1960                 } *fields;
1961         } structure;
1962
1963 ###### type functions
1964         void (*print_type_decl)(struct type *type, FILE *f);
1965
1966 ###### value functions
1967
1968         static void structure_init(struct type *type, struct value *val)
1969         {
1970                 int i;
1971
1972                 for (i = 0; i < type->structure.nfields; i++) {
1973                         struct value *v;
1974                         v = (void*) val->ptr + type->structure.fields[i].offset;
1975                         if (type->structure.fields[i].init)
1976                                 dup_value(type->structure.fields[i].type,
1977                                           type->structure.fields[i].init,
1978                                           v);
1979                         else
1980                                 val_init(type->structure.fields[i].type, v);
1981                 }
1982         }
1983
1984         static void structure_free(struct type *type, struct value *val)
1985         {
1986                 int i;
1987
1988                 for (i = 0; i < type->structure.nfields; i++) {
1989                         struct value *v;
1990                         v = (void*)val->ptr + type->structure.fields[i].offset;
1991                         free_value(type->structure.fields[i].type, v);
1992                 }
1993         }
1994
1995         static void structure_free_type(struct type *t)
1996         {
1997                 int i;
1998                 for (i = 0; i < t->structure.nfields; i++)
1999                         if (t->structure.fields[i].init) {
2000                                 free_value(t->structure.fields[i].type,
2001                                            t->structure.fields[i].init);
2002                         }
2003                 free(t->structure.fields);
2004         }
2005
2006         static struct type structure_prototype = {
2007                 .init = structure_init,
2008                 .free = structure_free,
2009                 .free_type = structure_free_type,
2010                 .print_type_decl = structure_print_type,
2011         };
2012
2013 ###### exec type
2014         Xfieldref,
2015
2016 ###### ast
2017         struct fieldref {
2018                 struct exec;
2019                 struct exec *left;
2020                 int index;
2021                 struct text name;
2022         };
2023
2024 ###### free exec cases
2025         case Xfieldref:
2026                 free_exec(cast(fieldref, e)->left);
2027                 free(e);
2028                 break;
2029
2030 ###### declare terminals
2031         $TERM struct .
2032
2033 ###### variable grammar
2034
2035         | Variable . IDENTIFIER ${ {
2036                 struct fieldref *fr = new_pos(fieldref, $2);
2037                 fr->left = $<1;
2038                 fr->name = $3.txt;
2039                 fr->index = -2;
2040                 $0 = fr;
2041         } }$
2042
2043 ###### print exec cases
2044
2045         case Xfieldref:
2046         {
2047                 struct fieldref *f = cast(fieldref, e);
2048                 print_exec(f->left, -1, bracket);
2049                 printf(".%.*s", f->name.len, f->name.txt);
2050                 break;
2051         }
2052
2053 ###### ast functions
2054         static int find_struct_index(struct type *type, struct text field)
2055         {
2056                 int i;
2057                 for (i = 0; i < type->structure.nfields; i++)
2058                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2059                                 return i;
2060                 return -1;
2061         }
2062
2063 ###### propagate exec cases
2064
2065         case Xfieldref:
2066         {
2067                 struct fieldref *f = cast(fieldref, prog);
2068                 struct type *st = propagate_types(f->left, c, ok, NULL, 0);
2069
2070                 if (!st)
2071                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2072                                  NULL, 0, NULL);
2073                 else if (st->init != structure_init)
2074                         type_err(c, "error: field reference attempted on %1, not a struct",
2075                                  f->left, st, 0, NULL);
2076                 else if (f->index == -2) {
2077                         f->index = find_struct_index(st, f->name);
2078                         if (f->index < 0)
2079                                 type_err(c, "error: cannot find requested field in %1",
2080                                          f->left, st, 0, NULL);
2081                 }
2082                 if (f->index >= 0) {
2083                         struct type *ft = st->structure.fields[f->index].type;
2084                         if (!type_compat(type, ft, rules))
2085                                 type_err(c, "error: have %1 but need %2", prog,
2086                                          ft, rules, type);
2087                         return ft;
2088                 }
2089                 break;
2090         }
2091
2092 ###### interp exec cases
2093         case Xfieldref:
2094         {
2095                 struct fieldref *f = cast(fieldref, e);
2096                 struct type *ltype;
2097                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2098                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2099                 rvtype = ltype->structure.fields[f->index].type;
2100                 break;
2101         }
2102
2103 ###### ast
2104         struct fieldlist {
2105                 struct fieldlist *prev;
2106                 struct field f;
2107         };
2108
2109 ###### ast functions
2110         static void free_fieldlist(struct fieldlist *f)
2111         {
2112                 if (!f)
2113                         return;
2114                 free_fieldlist(f->prev);
2115                 if (f->f.init) {
2116                         free_value(f->f.type, f->f.init);       // UNTESTED
2117                         free(f->f.init);        // UNTESTED
2118                 }
2119                 free(f);
2120         }
2121
2122 ###### top level grammar
2123         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2124                         struct type *t =
2125                                 add_type(c, $2.txt, &structure_prototype);
2126                         int cnt = 0;
2127                         struct fieldlist *f;
2128
2129                         for (f = $3; f; f=f->prev)
2130                                 cnt += 1;
2131
2132                         t->structure.nfields = cnt;
2133                         t->structure.fields = calloc(cnt, sizeof(struct field));
2134                         f = $3;
2135                         while (cnt > 0) {
2136                                 int a = f->f.type->align;
2137                                 cnt -= 1;
2138                                 t->structure.fields[cnt] = f->f;
2139                                 if (t->size & (a-1))
2140                                         t->size = (t->size | (a-1)) + 1;
2141                                 t->structure.fields[cnt].offset = t->size;
2142                                 t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2143                                 if (a > t->align)
2144                                         t->align = a;
2145                                 f->f.init = NULL;
2146                                 f = f->prev;
2147                         }
2148                 } }$
2149
2150         $*fieldlist
2151         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2152                 | { SimpleFieldList } ${ $0 = $<SFL; }$
2153                 | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2154                 | SimpleFieldList EOL ${ $0 = $<SFL; }$
2155
2156         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2157                 | FieldLines SimpleFieldList Newlines ${
2158                         $SFL->prev = $<FL;
2159                         $0 = $<SFL;
2160                 }$
2161
2162         SimpleFieldList -> Field ${ $0 = $<F; }$
2163                 | SimpleFieldList ; Field ${
2164                         $F->prev = $<SFL;
2165                         $0 = $<F;
2166                 }$
2167                 | SimpleFieldList ; ${
2168                         $0 = $<SFL;
2169                 }$
2170                 | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2171
2172         Field -> IDENTIFIER : Type = Expression ${ {
2173                         int ok; // UNTESTED
2174
2175                         $0 = calloc(1, sizeof(struct fieldlist));
2176                         $0->f.name = $1.txt;
2177                         $0->f.type = $<3;
2178                         $0->f.init = NULL;
2179                         do {
2180                                 ok = 1;
2181                                 propagate_types($<5, c, &ok, $3, 0);
2182                         } while (ok == 2);
2183                         if (!ok)
2184                                 c->parse_error = 1;     // UNTESTED
2185                         else {
2186                                 struct value vl = interp_exec(c, $5, NULL);
2187                                 $0->f.init = global_alloc(c, $0->f.type, NULL, &vl);
2188                         }
2189                 } }$
2190                 | IDENTIFIER : Type ${
2191                         $0 = calloc(1, sizeof(struct fieldlist));
2192                         $0->f.name = $1.txt;
2193                         $0->f.type = $<3;
2194                         if ($0->f.type->prepare_type)
2195                                 $0->f.type->prepare_type(c, $0->f.type, 1);
2196                 }$
2197
2198 ###### forward decls
2199         static void structure_print_type(struct type *t, FILE *f);
2200
2201 ###### value functions
2202         static void structure_print_type(struct type *t, FILE *f)       // UNTESTED
2203         {       // UNTESTED
2204                 int i;  // UNTESTED
2205
2206                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2207
2208                 for (i = 0; i < t->structure.nfields; i++) {
2209                         struct field *fl = t->structure.fields + i;
2210                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2211                         type_print(fl->type, f);
2212                         if (fl->type->print && fl->init) {
2213                                 fprintf(f, " = ");
2214                                 if (fl->type == Tstr)
2215                                         fprintf(f, "\"");       // UNTESTED
2216                                 print_value(fl->type, fl->init);
2217                                 if (fl->type == Tstr)
2218                                         fprintf(f, "\"");       // UNTESTED
2219                         }
2220                         printf("\n");
2221                 }
2222         }
2223
2224 ###### print type decls
2225         {       // UNTESTED
2226                 struct type *t; // UNTESTED
2227                 int target = -1;
2228
2229                 while (target != 0) {
2230                         int i = 0;
2231                         for (t = context.typelist; t ; t=t->next)
2232                                 if (t->print_type_decl) {
2233                                         i += 1;
2234                                         if (i == target)
2235                                                 break;
2236                                 }
2237
2238                         if (target == -1) {
2239                                 target = i;
2240                         } else {
2241                                 t->print_type_decl(t, stdout);
2242                                 target -= 1;
2243                         }
2244                 }
2245         }
2246
2247 ### Functions
2248
2249 A function is a named chunk of code which can be passed parameters and
2250 can return results.  Each function has an implicit type which includes
2251 the set of parameters and the return value.  As yet these types cannot
2252 be declared separate from the function itself.
2253
2254 In fact, only one function is currently possible - `main`.  `main` is
2255 passed an array of strings together with the size of the array, and
2256 doesn't return anything.  The strings are command line arguments.
2257
2258 The parameters can be specified either in parentheses as a list, such as
2259
2260 ##### Example: function 1
2261
2262         func main(av:[ac::number]string)
2263                 code block
2264
2265 or as an indented list of one parameter per line
2266
2267 ##### Example: function 2
2268
2269         func main
2270                 argv:[argc::number]string
2271         do
2272                 code block
2273
2274 ###### Binode types
2275         Func, List,
2276
2277 ###### Grammar
2278
2279         $TERM func main
2280
2281         $*binode
2282         MainFunction -> func main ( OpenScope Args ) Block Newlines ${
2283                         $0 = new(binode);
2284                         $0->op = Func;
2285                         $0->left = reorder_bilist($<Ar);
2286                         $0->right = $<Bl;
2287                         var_block_close(c, CloseSequential);
2288                         if (c->scope_stack && !c->parse_error) abort();
2289                 }$
2290                 | func main IN OpenScope OptNL Args OUT OptNL do Block Newlines ${
2291                         $0 = new(binode);
2292                         $0->op = Func;
2293                         $0->left = reorder_bilist($<Ar);
2294                         $0->right = $<Bl;
2295                         var_block_close(c, CloseSequential);
2296                         if (c->scope_stack && !c->parse_error) abort();
2297                 }$
2298                 | func main NEWLINE OpenScope OptNL do Block Newlines ${
2299                         $0 = new(binode);
2300                         $0->op = Func;
2301                         $0->left = NULL;
2302                         $0->right = $<Bl;
2303                         var_block_close(c, CloseSequential);
2304                         if (c->scope_stack && !c->parse_error) abort();
2305                 }$
2306
2307         Args -> ${ $0 = NULL; }$
2308                 | Varlist ${ $0 = $<1; }$
2309                 | Varlist ; ${ $0 = $<1; }$
2310                 | Varlist NEWLINE ${ $0 = $<1; }$
2311
2312         Varlist -> Varlist ; ArgDecl ${ // UNTESTED
2313                         $0 = new(binode);
2314                         $0->op = List;
2315                         $0->left = $<Vl;
2316                         $0->right = $<AD;
2317                 }$
2318                 | ArgDecl ${
2319                         $0 = new(binode);
2320                         $0->op = List;
2321                         $0->left = NULL;
2322                         $0->right = $<AD;
2323                 }$
2324
2325         $*var
2326         ArgDecl -> IDENTIFIER : FormalType ${ {
2327                 struct variable *v = var_decl(c, $1.txt);
2328                 $0 = new(var);
2329                 $0->var = v;
2330                 v->type = $<FT;
2331         } }$
2332
2333 ## Executables: the elements of code
2334
2335 Each code element needs to be parsed, printed, analysed,
2336 interpreted, and freed.  There are several, so let's just start with
2337 the easy ones and work our way up.
2338
2339 ### Values
2340
2341 We have already met values as separate objects.  When manifest
2342 constants appear in the program text, that must result in an executable
2343 which has a constant value.  So the `val` structure embeds a value in
2344 an executable.
2345
2346 ###### exec type
2347         Xval,
2348
2349 ###### ast
2350         struct val {
2351                 struct exec;
2352                 struct type *vtype;
2353                 struct value val;
2354         };
2355
2356 ###### ast functions
2357         struct val *new_val(struct type *T, struct token tk)
2358         {
2359                 struct val *v = new_pos(val, tk);
2360                 v->vtype = T;
2361                 return v;
2362         }
2363
2364 ###### Grammar
2365
2366         $TERM True False
2367
2368         $*val
2369         Value ->  True ${
2370                         $0 = new_val(Tbool, $1);
2371                         $0->val.bool = 1;
2372                         }$
2373                 | False ${
2374                         $0 = new_val(Tbool, $1);
2375                         $0->val.bool = 0;
2376                         }$
2377                 | NUMBER ${
2378                         $0 = new_val(Tnum, $1);
2379                         {
2380                         char tail[3];
2381                         if (number_parse($0->val.num, tail, $1.txt) == 0)
2382                                 mpq_init($0->val.num);  // UNTESTED
2383                                 if (tail[0])
2384                                         tok_err(c, "error: unsupported number suffix",
2385                                                 &$1);
2386                         }
2387                         }$
2388                 | STRING ${
2389                         $0 = new_val(Tstr, $1);
2390                         {
2391                         char tail[3];
2392                         string_parse(&$1, '\\', &$0->val.str, tail);
2393                         if (tail[0])
2394                                 tok_err(c, "error: unsupported string suffix",
2395                                         &$1);
2396                         }
2397                         }$
2398                 | MULTI_STRING ${
2399                         $0 = new_val(Tstr, $1);
2400                         {
2401                         char tail[3];
2402                         string_parse(&$1, '\\', &$0->val.str, tail);
2403                         if (tail[0])
2404                                 tok_err(c, "error: unsupported string suffix",
2405                                         &$1);
2406                         }
2407                         }$
2408
2409 ###### print exec cases
2410         case Xval:
2411         {
2412                 struct val *v = cast(val, e);
2413                 if (v->vtype == Tstr)
2414                         printf("\"");
2415                 print_value(v->vtype, &v->val);
2416                 if (v->vtype == Tstr)
2417                         printf("\"");
2418                 break;
2419         }
2420
2421 ###### propagate exec cases
2422         case Xval:
2423         {
2424                 struct val *val = cast(val, prog);
2425                 if (!type_compat(type, val->vtype, rules))
2426                         type_err(c, "error: expected %1%r found %2",
2427                                    prog, type, rules, val->vtype);
2428                 return val->vtype;
2429         }
2430
2431 ###### interp exec cases
2432         case Xval:
2433                 rvtype = cast(val, e)->vtype;
2434                 dup_value(rvtype, &cast(val, e)->val, &rv);
2435                 break;
2436
2437 ###### ast functions
2438         static void free_val(struct val *v)
2439         {
2440                 if (v)
2441                         free_value(v->vtype, &v->val);
2442                 free(v);
2443         }
2444
2445 ###### free exec cases
2446         case Xval: free_val(cast(val, e)); break;
2447
2448 ###### ast functions
2449         // Move all nodes from 'b' to 'rv', reversing their order.
2450         // In 'b' 'left' is a list, and 'right' is the last node.
2451         // In 'rv', left' is the first node and 'right' is a list.
2452         static struct binode *reorder_bilist(struct binode *b)
2453         {
2454                 struct binode *rv = NULL;
2455
2456                 while (b) {
2457                         struct exec *t = b->right;
2458                         b->right = rv;
2459                         rv = b;
2460                         if (b->left)
2461                                 b = cast(binode, b->left);
2462                         else
2463                                 b = NULL;
2464                         rv->left = t;
2465                 }
2466                 return rv;
2467         }
2468
2469 ### Variables
2470
2471 Just as we used a `val` to wrap a value into an `exec`, we similarly
2472 need a `var` to wrap a `variable` into an exec.  While each `val`
2473 contained a copy of the value, each `var` holds a link to the variable
2474 because it really is the same variable no matter where it appears.
2475 When a variable is used, we need to remember to follow the `->merged`
2476 link to find the primary instance.
2477
2478 ###### exec type
2479         Xvar,
2480
2481 ###### ast
2482         struct var {
2483                 struct exec;
2484                 struct variable *var;
2485         };
2486
2487 ###### Grammar
2488
2489         $TERM : ::
2490
2491         $*var
2492         VariableDecl -> IDENTIFIER : ${ {
2493                 struct variable *v = var_decl(c, $1.txt);
2494                 $0 = new_pos(var, $1);
2495                 $0->var = v;
2496                 if (v)
2497                         v->where_decl = $0;
2498                 else {
2499                         v = var_ref(c, $1.txt);
2500                         $0->var = v;
2501                         type_err(c, "error: variable '%v' redeclared",
2502                                  $0, NULL, 0, NULL);
2503                         type_err(c, "info: this is where '%v' was first declared",
2504                                  v->where_decl, NULL, 0, NULL);
2505                 }
2506         } }$
2507             | IDENTIFIER :: ${ {
2508                 struct variable *v = var_decl(c, $1.txt);
2509                 $0 = new_pos(var, $1);
2510                 $0->var = v;
2511                 if (v) {
2512                         v->where_decl = $0;
2513                         v->constant = 1;
2514                 } else {
2515                         v = var_ref(c, $1.txt);
2516                         $0->var = v;
2517                         type_err(c, "error: variable '%v' redeclared",
2518                                  $0, NULL, 0, NULL);
2519                         type_err(c, "info: this is where '%v' was first declared",
2520                                  v->where_decl, NULL, 0, NULL);
2521                 }
2522         } }$
2523             | IDENTIFIER : Type ${ {
2524                 struct variable *v = var_decl(c, $1.txt);
2525                 $0 = new_pos(var, $1);
2526                 $0->var = v;
2527                 if (v) {
2528                         v->where_decl = $0;
2529                         v->where_set = $0;
2530                         v->type = $<Type;
2531                 } else {
2532                         v = var_ref(c, $1.txt);
2533                         $0->var = v;
2534                         type_err(c, "error: variable '%v' redeclared",
2535                                  $0, NULL, 0, NULL);
2536                         type_err(c, "info: this is where '%v' was first declared",
2537                                  v->where_decl, NULL, 0, NULL);
2538                 }
2539         } }$
2540             | IDENTIFIER :: Type ${ {
2541                 struct variable *v = var_decl(c, $1.txt);
2542                 $0 = new_pos(var, $1);
2543                 $0->var = v;
2544                 if (v) {
2545                         v->where_decl = $0;
2546                         v->where_set = $0;
2547                         v->type = $<Type;
2548                         v->constant = 1;
2549                 } else {
2550                         v = var_ref(c, $1.txt);
2551                         $0->var = v;
2552                         type_err(c, "error: variable '%v' redeclared",
2553                                  $0, NULL, 0, NULL);
2554                         type_err(c, "info: this is where '%v' was first declared",
2555                                  v->where_decl, NULL, 0, NULL);
2556                 }
2557         } }$
2558
2559         $*exec
2560         Variable -> IDENTIFIER ${ {
2561                 struct variable *v = var_ref(c, $1.txt);
2562                 $0 = new_pos(var, $1);
2563                 if (v == NULL) {
2564                         /* This might be a label - allocate a var just in case */
2565                         v = var_decl(c, $1.txt);
2566                         if (v) {
2567                                 v->type = Tnone;
2568                                 v->where_decl = $0;
2569                                 v->where_set = $0;
2570                         }
2571                 }
2572                 cast(var, $0)->var = v;
2573         } }$
2574         ## variable grammar
2575
2576 ###### print exec cases
2577         case Xvar:
2578         {
2579                 struct var *v = cast(var, e);
2580                 if (v->var) {
2581                         struct binding *b = v->var->name;
2582                         printf("%.*s", b->name.len, b->name.txt);
2583                 }
2584                 break;
2585         }
2586
2587 ###### format cases
2588         case 'v':
2589                 if (loc && loc->type == Xvar) {
2590                         struct var *v = cast(var, loc);
2591                         if (v->var) {
2592                                 struct binding *b = v->var->name;
2593                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2594                         } else
2595                                 fputs("???", stderr);   // NOTEST
2596                 } else
2597                         fputs("NOTVAR", stderr);        // NOTEST
2598                 break;
2599
2600 ###### propagate exec cases
2601
2602         case Xvar:
2603         {
2604                 struct var *var = cast(var, prog);
2605                 struct variable *v = var->var;
2606                 if (!v) {
2607                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2608                         return Tnone;                                   // NOTEST
2609                 }
2610                 v = v->merged;
2611                 if (v->constant && (rules & Rnoconstant)) {
2612                         type_err(c, "error: Cannot assign to a constant: %v",
2613                                  prog, NULL, 0, NULL);
2614                         type_err(c, "info: name was defined as a constant here",
2615                                  v->where_decl, NULL, 0, NULL);
2616                         return v->type;
2617                 }
2618                 if (v->type == Tnone && v->where_decl == prog)
2619                         type_err(c, "error: variable used but not declared: %v",
2620                                  prog, NULL, 0, NULL);
2621                 if (v->type == NULL) {
2622                         if (type && *ok != 0) {
2623                                 v->type = type;
2624                                 v->where_set = prog;
2625                                 *ok = 2;
2626                         }
2627                         return type;
2628                 }
2629                 if (!type_compat(type, v->type, rules)) {
2630                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2631                                  type, rules, v->type);
2632                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2633                                  v->type, rules, NULL);
2634                 }
2635                 if (!type)
2636                         return v->type;
2637                 return type;
2638         }
2639
2640 ###### interp exec cases
2641         case Xvar:
2642         {
2643                 struct var *var = cast(var, e);
2644                 struct variable *v = var->var;
2645
2646                 v = v->merged;
2647                 lrv = var_value(c, v);
2648                 rvtype = v->type;
2649                 break;
2650         }
2651
2652 ###### ast functions
2653
2654         static void free_var(struct var *v)
2655         {
2656                 free(v);
2657         }
2658
2659 ###### free exec cases
2660         case Xvar: free_var(cast(var, e)); break;
2661
2662 ### Expressions: Conditional
2663
2664 Our first user of the `binode` will be conditional expressions, which
2665 is a bit odd as they actually have three components.  That will be
2666 handled by having 2 binodes for each expression.  The conditional
2667 expression is the lowest precedence operator which is why we define it
2668 first - to start the precedence list.
2669
2670 Conditional expressions are of the form "value `if` condition `else`
2671 other_value".  They associate to the right, so everything to the right
2672 of `else` is part of an else value, while only a higher-precedence to
2673 the left of `if` is the if values.  Between `if` and `else` there is no
2674 room for ambiguity, so a full conditional expression is allowed in
2675 there.
2676
2677 ###### Binode types
2678         CondExpr,
2679
2680 ###### Grammar
2681
2682         $LEFT if $$ifelse
2683         ## expr precedence
2684
2685         $*exec
2686         Expression -> Expression if Expression else Expression $$ifelse ${ {
2687                         struct binode *b1 = new(binode);
2688                         struct binode *b2 = new(binode);
2689                         b1->op = CondExpr;
2690                         b1->left = $<3;
2691                         b1->right = b2;
2692                         b2->op = CondExpr;
2693                         b2->left = $<1;
2694                         b2->right = $<5;
2695                         $0 = b1;
2696                 } }$
2697                 ## expression grammar
2698
2699 ###### print binode cases
2700
2701         case CondExpr:
2702                 b2 = cast(binode, b->right);
2703                 if (bracket) printf("(");
2704                 print_exec(b2->left, -1, bracket);
2705                 printf(" if ");
2706                 print_exec(b->left, -1, bracket);
2707                 printf(" else ");
2708                 print_exec(b2->right, -1, bracket);
2709                 if (bracket) printf(")");
2710                 break;
2711
2712 ###### propagate binode cases
2713
2714         case CondExpr: {
2715                 /* cond must be Tbool, others must match */
2716                 struct binode *b2 = cast(binode, b->right);
2717                 struct type *t2;
2718
2719                 propagate_types(b->left, c, ok, Tbool, 0);
2720                 t = propagate_types(b2->left, c, ok, type, Rnolabel);
2721                 t2 = propagate_types(b2->right, c, ok, type ?: t, Rnolabel);
2722                 return t ?: t2;
2723         }
2724
2725 ###### interp binode cases
2726
2727         case CondExpr: {
2728                 struct binode *b2 = cast(binode, b->right);
2729                 left = interp_exec(c, b->left, &ltype);
2730                 if (left.bool)
2731                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
2732                 else
2733                         rv = interp_exec(c, b2->right, &rvtype);
2734                 }
2735                 break;
2736
2737 ### Expressions: Boolean
2738
2739 The next class of expressions to use the `binode` will be Boolean
2740 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
2741 have same corresponding precendence.  The difference is that they don't
2742 evaluate the second expression if not necessary.
2743
2744 ###### Binode types
2745         And,
2746         AndThen,
2747         Or,
2748         OrElse,
2749         Not,
2750
2751 ###### expr precedence
2752         $LEFT or
2753         $LEFT and
2754         $LEFT not
2755
2756 ###### expression grammar
2757                 | Expression or Expression ${ {
2758                         struct binode *b = new(binode);
2759                         b->op = Or;
2760                         b->left = $<1;
2761                         b->right = $<3;
2762                         $0 = b;
2763                 } }$
2764                 | Expression or else Expression ${ {
2765                         struct binode *b = new(binode);
2766                         b->op = OrElse;
2767                         b->left = $<1;
2768                         b->right = $<4;
2769                         $0 = b;
2770                 } }$
2771
2772                 | Expression and Expression ${ {
2773                         struct binode *b = new(binode);
2774                         b->op = And;
2775                         b->left = $<1;
2776                         b->right = $<3;
2777                         $0 = b;
2778                 } }$
2779                 | Expression and then Expression ${ {
2780                         struct binode *b = new(binode);
2781                         b->op = AndThen;
2782                         b->left = $<1;
2783                         b->right = $<4;
2784                         $0 = b;
2785                 } }$
2786
2787                 | not Expression ${ {
2788                         struct binode *b = new(binode);
2789                         b->op = Not;
2790                         b->right = $<2;
2791                         $0 = b;
2792                 } }$
2793
2794 ###### print binode cases
2795         case And:
2796                 if (bracket) printf("(");
2797                 print_exec(b->left, -1, bracket);
2798                 printf(" and ");
2799                 print_exec(b->right, -1, bracket);
2800                 if (bracket) printf(")");
2801                 break;
2802         case AndThen:
2803                 if (bracket) printf("(");
2804                 print_exec(b->left, -1, bracket);
2805                 printf(" and then ");
2806                 print_exec(b->right, -1, bracket);
2807                 if (bracket) printf(")");
2808                 break;
2809         case Or:
2810                 if (bracket) printf("(");
2811                 print_exec(b->left, -1, bracket);
2812                 printf(" or ");
2813                 print_exec(b->right, -1, bracket);
2814                 if (bracket) printf(")");
2815                 break;
2816         case OrElse:
2817                 if (bracket) printf("(");
2818                 print_exec(b->left, -1, bracket);
2819                 printf(" or else ");
2820                 print_exec(b->right, -1, bracket);
2821                 if (bracket) printf(")");
2822                 break;
2823         case Not:
2824                 if (bracket) printf("(");
2825                 printf("not ");
2826                 print_exec(b->right, -1, bracket);
2827                 if (bracket) printf(")");
2828                 break;
2829
2830 ###### propagate binode cases
2831         case And:
2832         case AndThen:
2833         case Or:
2834         case OrElse:
2835         case Not:
2836                 /* both must be Tbool, result is Tbool */
2837                 propagate_types(b->left, c, ok, Tbool, 0);
2838                 propagate_types(b->right, c, ok, Tbool, 0);
2839                 if (type && type != Tbool)
2840                         type_err(c, "error: %1 operation found where %2 expected", prog,
2841                                    Tbool, 0, type);
2842                 return Tbool;
2843
2844 ###### interp binode cases
2845         case And:
2846                 rv = interp_exec(c, b->left, &rvtype);
2847                 right = interp_exec(c, b->right, &rtype);
2848                 rv.bool = rv.bool && right.bool;
2849                 break;
2850         case AndThen:
2851                 rv = interp_exec(c, b->left, &rvtype);
2852                 if (rv.bool)
2853                         rv = interp_exec(c, b->right, NULL);
2854                 break;
2855         case Or:
2856                 rv = interp_exec(c, b->left, &rvtype);
2857                 right = interp_exec(c, b->right, &rtype);
2858                 rv.bool = rv.bool || right.bool;
2859                 break;
2860         case OrElse:
2861                 rv = interp_exec(c, b->left, &rvtype);
2862                 if (!rv.bool)
2863                         rv = interp_exec(c, b->right, NULL);
2864                 break;
2865         case Not:
2866                 rv = interp_exec(c, b->right, &rvtype);
2867                 rv.bool = !rv.bool;
2868                 break;
2869
2870 ### Expressions: Comparison
2871
2872 Of slightly higher precedence that Boolean expressions are Comparisons.
2873 A comparison takes arguments of any comparable type, but the two types
2874 must be the same.
2875
2876 To simplify the parsing we introduce an `eop` which can record an
2877 expression operator, and the `CMPop` non-terminal will match one of them.
2878
2879 ###### ast
2880         struct eop {
2881                 enum Btype op;
2882         };
2883
2884 ###### ast functions
2885         static void free_eop(struct eop *e)
2886         {
2887                 if (e)
2888                         free(e);
2889         }
2890
2891 ###### Binode types
2892         Less,
2893         Gtr,
2894         LessEq,
2895         GtrEq,
2896         Eql,
2897         NEql,
2898
2899 ###### expr precedence
2900         $LEFT < > <= >= == != CMPop
2901
2902 ###### expression grammar
2903         | Expression CMPop Expression ${ {
2904                 struct binode *b = new(binode);
2905                 b->op = $2.op;
2906                 b->left = $<1;
2907                 b->right = $<3;
2908                 $0 = b;
2909         } }$
2910
2911 ###### Grammar
2912
2913         $eop
2914         CMPop ->   < ${ $0.op = Less; }$
2915                 |  > ${ $0.op = Gtr; }$
2916                 |  <= ${ $0.op = LessEq; }$
2917                 |  >= ${ $0.op = GtrEq; }$
2918                 |  == ${ $0.op = Eql; }$
2919                 |  != ${ $0.op = NEql; }$
2920
2921 ###### print binode cases
2922
2923         case Less:
2924         case LessEq:
2925         case Gtr:
2926         case GtrEq:
2927         case Eql:
2928         case NEql:
2929                 if (bracket) printf("(");
2930                 print_exec(b->left, -1, bracket);
2931                 switch(b->op) {
2932                 case Less:   printf(" < "); break;
2933                 case LessEq: printf(" <= "); break;
2934                 case Gtr:    printf(" > "); break;
2935                 case GtrEq:  printf(" >= "); break;
2936                 case Eql:    printf(" == "); break;
2937                 case NEql:   printf(" != "); break;
2938                 default: abort();               // NOTEST
2939                 }
2940                 print_exec(b->right, -1, bracket);
2941                 if (bracket) printf(")");
2942                 break;
2943
2944 ###### propagate binode cases
2945         case Less:
2946         case LessEq:
2947         case Gtr:
2948         case GtrEq:
2949         case Eql:
2950         case NEql:
2951                 /* Both must match but not be labels, result is Tbool */
2952                 t = propagate_types(b->left, c, ok, NULL, Rnolabel);
2953                 if (t)
2954                         propagate_types(b->right, c, ok, t, 0);
2955                 else {
2956                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);   // UNTESTED
2957                         if (t)  // UNTESTED
2958                                 t = propagate_types(b->left, c, ok, t, 0);      // UNTESTED
2959                 }
2960                 if (!type_compat(type, Tbool, 0))
2961                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
2962                                     Tbool, rules, type);
2963                 return Tbool;
2964
2965 ###### interp binode cases
2966         case Less:
2967         case LessEq:
2968         case Gtr:
2969         case GtrEq:
2970         case Eql:
2971         case NEql:
2972         {
2973                 int cmp;
2974                 left = interp_exec(c, b->left, &ltype);
2975                 right = interp_exec(c, b->right, &rtype);
2976                 cmp = value_cmp(ltype, rtype, &left, &right);
2977                 rvtype = Tbool;
2978                 switch (b->op) {
2979                 case Less:      rv.bool = cmp <  0; break;
2980                 case LessEq:    rv.bool = cmp <= 0; break;
2981                 case Gtr:       rv.bool = cmp >  0; break;
2982                 case GtrEq:     rv.bool = cmp >= 0; break;
2983                 case Eql:       rv.bool = cmp == 0; break;
2984                 case NEql:      rv.bool = cmp != 0; break;
2985                 default:        rv.bool = 0; break;     // NOTEST
2986                 }
2987                 break;
2988         }
2989
2990 ### Expressions: The rest
2991
2992 The remaining expressions with the highest precedence are arithmetic,
2993 string concatenation, and string conversion.  String concatenation
2994 (`++`) has the same precedence as multiplication and division, but lower
2995 than the uniary.
2996
2997 String conversion is a temporary feature until I get a better type
2998 system.  `$` is a prefix operator which expects a string and returns
2999 a number.
3000
3001 `+` and `-` are both infix and prefix operations (where they are
3002 absolute value and negation).  These have different operator names.
3003
3004 We also have a 'Bracket' operator which records where parentheses were
3005 found.  This makes it easy to reproduce these when printing.  Possibly I
3006 should only insert brackets were needed for precedence.
3007
3008 ###### Binode types
3009         Plus, Minus,
3010         Times, Divide, Rem,
3011         Concat,
3012         Absolute, Negate,
3013         StringConv,
3014         Bracket,
3015
3016 ###### expr precedence
3017         $LEFT + - Eop
3018         $LEFT * / % ++ Top
3019         $LEFT Uop $
3020         $TERM ( )
3021
3022 ###### expression grammar
3023                 | Expression Eop Expression ${ {
3024                         struct binode *b = new(binode);
3025                         b->op = $2.op;
3026                         b->left = $<1;
3027                         b->right = $<3;
3028                         $0 = b;
3029                 } }$
3030
3031                 | Expression Top Expression ${ {
3032                         struct binode *b = new(binode);
3033                         b->op = $2.op;
3034                         b->left = $<1;
3035                         b->right = $<3;
3036                         $0 = b;
3037                 } }$
3038
3039                 | ( Expression ) ${ {
3040                         struct binode *b = new_pos(binode, $1);
3041                         b->op = Bracket;
3042                         b->right = $<2;
3043                         $0 = b;
3044                 } }$
3045                 | Uop Expression ${ {
3046                         struct binode *b = new(binode);
3047                         b->op = $1.op;
3048                         b->right = $<2;
3049                         $0 = b;
3050                 } }$
3051                 | Value ${ $0 = $<1; }$
3052                 | Variable ${ $0 = $<1; }$
3053
3054         $eop
3055         Eop ->    + ${ $0.op = Plus; }$
3056                 | - ${ $0.op = Minus; }$
3057
3058         Uop ->    + ${ $0.op = Absolute; }$
3059                 | - ${ $0.op = Negate; }$
3060                 | $ ${ $0.op = StringConv; }$
3061
3062         Top ->    * ${ $0.op = Times; }$
3063                 | / ${ $0.op = Divide; }$
3064                 | % ${ $0.op = Rem; }$
3065                 | ++ ${ $0.op = Concat; }$
3066
3067 ###### print binode cases
3068         case Plus:
3069         case Minus:
3070         case Times:
3071         case Divide:
3072         case Concat:
3073         case Rem:
3074                 if (bracket) printf("(");
3075                 print_exec(b->left, indent, bracket);
3076                 switch(b->op) {
3077                 case Plus:   fputs(" + ", stdout); break;
3078                 case Minus:  fputs(" - ", stdout); break;
3079                 case Times:  fputs(" * ", stdout); break;
3080                 case Divide: fputs(" / ", stdout); break;
3081                 case Rem:    fputs(" % ", stdout); break;
3082                 case Concat: fputs(" ++ ", stdout); break;
3083                 default: abort();       // NOTEST
3084                 }                       // NOTEST
3085                 print_exec(b->right, indent, bracket);
3086                 if (bracket) printf(")");
3087                 break;
3088         case Absolute:
3089         case Negate:
3090         case StringConv:
3091                 if (bracket) printf("(");
3092                 switch (b->op) {
3093                 case Absolute:   fputs("+", stdout); break;
3094                 case Negate:     fputs("-", stdout); break;
3095                 case StringConv: fputs("$", stdout); break;
3096                 default: abort();       // NOTEST
3097                 }                       // NOTEST
3098                 print_exec(b->right, indent, bracket);
3099                 if (bracket) printf(")");
3100                 break;
3101         case Bracket:
3102                 printf("(");
3103                 print_exec(b->right, indent, bracket);
3104                 printf(")");
3105                 break;
3106
3107 ###### propagate binode cases
3108         case Plus:
3109         case Minus:
3110         case Times:
3111         case Rem:
3112         case Divide:
3113                 /* both must be numbers, result is Tnum */
3114         case Absolute:
3115         case Negate:
3116                 /* as propagate_types ignores a NULL,
3117                  * unary ops fit here too */
3118                 propagate_types(b->left, c, ok, Tnum, 0);
3119                 propagate_types(b->right, c, ok, Tnum, 0);
3120                 if (!type_compat(type, Tnum, 0))
3121                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3122                                    Tnum, rules, type);
3123                 return Tnum;
3124
3125         case Concat:
3126                 /* both must be Tstr, result is Tstr */
3127                 propagate_types(b->left, c, ok, Tstr, 0);
3128                 propagate_types(b->right, c, ok, Tstr, 0);
3129                 if (!type_compat(type, Tstr, 0))
3130                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3131                                    Tstr, rules, type);
3132                 return Tstr;
3133
3134         case StringConv:
3135                 /* op must be string, result is number */
3136                 propagate_types(b->left, c, ok, Tstr, 0);
3137                 if (!type_compat(type, Tnum, 0))
3138                         type_err(c,     // UNTESTED
3139                           "error: Can only convert string to number, not %1",
3140                                 prog, type, 0, NULL);
3141                 return Tnum;
3142
3143         case Bracket:
3144                 return propagate_types(b->right, c, ok, type, 0);
3145
3146 ###### interp binode cases
3147
3148         case Plus:
3149                 rv = interp_exec(c, b->left, &rvtype);
3150                 right = interp_exec(c, b->right, &rtype);
3151                 mpq_add(rv.num, rv.num, right.num);
3152                 break;
3153         case Minus:
3154                 rv = interp_exec(c, b->left, &rvtype);
3155                 right = interp_exec(c, b->right, &rtype);
3156                 mpq_sub(rv.num, rv.num, right.num);
3157                 break;
3158         case Times:
3159                 rv = interp_exec(c, b->left, &rvtype);
3160                 right = interp_exec(c, b->right, &rtype);
3161                 mpq_mul(rv.num, rv.num, right.num);
3162                 break;
3163         case Divide:
3164                 rv = interp_exec(c, b->left, &rvtype);
3165                 right = interp_exec(c, b->right, &rtype);
3166                 mpq_div(rv.num, rv.num, right.num);
3167                 break;
3168         case Rem: {
3169                 mpz_t l, r, rem;
3170
3171                 left = interp_exec(c, b->left, &ltype);
3172                 right = interp_exec(c, b->right, &rtype);
3173                 mpz_init(l); mpz_init(r); mpz_init(rem);
3174                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3175                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3176                 mpz_tdiv_r(rem, l, r);
3177                 val_init(Tnum, &rv);
3178                 mpq_set_z(rv.num, rem);
3179                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3180                 rvtype = ltype;
3181                 break;
3182         }
3183         case Negate:
3184                 rv = interp_exec(c, b->right, &rvtype);
3185                 mpq_neg(rv.num, rv.num);
3186                 break;
3187         case Absolute:
3188                 rv = interp_exec(c, b->right, &rvtype);
3189                 mpq_abs(rv.num, rv.num);
3190                 break;
3191         case Bracket:
3192                 rv = interp_exec(c, b->right, &rvtype);
3193                 break;
3194         case Concat:
3195                 left = interp_exec(c, b->left, &ltype);
3196                 right = interp_exec(c, b->right, &rtype);
3197                 rvtype = Tstr;
3198                 rv.str = text_join(left.str, right.str);
3199                 break;
3200         case StringConv:
3201                 right = interp_exec(c, b->right, &rvtype);
3202                 rtype = Tstr;
3203                 rvtype = Tnum;
3204
3205                 struct text tx = right.str;
3206                 char tail[3];
3207                 int neg = 0;
3208                 if (tx.txt[0] == '-') {
3209                         neg = 1;        // UNTESTED
3210                         tx.txt++;       // UNTESTED
3211                         tx.len--;       // UNTESTED
3212                 }
3213                 if (number_parse(rv.num, tail, tx) == 0)
3214                         mpq_init(rv.num);       // UNTESTED
3215                 else if (neg)
3216                         mpq_neg(rv.num, rv.num);        // UNTESTED
3217                 if (tail[0])
3218                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3219
3220                 break;
3221
3222 ###### value functions
3223
3224         static struct text text_join(struct text a, struct text b)
3225         {
3226                 struct text rv;
3227                 rv.len = a.len + b.len;
3228                 rv.txt = malloc(rv.len);
3229                 memcpy(rv.txt, a.txt, a.len);
3230                 memcpy(rv.txt+a.len, b.txt, b.len);
3231                 return rv;
3232         }
3233
3234 ### Blocks, Statements, and Statement lists.
3235
3236 Now that we have expressions out of the way we need to turn to
3237 statements.  There are simple statements and more complex statements.
3238 Simple statements do not contain (syntactic) newlines, complex statements do.
3239
3240 Statements often come in sequences and we have corresponding simple
3241 statement lists and complex statement lists.
3242 The former comprise only simple statements separated by semicolons.
3243 The later comprise complex statements and simple statement lists.  They are
3244 separated by newlines.  Thus the semicolon is only used to separate
3245 simple statements on the one line.  This may be overly restrictive,
3246 but I'm not sure I ever want a complex statement to share a line with
3247 anything else.
3248
3249 Note that a simple statement list can still use multiple lines if
3250 subsequent lines are indented, so
3251
3252 ###### Example: wrapped simple statement list
3253
3254         a = b; c = d;
3255            e = f; print g
3256
3257 is a single simple statement list.  This might allow room for
3258 confusion, so I'm not set on it yet.
3259
3260 A simple statement list needs no extra syntax.  A complex statement
3261 list has two syntactic forms.  It can be enclosed in braces (much like
3262 C blocks), or it can be introduced by an indent and continue until an
3263 unindented newline (much like Python blocks).  With this extra syntax
3264 it is referred to as a block.
3265
3266 Note that a block does not have to include any newlines if it only
3267 contains simple statements.  So both of:
3268
3269         if condition: a=b; d=f
3270
3271         if condition { a=b; print f }
3272
3273 are valid.
3274
3275 In either case the list is constructed from a `binode` list with
3276 `Block` as the operator.  When parsing the list it is most convenient
3277 to append to the end, so a list is a list and a statement.  When using
3278 the list it is more convenient to consider a list to be a statement
3279 and a list.  So we need a function to re-order a list.
3280 `reorder_bilist` serves this purpose.
3281
3282 The only stand-alone statement we introduce at this stage is `pass`
3283 which does nothing and is represented as a `NULL` pointer in a `Block`
3284 list.  Other stand-alone statements will follow once the infrastructure
3285 is in-place.
3286
3287 ###### Binode types
3288         Block,
3289
3290 ###### Grammar
3291
3292         $TERM { } ;
3293
3294         $*binode
3295         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3296                 | { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3297                 | SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3298                 | SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3299                 | IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3300
3301         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3302                 | OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3303                 | OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3304                 | OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3305                 | IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3306
3307         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3308                 | { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3309                 | IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3310
3311         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3312                 | { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3313                 | : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3314                 | : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3315                 | : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3316
3317         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3318
3319         ComplexStatements -> ComplexStatements ComplexStatement ${
3320                         if ($2 == NULL) {
3321                                 $0 = $<1;
3322                         } else {
3323                                 $0 = new(binode);
3324                                 $0->op = Block;
3325                                 $0->left = $<1;
3326                                 $0->right = $<2;
3327                         }
3328                 }$
3329                 | ComplexStatement ${
3330                         if ($1 == NULL) {
3331                                 $0 = NULL;
3332                         } else {
3333                                 $0 = new(binode);
3334                                 $0->op = Block;
3335                                 $0->left = NULL;
3336                                 $0->right = $<1;
3337                         }
3338                 }$
3339
3340         $*exec
3341         ComplexStatement -> SimpleStatements Newlines ${
3342                         $0 = reorder_bilist($<SS);
3343                         }$
3344                 |  SimpleStatements ; Newlines ${
3345                         $0 = reorder_bilist($<SS);
3346                         }$
3347                 ## ComplexStatement Grammar
3348
3349         $*binode
3350         SimpleStatements -> SimpleStatements ; SimpleStatement ${
3351                         $0 = new(binode);
3352                         $0->op = Block;
3353                         $0->left = $<1;
3354                         $0->right = $<3;
3355                         }$
3356                 | SimpleStatement ${
3357                         $0 = new(binode);
3358                         $0->op = Block;
3359                         $0->left = NULL;
3360                         $0->right = $<1;
3361                         }$
3362
3363         $TERM pass
3364         SimpleStatement -> pass ${ $0 = NULL; }$
3365                 | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
3366                 ## SimpleStatement Grammar
3367
3368 ###### print binode cases
3369         case Block:
3370                 if (indent < 0) {
3371                         // simple statement
3372                         if (b->left == NULL)    // UNTESTED
3373                                 printf("pass"); // UNTESTED
3374                         else
3375                                 print_exec(b->left, indent, bracket);   // UNTESTED
3376                         if (b->right) { // UNTESTED
3377                                 printf("; ");   // UNTESTED
3378                                 print_exec(b->right, indent, bracket);  // UNTESTED
3379                         }
3380                 } else {
3381                         // block, one per line
3382                         if (b->left == NULL)
3383                                 do_indent(indent, "pass\n");
3384                         else
3385                                 print_exec(b->left, indent, bracket);
3386                         if (b->right)
3387                                 print_exec(b->right, indent, bracket);
3388                 }
3389                 break;
3390
3391 ###### propagate binode cases
3392         case Block:
3393         {
3394                 /* If any statement returns something other than Tnone
3395                  * or Tbool then all such must return same type.
3396                  * As each statement may be Tnone or something else,
3397                  * we must always pass NULL (unknown) down, otherwise an incorrect
3398                  * error might occur.  We never return Tnone unless it is
3399                  * passed in.
3400                  */
3401                 struct binode *e;
3402
3403                 for (e = b; e; e = cast(binode, e->right)) {
3404                         t = propagate_types(e->left, c, ok, NULL, rules);
3405                         if ((rules & Rboolok) && t == Tbool)
3406                                 t = NULL;
3407                         if (t && t != Tnone && t != Tbool) {
3408                                 if (!type)
3409                                         type = t;
3410                                 else if (t != type)
3411                                         type_err(c, "error: expected %1%r, found %2",
3412                                                  e->left, type, rules, t);
3413                         }
3414                 }
3415                 return type;
3416         }
3417
3418 ###### interp binode cases
3419         case Block:
3420                 while (rvtype == Tnone &&
3421                        b) {
3422                         if (b->left)
3423                                 rv = interp_exec(c, b->left, &rvtype);
3424                         b = cast(binode, b->right);
3425                 }
3426                 break;
3427
3428 ### The Print statement
3429
3430 `print` is a simple statement that takes a comma-separated list of
3431 expressions and prints the values separated by spaces and terminated
3432 by a newline.  No control of formatting is possible.
3433
3434 `print` faces the same list-ordering issue as blocks, and uses the
3435 same solution.
3436
3437 ###### Binode types
3438         Print,
3439
3440 ##### expr precedence
3441         $TERM print ,
3442
3443 ###### SimpleStatement Grammar
3444
3445         | print ExpressionList ${
3446                 $0 = reorder_bilist($<2);
3447         }$
3448         | print ExpressionList , ${
3449                 $0 = new(binode);
3450                 $0->op = Print;
3451                 $0->right = NULL;
3452                 $0->left = $<2;
3453                 $0 = reorder_bilist($0);
3454         }$
3455         | print ${
3456                 $0 = new(binode);
3457                 $0->op = Print;
3458                 $0->right = NULL;
3459         }$
3460
3461 ###### Grammar
3462
3463         $*binode
3464         ExpressionList -> ExpressionList , Expression ${
3465                 $0 = new(binode);
3466                 $0->op = Print;
3467                 $0->left = $<1;
3468                 $0->right = $<3;
3469                 }$
3470                 | Expression ${
3471                         $0 = new(binode);
3472                         $0->op = Print;
3473                         $0->left = NULL;
3474                         $0->right = $<1;
3475                 }$
3476
3477 ###### print binode cases
3478
3479         case Print:
3480                 do_indent(indent, "print");
3481                 while (b) {
3482                         if (b->left) {
3483                                 printf(" ");
3484                                 print_exec(b->left, -1, bracket);
3485                                 if (b->right)
3486                                         printf(",");
3487                         }
3488                         b = cast(binode, b->right);
3489                 }
3490                 if (indent >= 0)
3491                         printf("\n");
3492                 break;
3493
3494 ###### propagate binode cases
3495
3496         case Print:
3497                 /* don't care but all must be consistent */
3498                 propagate_types(b->left, c, ok, NULL, Rnolabel);
3499                 propagate_types(b->right, c, ok, NULL, Rnolabel);
3500                 break;
3501
3502 ###### interp binode cases
3503
3504         case Print:
3505         {
3506                 char sep = 0;
3507                 int eol = 1;
3508                 for ( ; b; b = cast(binode, b->right))
3509                         if (b->left) {
3510                                 if (sep)
3511                                         putchar(sep);
3512                                 left = interp_exec(c, b->left, &ltype);
3513                                 print_value(ltype, &left);
3514                                 free_value(ltype, &left);
3515                                 if (b->right)
3516                                         sep = ' ';
3517                         } else if (sep)
3518                                 eol = 0;
3519                 ltype = Tnone;
3520                 if (eol)
3521                         printf("\n");
3522                 break;
3523         }
3524
3525 ###### Assignment statement
3526
3527 An assignment will assign a value to a variable, providing it hasn't
3528 been declared as a constant.  The analysis phase ensures that the type
3529 will be correct so the interpreter just needs to perform the
3530 calculation.  There is a form of assignment which declares a new
3531 variable as well as assigning a value.  If a name is assigned before
3532 it is declared, and error will be raised as the name is created as
3533 `Tlabel` and it is illegal to assign to such names.
3534
3535 ###### Binode types
3536         Assign,
3537         Declare,
3538
3539 ###### declare terminals
3540         $TERM =
3541
3542 ###### SimpleStatement Grammar
3543         | Variable = Expression ${
3544                         $0 = new(binode);
3545                         $0->op = Assign;
3546                         $0->left = $<1;
3547                         $0->right = $<3;
3548                 }$
3549         | VariableDecl = Expression ${
3550                         $0 = new(binode);
3551                         $0->op = Declare;
3552                         $0->left = $<1;
3553                         $0->right =$<3;
3554                 }$
3555
3556         | VariableDecl ${
3557                         if ($1->var->where_set == NULL) {
3558                                 type_err(c,
3559                                          "Variable declared with no type or value: %v",
3560                                          $1, NULL, 0, NULL);
3561                         } else {
3562                                 $0 = new(binode);
3563                                 $0->op = Declare;
3564                                 $0->left = $<1;
3565                                 $0->right = NULL;
3566                         }
3567                 }$
3568
3569 ###### print binode cases
3570
3571         case Assign:
3572                 do_indent(indent, "");
3573                 print_exec(b->left, indent, bracket);
3574                 printf(" = ");
3575                 print_exec(b->right, indent, bracket);
3576                 if (indent >= 0)
3577                         printf("\n");
3578                 break;
3579
3580         case Declare:
3581                 {
3582                 struct variable *v = cast(var, b->left)->var;
3583                 do_indent(indent, "");
3584                 print_exec(b->left, indent, bracket);
3585                 if (cast(var, b->left)->var->constant) {
3586                         if (v->where_decl == v->where_set) {
3587                                 printf("::");
3588                                 type_print(v->type, stdout);
3589                                 printf(" ");
3590                         } else
3591                                 printf(" ::");
3592                 } else {
3593                         if (v->where_decl == v->where_set) {
3594                                 printf(":");
3595                                 type_print(v->type, stdout);
3596                                 printf(" ");
3597                         } else
3598                                 printf(" :");
3599                 }
3600                 if (b->right) {
3601                         printf("= ");
3602                         print_exec(b->right, indent, bracket);
3603                 }
3604                 if (indent >= 0)
3605                         printf("\n");
3606                 }
3607                 break;
3608
3609 ###### propagate binode cases
3610
3611         case Assign:
3612         case Declare:
3613                 /* Both must match and not be labels,
3614                  * Type must support 'dup',
3615                  * For Assign, left must not be constant.
3616                  * result is Tnone
3617                  */
3618                 t = propagate_types(b->left, c, ok, NULL,
3619                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
3620                 if (!b->right)
3621                         return Tnone;
3622
3623                 if (t) {
3624                         if (propagate_types(b->right, c, ok, t, 0) != t)
3625                                 if (b->left->type == Xvar)
3626                                         type_err(c, "info: variable '%v' was set as %1 here.",
3627                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
3628                 } else {
3629                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
3630                         if (t)
3631                                 propagate_types(b->left, c, ok, t,
3632                                                 (b->op == Assign ? Rnoconstant : 0));
3633                 }
3634                 if (t && t->dup == NULL)
3635                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
3636                 return Tnone;
3637
3638                 break;
3639
3640 ###### interp binode cases
3641
3642         case Assign:
3643                 lleft = linterp_exec(c, b->left, &ltype);
3644                 right = interp_exec(c, b->right, &rtype);
3645                 if (lleft) {
3646                         free_value(ltype, lleft);
3647                         dup_value(ltype, &right, lleft);
3648                         ltype = NULL;
3649                 }
3650                 break;
3651
3652         case Declare:
3653         {
3654                 struct variable *v = cast(var, b->left)->var;
3655                 struct value *val;
3656                 v = v->merged;
3657                 val = var_value(c, v);
3658                 free_value(v->type, val);
3659                 if (v->type->prepare_type)
3660                         v->type->prepare_type(c, v->type, 0);
3661                 if (b->right) {
3662                         right = interp_exec(c, b->right, &rtype);
3663                         memcpy(val, &right, rtype->size);
3664                         rtype = Tnone;
3665                 } else {
3666                         val_init(v->type, val);
3667                 }
3668                 break;
3669         }
3670
3671 ### The `use` statement
3672
3673 The `use` statement is the last "simple" statement.  It is needed when
3674 the condition in a conditional statement is a block.  `use` works much
3675 like `return` in C, but only completes the `condition`, not the whole
3676 function.
3677
3678 ###### Binode types
3679         Use,
3680
3681 ###### expr precedence
3682         $TERM use
3683
3684 ###### SimpleStatement Grammar
3685         | use Expression ${
3686                 $0 = new_pos(binode, $1);
3687                 $0->op = Use;
3688                 $0->right = $<2;
3689                 if ($0->right->type == Xvar) {
3690                         struct var *v = cast(var, $0->right);
3691                         if (v->var->type == Tnone) {
3692                                 /* Convert this to a label */
3693                                 struct value *val;
3694
3695                                 v->var->type = Tlabel;
3696                                 val = global_alloc(c, Tlabel, v->var, NULL);
3697                                 val->label = val;
3698                         }
3699                 }
3700         }$
3701
3702 ###### print binode cases
3703
3704         case Use:
3705                 do_indent(indent, "use ");
3706                 print_exec(b->right, -1, bracket);
3707                 if (indent >= 0)
3708                         printf("\n");
3709                 break;
3710
3711 ###### propagate binode cases
3712
3713         case Use:
3714                 /* result matches value */
3715                 return propagate_types(b->right, c, ok, type, 0);
3716
3717 ###### interp binode cases
3718
3719         case Use:
3720                 rv = interp_exec(c, b->right, &rvtype);
3721                 break;
3722
3723 ### The Conditional Statement
3724
3725 This is the biggy and currently the only complex statement.  This
3726 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
3727 It is comprised of a number of parts, all of which are optional though
3728 set combinations apply.  Each part is (usually) a key word (`then` is
3729 sometimes optional) followed by either an expression or a code block,
3730 except the `casepart` which is a "key word and an expression" followed
3731 by a code block.  The code-block option is valid for all parts and,
3732 where an expression is also allowed, the code block can use the `use`
3733 statement to report a value.  If the code block does not report a value
3734 the effect is similar to reporting `True`.
3735
3736 The `else` and `case` parts, as well as `then` when combined with
3737 `if`, can contain a `use` statement which will apply to some
3738 containing conditional statement. `for` parts, `do` parts and `then`
3739 parts used with `for` can never contain a `use`, except in some
3740 subordinate conditional statement.
3741
3742 If there is a `forpart`, it is executed first, only once.
3743 If there is a `dopart`, then it is executed repeatedly providing
3744 always that the `condpart` or `cond`, if present, does not return a non-True
3745 value.  `condpart` can fail to return any value if it simply executes
3746 to completion.  This is treated the same as returning `True`.
3747
3748 If there is a `thenpart` it will be executed whenever the `condpart`
3749 or `cond` returns True (or does not return any value), but this will happen
3750 *after* `dopart` (when present).
3751
3752 If `elsepart` is present it will be executed at most once when the
3753 condition returns `False` or some value that isn't `True` and isn't
3754 matched by any `casepart`.  If there are any `casepart`s, they will be
3755 executed when the condition returns a matching value.
3756
3757 The particular sorts of values allowed in case parts has not yet been
3758 determined in the language design, so nothing is prohibited.
3759
3760 The various blocks in this complex statement potentially provide scope
3761 for variables as described earlier.  Each such block must include the
3762 "OpenScope" nonterminal before parsing the block, and must call
3763 `var_block_close()` when closing the block.
3764
3765 The code following "`if`", "`switch`" and "`for`" does not get its own
3766 scope, but is in a scope covering the whole statement, so names
3767 declared there cannot be redeclared elsewhere.  Similarly the
3768 condition following "`while`" is in a scope the covers the body
3769 ("`do`" part) of the loop, and which does not allow conditional scope
3770 extension.  Code following "`then`" (both looping and non-looping),
3771 "`else`" and "`case`" each get their own local scope.
3772
3773 The type requirements on the code block in a `whilepart` are quite
3774 unusal.  It is allowed to return a value of some identifiable type, in
3775 which case the loop aborts and an appropriate `casepart` is run, or it
3776 can return a Boolean, in which case the loop either continues to the
3777 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
3778 This is different both from the `ifpart` code block which is expected to
3779 return a Boolean, or the `switchpart` code block which is expected to
3780 return the same type as the casepart values.  The correct analysis of
3781 the type of the `whilepart` code block is the reason for the
3782 `Rboolok` flag which is passed to `propagate_types()`.
3783
3784 The `cond_statement` cannot fit into a `binode` so a new `exec` is
3785 defined.
3786
3787 ###### exec type
3788         Xcond_statement,
3789
3790 ###### ast
3791         struct casepart {
3792                 struct exec *value;
3793                 struct exec *action;
3794                 struct casepart *next;
3795         };
3796         struct cond_statement {
3797                 struct exec;
3798                 struct exec *forpart, *condpart, *dopart, *thenpart, *elsepart;
3799                 struct casepart *casepart;
3800         };
3801
3802 ###### ast functions
3803
3804         static void free_casepart(struct casepart *cp)
3805         {
3806                 while (cp) {
3807                         struct casepart *t;
3808                         free_exec(cp->value);
3809                         free_exec(cp->action);
3810                         t = cp->next;
3811                         free(cp);
3812                         cp = t;
3813                 }
3814         }
3815
3816         static void free_cond_statement(struct cond_statement *s)
3817         {
3818                 if (!s)
3819                         return;
3820                 free_exec(s->forpart);
3821                 free_exec(s->condpart);
3822                 free_exec(s->dopart);
3823                 free_exec(s->thenpart);
3824                 free_exec(s->elsepart);
3825                 free_casepart(s->casepart);
3826                 free(s);
3827         }
3828
3829 ###### free exec cases
3830         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
3831
3832 ###### ComplexStatement Grammar
3833         | CondStatement ${ $0 = $<1; }$
3834
3835 ###### expr precedence
3836         $TERM for then while do
3837         $TERM else
3838         $TERM switch case
3839
3840 ###### Grammar
3841
3842         $*cond_statement
3843         // A CondStatement must end with EOL, as does CondSuffix and
3844         // IfSuffix.
3845         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
3846         // may or may not end with EOL
3847         // WhilePart and IfPart include an appropriate Suffix
3848
3849         // Both ForPart and Whilepart open scopes, and CondSuffix only
3850         // closes one - so in the first branch here we have another to close.
3851         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
3852                         $0 = $<CS;
3853                         $0->forpart = $<FP;
3854                         $0->thenpart = $<TP;
3855                         $0->condpart = $WP.condpart; $WP.condpart = NULL;
3856                         $0->dopart = $WP.dopart; $WP.dopart = NULL;
3857                         var_block_close(c, CloseSequential);
3858                         }$
3859                 | ForPart OptNL WhilePart CondSuffix ${
3860                         $0 = $<CS;
3861                         $0->forpart = $<FP;
3862                         $0->condpart = $WP.condpart; $WP.condpart = NULL;
3863                         $0->dopart = $WP.dopart; $WP.dopart = NULL;
3864                         var_block_close(c, CloseSequential);
3865                         }$
3866                 | WhilePart CondSuffix ${
3867                         $0 = $<CS;
3868                         $0->condpart = $WP.condpart; $WP.condpart = NULL;
3869                         $0->dopart = $WP.dopart; $WP.dopart = NULL;
3870                         }$
3871                 | SwitchPart OptNL CasePart CondSuffix ${
3872                         $0 = $<CS;
3873                         $0->condpart = $<SP;
3874                         $CP->next = $0->casepart;
3875                         $0->casepart = $<CP;
3876                         var_block_close(c, CloseSequential);
3877                         }$
3878                 | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
3879                         $0 = $<CS;
3880                         $0->condpart = $<SP;
3881                         $CP->next = $0->casepart;
3882                         $0->casepart = $<CP;
3883                         var_block_close(c, CloseSequential);
3884                         }$
3885                 | IfPart IfSuffix ${
3886                         $0 = $<IS;
3887                         $0->condpart = $IP.condpart; $IP.condpart = NULL;
3888                         $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
3889                         // This is where we close an "if" statement
3890                         var_block_close(c, CloseSequential);
3891                         }$
3892
3893         CondSuffix -> IfSuffix ${
3894                         $0 = $<1;
3895                 }$
3896                 | Newlines CasePart CondSuffix ${
3897                         $0 = $<CS;
3898                         $CP->next = $0->casepart;
3899                         $0->casepart = $<CP;
3900                 }$
3901                 | CasePart CondSuffix ${
3902                         $0 = $<CS;
3903                         $CP->next = $0->casepart;
3904                         $0->casepart = $<CP;
3905                 }$
3906
3907         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
3908                 | Newlines ElsePart ${ $0 = $<EP; }$
3909                 | ElsePart ${$0 = $<EP; }$
3910
3911         ElsePart -> else OpenBlock Newlines ${
3912                         $0 = new(cond_statement);
3913                         $0->elsepart = $<OB;
3914                         var_block_close(c, CloseElse);
3915                 }$
3916                 | else OpenScope CondStatement ${
3917                         $0 = new(cond_statement);
3918                         $0->elsepart = $<CS;
3919                         var_block_close(c, CloseElse);
3920                 }$
3921
3922         $*casepart
3923         CasePart -> case Expression OpenScope ColonBlock ${
3924                         $0 = calloc(1,sizeof(struct casepart));
3925                         $0->value = $<Ex;
3926                         $0->action = $<Bl;
3927                         var_block_close(c, CloseParallel);
3928                 }$
3929
3930         $*exec
3931         // These scopes are closed in CondStatement
3932         ForPart -> for OpenBlock ${
3933                         $0 = $<Bl;
3934                 }$
3935
3936         ThenPart -> then OpenBlock ${
3937                         $0 = $<OB;
3938                         var_block_close(c, CloseSequential);
3939                 }$
3940
3941         $cond_statement
3942         // This scope is closed in CondStatement
3943         WhilePart -> while UseBlock OptNL do Block ${
3944                         $0.condpart = $<UB;
3945                         $0.dopart = $<Bl;
3946                         var_block_close(c, CloseSequential);
3947                 }$
3948                 | while OpenScope Expression ColonBlock ${
3949                         $0.condpart = $<Exp;
3950                         $0.dopart = $<Bl;
3951                         var_block_close(c, CloseSequential);
3952                 }$
3953
3954         IfPart -> if UseBlock OptNL then OpenBlock ClosePara ${
3955                         $0.condpart = $<UB;
3956                         $0.thenpart = $<Bl;
3957                 }$
3958                 | if OpenScope Expression OpenScope ColonBlock ClosePara ${
3959                         $0.condpart = $<Ex;
3960                         $0.thenpart = $<Bl;
3961                 }$
3962                 | if OpenScope Expression OpenScope OptNL then Block ClosePara ${
3963                         $0.condpart = $<Ex;
3964                         $0.thenpart = $<Bl;
3965                 }$
3966
3967         $*exec
3968         // This scope is closed in CondStatement
3969         SwitchPart -> switch OpenScope Expression ${
3970                         $0 = $<Ex;
3971                 }$
3972                 | switch UseBlock ${
3973                         $0 = $<Bl;
3974                 }$
3975
3976 ###### print exec cases
3977
3978         case Xcond_statement:
3979         {
3980                 struct cond_statement *cs = cast(cond_statement, e);
3981                 struct casepart *cp;
3982                 if (cs->forpart) {
3983                         do_indent(indent, "for");
3984                         if (bracket) printf(" {\n"); else printf("\n");
3985                         print_exec(cs->forpart, indent+1, bracket);
3986                         if (cs->thenpart) {
3987                                 if (bracket)
3988                                         do_indent(indent, "} then {\n");
3989                                 else
3990                                         do_indent(indent, "then\n");
3991                                 print_exec(cs->thenpart, indent+1, bracket);
3992                         }
3993                         if (bracket) do_indent(indent, "}\n");
3994                 }
3995                 if (cs->dopart) {
3996                         // a loop
3997                         if (cs->condpart && cs->condpart->type == Xbinode &&
3998                             cast(binode, cs->condpart)->op == Block) {
3999                                 if (bracket)
4000                                         do_indent(indent, "while {\n");
4001                                 else
4002                                         do_indent(indent, "while\n");
4003                                 print_exec(cs->condpart, indent+1, bracket);
4004                                 if (bracket)
4005                                         do_indent(indent, "} do {\n");
4006                                 else
4007                                         do_indent(indent, "do\n");
4008                                 print_exec(cs->dopart, indent+1, bracket);
4009                                 if (bracket)
4010                                         do_indent(indent, "}\n");
4011                         } else {
4012                                 do_indent(indent, "while ");
4013                                 print_exec(cs->condpart, 0, bracket);
4014                                 if (bracket)
4015                                         printf(" {\n");
4016                                 else
4017                                         printf(":\n");
4018                                 print_exec(cs->dopart, indent+1, bracket);
4019                                 if (bracket)
4020                                         do_indent(indent, "}\n");
4021                         }
4022                 } else {
4023                         // a condition
4024                         if (cs->casepart)
4025                                 do_indent(indent, "switch");
4026                         else
4027                                 do_indent(indent, "if");
4028                         if (cs->condpart && cs->condpart->type == Xbinode &&
4029                             cast(binode, cs->condpart)->op == Block) {
4030                                 if (bracket)    // UNTESTED
4031                                         printf(" {\n"); // UNTESTED
4032                                 else
4033                                         printf(":\n");  // UNTESTED
4034                                 print_exec(cs->condpart, indent+1, bracket);    // UNTESTED
4035                                 if (bracket)    // UNTESTED
4036                                         do_indent(indent, "}\n");       // UNTESTED
4037                                 if (cs->thenpart) {     // UNTESTED
4038                                         do_indent(indent, "then:\n");   // UNTESTED
4039                                         print_exec(cs->thenpart, indent+1, bracket);    // UNTESTED
4040                                 }
4041                         } else {
4042                                 printf(" ");
4043                                 print_exec(cs->condpart, 0, bracket);
4044                                 if (cs->thenpart) {
4045                                         if (bracket)
4046                                                 printf(" {\n");
4047                                         else
4048                                                 printf(":\n");
4049                                         print_exec(cs->thenpart, indent+1, bracket);
4050                                         if (bracket)
4051                                                 do_indent(indent, "}\n");
4052                                 } else
4053                                         printf("\n");
4054                         }
4055                 }
4056                 for (cp = cs->casepart; cp; cp = cp->next) {
4057                         do_indent(indent, "case ");
4058                         print_exec(cp->value, -1, 0);
4059                         if (bracket)
4060                                 printf(" {\n");
4061                         else
4062                                 printf(":\n");
4063                         print_exec(cp->action, indent+1, bracket);
4064                         if (bracket)
4065                                 do_indent(indent, "}\n");
4066                 }
4067                 if (cs->elsepart) {
4068                         do_indent(indent, "else");
4069                         if (bracket)
4070                                 printf(" {\n");
4071                         else
4072                                 printf("\n");
4073                         print_exec(cs->elsepart, indent+1, bracket);
4074                         if (bracket)
4075                                 do_indent(indent, "}\n");
4076                 }
4077                 break;
4078         }
4079
4080 ###### propagate exec cases
4081         case Xcond_statement:
4082         {
4083                 // forpart and dopart must return Tnone
4084                 // thenpart must return Tnone if there is a dopart,
4085                 // otherwise it is like elsepart.
4086                 // condpart must:
4087                 //    be bool if there is no casepart
4088                 //    match casepart->values if there is a switchpart
4089                 //    either be bool or match casepart->value if there
4090                 //             is a whilepart
4091                 // elsepart and casepart->action must match the return type
4092                 //   expected of this statement.
4093                 struct cond_statement *cs = cast(cond_statement, prog);
4094                 struct casepart *cp;
4095
4096                 t = propagate_types(cs->forpart, c, ok, Tnone, 0);
4097                 if (!type_compat(Tnone, t, 0))
4098                         *ok = 0;        // UNTESTED
4099                 t = propagate_types(cs->dopart, c, ok, Tnone, 0);
4100                 if (!type_compat(Tnone, t, 0))
4101                         *ok = 0;        // UNTESTED
4102                 if (cs->dopart) {
4103                         t = propagate_types(cs->thenpart, c, ok, Tnone, 0);
4104                         if (!type_compat(Tnone, t, 0))
4105                                 *ok = 0;        // UNTESTED
4106                 }
4107                 if (cs->casepart == NULL)
4108                         propagate_types(cs->condpart, c, ok, Tbool, 0);
4109                 else {
4110                         /* Condpart must match case values, with bool permitted */
4111                         t = NULL;
4112                         for (cp = cs->casepart;
4113                              cp && !t; cp = cp->next)
4114                                 t = propagate_types(cp->value, c, ok, NULL, 0);
4115                         if (!t && cs->condpart)
4116                                 t = propagate_types(cs->condpart, c, ok, NULL, Rboolok);        // UNTESTED
4117                         // Now we have a type (I hope) push it down
4118                         if (t) {
4119                                 for (cp = cs->casepart; cp; cp = cp->next)
4120                                         propagate_types(cp->value, c, ok, t, 0);
4121                                 propagate_types(cs->condpart, c, ok, t, Rboolok);
4122                         }
4123                 }
4124                 // (if)then, else, and case parts must return expected type.
4125                 if (!cs->dopart && !type)
4126                         type = propagate_types(cs->thenpart, c, ok, NULL, rules);
4127                 if (!type)
4128                         type = propagate_types(cs->elsepart, c, ok, NULL, rules);
4129                 for (cp = cs->casepart;
4130                      cp && !type;
4131                      cp = cp->next)     // UNTESTED
4132                         type = propagate_types(cp->action, c, ok, NULL, rules); // UNTESTED
4133                 if (type) {
4134                         if (!cs->dopart)
4135                                 propagate_types(cs->thenpart, c, ok, type, rules);
4136                         propagate_types(cs->elsepart, c, ok, type, rules);
4137                         for (cp = cs->casepart; cp ; cp = cp->next)
4138                                 propagate_types(cp->action, c, ok, type, rules);
4139                         return type;
4140                 } else
4141                         return NULL;
4142         }
4143
4144 ###### interp exec cases
4145         case Xcond_statement:
4146         {
4147                 struct value v, cnd;
4148                 struct type *vtype, *cndtype;
4149                 struct casepart *cp;
4150                 struct cond_statement *cs = cast(cond_statement, e);
4151
4152                 if (cs->forpart)
4153                         interp_exec(c, cs->forpart, NULL);
4154                 do {
4155                         if (cs->condpart)
4156                                 cnd = interp_exec(c, cs->condpart, &cndtype);
4157                         else
4158                                 cndtype = Tnone;        // UNTESTED
4159                         if (!(cndtype == Tnone ||
4160                               (cndtype == Tbool && cnd.bool != 0)))
4161                                 break;
4162                         // cnd is Tnone or Tbool, doesn't need to be freed
4163                         if (cs->dopart)
4164                                 interp_exec(c, cs->dopart, NULL);
4165
4166                         if (cs->thenpart) {
4167                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4168                                 if (rvtype != Tnone || !cs->dopart)
4169                                         goto Xcond_done;
4170                                 free_value(rvtype, &rv);
4171                                 rvtype = Tnone;
4172                         }
4173                 } while (cs->dopart);
4174
4175                 for (cp = cs->casepart; cp; cp = cp->next) {
4176                         v = interp_exec(c, cp->value, &vtype);
4177                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4178                                 free_value(vtype, &v);
4179                                 free_value(cndtype, &cnd);
4180                                 rv = interp_exec(c, cp->action, &rvtype);
4181                                 goto Xcond_done;
4182                         }
4183                         free_value(vtype, &v);
4184                 }
4185                 free_value(cndtype, &cnd);
4186                 if (cs->elsepart)
4187                         rv = interp_exec(c, cs->elsepart, &rvtype);
4188                 else
4189                         rvtype = Tnone;
4190         Xcond_done:
4191                 break;
4192         }
4193
4194 ### Top level structure
4195
4196 All the language elements so far can be used in various places.  Now
4197 it is time to clarify what those places are.
4198
4199 At the top level of a file there will be a number of declarations.
4200 Many of the things that can be declared haven't been described yet,
4201 such as functions, procedures, imports, and probably more.
4202 For now there are two sorts of things that can appear at the top
4203 level.  They are predefined constants, `struct` types, and the `main`
4204 function.  While the syntax will allow the `main` function to appear
4205 multiple times, that will trigger an error if it is actually attempted.
4206
4207 The various declarations do not return anything.  They store the
4208 various declarations in the parse context.
4209
4210 ###### Parser: grammar
4211
4212         $void
4213         Ocean -> OptNL DeclarationList
4214
4215         ## declare terminals
4216
4217         OptNL ->
4218                 | OptNL NEWLINE
4219         Newlines -> NEWLINE
4220                 | Newlines NEWLINE
4221
4222         DeclarationList -> Declaration
4223                 | DeclarationList Declaration
4224
4225         Declaration -> ERROR Newlines ${
4226                         tok_err(c,      // UNTESTED
4227                                 "error: unhandled parse error", &$1);
4228                 }$
4229                 | DeclareConstant
4230                 | DeclareFunction
4231                 | DeclareStruct
4232
4233         ## top level grammar
4234
4235         ## Grammar
4236
4237 ### The `const` section
4238
4239 As well as being defined in with the code that uses them, constants
4240 can be declared at the top level.  These have full-file scope, so they
4241 are always `InScope`.  The value of a top level constant can be given
4242 as an expression, and this is evaluated immediately rather than in the
4243 later interpretation stage.  Once we add functions to the language, we
4244 will need rules concern which, if any, can be used to define a top
4245 level constant.
4246
4247 Constants are defined in a section that starts with the reserved word
4248 `const` and then has a block with a list of assignment statements.
4249 For syntactic consistency, these must use the double-colon syntax to
4250 make it clear that they are constants.  Type can also be given: if
4251 not, the type will be determined during analysis, as with other
4252 constants.
4253
4254 As the types constants are inserted at the head of a list, printing
4255 them in the same order that they were read is not straight forward.
4256 We take a quadratic approach here and count the number of constants
4257 (variables of depth 0), then count down from there, each time
4258 searching through for the Nth constant for decreasing N.
4259
4260 ###### top level grammar
4261
4262         $TERM const
4263
4264         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4265                 | const { SimpleConstList } Newlines
4266                 | const IN OptNL ConstList OUT Newlines
4267                 | const SimpleConstList Newlines
4268
4269         ConstList -> ConstList SimpleConstLine
4270                 | SimpleConstLine
4271         SimpleConstList -> SimpleConstList ; Const
4272                 | Const
4273                 | SimpleConstList ;
4274         SimpleConstLine -> SimpleConstList Newlines
4275                 | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4276
4277         $*type
4278         CType -> Type   ${ $0 = $<1; }$
4279                 |       ${ $0 = NULL; }$
4280         $void
4281         Const -> IDENTIFIER :: CType = Expression ${ {
4282                 int ok;
4283                 struct variable *v;
4284
4285                 v = var_decl(c, $1.txt);
4286                 if (v) {
4287                         struct var *var = new_pos(var, $1);
4288                         v->where_decl = var;
4289                         v->where_set = var;
4290                         var->var = v;
4291                         v->constant = 1;
4292                 } else {
4293                         v = var_ref(c, $1.txt);
4294                         tok_err(c, "error: name already declared", &$1);
4295                         type_err(c, "info: this is where '%v' was first declared",
4296                                  v->where_decl, NULL, 0, NULL);
4297                 }
4298                 do {
4299                         ok = 1;
4300                         propagate_types($5, c, &ok, $3, 0);
4301                 } while (ok == 2);
4302                 if (!ok)
4303                         c->parse_error = 1;
4304                 else if (v) {
4305                         struct value res = interp_exec(c, $5, &v->type);
4306                         global_alloc(c, v->type, v, &res);
4307                 }
4308         } }$
4309
4310 ###### print const decls
4311         {
4312                 struct variable *v;
4313                 int target = -1;
4314
4315                 while (target != 0) {
4316                         int i = 0;
4317                         for (v = context.in_scope; v; v=v->in_scope)
4318                                 if (v->depth == 0) {
4319                                         i += 1;
4320                                         if (i == target)
4321                                                 break;
4322                                 }
4323
4324                         if (target == -1) {
4325                                 if (i)
4326                                         printf("const\n");
4327                                 target = i;
4328                         } else {
4329                                 struct value *val = var_value(&context, v);
4330                                 printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
4331                                 type_print(v->type, stdout);
4332                                 printf(" = ");
4333                                 if (v->type == Tstr)
4334                                         printf("\"");
4335                                 print_value(v->type, val);
4336                                 if (v->type == Tstr)
4337                                         printf("\"");
4338                                 printf("\n");
4339                                 target -= 1;
4340                         }
4341                 }
4342         }
4343
4344 ### Finally the whole `main` function.
4345
4346 An Ocean program can currently have only one function - `main` - and
4347 that must exist.  It expects an array of strings with a provided size.
4348 Following this is a `block` which is the code to execute.
4349
4350 As this is the top level, several things are handled a bit
4351 differently.
4352 The function is not interpreted by `interp_exec` as that isn't
4353 passed the argument list which the program requires.  Similarly type
4354 analysis is a bit more interesting at this level.
4355
4356 ###### top level grammar
4357
4358         DeclareFunction -> MainFunction ${ {
4359                 if (c->prog)
4360                         type_err(c, "\"main\" defined a second time",
4361                                  $1, NULL, 0, NULL);
4362                 else
4363                         c->prog = $<1;
4364         } }$
4365
4366 ###### print binode cases
4367         case Func:
4368         case List:
4369                 do_indent(indent, "func main(");
4370                 for (b2 = cast(binode, b->left); b2; b2 = cast(binode, b2->right)) {
4371                         struct variable *v = cast(var, b2->left)->var;
4372                         printf(" ");
4373                         print_exec(b2->left, 0, 0);
4374                         printf(":");
4375                         type_print(v->type, stdout);
4376                 }
4377                 if (bracket)
4378                         printf(") {\n");
4379                 else
4380                         printf(")\n");
4381                 print_exec(b->right, indent+1, bracket);
4382                 if (bracket)
4383                         do_indent(indent, "}\n");
4384                 break;
4385
4386 ###### propagate binode cases
4387         case List:
4388         case Func: abort();             // NOTEST
4389
4390 ###### core functions
4391
4392         static int analyse_prog(struct exec *prog, struct parse_context *c)
4393         {
4394                 struct binode *bp = cast(binode, prog);
4395                 struct binode *b;
4396                 int ok = 1;
4397                 int arg = 0;
4398                 struct type *argv_type;
4399                 struct text argv_type_name = { " argv", 5 };
4400
4401                 if (!bp)
4402                         return 0;       // NOTEST
4403
4404                 argv_type = add_type(c, argv_type_name, &array_prototype);
4405                 argv_type->array.member = Tstr;
4406                 argv_type->array.unspec = 1;
4407
4408                 for (b = cast(binode, bp->left); b; b = cast(binode, b->right)) {
4409                         ok = 1;
4410                         switch (arg++) {
4411                         case 0: /* argv */
4412                                 propagate_types(b->left, c, &ok, argv_type, 0);
4413                                 break;
4414                         default: /* invalid */  // NOTEST
4415                                 propagate_types(b->left, c, &ok, Tnone, 0);     // NOTEST
4416                         }
4417                 }
4418
4419                 do {
4420                         ok = 1;
4421                         propagate_types(bp->right, c, &ok, Tnone, 0);
4422                 } while (ok == 2);
4423                 if (!ok)
4424                         return 0;
4425
4426                 /* Make sure everything is still consistent */
4427                 propagate_types(bp->right, c, &ok, Tnone, 0);
4428                 if (!ok)
4429                         return 0;       // UNTESTED
4430                 scope_finalize(c);
4431                 return 1;
4432         }
4433
4434         static void interp_prog(struct parse_context *c, struct exec *prog,
4435                                 int argc, char **argv)
4436         {
4437                 struct binode *p = cast(binode, prog);
4438                 struct binode *al;
4439                 int anum = 0;
4440                 struct value v;
4441                 struct type *vtype;
4442
4443                 if (!prog)
4444                         return;         // NOTEST
4445                 al = cast(binode, p->left);
4446                 while (al) {
4447                         struct var *v = cast(var, al->left);
4448                         struct value *vl = var_value(c, v->var);
4449                         struct value arg;
4450                         struct type *t;
4451                         mpq_t argcq;
4452                         int i;
4453
4454                         switch (anum++) {
4455                         case 0: /* argv */
4456                                 t = v->var->type;
4457                                 mpq_init(argcq);
4458                                 mpq_set_ui(argcq, argc, 1);
4459                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
4460                                 t->prepare_type(c, t, 0);
4461                                 array_init(v->var->type, vl);
4462                                 for (i = 0; i < argc; i++) {
4463                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
4464
4465
4466                                         arg.str.txt = argv[i];
4467                                         arg.str.len = strlen(argv[i]);
4468                                         free_value(Tstr, vl2);
4469                                         dup_value(Tstr, &arg, vl2);
4470                                 }
4471                                 break;
4472                         }
4473                         al = cast(binode, al->right);
4474                 }
4475                 v = interp_exec(c, p->right, &vtype);
4476                 free_value(vtype, &v);
4477         }
4478
4479 ###### interp binode cases
4480         case List:
4481         case Func: abort();     // NOTEST
4482
4483 ## And now to test it out.
4484
4485 Having a language requires having a "hello world" program.  I'll
4486 provide a little more than that: a program that prints "Hello world"
4487 finds the GCD of two numbers, prints the first few elements of
4488 Fibonacci, performs a binary search for a number, and a few other
4489 things which will likely grow as the languages grows.
4490
4491 ###### File: oceani.mk
4492         demos :: sayhello
4493         sayhello : oceani
4494                 @echo "===== DEMO ====="
4495                 ./oceani --section "demo: hello" oceani.mdc 55 33
4496
4497 ###### demo: hello
4498
4499         const
4500                 pi ::= 3.141_592_6
4501                 four ::= 2 + 2 ; five ::= 10/2
4502         const pie ::= "I like Pie";
4503                 cake ::= "The cake is"
4504                   ++ " a lie"
4505
4506         struct fred
4507                 size:[four]number
4508                 name:string
4509                 alive:Boolean
4510
4511         func main
4512                 argv:[argc::]string
4513         do
4514                 print "Hello World, what lovely oceans you have!"
4515                 print "Are there", five, "?"
4516                 print pi, pie, "but", cake
4517
4518                 A := $argv[1]; B := $argv[2]
4519
4520                 /* When a variable is defined in both branches of an 'if',
4521                  * and used afterwards, the variables are merged.
4522                  */
4523                 if A > B:
4524                         bigger := "yes"
4525                 else
4526                         bigger := "no"
4527                 print "Is", A, "bigger than", B,"? ", bigger
4528                 /* If a variable is not used after the 'if', no
4529                  * merge happens, so types can be different
4530                  */
4531                 if A > B * 2:
4532                         double:string = "yes"
4533                         print A, "is more than twice", B, "?", double
4534                 else
4535                         double := B*2
4536                         print "double", B, "is", double
4537
4538                 a : number
4539                 a = A;
4540                 b:number = B
4541                 if a > 0 and then b > 0:
4542                         while a != b:
4543                                 if a < b:
4544                                         b = b - a
4545                                 else
4546                                         a = a - b
4547                         print "GCD of", A, "and", B,"is", a
4548                 else if a <= 0:
4549                         print a, "is not positive, cannot calculate GCD"
4550                 else
4551                         print b, "is not positive, cannot calculate GCD"
4552
4553                 for
4554                         togo := 10
4555                         f1 := 1; f2 := 1
4556                         print "Fibonacci:", f1,f2,
4557                 then togo = togo - 1
4558                 while togo > 0:
4559                         f3 := f1 + f2
4560                         print "", f3,
4561                         f1 = f2
4562                         f2 = f3
4563                 print ""
4564
4565                 /* Binary search... */
4566                 for
4567                         lo:= 0; hi := 100
4568                         target := 77
4569                 while
4570                         mid := (lo + hi) / 2
4571                         if mid == target:
4572                                 use Found
4573                         if mid < target:
4574                                 lo = mid
4575                         else
4576                                 hi = mid
4577                         if hi - lo < 1:
4578                                 lo = mid
4579                                 use GiveUp
4580                         use True
4581                 do pass
4582                 case Found:
4583                         print "Yay, I found", target
4584                 case GiveUp:
4585                         print "Closest I found was", lo
4586
4587                 size::= 10
4588                 list:[size]number
4589                 list[0] = 1234
4590                 // "middle square" PRNG.  Not particularly good, but one my
4591                 // Dad taught me - the first one I ever heard of.
4592                 for i:=1; then i = i + 1; while i < size:
4593                         n := list[i-1] * list[i-1]
4594                         list[i] = (n / 100) % 10 000
4595
4596                 print "Before sort:",
4597                 for i:=0; then i = i + 1; while i < size:
4598                         print "", list[i],
4599                 print
4600
4601                 for i := 1; then i=i+1; while i < size:
4602                         for j:=i-1; then j=j-1; while j >= 0:
4603                                 if list[j] > list[j+1]:
4604                                         t:= list[j]
4605                                         list[j] = list[j+1]
4606                                         list[j+1] = t
4607                 print " After sort:",
4608                 for i:=0; then i = i + 1; while i < size:
4609                         print "", list[i],
4610                 print
4611
4612                 if 1 == 2 then print "yes"; else print "no"
4613
4614                 bob:fred
4615                 bob.name = "Hello"
4616                 bob.alive = (bob.name == "Hello")
4617                 print "bob", "is" if  bob.alive else "isn't", "alive"