ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 struct exec *prog;
 118                 ## parse context
 119         };
 120
 121 ###### macros
 122
 123         #define container_of(ptr, type, member) ({                      \
 124                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 125                 (type *)( (char *)__mptr - offsetof(type,member) );})
 126
 127         #define config2context(_conf) container_of(_conf, struct parse_context, \
 128                 config)
 129
 130 ###### Parser: reduce
 131         struct parse_context *c = config2context(config);
 132
 133 ###### Parser: code
 134
 135         #include <unistd.h>
 136         #include <stdlib.h>
 137         #include <fcntl.h>
 138         #include <errno.h>
 139         #include <sys/mman.h>
 140         #include <string.h>
 141         #include <stdio.h>
 142         #include <locale.h>
 143         #include <malloc.h>
 144         #include "mdcode.h"
 145         #include "scanner.h"
 146         #include "parser.h"
 147
 148         ## includes
 149
 150         #include "oceani.h"
 151
 152         ## forward decls
 153         ## value functions
 154         ## ast functions
 155         ## core functions
 156
 157         #include <getopt.h>
 158         static char Usage[] =
 159                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 160         static const struct option long_options[] = {
 161                 {"trace",     0, NULL, 't'},
 162                 {"print",     0, NULL, 'p'},
 163                 {"noexec",    0, NULL, 'n'},
 164                 {"brackets",  0, NULL, 'b'},
 165                 {"section",   1, NULL, 's'},
 166                 {NULL,        0, NULL, 0},
 167         };
 168         const char *options = "tpnbs";
 169
 170         static void pr_err(char *msg)                   // NOTEST
 171         {
 172                 fprintf(stderr, "%s\n", msg);           // NOTEST
 173         }                                               // NOTEST
 174
 175         int main(int argc, char *argv[])
 176         {
 177                 int fd;
 178                 int len;
 179                 char *file;
 180                 struct section *s, *ss;
 181                 char *section = NULL;
 182                 struct parse_context context = {
 183                         .config = {
 184                                 .ignored = (1 << TK_mark),
 185                                 .number_chars = ".,_+- ",
 186                                 .word_start = "_",
 187                                 .word_cont = "_",
 188                         },
 189                 };
 190                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 191                 int opt;
 192                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 193                        != -1) {
 194                         switch(opt) {
 195                         case 't': dotrace=1; break;
 196                         case 'p': doprint=1; break;
 197                         case 'n': doexec=0; break;
 198                         case 'b': brackets=1; break;
 199                         case 's': section = optarg; break;
 200                         default: fprintf(stderr, Usage);
 201                                 exit(1);
 202                         }
 203                 }
 204                 if (optind >= argc) {
 205                         fprintf(stderr, "oceani: no input file given\n");
 206                         exit(1);
 207                 }
 208                 fd = open(argv[optind], O_RDONLY);
 209                 if (fd < 0) {
 210                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 211                         exit(1);
 212                 }
 213                 context.file_name = argv[optind];
 214                 len = lseek(fd, 0, 2);
 215                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 216                 s = code_extract(file, file+len, pr_err);
 217                 if (!s) {
 218                         fprintf(stderr, "oceani: could not find any code in %s\n",
 219                                 argv[optind]);
 220                         exit(1);
 221                 }
 222
 223                 ## context initialization
 224
 225                 if (section) {
 226                         for (ss = s; ss; ss = ss->next) {
 227                                 struct text sec = ss->section;
 228                                 if (sec.len == strlen(section) &&
 229                                     strncmp(sec.txt, section, sec.len) == 0)
 230                                         break;
 231                         }
 232                         if (!ss) {
 233                                 fprintf(stderr, "oceani: cannot find section %s\n",
 234                                         section);
 235                                 exit(1);
 236                         }
 237                 } else
 238                         ss = s;                         // NOTEST
 239                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 240
 241                 if (!context.prog) {
 242                         fprintf(stderr, "oceani: no main function found.\n");
 243                         context.parse_error = 1;
 244                 }
 245                 if (context.prog && doprint) {
 246                         ## print const decls
 247                         ## print type decls
 248                         print_exec(context.prog, 0, brackets);
 249                 }
 250                 if (context.prog && doexec && !context.parse_error) {
 251                         if (!analyse_prog(context.prog, &context)) {
 252                                 fprintf(stderr, "oceani: type error in program - not running.\n");
 253                                 exit(1);
 254                         }
 255                         interp_prog(&context, context.prog, argc - optind, argv+optind);
 256                 }
 257                 free_exec(context.prog);
 258
 259                 while (s) {
 260                         struct section *t = s->next;
 261                         code_free(s->code);
 262                         free(s);
 263                         s = t;
 264                 }
 265                 ## free context vars
 266                 ## free context types
 267                 ## free context storage
 268                 exit(context.parse_error ? 1 : 0);
 269         }
 270
 271 ### Analysis
 272
 273 The four requirements of parse, analyse, print, interpret apply to
 274 each language element individually so that is how most of the code
 275 will be structured.
 276
 277 Three of the four are fairly self explanatory.  The one that requires
 278 a little explanation is the analysis step.
 279
 280 The current language design does not require the types of variables to
 281 be declared, but they must still have a single type.  Different
 282 operations impose different requirements on the variables, for example
 283 addition requires both arguments to be numeric, and assignment
 284 requires the variable on the left to have the same type as the
 285 expression on the right.
 286
 287 Analysis involves propagating these type requirements around and
 288 consequently setting the type of each variable.  If any requirements
 289 are violated (e.g. a string is compared with a number) or if a
 290 variable needs to have two different types, then an error is raised
 291 and the program will not run.
 292
 293 If the same variable is declared in both branchs of an 'if/else', or
 294 in all cases of a 'switch' then the multiple instances may be merged
 295 into just one variable if the variable is referenced after the
 296 conditional statement.  When this happens, the types must naturally be
 297 consistent across all the branches.  When the variable is not used
 298 outside the if, the variables in the different branches are distinct
 299 and can be of different types.
 300
 301 Undeclared names may only appear in "use" statements and "case" expressions.
 302 These names are given a type of "label" and a unique value.
 303 This allows them to fill the role of a name in an enumerated type, which
 304 is useful for testing the `switch` statement.
 305
 306 As we will see, the condition part of a `while` statement can return
 307 either a Boolean or some other type.  This requires that the expected
 308 type that gets passed around comprises a type and a flag to indicate
 309 that `Tbool` is also permitted.
 310
 311 As there are, as yet, no distinct types that are compatible, there
 312 isn't much subtlety in the analysis.  When we have distinct number
 313 types, this will become more interesting.
 314
 315 #### Error reporting
 316
 317 When analysis discovers an inconsistency it needs to report an error;
 318 just refusing to run the code ensures that the error doesn't cascade,
 319 but by itself it isn't very useful.  A clear understanding of the sort
 320 of error message that are useful will help guide the process of
 321 analysis.
 322
 323 At a simplistic level, the only sort of error that type analysis can
 324 report is that the type of some construct doesn't match a contextual
 325 requirement.  For example, in `4 + "hello"` the addition provides a
 326 contextual requirement for numbers, but `"hello"` is not a number.  In
 327 this particular example no further information is needed as the types
 328 are obvious from local information.  When a variable is involved that
 329 isn't the case.  It may be helpful to explain why the variable has a
 330 particular type, by indicating the location where the type was set,
 331 whether by declaration or usage.
 332
 333 Using a recursive-descent analysis we can easily detect a problem at
 334 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 335 will detect that one argument is not a number and the usage of `hello`
 336 will detect that a number was wanted, but not provided.  In this
 337 (early) version of the language, we will generate error reports at
 338 multiple locations, so the use of `hello` will report an error and
 339 explain were the value was set, and the addition will report an error
 340 and say why numbers are needed.  To be able to report locations for
 341 errors, each language element will need to record a file location
 342 (line and column) and each variable will need to record the language
 343 element where its type was set.  For now we will assume that each line
 344 of an error message indicates one location in the file, and up to 2
 345 types.  So we provide a `printf`-like function which takes a format, a
 346 location (a `struct exec` which has not yet been introduced), and 2
 347 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 348 will need a function to print the location, once we know how that is
 349 stored. e As will be explained later, there are sometimes extra rules for
 350 type matching and they might affect error messages, we need to pass those
 351 in too.
 352
 353 As well as type errors, we sometimes need to report problems with
 354 tokens, which might be unexpected or might name a type that has not
 355 been defined.  For these we have `tok_err()` which reports an error
 356 with a given token.  Each of the error functions sets the flag in the
 357 context so indicate that parsing failed.
 358
 359 ###### forward decls
 360
 361         static void fput_loc(struct exec *loc, FILE *f);
 362
 363 ###### core functions
 364
 365         static void type_err(struct parse_context *c,
 366                              char *fmt, struct exec *loc,
 367                              struct type *t1, int rules, struct type *t2)
 368         {
 369                 fprintf(stderr, "%s:", c->file_name);
 370                 fput_loc(loc, stderr);
 371                 for (; *fmt ; fmt++) {
 372                         if (*fmt != '%') {
 373                                 fputc(*fmt, stderr);
 374                                 continue;
 375                         }
 376                         fmt++;
 377                         switch (*fmt) {
 378                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 379                         default: fputc('?', stderr); break;     // NOTEST
 380                         case '1':
 381                                 type_print(t1, stderr);
 382                                 break;
 383                         case '2':
 384                                 type_print(t2, stderr);
 385                                 break;
 386                         ## format cases
 387                         }
 388                 }
 389                 fputs("\n", stderr);
 390                 c->parse_error = 1;
 391         }
 392
 393         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 394         {
 395                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 396                         t->txt.len, t->txt.txt);
 397                 c->parse_error = 1;
 398         }
 399
 400 ## Entities: declared and predeclared.
 401
 402 There are various "things" that the language and/or the interpreter
 403 needs to know about to parse and execute a program.  These include
 404 types, variables, values, and executable code.  These are all lumped
 405 together under the term "entities" (calling them "objects" would be
 406 confusing) and introduced here.  The following section will present the
 407 different specific code elements which comprise or manipulate these
 408 various entities.
 409
 410 ### Types
 411
 412 Values come in a wide range of types, with more likely to be added.
 413 Each type needs to be able to print its own values (for convenience at
 414 least) as well as to compare two values, at least for equality and
 415 possibly for order.  For now, values might need to be duplicated and
 416 freed, though eventually such manipulations will be better integrated
 417 into the language.
 418
 419 Rather than requiring every numeric type to support all numeric
 420 operations (add, multiple, etc), we allow types to be able to present
 421 as one of a few standard types: integer, float, and fraction.  The
 422 existence of these conversion functions eventually enable types to
 423 determine if they are compatible with other types, though such types
 424 have not yet been implemented.
 425
 426 Named type are stored in a simple linked list.  Objects of each type are
 427 "values" which are often passed around by value.
 428
 429 ###### ast
 430
 431         struct value {
 432                 union {
 433                         char ptr[1];
 434                         ## value union fields
 435                 };
 436         };
 437
 438         struct type {
 439                 struct text name;
 440                 struct type *next;
 441                 int size, align;
 442                 void (*init)(struct type *type, struct value *val);
 443                 void (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 444                 void (*print)(struct type *type, struct value *val);
 445                 void (*print_type)(struct type *type, FILE *f);
 446                 int (*cmp_order)(struct type *t1, struct type *t2,
 447                                  struct value *v1, struct value *v2);
 448                 int (*cmp_eq)(struct type *t1, struct type *t2,
 449                               struct value *v1, struct value *v2);
 450                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 451                 void (*free)(struct type *type, struct value *val);
 452                 void (*free_type)(struct type *t);
 453                 long long (*to_int)(struct value *v);
 454                 double (*to_float)(struct value *v);
 455                 int (*to_mpq)(mpq_t *q, struct value *v);
 456                 ## type functions
 457                 union {
 458                         ## type union fields
 459                 };
 460         };
 461
 462 ###### parse context
 463
 464         struct type *typelist;
 465
 466 ###### ast functions
 467
 468         static struct type *find_type(struct parse_context *c, struct text s)
 469         {
 470                 struct type *l = c->typelist;
 471
 472                 while (l &&
 473                        text_cmp(l->name, s) != 0)
 474                                 l = l->next;
 475                 return l;
 476         }
 477
 478         static struct type *add_type(struct parse_context *c, struct text s,
 479                                      struct type *proto)
 480         {
 481                 struct type *n;
 482
 483                 n = calloc(1, sizeof(*n));
 484                 *n = *proto;
 485                 n->name = s;
 486                 n->next = c->typelist;
 487                 c->typelist = n;
 488                 return n;
 489         }
 490
 491         static void free_type(struct type *t)
 492         {
 493                 /* The type is always a reference to something in the
 494                  * context, so we don't need to free anything.
 495                  */
 496         }
 497
 498         static void free_value(struct type *type, struct value *v)
 499         {
 500                 if (type && v)
 501                         type->free(type, v);
 502         }
 503
 504         static void type_print(struct type *type, FILE *f)
 505         {
 506                 if (!type)
 507                         fputs("*unknown*type*", f);     // NOTEST
 508                 else if (type->name.len)
 509                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 510                 else if (type->print_type)
 511                         type->print_type(type, f);
 512                 else
 513                         fputs("*invalid*type*", f);     // NOTEST
 514         }
 515
 516         static void val_init(struct type *type, struct value *val)
 517         {
 518                 if (type && type->init)
 519                         type->init(type, val);
 520         }
 521
 522         static void dup_value(struct type *type,
 523                               struct value *vold, struct value *vnew)
 524         {
 525                 if (type && type->dup)
 526                         type->dup(type, vold, vnew);
 527         }
 528
 529         static int value_cmp(struct type *tl, struct type *tr,
 530                              struct value *left, struct value *right)
 531         {
 532                 if (tl && tl->cmp_order)
 533                         return tl->cmp_order(tl, tr, left, right);
 534                 if (tl && tl->cmp_eq)                   // NOTEST
 535                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 536                 return -1;                              // NOTEST
 537         }
 538
 539         static void print_value(struct type *type, struct value *v)
 540         {
 541                 if (type && type->print)
 542                         type->print(type, v);
 543                 else
 544                         printf("*Unknown*");            // NOTEST
 545         }
 546
 547 ###### forward decls
 548
 549         static void free_value(struct type *type, struct value *v);
 550         static int type_compat(struct type *require, struct type *have, int rules);
 551         static void type_print(struct type *type, FILE *f);
 552         static void val_init(struct type *type, struct value *v);
 553         static void dup_value(struct type *type,
 554                               struct value *vold, struct value *vnew);
 555         static int value_cmp(struct type *tl, struct type *tr,
 556                              struct value *left, struct value *right);
 557         static void print_value(struct type *type, struct value *v);
 558
 559 ###### free context types
 560
 561         while (context.typelist) {
 562                 struct type *t = context.typelist;
 563
 564                 context.typelist = t->next;
 565                 if (t->free_type)
 566                         t->free_type(t);
 567                 free(t);
 568         }
 569
 570 Type can be specified for local variables, for fields in a structure,
 571 for formal parameters to functions, and possibly elsewhere.  Different
 572 rules may apply in different contexts.  As a minimum, a named type may
 573 always be used.  Currently the type of a formal parameter can be
 574 different from types in other contexts, so we have a separate grammar
 575 symbol for those.
 576
 577 ###### Grammar
 578
 579         $*type
 580         Type -> IDENTIFIER ${
 581                 $0 = find_type(c, $1.txt);
 582                 if (!$0) {
 583                         tok_err(c,
 584                                 "error: undefined type", &$1);
 585
 586                         $0 = Tnone;
 587                 }
 588         }$
 589         ## type grammar
 590
 591         FormalType -> Type ${ $0 = $<1; }$
 592         ## formal type grammar
 593
 594 #### Base Types
 595
 596 Values of the base types can be numbers, which we represent as
 597 multi-precision fractions, strings, Booleans and labels.  When
 598 analysing the program we also need to allow for places where no value
 599 is meaningful (type `Tnone`) and where we don't know what type to
 600 expect yet (type is `NULL`).
 601
 602 Values are never shared, they are always copied when used, and freed
 603 when no longer needed.
 604
 605 When propagating type information around the program, we need to
 606 determine if two types are compatible, where type `NULL` is compatible
 607 with anything.  There are two special cases with type compatibility,
 608 both related to the Conditional Statement which will be described
 609 later.  In some cases a Boolean can be accepted as well as some other
 610 primary type, and in others any type is acceptable except a label (`Vlabel`).
 611 A separate function encoding these cases will simplify some code later.
 612
 613 ###### type functions
 614
 615         int (*compat)(struct type *this, struct type *other);
 616
 617 ###### ast functions
 618
 619         static int type_compat(struct type *require, struct type *have, int rules)
 620         {
 621                 if ((rules & Rboolok) && have == Tbool)
 622                         return 1;       // NOTEST
 623                 if ((rules & Rnolabel) && have == Tlabel)
 624                         return 0;       // NOTEST
 625                 if (!require || !have)
 626                         return 1;
 627
 628                 if (require->compat)
 629                         return require->compat(require, have);
 630
 631                 return require == have;
 632         }
 633
 634 ###### includes
 635         #include <gmp.h>
 636         #include "parse_string.h"
 637         #include "parse_number.h"
 638
 639 ###### libs
 640         myLDLIBS := libnumber.o libstring.o -lgmp
 641         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
 642
 643 ###### type union fields
 644         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
 645
 646 ###### value union fields
 647         struct text str;
 648         mpq_t num;
 649         unsigned char bool;
 650         void *label;
 651
 652 ###### ast functions
 653         static void _free_value(struct type *type, struct value *v)
 654         {
 655                 if (!v)
 656                         return;         // NOTEST
 657                 switch (type->vtype) {
 658                 case Vnone: break;
 659                 case Vstr: free(v->str.txt); break;
 660                 case Vnum: mpq_clear(v->num); break;
 661                 case Vlabel:
 662                 case Vbool: break;
 663                 }
 664         }
 665
 666 ###### value functions
 667
 668         static void _val_init(struct type *type, struct value *val)
 669         {
 670                 switch(type->vtype) {
 671                 case Vnone:             // NOTEST
 672                         break;          // NOTEST
 673                 case Vnum:
 674                         mpq_init(val->num); break;
 675                 case Vstr:
 676                         val->str.txt = malloc(1);
 677                         val->str.len = 0;
 678                         break;
 679                 case Vbool:
 680                         val->bool = 0;
 681                         break;
 682                 case Vlabel:
 683                         val->label = NULL;
 684                         break;
 685                 }
 686         }
 687
 688         static void _dup_value(struct type *type,
 689                                struct value *vold, struct value *vnew)
 690         {
 691                 switch (type->vtype) {
 692                 case Vnone:             // NOTEST
 693                         break;          // NOTEST
 694                 case Vlabel:
 695                         vnew->label = vold->label;
 696                         break;
 697                 case Vbool:
 698                         vnew->bool = vold->bool;
 699                         break;
 700                 case Vnum:
 701                         mpq_init(vnew->num);
 702                         mpq_set(vnew->num, vold->num);
 703                         break;
 704                 case Vstr:
 705                         vnew->str.len = vold->str.len;
 706                         vnew->str.txt = malloc(vnew->str.len);
 707                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
 708                         break;
 709                 }
 710         }
 711
 712         static int _value_cmp(struct type *tl, struct type *tr,
 713                               struct value *left, struct value *right)
 714         {
 715                 int cmp;
 716                 if (tl != tr)
 717                         return tl - tr; // NOTEST
 718                 switch (tl->vtype) {
 719                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
 720                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
 721                 case Vstr: cmp = text_cmp(left->str, right->str); break;
 722                 case Vbool: cmp = left->bool - right->bool; break;
 723                 case Vnone: cmp = 0;                    // NOTEST
 724                 }
 725                 return cmp;
 726         }
 727
 728         static void _print_value(struct type *type, struct value *v)
 729         {
 730                 switch (type->vtype) {
 731                 case Vnone:                             // NOTEST
 732                         printf("*no-value*"); break;    // NOTEST
 733                 case Vlabel:                            // NOTEST
 734                         printf("*label-%p*", v->label); break; // NOTEST
 735                 case Vstr:
 736                         printf("%.*s", v->str.len, v->str.txt); break;
 737                 case Vbool:
 738                         printf("%s", v->bool ? "True":"False"); break;
 739                 case Vnum:
 740                         {
 741                         mpf_t fl;
 742                         mpf_init2(fl, 20);
 743                         mpf_set_q(fl, v->num);
 744                         gmp_printf("%Fg", fl);
 745                         mpf_clear(fl);
 746                         break;
 747                         }
 748                 }
 749         }
 750
 751         static void _free_value(struct type *type, struct value *v);
 752
 753         static struct type base_prototype = {
 754                 .init = _val_init,
 755                 .print = _print_value,
 756                 .cmp_order = _value_cmp,
 757                 .cmp_eq = _value_cmp,
 758                 .dup = _dup_value,
 759                 .free = _free_value,
 760         };
 761
 762         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
 763
 764 ###### ast functions
 765         static struct type *add_base_type(struct parse_context *c, char *n,
 766                                           enum vtype vt, int size)
 767         {
 768                 struct text txt = { n, strlen(n) };
 769                 struct type *t;
 770
 771                 t = add_type(c, txt, &base_prototype);
 772                 t->vtype = vt;
 773                 t->size = size;
 774                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
 775                 if (t->size & (t->align - 1))
 776                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
 777                 return t;
 778         }
 779
 780 ###### context initialization
 781
 782         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
 783         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
 784         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
 785         Tnone  = add_base_type(&context, "none", Vnone, 0);
 786         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
 787
 788 ### Variables
 789
 790 Variables are scoped named values.  We store the names in a linked list
 791 of "bindings" sorted in lexical order, and use sequential search and
 792 insertion sort.
 793
 794 ###### ast
 795
 796         struct binding {
 797                 struct text name;
 798                 struct binding *next;   // in lexical order
 799                 ## binding fields
 800         };
 801
 802 This linked list is stored in the parse context so that "reduce"
 803 functions can find or add variables, and so the analysis phase can
 804 ensure that every variable gets a type.
 805
 806 ###### parse context
 807
 808         struct binding *varlist;  // In lexical order
 809
 810 ###### ast functions
 811
 812         static struct binding *find_binding(struct parse_context *c, struct text s)
 813         {
 814                 struct binding **l = &c->varlist;
 815                 struct binding *n;
 816                 int cmp = 1;
 817
 818                 while (*l &&
 819                         (cmp = text_cmp((*l)->name, s)) < 0)
 820                                 l = & (*l)->next;
 821                 if (cmp == 0)
 822                         return *l;
 823                 n = calloc(1, sizeof(*n));
 824                 n->name = s;
 825                 n->next = *l;
 826                 *l = n;
 827                 return n;
 828         }
 829
 830 Each name can be linked to multiple variables defined in different
 831 scopes.  Each scope starts where the name is declared and continues
 832 until the end of the containing code block.  Scopes of a given name
 833 cannot nest, so a declaration while a name is in-scope is an error.
 834
 835 ###### binding fields
 836         struct variable *var;
 837
 838 ###### ast
 839         struct variable {
 840                 struct variable *previous;
 841                 struct type *type;
 842                 struct binding *name;
 843                 struct exec *where_decl;// where name was declared
 844                 struct exec *where_set; // where type was set
 845                 ## variable fields
 846         };
 847
 848 While the naming seems strange, we include local constants in the
 849 definition of variables.  A name declared `var := value` can
 850 subsequently be changed, but a name declared `var ::= value` cannot -
 851 it is constant
 852
 853 ###### variable fields
 854         int constant;
 855
 856 Scopes in parallel branches can be partially merged.  More
 857 specifically, if a given name is declared in both branches of an
 858 if/else then its scope is a candidate for merging.  Similarly if
 859 every branch of an exhaustive switch (e.g. has an "else" clause)
 860 declares a given name, then the scopes from the branches are
 861 candidates for merging.
 862
 863 Note that names declared inside a loop (which is only parallel to
 864 itself) are never visible after the loop.  Similarly names defined in
 865 scopes which are not parallel, such as those started by `for` and
 866 `switch`, are never visible after the scope.  Only variables defined in
 867 both `then` and `else` (including the implicit then after an `if`, and
 868 excluding `then` used with `for`) and in all `case`s and `else` of a
 869 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
 870
 871 Labels, which are a bit like variables, follow different rules.
 872 Labels are not explicitly declared, but if an undeclared name appears
 873 in a context where a label is legal, that effectively declares the
 874 name as a label.  The declaration remains in force (or in scope) at
 875 least to the end of the immediately containing block and conditionally
 876 in any larger containing block which does not declare the name in some
 877 other way.  Importantly, the conditional scope extension happens even
 878 if the label is only used in one parallel branch of a conditional --
 879 when used in one branch it is treated as having been declared in all
 880 branches.
 881
 882 Merge candidates are tentatively visible beyond the end of the
 883 branching statement which creates them.  If the name is used, the
 884 merge is affirmed and they become a single variable visible at the
 885 outer layer.  If not - if it is redeclared first - the merge lapses.
 886
 887 To track scopes we have an extra stack, implemented as a linked list,
 888 which roughly parallels the parse stack and which is used exclusively
 889 for scoping.  When a new scope is opened, a new frame is pushed and
 890 the child-count of the parent frame is incremented.  This child-count
 891 is used to distinguish between the first of a set of parallel scopes,
 892 in which declared variables must not be in scope, and subsequent
 893 branches, whether they may already be conditionally scoped.
 894
 895 To push a new frame *before* any code in the frame is parsed, we need a
 896 grammar reduction.  This is most easily achieved with a grammar
 897 element which derives the empty string, and creates the new scope when
 898 it is recognised.  This can be placed, for example, between a keyword
 899 like "if" and the code following it.
 900
 901 ###### ast
 902         struct scope {
 903                 struct scope *parent;
 904                 int child_count;
 905         };
 906
 907 ###### parse context
 908         int scope_depth;
 909         struct scope *scope_stack;
 910
 911 ###### ast functions
 912         static void scope_pop(struct parse_context *c)
 913         {
 914                 struct scope *s = c->scope_stack;
 915
 916                 c->scope_stack = s->parent;
 917                 free(s);
 918                 c->scope_depth -= 1;
 919         }
 920
 921         static void scope_push(struct parse_context *c)
 922         {
 923                 struct scope *s = calloc(1, sizeof(*s));
 924                 if (c->scope_stack)
 925                         c->scope_stack->child_count += 1;
 926                 s->parent = c->scope_stack;
 927                 c->scope_stack = s;
 928                 c->scope_depth += 1;
 929         }
 930
 931 ###### Grammar
 932
 933         $void
 934         OpenScope -> ${ scope_push(c); }$
 935         ClosePara -> ${ var_block_close(c, CloseParallel); }$
 936
 937 Each variable records a scope depth and is in one of four states:
 938
 939 - "in scope".  This is the case between the declaration of the
 940   variable and the end of the containing block, and also between
 941   the usage with affirms a merge and the end of that block.
 942
 943   The scope depth is not greater than the current parse context scope
 944   nest depth.  When the block of that depth closes, the state will
 945   change.  To achieve this, all "in scope" variables are linked
 946   together as a stack in nesting order.
 947
 948 - "pending".  The "in scope" block has closed, but other parallel
 949   scopes are still being processed.  So far, every parallel block at
 950   the same level that has closed has declared the name.
 951
 952   The scope depth is the depth of the last parallel block that
 953   enclosed the declaration, and that has closed.
 954
 955 - "conditionally in scope".  The "in scope" block and all parallel
 956   scopes have closed, and no further mention of the name has been
 957   seen.  This state includes a secondary nest depth which records the
 958   outermost scope seen since the variable became conditionally in
 959   scope.  If a use of the name is found, the variable becomes "in
 960   scope" and that secondary depth becomes the recorded scope depth.
 961   If the name is declared as a new variable, the old variable becomes
 962   "out of scope" and the recorded scope depth stays unchanged.
 963
 964 - "out of scope".  The variable is neither in scope nor conditionally
 965   in scope.  It is permanently out of scope now and can be removed from
 966   the "in scope" stack.
 967
 968 ###### variable fields
 969         int depth, min_depth;
 970         enum { OutScope, PendingScope, CondScope, InScope } scope;
 971         struct variable *in_scope;
 972
 973 ###### parse context
 974
 975         struct variable *in_scope;
 976
 977 All variables with the same name are linked together using the
 978 'previous' link.  Those variable that have been affirmatively merged all
 979 have a 'merged' pointer that points to one primary variable - the most
 980 recently declared instance.  When merging variables, we need to also
 981 adjust the 'merged' pointer on any other variables that had previously
 982 been merged with the one that will no longer be primary.
 983
 984 A variable that is no longer the most recent instance of a name may
 985 still have "pending" scope, if it might still be merged with most
 986 recent instance.  These variables don't really belong in the
 987 "in_scope" list, but are not immediately removed when a new instance
 988 is found.  Instead, they are detected and ignored when considering the
 989 list of in_scope names.
 990
 991 The storage of the value of a variable will be described later.  For now
 992 we just need to know that when a variable goes out of scope, it might
 993 need to be freed.  For this we need to be able to find it, so assume that
 994 `var_value()` will provide that.
 995
 996 ###### variable fields
 997         struct variable *merged;
 998
 999 ###### ast functions
1000
1001         static void variable_merge(struct variable *primary, struct variable *secondary)
1002         {
1003                 struct variable *v;
1004
1005                 if (primary->merged)
1006                         // shouldn't happen
1007                         primary = primary->merged;      // NOTEST
1008
1009                 for (v = primary->previous; v; v=v->previous)
1010                         if (v == secondary || v == secondary->merged ||
1011                             v->merged == secondary ||
1012                             (v->merged && v->merged == secondary->merged)) {
1013                                 v->scope = OutScope;
1014                                 v->merged = primary;
1015                         }
1016         }
1017
1018 ###### forward decls
1019         static struct value *var_value(struct parse_context *c, struct variable *v);
1020
1021 ###### free context vars
1022
1023         while (context.varlist) {
1024                 struct binding *b = context.varlist;
1025                 struct variable *v = b->var;
1026                 context.varlist = b->next;
1027                 free(b);
1028                 while (v) {
1029                         struct variable *t = v;
1030
1031                         v = t->previous;
1032                         free_value(t->type, var_value(&context, t));
1033                         if (t->depth == 0)
1034                                 // This is a global constant
1035                                 free_exec(t->where_decl);
1036                         free(t);
1037                 }
1038         }
1039
1040 #### Manipulating Bindings
1041
1042 When a name is conditionally visible, a new declaration discards the
1043 old binding - the condition lapses.  Conversely a usage of the name
1044 affirms the visibility and extends it to the end of the containing
1045 block - i.e. the block that contains both the original declaration and
1046 the latest usage.  This is determined from `min_depth`.  When a
1047 conditionally visible variable gets affirmed like this, it is also
1048 merged with other conditionally visible variables with the same name.
1049
1050 When we parse a variable declaration we either report an error if the
1051 name is currently bound, or create a new variable at the current nest
1052 depth if the name is unbound or bound to a conditionally scoped or
1053 pending-scope variable.  If the previous variable was conditionally
1054 scoped, it and its homonyms becomes out-of-scope.
1055
1056 When we parse a variable reference (including non-declarative assignment
1057 "foo = bar") we report an error if the name is not bound or is bound to
1058 a pending-scope variable; update the scope if the name is bound to a
1059 conditionally scoped variable; or just proceed normally if the named
1060 variable is in scope.
1061
1062 When we exit a scope, any variables bound at this level are either
1063 marked out of scope or pending-scoped, depending on whether the scope
1064 was sequential or parallel.  Here a "parallel" scope means the "then"
1065 or "else" part of a conditional, or any "case" or "else" branch of a
1066 switch.  Other scopes are "sequential".
1067
1068 When exiting a parallel scope we check if there are any variables that
1069 were previously pending and are still visible. If there are, then
1070 there weren't redeclared in the most recent scope, so they cannot be
1071 merged and must become out-of-scope.  If it is not the first of
1072 parallel scopes (based on `child_count`), we check that there was a
1073 previous binding that is still pending-scope.  If there isn't, the new
1074 variable must now be out-of-scope.
1075
1076 When exiting a sequential scope that immediately enclosed parallel
1077 scopes, we need to resolve any pending-scope variables.  If there was
1078 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1079 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1080 all pending-scope variables become conditionally scoped.
1081
1082 ###### ast
1083         enum closetype { CloseSequential, CloseParallel, CloseElse };
1084
1085 ###### ast functions
1086
1087         static struct variable *var_decl(struct parse_context *c, struct text s)
1088         {
1089                 struct binding *b = find_binding(c, s);
1090                 struct variable *v = b->var;
1091
1092                 switch (v ? v->scope : OutScope) {
1093                 case InScope:
1094                         /* Caller will report the error */
1095                         return NULL;
1096                 case CondScope:
1097                         for (;
1098                              v && v->scope == CondScope;
1099                              v = v->previous)
1100                                 v->scope = OutScope;
1101                         break;
1102                 default: break;
1103                 }
1104                 v = calloc(1, sizeof(*v));
1105                 v->previous = b->var;
1106                 b->var = v;
1107                 v->name = b;
1108                 v->min_depth = v->depth = c->scope_depth;
1109                 v->scope = InScope;
1110                 v->in_scope = c->in_scope;
1111                 c->in_scope = v;
1112                 return v;
1113         }
1114
1115         static struct variable *var_ref(struct parse_context *c, struct text s)
1116         {
1117                 struct binding *b = find_binding(c, s);
1118                 struct variable *v = b->var;
1119                 struct variable *v2;
1120
1121                 switch (v ? v->scope : OutScope) {
1122                 case OutScope:
1123                 case PendingScope:
1124                         /* Caller will report the error */
1125                         return NULL;
1126                 case CondScope:
1127                         /* All CondScope variables of this name need to be merged
1128                          * and become InScope
1129                          */
1130                         v->depth = v->min_depth;
1131                         v->scope = InScope;
1132                         for (v2 = v->previous;
1133                              v2 && v2->scope == CondScope;
1134                              v2 = v2->previous)
1135                                 variable_merge(v, v2);
1136                         break;
1137                 case InScope:
1138                         break;
1139                 }
1140                 return v;
1141         }
1142
1143         static void var_block_close(struct parse_context *c, enum closetype ct)
1144         {
1145                 /* Close off all variables that are in_scope */
1146                 struct variable *v, **vp, *v2;
1147
1148                 scope_pop(c);
1149                 for (vp = &c->in_scope;
1150                      v = *vp, v && v->depth > c->scope_depth && v->min_depth > c->scope_depth;
1151                      ) {
1152                         if (v->name->var == v) switch (ct) {
1153                         case CloseElse:
1154                         case CloseParallel: /* handle PendingScope */
1155                                 switch(v->scope) {
1156                                 case InScope:
1157                                 case CondScope:
1158                                         if (c->scope_stack->child_count == 1)
1159                                                 v->scope = PendingScope;
1160                                         else if (v->previous &&
1161                                                  v->previous->scope == PendingScope)
1162                                                 v->scope = PendingScope;
1163                                         else if (v->type == Tlabel)     // UNTESTED
1164                                                 v->scope = PendingScope;        // UNTESTED
1165                                         else if (v->name->var == v)     // UNTESTED
1166                                                 v->scope = OutScope;    // UNTESTED
1167                                         if (ct == CloseElse) {
1168                                                 /* All Pending variables with this name
1169                                                  * are now Conditional */
1170                                                 for (v2 = v;
1171                                                      v2 && v2->scope == PendingScope;
1172                                                      v2 = v2->previous)
1173                                                         v2->scope = CondScope;
1174                                         }
1175                                         break;
1176                                 case PendingScope:
1177                                         for (v2 = v;
1178                                              v2 && v2->scope == PendingScope;
1179                                              v2 = v2->previous)
1180                                                 if (v2->type != Tlabel)
1181                                                         v2->scope = OutScope;
1182                                         break;
1183                                 case OutScope: break;   // UNTESTED
1184                                 }
1185                                 break;
1186                         case CloseSequential:
1187                                 if (v->type == Tlabel)
1188                                         v->scope = PendingScope;
1189                                 switch (v->scope) {
1190                                 case InScope:
1191                                         v->scope = OutScope;
1192                                         break;
1193                                 case PendingScope:
1194                                         /* There was no 'else', so we can only become
1195                                          * conditional if we know the cases were exhaustive,
1196                                          * and that doesn't mean anything yet.
1197                                          * So only labels become conditional..
1198                                          */
1199                                         for (v2 = v;
1200                                              v2 && v2->scope == PendingScope;
1201                                              v2 = v2->previous)
1202                                                 if (v2->type == Tlabel) {
1203                                                         v2->scope = CondScope;
1204                                                         v2->min_depth = c->scope_depth;
1205                                                 } else
1206                                                         v2->scope = OutScope;
1207                                         break;
1208                                 case CondScope:
1209                                 case OutScope: break;
1210                                 }
1211                                 break;
1212                         }
1213                         if (v->scope == OutScope || v->name->var != v)
1214                                 *vp = v->in_scope;
1215                         else
1216                                 vp = &v->in_scope;
1217                 }
1218         }
1219
1220 #### Storing Values
1221
1222 The value of a variable is store separately from the variable, on an
1223 analogue of a stack frame.  There are (currently) two frames that can be
1224 active.  A global frame which currently only stores constants, and a
1225 stacked frame which stores local variables.  Each variable knows if it
1226 is global or not, and what its index into the frame is.
1227
1228 Values in the global frame are known immediately they are relevant, so
1229 the frame needs to be reallocated as it grows so it can store those
1230 values.  The local frame doesn't get values until the interpreted phase
1231 is started, so there is no need to allocate until the size is known.
1232
1233 ###### variable fields
1234                 short frame_pos;
1235                 short global;
1236
1237 ###### parse context
1238
1239         short global_size, global_alloc;
1240         short local_size;
1241         void *global, *local;
1242
1243 ###### ast functions
1244
1245         static struct value *var_value(struct parse_context *c, struct variable *v)
1246         {
1247                 if (!v->global) {
1248                         if (!c->local || !v->type)
1249                                 return NULL;
1250                         if (v->frame_pos + v->type->size > c->local_size) {
1251                                 printf("INVALID frame_pos\n");  // NOTEST
1252                                 exit(2);                        // NOTEST
1253                         }
1254                         return c->local + v->frame_pos;
1255                 }
1256                 if (c->global_size > c->global_alloc) {
1257                         int old = c->global_alloc;
1258                         c->global_alloc = (c->global_size | 1023) + 1024;
1259                         c->global = realloc(c->global, c->global_alloc);
1260                         memset(c->global + old, 0, c->global_alloc - old);
1261                 }
1262                 return c->global + v->frame_pos;
1263         }
1264
1265         static struct value *global_alloc(struct parse_context *c, struct type *t,
1266                                           struct variable *v, struct value *init)
1267         {
1268                 struct value *ret;
1269                 struct variable scratch;
1270
1271                 if (t->prepare_type)
1272                         t->prepare_type(c, t, 1);       // NOTEST
1273
1274                 if (c->global_size & (t->align - 1))
1275                         c->global_size = (c->global_size + t->align) & ~(t->align-1);   // UNTESTED
1276                 if (!v) {
1277                         v = &scratch;
1278                         v->type = t;
1279                 }
1280                 v->frame_pos = c->global_size;
1281                 v->global = 1;
1282                 c->global_size += v->type->size;
1283                 ret = var_value(c, v);
1284                 if (init)
1285                         memcpy(ret, init, t->size);
1286                 else
1287                         val_init(t, ret);
1288                 return ret;
1289         }
1290
1291 As global values are found -- struct field initializers, labels etc --
1292 `global_alloc()` is called to record the value in the global frame.
1293
1294 When the program is fully parsed, we need to walk the list of variables
1295 to find any that weren't merged away and that aren't global, and to
1296 calculate the frame size and assign a frame position for each variable.
1297 For this we have `scope_finalize()`.
1298
1299 ###### ast functions
1300
1301         static void scope_finalize(struct parse_context *c)
1302         {
1303                 struct binding *b;
1304
1305                 for (b = c->varlist; b; b = b->next) {
1306                         struct variable *v;
1307                         for (v = b->var; v; v = v->previous) {
1308                                 struct type *t = v->type;
1309                                 if (v->merged && v->merged != v)
1310                                         continue;
1311                                 if (v->global)
1312                                         continue;
1313                                 if (c->local_size & (t->align - 1))
1314                                         c->local_size = (c->local_size + t->align) & ~(t->align-1);
1315                                 v->frame_pos = c->local_size;
1316                                 c->local_size += v->type->size;
1317                         }
1318                 }
1319                 c->local = calloc(1, c->local_size);
1320         }
1321
1322 ###### free context storage
1323         free(context.global);
1324         free(context.local);
1325
1326 ### Executables
1327
1328 Executables can be lots of different things.  In many cases an
1329 executable is just an operation combined with one or two other
1330 executables.  This allows for expressions and lists etc.  Other times an
1331 executable is something quite specific like a constant or variable name.
1332 So we define a `struct exec` to be a general executable with a type, and
1333 a `struct binode` which is a subclass of `exec`, forms a node in a
1334 binary tree, and holds an operation.  There will be other subclasses,
1335 and to access these we need to be able to `cast` the `exec` into the
1336 various other types.  The first field in any `struct exec` is the type
1337 from the `exec_types` enum.
1338
1339 ###### macros
1340         #define cast(structname, pointer) ({            \
1341                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
1342                 if (__mptr && *__mptr != X##structname) abort();                \
1343                 (struct structname *)( (char *)__mptr);})
1344
1345         #define new(structname) ({                                              \
1346                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
1347                 __ptr->type = X##structname;                                            \
1348                 __ptr->line = -1; __ptr->column = -1;                                   \
1349                 __ptr;})
1350
1351         #define new_pos(structname, token) ({                                           \
1352                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
1353                 __ptr->type = X##structname;                                            \
1354                 __ptr->line = token.line; __ptr->column = token.col;                    \
1355                 __ptr;})
1356
1357 ###### ast
1358         enum exec_types {
1359                 Xbinode,
1360                 ## exec type
1361         };
1362         struct exec {
1363                 enum exec_types type;
1364                 int line, column;
1365         };
1366         struct binode {
1367                 struct exec;
1368                 enum Btype {
1369                         ## Binode types
1370                 } op;
1371                 struct exec *left, *right;
1372         };
1373
1374 ###### ast functions
1375
1376         static int __fput_loc(struct exec *loc, FILE *f)
1377         {
1378                 if (!loc)
1379                         return 0;
1380                 if (loc->line >= 0) {
1381                         fprintf(f, "%d:%d: ", loc->line, loc->column);
1382                         return 1;
1383                 }
1384                 if (loc->type == Xbinode)
1385                         return __fput_loc(cast(binode,loc)->left, f) ||
1386                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
1387                 return 0;                       // NOTEST
1388         }
1389         static void fput_loc(struct exec *loc, FILE *f)
1390         {
1391                 if (!__fput_loc(loc, f))
1392                         fprintf(f, "??:??: ");  // NOTEST
1393         }
1394
1395 Each different type of `exec` node needs a number of functions defined,
1396 a bit like methods.  We must be able to free it, print it, analyse it
1397 and execute it.  Once we have specific `exec` types we will need to
1398 parse them too.  Let's take this a bit more slowly.
1399
1400 #### Freeing
1401
1402 The parser generator requires a `free_foo` function for each struct
1403 that stores attributes and they will often be `exec`s and subtypes
1404 there-of.  So we need `free_exec` which can handle all the subtypes,
1405 and we need `free_binode`.
1406
1407 ###### ast functions
1408
1409         static void free_binode(struct binode *b)
1410         {
1411                 if (!b)
1412                         return;
1413                 free_exec(b->left);
1414                 free_exec(b->right);
1415                 free(b);
1416         }
1417
1418 ###### core functions
1419         static void free_exec(struct exec *e)
1420         {
1421                 if (!e)
1422                         return;
1423                 switch(e->type) {
1424                         ## free exec cases
1425                 }
1426         }
1427
1428 ###### forward decls
1429
1430         static void free_exec(struct exec *e);
1431
1432 ###### free exec cases
1433         case Xbinode: free_binode(cast(binode, e)); break;
1434
1435 #### Printing
1436
1437 Printing an `exec` requires that we know the current indent level for
1438 printing line-oriented components.  As will become clear later, we
1439 also want to know what sort of bracketing to use.
1440
1441 ###### ast functions
1442
1443         static void do_indent(int i, char *str)
1444         {
1445                 while (i--)
1446                         printf("    ");
1447                 printf("%s", str);
1448         }
1449
1450 ###### core functions
1451         static void print_binode(struct binode *b, int indent, int bracket)
1452         {
1453                 struct binode *b2;
1454                 switch(b->op) {
1455                 ## print binode cases
1456                 }
1457         }
1458
1459         static void print_exec(struct exec *e, int indent, int bracket)
1460         {
1461                 if (!e)
1462                         return;         // NOTEST
1463                 switch (e->type) {
1464                 case Xbinode:
1465                         print_binode(cast(binode, e), indent, bracket); break;
1466                 ## print exec cases
1467                 }
1468         }
1469
1470 ###### forward decls
1471
1472         static void print_exec(struct exec *e, int indent, int bracket);
1473
1474 #### Analysing
1475
1476 As discussed, analysis involves propagating type requirements around the
1477 program and looking for errors.
1478
1479 So `propagate_types` is passed an expected type (being a `struct type`
1480 pointer together with some `val_rules` flags) that the `exec` is
1481 expected to return, and returns the type that it does return, either
1482 of which can be `NULL` signifying "unknown".  An `ok` flag is passed
1483 by reference. It is set to `0` when an error is found, and `2` when
1484 any change is made.  If it remains unchanged at `1`, then no more
1485 propagation is needed.
1486
1487 ###### ast
1488
1489         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 2<<1};
1490
1491 ###### format cases
1492         case 'r':
1493                 if (rules & Rnolabel)
1494                         fputs(" (labels not permitted)", stderr);
1495                 break;
1496
1497 ###### core functions
1498
1499         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1500                                             struct type *type, int rules);
1501         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1502                                               struct type *type, int rules)
1503         {
1504                 struct type *t;
1505
1506                 if (!prog)
1507                         return Tnone;
1508
1509                 switch (prog->type) {
1510                 case Xbinode:
1511                 {
1512                         struct binode *b = cast(binode, prog);
1513                         switch (b->op) {
1514                         ## propagate binode cases
1515                         }
1516                         break;
1517                 }
1518                 ## propagate exec cases
1519                 }
1520                 return Tnone;
1521         }
1522
1523         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1524                                             struct type *type, int rules)
1525         {
1526                 struct type *ret = __propagate_types(prog, c, ok, type, rules);
1527
1528                 if (c->parse_error)
1529                         *ok = 0;
1530                 return ret;
1531         }
1532
1533 #### Interpreting
1534
1535 Interpreting an `exec` doesn't require anything but the `exec`.  State
1536 is stored in variables and each variable will be directly linked from
1537 within the `exec` tree.  The exception to this is the `main` function
1538 which needs to look at command line arguments.  This function will be
1539 interpreted separately.
1540
1541 Each `exec` can return a value combined with a type in `struct lrval`.
1542 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
1543 the location of a value, which can be updated, in `lval`.  Others will
1544 set `lval` to NULL indicating that there is a value of appropriate type
1545 in `rval`.
1546
1547 ###### core functions
1548
1549         struct lrval {
1550                 struct type *type;
1551                 struct value rval, *lval;
1552         };
1553
1554         static struct lrval _interp_exec(struct parse_context *c, struct exec *e);
1555
1556         static struct value interp_exec(struct parse_context *c, struct exec *e,
1557                                         struct type **typeret)
1558         {
1559                 struct lrval ret = _interp_exec(c, e);
1560
1561                 if (!ret.type) abort();
1562                 if (typeret)
1563                         *typeret = ret.type;
1564                 if (ret.lval)
1565                         dup_value(ret.type, ret.lval, &ret.rval);
1566                 return ret.rval;
1567         }
1568
1569         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
1570                                           struct type **typeret)
1571         {
1572                 struct lrval ret = _interp_exec(c, e);
1573
1574                 if (ret.lval)
1575                         *typeret = ret.type;
1576                 else
1577                         free_value(ret.type, &ret.rval);
1578                 return ret.lval;
1579         }
1580
1581         static struct lrval _interp_exec(struct parse_context *c, struct exec *e)
1582         {
1583                 struct lrval ret;
1584                 struct value rv = {}, *lrv = NULL;
1585                 struct type *rvtype;
1586
1587                 rvtype = ret.type = Tnone;
1588                 if (!e) {
1589                         ret.lval = lrv; // UNTESTED
1590                         ret.rval = rv;  // UNTESTED
1591                         return ret;     // UNTESTED
1592                 }
1593
1594                 switch(e->type) {
1595                 case Xbinode:
1596                 {
1597                         struct binode *b = cast(binode, e);
1598                         struct value left, right, *lleft;
1599                         struct type *ltype, *rtype;
1600                         ltype = rtype = Tnone;
1601                         switch (b->op) {
1602                         ## interp binode cases
1603                         }
1604                         free_value(ltype, &left);
1605                         free_value(rtype, &right);
1606                         break;
1607                 }
1608                 ## interp exec cases
1609                 }
1610                 ret.lval = lrv;
1611                 ret.rval = rv;
1612                 ret.type = rvtype;
1613                 return ret;
1614         }
1615
1616 ### Complex types
1617
1618 Now that we have the shape of the interpreter in place we can add some
1619 complex types and connected them in to the data structures and the
1620 different phases of parse, analyse, print, interpret.
1621
1622 Thus far we have arrays and structs.
1623
1624 #### Arrays
1625
1626 Arrays can be declared by giving a size and a type, as `[size]type' so
1627 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
1628 size can be either a literal number, or a named constant.  Some day an
1629 arbitrary expression will be supported.
1630
1631 As a formal parameter to a function, the array can be declared with a
1632 new variable as the size: `name:[size::number]string`.  The `size`
1633 variable is set to the size of the array and must be a constant.  As
1634 `number` is the only supported type, it can be left out:
1635 `name:[size::]string`.
1636
1637 Arrays cannot be assigned.  When pointers are introduced we will also
1638 introduce array slices which can refer to part or all of an array -
1639 the assignment syntax will create a slice.  For now, an array can only
1640 ever be referenced by the name it is declared with.  It is likely that
1641 a "`copy`" primitive will eventually be define which can be used to
1642 make a copy of an array with controllable recursive depth.
1643
1644 For now we have two sorts of array, those with fixed size either because
1645 it is given as a literal number or because it is a struct member (which
1646 cannot have a runtime-changing size), and those with a size that is
1647 determined at runtime - local variables with a const size.  The former
1648 have their size calculated at parse time, the latter at run time.
1649
1650 For the latter type, the `size` field of the type is the size of a
1651 pointer, and the array is reallocated every time it comes into scope.
1652
1653 We differentiate struct fields with a const size from local variables
1654 with a const size by whether they are prepared at parse time or not.
1655
1656 ###### type union fields
1657
1658         struct {
1659                 int unspec;     // size is unspecified - vsize must be set.
1660                 short size;
1661                 short static_size;
1662                 struct variable *vsize;
1663                 struct type *member;
1664         } array;
1665
1666 ###### value union fields
1667         void *array;  // used if not static_size
1668
1669 ###### value functions
1670
1671         static void array_prepare_type(struct parse_context *c, struct type *type,
1672                                        int parse_time)
1673         {
1674                 struct value *vsize;
1675                 mpz_t q;
1676                 if (!type->array.vsize || type->array.static_size)
1677                         return;
1678
1679                 vsize = var_value(c, type->array.vsize);
1680                 mpz_init(q);
1681                 mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
1682                 type->array.size = mpz_get_si(q);
1683                 mpz_clear(q);
1684
1685                 if (parse_time) {
1686                         type->array.static_size = 1;
1687                         type->size = type->array.size * type->array.member->size;
1688                         type->align = type->array.member->align;
1689                 }
1690         }
1691
1692         static void array_init(struct type *type, struct value *val)
1693         {
1694                 int i;
1695                 void *ptr = val->ptr;
1696
1697                 if (!val)
1698                         return;                         // NOTEST
1699                 if (!type->array.static_size) {
1700                         val->array = calloc(type->array.size,
1701                                             type->array.member->size);
1702                         ptr = val->array;
1703                 }
1704                 for (i = 0; i < type->array.size; i++) {
1705                         struct value *v;
1706                         v = (void*)ptr + i * type->array.member->size;
1707                         val_init(type->array.member, v);
1708                 }
1709         }
1710
1711         static void array_free(struct type *type, struct value *val)
1712         {
1713                 int i;
1714                 void *ptr = val->ptr;
1715
1716                 if (!type->array.static_size)
1717                         ptr = val->array;
1718                 for (i = 0; i < type->array.size; i++) {
1719                         struct value *v;
1720                         v = (void*)ptr + i * type->array.member->size;
1721                         free_value(type->array.member, v);
1722                 }
1723                 if (!type->array.static_size)
1724                         free(ptr);
1725         }
1726
1727         static int array_compat(struct type *require, struct type *have)
1728         {
1729                 if (have->compat != require->compat)
1730                         return 0;       // UNTESTED
1731                 /* Both are arrays, so we can look at details */
1732                 if (!type_compat(require->array.member, have->array.member, 0))
1733                         return 0;
1734                 if (have->array.unspec && require->array.unspec) {
1735                         if (have->array.vsize && require->array.vsize &&
1736                             have->array.vsize != require->array.vsize)  // UNTESTED
1737                                 /* sizes might not be the same */
1738                                 return 0;       // UNTESTED
1739                         return 1;
1740                 }
1741                 if (have->array.unspec || require->array.unspec)
1742                         return 1;       // UNTESTED
1743                 if (require->array.vsize == NULL && have->array.vsize == NULL)
1744                         return require->array.size == have->array.size;
1745
1746                 return require->array.vsize == have->array.vsize;       // UNTESTED
1747         }
1748
1749         static void array_print_type(struct type *type, FILE *f)
1750         {
1751                 fputs("[", f);
1752                 if (type->array.vsize) {
1753                         struct binding *b = type->array.vsize->name;
1754                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
1755                                 type->array.unspec ? "::" : "");
1756                 } else
1757                         fprintf(f, "%d]", type->array.size);
1758                 type_print(type->array.member, f);
1759         }
1760
1761         static struct type array_prototype = {
1762                 .init = array_init,
1763                 .prepare_type = array_prepare_type,
1764                 .print_type = array_print_type,
1765                 .compat = array_compat,
1766                 .free = array_free,
1767                 .size = sizeof(void*),
1768                 .align = sizeof(void*),
1769         };
1770
1771 ###### declare terminals
1772         $TERM [ ]
1773
1774 ###### type grammar
1775
1776         | [ NUMBER ] Type ${ {
1777                 char tail[3];
1778                 mpq_t num;
1779                 struct text noname = { "", 0 };
1780                 struct type *t;
1781
1782                 $0 = t = add_type(c, noname, &array_prototype);
1783                 t->array.member = $<4;
1784                 t->array.vsize = NULL;
1785                 if (number_parse(num, tail, $2.txt) == 0)
1786                         tok_err(c, "error: unrecognised number", &$2);
1787                 else if (tail[0])
1788                         tok_err(c, "error: unsupported number suffix", &$2);
1789                 else {
1790                         t->array.size = mpz_get_ui(mpq_numref(num));
1791                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
1792                                 tok_err(c, "error: array size must be an integer",
1793                                         &$2);
1794                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
1795                                 tok_err(c, "error: array size is too large",
1796                                         &$2);
1797                         mpq_clear(num);
1798                 }
1799                 t->array.static_size = 1;
1800                 t->size = t->array.size * t->array.member->size;
1801                 t->align = t->array.member->align;
1802         } }$
1803
1804         | [ IDENTIFIER ] Type ${ {
1805                 struct variable *v = var_ref(c, $2.txt);
1806                 struct text noname = { "", 0 };
1807
1808                 if (!v)
1809                         tok_err(c, "error: name undeclared", &$2);
1810                 else if (!v->constant)
1811                         tok_err(c, "error: array size must be a constant", &$2);
1812
1813                 $0 = add_type(c, noname, &array_prototype);
1814                 $0->array.member = $<4;
1815                 $0->array.size = 0;
1816                 $0->array.vsize = v;
1817         } }$
1818
1819 ###### Grammar
1820         $*type
1821         OptType -> Type ${ $0 = $<1; }$
1822                 | ${ $0 = NULL; }$
1823
1824 ###### formal type grammar
1825
1826         | [ IDENTIFIER :: OptType ] Type ${ {
1827                 struct variable *v = var_decl(c, $ID.txt);
1828                 struct text noname = { "", 0 };
1829
1830                 v->type = $<OT;
1831                 v->constant = 1;
1832                 if (!v->type)
1833                         v->type = Tnum;
1834                 $0 = add_type(c, noname, &array_prototype);
1835                 $0->array.member = $<6;
1836                 $0->array.size = 0;
1837                 $0->array.unspec = 1;
1838                 $0->array.vsize = v;
1839         } }$
1840
1841 ###### Binode types
1842         Index,
1843
1844 ###### variable grammar
1845
1846         | Variable [ Expression ] ${ {
1847                 struct binode *b = new(binode);
1848                 b->op = Index;
1849                 b->left = $<1;
1850                 b->right = $<3;
1851                 $0 = b;
1852         } }$
1853
1854 ###### print binode cases
1855         case Index:
1856                 print_exec(b->left, -1, bracket);
1857                 printf("[");
1858                 print_exec(b->right, -1, bracket);
1859                 printf("]");
1860                 break;
1861
1862 ###### propagate binode cases
1863         case Index:
1864                 /* left must be an array, right must be a number,
1865                  * result is the member type of the array
1866                  */
1867                 propagate_types(b->right, c, ok, Tnum, 0);
1868                 t = propagate_types(b->left, c, ok, NULL, rules & Rnoconstant);
1869                 if (!t || t->compat != array_compat) {
1870                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
1871                         return NULL;
1872                 } else {
1873                         if (!type_compat(type, t->array.member, rules)) {
1874                                 type_err(c, "error: have %1 but need %2", prog,
1875                                          t->array.member, rules, type);
1876                         }
1877                         return t->array.member;
1878                 }
1879                 break;
1880
1881 ###### interp binode cases
1882         case Index: {
1883                 mpz_t q;
1884                 long i;
1885                 void *ptr;
1886
1887                 lleft = linterp_exec(c, b->left, &ltype);
1888                 right = interp_exec(c, b->right, &rtype);
1889                 mpz_init(q);
1890                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
1891                 i = mpz_get_si(q);
1892                 mpz_clear(q);
1893
1894                 if (ltype->array.static_size)
1895                         ptr = lleft;
1896                 else
1897                         ptr = *(void**)lleft;
1898                 rvtype = ltype->array.member;
1899                 if (i >= 0 && i < ltype->array.size)
1900                         lrv = ptr + i * rvtype->size;
1901                 else
1902                         val_init(ltype->array.member, &rv);
1903                 ltype = NULL;
1904                 break;
1905         }
1906
1907 #### Structs
1908
1909 A `struct` is a data-type that contains one or more other data-types.
1910 It differs from an array in that each member can be of a different
1911 type, and they are accessed by name rather than by number.  Thus you
1912 cannot choose an element by calculation, you need to know what you
1913 want up-front.
1914
1915 The language makes no promises about how a given structure will be
1916 stored in memory - it is free to rearrange fields to suit whatever
1917 criteria seems important.
1918
1919 Structs are declared separately from program code - they cannot be
1920 declared in-line in a variable declaration like arrays can.  A struct
1921 is given a name and this name is used to identify the type - the name
1922 is not prefixed by the word `struct` as it would be in C.
1923
1924 Structs are only treated as the same if they have the same name.
1925 Simply having the same fields in the same order is not enough.  This
1926 might change once we can create structure initializers from a list of
1927 values.
1928
1929 Each component datum is identified much like a variable is declared,
1930 with a name, one or two colons, and a type.  The type cannot be omitted
1931 as there is no opportunity to deduce the type from usage.  An initial
1932 value can be given following an equals sign, so
1933
1934 ##### Example: a struct type
1935
1936         struct complex:
1937                 x:number = 0
1938                 y:number = 0
1939
1940 would declare a type called "complex" which has two number fields,
1941 each initialised to zero.
1942
1943 Struct will need to be declared separately from the code that uses
1944 them, so we will need to be able to print out the declaration of a
1945 struct when reprinting the whole program.  So a `print_type_decl` type
1946 function will be needed.
1947
1948 ###### type union fields
1949
1950         struct {
1951                 int nfields;
1952                 struct field {
1953                         struct text name;
1954                         struct type *type;
1955                         struct value *init;
1956                         int offset;
1957                 } *fields;
1958         } structure;
1959
1960 ###### type functions
1961         void (*print_type_decl)(struct type *type, FILE *f);
1962
1963 ###### value functions
1964
1965         static void structure_init(struct type *type, struct value *val)
1966         {
1967                 int i;
1968
1969                 for (i = 0; i < type->structure.nfields; i++) {
1970                         struct value *v;
1971                         v = (void*) val->ptr + type->structure.fields[i].offset;
1972                         if (type->structure.fields[i].init)
1973                                 dup_value(type->structure.fields[i].type,
1974                                           type->structure.fields[i].init,
1975                                           v);
1976                         else
1977                                 val_init(type->structure.fields[i].type, v);
1978                 }
1979         }
1980
1981         static void structure_free(struct type *type, struct value *val)
1982         {
1983                 int i;
1984
1985                 for (i = 0; i < type->structure.nfields; i++) {
1986                         struct value *v;
1987                         v = (void*)val->ptr + type->structure.fields[i].offset;
1988                         free_value(type->structure.fields[i].type, v);
1989                 }
1990         }
1991
1992         static void structure_free_type(struct type *t)
1993         {
1994                 int i;
1995                 for (i = 0; i < t->structure.nfields; i++)
1996                         if (t->structure.fields[i].init) {
1997                                 free_value(t->structure.fields[i].type,
1998                                            t->structure.fields[i].init);
1999                         }
2000                 free(t->structure.fields);
2001         }
2002
2003         static struct type structure_prototype = {
2004                 .init = structure_init,
2005                 .free = structure_free,
2006                 .free_type = structure_free_type,
2007                 .print_type_decl = structure_print_type,
2008         };
2009
2010 ###### exec type
2011         Xfieldref,
2012
2013 ###### ast
2014         struct fieldref {
2015                 struct exec;
2016                 struct exec *left;
2017                 int index;
2018                 struct text name;
2019         };
2020
2021 ###### free exec cases
2022         case Xfieldref:
2023                 free_exec(cast(fieldref, e)->left);
2024                 free(e);
2025                 break;
2026
2027 ###### declare terminals
2028         $TERM struct .
2029
2030 ###### variable grammar
2031
2032         | Variable . IDENTIFIER ${ {
2033                 struct fieldref *fr = new_pos(fieldref, $2);
2034                 fr->left = $<1;
2035                 fr->name = $3.txt;
2036                 fr->index = -2;
2037                 $0 = fr;
2038         } }$
2039
2040 ###### print exec cases
2041
2042         case Xfieldref:
2043         {
2044                 struct fieldref *f = cast(fieldref, e);
2045                 print_exec(f->left, -1, bracket);
2046                 printf(".%.*s", f->name.len, f->name.txt);
2047                 break;
2048         }
2049
2050 ###### ast functions
2051         static int find_struct_index(struct type *type, struct text field)
2052         {
2053                 int i;
2054                 for (i = 0; i < type->structure.nfields; i++)
2055                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2056                                 return i;
2057                 return -1;
2058         }
2059
2060 ###### propagate exec cases
2061
2062         case Xfieldref:
2063         {
2064                 struct fieldref *f = cast(fieldref, prog);
2065                 struct type *st = propagate_types(f->left, c, ok, NULL, 0);
2066
2067                 if (!st)
2068                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2069                                  NULL, 0, NULL);
2070                 else if (st->init != structure_init)
2071                         type_err(c, "error: field reference attempted on %1, not a struct",
2072                                  f->left, st, 0, NULL);
2073                 else if (f->index == -2) {
2074                         f->index = find_struct_index(st, f->name);
2075                         if (f->index < 0)
2076                                 type_err(c, "error: cannot find requested field in %1",
2077                                          f->left, st, 0, NULL);
2078                 }
2079                 if (f->index >= 0) {
2080                         struct type *ft = st->structure.fields[f->index].type;
2081                         if (!type_compat(type, ft, rules))
2082                                 type_err(c, "error: have %1 but need %2", prog,
2083                                          ft, rules, type);
2084                         return ft;
2085                 }
2086                 break;
2087         }
2088
2089 ###### interp exec cases
2090         case Xfieldref:
2091         {
2092                 struct fieldref *f = cast(fieldref, e);
2093                 struct type *ltype;
2094                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2095                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2096                 rvtype = ltype->structure.fields[f->index].type;
2097                 break;
2098         }
2099
2100 ###### ast
2101         struct fieldlist {
2102                 struct fieldlist *prev;
2103                 struct field f;
2104         };
2105
2106 ###### ast functions
2107         static void free_fieldlist(struct fieldlist *f)
2108         {
2109                 if (!f)
2110                         return;
2111                 free_fieldlist(f->prev);
2112                 if (f->f.init) {
2113                         free_value(f->f.type, f->f.init);       // UNTESTED
2114                         free(f->f.init);        // UNTESTED
2115                 }
2116                 free(f);
2117         }
2118
2119 ###### top level grammar
2120         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2121                         struct type *t =
2122                                 add_type(c, $2.txt, &structure_prototype);
2123                         int cnt = 0;
2124                         struct fieldlist *f;
2125
2126                         for (f = $3; f; f=f->prev)
2127                                 cnt += 1;
2128
2129                         t->structure.nfields = cnt;
2130                         t->structure.fields = calloc(cnt, sizeof(struct field));
2131                         f = $3;
2132                         while (cnt > 0) {
2133                                 int a = f->f.type->align;
2134                                 cnt -= 1;
2135                                 t->structure.fields[cnt] = f->f;
2136                                 if (t->size & (a-1))
2137                                         t->size = (t->size | (a-1)) + 1;
2138                                 t->structure.fields[cnt].offset = t->size;
2139                                 t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2140                                 if (a > t->align)
2141                                         t->align = a;
2142                                 f->f.init = NULL;
2143                                 f = f->prev;
2144                         }
2145                 } }$
2146
2147         $*fieldlist
2148         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2149                 | { SimpleFieldList } ${ $0 = $<SFL; }$
2150                 | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2151                 | SimpleFieldList EOL ${ $0 = $<SFL; }$
2152
2153         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2154                 | FieldLines SimpleFieldList Newlines ${
2155                         $SFL->prev = $<FL;
2156                         $0 = $<SFL;
2157                 }$
2158
2159         SimpleFieldList -> Field ${ $0 = $<F; }$
2160                 | SimpleFieldList ; Field ${
2161                         $F->prev = $<SFL;
2162                         $0 = $<F;
2163                 }$
2164                 | SimpleFieldList ; ${
2165                         $0 = $<SFL;
2166                 }$
2167                 | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2168
2169         Field -> IDENTIFIER : Type = Expression ${ {
2170                         int ok; // UNTESTED
2171
2172                         $0 = calloc(1, sizeof(struct fieldlist));
2173                         $0->f.name = $1.txt;
2174                         $0->f.type = $<3;
2175                         $0->f.init = NULL;
2176                         do {
2177                                 ok = 1;
2178                                 propagate_types($<5, c, &ok, $3, 0);
2179                         } while (ok == 2);
2180                         if (!ok)
2181                                 c->parse_error = 1;     // UNTESTED
2182                         else {
2183                                 struct value vl = interp_exec(c, $5, NULL);
2184                                 $0->f.init = global_alloc(c, $0->f.type, NULL, &vl);
2185                         }
2186                 } }$
2187                 | IDENTIFIER : Type ${
2188                         $0 = calloc(1, sizeof(struct fieldlist));
2189                         $0->f.name = $1.txt;
2190                         $0->f.type = $<3;
2191                         if ($0->f.type->prepare_type)
2192                                 $0->f.type->prepare_type(c, $0->f.type, 1);
2193                 }$
2194
2195 ###### forward decls
2196         static void structure_print_type(struct type *t, FILE *f);
2197
2198 ###### value functions
2199         static void structure_print_type(struct type *t, FILE *f)       // UNTESTED
2200         {       // UNTESTED
2201                 int i;  // UNTESTED
2202
2203                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2204
2205                 for (i = 0; i < t->structure.nfields; i++) {
2206                         struct field *fl = t->structure.fields + i;
2207                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2208                         type_print(fl->type, f);
2209                         if (fl->type->print && fl->init) {
2210                                 fprintf(f, " = ");
2211                                 if (fl->type == Tstr)
2212                                         fprintf(f, "\"");       // UNTESTED
2213                                 print_value(fl->type, fl->init);
2214                                 if (fl->type == Tstr)
2215                                         fprintf(f, "\"");       // UNTESTED
2216                         }
2217                         printf("\n");
2218                 }
2219         }
2220
2221 ###### print type decls
2222         {       // UNTESTED
2223                 struct type *t; // UNTESTED
2224                 int target = -1;
2225
2226                 while (target != 0) {
2227                         int i = 0;
2228                         for (t = context.typelist; t ; t=t->next)
2229                                 if (t->print_type_decl) {
2230                                         i += 1;
2231                                         if (i == target)
2232                                                 break;
2233                                 }
2234
2235                         if (target == -1) {
2236                                 target = i;
2237                         } else {
2238                                 t->print_type_decl(t, stdout);
2239                                 target -= 1;
2240                         }
2241                 }
2242         }
2243
2244 ### Functions
2245
2246 A function is a named chunk of code which can be passed parameters and
2247 can return results.  Each function has an implicit type which includes
2248 the set of parameters and the return value.  As yet these types cannot
2249 be declared separate from the function itself.
2250
2251 In fact, only one function is currently possible - `main`.  `main` is
2252 passed an array of strings together with the size of the array, and
2253 doesn't return anything.  The strings are command line arguments.
2254
2255 The parameters can be specified either in parentheses as a list, such as
2256
2257 ##### Example: function 1
2258
2259         func main(av:[ac::number]string)
2260                 code block
2261
2262 or as an indented list of one parameter per line
2263
2264 ##### Example: function 2
2265
2266         func main
2267                 argv:[argc::number]string
2268         do
2269                 code block
2270
2271 ###### Binode types
2272         Func, List,
2273
2274 ###### Grammar
2275
2276         $TERM func main
2277
2278         $*binode
2279         MainFunction -> func main ( OpenScope Args ) Block Newlines ${
2280                         $0 = new(binode);
2281                         $0->op = Func;
2282                         $0->left = reorder_bilist($<Ar);
2283                         $0->right = $<Bl;
2284                         var_block_close(c, CloseSequential);
2285                         if (c->scope_stack && !c->parse_error) abort();
2286                 }$
2287                 | func main IN OpenScope OptNL Args OUT OptNL do Block Newlines ${
2288                         $0 = new(binode);
2289                         $0->op = Func;
2290                         $0->left = reorder_bilist($<Ar);
2291                         $0->right = $<Bl;
2292                         var_block_close(c, CloseSequential);
2293                         if (c->scope_stack && !c->parse_error) abort();
2294                 }$
2295                 | func main NEWLINE OpenScope OptNL do Block Newlines ${
2296                         $0 = new(binode);
2297                         $0->op = Func;
2298                         $0->left = NULL;
2299                         $0->right = $<Bl;
2300                         var_block_close(c, CloseSequential);
2301                         if (c->scope_stack && !c->parse_error) abort();
2302                 }$
2303
2304         Args -> ${ $0 = NULL; }$
2305                 | Varlist ${ $0 = $<1; }$
2306                 | Varlist ; ${ $0 = $<1; }$
2307                 | Varlist NEWLINE ${ $0 = $<1; }$
2308
2309         Varlist -> Varlist ; ArgDecl ${ // UNTESTED
2310                         $0 = new(binode);
2311                         $0->op = List;
2312                         $0->left = $<Vl;
2313                         $0->right = $<AD;
2314                 }$
2315                 | ArgDecl ${
2316                         $0 = new(binode);
2317                         $0->op = List;
2318                         $0->left = NULL;
2319                         $0->right = $<AD;
2320                 }$
2321
2322         $*var
2323         ArgDecl -> IDENTIFIER : FormalType ${ {
2324                 struct variable *v = var_decl(c, $1.txt);
2325                 $0 = new(var);
2326                 $0->var = v;
2327                 v->type = $<FT;
2328         } }$
2329
2330 ## Executables: the elements of code
2331
2332 Each code element needs to be parsed, printed, analysed,
2333 interpreted, and freed.  There are several, so let's just start with
2334 the easy ones and work our way up.
2335
2336 ### Values
2337
2338 We have already met values as separate objects.  When manifest
2339 constants appear in the program text, that must result in an executable
2340 which has a constant value.  So the `val` structure embeds a value in
2341 an executable.
2342
2343 ###### exec type
2344         Xval,
2345
2346 ###### ast
2347         struct val {
2348                 struct exec;
2349                 struct type *vtype;
2350                 struct value val;
2351         };
2352
2353 ###### ast functions
2354         struct val *new_val(struct type *T, struct token tk)
2355         {
2356                 struct val *v = new_pos(val, tk);
2357                 v->vtype = T;
2358                 return v;
2359         }
2360
2361 ###### Grammar
2362
2363         $TERM True False
2364
2365         $*val
2366         Value ->  True ${
2367                         $0 = new_val(Tbool, $1);
2368                         $0->val.bool = 1;
2369                         }$
2370                 | False ${
2371                         $0 = new_val(Tbool, $1);
2372                         $0->val.bool = 0;
2373                         }$
2374                 | NUMBER ${
2375                         $0 = new_val(Tnum, $1);
2376                         {
2377                         char tail[3];
2378                         if (number_parse($0->val.num, tail, $1.txt) == 0)
2379                                 mpq_init($0->val.num);  // UNTESTED
2380                                 if (tail[0])
2381                                         tok_err(c, "error: unsupported number suffix",
2382                                                 &$1);
2383                         }
2384                         }$
2385                 | STRING ${
2386                         $0 = new_val(Tstr, $1);
2387                         {
2388                         char tail[3];
2389                         string_parse(&$1, '\\', &$0->val.str, tail);
2390                         if (tail[0])
2391                                 tok_err(c, "error: unsupported string suffix",
2392                                         &$1);
2393                         }
2394                         }$
2395                 | MULTI_STRING ${
2396                         $0 = new_val(Tstr, $1);
2397                         {
2398                         char tail[3];
2399                         string_parse(&$1, '\\', &$0->val.str, tail);
2400                         if (tail[0])
2401                                 tok_err(c, "error: unsupported string suffix",
2402                                         &$1);
2403                         }
2404                         }$
2405
2406 ###### print exec cases
2407         case Xval:
2408         {
2409                 struct val *v = cast(val, e);
2410                 if (v->vtype == Tstr)
2411                         printf("\"");
2412                 print_value(v->vtype, &v->val);
2413                 if (v->vtype == Tstr)
2414                         printf("\"");
2415                 break;
2416         }
2417
2418 ###### propagate exec cases
2419         case Xval:
2420         {
2421                 struct val *val = cast(val, prog);
2422                 if (!type_compat(type, val->vtype, rules))
2423                         type_err(c, "error: expected %1%r found %2",
2424                                    prog, type, rules, val->vtype);
2425                 return val->vtype;
2426         }
2427
2428 ###### interp exec cases
2429         case Xval:
2430                 rvtype = cast(val, e)->vtype;
2431                 dup_value(rvtype, &cast(val, e)->val, &rv);
2432                 break;
2433
2434 ###### ast functions
2435         static void free_val(struct val *v)
2436         {
2437                 if (v)
2438                         free_value(v->vtype, &v->val);
2439                 free(v);
2440         }
2441
2442 ###### free exec cases
2443         case Xval: free_val(cast(val, e)); break;
2444
2445 ###### ast functions
2446         // Move all nodes from 'b' to 'rv', reversing their order.
2447         // In 'b' 'left' is a list, and 'right' is the last node.
2448         // In 'rv', left' is the first node and 'right' is a list.
2449         static struct binode *reorder_bilist(struct binode *b)
2450         {
2451                 struct binode *rv = NULL;
2452
2453                 while (b) {
2454                         struct exec *t = b->right;
2455                         b->right = rv;
2456                         rv = b;
2457                         if (b->left)
2458                                 b = cast(binode, b->left);
2459                         else
2460                                 b = NULL;
2461                         rv->left = t;
2462                 }
2463                 return rv;
2464         }
2465
2466 ### Variables
2467
2468 Just as we used a `val` to wrap a value into an `exec`, we similarly
2469 need a `var` to wrap a `variable` into an exec.  While each `val`
2470 contained a copy of the value, each `var` holds a link to the variable
2471 because it really is the same variable no matter where it appears.
2472 When a variable is used, we need to remember to follow the `->merged`
2473 link to find the primary instance.
2474
2475 ###### exec type
2476         Xvar,
2477
2478 ###### ast
2479         struct var {
2480                 struct exec;
2481                 struct variable *var;
2482         };
2483
2484 ###### Grammar
2485
2486         $TERM : ::
2487
2488         $*var
2489         VariableDecl -> IDENTIFIER : ${ {
2490                 struct variable *v = var_decl(c, $1.txt);
2491                 $0 = new_pos(var, $1);
2492                 $0->var = v;
2493                 if (v)
2494                         v->where_decl = $0;
2495                 else {
2496                         v = var_ref(c, $1.txt);
2497                         $0->var = v;
2498                         type_err(c, "error: variable '%v' redeclared",
2499                                  $0, NULL, 0, NULL);
2500                         type_err(c, "info: this is where '%v' was first declared",
2501                                  v->where_decl, NULL, 0, NULL);
2502                 }
2503         } }$
2504             | IDENTIFIER :: ${ {
2505                 struct variable *v = var_decl(c, $1.txt);
2506                 $0 = new_pos(var, $1);
2507                 $0->var = v;
2508                 if (v) {
2509                         v->where_decl = $0;
2510                         v->constant = 1;
2511                 } else {
2512                         v = var_ref(c, $1.txt);
2513                         $0->var = v;
2514                         type_err(c, "error: variable '%v' redeclared",
2515                                  $0, NULL, 0, NULL);
2516                         type_err(c, "info: this is where '%v' was first declared",
2517                                  v->where_decl, NULL, 0, NULL);
2518                 }
2519         } }$
2520             | IDENTIFIER : Type ${ {
2521                 struct variable *v = var_decl(c, $1.txt);
2522                 $0 = new_pos(var, $1);
2523                 $0->var = v;
2524                 if (v) {
2525                         v->where_decl = $0;
2526                         v->where_set = $0;
2527                         v->type = $<Type;
2528                 } else {
2529                         v = var_ref(c, $1.txt);
2530                         $0->var = v;
2531                         type_err(c, "error: variable '%v' redeclared",
2532                                  $0, NULL, 0, NULL);
2533                         type_err(c, "info: this is where '%v' was first declared",
2534                                  v->where_decl, NULL, 0, NULL);
2535                 }
2536         } }$
2537             | IDENTIFIER :: Type ${ {
2538                 struct variable *v = var_decl(c, $1.txt);
2539                 $0 = new_pos(var, $1);
2540                 $0->var = v;
2541                 if (v) {
2542                         v->where_decl = $0;
2543                         v->where_set = $0;
2544                         v->type = $<Type;
2545                         v->constant = 1;
2546                 } else {
2547                         v = var_ref(c, $1.txt);
2548                         $0->var = v;
2549                         type_err(c, "error: variable '%v' redeclared",
2550                                  $0, NULL, 0, NULL);
2551                         type_err(c, "info: this is where '%v' was first declared",
2552                                  v->where_decl, NULL, 0, NULL);
2553                 }
2554         } }$
2555
2556         $*exec
2557         Variable -> IDENTIFIER ${ {
2558                 struct variable *v = var_ref(c, $1.txt);
2559                 $0 = new_pos(var, $1);
2560                 if (v == NULL) {
2561                         /* This might be a label - allocate a var just in case */
2562                         v = var_decl(c, $1.txt);
2563                         if (v) {
2564                                 v->type = Tnone;
2565                                 v->where_decl = $0;
2566                                 v->where_set = $0;
2567                         }
2568                 }
2569                 cast(var, $0)->var = v;
2570         } }$
2571         ## variable grammar
2572
2573 ###### print exec cases
2574         case Xvar:
2575         {
2576                 struct var *v = cast(var, e);
2577                 if (v->var) {
2578                         struct binding *b = v->var->name;
2579                         printf("%.*s", b->name.len, b->name.txt);
2580                 }
2581                 break;
2582         }
2583
2584 ###### format cases
2585         case 'v':
2586                 if (loc && loc->type == Xvar) {
2587                         struct var *v = cast(var, loc);
2588                         if (v->var) {
2589                                 struct binding *b = v->var->name;
2590                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2591                         } else
2592                                 fputs("???", stderr);   // NOTEST
2593                 } else
2594                         fputs("NOTVAR", stderr);        // NOTEST
2595                 break;
2596
2597 ###### propagate exec cases
2598
2599         case Xvar:
2600         {
2601                 struct var *var = cast(var, prog);
2602                 struct variable *v = var->var;
2603                 if (!v) {
2604                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2605                         return Tnone;                                   // NOTEST
2606                 }
2607                 if (v->merged)
2608                         v = v->merged;
2609                 if (v->constant && (rules & Rnoconstant)) {
2610                         type_err(c, "error: Cannot assign to a constant: %v",
2611                                  prog, NULL, 0, NULL);
2612                         type_err(c, "info: name was defined as a constant here",
2613                                  v->where_decl, NULL, 0, NULL);
2614                         return v->type;
2615                 }
2616                 if (v->type == Tnone && v->where_decl == prog)
2617                         type_err(c, "error: variable used but not declared: %v",
2618                                  prog, NULL, 0, NULL);
2619                 if (v->type == NULL) {
2620                         if (type && *ok != 0) {
2621                                 v->type = type;
2622                                 v->where_set = prog;
2623                                 *ok = 2;
2624                         }
2625                         return type;
2626                 }
2627                 if (!type_compat(type, v->type, rules)) {
2628                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2629                                  type, rules, v->type);
2630                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2631                                  v->type, rules, NULL);
2632                 }
2633                 if (!type)
2634                         return v->type;
2635                 return type;
2636         }
2637
2638 ###### interp exec cases
2639         case Xvar:
2640         {
2641                 struct var *var = cast(var, e);
2642                 struct variable *v = var->var;
2643
2644                 if (v->merged)
2645                         v = v->merged;  // UNTESTED
2646                 lrv = var_value(c, v);
2647                 rvtype = v->type;
2648                 break;
2649         }
2650
2651 ###### ast functions
2652
2653         static void free_var(struct var *v)
2654         {
2655                 free(v);
2656         }
2657
2658 ###### free exec cases
2659         case Xvar: free_var(cast(var, e)); break;
2660
2661 ### Expressions: Conditional
2662
2663 Our first user of the `binode` will be conditional expressions, which
2664 is a bit odd as they actually have three components.  That will be
2665 handled by having 2 binodes for each expression.  The conditional
2666 expression is the lowest precedence operator which is why we define it
2667 first - to start the precedence list.
2668
2669 Conditional expressions are of the form "value `if` condition `else`
2670 other_value".  They associate to the right, so everything to the right
2671 of `else` is part of an else value, while only a higher-precedence to
2672 the left of `if` is the if values.  Between `if` and `else` there is no
2673 room for ambiguity, so a full conditional expression is allowed in
2674 there.
2675
2676 ###### Binode types
2677         CondExpr,
2678
2679 ###### Grammar
2680
2681         $LEFT if $$ifelse
2682         ## expr precedence
2683
2684         $*exec
2685         Expression -> Expression if Expression else Expression $$ifelse ${ {
2686                         struct binode *b1 = new(binode);
2687                         struct binode *b2 = new(binode);
2688                         b1->op = CondExpr;
2689                         b1->left = $<3;
2690                         b1->right = b2;
2691                         b2->op = CondExpr;
2692                         b2->left = $<1;
2693                         b2->right = $<5;
2694                         $0 = b1;
2695                 } }$
2696                 ## expression grammar
2697
2698 ###### print binode cases
2699
2700         case CondExpr:
2701                 b2 = cast(binode, b->right);
2702                 if (bracket) printf("(");
2703                 print_exec(b2->left, -1, bracket);
2704                 printf(" if ");
2705                 print_exec(b->left, -1, bracket);
2706                 printf(" else ");
2707                 print_exec(b2->right, -1, bracket);
2708                 if (bracket) printf(")");
2709                 break;
2710
2711 ###### propagate binode cases
2712
2713         case CondExpr: {
2714                 /* cond must be Tbool, others must match */
2715                 struct binode *b2 = cast(binode, b->right);
2716                 struct type *t2;
2717
2718                 propagate_types(b->left, c, ok, Tbool, 0);
2719                 t = propagate_types(b2->left, c, ok, type, Rnolabel);
2720                 t2 = propagate_types(b2->right, c, ok, type ?: t, Rnolabel);
2721                 return t ?: t2;
2722         }
2723
2724 ###### interp binode cases
2725
2726         case CondExpr: {
2727                 struct binode *b2 = cast(binode, b->right);
2728                 left = interp_exec(c, b->left, &ltype);
2729                 if (left.bool)
2730                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
2731                 else
2732                         rv = interp_exec(c, b2->right, &rvtype);
2733                 }
2734                 break;
2735
2736 ### Expressions: Boolean
2737
2738 The next class of expressions to use the `binode` will be Boolean
2739 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
2740 have same corresponding precendence.  The difference is that they don't
2741 evaluate the second expression if not necessary.
2742
2743 ###### Binode types
2744         And,
2745         AndThen,
2746         Or,
2747         OrElse,
2748         Not,
2749
2750 ###### expr precedence
2751         $LEFT or
2752         $LEFT and
2753         $LEFT not
2754
2755 ###### expression grammar
2756                 | Expression or Expression ${ {
2757                         struct binode *b = new(binode);
2758                         b->op = Or;
2759                         b->left = $<1;
2760                         b->right = $<3;
2761                         $0 = b;
2762                 } }$
2763                 | Expression or else Expression ${ {
2764                         struct binode *b = new(binode);
2765                         b->op = OrElse;
2766                         b->left = $<1;
2767                         b->right = $<4;
2768                         $0 = b;
2769                 } }$
2770
2771                 | Expression and Expression ${ {
2772                         struct binode *b = new(binode);
2773                         b->op = And;
2774                         b->left = $<1;
2775                         b->right = $<3;
2776                         $0 = b;
2777                 } }$
2778                 | Expression and then Expression ${ {
2779                         struct binode *b = new(binode);
2780                         b->op = AndThen;
2781                         b->left = $<1;
2782                         b->right = $<4;
2783                         $0 = b;
2784                 } }$
2785
2786                 | not Expression ${ {
2787                         struct binode *b = new(binode);
2788                         b->op = Not;
2789                         b->right = $<2;
2790                         $0 = b;
2791                 } }$
2792
2793 ###### print binode cases
2794         case And:
2795                 if (bracket) printf("(");
2796                 print_exec(b->left, -1, bracket);
2797                 printf(" and ");
2798                 print_exec(b->right, -1, bracket);
2799                 if (bracket) printf(")");
2800                 break;
2801         case AndThen:
2802                 if (bracket) printf("(");
2803                 print_exec(b->left, -1, bracket);
2804                 printf(" and then ");
2805                 print_exec(b->right, -1, bracket);
2806                 if (bracket) printf(")");
2807                 break;
2808         case Or:
2809                 if (bracket) printf("(");
2810                 print_exec(b->left, -1, bracket);
2811                 printf(" or ");
2812                 print_exec(b->right, -1, bracket);
2813                 if (bracket) printf(")");
2814                 break;
2815         case OrElse:
2816                 if (bracket) printf("(");
2817                 print_exec(b->left, -1, bracket);
2818                 printf(" or else ");
2819                 print_exec(b->right, -1, bracket);
2820                 if (bracket) printf(")");
2821                 break;
2822         case Not:
2823                 if (bracket) printf("(");
2824                 printf("not ");
2825                 print_exec(b->right, -1, bracket);
2826                 if (bracket) printf(")");
2827                 break;
2828
2829 ###### propagate binode cases
2830         case And:
2831         case AndThen:
2832         case Or:
2833         case OrElse:
2834         case Not:
2835                 /* both must be Tbool, result is Tbool */
2836                 propagate_types(b->left, c, ok, Tbool, 0);
2837                 propagate_types(b->right, c, ok, Tbool, 0);
2838                 if (type && type != Tbool)
2839                         type_err(c, "error: %1 operation found where %2 expected", prog,
2840                                    Tbool, 0, type);
2841                 return Tbool;
2842
2843 ###### interp binode cases
2844         case And:
2845                 rv = interp_exec(c, b->left, &rvtype);
2846                 right = interp_exec(c, b->right, &rtype);
2847                 rv.bool = rv.bool && right.bool;
2848                 break;
2849         case AndThen:
2850                 rv = interp_exec(c, b->left, &rvtype);
2851                 if (rv.bool)
2852                         rv = interp_exec(c, b->right, NULL);
2853                 break;
2854         case Or:
2855                 rv = interp_exec(c, b->left, &rvtype);
2856                 right = interp_exec(c, b->right, &rtype);
2857                 rv.bool = rv.bool || right.bool;
2858                 break;
2859         case OrElse:
2860                 rv = interp_exec(c, b->left, &rvtype);
2861                 if (!rv.bool)
2862                         rv = interp_exec(c, b->right, NULL);
2863                 break;
2864         case Not:
2865                 rv = interp_exec(c, b->right, &rvtype);
2866                 rv.bool = !rv.bool;
2867                 break;
2868
2869 ### Expressions: Comparison
2870
2871 Of slightly higher precedence that Boolean expressions are Comparisons.
2872 A comparison takes arguments of any comparable type, but the two types
2873 must be the same.
2874
2875 To simplify the parsing we introduce an `eop` which can record an
2876 expression operator, and the `CMPop` non-terminal will match one of them.
2877
2878 ###### ast
2879         struct eop {
2880                 enum Btype op;
2881         };
2882
2883 ###### ast functions
2884         static void free_eop(struct eop *e)
2885         {
2886                 if (e)
2887                         free(e);
2888         }
2889
2890 ###### Binode types
2891         Less,
2892         Gtr,
2893         LessEq,
2894         GtrEq,
2895         Eql,
2896         NEql,
2897
2898 ###### expr precedence
2899         $LEFT < > <= >= == != CMPop
2900
2901 ###### expression grammar
2902         | Expression CMPop Expression ${ {
2903                 struct binode *b = new(binode);
2904                 b->op = $2.op;
2905                 b->left = $<1;
2906                 b->right = $<3;
2907                 $0 = b;
2908         } }$
2909
2910 ###### Grammar
2911
2912         $eop
2913         CMPop ->   < ${ $0.op = Less; }$
2914                 |  > ${ $0.op = Gtr; }$
2915                 |  <= ${ $0.op = LessEq; }$
2916                 |  >= ${ $0.op = GtrEq; }$
2917                 |  == ${ $0.op = Eql; }$
2918                 |  != ${ $0.op = NEql; }$
2919
2920 ###### print binode cases
2921
2922         case Less:
2923         case LessEq:
2924         case Gtr:
2925         case GtrEq:
2926         case Eql:
2927         case NEql:
2928                 if (bracket) printf("(");
2929                 print_exec(b->left, -1, bracket);
2930                 switch(b->op) {
2931                 case Less:   printf(" < "); break;
2932                 case LessEq: printf(" <= "); break;
2933                 case Gtr:    printf(" > "); break;
2934                 case GtrEq:  printf(" >= "); break;
2935                 case Eql:    printf(" == "); break;
2936                 case NEql:   printf(" != "); break;
2937                 default: abort();               // NOTEST
2938                 }
2939                 print_exec(b->right, -1, bracket);
2940                 if (bracket) printf(")");
2941                 break;
2942
2943 ###### propagate binode cases
2944         case Less:
2945         case LessEq:
2946         case Gtr:
2947         case GtrEq:
2948         case Eql:
2949         case NEql:
2950                 /* Both must match but not be labels, result is Tbool */
2951                 t = propagate_types(b->left, c, ok, NULL, Rnolabel);
2952                 if (t)
2953                         propagate_types(b->right, c, ok, t, 0);
2954                 else {
2955                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);   // UNTESTED
2956                         if (t)  // UNTESTED
2957                                 t = propagate_types(b->left, c, ok, t, 0);      // UNTESTED
2958                 }
2959                 if (!type_compat(type, Tbool, 0))
2960                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
2961                                     Tbool, rules, type);
2962                 return Tbool;
2963
2964 ###### interp binode cases
2965         case Less:
2966         case LessEq:
2967         case Gtr:
2968         case GtrEq:
2969         case Eql:
2970         case NEql:
2971         {
2972                 int cmp;
2973                 left = interp_exec(c, b->left, &ltype);
2974                 right = interp_exec(c, b->right, &rtype);
2975                 cmp = value_cmp(ltype, rtype, &left, &right);
2976                 rvtype = Tbool;
2977                 switch (b->op) {
2978                 case Less:      rv.bool = cmp <  0; break;
2979                 case LessEq:    rv.bool = cmp <= 0; break;
2980                 case Gtr:       rv.bool = cmp >  0; break;
2981                 case GtrEq:     rv.bool = cmp >= 0; break;
2982                 case Eql:       rv.bool = cmp == 0; break;
2983                 case NEql:      rv.bool = cmp != 0; break;
2984                 default:        rv.bool = 0; break;     // NOTEST
2985                 }
2986                 break;
2987         }
2988
2989 ### Expressions: The rest
2990
2991 The remaining expressions with the highest precedence are arithmetic,
2992 string concatenation, and string conversion.  String concatenation
2993 (`++`) has the same precedence as multiplication and division, but lower
2994 than the uniary.
2995
2996 String conversion is a temporary feature until I get a better type
2997 system.  `$` is a prefix operator which expects a string and returns
2998 a number.
2999
3000 `+` and `-` are both infix and prefix operations (where they are
3001 absolute value and negation).  These have different operator names.
3002
3003 We also have a 'Bracket' operator which records where parentheses were
3004 found.  This makes it easy to reproduce these when printing.  Possibly I
3005 should only insert brackets were needed for precedence.
3006
3007 ###### Binode types
3008         Plus, Minus,
3009         Times, Divide, Rem,
3010         Concat,
3011         Absolute, Negate,
3012         StringConv,
3013         Bracket,
3014
3015 ###### expr precedence
3016         $LEFT + - Eop
3017         $LEFT * / % ++ Top
3018         $LEFT Uop $
3019         $TERM ( )
3020
3021 ###### expression grammar
3022                 | Expression Eop Expression ${ {
3023                         struct binode *b = new(binode);
3024                         b->op = $2.op;
3025                         b->left = $<1;
3026                         b->right = $<3;
3027                         $0 = b;
3028                 } }$
3029
3030                 | Expression Top Expression ${ {
3031                         struct binode *b = new(binode);
3032                         b->op = $2.op;
3033                         b->left = $<1;
3034                         b->right = $<3;
3035                         $0 = b;
3036                 } }$
3037
3038                 | ( Expression ) ${ {
3039                         struct binode *b = new_pos(binode, $1);
3040                         b->op = Bracket;
3041                         b->right = $<2;
3042                         $0 = b;
3043                 } }$
3044                 | Uop Expression ${ {
3045                         struct binode *b = new(binode);
3046                         b->op = $1.op;
3047                         b->right = $<2;
3048                         $0 = b;
3049                 } }$
3050                 | Value ${ $0 = $<1; }$
3051                 | Variable ${ $0 = $<1; }$
3052
3053         $eop
3054         Eop ->    + ${ $0.op = Plus; }$
3055                 | - ${ $0.op = Minus; }$
3056
3057         Uop ->    + ${ $0.op = Absolute; }$
3058                 | - ${ $0.op = Negate; }$
3059                 | $ ${ $0.op = StringConv; }$
3060
3061         Top ->    * ${ $0.op = Times; }$
3062                 | / ${ $0.op = Divide; }$
3063                 | % ${ $0.op = Rem; }$
3064                 | ++ ${ $0.op = Concat; }$
3065
3066 ###### print binode cases
3067         case Plus:
3068         case Minus:
3069         case Times:
3070         case Divide:
3071         case Concat:
3072         case Rem:
3073                 if (bracket) printf("(");
3074                 print_exec(b->left, indent, bracket);
3075                 switch(b->op) {
3076                 case Plus:   fputs(" + ", stdout); break;
3077                 case Minus:  fputs(" - ", stdout); break;
3078                 case Times:  fputs(" * ", stdout); break;
3079                 case Divide: fputs(" / ", stdout); break;
3080                 case Rem:    fputs(" % ", stdout); break;
3081                 case Concat: fputs(" ++ ", stdout); break;
3082                 default: abort();       // NOTEST
3083                 }                       // NOTEST
3084                 print_exec(b->right, indent, bracket);
3085                 if (bracket) printf(")");
3086                 break;
3087         case Absolute:
3088         case Negate:
3089         case StringConv:
3090                 if (bracket) printf("(");
3091                 switch (b->op) {
3092                 case Absolute:   fputs("+", stdout); break;
3093                 case Negate:     fputs("-", stdout); break;
3094                 case StringConv: fputs("$", stdout); break;
3095                 default: abort();       // NOTEST
3096                 }                       // NOTEST
3097                 print_exec(b->right, indent, bracket);
3098                 if (bracket) printf(")");
3099                 break;
3100         case Bracket:
3101                 printf("(");
3102                 print_exec(b->right, indent, bracket);
3103                 printf(")");
3104                 break;
3105
3106 ###### propagate binode cases
3107         case Plus:
3108         case Minus:
3109         case Times:
3110         case Rem:
3111         case Divide:
3112                 /* both must be numbers, result is Tnum */
3113         case Absolute:
3114         case Negate:
3115                 /* as propagate_types ignores a NULL,
3116                  * unary ops fit here too */
3117                 propagate_types(b->left, c, ok, Tnum, 0);
3118                 propagate_types(b->right, c, ok, Tnum, 0);
3119                 if (!type_compat(type, Tnum, 0))
3120                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3121                                    Tnum, rules, type);
3122                 return Tnum;
3123
3124         case Concat:
3125                 /* both must be Tstr, result is Tstr */
3126                 propagate_types(b->left, c, ok, Tstr, 0);
3127                 propagate_types(b->right, c, ok, Tstr, 0);
3128                 if (!type_compat(type, Tstr, 0))
3129                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3130                                    Tstr, rules, type);
3131                 return Tstr;
3132
3133         case StringConv:
3134                 /* op must be string, result is number */
3135                 propagate_types(b->left, c, ok, Tstr, 0);
3136                 if (!type_compat(type, Tnum, 0))
3137                         type_err(c,     // UNTESTED
3138                           "error: Can only convert string to number, not %1",
3139                                 prog, type, 0, NULL);
3140                 return Tnum;
3141
3142         case Bracket:
3143                 return propagate_types(b->right, c, ok, type, 0);
3144
3145 ###### interp binode cases
3146
3147         case Plus:
3148                 rv = interp_exec(c, b->left, &rvtype);
3149                 right = interp_exec(c, b->right, &rtype);
3150                 mpq_add(rv.num, rv.num, right.num);
3151                 break;
3152         case Minus:
3153                 rv = interp_exec(c, b->left, &rvtype);
3154                 right = interp_exec(c, b->right, &rtype);
3155                 mpq_sub(rv.num, rv.num, right.num);
3156                 break;
3157         case Times:
3158                 rv = interp_exec(c, b->left, &rvtype);
3159                 right = interp_exec(c, b->right, &rtype);
3160                 mpq_mul(rv.num, rv.num, right.num);
3161                 break;
3162         case Divide:
3163                 rv = interp_exec(c, b->left, &rvtype);
3164                 right = interp_exec(c, b->right, &rtype);
3165                 mpq_div(rv.num, rv.num, right.num);
3166                 break;
3167         case Rem: {
3168                 mpz_t l, r, rem;
3169
3170                 left = interp_exec(c, b->left, &ltype);
3171                 right = interp_exec(c, b->right, &rtype);
3172                 mpz_init(l); mpz_init(r); mpz_init(rem);
3173                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3174                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3175                 mpz_tdiv_r(rem, l, r);
3176                 val_init(Tnum, &rv);
3177                 mpq_set_z(rv.num, rem);
3178                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3179                 rvtype = ltype;
3180                 break;
3181         }
3182         case Negate:
3183                 rv = interp_exec(c, b->right, &rvtype);
3184                 mpq_neg(rv.num, rv.num);
3185                 break;
3186         case Absolute:
3187                 rv = interp_exec(c, b->right, &rvtype);
3188                 mpq_abs(rv.num, rv.num);
3189                 break;
3190         case Bracket:
3191                 rv = interp_exec(c, b->right, &rvtype);
3192                 break;
3193         case Concat:
3194                 left = interp_exec(c, b->left, &ltype);
3195                 right = interp_exec(c, b->right, &rtype);
3196                 rvtype = Tstr;
3197                 rv.str = text_join(left.str, right.str);
3198                 break;
3199         case StringConv:
3200                 right = interp_exec(c, b->right, &rvtype);
3201                 rtype = Tstr;
3202                 rvtype = Tnum;
3203
3204                 struct text tx = right.str;
3205                 char tail[3];
3206                 int neg = 0;
3207                 if (tx.txt[0] == '-') {
3208                         neg = 1;        // UNTESTED
3209                         tx.txt++;       // UNTESTED
3210                         tx.len--;       // UNTESTED
3211                 }
3212                 if (number_parse(rv.num, tail, tx) == 0)
3213                         mpq_init(rv.num);       // UNTESTED
3214                 else if (neg)
3215                         mpq_neg(rv.num, rv.num);        // UNTESTED
3216                 if (tail[0])
3217                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3218
3219                 break;
3220
3221 ###### value functions
3222
3223         static struct text text_join(struct text a, struct text b)
3224         {
3225                 struct text rv;
3226                 rv.len = a.len + b.len;
3227                 rv.txt = malloc(rv.len);
3228                 memcpy(rv.txt, a.txt, a.len);
3229                 memcpy(rv.txt+a.len, b.txt, b.len);
3230                 return rv;
3231         }
3232
3233 ### Blocks, Statements, and Statement lists.
3234
3235 Now that we have expressions out of the way we need to turn to
3236 statements.  There are simple statements and more complex statements.
3237 Simple statements do not contain (syntactic) newlines, complex statements do.
3238
3239 Statements often come in sequences and we have corresponding simple
3240 statement lists and complex statement lists.
3241 The former comprise only simple statements separated by semicolons.
3242 The later comprise complex statements and simple statement lists.  They are
3243 separated by newlines.  Thus the semicolon is only used to separate
3244 simple statements on the one line.  This may be overly restrictive,
3245 but I'm not sure I ever want a complex statement to share a line with
3246 anything else.
3247
3248 Note that a simple statement list can still use multiple lines if
3249 subsequent lines are indented, so
3250
3251 ###### Example: wrapped simple statement list
3252
3253         a = b; c = d;
3254            e = f; print g
3255
3256 is a single simple statement list.  This might allow room for
3257 confusion, so I'm not set on it yet.
3258
3259 A simple statement list needs no extra syntax.  A complex statement
3260 list has two syntactic forms.  It can be enclosed in braces (much like
3261 C blocks), or it can be introduced by an indent and continue until an
3262 unindented newline (much like Python blocks).  With this extra syntax
3263 it is referred to as a block.
3264
3265 Note that a block does not have to include any newlines if it only
3266 contains simple statements.  So both of:
3267
3268         if condition: a=b; d=f
3269
3270         if condition { a=b; print f }
3271
3272 are valid.
3273
3274 In either case the list is constructed from a `binode` list with
3275 `Block` as the operator.  When parsing the list it is most convenient
3276 to append to the end, so a list is a list and a statement.  When using
3277 the list it is more convenient to consider a list to be a statement
3278 and a list.  So we need a function to re-order a list.
3279 `reorder_bilist` serves this purpose.
3280
3281 The only stand-alone statement we introduce at this stage is `pass`
3282 which does nothing and is represented as a `NULL` pointer in a `Block`
3283 list.  Other stand-alone statements will follow once the infrastructure
3284 is in-place.
3285
3286 ###### Binode types
3287         Block,
3288
3289 ###### Grammar
3290
3291         $TERM { } ;
3292
3293         $*binode
3294         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3295                 | { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3296                 | SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3297                 | SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3298                 | IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3299
3300         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3301                 | OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3302                 | OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3303                 | OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3304                 | IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3305
3306         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3307                 | { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3308                 | IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3309
3310         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3311                 | { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3312                 | : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3313                 | : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3314                 | : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3315
3316         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3317
3318         ComplexStatements -> ComplexStatements ComplexStatement ${
3319                         if ($2 == NULL) {
3320                                 $0 = $<1;
3321                         } else {
3322                                 $0 = new(binode);
3323                                 $0->op = Block;
3324                                 $0->left = $<1;
3325                                 $0->right = $<2;
3326                         }
3327                 }$
3328                 | ComplexStatement ${
3329                         if ($1 == NULL) {
3330                                 $0 = NULL;
3331                         } else {
3332                                 $0 = new(binode);
3333                                 $0->op = Block;
3334                                 $0->left = NULL;
3335                                 $0->right = $<1;
3336                         }
3337                 }$
3338
3339         $*exec
3340         ComplexStatement -> SimpleStatements Newlines ${
3341                         $0 = reorder_bilist($<SS);
3342                         }$
3343                 |  SimpleStatements ; Newlines ${
3344                         $0 = reorder_bilist($<SS);
3345                         }$
3346                 ## ComplexStatement Grammar
3347
3348         $*binode
3349         SimpleStatements -> SimpleStatements ; SimpleStatement ${
3350                         $0 = new(binode);
3351                         $0->op = Block;
3352                         $0->left = $<1;
3353                         $0->right = $<3;
3354                         }$
3355                 | SimpleStatement ${
3356                         $0 = new(binode);
3357                         $0->op = Block;
3358                         $0->left = NULL;
3359                         $0->right = $<1;
3360                         }$
3361
3362         $TERM pass
3363         SimpleStatement -> pass ${ $0 = NULL; }$
3364                 | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
3365                 ## SimpleStatement Grammar
3366
3367 ###### print binode cases
3368         case Block:
3369                 if (indent < 0) {
3370                         // simple statement
3371                         if (b->left == NULL)    // UNTESTED
3372                                 printf("pass"); // UNTESTED
3373                         else
3374                                 print_exec(b->left, indent, bracket);   // UNTESTED
3375                         if (b->right) { // UNTESTED
3376                                 printf("; ");   // UNTESTED
3377                                 print_exec(b->right, indent, bracket);  // UNTESTED
3378                         }
3379                 } else {
3380                         // block, one per line
3381                         if (b->left == NULL)
3382                                 do_indent(indent, "pass\n");
3383                         else
3384                                 print_exec(b->left, indent, bracket);
3385                         if (b->right)
3386                                 print_exec(b->right, indent, bracket);
3387                 }
3388                 break;
3389
3390 ###### propagate binode cases
3391         case Block:
3392         {
3393                 /* If any statement returns something other than Tnone
3394                  * or Tbool then all such must return same type.
3395                  * As each statement may be Tnone or something else,
3396                  * we must always pass NULL (unknown) down, otherwise an incorrect
3397                  * error might occur.  We never return Tnone unless it is
3398                  * passed in.
3399                  */
3400                 struct binode *e;
3401
3402                 for (e = b; e; e = cast(binode, e->right)) {
3403                         t = propagate_types(e->left, c, ok, NULL, rules);
3404                         if ((rules & Rboolok) && t == Tbool)
3405                                 t = NULL;
3406                         if (t && t != Tnone && t != Tbool) {
3407                                 if (!type)
3408                                         type = t;
3409                                 else if (t != type)
3410                                         type_err(c, "error: expected %1%r, found %2",
3411                                                  e->left, type, rules, t);
3412                         }
3413                 }
3414                 return type;
3415         }
3416
3417 ###### interp binode cases
3418         case Block:
3419                 while (rvtype == Tnone &&
3420                        b) {
3421                         if (b->left)
3422                                 rv = interp_exec(c, b->left, &rvtype);
3423                         b = cast(binode, b->right);
3424                 }
3425                 break;
3426
3427 ### The Print statement
3428
3429 `print` is a simple statement that takes a comma-separated list of
3430 expressions and prints the values separated by spaces and terminated
3431 by a newline.  No control of formatting is possible.
3432
3433 `print` faces the same list-ordering issue as blocks, and uses the
3434 same solution.
3435
3436 ###### Binode types
3437         Print,
3438
3439 ##### expr precedence
3440         $TERM print ,
3441
3442 ###### SimpleStatement Grammar
3443
3444         | print ExpressionList ${
3445                 $0 = reorder_bilist($<2);
3446         }$
3447         | print ExpressionList , ${
3448                 $0 = new(binode);
3449                 $0->op = Print;
3450                 $0->right = NULL;
3451                 $0->left = $<2;
3452                 $0 = reorder_bilist($0);
3453         }$
3454         | print ${
3455                 $0 = new(binode);
3456                 $0->op = Print;
3457                 $0->right = NULL;
3458         }$
3459
3460 ###### Grammar
3461
3462         $*binode
3463         ExpressionList -> ExpressionList , Expression ${
3464                 $0 = new(binode);
3465                 $0->op = Print;
3466                 $0->left = $<1;
3467                 $0->right = $<3;
3468                 }$
3469                 | Expression ${
3470                         $0 = new(binode);
3471                         $0->op = Print;
3472                         $0->left = NULL;
3473                         $0->right = $<1;
3474                 }$
3475
3476 ###### print binode cases
3477
3478         case Print:
3479                 do_indent(indent, "print");
3480                 while (b) {
3481                         if (b->left) {
3482                                 printf(" ");
3483                                 print_exec(b->left, -1, bracket);
3484                                 if (b->right)
3485                                         printf(",");
3486                         }
3487                         b = cast(binode, b->right);
3488                 }
3489                 if (indent >= 0)
3490                         printf("\n");
3491                 break;
3492
3493 ###### propagate binode cases
3494
3495         case Print:
3496                 /* don't care but all must be consistent */
3497                 propagate_types(b->left, c, ok, NULL, Rnolabel);
3498                 propagate_types(b->right, c, ok, NULL, Rnolabel);
3499                 break;
3500
3501 ###### interp binode cases
3502
3503         case Print:
3504         {
3505                 char sep = 0;
3506                 int eol = 1;
3507                 for ( ; b; b = cast(binode, b->right))
3508                         if (b->left) {
3509                                 if (sep)
3510                                         putchar(sep);
3511                                 left = interp_exec(c, b->left, &ltype);
3512                                 print_value(ltype, &left);
3513                                 free_value(ltype, &left);
3514                                 if (b->right)
3515                                         sep = ' ';
3516                         } else if (sep)
3517                                 eol = 0;
3518                 ltype = Tnone;
3519                 if (eol)
3520                         printf("\n");
3521                 break;
3522         }
3523
3524 ###### Assignment statement
3525
3526 An assignment will assign a value to a variable, providing it hasn't
3527 been declared as a constant.  The analysis phase ensures that the type
3528 will be correct so the interpreter just needs to perform the
3529 calculation.  There is a form of assignment which declares a new
3530 variable as well as assigning a value.  If a name is assigned before
3531 it is declared, and error will be raised as the name is created as
3532 `Tlabel` and it is illegal to assign to such names.
3533
3534 ###### Binode types
3535         Assign,
3536         Declare,
3537
3538 ###### declare terminals
3539         $TERM =
3540
3541 ###### SimpleStatement Grammar
3542         | Variable = Expression ${
3543                         $0 = new(binode);
3544                         $0->op = Assign;
3545                         $0->left = $<1;
3546                         $0->right = $<3;
3547                 }$
3548         | VariableDecl = Expression ${
3549                         $0 = new(binode);
3550                         $0->op = Declare;
3551                         $0->left = $<1;
3552                         $0->right =$<3;
3553                 }$
3554
3555         | VariableDecl ${
3556                         if ($1->var->where_set == NULL) {
3557                                 type_err(c,
3558                                          "Variable declared with no type or value: %v",
3559                                          $1, NULL, 0, NULL);
3560                         } else {
3561                                 $0 = new(binode);
3562                                 $0->op = Declare;
3563                                 $0->left = $<1;
3564                                 $0->right = NULL;
3565                         }
3566                 }$
3567
3568 ###### print binode cases
3569
3570         case Assign:
3571                 do_indent(indent, "");
3572                 print_exec(b->left, indent, bracket);
3573                 printf(" = ");
3574                 print_exec(b->right, indent, bracket);
3575                 if (indent >= 0)
3576                         printf("\n");
3577                 break;
3578
3579         case Declare:
3580                 {
3581                 struct variable *v = cast(var, b->left)->var;
3582                 do_indent(indent, "");
3583                 print_exec(b->left, indent, bracket);
3584                 if (cast(var, b->left)->var->constant) {
3585                         if (v->where_decl == v->where_set) {
3586                                 printf("::");
3587                                 type_print(v->type, stdout);
3588                                 printf(" ");
3589                         } else
3590                                 printf(" ::");
3591                 } else {
3592                         if (v->where_decl == v->where_set) {
3593                                 printf(":");
3594                                 type_print(v->type, stdout);
3595                                 printf(" ");
3596                         } else
3597                                 printf(" :");
3598                 }
3599                 if (b->right) {
3600                         printf("= ");
3601                         print_exec(b->right, indent, bracket);
3602                 }
3603                 if (indent >= 0)
3604                         printf("\n");
3605                 }
3606                 break;
3607
3608 ###### propagate binode cases
3609
3610         case Assign:
3611         case Declare:
3612                 /* Both must match and not be labels,
3613                  * Type must support 'dup',
3614                  * For Assign, left must not be constant.
3615                  * result is Tnone
3616                  */
3617                 t = propagate_types(b->left, c, ok, NULL,
3618                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
3619                 if (!b->right)
3620                         return Tnone;
3621
3622                 if (t) {
3623                         if (propagate_types(b->right, c, ok, t, 0) != t)
3624                                 if (b->left->type == Xvar)
3625                                         type_err(c, "info: variable '%v' was set as %1 here.",
3626                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
3627                 } else {
3628                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
3629                         if (t)
3630                                 propagate_types(b->left, c, ok, t,
3631                                                 (b->op == Assign ? Rnoconstant : 0));
3632                 }
3633                 if (t && t->dup == NULL)
3634                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
3635                 return Tnone;
3636
3637                 break;
3638
3639 ###### interp binode cases
3640
3641         case Assign:
3642                 lleft = linterp_exec(c, b->left, &ltype);
3643                 right = interp_exec(c, b->right, &rtype);
3644                 if (lleft) {
3645                         free_value(ltype, lleft);
3646                         dup_value(ltype, &right, lleft);
3647                         ltype = NULL;
3648                 }
3649                 break;
3650
3651         case Declare:
3652         {
3653                 struct variable *v = cast(var, b->left)->var;
3654                 struct value *val;
3655                 if (v->merged)
3656                         v = v->merged;
3657                 val = var_value(c, v);
3658                 free_value(v->type, val);
3659                 if (v->type->prepare_type)
3660                         v->type->prepare_type(c, v->type, 0);
3661                 if (b->right) {
3662                         right = interp_exec(c, b->right, &rtype);
3663                         memcpy(val, &right, rtype->size);
3664                         rtype = Tnone;
3665                 } else {
3666                         val_init(v->type, val);
3667                 }
3668                 break;
3669         }
3670
3671 ### The `use` statement
3672
3673 The `use` statement is the last "simple" statement.  It is needed when
3674 the condition in a conditional statement is a block.  `use` works much
3675 like `return` in C, but only completes the `condition`, not the whole
3676 function.
3677
3678 ###### Binode types
3679         Use,
3680
3681 ###### expr precedence
3682         $TERM use
3683
3684 ###### SimpleStatement Grammar
3685         | use Expression ${
3686                 $0 = new_pos(binode, $1);
3687                 $0->op = Use;
3688                 $0->right = $<2;
3689                 if ($0->right->type == Xvar) {
3690                         struct var *v = cast(var, $0->right);
3691                         if (v->var->type == Tnone) {
3692                                 /* Convert this to a label */
3693                                 struct value *val;
3694
3695                                 v->var->type = Tlabel;
3696                                 val = global_alloc(c, Tlabel, v->var, NULL);
3697                                 val->label = val;
3698                         }
3699                 }
3700         }$
3701
3702 ###### print binode cases
3703
3704         case Use:
3705                 do_indent(indent, "use ");
3706                 print_exec(b->right, -1, bracket);
3707                 if (indent >= 0)
3708                         printf("\n");
3709                 break;
3710
3711 ###### propagate binode cases
3712
3713         case Use:
3714                 /* result matches value */
3715                 return propagate_types(b->right, c, ok, type, 0);
3716
3717 ###### interp binode cases
3718
3719         case Use:
3720                 rv = interp_exec(c, b->right, &rvtype);
3721                 break;
3722
3723 ### The Conditional Statement
3724
3725 This is the biggy and currently the only complex statement.  This
3726 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
3727 It is comprised of a number of parts, all of which are optional though
3728 set combinations apply.  Each part is (usually) a key word (`then` is
3729 sometimes optional) followed by either an expression or a code block,
3730 except the `casepart` which is a "key word and an expression" followed
3731 by a code block.  The code-block option is valid for all parts and,
3732 where an expression is also allowed, the code block can use the `use`
3733 statement to report a value.  If the code block does not report a value
3734 the effect is similar to reporting `True`.
3735
3736 The `else` and `case` parts, as well as `then` when combined with
3737 `if`, can contain a `use` statement which will apply to some
3738 containing conditional statement. `for` parts, `do` parts and `then`
3739 parts used with `for` can never contain a `use`, except in some
3740 subordinate conditional statement.
3741
3742 If there is a `forpart`, it is executed first, only once.
3743 If there is a `dopart`, then it is executed repeatedly providing
3744 always that the `condpart` or `cond`, if present, does not return a non-True
3745 value.  `condpart` can fail to return any value if it simply executes
3746 to completion.  This is treated the same as returning `True`.
3747
3748 If there is a `thenpart` it will be executed whenever the `condpart`
3749 or `cond` returns True (or does not return any value), but this will happen
3750 *after* `dopart` (when present).
3751
3752 If `elsepart` is present it will be executed at most once when the
3753 condition returns `False` or some value that isn't `True` and isn't
3754 matched by any `casepart`.  If there are any `casepart`s, they will be
3755 executed when the condition returns a matching value.
3756
3757 The particular sorts of values allowed in case parts has not yet been
3758 determined in the language design, so nothing is prohibited.
3759
3760 The various blocks in this complex statement potentially provide scope
3761 for variables as described earlier.  Each such block must include the
3762 "OpenScope" nonterminal before parsing the block, and must call
3763 `var_block_close()` when closing the block.
3764
3765 The code following "`if`", "`switch`" and "`for`" does not get its own
3766 scope, but is in a scope covering the whole statement, so names
3767 declared there cannot be redeclared elsewhere.  Similarly the
3768 condition following "`while`" is in a scope the covers the body
3769 ("`do`" part) of the loop, and which does not allow conditional scope
3770 extension.  Code following "`then`" (both looping and non-looping),
3771 "`else`" and "`case`" each get their own local scope.
3772
3773 The type requirements on the code block in a `whilepart` are quite
3774 unusal.  It is allowed to return a value of some identifiable type, in
3775 which case the loop aborts and an appropriate `casepart` is run, or it
3776 can return a Boolean, in which case the loop either continues to the
3777 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
3778 This is different both from the `ifpart` code block which is expected to
3779 return a Boolean, or the `switchpart` code block which is expected to
3780 return the same type as the casepart values.  The correct analysis of
3781 the type of the `whilepart` code block is the reason for the
3782 `Rboolok` flag which is passed to `propagate_types()`.
3783
3784 The `cond_statement` cannot fit into a `binode` so a new `exec` is
3785 defined.
3786
3787 ###### exec type
3788         Xcond_statement,
3789
3790 ###### ast
3791         struct casepart {
3792                 struct exec *value;
3793                 struct exec *action;
3794                 struct casepart *next;
3795         };
3796         struct cond_statement {
3797                 struct exec;
3798                 struct exec *forpart, *condpart, *dopart, *thenpart, *elsepart;
3799                 struct casepart *casepart;
3800         };
3801
3802 ###### ast functions
3803
3804         static void free_casepart(struct casepart *cp)
3805         {
3806                 while (cp) {
3807                         struct casepart *t;
3808                         free_exec(cp->value);
3809                         free_exec(cp->action);
3810                         t = cp->next;
3811                         free(cp);
3812                         cp = t;
3813                 }
3814         }
3815
3816         static void free_cond_statement(struct cond_statement *s)
3817         {
3818                 if (!s)
3819                         return;
3820                 free_exec(s->forpart);
3821                 free_exec(s->condpart);
3822                 free_exec(s->dopart);
3823                 free_exec(s->thenpart);
3824                 free_exec(s->elsepart);
3825                 free_casepart(s->casepart);
3826                 free(s);
3827         }
3828
3829 ###### free exec cases
3830         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
3831
3832 ###### ComplexStatement Grammar
3833         | CondStatement ${ $0 = $<1; }$
3834
3835 ###### expr precedence
3836         $TERM for then while do
3837         $TERM else
3838         $TERM switch case
3839
3840 ###### Grammar
3841
3842         $*cond_statement
3843         // A CondStatement must end with EOL, as does CondSuffix and
3844         // IfSuffix.
3845         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
3846         // may or may not end with EOL
3847         // WhilePart and IfPart include an appropriate Suffix
3848
3849         // Both ForPart and Whilepart open scopes, and CondSuffix only
3850         // closes one - so in the first branch here we have another to close.
3851         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
3852                         $0 = $<CS;
3853                         $0->forpart = $<FP;
3854                         $0->thenpart = $<TP;
3855                         $0->condpart = $WP.condpart; $WP.condpart = NULL;
3856                         $0->dopart = $WP.dopart; $WP.dopart = NULL;
3857                         var_block_close(c, CloseSequential);
3858                         }$
3859                 | ForPart OptNL WhilePart CondSuffix ${
3860                         $0 = $<CS;
3861                         $0->forpart = $<FP;
3862                         $0->condpart = $WP.condpart; $WP.condpart = NULL;
3863                         $0->dopart = $WP.dopart; $WP.dopart = NULL;
3864                         var_block_close(c, CloseSequential);
3865                         }$
3866                 | WhilePart CondSuffix ${
3867                         $0 = $<CS;
3868                         $0->condpart = $WP.condpart; $WP.condpart = NULL;
3869                         $0->dopart = $WP.dopart; $WP.dopart = NULL;
3870                         }$
3871                 | SwitchPart OptNL CasePart CondSuffix ${
3872                         $0 = $<CS;
3873                         $0->condpart = $<SP;
3874                         $CP->next = $0->casepart;
3875                         $0->casepart = $<CP;
3876                         }$
3877                 | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
3878                         $0 = $<CS;
3879                         $0->condpart = $<SP;
3880                         $CP->next = $0->casepart;
3881                         $0->casepart = $<CP;
3882                         }$
3883                 | IfPart IfSuffix ${
3884                         $0 = $<IS;
3885                         $0->condpart = $IP.condpart; $IP.condpart = NULL;
3886                         $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
3887                         // This is where we close an "if" statement
3888                         var_block_close(c, CloseSequential);
3889                         }$
3890
3891         CondSuffix -> IfSuffix ${
3892                         $0 = $<1;
3893                         // This is where we close scope of the whole
3894                         // "for" or "while" statement
3895                         var_block_close(c, CloseSequential);
3896                 }$
3897                 | Newlines CasePart CondSuffix ${
3898                         $0 = $<CS;
3899                         $CP->next = $0->casepart;
3900                         $0->casepart = $<CP;
3901                 }$
3902                 | CasePart CondSuffix ${
3903                         $0 = $<CS;
3904                         $CP->next = $0->casepart;
3905                         $0->casepart = $<CP;
3906                 }$
3907
3908         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
3909                 | Newlines ElsePart ${ $0 = $<EP; }$
3910                 | ElsePart ${$0 = $<EP; }$
3911
3912         ElsePart -> else OpenBlock Newlines ${
3913                         $0 = new(cond_statement);
3914                         $0->elsepart = $<OB;
3915                         var_block_close(c, CloseElse);
3916                 }$
3917                 | else OpenScope CondStatement ${
3918                         $0 = new(cond_statement);
3919                         $0->elsepart = $<CS;
3920                         var_block_close(c, CloseElse);
3921                 }$
3922
3923         $*casepart
3924         CasePart -> case Expression OpenScope ColonBlock ${
3925                         $0 = calloc(1,sizeof(struct casepart));
3926                         $0->value = $<Ex;
3927                         $0->action = $<Bl;
3928                         var_block_close(c, CloseParallel);
3929                 }$
3930
3931         $*exec
3932         // These scopes are closed in CondSuffix
3933         ForPart -> for OpenBlock ${
3934                         $0 = $<Bl;
3935                 }$
3936
3937         ThenPart -> then OpenBlock ${
3938                         $0 = $<OB;
3939                         var_block_close(c, CloseSequential);
3940                 }$
3941
3942         $cond_statement
3943         // This scope is closed in CondSuffix
3944         WhilePart -> while UseBlock OptNL do Block ${
3945                         $0.condpart = $<UB;
3946                         $0.dopart = $<Bl;
3947                 }$
3948                 | while OpenScope Expression ColonBlock ${
3949                         $0.condpart = $<Exp;
3950                         $0.dopart = $<Bl;
3951                 }$
3952
3953         IfPart -> if UseBlock OptNL then OpenBlock ClosePara ${
3954                         $0.condpart = $<UB;
3955                         $0.thenpart = $<Bl;
3956                 }$
3957                 | if OpenScope Expression OpenScope ColonBlock ClosePara ${
3958                         $0.condpart = $<Ex;
3959                         $0.thenpart = $<Bl;
3960                 }$
3961                 | if OpenScope Expression OpenScope OptNL then Block ClosePara ${
3962                         $0.condpart = $<Ex;
3963                         $0.thenpart = $<Bl;
3964                 }$
3965
3966         $*exec
3967         // This scope is closed in CondSuffix
3968         SwitchPart -> switch OpenScope Expression ${
3969                         $0 = $<Ex;
3970                 }$
3971                 | switch UseBlock ${
3972                         $0 = $<Bl;
3973                 }$
3974
3975 ###### print exec cases
3976
3977         case Xcond_statement:
3978         {
3979                 struct cond_statement *cs = cast(cond_statement, e);
3980                 struct casepart *cp;
3981                 if (cs->forpart) {
3982                         do_indent(indent, "for");
3983                         if (bracket) printf(" {\n"); else printf("\n");
3984                         print_exec(cs->forpart, indent+1, bracket);
3985                         if (cs->thenpart) {
3986                                 if (bracket)
3987                                         do_indent(indent, "} then {\n");
3988                                 else
3989                                         do_indent(indent, "then\n");
3990                                 print_exec(cs->thenpart, indent+1, bracket);
3991                         }
3992                         if (bracket) do_indent(indent, "}\n");
3993                 }
3994                 if (cs->dopart) {
3995                         // a loop
3996                         if (cs->condpart && cs->condpart->type == Xbinode &&
3997                             cast(binode, cs->condpart)->op == Block) {
3998                                 if (bracket)
3999                                         do_indent(indent, "while {\n");
4000                                 else
4001                                         do_indent(indent, "while\n");
4002                                 print_exec(cs->condpart, indent+1, bracket);
4003                                 if (bracket)
4004                                         do_indent(indent, "} do {\n");
4005                                 else
4006                                         do_indent(indent, "do\n");
4007                                 print_exec(cs->dopart, indent+1, bracket);
4008                                 if (bracket)
4009                                         do_indent(indent, "}\n");
4010                         } else {
4011                                 do_indent(indent, "while ");
4012                                 print_exec(cs->condpart, 0, bracket);
4013                                 if (bracket)
4014                                         printf(" {\n");
4015                                 else
4016                                         printf(":\n");
4017                                 print_exec(cs->dopart, indent+1, bracket);
4018                                 if (bracket)
4019                                         do_indent(indent, "}\n");
4020                         }
4021                 } else {
4022                         // a condition
4023                         if (cs->casepart)
4024                                 do_indent(indent, "switch");
4025                         else
4026                                 do_indent(indent, "if");
4027                         if (cs->condpart && cs->condpart->type == Xbinode &&
4028                             cast(binode, cs->condpart)->op == Block) {
4029                                 if (bracket)    // UNTESTED
4030                                         printf(" {\n"); // UNTESTED
4031                                 else
4032                                         printf(":\n");  // UNTESTED
4033                                 print_exec(cs->condpart, indent+1, bracket);    // UNTESTED
4034                                 if (bracket)    // UNTESTED
4035                                         do_indent(indent, "}\n");       // UNTESTED
4036                                 if (cs->thenpart) {     // UNTESTED
4037                                         do_indent(indent, "then:\n");   // UNTESTED
4038                                         print_exec(cs->thenpart, indent+1, bracket);    // UNTESTED
4039                                 }
4040                         } else {
4041                                 printf(" ");
4042                                 print_exec(cs->condpart, 0, bracket);
4043                                 if (cs->thenpart) {
4044                                         if (bracket)
4045                                                 printf(" {\n");
4046                                         else
4047                                                 printf(":\n");
4048                                         print_exec(cs->thenpart, indent+1, bracket);
4049                                         if (bracket)
4050                                                 do_indent(indent, "}\n");
4051                                 } else
4052                                         printf("\n");
4053                         }
4054                 }
4055                 for (cp = cs->casepart; cp; cp = cp->next) {
4056                         do_indent(indent, "case ");
4057                         print_exec(cp->value, -1, 0);
4058                         if (bracket)
4059                                 printf(" {\n");
4060                         else
4061                                 printf(":\n");
4062                         print_exec(cp->action, indent+1, bracket);
4063                         if (bracket)
4064                                 do_indent(indent, "}\n");
4065                 }
4066                 if (cs->elsepart) {
4067                         do_indent(indent, "else");
4068                         if (bracket)
4069                                 printf(" {\n");
4070                         else
4071                                 printf("\n");
4072                         print_exec(cs->elsepart, indent+1, bracket);
4073                         if (bracket)
4074                                 do_indent(indent, "}\n");
4075                 }
4076                 break;
4077         }
4078
4079 ###### propagate exec cases
4080         case Xcond_statement:
4081         {
4082                 // forpart and dopart must return Tnone
4083                 // thenpart must return Tnone if there is a dopart,
4084                 // otherwise it is like elsepart.
4085                 // condpart must:
4086                 //    be bool if there is no casepart
4087                 //    match casepart->values if there is a switchpart
4088                 //    either be bool or match casepart->value if there
4089                 //             is a whilepart
4090                 // elsepart and casepart->action must match the return type
4091                 //   expected of this statement.
4092                 struct cond_statement *cs = cast(cond_statement, prog);
4093                 struct casepart *cp;
4094
4095                 t = propagate_types(cs->forpart, c, ok, Tnone, 0);
4096                 if (!type_compat(Tnone, t, 0))
4097                         *ok = 0;        // UNTESTED
4098                 t = propagate_types(cs->dopart, c, ok, Tnone, 0);
4099                 if (!type_compat(Tnone, t, 0))
4100                         *ok = 0;        // UNTESTED
4101                 if (cs->dopart) {
4102                         t = propagate_types(cs->thenpart, c, ok, Tnone, 0);
4103                         if (!type_compat(Tnone, t, 0))
4104                                 *ok = 0;        // UNTESTED
4105                 }
4106                 if (cs->casepart == NULL)
4107                         propagate_types(cs->condpart, c, ok, Tbool, 0);
4108                 else {
4109                         /* Condpart must match case values, with bool permitted */
4110                         t = NULL;
4111                         for (cp = cs->casepart;
4112                              cp && !t; cp = cp->next)
4113                                 t = propagate_types(cp->value, c, ok, NULL, 0);
4114                         if (!t && cs->condpart)
4115                                 t = propagate_types(cs->condpart, c, ok, NULL, Rboolok);        // UNTESTED
4116                         // Now we have a type (I hope) push it down
4117                         if (t) {
4118                                 for (cp = cs->casepart; cp; cp = cp->next)
4119                                         propagate_types(cp->value, c, ok, t, 0);
4120                                 propagate_types(cs->condpart, c, ok, t, Rboolok);
4121                         }
4122                 }
4123                 // (if)then, else, and case parts must return expected type.
4124                 if (!cs->dopart && !type)
4125                         type = propagate_types(cs->thenpart, c, ok, NULL, rules);
4126                 if (!type)
4127                         type = propagate_types(cs->elsepart, c, ok, NULL, rules);
4128                 for (cp = cs->casepart;
4129                      cp && !type;
4130                      cp = cp->next)     // UNTESTED
4131                         type = propagate_types(cp->action, c, ok, NULL, rules); // UNTESTED
4132                 if (type) {
4133                         if (!cs->dopart)
4134                                 propagate_types(cs->thenpart, c, ok, type, rules);
4135                         propagate_types(cs->elsepart, c, ok, type, rules);
4136                         for (cp = cs->casepart; cp ; cp = cp->next)
4137                                 propagate_types(cp->action, c, ok, type, rules);
4138                         return type;
4139                 } else
4140                         return NULL;
4141         }
4142
4143 ###### interp exec cases
4144         case Xcond_statement:
4145         {
4146                 struct value v, cnd;
4147                 struct type *vtype, *cndtype;
4148                 struct casepart *cp;
4149                 struct cond_statement *cs = cast(cond_statement, e);
4150
4151                 if (cs->forpart)
4152                         interp_exec(c, cs->forpart, NULL);
4153                 do {
4154                         if (cs->condpart)
4155                                 cnd = interp_exec(c, cs->condpart, &cndtype);
4156                         else
4157                                 cndtype = Tnone;        // UNTESTED
4158                         if (!(cndtype == Tnone ||
4159                               (cndtype == Tbool && cnd.bool != 0)))
4160                                 break;
4161                         // cnd is Tnone or Tbool, doesn't need to be freed
4162                         if (cs->dopart)
4163                                 interp_exec(c, cs->dopart, NULL);
4164
4165                         if (cs->thenpart) {
4166                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4167                                 if (rvtype != Tnone || !cs->dopart)
4168                                         goto Xcond_done;
4169                                 free_value(rvtype, &rv);
4170                                 rvtype = Tnone;
4171                         }
4172                 } while (cs->dopart);
4173
4174                 for (cp = cs->casepart; cp; cp = cp->next) {
4175                         v = interp_exec(c, cp->value, &vtype);
4176                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4177                                 free_value(vtype, &v);
4178                                 free_value(cndtype, &cnd);
4179                                 rv = interp_exec(c, cp->action, &rvtype);
4180                                 goto Xcond_done;
4181                         }
4182                         free_value(vtype, &v);
4183                 }
4184                 free_value(cndtype, &cnd);
4185                 if (cs->elsepart)
4186                         rv = interp_exec(c, cs->elsepart, &rvtype);
4187                 else
4188                         rvtype = Tnone;
4189         Xcond_done:
4190                 break;
4191         }
4192
4193 ### Top level structure
4194
4195 All the language elements so far can be used in various places.  Now
4196 it is time to clarify what those places are.
4197
4198 At the top level of a file there will be a number of declarations.
4199 Many of the things that can be declared haven't been described yet,
4200 such as functions, procedures, imports, and probably more.
4201 For now there are two sorts of things that can appear at the top
4202 level.  They are predefined constants, `struct` types, and the `main`
4203 function.  While the syntax will allow the `main` function to appear
4204 multiple times, that will trigger an error if it is actually attempted.
4205
4206 The various declarations do not return anything.  They store the
4207 various declarations in the parse context.
4208
4209 ###### Parser: grammar
4210
4211         $void
4212         Ocean -> OptNL DeclarationList
4213
4214         ## declare terminals
4215
4216         OptNL ->
4217                 | OptNL NEWLINE
4218         Newlines -> NEWLINE
4219                 | Newlines NEWLINE
4220
4221         DeclarationList -> Declaration
4222                 | DeclarationList Declaration
4223
4224         Declaration -> ERROR Newlines ${
4225                         tok_err(c,      // UNTESTED
4226                                 "error: unhandled parse error", &$1);
4227                 }$
4228                 | DeclareConstant
4229                 | DeclareFunction
4230                 | DeclareStruct
4231
4232         ## top level grammar
4233
4234         ## Grammar
4235
4236 ### The `const` section
4237
4238 As well as being defined in with the code that uses them, constants
4239 can be declared at the top level.  These have full-file scope, so they
4240 are always `InScope`.  The value of a top level constant can be given
4241 as an expression, and this is evaluated immediately rather than in the
4242 later interpretation stage.  Once we add functions to the language, we
4243 will need rules concern which, if any, can be used to define a top
4244 level constant.
4245
4246 Constants are defined in a section that starts with the reserved word
4247 `const` and then has a block with a list of assignment statements.
4248 For syntactic consistency, these must use the double-colon syntax to
4249 make it clear that they are constants.  Type can also be given: if
4250 not, the type will be determined during analysis, as with other
4251 constants.
4252
4253 As the types constants are inserted at the head of a list, printing
4254 them in the same order that they were read is not straight forward.
4255 We take a quadratic approach here and count the number of constants
4256 (variables of depth 0), then count down from there, each time
4257 searching through for the Nth constant for decreasing N.
4258
4259 ###### top level grammar
4260
4261         $TERM const
4262
4263         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4264                 | const { SimpleConstList } Newlines
4265                 | const IN OptNL ConstList OUT Newlines
4266                 | const SimpleConstList Newlines
4267
4268         ConstList -> ConstList SimpleConstLine
4269                 | SimpleConstLine
4270         SimpleConstList -> SimpleConstList ; Const
4271                 | Const
4272                 | SimpleConstList ;
4273         SimpleConstLine -> SimpleConstList Newlines
4274                 | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4275
4276         $*type
4277         CType -> Type   ${ $0 = $<1; }$
4278                 |       ${ $0 = NULL; }$
4279         $void
4280         Const -> IDENTIFIER :: CType = Expression ${ {
4281                 int ok;
4282                 struct variable *v;
4283
4284                 v = var_decl(c, $1.txt);
4285                 if (v) {
4286                         struct var *var = new_pos(var, $1);
4287                         v->where_decl = var;
4288                         v->where_set = var;
4289                         var->var = v;
4290                         v->constant = 1;
4291                 } else {
4292                         v = var_ref(c, $1.txt);
4293                         tok_err(c, "error: name already declared", &$1);
4294                         type_err(c, "info: this is where '%v' was first declared",
4295                                  v->where_decl, NULL, 0, NULL);
4296                 }
4297                 do {
4298                         ok = 1;
4299                         propagate_types($5, c, &ok, $3, 0);
4300                 } while (ok == 2);
4301                 if (!ok)
4302                         c->parse_error = 1;
4303                 else if (v) {
4304                         struct value res = interp_exec(c, $5, &v->type);
4305                         global_alloc(c, v->type, v, &res);
4306                 }
4307         } }$
4308
4309 ###### print const decls
4310         {
4311                 struct variable *v;
4312                 int target = -1;
4313
4314                 while (target != 0) {
4315                         int i = 0;
4316                         for (v = context.in_scope; v; v=v->in_scope)
4317                                 if (v->depth == 0) {
4318                                         i += 1;
4319                                         if (i == target)
4320                                                 break;
4321                                 }
4322
4323                         if (target == -1) {
4324                                 if (i)
4325                                         printf("const\n");
4326                                 target = i;
4327                         } else {
4328                                 struct value *val = var_value(&context, v);
4329                                 printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
4330                                 type_print(v->type, stdout);
4331                                 printf(" = ");
4332                                 if (v->type == Tstr)
4333                                         printf("\"");
4334                                 print_value(v->type, val);
4335                                 if (v->type == Tstr)
4336                                         printf("\"");
4337                                 printf("\n");
4338                                 target -= 1;
4339                         }
4340                 }
4341         }
4342
4343 ### Finally the whole `main` function.
4344
4345 An Ocean program can currently have only one function - `main` - and
4346 that must exist.  It expects an array of strings with a provided size.
4347 Following this is a `block` which is the code to execute.
4348
4349 As this is the top level, several things are handled a bit
4350 differently.
4351 The function is not interpreted by `interp_exec` as that isn't
4352 passed the argument list which the program requires.  Similarly type
4353 analysis is a bit more interesting at this level.
4354
4355 ###### top level grammar
4356
4357         DeclareFunction -> MainFunction ${ {
4358                 if (c->prog)
4359                         type_err(c, "\"main\" defined a second time",
4360                                  $1, NULL, 0, NULL);
4361                 else
4362                         c->prog = $<1;
4363         } }$
4364
4365 ###### print binode cases
4366         case Func:
4367         case List:
4368                 do_indent(indent, "func main(");
4369                 for (b2 = cast(binode, b->left); b2; b2 = cast(binode, b2->right)) {
4370                         struct variable *v = cast(var, b2->left)->var;
4371                         printf(" ");
4372                         print_exec(b2->left, 0, 0);
4373                         printf(":");
4374                         type_print(v->type, stdout);
4375                 }
4376                 if (bracket)
4377                         printf(") {\n");
4378                 else
4379                         printf(")\n");
4380                 print_exec(b->right, indent+1, bracket);
4381                 if (bracket)
4382                         do_indent(indent, "}\n");
4383                 break;
4384
4385 ###### propagate binode cases
4386         case List:
4387         case Func: abort();             // NOTEST
4388
4389 ###### core functions
4390
4391         static int analyse_prog(struct exec *prog, struct parse_context *c)
4392         {
4393                 struct binode *bp = cast(binode, prog);
4394                 struct binode *b;
4395                 int ok = 1;
4396                 int arg = 0;
4397                 struct type *argv_type;
4398                 struct text argv_type_name = { " argv", 5 };
4399
4400                 if (!bp)
4401                         return 0;       // NOTEST
4402
4403                 argv_type = add_type(c, argv_type_name, &array_prototype);
4404                 argv_type->array.member = Tstr;
4405                 argv_type->array.unspec = 1;
4406
4407                 for (b = cast(binode, bp->left); b; b = cast(binode, b->right)) {
4408                         ok = 1;
4409                         switch (arg++) {
4410                         case 0: /* argv */
4411                                 propagate_types(b->left, c, &ok, argv_type, 0);
4412                                 break;
4413                         default: /* invalid */  // NOTEST
4414                                 propagate_types(b->left, c, &ok, Tnone, 0);     // NOTEST
4415                         }
4416                 }
4417
4418                 do {
4419                         ok = 1;
4420                         propagate_types(bp->right, c, &ok, Tnone, 0);
4421                 } while (ok == 2);
4422                 if (!ok)
4423                         return 0;
4424
4425                 /* Make sure everything is still consistent */
4426                 propagate_types(bp->right, c, &ok, Tnone, 0);
4427                 if (!ok)
4428                         return 0;       // UNTESTED
4429                 scope_finalize(c);
4430                 return 1;
4431         }
4432
4433         static void interp_prog(struct parse_context *c, struct exec *prog,
4434                                 int argc, char **argv)
4435         {
4436                 struct binode *p = cast(binode, prog);
4437                 struct binode *al;
4438                 int anum = 0;
4439                 struct value v;
4440                 struct type *vtype;
4441
4442                 if (!prog)
4443                         return;         // NOTEST
4444                 al = cast(binode, p->left);
4445                 while (al) {
4446                         struct var *v = cast(var, al->left);
4447                         struct value *vl = var_value(c, v->var);
4448                         struct value arg;
4449                         struct type *t;
4450                         mpq_t argcq;
4451                         int i;
4452
4453                         switch (anum++) {
4454                         case 0: /* argv */
4455                                 t = v->var->type;
4456                                 mpq_init(argcq);
4457                                 mpq_set_ui(argcq, argc, 1);
4458                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
4459                                 t->prepare_type(c, t, 0);
4460                                 array_init(v->var->type, vl);
4461                                 for (i = 0; i < argc; i++) {
4462                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
4463
4464
4465                                         arg.str.txt = argv[i];
4466                                         arg.str.len = strlen(argv[i]);
4467                                         free_value(Tstr, vl2);
4468                                         dup_value(Tstr, &arg, vl2);
4469                                 }
4470                                 break;
4471                         }
4472                         al = cast(binode, al->right);
4473                 }
4474                 v = interp_exec(c, p->right, &vtype);
4475                 free_value(vtype, &v);
4476         }
4477
4478 ###### interp binode cases
4479         case List:
4480         case Func: abort();     // NOTEST
4481
4482 ## And now to test it out.
4483
4484 Having a language requires having a "hello world" program.  I'll
4485 provide a little more than that: a program that prints "Hello world"
4486 finds the GCD of two numbers, prints the first few elements of
4487 Fibonacci, performs a binary search for a number, and a few other
4488 things which will likely grow as the languages grows.
4489
4490 ###### File: oceani.mk
4491         demos :: sayhello
4492         sayhello : oceani
4493                 @echo "===== DEMO ====="
4494                 ./oceani --section "demo: hello" oceani.mdc 55 33
4495
4496 ###### demo: hello
4497
4498         const
4499                 pi ::= 3.141_592_6
4500                 four ::= 2 + 2 ; five ::= 10/2
4501         const pie ::= "I like Pie";
4502                 cake ::= "The cake is"
4503                   ++ " a lie"
4504
4505         struct fred
4506                 size:[four]number
4507                 name:string
4508                 alive:Boolean
4509
4510         func main
4511                 argv:[argc::]string
4512         do
4513                 print "Hello World, what lovely oceans you have!"
4514                 print "Are there", five, "?"
4515                 print pi, pie, "but", cake
4516
4517                 A := $argv[1]; B := $argv[2]
4518
4519                 /* When a variable is defined in both branches of an 'if',
4520                  * and used afterwards, the variables are merged.
4521                  */
4522                 if A > B:
4523                         bigger := "yes"
4524                 else
4525                         bigger := "no"
4526                 print "Is", A, "bigger than", B,"? ", bigger
4527                 /* If a variable is not used after the 'if', no
4528                  * merge happens, so types can be different
4529                  */
4530                 if A > B * 2:
4531                         double:string = "yes"
4532                         print A, "is more than twice", B, "?", double
4533                 else
4534                         double := B*2
4535                         print "double", B, "is", double
4536
4537                 a : number
4538                 a = A;
4539                 b:number = B
4540                 if a > 0 and then b > 0:
4541                         while a != b:
4542                                 if a < b:
4543                                         b = b - a
4544                                 else
4545                                         a = a - b
4546                         print "GCD of", A, "and", B,"is", a
4547                 else if a <= 0:
4548                         print a, "is not positive, cannot calculate GCD"
4549                 else
4550                         print b, "is not positive, cannot calculate GCD"
4551
4552                 for
4553                         togo := 10
4554                         f1 := 1; f2 := 1
4555                         print "Fibonacci:", f1,f2,
4556                 then togo = togo - 1
4557                 while togo > 0:
4558                         f3 := f1 + f2
4559                         print "", f3,
4560                         f1 = f2
4561                         f2 = f3
4562                 print ""
4563
4564                 /* Binary search... */
4565                 for
4566                         lo:= 0; hi := 100
4567                         target := 77
4568                 while
4569                         mid := (lo + hi) / 2
4570                         if mid == target:
4571                                 use Found
4572                         if mid < target:
4573                                 lo = mid
4574                         else
4575                                 hi = mid
4576                         if hi - lo < 1:
4577                                 use GiveUp
4578                         use True
4579                 do pass
4580                 case Found:
4581                         print "Yay, I found", target
4582                 case GiveUp:
4583                         print "Closest I found was", mid
4584
4585                 size::= 10
4586                 list:[size]number
4587                 list[0] = 1234
4588                 // "middle square" PRNG.  Not particularly good, but one my
4589                 // Dad taught me - the first one I ever heard of.
4590                 for i:=1; then i = i + 1; while i < size:
4591                         n := list[i-1] * list[i-1]
4592                         list[i] = (n / 100) % 10 000
4593
4594                 print "Before sort:",
4595                 for i:=0; then i = i + 1; while i < size:
4596                         print "", list[i],
4597                 print
4598
4599                 for i := 1; then i=i+1; while i < size:
4600                         for j:=i-1; then j=j-1; while j >= 0:
4601                                 if list[j] > list[j+1]:
4602                                         t:= list[j]
4603                                         list[j] = list[j+1]
4604                                         list[j+1] = t
4605                 print " After sort:",
4606                 for i:=0; then i = i + 1; while i < size:
4607                         print "", list[i],
4608                 print
4609
4610                 if 1 == 2 then print "yes"; else print "no"
4611
4612                 bob:fred
4613                 bob.name = "Hello"
4614                 bob.alive = (bob.name == "Hello")
4615                 print "bob", "is" if  bob.alive else "isn't", "alive"