ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Stoney Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This second version of the interpreter exists to test out the
  33 structured statement providing conditions and iteration, and simple
  34 variable scoping.  Clearly we need some minimal other functionality so
  35 that values can be tested and instructions iterated over.  All that
  36 functionality is clearly not normative at this stage (not that
  37 anything is **really** normative yet) and will change, so early test
  38 code will certainly break in later versions.
  39
  40 The under-test parts of the language are:
  41
  42  - conditional/looping structured statements
  43  - the `use` statement which is needed for that
  44  - Variable binding using ":=" and "::=", and assignment using "=".
  45
  46 Elements which are present to make a usable language are:
  47
  48  - "blocks" of multiple statements.
  49  - `pass`: a statement which does nothing.
  50  - expressions: `+`, `-`, `*`, `/` can apply to numbers and `++` can
  51    catenate strings.  `and`, `or`, `not` manipulate Booleans, and
  52    normal comparison operators can work on all three types.
  53  - `print`: will print the values in a list of expressions.
  54  - `program`: is given a list of identifiers to initialize from
  55    arguments.
  56
  57 ## Naming
  58
  59 Versions of the interpreter which obviously do not support a complete
  60 language will be named after creeks and streams.  This one is Stoney
  61 Creek.
  62
  63 Once we have something reasonably resembling a complete language, the
  64 names of rivers will be used.
  65 Early versions of the compiler will be named after seas.  Major
  66 releases of the compiler will be named after oceans.  Hopefully I will
  67 be finished once I get to the Pacific Ocean release.
  68
  69 ## Outline
  70
  71 As well as parsing and executing a program, the interpreter can print
  72 out the program from the parsed internal structure.  This is useful
  73 for validating the parsing.
  74 So the main requirements of the interpreter are:
  75
  76 - Parse the program, possibly with tracing,
  77 - Analyse the parsed program to ensure consistency,
  78 - Print the program,
  79 - Execute the program.
  80
  81 This is all performed by a single C program extracted with
  82 `parsergen`.
  83
  84 There will be two formats for printing the program: a default and one
  85 that uses bracketing.  So a `--bracket` command line option is needed
  86 for that.  Normally the first code section found is used, however an
  87 alternate section can be requested so that a file (such as this one)
  88 can contain multiple programs This is effected with the `--section`
  89 option.
  90
  91 This code must be compiled with `-fplan9-extensions` so that anonymous
  92 structures can be used.
  93
  94 ###### File: oceani.mk
  95
  96         myCFLAGS := -Wall -g -fplan9-extensions
  97         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  98         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  99         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
 100         ## libs
 101         all :: $(LDLIBS) oceani
 102         oceani.c oceani.h : oceani.mdc parsergen
 103                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 104         oceani.mk: oceani.mdc md2c
 105                 ./md2c oceani.mdc
 106
 107         oceani: oceani.o $(LDLIBS)
 108                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 109
 110 ###### Parser: header
 111         ## macros
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 ## parse context
 118         };
 119
 120 ###### macros
 121
 122         #define container_of(ptr, type, member) ({                      \
 123                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 124                 (type *)( (char *)__mptr - offsetof(type,member) );})
 125
 126         #define config2context(_conf) container_of(_conf, struct parse_context, \
 127                 config)
 128
 129 ###### Parser: code
 130
 131         #include <unistd.h>
 132         #include <stdlib.h>
 133         #include <fcntl.h>
 134         #include <errno.h>
 135         #include <sys/mman.h>
 136         #include <string.h>
 137         #include <stdio.h>
 138         #include <locale.h>
 139         #include <malloc.h>
 140         #include "mdcode.h"
 141         #include "scanner.h"
 142         #include "parser.h"
 143
 144         ## includes
 145
 146         #include "oceani.h"
 147
 148         ## forward decls
 149         ## value functions
 150         ## ast functions
 151         ## core functions
 152
 153         #include <getopt.h>
 154         static char Usage[] = "Usage: oceani --trace --print --noexec --brackets"
 155                               "--section=SectionName prog.ocn\n";
 156         static const struct option long_options[] = {
 157                 {"trace",     0, NULL, 't'},
 158                 {"print",     0, NULL, 'p'},
 159                 {"noexec",    0, NULL, 'n'},
 160                 {"brackets",  0, NULL, 'b'},
 161                 {"section",   1, NULL, 's'},
 162                 {NULL,        0, NULL, 0},
 163         };
 164         const char *options = "tpnbs";
 165         int main(int argc, char *argv[])
 166         {
 167                 int fd;
 168                 int len;
 169                 char *file;
 170                 struct section *s;
 171                 char *section = NULL;
 172                 struct parse_context context = {
 173                         .config = {
 174                                 .ignored = (1 << TK_line_comment)
 175                                          | (1 << TK_block_comment),
 176                                 .number_chars = ".,_+-",
 177                                 .word_start = "_",
 178                                 .word_cont = "_",
 179                         },
 180                 };
 181                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 182                 struct exec **prog;
 183                 int opt;
 184                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 185                        != -1) {
 186                         switch(opt) {
 187                         case 't': dotrace=1; break;
 188                         case 'p': doprint=1; break;
 189                         case 'n': doexec=0; break;
 190                         case 'b': brackets=1; break;
 191                         case 's': section = optarg; break;
 192                         default: fprintf(stderr, Usage);
 193                                 exit(1);
 194                         }
 195                 }
 196                 if (optind >= argc) {
 197                         fprintf(stderr, "oceani: no input file given\n");
 198                         exit(1);
 199                 }
 200                 fd = open(argv[optind], O_RDONLY);
 201                 if (fd < 0) {
 202                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 203                         exit(1);
 204                 }
 205                 context.file_name = argv[optind];
 206                 len = lseek(fd, 0, 2);
 207                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 208                 s = code_extract(file, file+len, NULL);
 209                 if (!s) {
 210                         fprintf(stderr, "oceani: could not find any code in %s\n",
 211                                 argv[optind]);
 212                         exit(1);
 213                 }
 214
 215                 ## context initialization
 216
 217                 if (section) {
 218                         struct section *ss;
 219                         for (ss = s; ss; ss = ss->next) {
 220                                 struct text sec = ss->section;
 221                                 if (sec.len == strlen(section) &&
 222                                     strncmp(sec.txt, section, sec.len) == 0)
 223                                         break;
 224                         }
 225                         if (ss)
 226                                 prog = parse_oceani(ss->code, &context.config,
 227                                                     dotrace ? stderr : NULL);
 228                         else {
 229                                 fprintf(stderr, "oceani: cannot find section %s\n",
 230                                         section);
 231                                 exit(1);
 232                         }
 233                 } else
 234                         prog = parse_oceani(s->code, &context.config,
 235                                     dotrace ? stderr : NULL);
 236                 if (!prog) {
 237                         fprintf(stderr, "oceani: fatal parser error.\n");
 238                         context.parse_error = 1;
 239                 }
 240                 if (prog && doprint)
 241                         print_exec(*prog, 0, brackets);
 242                 if (prog && doexec && !context.parse_error) {
 243                         if (!analyse_prog(*prog, &context)) {
 244                                 fprintf(stderr, "oceani: type error in program - not running.\n");
 245                                 exit(1);
 246                         }
 247                         interp_prog(*prog, argv+optind+1);
 248                 }
 249                 if (prog) {
 250                         free_exec(*prog);
 251                         free(prog);
 252                 }
 253                 while (s) {
 254                         struct section *t = s->next;
 255                         code_free(s->code);
 256                         free(s);
 257                         s = t;
 258                 }
 259                 ## free context vars
 260                 ## free context types
 261                 exit(context.parse_error ? 1 : 0);
 262         }
 263
 264 ### Analysis
 265
 266 The four requirements of parse, analyse, print, interpret apply to
 267 each language element individually so that is how most of the code
 268 will be structured.
 269
 270 Three of the four are fairly self explanatory.  The one that requires
 271 a little explanation is the analysis step.
 272
 273 The current language design does not require the types of variables to
 274 be declared, but they must still have a single type.  Different
 275 operations impose different requirements on the variables, for example
 276 addition requires both arguments to be numeric, and assignment
 277 requires the variable on the left to have the same type as the
 278 expression on the right.
 279
 280 Analysis involves propagating these type requirements around and
 281 consequently setting the type of each variable.  If any requirements
 282 are violated (e.g. a string is compared with a number) or if a
 283 variable needs to have two different types, then an error is raised
 284 and the program will not run.
 285
 286 If the same variable is declared in both branchs of an 'if/else', or
 287 in all cases of a 'switch' then the multiple instances may be merged
 288 into just one variable if the variable is references after the
 289 conditional statement.  When this happens, the types must naturally be
 290 consistent across all the branches.  When the variable is not used
 291 outside the if, the variables in the different branches are distinct
 292 and can be of different types.
 293
 294 Determining the types of all variables early is important for
 295 processing command line arguments.  These can be assigned to any type
 296 of variable, but we must first know the correct type so any required
 297 conversion can happen.  If a variable is associated with a command
 298 line argument but no type can be interpreted (e.g. the variable is
 299 only ever used in a `print` statement), then the type is set to
 300 'string'.
 301
 302 Undeclared names may only appear in "use" statements and "case" expressions.
 303 These names are given a type of "label" and a unique value.
 304 This allows them to fill the role of a name in an enumerated type, which
 305 is useful for testing the `switch` statement.
 306
 307 As we will see, the condition part of a `while` statement can return
 308 either a Boolean or some other type.  This requires that the expect
 309 type that gets passed around comprises a type (`enum vtype`) and a
 310 flag to indicate that `Vbool` is also permitted.
 311
 312 As there are, as yet, no distinct types that are compatible, there
 313 isn't much subtlety in the analysis.  When we have distinct number
 314 types, this will become more interesting.
 315
 316 #### Error reporting
 317
 318 When analysis discovers an inconsistency it needs to report an error;
 319 just refusing to run the code ensures that the error doesn't cascade,
 320 but by itself it isn't very useful.  A clear understand of the sort of
 321 error message that are useful will help guide the process of analysis.
 322
 323 At a simplistic level, the only sort of error that type analysis can
 324 report is that the type of some construct doesn't match a contextual
 325 requirement.  For example, in `4 + "hello"` the addition provides a
 326 contextual requirement for numbers, but `"hello"` is not a number.  In
 327 this particular example no further information is needed as the types
 328 are obvious from local information.  When a variable is involved that
 329 isn't the case.  It may be helpful to explain why the variable has a
 330 particular type, by indicating the location where the type was set,
 331 whether by declaration or usage.
 332
 333 Using a recursive-descent analysis we can easily detect a problem at
 334 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 335 will detect that one argument is not a number and the usage of `hello`
 336 will detect that a number was wanted, but not provided.  In this
 337 (early) version of the language, we will generate error reports at
 338 multiple locations, so the use of `hello` will report an error and
 339 explain were the value was set, and the addition will report an error
 340 and say why numbers are needed.  To be able to report locations for
 341 errors, each language element will need to record a file location
 342 (line and column) and each variable will need to record the language
 343 element where its type was set.  For now we will assume that each line
 344 of an error message indicates one location in the file, and up to 2
 345 types.  So we provide a `printf`-like function which takes a format, a
 346 language (a `struct exec` which has not yet been introduced), and 2
 347 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 348 will need a function to print the location, once we know how that is
 349 stored.  As will be explained later, there are sometimes extra rules for
 350 type matching and they might affect error messages, we need to pass those
 351 in too.
 352
 353 As well as type errors, we sometimes need to report problems with
 354 tokens, which might be unexpected or might name a type that has not
 355 been defined.  For these we have `tok_err()` which reports an error
 356 with a given token.  Each of the error functions sets the flag in the
 357 context so indicate that parsing failed.
 358
 359 ###### forward decls
 360
 361         static void fput_loc(struct exec *loc, FILE *f);
 362
 363 ###### core functions
 364
 365         static void type_err(struct parse_context *c,
 366                              char *fmt, struct exec *loc,
 367                              struct type *t1, int rules, struct type *t2)
 368         {
 369                 fprintf(stderr, "%s:", c->file_name);
 370                 fput_loc(loc, stderr);
 371                 for (; *fmt ; fmt++) {
 372                         if (*fmt != '%') {
 373                                 fputc(*fmt, stderr);
 374                                 continue;
 375                         }
 376                         fmt++;
 377                         switch (*fmt) {
 378                         case '%': fputc(*fmt, stderr); break;
 379                         default: fputc('?', stderr); break;
 380                         case '1':
 381                                 type_print(t1, stderr);
 382                                 break;
 383                         case '2':
 384                                 type_print(t2, stderr);
 385                                 break;
 386                         ## format cases
 387                         }
 388                 }
 389                 fputs("\n", stderr);
 390                 c->parse_error = 1;
 391         }
 392
 393         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 394         {
 395                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 396                         t->txt.len, t->txt.txt);
 397                 c->parse_error = 1;
 398         }
 399
 400 ## Data Structures
 401
 402 One last introductory step before detailing the language elements and
 403 providing their four requirements is to establish the data structures
 404 to store these elements.
 405
 406 There are two key objects that we need to work with: executable
 407 elements which comprise the program, and values which the program
 408 works with.  Between these are the variables in their various scopes
 409 which hold the values, and types which classify the values stored and
 410 manipulatd by executables.
 411
 412 ### Types
 413
 414 Values come in a wide range of types, with more likely to be added.
 415 Each type needs to be able to parse and print its own values (for
 416 convenience at least) as well as to compare two values, at least for
 417 equality and possibly for order.  For now, values might need to be
 418 duplicated and freed, though eventually such manipulations will be
 419 better integrated into the language.
 420
 421 Rather than requiring every numeric type to support all numeric
 422 operations (add, multiple, etc), we allow types to be able to present
 423 as one of a few standard types: integer, float, and fraction.  The
 424 existance of these conversion functions enable types to determine if
 425 they are compatible with other types.
 426
 427 Named type are stored in a simple linked list.  Objects of each type are "values"
 428 which are often passed around by value.
 429
 430 ###### ast
 431
 432         struct value {
 433                 struct type *type;
 434                 union {
 435                         ## value union fields
 436                 };
 437         };
 438
 439         struct type {
 440                 struct text name;
 441                 struct type *next;
 442                 struct value (*init)(struct type *type);
 443                 struct value (*prepare)(struct type *type);
 444                 struct value (*parse)(struct type *type, char *str);
 445                 void (*print)(struct value val);
 446                 void (*print_type)(struct type *type, FILE *f);
 447                 int (*cmp_order)(struct value v1, struct value v2);
 448                 int (*cmp_eq)(struct value v1, struct value v2);
 449                 struct value (*dup)(struct value val);
 450                 void (*free)(struct value val);
 451                 int (*compat)(struct type *this, struct type *other);
 452                 long long (*to_int)(struct value *v);
 453                 double (*to_float)(struct value *v);
 454                 int (*to_mpq)(mpq_t *q, struct value *v);
 455                 union {
 456                         ## type union fields
 457                 };
 458         };
 459
 460 ###### parse context
 461
 462         struct type *typelist;
 463
 464 ###### ast functions
 465
 466         static struct type *find_type(struct parse_context *c, struct text s)
 467         {
 468                 struct type *l = c->typelist;
 469
 470                 while (l &&
 471                        text_cmp(l->name, s) != 0)
 472                                 l = l->next;
 473                 return l;
 474         }
 475
 476         static struct type *add_type(struct parse_context *c, struct text s,
 477                                      struct type *proto)
 478         {
 479                 struct type *n;
 480
 481                 n = calloc(1, sizeof(*n));
 482                 *n = *proto;
 483                 n->name = s;
 484                 n->next = c->typelist;
 485                 c->typelist = n;
 486                 return n;
 487         }
 488
 489         static void free_type(struct type *t)
 490         {
 491                 /* The type is always a reference to something in the
 492                  * context, so we don't need to free anything.
 493                  */
 494         }
 495
 496         static void free_value(struct value v)
 497         {
 498                 if (v.type)
 499                         v.type->free(v);
 500         }
 501
 502         static int type_compat(struct type *require, struct type *have, int rules)
 503         {
 504                 if ((rules & Rboolok) && have == Tbool)
 505                         return 1;
 506                 if ((rules & Rnolabel) && have == Tlabel)
 507                         return 0;
 508                 if (!require || !have)
 509                         return 1;
 510
 511                 if (require->compat)
 512                         return require->compat(require, have);
 513
 514                 return require == have;
 515         }
 516
 517         static void type_print(struct type *type, FILE *f)
 518         {
 519                 if (!type)
 520                         fputs("*unknown*type*", f);
 521                 else if (type->name.len)
 522                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 523                 else if (type->print_type)
 524                         type->print_type(type, f);
 525                 else
 526                         fputs("*invalid*type*", f);
 527         }
 528
 529         static struct value val_prepare(struct type *type)
 530         {
 531                 struct value rv;
 532
 533                 if (type)
 534                         return type->prepare(type);
 535                 rv.type = type;
 536                 return rv;
 537         }
 538
 539         static struct value val_init(struct type *type)
 540         {
 541                 struct value rv;
 542
 543                 if (type)
 544                         return type->init(type);
 545                 rv.type = type;
 546                 return rv;
 547         }
 548
 549         static struct value dup_value(struct value v)
 550         {
 551                 if (v.type)
 552                         return v.type->dup(v);
 553                 return v;
 554         }
 555
 556         static int value_cmp(struct value left, struct value right)
 557         {
 558                 if (left.type && left.type->cmp_order)
 559                         return left.type->cmp_order(left, right);
 560                 if (left.type && left.type->cmp_eq)
 561                         return left.type->cmp_eq(left, right);
 562                 return -1;
 563         }
 564
 565         static void print_value(struct value v)
 566         {
 567                 if (v.type && v.type->print)
 568                         v.type->print(v);
 569                 else
 570                         printf("*Unknown*");
 571         }
 572
 573         static struct value parse_value(struct type *type, char *arg)
 574         {
 575                 struct value rv;
 576
 577                 if (type && type->parse)
 578                         return type->parse(type, arg);
 579                 rv.type = NULL;
 580                 return rv;
 581         }
 582
 583 ###### free context types
 584
 585         while (context.typelist) {
 586                 struct type *t = context.typelist;
 587
 588                 context.typelist = t->next;
 589                 free(t);
 590         }
 591
 592 #### Base Types
 593
 594 Values of the base types can be numbers, which we represent as
 595 multi-precision fractions, strings, Booleans and labels.  When
 596 analysing the program we also need to allow for places where no value
 597 is meaningful (type `Tnone`) and where we don't know what type to
 598 expect yet (type is `NULL`).
 599
 600 Values are never shared, they are always copied when used, and freed
 601 when no longer needed.
 602
 603 When propagating type information around the program, we need to
 604 determine if two types are compatible, where type `NULL` is compatible
 605 with anything.  There are two special cases with type compatibility,
 606 both related to the Conditional Statement which will be described
 607 later.  In some cases a Boolean can be accepted as well as some other
 608 primary type, and in others any type is acceptable except a label (`Vlabel`).
 609 A separate function encode these cases will simplify some code later.
 610
 611 When assigning command line arguments to variables, we need to be able
 612 to parse each type from a string.
 613
 614 ###### includes
 615         #include <gmp.h>
 616         #include "string.h"
 617         #include "number.h"
 618
 619 ###### libs
 620         myLDLIBS := libnumber.o libstring.o -lgmp
 621         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
 622
 623 ###### type union fields
 624         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
 625
 626 ###### value union fields
 627         struct text str;
 628         mpq_t num;
 629         int bool;
 630         void *label;
 631
 632 ###### ast functions
 633         static void _free_value(struct value v)
 634         {
 635                 switch (v.type->vtype) {
 636                 case Vnone: break;
 637                 case Vstr: free(v.str.txt); break;
 638                 case Vnum: mpq_clear(v.num); break;
 639                 case Vlabel:
 640                 case Vbool: break;
 641                 }
 642         }
 643
 644 ###### value functions
 645
 646         static struct value _val_prepare(struct type *type)
 647         {
 648                 struct value rv;
 649
 650                 rv.type = type;
 651                 switch(type->vtype) {
 652                 case Vnone:
 653                         break;
 654                 case Vnum:
 655                         memset(&rv.num, 0, sizeof(rv.num));
 656                         break;
 657                 case Vstr:
 658                         rv.str.txt = NULL;
 659                         rv.str.len = 0;
 660                         break;
 661                 case Vbool:
 662                         rv.bool = 0;
 663                         break;
 664                 case Vlabel:
 665                         rv.label = NULL;
 666                         break;
 667                 }
 668                 return rv;
 669         }
 670
 671         static struct value _val_init(struct type *type)
 672         {
 673                 struct value rv;
 674
 675                 rv.type = type;
 676                 switch(type->vtype) {
 677                 case Vnone:
 678                         break;
 679                 case Vnum:
 680                         mpq_init(rv.num); break;
 681                 case Vstr:
 682                         rv.str.txt = malloc(1);
 683                         rv.str.len = 0;
 684                         break;
 685                 case Vbool:
 686                         rv.bool = 0;
 687                         break;
 688                 case Vlabel:
 689                         rv.label = NULL;
 690                         break;
 691                 }
 692                 return rv;
 693         }
 694
 695         static struct value _dup_value(struct value v)
 696         {
 697                 struct value rv;
 698                 rv.type = v.type;
 699                 switch (rv.type->vtype) {
 700                 case Vnone:
 701                         break;
 702                 case Vlabel:
 703                         rv.label = v.label;
 704                         break;
 705                 case Vbool:
 706                         rv.bool = v.bool;
 707                         break;
 708                 case Vnum:
 709                         mpq_init(rv.num);
 710                         mpq_set(rv.num, v.num);
 711                         break;
 712                 case Vstr:
 713                         rv.str.len = v.str.len;
 714                         rv.str.txt = malloc(rv.str.len);
 715                         memcpy(rv.str.txt, v.str.txt, v.str.len);
 716                         break;
 717                 }
 718                 return rv;
 719         }
 720
 721         static int _value_cmp(struct value left, struct value right)
 722         {
 723                 int cmp;
 724                 if (left.type != right.type)
 725                         return left.type - right.type;
 726                 switch (left.type->vtype) {
 727                 case Vlabel: cmp = left.label == right.label ? 0 : 1; break;
 728                 case Vnum: cmp = mpq_cmp(left.num, right.num); break;
 729                 case Vstr: cmp = text_cmp(left.str, right.str); break;
 730                 case Vbool: cmp = left.bool - right.bool; break;
 731                 case Vnone: cmp = 0;
 732                 }
 733                 return cmp;
 734         }
 735
 736         static void _print_value(struct value v)
 737         {
 738                 switch (v.type->vtype) {
 739                 case Vnone:
 740                         printf("*no-value*"); break;
 741                 case Vlabel:
 742                         printf("*label-%p*", v.label); break;
 743                 case Vstr:
 744                         printf("%.*s", v.str.len, v.str.txt); break;
 745                 case Vbool:
 746                         printf("%s", v.bool ? "True":"False"); break;
 747                 case Vnum:
 748                         {
 749                         mpf_t fl;
 750                         mpf_init2(fl, 20);
 751                         mpf_set_q(fl, v.num);
 752                         gmp_printf("%Fg", fl);
 753                         mpf_clear(fl);
 754                         break;
 755                         }
 756                 }
 757         }
 758
 759         static struct value _parse_value(struct type *type, char *arg)
 760         {
 761                 struct value val;
 762                 struct text tx;
 763                 int neg = 0;
 764                 char tail[3] = "";
 765
 766                 val.type = type;
 767                 switch(type->vtype) {
 768                 case Vlabel:
 769                 case Vnone:
 770                         val.type = NULL;
 771                         break;
 772                 case Vstr:
 773                         val.str.len = strlen(arg);
 774                         val.str.txt = malloc(val.str.len);
 775                         memcpy(val.str.txt, arg, val.str.len);
 776                         break;
 777                 case Vnum:
 778                         if (*arg == '-') {
 779                                 neg = 1;
 780                                 arg++;
 781                         }
 782                         tx.txt = arg; tx.len = strlen(tx.txt);
 783                         if (number_parse(val.num, tail, tx) == 0)
 784                                 mpq_init(val.num);
 785                         else if (neg)
 786                                 mpq_neg(val.num, val.num);
 787                         if (tail[0]) {
 788                                 printf("Unsupported suffix: %s\n", arg);
 789                                 val.type = NULL;
 790                         }
 791                         break;
 792                 case Vbool:
 793                         if (strcasecmp(arg, "true") == 0 ||
 794                             strcmp(arg, "1") == 0)
 795                                 val.bool = 1;
 796                         else if (strcasecmp(arg, "false") == 0 ||
 797                                  strcmp(arg, "0") == 0)
 798                                 val.bool = 0;
 799                         else {
 800                                 printf("Bad bool: %s\n", arg);
 801                                 val.type = NULL;
 802                         }
 803                         break;
 804                 }
 805                 return val;
 806         }
 807
 808         static void _free_value(struct value v);
 809
 810         static struct type base_prototype = {
 811                 .init = _val_init,
 812                 .prepare = _val_prepare,
 813                 .parse = _parse_value,
 814                 .print = _print_value,
 815                 .cmp_order = _value_cmp,
 816                 .cmp_eq = _value_cmp,
 817                 .dup = _dup_value,
 818                 .free = _free_value,
 819         };
 820
 821         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
 822
 823 ###### ast functions
 824         static struct type *add_base_type(struct parse_context *c, char *n, enum vtype vt)
 825         {
 826                 struct text txt = { n, strlen(n) };
 827                 struct type *t;
 828
 829                 t = add_type(c, txt, &base_prototype);
 830                 t->vtype = vt;
 831                 return t;
 832         }
 833
 834 ###### context initialization
 835
 836         Tbool  = add_base_type(&context, "Boolean", Vbool);
 837         Tstr   = add_base_type(&context, "string", Vstr);
 838         Tnum   = add_base_type(&context, "number", Vnum);
 839         Tnone  = add_base_type(&context, "none", Vnone);
 840         Tlabel = add_base_type(&context, "label", Vlabel);
 841
 842 ### Variables
 843
 844 Variables are scoped named values.  We store the names in a linked
 845 list of "bindings" sorted lexically, and use sequential search and
 846 insertion sort.
 847
 848 ###### ast
 849
 850         struct binding {
 851                 struct text name;
 852                 struct binding *next;   // in lexical order
 853                 ## binding fields
 854         };
 855
 856 This linked list is stored in the parse context so that "reduce"
 857 functions can find or add variables, and so the analysis phase can
 858 ensure that every variable gets a type.
 859
 860 ###### parse context
 861
 862         struct binding *varlist;  // In lexical order
 863
 864 ###### ast functions
 865
 866         static struct binding *find_binding(struct parse_context *c, struct text s)
 867         {
 868                 struct binding **l = &c->varlist;
 869                 struct binding *n;
 870                 int cmp = 1;
 871
 872                 while (*l &&
 873                         (cmp = text_cmp((*l)->name, s)) < 0)
 874                                 l = & (*l)->next;
 875                 if (cmp == 0)
 876                         return *l;
 877                 n = calloc(1, sizeof(*n));
 878                 n->name = s;
 879                 n->next = *l;
 880                 *l = n;
 881                 return n;
 882         }
 883
 884 Each name can be linked to multiple variables defined in different
 885 scopes.  Each scope starts where the name is declared and continues
 886 until the end of the containing code block.  Scopes of a given name
 887 cannot nest, so a declaration while a name is in-scope is an error.
 888
 889 ###### binding fields
 890         struct variable *var;
 891
 892 ###### ast
 893         struct variable {
 894                 struct variable *previous;
 895                 struct value val;
 896                 struct binding *name;
 897                 struct exec *where_decl;// where name was declared
 898                 struct exec *where_set; // where type was set
 899                 ## variable fields
 900         };
 901
 902 While the naming seems strange, we include local constants in the
 903 definition of variables.  A name declared `var := value` can
 904 subsequently be changed, but a name declared `var ::= value` cannot -
 905 it is constant
 906
 907 ###### variable fields
 908         int constant;
 909
 910 Scopes in parallel branches can be partially merged.  More
 911 specifically, if a given name is declared in both branches of an
 912 if/else then its scope is a candidate for merging.  Similarly if
 913 every branch of an exhaustive switch (e.g. has an "else" clause)
 914 declares a given name, then the scopes from the branches are
 915 candidates for merging.
 916
 917 Note that names declared inside a loop (which is only parallel to
 918 itself) are never visible after the loop.  Similarly names defined in
 919 scopes which are not parallel, such as those started by `for` and
 920 `switch`, are never visible after the scope.  Only variables defined in
 921 both `then` and `else` (including the implicit then after an `if`, and
 922 excluding `then` used with `for`) and in all `case`s and `else` of a
 923 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
 924
 925 Labels, which are a bit like variables, follow different rules.
 926 Labels are not explicitly declared, but if an undeclared name appears
 927 in a context where a label is legal, that effectively declares the
 928 name as a label.  The declaration remains in force (or in scope) at
 929 least to the end of the immediately containing block and conditionally
 930 in any larger containing block which does not declare the name in some
 931 other way.  Importantly, the conditional scope extension happens even
 932 if the label is only used in one parallel branch of a conditional --
 933 when used in one branch it is treated as having been declared in all
 934 branches.
 935
 936 Merge candidates are tentatively visible beyond the end of the
 937 branching statement which creates them.  If the name is used, the
 938 merge is affirmed and they become a single variable visible at the
 939 outer layer.  If not - if it is redeclared first - the merge lapses.
 940
 941 To track scopes we have an extra stack, implemented as a linked list,
 942 which roughly parallels the parse stack and which is used exclusively
 943 for scoping.  When a new scope is opened, a new frame is pushed and
 944 the child-count of the parent frame is incremented.  This child-count
 945 is used to distinguish between the first of a set of parallel scopes,
 946 in which declared variables must not be in scope, and subsequent
 947 branches, whether they must already be conditionally scoped.
 948
 949 To push a new frame *before* any code in the frame is parsed, we need a
 950 grammar reduction.  This is most easily achieved with a grammar
 951 element which derives the empty string, and creates the new scope when
 952 it is recognized.  This can be placed, for example, between a keyword
 953 like "if" and the code following it.
 954
 955 ###### ast
 956         struct scope {
 957                 struct scope *parent;
 958                 int child_count;
 959         };
 960
 961 ###### parse context
 962         int scope_depth;
 963         struct scope *scope_stack;
 964
 965 ###### ast functions
 966         static void scope_pop(struct parse_context *c)
 967         {
 968                 struct scope *s = c->scope_stack;
 969
 970                 c->scope_stack = s->parent;
 971                 free(s);
 972                 c->scope_depth -= 1;
 973         }
 974
 975         static void scope_push(struct parse_context *c)
 976         {
 977                 struct scope *s = calloc(1, sizeof(*s));
 978                 if (c->scope_stack)
 979                         c->scope_stack->child_count += 1;
 980                 s->parent = c->scope_stack;
 981                 c->scope_stack = s;
 982                 c->scope_depth += 1;
 983         }
 984
 985 ###### Grammar
 986
 987         $void
 988         OpenScope -> ${ scope_push(config2context(config)); }$
 989
 990
 991 Each variable records a scope depth and is in one of four states:
 992
 993 - "in scope".  This is the case between the declaration of the
 994   variable and the end of the containing block, and also between
 995   the usage with affirms a merge and the end of that block.
 996
 997   The scope depth is not greater than the current parse context scope
 998   nest depth.  When the block of that depth closes, the state will
 999   change.  To achieve this, all "in scope" variables are linked
1000   together as a stack in nesting order.
1001
1002 - "pending".  The "in scope" block has closed, but other parallel
1003   scopes are still being processed.  So far, every parallel block at
1004   the same level that has closed has declared the name.
1005
1006   The scope depth is the depth of the last parallel block that
1007   enclosed the declaration, and that has closed.
1008
1009 - "conditionally in scope".  The "in scope" block and all parallel
1010   scopes have closed, and no further mention of the name has been
1011   seen.  This state includes a secondary nest depth which records the
1012   outermost scope seen since the variable became conditionally in
1013   scope.  If a use of the name is found, the variable becomes "in
1014   scope" and that secondary depth becomes the recorded scope depth.
1015   If the name is declared as a new variable, the old variable becomes
1016   "out of scope" and the recorded scope depth stays unchanged.
1017
1018 - "out of scope".  The variable is neither in scope nor conditionally
1019   in scope.  It is permanently out of scope now and can be removed from
1020   the "in scope" stack.
1021
1022
1023 ###### variable fields
1024         int depth, min_depth;
1025         enum { OutScope, PendingScope, CondScope, InScope } scope;
1026         struct variable *in_scope;
1027
1028 ###### parse context
1029
1030         struct variable *in_scope;
1031
1032 All variables with the same name are linked together using the
1033 'previous' link.  Those variable that have
1034 been affirmatively merged all have a 'merged' pointer that points to
1035 one primary variable - the most recently declared instance. When
1036 merging variables, we need to also adjust the 'merged' pointer on any
1037 other variables that had previously been merged with the one that will
1038 no longer be primary.
1039
1040 ###### variable fields
1041         struct variable *merged;
1042
1043 ###### ast functions
1044
1045         static void variable_merge(struct variable *primary, struct variable *secondary)
1046         {
1047                 struct variable *v;
1048
1049                 if (primary->merged)
1050                         // shouldn't happen
1051                         primary = primary->merged;
1052
1053                 for (v = primary->previous; v; v=v->previous)
1054                         if (v == secondary || v == secondary->merged ||
1055                             v->merged == secondary ||
1056                             (v->merged && v->merged == secondary->merged)) {
1057                                 v->scope = OutScope;
1058                                 v->merged = primary;
1059                         }
1060         }
1061
1062 ###### free context vars
1063
1064         while (context.varlist) {
1065                 struct binding *b = context.varlist;
1066                 struct variable *v = b->var;
1067                 context.varlist = b->next;
1068                 free(b);
1069                 while (v) {
1070                         struct variable *t = v;
1071
1072                         v = t->previous;
1073                         free_value(t->val);
1074                         free(t);
1075                 }
1076         }
1077
1078 #### Manipulating Bindings
1079
1080 When a name is conditionally visible, a new declaration discards the
1081 old binding - the condition lapses.  Conversely a usage of the name
1082 affirms the visibility and extends it to the end of the containing
1083 block - i.e. the block that contains both the original declaration and
1084 the latest usage.  This is determined from `min_depth`.  When a
1085 conditionally visible variable gets affirmed like this, it is also
1086 merged with other conditionally visible variables with the same name.
1087
1088 When we parse a variable declaration we either signal an error if the
1089 name is currently bound, or create a new variable at the current nest
1090 depth if the name is unbound or bound to a conditionally scoped or
1091 pending-scope variable.  If the previous variable was conditionally
1092 scoped, it and its homonyms becomes out-of-scope.
1093
1094 When we parse a variable reference (including non-declarative
1095 assignment) we signal an error if the name is not bound or is bound to
1096 a pending-scope variable; update the scope if the name is bound to a
1097 conditionally scoped variable; or just proceed normally if the named
1098 variable is in scope.
1099
1100 When we exit a scope, any variables bound at this level are either
1101 marked out of scope or pending-scoped, depending on whether the
1102 scope was sequential or parallel.
1103
1104 When exiting a parallel scope we check if there are any variables that
1105 were previously pending and are still visible. If there are, then
1106 there weren't redeclared in the most recent scope, so they cannot be
1107 merged and must become out-of-scope.  If it is not the first of
1108 parallel scopes (based on `child_count`), we check that there was a
1109 previous binding that is still pending-scope.  If there isn't, the new
1110 variable must now be out-of-scope.
1111
1112 When exiting a sequential scope that immediately enclosed parallel
1113 scopes, we need to resolve any pending-scope variables.  If there was
1114 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1115 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1116 all pending-scope variables become conditionally scoped.
1117
1118 ###### ast
1119         enum closetype { CloseSequential, CloseParallel, CloseElse };
1120
1121 ###### ast functions
1122
1123         static struct variable *var_decl(struct parse_context *c, struct text s)
1124         {
1125                 struct binding *b = find_binding(c, s);
1126                 struct variable *v = b->var;
1127
1128                 switch (v ? v->scope : OutScope) {
1129                 case InScope:
1130                         /* Caller will report the error */
1131                         return NULL;
1132                 case CondScope:
1133                         for (;
1134                              v && v->scope == CondScope;
1135                              v = v->previous)
1136                                 v->scope = OutScope;
1137                         break;
1138                 default: break;
1139                 }
1140                 v = calloc(1, sizeof(*v));
1141                 v->previous = b->var;
1142                 b->var = v;
1143                 v->name = b;
1144                 v->min_depth = v->depth = c->scope_depth;
1145                 v->scope = InScope;
1146                 v->in_scope = c->in_scope;
1147                 c->in_scope = v;
1148                 v->val = val_prepare(NULL);
1149                 return v;
1150         }
1151
1152         static struct variable *var_ref(struct parse_context *c, struct text s)
1153         {
1154                 struct binding *b = find_binding(c, s);
1155                 struct variable *v = b->var;
1156                 struct variable *v2;
1157
1158                 switch (v ? v->scope : OutScope) {
1159                 case OutScope:
1160                 case PendingScope:
1161                         /* Signal an error - once that is possible */
1162                         return NULL;
1163                 case CondScope:
1164                         /* All CondScope variables of this name need to be merged
1165                          * and become InScope
1166                          */
1167                         v->depth = v->min_depth;
1168                         v->scope = InScope;
1169                         for (v2 = v->previous;
1170                              v2 && v2->scope == CondScope;
1171                              v2 = v2->previous)
1172                                 variable_merge(v, v2);
1173                         break;
1174                 case InScope:
1175                         break;
1176                 }
1177                 return v;
1178         }
1179
1180         static void var_block_close(struct parse_context *c, enum closetype ct)
1181         {
1182                 /* close of all variables that are in_scope */
1183                 struct variable *v, **vp, *v2;
1184
1185                 scope_pop(c);
1186                 for (vp = &c->in_scope;
1187                      v = *vp, v && v->depth > c->scope_depth && v->min_depth > c->scope_depth;
1188                      ) {
1189                         switch (ct) {
1190                         case CloseElse:
1191                         case CloseParallel: /* handle PendingScope */
1192                                 switch(v->scope) {
1193                                 case InScope:
1194                                 case CondScope:
1195                                         if (c->scope_stack->child_count == 1)
1196                                                 v->scope = PendingScope;
1197                                         else if (v->previous &&
1198                                                  v->previous->scope == PendingScope)
1199                                                 v->scope = PendingScope;
1200                                         else if (v->val.type == Tlabel)
1201                                                 v->scope = PendingScope;
1202                                         else if (v->name->var == v)
1203                                                 v->scope = OutScope;
1204                                         if (ct == CloseElse) {
1205                                                 /* All Pending variables with this name
1206                                                  * are now Conditional */
1207                                                 for (v2 = v;
1208                                                      v2 && v2->scope == PendingScope;
1209                                                      v2 = v2->previous)
1210                                                         v2->scope = CondScope;
1211                                         }
1212                                         break;
1213                                 case PendingScope:
1214                                         for (v2 = v;
1215                                              v2 && v2->scope == PendingScope;
1216                                              v2 = v2->previous)
1217                                                 if (v2->val.type != Tlabel)
1218                                                         v2->scope = OutScope;
1219                                         break;
1220                                 case OutScope: break;
1221                                 }
1222                                 break;
1223                         case CloseSequential:
1224                                 if (v->val.type == Tlabel)
1225                                         v->scope = PendingScope;
1226                                 switch (v->scope) {
1227                                 case InScope:
1228                                         v->scope = OutScope;
1229                                         break;
1230                                 case PendingScope:
1231                                         /* There was no 'else', so we can only become
1232                                          * conditional if we know the cases were exhaustive,
1233                                          * and that doesn't mean anything yet.
1234                                          * So only labels become conditional..
1235                                          */
1236                                         for (v2 = v;
1237                                              v2 && v2->scope == PendingScope;
1238                                              v2 = v2->previous)
1239                                                 if (v2->val.type == Tlabel) {
1240                                                         v2->scope = CondScope;
1241                                                         v2->min_depth = c->scope_depth;
1242                                                 } else
1243                                                         v2->scope = OutScope;
1244                                         break;
1245                                 case CondScope:
1246                                 case OutScope: break;
1247                                 }
1248                                 break;
1249                         }
1250                         if (v->scope == OutScope)
1251                                 *vp = v->in_scope;
1252                         else
1253                                 vp = &v->in_scope;
1254                 }
1255         }
1256
1257 ### Executables
1258
1259 Executables can be lots of different things.  In many cases an
1260 executable is just an operation combined with one or two other
1261 executables.  This allows for expressions and lists etc.  Other times
1262 an executable is something quite specific like a constant or variable
1263 name.  So we define a `struct exec` to be a general executable with a
1264 type, and a `struct binode` which is a subclass of `exec`, forms a
1265 node in a binary tree, and holds an operation. There will be other
1266 subclasses, and to access these we need to be able to `cast` the
1267 `exec` into the various other types.
1268
1269 ###### macros
1270         #define cast(structname, pointer) ({            \
1271                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
1272                 if (__mptr && *__mptr != X##structname) abort();                \
1273                 (struct structname *)( (char *)__mptr);})
1274
1275         #define new(structname) ({                                              \
1276                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
1277                 __ptr->type = X##structname;                                            \
1278                 __ptr->line = -1; __ptr->column = -1;                                   \
1279                 __ptr;})
1280
1281         #define new_pos(structname, token) ({                                           \
1282                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
1283                 __ptr->type = X##structname;                                            \
1284                 __ptr->line = token.line; __ptr->column = token.col;                    \
1285                 __ptr;})
1286
1287 ###### ast
1288         enum exec_types {
1289                 Xbinode,
1290                 ## exec type
1291         };
1292         struct exec {
1293                 enum exec_types type;
1294                 int line, column;
1295         };
1296         struct binode {
1297                 struct exec;
1298                 enum Btype {
1299                         ## Binode types
1300                 } op;
1301                 struct exec *left, *right;
1302         };
1303
1304 ###### ast functions
1305
1306         static int __fput_loc(struct exec *loc, FILE *f)
1307         {
1308                 if (loc->line >= 0) {
1309                         fprintf(f, "%d:%d: ", loc->line, loc->column);
1310                         return 1;
1311                 }
1312                 if (loc->type == Xbinode)
1313                         return __fput_loc(cast(binode,loc)->left, f) ||
1314                                __fput_loc(cast(binode,loc)->right, f);
1315                 return 0;
1316         }
1317         static void fput_loc(struct exec *loc, FILE *f)
1318         {
1319                 if (!__fput_loc(loc, f))
1320                         fprintf(f, "??:??: ");
1321         }
1322
1323 Each different type of `exec` node needs a number of functions
1324 defined, a bit like methods.  We must be able to be able to free it,
1325 print it, analyse it and execute it.  Once we have specific `exec`
1326 types we will need to parse them too.  Let's take this a bit more
1327 slowly.
1328
1329 #### Freeing
1330
1331 The parser generator requires a `free_foo` function for each struct
1332 that stores attributes and they will be `exec`s and subtypes there-of.
1333 So we need `free_exec` which can handle all the subtypes, and we need
1334 `free_binode`.
1335
1336 ###### ast functions
1337
1338         static void free_binode(struct binode *b)
1339         {
1340                 if (!b)
1341                         return;
1342                 free_exec(b->left);
1343                 free_exec(b->right);
1344                 free(b);
1345         }
1346
1347 ###### core functions
1348         static void free_exec(struct exec *e)
1349         {
1350                 if (!e)
1351                         return;
1352                 switch(e->type) {
1353                         ## free exec cases
1354                 }
1355         }
1356
1357 ###### forward decls
1358
1359         static void free_exec(struct exec *e);
1360
1361 ###### free exec cases
1362         case Xbinode: free_binode(cast(binode, e)); break;
1363
1364 #### Printing
1365
1366 Printing an `exec` requires that we know the current indent level for
1367 printing line-oriented components.  As will become clear later, we
1368 also want to know what sort of bracketing to use.
1369
1370 ###### ast functions
1371
1372         static void do_indent(int i, char *str)
1373         {
1374                 while (i--)
1375                         printf("    ");
1376                 printf("%s", str);
1377         }
1378
1379 ###### core functions
1380         static void print_binode(struct binode *b, int indent, int bracket)
1381         {
1382                 struct binode *b2;
1383                 switch(b->op) {
1384                 ## print binode cases
1385                 }
1386         }
1387
1388         static void print_exec(struct exec *e, int indent, int bracket)
1389         {
1390                 if (!e)
1391                         return;
1392                 switch (e->type) {
1393                 case Xbinode:
1394                         print_binode(cast(binode, e), indent, bracket); break;
1395                 ## print exec cases
1396                 }
1397         }
1398
1399 ###### forward decls
1400
1401         static void print_exec(struct exec *e, int indent, int bracket);
1402
1403 #### Analysing
1404
1405 As discussed, analysis involves propagating type requirements around
1406 the program and looking for errors.
1407
1408 So `propagate_types` is passed an expected type (being a `struct type`
1409 pointer together with some `val_rules` flags) that the `exec` is
1410 expected to return, and returns the type that it does return, either
1411 of which can be `NULL` signifying "unknown".  An `ok` flag is passed
1412 by reference. It is set to `0` when an error is found, and `2` when
1413 any change is made.  If it remains unchanged at `1`, then no more
1414 propagation is needed.
1415
1416 ###### ast
1417
1418         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 2<<1};
1419
1420 ###### format cases
1421         case 'r':
1422                 if (rules & Rnolabel)
1423                         fputs(" (labels not permitted)", stderr);
1424                 break;
1425
1426 ###### core functions
1427
1428         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1429                                             struct type *type, int rules)
1430         {
1431                 struct type *t;
1432
1433                 if (!prog)
1434                         return Tnone;
1435
1436                 switch (prog->type) {
1437                 case Xbinode:
1438                 {
1439                         struct binode *b = cast(binode, prog);
1440                         switch (b->op) {
1441                         ## propagate binode cases
1442                         }
1443                         break;
1444                 }
1445                 ## propagate exec cases
1446                 }
1447                 return Tnone;
1448         }
1449
1450 #### Interpreting
1451
1452 Interpreting an `exec` doesn't require anything but the `exec`.  State
1453 is stored in variables and each variable will be directly linked from
1454 within the `exec` tree.  The exception to this is the whole `program`
1455 which needs to look at command line arguments.  The `program` will be
1456 interpreted separately.
1457
1458 Each `exec` can return a value, which may be `Tnone` but must be non-NULL;
1459
1460 ###### core functions
1461
1462         struct lrval {
1463                 struct value val, *lval;
1464         };
1465
1466         static struct lrval _interp_exec(struct exec *e);
1467
1468         static struct value interp_exec(struct exec *e)
1469         {
1470                 struct lrval ret = _interp_exec(e);
1471
1472                 if (ret.lval)
1473                         return dup_value(*ret.lval);
1474                 else
1475                         return ret.val;
1476         }
1477
1478         static struct value *linterp_exec(struct exec *e)
1479         {
1480                 struct lrval ret = _interp_exec(e);
1481
1482                 return ret.lval;
1483         }
1484
1485         static struct lrval _interp_exec(struct exec *e)
1486         {
1487                 struct lrval ret;
1488                 struct value rv, *lrv = NULL;
1489                 rv.type = Tnone;
1490                 if (!e) {
1491                         ret.lval = lrv;
1492                         ret.val = rv;
1493                         return ret;
1494                 }
1495
1496                 switch(e->type) {
1497                 case Xbinode:
1498                 {
1499                         struct binode *b = cast(binode, e);
1500                         struct value left, right, *lleft;
1501                         left.type = right.type = Tnone;
1502                         switch (b->op) {
1503                         ## interp binode cases
1504                         }
1505                         free_value(left); free_value(right);
1506                         break;
1507                 }
1508                 ## interp exec cases
1509                 }
1510                 ret.lval = lrv;
1511                 ret.val = rv;
1512                 return ret;
1513         }
1514
1515 ## Language elements
1516
1517 Each language element needs to be parsed, printed, analysed,
1518 interpreted, and freed.  There are several, so let's just start with
1519 the easy ones and work our way up.
1520
1521 ### Values
1522
1523 We have already met values as separate objects.  When manifest
1524 constants appear in the program text, that must result in an executable
1525 which has a constant value.  So the `val` structure embeds a value in
1526 an executable.
1527
1528 ###### exec type
1529         Xval,
1530
1531 ###### ast
1532         struct val {
1533                 struct exec;
1534                 struct value val;
1535         };
1536
1537 ###### Grammar
1538
1539         $*val
1540         Value ->  True ${
1541                         $0 = new_pos(val, $1);
1542                         $0->val.type = Tbool;
1543                         $0->val.bool = 1;
1544                         }$
1545                 | False ${
1546                         $0 = new_pos(val, $1);
1547                         $0->val.type = Tbool;
1548                         $0->val.bool = 0;
1549                         }$
1550                 | NUMBER ${
1551                         $0 = new_pos(val, $1);
1552                         $0->val.type = Tnum;
1553                         {
1554                         char tail[3];
1555                         if (number_parse($0->val.num, tail, $1.txt) == 0)
1556                                 mpq_init($0->val.num);
1557                                 if (tail[0])
1558                                         tok_err(config2context(config), "error: unsupported number suffix",
1559                                                 &$1);
1560                         }
1561                         }$
1562                 | STRING ${
1563                         $0 = new_pos(val, $1);
1564                         $0->val.type = Tstr;
1565                         {
1566                         char tail[3];
1567                         string_parse(&$1, '\\', &$0->val.str, tail);
1568                         if (tail[0])
1569                                 tok_err(config2context(config), "error: unsupported string suffix",
1570                                         &$1);
1571                         }
1572                         }$
1573                 | MULTI_STRING ${
1574                         $0 = new_pos(val, $1);
1575                         $0->val.type = Tstr;
1576                         {
1577                         char tail[3];
1578                         string_parse(&$1, '\\', &$0->val.str, tail);
1579                         if (tail[0])
1580                                 tok_err(config2context(config), "error: unsupported string suffix",
1581                                         &$1);
1582                         }
1583                         }$
1584
1585 ###### print exec cases
1586         case Xval:
1587         {
1588                 struct val *v = cast(val, e);
1589                 if (v->val.type == Tstr)
1590                         printf("\"");
1591                 print_value(v->val);
1592                 if (v->val.type == Tstr)
1593                         printf("\"");
1594                 break;
1595         }
1596
1597 ###### propagate exec cases
1598                 case Xval:
1599                 {
1600                         struct val *val = cast(val, prog);
1601                         if (!type_compat(type, val->val.type, rules)) {
1602                                 type_err(c, "error: expected %1%r found %2",
1603                                            prog, type, rules, val->val.type);
1604                                 *ok = 0;
1605                         }
1606                         return val->val.type;
1607                 }
1608
1609 ###### interp exec cases
1610         case Xval:
1611                 rv = dup_value(cast(val, e)->val);
1612                 break;
1613
1614 ###### ast functions
1615         static void free_val(struct val *v)
1616         {
1617                 if (!v)
1618                         return;
1619                 free_value(v->val);
1620                 free(v);
1621         }
1622
1623 ###### free exec cases
1624         case Xval: free_val(cast(val, e)); break;
1625
1626 ###### ast functions
1627         // Move all nodes from 'b' to 'rv', reversing the order.
1628         // In 'b' 'left' is a list, and 'right' is the last node.
1629         // In 'rv', left' is the first node and 'right' is a list.
1630         static struct binode *reorder_bilist(struct binode *b)
1631         {
1632                 struct binode *rv = NULL;
1633
1634                 while (b) {
1635                         struct exec *t = b->right;
1636                         b->right = rv;
1637                         rv = b;
1638                         if (b->left)
1639                                 b = cast(binode, b->left);
1640                         else
1641                                 b = NULL;
1642                         rv->left = t;
1643                 }
1644                 return rv;
1645         }
1646
1647 ### Variables
1648
1649 Just as we used a `val` to wrap a value into an `exec`, we similarly
1650 need a `var` to wrap a `variable` into an exec.  While each `val`
1651 contained a copy of the value, each `var` hold a link to the variable
1652 because it really is the same variable no matter where it appears.
1653 When a variable is used, we need to remember to follow the `->merged`
1654 link to find the primary instance.
1655
1656 ###### exec type
1657         Xvar,
1658
1659 ###### ast
1660         struct var {
1661                 struct exec;
1662                 struct variable *var;
1663         };
1664
1665 ###### Grammar
1666
1667         $*var
1668         VariableDecl -> IDENTIFIER : ${ {
1669                 struct variable *v = var_decl(config2context(config), $1.txt);
1670                 $0 = new_pos(var, $1);
1671                 $0->var = v;
1672                 if (v)
1673                         v->where_decl = $0;
1674                 else {
1675                         v = var_ref(config2context(config), $1.txt);
1676                         $0->var = v;
1677                         type_err(config2context(config), "error: variable '%v' redeclared",
1678                                  $0, Tnone, 0, Tnone);
1679                         type_err(config2context(config), "info: this is where '%v' was first declared",
1680                                  v->where_decl, Tnone, 0, Tnone);
1681                 }
1682         } }$
1683             | IDENTIFIER :: ${ {
1684                 struct variable *v = var_decl(config2context(config), $1.txt);
1685                 $0 = new_pos(var, $1);
1686                 $0->var = v;
1687                 if (v) {
1688                         v->where_decl = $0;
1689                         v->constant = 1;
1690                 } else {
1691                         v = var_ref(config2context(config), $1.txt);
1692                         $0->var = v;
1693                         type_err(config2context(config), "error: variable '%v' redeclared",
1694                                  $0, Tnone, 0, Tnone);
1695                         type_err(config2context(config), "info: this is where '%v' was first declared",
1696                                  v->where_decl, Tnone, 0, Tnone);
1697                 }
1698         } }$
1699             | IDENTIFIER : Type ${ {
1700                 struct variable *v = var_decl(config2context(config), $1.txt);
1701                 $0 = new_pos(var, $1);
1702                 $0->var = v;
1703                 if (v) {
1704                         v->where_decl = $0;
1705                         v->where_set = $0;
1706                         v->val = val_prepare($<3);
1707                 } else {
1708                         v = var_ref(config2context(config), $1.txt);
1709                         $0->var = v;
1710                         type_err(config2context(config), "error: variable '%v' redeclared",
1711                                  $0, Tnone, 0, Tnone);
1712                         type_err(config2context(config), "info: this is where '%v' was first declared",
1713                                  v->where_decl, Tnone, 0, Tnone);
1714                 }
1715         } }$
1716             | IDENTIFIER :: Type ${ {
1717                 struct variable *v = var_decl(config2context(config), $1.txt);
1718                 $0 = new_pos(var, $1);
1719                 $0->var = v;
1720                 if (v) {
1721                         v->where_decl = $0;
1722                         v->where_set = $0;
1723                         v->val = val_prepare($<3);
1724                         v->constant = 1;
1725                 } else {
1726                         v = var_ref(config2context(config), $1.txt);
1727                         $0->var = v;
1728                         type_err(config2context(config), "error: variable '%v' redeclared",
1729                                  $0, Tnone, 0, Tnone);
1730                         type_err(config2context(config), "info: this is where '%v' was first declared",
1731                                  v->where_decl, Tnone, 0, Tnone);
1732                 }
1733         } }$
1734
1735         Variable -> IDENTIFIER ${ {
1736                 struct variable *v = var_ref(config2context(config), $1.txt);
1737                 $0 = new_pos(var, $1);
1738                 if (v == NULL) {
1739                         /* This might be a label - allocate a var just in case */
1740                         v = var_decl(config2context(config), $1.txt);
1741                         if (v) {
1742                                 v->val = val_prepare(Tlabel);
1743                                 v->val.label = &v->val;
1744                                 v->where_set = $0;
1745                         }
1746                 }
1747                 $0->var = v;
1748         } }$
1749
1750         $*type
1751         Type -> IDENTIFIER ${
1752                 $0 = find_type(config2context(config), $1.txt);
1753                 if (!$0) {
1754                         tok_err(config2context(config),
1755                                 "error: undefined type", &$1);
1756
1757                         $0 = Tnone;
1758                 }
1759         }$
1760
1761 ###### print exec cases
1762         case Xvar:
1763         {
1764                 struct var *v = cast(var, e);
1765                 if (v->var) {
1766                         struct binding *b = v->var->name;
1767                         printf("%.*s", b->name.len, b->name.txt);
1768                 }
1769                 break;
1770         }
1771
1772 ###### format cases
1773         case 'v':
1774                 if (loc->type == Xvar) {
1775                         struct var *v = cast(var, loc);
1776                         if (v->var) {
1777                                 struct binding *b = v->var->name;
1778                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
1779                         } else
1780                                 fputs("???", stderr);
1781                 } else
1782                         fputs("NOTVAR", stderr);
1783                 break;
1784
1785 ###### propagate exec cases
1786
1787         case Xvar:
1788         {
1789                 struct var *var = cast(var, prog);
1790                 struct variable *v = var->var;
1791                 if (!v) {
1792                         type_err(c, "%d:BUG: no variable!!", prog, Tnone, 0, Tnone);
1793                         *ok = 0;
1794                         return Tnone;
1795                 }
1796                 if (v->merged)
1797                         v = v->merged;
1798                 if (v->constant && (rules & Rnoconstant)) {
1799                         type_err(c, "error: Cannot assign to a constant: %v",
1800                                  prog, NULL, 0, NULL);
1801                         type_err(c, "info: name was defined as a constant here",
1802                                  v->where_decl, NULL, 0, NULL);
1803                         *ok = 0;
1804                         return v->val.type;
1805                 }
1806                 if (v->val.type == NULL) {
1807                         if (type && *ok != 0) {
1808                                 v->val = val_prepare(type);
1809                                 v->where_set = prog;
1810                                 *ok = 2;
1811                         }
1812                         return type;
1813                 }
1814                 if (!type_compat(type, v->val.type, rules)) {
1815                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
1816                                  type, rules, v->val.type);
1817                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
1818                                  v->val.type, rules, Tnone);
1819                         *ok = 0;
1820                 }
1821                 if (!type)
1822                         return v->val.type;
1823                 return type;
1824         }
1825
1826 ###### interp exec cases
1827         case Xvar:
1828         {
1829                 struct var *var = cast(var, e);
1830                 struct variable *v = var->var;
1831
1832                 if (v->merged)
1833                         v = v->merged;
1834                 lrv = &v->val;
1835                 break;
1836         }
1837
1838 ###### ast functions
1839
1840         static void free_var(struct var *v)
1841         {
1842                 free(v);
1843         }
1844
1845 ###### free exec cases
1846         case Xvar: free_var(cast(var, e)); break;
1847
1848 ### Expressions: Boolean
1849
1850 Our first user of the `binode` will be expressions, and particularly
1851 Boolean expressions.  As I haven't implemented precedence in the
1852 parser generator yet, we need different names for each precedence
1853 level used by expressions.  The outer most or lowest level precedence
1854 are Boolean `or` `and`, and `not` which form an `Expression` out of `BTerm`s
1855 and `BFact`s.
1856
1857 ###### Binode types
1858         And,
1859         Or,
1860         Not,
1861
1862 ###### Grammar
1863
1864         $*exec
1865         Expression -> Expression or BTerm ${ {
1866                         struct binode *b = new(binode);
1867                         b->op = Or;
1868                         b->left = $<1;
1869                         b->right = $<3;
1870                         $0 = b;
1871                 } }$
1872                 | BTerm ${ $0 = $<1; }$
1873
1874         BTerm -> BTerm and BFact ${ {
1875                         struct binode *b = new(binode);
1876                         b->op = And;
1877                         b->left = $<1;
1878                         b->right = $<3;
1879                         $0 = b;
1880                 } }$
1881                 | BFact ${ $0 = $<1; }$
1882
1883         BFact -> not BFact ${ {
1884                         struct binode *b = new(binode);
1885                         b->op = Not;
1886                         b->right = $<2;
1887                         $0 = b;
1888                 } }$
1889                 ## other BFact
1890
1891 ###### print binode cases
1892         case And:
1893                 print_exec(b->left, -1, 0);
1894                 printf(" and ");
1895                 print_exec(b->right, -1, 0);
1896                 break;
1897         case Or:
1898                 print_exec(b->left, -1, 0);
1899                 printf(" or ");
1900                 print_exec(b->right, -1, 0);
1901                 break;
1902         case Not:
1903                 printf("not ");
1904                 print_exec(b->right, -1, 0);
1905                 break;
1906
1907 ###### propagate binode cases
1908         case And:
1909         case Or:
1910         case Not:
1911                 /* both must be Tbool, result is Tbool */
1912                 propagate_types(b->left, c, ok, Tbool, 0);
1913                 propagate_types(b->right, c, ok, Tbool, 0);
1914                 if (type && type != Tbool) {
1915                         type_err(c, "error: %1 operation found where %2 expected", prog,
1916                                    Tbool, 0, type);
1917                         *ok = 0;
1918                 }
1919                 return Tbool;
1920
1921 ###### interp binode cases
1922         case And:
1923                 rv = interp_exec(b->left);
1924                 right = interp_exec(b->right);
1925                 rv.bool = rv.bool && right.bool;
1926                 break;
1927         case Or:
1928                 rv = interp_exec(b->left);
1929                 right = interp_exec(b->right);
1930                 rv.bool = rv.bool || right.bool;
1931                 break;
1932         case Not:
1933                 rv = interp_exec(b->right);
1934                 rv.bool = !rv.bool;
1935                 break;
1936
1937 ### Expressions: Comparison
1938
1939 Of slightly higher precedence that Boolean expressions are
1940 Comparisons.
1941 A comparison takes arguments of any type, but the two types must be
1942 the same.
1943
1944 To simplify the parsing we introduce an `eop` which can record an
1945 expression operator.
1946
1947 ###### ast
1948         struct eop {
1949                 enum Btype op;
1950         };
1951
1952 ###### ast functions
1953         static void free_eop(struct eop *e)
1954         {
1955                 if (e)
1956                         free(e);
1957         }
1958
1959 ###### Binode types
1960         Less,
1961         Gtr,
1962         LessEq,
1963         GtrEq,
1964         Eql,
1965         NEql,
1966
1967 ###### other BFact
1968         | Expr CMPop Expr ${ {
1969                         struct binode *b = new(binode);
1970                         b->op = $2.op;
1971                         b->left = $<1;
1972                         b->right = $<3;
1973                         $0 = b;
1974         } }$
1975         | Expr ${ $0 = $<1; }$
1976
1977 ###### Grammar
1978
1979         $eop
1980         CMPop ->   < ${ $0.op = Less; }$
1981                 |  > ${ $0.op = Gtr; }$
1982                 |  <= ${ $0.op = LessEq; }$
1983                 |  >= ${ $0.op = GtrEq; }$
1984                 |  == ${ $0.op = Eql; }$
1985                 |  != ${ $0.op = NEql; }$
1986
1987 ###### print binode cases
1988
1989         case Less:
1990         case LessEq:
1991         case Gtr:
1992         case GtrEq:
1993         case Eql:
1994         case NEql:
1995                 print_exec(b->left, -1, 0);
1996                 switch(b->op) {
1997                 case Less:   printf(" < "); break;
1998                 case LessEq: printf(" <= "); break;
1999                 case Gtr:    printf(" > "); break;
2000                 case GtrEq:  printf(" >= "); break;
2001                 case Eql:    printf(" == "); break;
2002                 case NEql:   printf(" != "); break;
2003                 default: abort();
2004                 }
2005                 print_exec(b->right, -1, 0);
2006                 break;
2007
2008 ###### propagate binode cases
2009         case Less:
2010         case LessEq:
2011         case Gtr:
2012         case GtrEq:
2013         case Eql:
2014         case NEql:
2015                 /* Both must match but not be labels, result is Tbool */
2016                 t = propagate_types(b->left, c, ok, NULL, Rnolabel);
2017                 if (t)
2018                         propagate_types(b->right, c, ok, t, 0);
2019                 else {
2020                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
2021                         if (t)
2022                                 t = propagate_types(b->left, c, ok, t, 0);
2023                 }
2024                 if (!type_compat(type, Tbool, 0)) {
2025                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
2026                                     Tbool, rules, type);
2027                         *ok = 0;
2028                 }
2029                 return Tbool;
2030
2031 ###### interp binode cases
2032         case Less:
2033         case LessEq:
2034         case Gtr:
2035         case GtrEq:
2036         case Eql:
2037         case NEql:
2038         {
2039                 int cmp;
2040                 left = interp_exec(b->left);
2041                 right = interp_exec(b->right);
2042                 cmp = value_cmp(left, right);
2043                 rv.type = Tbool;
2044                 switch (b->op) {
2045                 case Less:      rv.bool = cmp <  0; break;
2046                 case LessEq:    rv.bool = cmp <= 0; break;
2047                 case Gtr:       rv.bool = cmp >  0; break;
2048                 case GtrEq:     rv.bool = cmp >= 0; break;
2049                 case Eql:       rv.bool = cmp == 0; break;
2050                 case NEql:      rv.bool = cmp != 0; break;
2051                 default: rv.bool = 0; break;
2052                 }
2053                 break;
2054         }
2055
2056 ### Expressions: The rest
2057
2058 The remaining expressions with the highest precedence are arithmetic
2059 and string concatenation.  They are `Expr`, `Term`, and `Factor`.
2060 The `Factor` is where the `Value` and `Variable` that we already have
2061 are included.
2062
2063 `+` and `-` are both infix and prefix operations (where they are
2064 absolute value and negation).  These have different operator names.
2065
2066 We also have a 'Bracket' operator which records where parentheses were
2067 found.  This makes it easy to reproduce these when printing.  Once
2068 precedence is handled better I might be able to discard this.
2069
2070 ###### Binode types
2071         Plus, Minus,
2072         Times, Divide,
2073         Concat,
2074         Absolute, Negate,
2075         Bracket,
2076
2077 ###### Grammar
2078
2079         $*exec
2080         Expr -> Expr Eop Term ${ {
2081                         struct binode *b = new(binode);
2082                         b->op = $2.op;
2083                         b->left = $<1;
2084                         b->right = $<3;
2085                         $0 = b;
2086                 } }$
2087                 | Term ${ $0 = $<1; }$
2088
2089         Term -> Term Top Factor ${ {
2090                         struct binode *b = new(binode);
2091                         b->op = $2.op;
2092                         b->left = $<1;
2093                         b->right = $<3;
2094                         $0 = b;
2095                 } }$
2096                 | Factor ${ $0 = $<1; }$
2097
2098         Factor -> ( Expression ) ${ {
2099                         struct binode *b = new_pos(binode, $1);
2100                         b->op = Bracket;
2101                         b->right = $<2;
2102                         $0 = b;
2103                 } }$
2104                 | Uop Factor ${ {
2105                         struct binode *b = new(binode);
2106                         b->op = $1.op;
2107                         b->right = $<2;
2108                         $0 = b;
2109                 } }$
2110                 | Value ${ $0 = $<1; }$
2111                 | Variable ${ $0 = $<1; }$
2112
2113         $eop
2114         Eop ->    + ${ $0.op = Plus; }$
2115                 | - ${ $0.op = Minus; }$
2116
2117         Uop ->    + ${ $0.op = Absolute; }$
2118                 | - ${ $0.op = Negate; }$
2119
2120         Top ->    * ${ $0.op = Times; }$
2121                 | / ${ $0.op = Divide; }$
2122                 | ++ ${ $0.op = Concat; }$
2123
2124 ###### print binode cases
2125         case Plus:
2126         case Minus:
2127         case Times:
2128         case Divide:
2129         case Concat:
2130                 print_exec(b->left, indent, 0);
2131                 switch(b->op) {
2132                 case Plus:   printf(" + "); break;
2133                 case Minus:  printf(" - "); break;
2134                 case Times:  printf(" * "); break;
2135                 case Divide: printf(" / "); break;
2136                 case Concat: printf(" ++ "); break;
2137                 default: abort();
2138                 }
2139                 print_exec(b->right, indent, 0);
2140                 break;
2141         case Absolute:
2142                 printf("+");
2143                 print_exec(b->right, indent, 0);
2144                 break;
2145         case Negate:
2146                 printf("-");
2147                 print_exec(b->right, indent, 0);
2148                 break;
2149         case Bracket:
2150                 printf("(");
2151                 print_exec(b->right, indent, 0);
2152                 printf(")");
2153                 break;
2154
2155 ###### propagate binode cases
2156         case Plus:
2157         case Minus:
2158         case Times:
2159         case Divide:
2160                 /* both must be numbers, result is Tnum */
2161         case Absolute:
2162         case Negate:
2163                 /* as propagate_types ignores a NULL,
2164                  * unary ops fit here too */
2165                 propagate_types(b->left, c, ok, Tnum, 0);
2166                 propagate_types(b->right, c, ok, Tnum, 0);
2167                 if (!type_compat(type, Tnum, 0)) {
2168                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
2169                                    Tnum, rules, type);
2170                         *ok = 0;
2171                 }
2172                 return Tnum;
2173
2174         case Concat:
2175                 /* both must be Tstr, result is Tstr */
2176                 propagate_types(b->left, c, ok, Tstr, 0);
2177                 propagate_types(b->right, c, ok, Tstr, 0);
2178                 if (!type_compat(type, Tstr, 0)) {
2179                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
2180                                    Tstr, rules, type);
2181                         *ok = 0;
2182                 }
2183                 return Tstr;
2184
2185         case Bracket:
2186                 return propagate_types(b->right, c, ok, type, 0);
2187
2188 ###### interp binode cases
2189
2190         case Plus:
2191                 rv = interp_exec(b->left);
2192                 right = interp_exec(b->right);
2193                 mpq_add(rv.num, rv.num, right.num);
2194                 break;
2195         case Minus:
2196                 rv = interp_exec(b->left);
2197                 right = interp_exec(b->right);
2198                 mpq_sub(rv.num, rv.num, right.num);
2199                 break;
2200         case Times:
2201                 rv = interp_exec(b->left);
2202                 right = interp_exec(b->right);
2203                 mpq_mul(rv.num, rv.num, right.num);
2204                 break;
2205         case Divide:
2206                 rv = interp_exec(b->left);
2207                 right = interp_exec(b->right);
2208                 mpq_div(rv.num, rv.num, right.num);
2209                 break;
2210         case Negate:
2211                 rv = interp_exec(b->right);
2212                 mpq_neg(rv.num, rv.num);
2213                 break;
2214         case Absolute:
2215                 rv = interp_exec(b->right);
2216                 mpq_abs(rv.num, rv.num);
2217                 break;
2218         case Bracket:
2219                 rv = interp_exec(b->right);
2220                 break;
2221         case Concat:
2222                 left = interp_exec(b->left);
2223                 right = interp_exec(b->right);
2224                 rv.type = Tstr;
2225                 rv.str = text_join(left.str, right.str);
2226                 break;
2227
2228
2229 ###### value functions
2230
2231         static struct text text_join(struct text a, struct text b)
2232         {
2233                 struct text rv;
2234                 rv.len = a.len + b.len;
2235                 rv.txt = malloc(rv.len);
2236                 memcpy(rv.txt, a.txt, a.len);
2237                 memcpy(rv.txt+a.len, b.txt, b.len);
2238                 return rv;
2239         }
2240
2241
2242 ### Blocks, Statements, and Statement lists.
2243
2244 Now that we have expressions out of the way we need to turn to
2245 statements.  There are simple statements and more complex statements.
2246 Simple statements do not contain newlines, complex statements do.
2247
2248 Statements often come in sequences and we have corresponding simple
2249 statement lists and complex statement lists.
2250 The former comprise only simple statements separated by semicolons.
2251 The later comprise complex statements and simple statement lists.  They are
2252 separated by newlines.  Thus the semicolon is only used to separate
2253 simple statements on the one line.  This may be overly restrictive,
2254 but I'm not sure I ever want a complex statement to share a line with
2255 anything else.
2256
2257 Note that a simple statement list can still use multiple lines if
2258 subsequent lines are indented, so
2259
2260 ###### Example: wrapped simple statement list
2261
2262         a = b; c = d;
2263            e = f; print g
2264
2265 is a single simple statement list.  This might allow room for
2266 confusion, so I'm not set on it yet.
2267
2268 A simple statement list needs no extra syntax.  A complex statement
2269 list has two syntactic forms.  It can be enclosed in braces (much like
2270 C blocks), or it can be introduced by a colon and continue until an
2271 unindented newline (much like Python blocks).  With this extra syntax
2272 it is referred to as a block.
2273
2274 Note that a block does not have to include any newlines if it only
2275 contains simple statements.  So both of:
2276
2277         if condition: a=b; d=f
2278
2279         if condition { a=b; print f }
2280
2281 are valid.
2282
2283 In either case the list is constructed from a `binode` list with
2284 `Block` as the operator.  When parsing the list it is most convenient
2285 to append to the end, so a list is a list and a statement.  When using
2286 the list it is more convenient to consider a list to be a statement
2287 and a list.  So we need a function to re-order a list.
2288 `reorder_bilist` serves this purpose.
2289
2290 The only stand-alone statement we introduce at this stage is `pass`
2291 which does nothing and is represented as a `NULL` pointer in a `Block`
2292 list.  Other stand-alone statements will follow once the infrastructure
2293 is in-place.
2294
2295 ###### Binode types
2296         Block,
2297
2298 ###### Grammar
2299
2300         $void
2301         OptNL -> Newlines
2302                 |
2303
2304         Newlines -> NEWLINE
2305                 | Newlines NEWLINE
2306
2307         $*binode
2308         Open -> {
2309                 | NEWLINE {
2310         Close -> }
2311                 | NEWLINE }
2312         Block -> Open Statementlist Close ${ $0 = $<2; }$
2313                 | Open Newlines Statementlist Close ${ $0 = $<3; }$
2314                 | Open SimpleStatements } ${ $0 = reorder_bilist($<2); }$
2315                 | Open Newlines SimpleStatements } ${ $0 = reorder_bilist($<3); }$
2316                 | : Statementlist ${ $0 = $<2; }$
2317                 | : SimpleStatements ${ $0 = reorder_bilist($<2); }$
2318
2319         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<1); }$
2320
2321         ComplexStatements -> ComplexStatements ComplexStatement ${
2322                 $0 = new(binode);
2323                 $0->op = Block;
2324                 $0->left = $<1;
2325                 $0->right = $<2;
2326                 }$
2327                 | ComplexStatements NEWLINE ${ $0 = $<1; }$
2328                 | ComplexStatement ${
2329                 $0 = new(binode);
2330                 $0->op = Block;
2331                 $0->left = NULL;
2332                 $0->right = $<1;
2333                 }$
2334
2335         $*exec
2336         ComplexStatement -> SimpleStatements NEWLINE ${
2337                         $0 = reorder_bilist($<1);
2338                         }$
2339                 ## ComplexStatement Grammar
2340
2341         $*binode
2342         SimpleStatements -> SimpleStatements ; SimpleStatement ${
2343                         $0 = new(binode);
2344                         $0->op = Block;
2345                         $0->left = $<1;
2346                         $0->right = $<3;
2347                         }$
2348                 | SimpleStatement ${
2349                         $0 = new(binode);
2350                         $0->op = Block;
2351                         $0->left = NULL;
2352                         $0->right = $<1;
2353                         }$
2354                 | SimpleStatements ; ${ $0 = $<1; }$
2355
2356         SimpleStatement -> pass ${ $0 = NULL; }$
2357                 ## SimpleStatement Grammar
2358
2359 ###### print binode cases
2360         case Block:
2361                 if (indent < 0) {
2362                         // simple statement
2363                         if (b->left == NULL)
2364                                 printf("pass");
2365                         else
2366                                 print_exec(b->left, indent, 0);
2367                         if (b->right) {
2368                                 printf("; ");
2369                                 print_exec(b->right, indent, 0);
2370                         }
2371                 } else {
2372                         // block, one per line
2373                         if (b->left == NULL)
2374                                 do_indent(indent, "pass\n");
2375                         else
2376                                 print_exec(b->left, indent, bracket);
2377                         if (b->right)
2378                                 print_exec(b->right, indent, bracket);
2379                 }
2380                 break;
2381
2382 ###### propagate binode cases
2383         case Block:
2384         {
2385                 /* If any statement returns something other than Tnone
2386                  * or Tbool then all such must return same type.
2387                  * As each statement may be Tnone or something else,
2388                  * we must always pass NULL (unknown) down, otherwise an incorrect
2389                  * error might occur.  We never return Tnone unless it is
2390                  * passed in.
2391                  */
2392                 struct binode *e;
2393
2394                 for (e = b; e; e = cast(binode, e->right)) {
2395                         t = propagate_types(e->left, c, ok, NULL, rules);
2396                         if ((rules & Rboolok) && t == Tbool)
2397                                 t = NULL;
2398                         if (t && t != Tnone && t != Tbool) {
2399                                 if (!type)
2400                                         type = t;
2401                                 else if (t != type) {
2402                                         type_err(c, "error: expected %1%r, found %2",
2403                                                  e->left, type, rules, t);
2404                                         *ok = 0;
2405                                 }
2406                         }
2407                 }
2408                 return type;
2409         }
2410
2411 ###### interp binode cases
2412         case Block:
2413                 while (rv.type == Tnone &&
2414                        b) {
2415                         if (b->left)
2416                                 rv = interp_exec(b->left);
2417                         b = cast(binode, b->right);
2418                 }
2419                 break;
2420
2421 ### The Print statement
2422
2423 `print` is a simple statement that takes a comma-separated list of
2424 expressions and prints the values separated by spaces and terminated
2425 by a newline.  No control of formatting is possible.
2426
2427 `print` faces the same list-ordering issue as blocks, and uses the
2428 same solution.
2429
2430 ###### Binode types
2431         Print,
2432
2433 ###### SimpleStatement Grammar
2434
2435         | print ExpressionList ${
2436                 $0 = reorder_bilist($<2);
2437         }$
2438         | print ExpressionList , ${
2439                 $0 = new(binode);
2440                 $0->op = Print;
2441                 $0->right = NULL;
2442                 $0->left = $<2;
2443                 $0 = reorder_bilist($0);
2444         }$
2445         | print ${
2446                 $0 = new(binode);
2447                 $0->op = Print;
2448                 $0->right = NULL;
2449         }$
2450
2451 ###### Grammar
2452
2453         $*binode
2454         ExpressionList -> ExpressionList , Expression ${
2455                 $0 = new(binode);
2456                 $0->op = Print;
2457                 $0->left = $<1;
2458                 $0->right = $<3;
2459                 }$
2460                 | Expression ${
2461                         $0 = new(binode);
2462                         $0->op = Print;
2463                         $0->left = NULL;
2464                         $0->right = $<1;
2465                 }$
2466
2467 ###### print binode cases
2468
2469         case Print:
2470                 do_indent(indent, "print");
2471                 while (b) {
2472                         if (b->left) {
2473                                 printf(" ");
2474                                 print_exec(b->left, -1, 0);
2475                                 if (b->right)
2476                                         printf(",");
2477                         }
2478                         b = cast(binode, b->right);
2479                 }
2480                 if (indent >= 0)
2481                         printf("\n");
2482                 break;
2483
2484 ###### propagate binode cases
2485
2486         case Print:
2487                 /* don't care but all must be consistent */
2488                 propagate_types(b->left, c, ok, NULL, Rnolabel);
2489                 propagate_types(b->right, c, ok, NULL, Rnolabel);
2490                 break;
2491
2492 ###### interp binode cases
2493
2494         case Print:
2495         {
2496                 char sep = 0;
2497                 int eol = 1;
2498                 for ( ; b; b = cast(binode, b->right))
2499                         if (b->left) {
2500                                 if (sep)
2501                                         putchar(sep);
2502                                 left = interp_exec(b->left);
2503                                 print_value(left);
2504                                 free_value(left);
2505                                 if (b->right)
2506                                         sep = ' ';
2507                         } else if (sep)
2508                                 eol = 0;
2509                 left.type = Tnone;
2510                 if (eol)
2511                         printf("\n");
2512                 break;
2513         }
2514
2515 ###### Assignment statement
2516
2517 An assignment will assign a value to a variable, providing it hasn't
2518 be declared as a constant.  The analysis phase ensures that the type
2519 will be correct so the interpreter just needs to perform the
2520 calculation.  There is a form of assignment which declares a new
2521 variable as well as assigning a value.  If a name is assigned before
2522 it is declared, and error will be raised as the name is created as
2523 `Tlabel` and it is illegal to assign to such names.
2524
2525 ###### Binode types
2526         Assign,
2527         Declare,
2528
2529 ###### SimpleStatement Grammar
2530         | Variable = Expression ${
2531                         $0 = new(binode);
2532                         $0->op = Assign;
2533                         $0->left = $<1;
2534                         $0->right = $<3;
2535                 }$
2536         | VariableDecl = Expression ${
2537                         $0 = new(binode);
2538                         $0->op = Declare;
2539                         $0->left = $<1;
2540                         $0->right =$<3;
2541                 }$
2542
2543         | VariableDecl ${
2544                         if ($1->var->where_set == NULL) {
2545                                 type_err(config2context(config), "Variable declared with no type or value: %v",
2546                                          $1, NULL, 0, NULL);
2547                         } else {
2548                                 $0 = new(binode);
2549                                 $0->op = Declare;
2550                                 $0->left = $<1;
2551                                 $0->right = NULL;
2552                         }
2553                 }$
2554
2555 ###### print binode cases
2556
2557         case Assign:
2558                 do_indent(indent, "");
2559                 print_exec(b->left, indent, 0);
2560                 printf(" = ");
2561                 print_exec(b->right, indent, 0);
2562                 if (indent >= 0)
2563                         printf("\n");
2564                 break;
2565
2566         case Declare:
2567                 {
2568                 struct variable *v = cast(var, b->left)->var;
2569                 do_indent(indent, "");
2570                 print_exec(b->left, indent, 0);
2571                 if (cast(var, b->left)->var->constant) {
2572                         if (v->where_decl == v->where_set) {
2573                                 printf("::");
2574                                 type_print(v->val.type, stdout);
2575                                 printf(" ");
2576                         } else
2577                                 printf(" ::");
2578                 } else {
2579                         if (v->where_decl == v->where_set) {
2580                                 printf(":");
2581                                 type_print(v->val.type, stdout);
2582                                 printf(" ");
2583                         } else
2584                                 printf(" :");
2585                 }
2586                 if (b->right) {
2587                         printf("= ");
2588                         print_exec(b->right, indent, 0);
2589                 }
2590                 if (indent >= 0)
2591                         printf("\n");
2592                 }
2593                 break;
2594
2595 ###### propagate binode cases
2596
2597         case Assign:
2598         case Declare:
2599                 /* Both must match and not be labels,
2600                  * Type must support 'dup',
2601                  * For Assign, left must not be constant.
2602                  * result is Tnone
2603                  */
2604                 t = propagate_types(b->left, c, ok, NULL,
2605                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
2606                 if (!b->right)
2607                         return Tnone;
2608
2609                 if (t) {
2610                         if (propagate_types(b->right, c, ok, t, 0) != t)
2611                                 if (b->left->type == Xvar)
2612                                         type_err(c, "info: variable '%v' was set as %1 here.",
2613                                                  cast(var, b->left)->var->where_set, t, rules, Tnone);
2614                 } else {
2615                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
2616                         if (t)
2617                                 propagate_types(b->left, c, ok, t,
2618                                                 (b->op == Assign ? Rnoconstant : 0));
2619                 }
2620                 if (t && t->dup == NULL) {
2621                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
2622                         *ok = 0;
2623                 }
2624                 return Tnone;
2625
2626                 break;
2627
2628 ###### interp binode cases
2629
2630         case Assign:
2631                 lleft = linterp_exec(b->left);
2632                 right = interp_exec(b->right);
2633                 if (lleft) {
2634                         free_value(*lleft);
2635                         *lleft = right;
2636                 } else
2637                         free_value(right);
2638                 right.type = NULL;
2639                 break;
2640
2641         case Declare:
2642         {
2643                 struct variable *v = cast(var, b->left)->var;
2644                 if (v->merged)
2645                         v = v->merged;
2646                 if (b->right)
2647                         right = interp_exec(b->right);
2648                 else
2649                         right = val_init(v->val.type);
2650                 free_value(v->val);
2651                 v->val = right;
2652                 right.type = NULL;
2653                 break;
2654         }
2655
2656 ### The `use` statement
2657
2658 The `use` statement is the last "simple" statement.  It is needed when
2659 the condition in a conditional statement is a block.  `use` works much
2660 like `return` in C, but only completes the `condition`, not the whole
2661 function.
2662
2663 ###### Binode types
2664         Use,
2665
2666 ###### SimpleStatement Grammar
2667         | use Expression ${
2668                 $0 = new_pos(binode, $1);
2669                 $0->op = Use;
2670                 $0->right = $<2;
2671         }$
2672
2673 ###### print binode cases
2674
2675         case Use:
2676                 do_indent(indent, "use ");
2677                 print_exec(b->right, -1, 0);
2678                 if (indent >= 0)
2679                         printf("\n");
2680                 break;
2681
2682 ###### propagate binode cases
2683
2684         case Use:
2685                 /* result matches value */
2686                 return propagate_types(b->right, c, ok, type, 0);
2687
2688 ###### interp binode cases
2689
2690         case Use:
2691                 rv = interp_exec(b->right);
2692                 break;
2693
2694 ### The Conditional Statement
2695
2696 This is the biggy and currently the only complex statement.  This
2697 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
2698 It is comprised of a number of parts, all of which are optional though
2699 set combinations apply.  Each part is (usually) a key word (`then` is
2700 sometimes optional) followed by either an expression or a code block,
2701 except the `casepart` which is a "key word and an expression" followed
2702 by a code block.  The code-block option is valid for all parts and,
2703 where an expression is also allowed, the code block can use the `use`
2704 statement to report a value.  If the code block does not report a value
2705 the effect is similar to reporting `True`.
2706
2707 The `else` and `case` parts, as well as `then` when combined with
2708 `if`, can contain a `use` statement which will apply to some
2709 containing conditional statement. `for` parts, `do` parts and `then`
2710 parts used with `for` can never contain a `use`, except in some
2711 subordinate conditional statement.
2712
2713 If there is a `forpart`, it is executed first, only once.
2714 If there is a `dopart`, then it is executed repeatedly providing
2715 always that the `condpart` or `cond`, if present, does not return a non-True
2716 value.  `condpart` can fail to return any value if it simply executes
2717 to completion.  This is treated the same as returning `True`.
2718
2719 If there is a `thenpart` it will be executed whenever the `condpart`
2720 or `cond` returns True (or does not return any value), but this will happen
2721 *after* `dopart` (when present).
2722
2723 If `elsepart` is present it will be executed at most once when the
2724 condition returns `False` or some value that isn't `True` and isn't
2725 matched by any `casepart`.  If there are any `casepart`s, they will be
2726 executed when the condition returns a matching value.
2727
2728 The particular sorts of values allowed in case parts has not yet been
2729 determined in the language design, so nothing is prohibited.
2730
2731 The various blocks in this complex statement potentially provide scope
2732 for variables as described earlier.  Each such block must include the
2733 "OpenScope" nonterminal before parsing the block, and must call
2734 `var_block_close()` when closing the block.
2735
2736 The code following "`if`", "`switch`" and "`for`" does not get its own
2737 scope, but is in a scope covering the whole statement, so names
2738 declared there cannot be redeclared elsewhere.  Similarly the
2739 condition following "`while`" is in a scope the covers the body
2740 ("`do`" part) of the loop, and which does not allow conditional scope
2741 extension.  Code following "`then`" (both looping and non-looping),
2742 "`else`" and "`case`" each get their own local scope.
2743
2744 The type requirements on the code block in a `whilepart` are quite
2745 unusal.  It is allowed to return a value of some identifiable type, in
2746 which case the loop aborts and an appropriate `casepart` is run, or it
2747 can return a Boolean, in which case the loop either continues to the
2748 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
2749 This is different both from the `ifpart` code block which is expected to
2750 return a Boolean, or the `switchpart` code block which is expected to
2751 return the same type as the casepart values.  The correct analysis of
2752 the type of the `whilepart` code block is the reason for the
2753 `Rboolok` flag which is passed to `propagate_types()`.
2754
2755 The `cond_statement` cannot fit into a `binode` so a new `exec` is
2756 defined.
2757
2758 ###### exec type
2759         Xcond_statement,
2760
2761 ###### ast
2762         struct casepart {
2763                 struct exec *value;
2764                 struct exec *action;
2765                 struct casepart *next;
2766         };
2767         struct cond_statement {
2768                 struct exec;
2769                 struct exec *forpart, *condpart, *dopart, *thenpart, *elsepart;
2770                 struct casepart *casepart;
2771         };
2772
2773 ###### ast functions
2774
2775         static void free_casepart(struct casepart *cp)
2776         {
2777                 while (cp) {
2778                         struct casepart *t;
2779                         free_exec(cp->value);
2780                         free_exec(cp->action);
2781                         t = cp->next;
2782                         free(cp);
2783                         cp = t;
2784                 }
2785         }
2786
2787         static void free_cond_statement(struct cond_statement *s)
2788         {
2789                 if (!s)
2790                         return;
2791                 free_exec(s->forpart);
2792                 free_exec(s->condpart);
2793                 free_exec(s->dopart);
2794                 free_exec(s->thenpart);
2795                 free_exec(s->elsepart);
2796                 free_casepart(s->casepart);
2797                 free(s);
2798         }
2799
2800 ###### free exec cases
2801         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
2802
2803 ###### ComplexStatement Grammar
2804         | CondStatement ${ $0 = $<1; }$
2805
2806 ###### Grammar
2807
2808         $*cond_statement
2809         // both ForThen and Whilepart open scopes, and CondSuffix only
2810         // closes one - so in the first branch here we have another to close.
2811         CondStatement -> ForThen WhilePart CondSuffix ${
2812                         $0 = $<3;
2813                         $0->forpart = $1.forpart; $1.forpart = NULL;
2814                         $0->thenpart = $1.thenpart; $1.thenpart = NULL;
2815                         $0->condpart = $2.condpart; $2.condpart = NULL;
2816                         $0->dopart = $2.dopart; $2.dopart = NULL;
2817                         var_block_close(config2context(config), CloseSequential);
2818                         }$
2819                 | WhilePart CondSuffix ${
2820                         $0 = $<2;
2821                         $0->condpart = $1.condpart; $1.condpart = NULL;
2822                         $0->dopart = $1.dopart; $1.dopart = NULL;
2823                         }$
2824                 | SwitchPart CondSuffix ${
2825                         $0 = $<2;
2826                         $0->condpart = $<1;
2827                         }$
2828                 | IfPart IfSuffix ${
2829                         $0 = $<2;
2830                         $0->condpart = $1.condpart; $1.condpart = NULL;
2831                         $0->thenpart = $1.thenpart; $1.thenpart = NULL;
2832                         // This is where we close an "if" statement
2833                         var_block_close(config2context(config), CloseSequential);
2834                         }$
2835
2836         CondSuffix -> IfSuffix ${
2837                         $0 = $<1;
2838                         // This is where we close scope of the whole
2839                         // "for" or "while" statement
2840                         var_block_close(config2context(config), CloseSequential);
2841                 }$
2842                 | CasePart CondSuffix ${
2843                         $0 = $<2;
2844                         $1->next = $0->casepart;
2845                         $0->casepart = $<1;
2846                 }$
2847
2848         $*casepart
2849         CasePart -> Newlines case Expression OpenScope Block ${
2850                         $0 = calloc(1,sizeof(struct casepart));
2851                         $0->value = $<3;
2852                         $0->action = $<5;
2853                         var_block_close(config2context(config), CloseParallel);
2854                 }$
2855                 | case Expression OpenScope Block ${
2856                         $0 = calloc(1,sizeof(struct casepart));
2857                         $0->value = $<2;
2858                         $0->action = $<4;
2859                         var_block_close(config2context(config), CloseParallel);
2860                 }$
2861
2862         $*cond_statement
2863         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
2864                 | Newlines else OpenScope Block ${
2865                         $0 = new(cond_statement);
2866                         $0->elsepart = $<4;
2867                         var_block_close(config2context(config), CloseElse);
2868                 }$
2869                 | else OpenScope Block ${
2870                         $0 = new(cond_statement);
2871                         $0->elsepart = $<3;
2872                         var_block_close(config2context(config), CloseElse);
2873                 }$
2874                 | Newlines else OpenScope CondStatement ${
2875                         $0 = new(cond_statement);
2876                         $0->elsepart = $<4;
2877                         var_block_close(config2context(config), CloseElse);
2878                 }$
2879                 | else OpenScope CondStatement ${
2880                         $0 = new(cond_statement);
2881                         $0->elsepart = $<3;
2882                         var_block_close(config2context(config), CloseElse);
2883                 }$
2884
2885
2886         $*exec
2887         // These scopes are closed in CondSuffix
2888         ForPart -> for OpenScope SimpleStatements ${
2889                         $0 = reorder_bilist($<3);
2890                 }$
2891                 |  for OpenScope Block ${
2892                         $0 = $<3;
2893                 }$
2894
2895         ThenPart -> then OpenScope SimpleStatements ${
2896                         $0 = reorder_bilist($<3);
2897                         var_block_close(config2context(config), CloseSequential);
2898                 }$
2899                 |  then OpenScope Block ${
2900                         $0 = $<3;
2901                         var_block_close(config2context(config), CloseSequential);
2902                 }$
2903
2904         ThenPartNL -> ThenPart OptNL ${
2905                         $0 = $<1;
2906                 }$
2907
2908         // This scope is closed in CondSuffix
2909         WhileHead -> while OpenScope Block ${
2910                 $0 = $<3;
2911                 }$
2912
2913         $cond_statement
2914         ForThen -> ForPart OptNL ThenPartNL ${
2915                         $0.forpart = $<1;
2916                         $0.thenpart = $<3;
2917                 }$
2918                 | ForPart OptNL ${
2919                         $0.forpart = $<1;
2920                 }$
2921
2922         // This scope is closed in CondSuffix
2923         WhilePart -> while OpenScope Expression Block ${
2924                         $0.type = Xcond_statement;
2925                         $0.condpart = $<3;
2926                         $0.dopart = $<4;
2927                 }$
2928                 | WhileHead OptNL do Block ${
2929                         $0.type = Xcond_statement;
2930                         $0.condpart = $<1;
2931                         $0.dopart = $<4;
2932                 }$
2933
2934         IfPart -> if OpenScope Expression OpenScope Block ${
2935                         $0.type = Xcond_statement;
2936                         $0.condpart = $<3;
2937                         $0.thenpart = $<5;
2938                         var_block_close(config2context(config), CloseParallel);
2939                 }$
2940                 | if OpenScope Block OptNL then OpenScope Block ${
2941                         $0.type = Xcond_statement;
2942                         $0.condpart = $<3;
2943                         $0.thenpart = $<7;
2944                         var_block_close(config2context(config), CloseParallel);
2945                 }$
2946
2947         $*exec
2948         // This scope is closed in CondSuffix
2949         SwitchPart -> switch OpenScope Expression ${
2950                         $0 = $<3;
2951                 }$
2952                 | switch OpenScope Block ${
2953                         $0 = $<3;
2954                 }$
2955
2956 ###### print exec cases
2957
2958         case Xcond_statement:
2959         {
2960                 struct cond_statement *cs = cast(cond_statement, e);
2961                 struct casepart *cp;
2962                 if (cs->forpart) {
2963                         do_indent(indent, "for");
2964                         if (bracket) printf(" {\n"); else printf(":\n");
2965                         print_exec(cs->forpart, indent+1, bracket);
2966                         if (cs->thenpart) {
2967                                 if (bracket)
2968                                         do_indent(indent, "} then {\n");
2969                                 else
2970                                         do_indent(indent, "then:\n");
2971                                 print_exec(cs->thenpart, indent+1, bracket);
2972                         }
2973                         if (bracket) do_indent(indent, "}\n");
2974                 }
2975                 if (cs->dopart) {
2976                         // a loop
2977                         if (cs->condpart && cs->condpart->type == Xbinode &&
2978                             cast(binode, cs->condpart)->op == Block) {
2979                                 if (bracket)
2980                                         do_indent(indent, "while {\n");
2981                                 else
2982                                         do_indent(indent, "while:\n");
2983                                 print_exec(cs->condpart, indent+1, bracket);
2984                                 if (bracket)
2985                                         do_indent(indent, "} do {\n");
2986                                 else
2987                                         do_indent(indent, "do:\n");
2988                                 print_exec(cs->dopart, indent+1, bracket);
2989                                 if (bracket)
2990                                         do_indent(indent, "}\n");
2991                         } else {
2992                                 do_indent(indent, "while ");
2993                                 print_exec(cs->condpart, 0, bracket);
2994                                 if (bracket)
2995                                         printf(" {\n");
2996                                 else
2997                                         printf(":\n");
2998                                 print_exec(cs->dopart, indent+1, bracket);
2999                                 if (bracket)
3000                                         do_indent(indent, "}\n");
3001                         }
3002                 } else {
3003                         // a condition
3004                         if (cs->casepart)
3005                                 do_indent(indent, "switch");
3006                         else
3007                                 do_indent(indent, "if");
3008                         if (cs->condpart && cs->condpart->type == Xbinode &&
3009                             cast(binode, cs->condpart)->op == Block) {
3010                                 if (bracket)
3011                                         printf(" {\n");
3012                                 else
3013                                         printf(":\n");
3014                                 print_exec(cs->condpart, indent+1, bracket);
3015                                 if (bracket)
3016                                         do_indent(indent, "}\n");
3017                                 if (cs->thenpart) {
3018                                         do_indent(indent, "then:\n");
3019                                         print_exec(cs->thenpart, indent+1, bracket);
3020                                 }
3021                         } else {
3022                                 printf(" ");
3023                                 print_exec(cs->condpart, 0, bracket);
3024                                 if (cs->thenpart) {
3025                                         if (bracket)
3026                                                 printf(" {\n");
3027                                         else
3028                                                 printf(":\n");
3029                                         print_exec(cs->thenpart, indent+1, bracket);
3030                                         if (bracket)
3031                                                 do_indent(indent, "}\n");
3032                                 } else
3033                                         printf("\n");
3034                         }
3035                 }
3036                 for (cp = cs->casepart; cp; cp = cp->next) {
3037                         do_indent(indent, "case ");
3038                         print_exec(cp->value, -1, 0);
3039                         if (bracket)
3040                                 printf(" {\n");
3041                         else
3042                                 printf(":\n");
3043                         print_exec(cp->action, indent+1, bracket);
3044                         if (bracket)
3045                                 do_indent(indent, "}\n");
3046                 }
3047                 if (cs->elsepart) {
3048                         do_indent(indent, "else");
3049                         if (bracket)
3050                                 printf(" {\n");
3051                         else
3052                                 printf(":\n");
3053                         print_exec(cs->elsepart, indent+1, bracket);
3054                         if (bracket)
3055                                 do_indent(indent, "}\n");
3056                 }
3057                 break;
3058         }
3059
3060 ###### propagate exec cases
3061         case Xcond_statement:
3062         {
3063                 // forpart and dopart must return Tnone
3064                 // thenpart must return Tnone if there is a dopart,
3065                 // otherwise it is like elsepart.
3066                 // condpart must:
3067                 //    be bool if there is no casepart
3068                 //    match casepart->values if there is a switchpart
3069                 //    either be bool or match casepart->value if there
3070                 //             is a whilepart
3071                 // elsepart and casepart->action must match the return type
3072                 //   expected of this statement.
3073                 struct cond_statement *cs = cast(cond_statement, prog);
3074                 struct casepart *cp;
3075
3076                 t = propagate_types(cs->forpart, c, ok, Tnone, 0);
3077                 if (!type_compat(Tnone, t, 0))
3078                         *ok = 0;
3079                 t = propagate_types(cs->dopart, c, ok, Tnone, 0);
3080                 if (!type_compat(Tnone, t, 0))
3081                         *ok = 0;
3082                 if (cs->dopart) {
3083                         t = propagate_types(cs->thenpart, c, ok, Tnone, 0);
3084                         if (!type_compat(Tnone, t, 0))
3085                                 *ok = 0;
3086                 }
3087                 if (cs->casepart == NULL)
3088                         propagate_types(cs->condpart, c, ok, Tbool, 0);
3089                 else {
3090                         /* Condpart must match case values, with bool permitted */
3091                         t = NULL;
3092                         for (cp = cs->casepart;
3093                              cp && !t; cp = cp->next)
3094                                 t = propagate_types(cp->value, c, ok, NULL, 0);
3095                         if (!t && cs->condpart)
3096                                 t = propagate_types(cs->condpart, c, ok, NULL, Rboolok);
3097                         // Now we have a type (I hope) push it down
3098                         if (t) {
3099                                 for (cp = cs->casepart; cp; cp = cp->next)
3100                                         propagate_types(cp->value, c, ok, t, 0);
3101                                 propagate_types(cs->condpart, c, ok, t, Rboolok);
3102                         }
3103                 }
3104                 // (if)then, else, and case parts must return expected type.
3105                 if (!cs->dopart && !type)
3106                         type = propagate_types(cs->thenpart, c, ok, NULL, rules);
3107                 if (!type)
3108                         type = propagate_types(cs->elsepart, c, ok, NULL, rules);
3109                 for (cp = cs->casepart;
3110                      cp && !type;
3111                      cp = cp->next)
3112                         type = propagate_types(cp->action, c, ok, NULL, rules);
3113                 if (type) {
3114                         if (!cs->dopart)
3115                                 propagate_types(cs->thenpart, c, ok, type, rules);
3116                         propagate_types(cs->elsepart, c, ok, type, rules);
3117                         for (cp = cs->casepart; cp ; cp = cp->next)
3118                                 propagate_types(cp->action, c, ok, type, rules);
3119                         return type;
3120                 } else
3121                         return NULL;
3122         }
3123
3124 ###### interp exec cases
3125         case Xcond_statement:
3126         {
3127                 struct value v, cnd;
3128                 struct casepart *cp;
3129                 struct cond_statement *c = cast(cond_statement, e);
3130
3131                 if (c->forpart)
3132                         interp_exec(c->forpart);
3133                 do {
3134                         if (c->condpart)
3135                                 cnd = interp_exec(c->condpart);
3136                         else
3137                                 cnd.type = Tnone;
3138                         if (!(cnd.type == Tnone ||
3139                               (cnd.type == Tbool && cnd.bool != 0)))
3140                                 break;
3141                         // cnd is Tnone or Tbool, doesn't need to be freed
3142                         if (c->dopart)
3143                                 interp_exec(c->dopart);
3144
3145                         if (c->thenpart) {
3146                                 rv = interp_exec(c->thenpart);
3147                                 if (rv.type != Tnone || !c->dopart)
3148                                         goto Xcond_done;
3149                                 free_value(rv);
3150                         }
3151                 } while (c->dopart);
3152
3153                 for (cp = c->casepart; cp; cp = cp->next) {
3154                         v = interp_exec(cp->value);
3155                         if (value_cmp(v, cnd) == 0) {
3156                                 free_value(v);
3157                                 free_value(cnd);
3158                                 rv = interp_exec(cp->action);
3159                                 goto Xcond_done;
3160                         }
3161                         free_value(v);
3162                 }
3163                 free_value(cnd);
3164                 if (c->elsepart)
3165                         rv = interp_exec(c->elsepart);
3166                 else
3167                         rv.type = Tnone;
3168         Xcond_done:
3169                 break;
3170         }
3171
3172 ### Finally the whole program.
3173
3174 Somewhat reminiscent of Pascal a (current) Ocean program starts with
3175 the keyword "program" and a list of variable names which are assigned
3176 values from command line arguments.  Following this is a `block` which
3177 is the code to execute.
3178
3179 As this is the top level, several things are handled a bit
3180 differently.
3181 The whole program is not interpreted by `interp_exec` as that isn't
3182 passed the argument list which the program requires.  Similarly type
3183 analysis is a bit more interesting at this level.
3184
3185 ###### Binode types
3186         Program,
3187
3188 ###### Parser: grammar
3189
3190         $*binode
3191         Program -> program OpenScope Varlist Block OptNL ${
3192                 $0 = new(binode);
3193                 $0->op = Program;
3194                 $0->left = reorder_bilist($<3);
3195                 $0->right = $<4;
3196                 var_block_close(config2context(config), CloseSequential);
3197                 if (config2context(config)->scope_stack) abort();
3198                 }$
3199                 | ERROR ${
3200                         tok_err(config2context(config),
3201                                 "error: unhandled parse error", &$1);
3202                 }$
3203
3204         Varlist -> Varlist ArgDecl ${
3205                         $0 = new(binode);
3206                         $0->op = Program;
3207                         $0->left = $<1;
3208                         $0->right = $<2;
3209                 }$
3210                 | ${ $0 = NULL; }$
3211
3212         $*var
3213         ArgDecl -> IDENTIFIER ${ {
3214                 struct variable *v = var_decl(config2context(config), $1.txt);
3215                 $0 = new(var);
3216                 $0->var = v;
3217         } }$
3218
3219         ## Grammar
3220
3221 ###### print binode cases
3222         case Program:
3223                 do_indent(indent, "program");
3224                 for (b2 = cast(binode, b->left); b2; b2 = cast(binode, b2->right)) {
3225                         printf(" ");
3226                         print_exec(b2->left, 0, 0);
3227                 }
3228                 if (bracket)
3229                         printf(" {\n");
3230                 else
3231                         printf(":\n");
3232                 print_exec(b->right, indent+1, bracket);
3233                 if (bracket)
3234                         do_indent(indent, "}\n");
3235                 break;
3236
3237 ###### propagate binode cases
3238         case Program: abort();
3239
3240 ###### core functions
3241
3242         static int analyse_prog(struct exec *prog, struct parse_context *c)
3243         {
3244                 struct binode *b = cast(binode, prog);
3245                 int ok = 1;
3246
3247                 if (!b)
3248                         return 0;
3249                 do {
3250                         ok = 1;
3251                         propagate_types(b->right, c, &ok, Tnone, 0);
3252                 } while (ok == 2);
3253                 if (!ok)
3254                         return 0;
3255
3256                 for (b = cast(binode, b->left); b; b = cast(binode, b->right)) {
3257                         struct var *v = cast(var, b->left);
3258                         if (!v->var->val.type) {
3259                                 v->var->where_set = b;
3260                                 v->var->val = val_prepare(Tstr);
3261                         }
3262                 }
3263                 b = cast(binode, prog);
3264                 do {
3265                         ok = 1;
3266                         propagate_types(b->right, c, &ok, Tnone, 0);
3267                 } while (ok == 2);
3268                 if (!ok)
3269                         return 0;
3270
3271                 /* Make sure everything is still consistent */
3272                 propagate_types(b->right, c, &ok, Tnone, 0);
3273                 return !!ok;
3274         }
3275
3276         static void interp_prog(struct exec *prog, char **argv)
3277         {
3278                 struct binode *p = cast(binode, prog);
3279                 struct binode *al;
3280                 struct value v;
3281
3282                 if (!prog)
3283                         return;
3284                 al = cast(binode, p->left);
3285                 while (al) {
3286                         struct var *v = cast(var, al->left);
3287                         struct value *vl = &v->var->val;
3288
3289                         if (argv[0] == NULL) {
3290                                 printf("Not enough args\n");
3291                                 exit(1);
3292                         }
3293                         al = cast(binode, al->right);
3294                         free_value(*vl);
3295                         *vl = parse_value(vl->type, argv[0]);
3296                         if (vl->type == NULL)
3297                                 exit(1);
3298                         argv++;
3299                 }
3300                 v = interp_exec(p->right);
3301                 free_value(v);
3302         }
3303
3304 ###### interp binode cases
3305         case Program: abort();
3306
3307 ## And now to test it out.
3308
3309 Having a language requires having a "hello world" program. I'll
3310 provide a little more than that: a program that prints "Hello world"
3311 finds the GCD of two numbers, prints the first few elements of
3312 Fibonacci, and performs a binary search for a number.
3313
3314 ###### File: oceani.mk
3315         tests :: sayhello
3316         sayhello : oceani
3317                 @echo "===== TEST ====="
3318                 ./oceani --section "test: hello" oceani.mdc 55 33
3319
3320 ###### test: hello
3321
3322         program A B:
3323                 print "Hello World, what lovely oceans you have!"
3324                 /* When a variable is defined in both branches of an 'if',
3325                  * and used afterwards, the variables are merged.
3326                  */
3327                 if A > B:
3328                         bigger := "yes"
3329                 else:
3330                         bigger := "no"
3331                 print "Is", A, "bigger than", B,"? ", bigger
3332                 /* If a variable is not used after the 'if', no
3333                  * merge happens, so types can be different
3334                  */
3335                 if A > B * 2:
3336                         double:string = "yes"
3337                         print A, "is more than twice", B, "?", double
3338                 else:
3339                         double := B*2
3340                         print "double", B, "is", double
3341
3342                 a : number
3343                 a = A;
3344                 b:number = B
3345                 if a > 0 and b > 0:
3346                         while a != b:
3347                                 if a < b:
3348                                         b = b - a
3349                                 else:
3350                                         a = a - b
3351                         print "GCD of", A, "and", B,"is", a
3352                 else if a <= 0:
3353                         print a, "is not positive, cannot calculate GCD"
3354                 else:
3355                         print b, "is not positive, cannot calculate GCD"
3356
3357                 for:
3358                         togo := 10
3359                         f1 := 1; f2 := 1
3360                         print "Fibonacci:", f1,f2,
3361                 then togo = togo - 1
3362                 while togo > 0:
3363                         f3 := f1 + f2
3364                         print "", f3,
3365                         f1 = f2
3366                         f2 = f3
3367                 print ""
3368
3369                 /* Binary search... */
3370                 for:
3371                         lo:= 0; hi := 100
3372                         target := 77
3373                 while:
3374                         mid := (lo + hi) / 2
3375                         if mid == target:
3376                                 use Found
3377                         if mid < target:
3378                                 lo = mid
3379                         else:
3380                                 hi = mid
3381                         if hi - lo < 1:
3382                                 use GiveUp
3383                         use True
3384                 do: pass
3385                 case Found:
3386                         print "Yay, I found", target
3387                 case GiveUp:
3388                         print "Closest I found was", mid