ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Jamison Creek version
   2
   3 Ocean is intended to be a compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This third version of the interpreter exists to test out some initial
  33 ideas relating to types.  Particularly it adds arrays (indexed from
  34 zero) and simple structures.  Basic control flow and variable scoping
  35 are already fairly well established, as are basic numerical and
  36 boolean operators.
  37
  38 Some operators that have only recently been added, and so have not
  39 generated all that much experience yet are "and then" and "or else" as
  40 short-circuit Boolean operators, and the "if ... else" trinary
  41 operator which can select between two expressions based on a third
  42 (which appears syntactically in the middle).
  43
  44 The "func" clause currently only allows a "main" function to be
  45 declared.  That will be extended when proper function support is added.
  46
  47 An element that is present purely to make a usable language, and
  48 without any expectation that they will remain, is the "print" statement
  49 which performs simple output.
  50
  51 The current scalar types are "number", "Boolean", and "string".
  52 Boolean will likely stay in its current form, the other two might, but
  53 could just as easily be changed.
  54
  55 ## Naming
  56
  57 Versions of the interpreter which obviously do not support a complete
  58 language will be named after creeks and streams.  This one is Jamison
  59 Creek.
  60
  61 Once we have something reasonably resembling a complete language, the
  62 names of rivers will be used.
  63 Early versions of the compiler will be named after seas.  Major
  64 releases of the compiler will be named after oceans.  Hopefully I will
  65 be finished once I get to the Pacific Ocean release.
  66
  67 ## Outline
  68
  69 As well as parsing and executing a program, the interpreter can print
  70 out the program from the parsed internal structure.  This is useful
  71 for validating the parsing.
  72 So the main requirements of the interpreter are:
  73
  74 - Parse the program, possibly with tracing,
  75 - Analyse the parsed program to ensure consistency,
  76 - Print the program,
  77 - Execute the "main" function in the program, if no parsing or
  78   consistency errors were found.
  79
  80 This is all performed by a single C program extracted with
  81 `parsergen`.
  82
  83 There will be two formats for printing the program: a default and one
  84 that uses bracketing.  So a `--bracket` command line option is needed
  85 for that.  Normally the first code section found is used, however an
  86 alternate section can be requested so that a file (such as this one)
  87 can contain multiple programs.  This is effected with the `--section`
  88 option.
  89
  90 This code must be compiled with `-fplan9-extensions` so that anonymous
  91 structures can be used.
  92
  93 ###### File: oceani.mk
  94
  95         myCFLAGS := -Wall -g -fplan9-extensions
  96         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  97         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  98         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  99         ## libs
 100         all :: $(LDLIBS) oceani
 101         oceani.c oceani.h : oceani.mdc parsergen
 102                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 103         oceani.mk: oceani.mdc md2c
 104                 ./md2c oceani.mdc
 105
 106         oceani: oceani.o $(LDLIBS)
 107                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 108
 109 ###### Parser: header
 110         ## macros
 111         struct parse_context;
 112         ## ast
 113         struct parse_context {
 114                 struct token_config config;
 115                 char *file_name;
 116                 int parse_error;
 117                 ## parse context
 118         };
 119
 120 ###### macros
 121
 122         #define container_of(ptr, type, member) ({                      \
 123                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 124                 (type *)( (char *)__mptr - offsetof(type,member) );})
 125
 126         #define config2context(_conf) container_of(_conf, struct parse_context, \
 127                 config)
 128
 129 ###### Parser: reduce
 130         struct parse_context *c = config2context(config);
 131
 132 ###### Parser: code
 133
 134         #include <unistd.h>
 135         #include <stdlib.h>
 136         #include <fcntl.h>
 137         #include <errno.h>
 138         #include <sys/mman.h>
 139         #include <string.h>
 140         #include <stdio.h>
 141         #include <locale.h>
 142         #include <malloc.h>
 143         #include "mdcode.h"
 144         #include "scanner.h"
 145         #include "parser.h"
 146
 147         ## includes
 148
 149         #include "oceani.h"
 150
 151         ## forward decls
 152         ## value functions
 153         ## ast functions
 154         ## core functions
 155
 156         #include <getopt.h>
 157         static char Usage[] =
 158                 "Usage: oceani --trace --print --noexec --brackets --section=SectionName prog.ocn\n";
 159         static const struct option long_options[] = {
 160                 {"trace",     0, NULL, 't'},
 161                 {"print",     0, NULL, 'p'},
 162                 {"noexec",    0, NULL, 'n'},
 163                 {"brackets",  0, NULL, 'b'},
 164                 {"section",   1, NULL, 's'},
 165                 {NULL,        0, NULL, 0},
 166         };
 167         const char *options = "tpnbs";
 168
 169         static void pr_err(char *msg)                   // NOTEST
 170         {
 171                 fprintf(stderr, "%s\n", msg);           // NOTEST
 172         }                                               // NOTEST
 173
 174         int main(int argc, char *argv[])
 175         {
 176                 int fd;
 177                 int len;
 178                 char *file;
 179                 struct section *s = NULL, *ss;
 180                 char *section = NULL;
 181                 struct parse_context context = {
 182                         .config = {
 183                                 .ignored = (1 << TK_mark),
 184                                 .number_chars = ".,_+- ",
 185                                 .word_start = "_",
 186                                 .word_cont = "_",
 187                         },
 188                 };
 189                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 190                 int opt;
 191                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 192                        != -1) {
 193                         switch(opt) {
 194                         case 't': dotrace=1; break;
 195                         case 'p': doprint=1; break;
 196                         case 'n': doexec=0; break;
 197                         case 'b': brackets=1; break;
 198                         case 's': section = optarg; break;
 199                         default: fprintf(stderr, Usage);
 200                                 exit(1);
 201                         }
 202                 }
 203                 if (optind >= argc) {
 204                         fprintf(stderr, "oceani: no input file given\n");
 205                         exit(1);
 206                 }
 207                 fd = open(argv[optind], O_RDONLY);
 208                 if (fd < 0) {
 209                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 210                         exit(1);
 211                 }
 212                 context.file_name = argv[optind];
 213                 len = lseek(fd, 0, 2);
 214                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 215                 s = code_extract(file, file+len, pr_err);
 216                 if (!s) {
 217                         fprintf(stderr, "oceani: could not find any code in %s\n",
 218                                 argv[optind]);
 219                         exit(1);
 220                 }
 221
 222                 ## context initialization
 223
 224                 if (section) {
 225                         for (ss = s; ss; ss = ss->next) {
 226                                 struct text sec = ss->section;
 227                                 if (sec.len == strlen(section) &&
 228                                     strncmp(sec.txt, section, sec.len) == 0)
 229                                         break;
 230                         }
 231                         if (!ss) {
 232                                 fprintf(stderr, "oceani: cannot find section %s\n",
 233                                         section);
 234                                 goto cleanup;
 235                         }
 236                 } else
 237                         ss = s;                         // NOTEST
 238                 if (!ss->code) {
 239                         fprintf(stderr, "oceani: no code found in requested section\n");        // NOTEST
 240                         goto cleanup;                   // NOTEST
 241                 }
 242
 243                 parse_oceani(ss->code, &context.config, dotrace ? stderr : NULL);
 244
 245                 if (!context.parse_error && !analyse_funcs(&context)) {
 246                         fprintf(stderr, "oceani: type error in program - not running.\n");
 247                         context.parse_error = 1;
 248                 }
 249
 250                 if (doprint) {
 251                         ## print const decls
 252                         ## print type decls
 253                         ## print func decls
 254                 }
 255                 if (doexec && !context.parse_error)
 256                         interp_main(&context, argc - optind, argv + optind);
 257         cleanup:
 258                 while (s) {
 259                         struct section *t = s->next;
 260                         code_free(s->code);
 261                         free(s);
 262                         s = t;
 263                 }
 264                 // FIXME parser should pop scope even on error
 265                 while (context.scope_depth > 0)
 266                         scope_pop(&context);
 267                 ## free global vars
 268                 ## free context types
 269                 ## free context storage
 270                 exit(context.parse_error ? 1 : 0);
 271         }
 272
 273 ### Analysis
 274
 275 The four requirements of parse, analyse, print, interpret apply to
 276 each language element individually so that is how most of the code
 277 will be structured.
 278
 279 Three of the four are fairly self explanatory.  The one that requires
 280 a little explanation is the analysis step.
 281
 282 The current language design does not require the types of variables to
 283 be declared, but they must still have a single type.  Different
 284 operations impose different requirements on the variables, for example
 285 addition requires both arguments to be numeric, and assignment
 286 requires the variable on the left to have the same type as the
 287 expression on the right.
 288
 289 Analysis involves propagating these type requirements around and
 290 consequently setting the type of each variable.  If any requirements
 291 are violated (e.g. a string is compared with a number) or if a
 292 variable needs to have two different types, then an error is raised
 293 and the program will not run.
 294
 295 If the same variable is declared in both branchs of an 'if/else', or
 296 in all cases of a 'switch' then the multiple instances may be merged
 297 into just one variable if the variable is referenced after the
 298 conditional statement.  When this happens, the types must naturally be
 299 consistent across all the branches.  When the variable is not used
 300 outside the if, the variables in the different branches are distinct
 301 and can be of different types.
 302
 303 Undeclared names may only appear in "use" statements and "case" expressions.
 304 These names are given a type of "label" and a unique value.
 305 This allows them to fill the role of a name in an enumerated type, which
 306 is useful for testing the `switch` statement.
 307
 308 As we will see, the condition part of a `while` statement can return
 309 either a Boolean or some other type.  This requires that the expected
 310 type that gets passed around comprises a type and a flag to indicate
 311 that `Tbool` is also permitted.
 312
 313 As there are, as yet, no distinct types that are compatible, there
 314 isn't much subtlety in the analysis.  When we have distinct number
 315 types, this will become more interesting.
 316
 317 #### Error reporting
 318
 319 When analysis discovers an inconsistency it needs to report an error;
 320 just refusing to run the code ensures that the error doesn't cascade,
 321 but by itself it isn't very useful.  A clear understanding of the sort
 322 of error message that are useful will help guide the process of
 323 analysis.
 324
 325 At a simplistic level, the only sort of error that type analysis can
 326 report is that the type of some construct doesn't match a contextual
 327 requirement.  For example, in `4 + "hello"` the addition provides a
 328 contextual requirement for numbers, but `"hello"` is not a number.  In
 329 this particular example no further information is needed as the types
 330 are obvious from local information.  When a variable is involved that
 331 isn't the case.  It may be helpful to explain why the variable has a
 332 particular type, by indicating the location where the type was set,
 333 whether by declaration or usage.
 334
 335 Using a recursive-descent analysis we can easily detect a problem at
 336 multiple locations. In "`hello:= "there"; 4 + hello`" the addition
 337 will detect that one argument is not a number and the usage of `hello`
 338 will detect that a number was wanted, but not provided.  In this
 339 (early) version of the language, we will generate error reports at
 340 multiple locations, so the use of `hello` will report an error and
 341 explain were the value was set, and the addition will report an error
 342 and say why numbers are needed.  To be able to report locations for
 343 errors, each language element will need to record a file location
 344 (line and column) and each variable will need to record the language
 345 element where its type was set.  For now we will assume that each line
 346 of an error message indicates one location in the file, and up to 2
 347 types.  So we provide a `printf`-like function which takes a format, a
 348 location (a `struct exec` which has not yet been introduced), and 2
 349 types. "`%1`" reports the first type, "`%2`" reports the second.  We
 350 will need a function to print the location, once we know how that is
 351 stored. e As will be explained later, there are sometimes extra rules for
 352 type matching and they might affect error messages, we need to pass those
 353 in too.
 354
 355 As well as type errors, we sometimes need to report problems with
 356 tokens, which might be unexpected or might name a type that has not
 357 been defined.  For these we have `tok_err()` which reports an error
 358 with a given token.  Each of the error functions sets the flag in the
 359 context so indicate that parsing failed.
 360
 361 ###### forward decls
 362
 363         static void fput_loc(struct exec *loc, FILE *f);
 364         static void type_err(struct parse_context *c,
 365                              char *fmt, struct exec *loc,
 366                              struct type *t1, int rules, struct type *t2);
 367
 368 ###### core functions
 369
 370         static void type_err(struct parse_context *c,
 371                              char *fmt, struct exec *loc,
 372                              struct type *t1, int rules, struct type *t2)
 373         {
 374                 fprintf(stderr, "%s:", c->file_name);
 375                 fput_loc(loc, stderr);
 376                 for (; *fmt ; fmt++) {
 377                         if (*fmt != '%') {
 378                                 fputc(*fmt, stderr);
 379                                 continue;
 380                         }
 381                         fmt++;
 382                         switch (*fmt) {
 383                         case '%': fputc(*fmt, stderr); break;   // NOTEST
 384                         default: fputc('?', stderr); break;     // NOTEST
 385                         case '1':
 386                                 type_print(t1, stderr);
 387                                 break;
 388                         case '2':
 389                                 type_print(t2, stderr);
 390                                 break;
 391                         ## format cases
 392                         }
 393                 }
 394                 fputs("\n", stderr);
 395                 c->parse_error = 1;
 396         }
 397
 398         static void tok_err(struct parse_context *c, char *fmt, struct token *t)
 399         {
 400                 fprintf(stderr, "%s:%d:%d: %s: %.*s\n", c->file_name, t->line, t->col, fmt,
 401                         t->txt.len, t->txt.txt);
 402                 c->parse_error = 1;
 403         }
 404
 405 ## Entities: declared and predeclared.
 406
 407 There are various "things" that the language and/or the interpreter
 408 needs to know about to parse and execute a program.  These include
 409 types, variables, values, and executable code.  These are all lumped
 410 together under the term "entities" (calling them "objects" would be
 411 confusing) and introduced here.  The following section will present the
 412 different specific code elements which comprise or manipulate these
 413 various entities.
 414
 415 ### Executables
 416
 417 Executables can be lots of different things.  In many cases an
 418 executable is just an operation combined with one or two other
 419 executables.  This allows for expressions and lists etc.  Other times an
 420 executable is something quite specific like a constant or variable name.
 421 So we define a `struct exec` to be a general executable with a type, and
 422 a `struct binode` which is a subclass of `exec`, forms a node in a
 423 binary tree, and holds an operation.  There will be other subclasses,
 424 and to access these we need to be able to `cast` the `exec` into the
 425 various other types.  The first field in any `struct exec` is the type
 426 from the `exec_types` enum.
 427
 428 ###### macros
 429         #define cast(structname, pointer) ({            \
 430                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 431                 if (__mptr && *__mptr != X##structname) abort();                \
 432                 (struct structname *)( (char *)__mptr);})
 433
 434         #define new(structname) ({                                              \
 435                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 436                 __ptr->type = X##structname;                                            \
 437                 __ptr->line = -1; __ptr->column = -1;                                   \
 438                 __ptr;})
 439
 440         #define new_pos(structname, token) ({                                           \
 441                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 442                 __ptr->type = X##structname;                                            \
 443                 __ptr->line = token.line; __ptr->column = token.col;                    \
 444                 __ptr;})
 445
 446 ###### ast
 447         enum exec_types {
 448                 Xbinode,
 449                 ## exec type
 450         };
 451         struct exec {
 452                 enum exec_types type;
 453                 int line, column;
 454                 ## exec fields
 455         };
 456         struct binode {
 457                 struct exec;
 458                 enum Btype {
 459                         ## Binode types
 460                 } op;
 461                 struct exec *left, *right;
 462         };
 463
 464 ###### ast functions
 465
 466         static int __fput_loc(struct exec *loc, FILE *f)
 467         {
 468                 if (!loc)
 469                         return 0;
 470                 if (loc->line >= 0) {
 471                         fprintf(f, "%d:%d: ", loc->line, loc->column);
 472                         return 1;
 473                 }
 474                 if (loc->type == Xbinode)
 475                         return __fput_loc(cast(binode,loc)->left, f) ||
 476                                __fput_loc(cast(binode,loc)->right, f);  // NOTEST
 477                 return 0;
 478         }
 479         static void fput_loc(struct exec *loc, FILE *f)
 480         {
 481                 if (!__fput_loc(loc, f))
 482                         fprintf(f, "??:??: ");
 483         }
 484
 485 Each different type of `exec` node needs a number of functions defined,
 486 a bit like methods.  We must be able to free it, print it, analyse it
 487 and execute it.  Once we have specific `exec` types we will need to
 488 parse them too.  Let's take this a bit more slowly.
 489
 490 #### Freeing
 491
 492 The parser generator requires a `free_foo` function for each struct
 493 that stores attributes and they will often be `exec`s and subtypes
 494 there-of.  So we need `free_exec` which can handle all the subtypes,
 495 and we need `free_binode`.
 496
 497 ###### ast functions
 498
 499         static void free_binode(struct binode *b)
 500         {
 501                 if (!b)
 502                         return;
 503                 free_exec(b->left);
 504                 free_exec(b->right);
 505                 free(b);
 506         }
 507
 508 ###### core functions
 509         static void free_exec(struct exec *e)
 510         {
 511                 if (!e)
 512                         return;
 513                 switch(e->type) {
 514                         ## free exec cases
 515                 }
 516         }
 517
 518 ###### forward decls
 519
 520         static void free_exec(struct exec *e);
 521
 522 ###### free exec cases
 523         case Xbinode: free_binode(cast(binode, e)); break;
 524
 525 #### Printing
 526
 527 Printing an `exec` requires that we know the current indent level for
 528 printing line-oriented components.  As will become clear later, we
 529 also want to know what sort of bracketing to use.
 530
 531 ###### ast functions
 532
 533         static void do_indent(int i, char *str)
 534         {
 535                 while (i-- > 0)
 536                         printf("    ");
 537                 printf("%s", str);
 538         }
 539
 540 ###### core functions
 541         static void print_binode(struct binode *b, int indent, int bracket)
 542         {
 543                 struct binode *b2;
 544                 switch(b->op) {
 545                 ## print binode cases
 546                 }
 547         }
 548
 549         static void print_exec(struct exec *e, int indent, int bracket)
 550         {
 551                 if (!e)
 552                         return;
 553                 switch (e->type) {
 554                 case Xbinode:
 555                         print_binode(cast(binode, e), indent, bracket); break;
 556                 ## print exec cases
 557                 }
 558                 if (e->to_free) {
 559                         struct variable *v;
 560                         do_indent(indent, "/* FREE");
 561                         for (v = e->to_free; v; v = v->next_free) {
 562                                 printf(" %.*s", v->name->name.len, v->name->name.txt);
 563                                 printf("[%d,%d]", v->scope_start, v->scope_end);
 564                                 if (v->frame_pos >= 0)
 565                                         printf("(%d+%d)", v->frame_pos,
 566                                                v->type ? v->type->size:0);
 567                         }
 568                         printf(" */\n");
 569                 }
 570         }
 571
 572 ###### forward decls
 573
 574         static void print_exec(struct exec *e, int indent, int bracket);
 575
 576 #### Analysing
 577
 578 As discussed, analysis involves propagating type requirements around the
 579 program and looking for errors.
 580
 581 So `propagate_types` is passed an expected type (being a `struct type`
 582 pointer together with some `val_rules` flags) that the `exec` is
 583 expected to return, and returns the type that it does return, either
 584 of which can be `NULL` signifying "unknown".  An `ok` flag is passed
 585 by reference. It is set to `0` when an error is found, and `2` when
 586 any change is made.  If it remains unchanged at `1`, then no more
 587 propagation is needed.
 588
 589 ###### ast
 590
 591         enum val_rules {Rnolabel = 1<<0, Rboolok = 1<<1, Rnoconstant = 1<<2};
 592
 593 ###### format cases
 594         case 'r':
 595                 if (rules & Rnolabel)
 596                         fputs(" (labels not permitted)", stderr);
 597                 break;
 598
 599 ###### forward decls
 600         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 601                                             struct type *type, int rules);
 602 ###### core functions
 603
 604         static struct type *__propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 605                                               struct type *type, int rules)
 606         {
 607                 struct type *t;
 608
 609                 if (!prog)
 610                         return Tnone;
 611
 612                 switch (prog->type) {
 613                 case Xbinode:
 614                 {
 615                         struct binode *b = cast(binode, prog);
 616                         switch (b->op) {
 617                         ## propagate binode cases
 618                         }
 619                         break;
 620                 }
 621                 ## propagate exec cases
 622                 }
 623                 return Tnone;
 624         }
 625
 626         static struct type *propagate_types(struct exec *prog, struct parse_context *c, int *ok,
 627                                             struct type *type, int rules)
 628         {
 629                 struct type *ret = __propagate_types(prog, c, ok, type, rules);
 630
 631                 if (c->parse_error)
 632                         *ok = 0;
 633                 return ret;
 634         }
 635
 636 #### Interpreting
 637
 638 Interpreting an `exec` doesn't require anything but the `exec`.  State
 639 is stored in variables and each variable will be directly linked from
 640 within the `exec` tree.  The exception to this is the `main` function
 641 which needs to look at command line arguments.  This function will be
 642 interpreted separately.
 643
 644 Each `exec` can return a value combined with a type in `struct lrval`.
 645 The type may be `Tnone` but must be non-NULL.  Some `exec`s will return
 646 the location of a value, which can be updated, in `lval`.  Others will
 647 set `lval` to NULL indicating that there is a value of appropriate type
 648 in `rval`.
 649
 650 ###### core functions
 651
 652         struct lrval {
 653                 struct type *type;
 654                 struct value rval, *lval;
 655         };
 656
 657         /* If dest is passed, dtype must give the expected type, and
 658          * result can go there, in which case type is returned as NULL.
 659          */
 660         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 661                                          struct value *dest, struct type *dtype);
 662
 663         static struct value interp_exec(struct parse_context *c, struct exec *e,
 664                                         struct type **typeret)
 665         {
 666                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 667
 668                 if (!ret.type) abort();
 669                 if (typeret)
 670                         *typeret = ret.type;
 671                 if (ret.lval)
 672                         dup_value(ret.type, ret.lval, &ret.rval);
 673                 return ret.rval;
 674         }
 675
 676         static struct value *linterp_exec(struct parse_context *c, struct exec *e,
 677                                           struct type **typeret)
 678         {
 679                 struct lrval ret = _interp_exec(c, e, NULL, NULL);
 680
 681                 if (!ret.type) abort();
 682                 if (ret.lval)
 683                         *typeret = ret.type;
 684                 else
 685                         free_value(ret.type, &ret.rval);
 686                 return ret.lval;
 687         }
 688
 689         /* dinterp_exec is used when the destination type is certain and
 690          * the value has a place to go.
 691          */
 692         static void dinterp_exec(struct parse_context *c, struct exec *e,
 693                                  struct value *dest, struct type *dtype,
 694                                  int need_free)
 695         {
 696                 struct lrval ret = _interp_exec(c, e, dest, dtype);
 697                 if (!ret.type)
 698                         return;
 699                 if (need_free)
 700                         free_value(dtype, dest);
 701                 if (ret.lval)
 702                         dup_value(dtype, ret.lval, dest);
 703                 else
 704                         memcpy(dest, &ret.rval, dtype->size);
 705         }
 706
 707         static struct lrval _interp_exec(struct parse_context *c, struct exec *e,
 708                                          struct value *dest, struct type *dtype)
 709         {
 710                 /* If the result is copied to dest, ret.type is set to NULL */
 711                 struct lrval ret;
 712                 struct value rv = {}, *lrv = NULL;
 713                 struct type *rvtype;
 714
 715                 rvtype = ret.type = Tnone;
 716                 if (!e) {
 717                         ret.lval = lrv;
 718                         ret.rval = rv;
 719                         return ret;
 720                 }
 721
 722                 switch(e->type) {
 723                 case Xbinode:
 724                 {
 725                         struct binode *b = cast(binode, e);
 726                         struct value left, right, *lleft;
 727                         struct type *ltype, *rtype;
 728                         ltype = rtype = Tnone;
 729                         switch (b->op) {
 730                         ## interp binode cases
 731                         }
 732                         free_value(ltype, &left);
 733                         free_value(rtype, &right);
 734                         break;
 735                 }
 736                 ## interp exec cases
 737                 }
 738                 if (rvtype) {
 739                         ret.lval = lrv;
 740                         ret.rval = rv;
 741                         ret.type = rvtype;
 742                 }
 743                 ## interp exec cleanup
 744                 return ret;
 745         }
 746
 747 ### Types
 748
 749 Values come in a wide range of types, with more likely to be added.
 750 Each type needs to be able to print its own values (for convenience at
 751 least) as well as to compare two values, at least for equality and
 752 possibly for order.  For now, values might need to be duplicated and
 753 freed, though eventually such manipulations will be better integrated
 754 into the language.
 755
 756 Rather than requiring every numeric type to support all numeric
 757 operations (add, multiply, etc), we allow types to be able to present
 758 as one of a few standard types: integer, float, and fraction.  The
 759 existence of these conversion functions eventually enable types to
 760 determine if they are compatible with other types, though such types
 761 have not yet been implemented.
 762
 763 Named type are stored in a simple linked list.  Objects of each type are
 764 "values" which are often passed around by value.
 765
 766 ###### ast
 767
 768         struct value {
 769                 union {
 770                         char ptr[1];
 771                         ## value union fields
 772                 };
 773         };
 774
 775         struct type {
 776                 struct text name;
 777                 struct type *next;
 778                 int size, align;
 779                 void (*init)(struct type *type, struct value *val);
 780                 void (*prepare_type)(struct parse_context *c, struct type *type, int parse_time);
 781                 void (*print)(struct type *type, struct value *val, FILE *f);
 782                 void (*print_type)(struct type *type, FILE *f);
 783                 int (*cmp_order)(struct type *t1, struct type *t2,
 784                                  struct value *v1, struct value *v2);
 785                 int (*cmp_eq)(struct type *t1, struct type *t2,
 786                               struct value *v1, struct value *v2);
 787                 void (*dup)(struct type *type, struct value *vold, struct value *vnew);
 788                 void (*free)(struct type *type, struct value *val);
 789                 void (*free_type)(struct type *t);
 790                 long long (*to_int)(struct value *v);
 791                 double (*to_float)(struct value *v);
 792                 int (*to_mpq)(mpq_t *q, struct value *v);
 793                 ## type functions
 794                 union {
 795                         ## type union fields
 796                 };
 797         };
 798
 799 ###### parse context
 800
 801         struct type *typelist;
 802
 803 ###### ast functions
 804
 805         static struct type *find_type(struct parse_context *c, struct text s)
 806         {
 807                 struct type *l = c->typelist;
 808
 809                 while (l &&
 810                        text_cmp(l->name, s) != 0)
 811                                 l = l->next;
 812                 return l;
 813         }
 814
 815         static struct type *add_type(struct parse_context *c, struct text s,
 816                                      struct type *proto)
 817         {
 818                 struct type *n;
 819
 820                 n = calloc(1, sizeof(*n));
 821                 *n = *proto;
 822                 n->name = s;
 823                 n->next = c->typelist;
 824                 c->typelist = n;
 825                 return n;
 826         }
 827
 828         static void free_type(struct type *t)
 829         {
 830                 /* The type is always a reference to something in the
 831                  * context, so we don't need to free anything.
 832                  */
 833         }
 834
 835         static void free_value(struct type *type, struct value *v)
 836         {
 837                 if (type && v) {
 838                         type->free(type, v);
 839                         memset(v, 0x5a, type->size);
 840                 }
 841         }
 842
 843         static void type_print(struct type *type, FILE *f)
 844         {
 845                 if (!type)
 846                         fputs("*unknown*type*", f);     // NOTEST
 847                 else if (type->name.len)
 848                         fprintf(f, "%.*s", type->name.len, type->name.txt);
 849                 else if (type->print_type)
 850                         type->print_type(type, f);
 851                 else
 852                         fputs("*invalid*type*", f);     // NOTEST
 853         }
 854
 855         static void val_init(struct type *type, struct value *val)
 856         {
 857                 if (type && type->init)
 858                         type->init(type, val);
 859         }
 860
 861         static void dup_value(struct type *type,
 862                               struct value *vold, struct value *vnew)
 863         {
 864                 if (type && type->dup)
 865                         type->dup(type, vold, vnew);
 866         }
 867
 868         static int value_cmp(struct type *tl, struct type *tr,
 869                              struct value *left, struct value *right)
 870         {
 871                 if (tl && tl->cmp_order)
 872                         return tl->cmp_order(tl, tr, left, right);
 873                 if (tl && tl->cmp_eq)                   // NOTEST
 874                         return tl->cmp_eq(tl, tr, left, right); // NOTEST
 875                 return -1;                              // NOTEST
 876         }
 877
 878         static void print_value(struct type *type, struct value *v, FILE *f)
 879         {
 880                 if (type && type->print)
 881                         type->print(type, v, f);
 882                 else
 883                         fprintf(f, "*Unknown*");                // NOTEST
 884         }
 885
 886 ###### forward decls
 887
 888         static void free_value(struct type *type, struct value *v);
 889         static int type_compat(struct type *require, struct type *have, int rules);
 890         static void type_print(struct type *type, FILE *f);
 891         static void val_init(struct type *type, struct value *v);
 892         static void dup_value(struct type *type,
 893                               struct value *vold, struct value *vnew);
 894         static int value_cmp(struct type *tl, struct type *tr,
 895                              struct value *left, struct value *right);
 896         static void print_value(struct type *type, struct value *v, FILE *f);
 897
 898 ###### free context types
 899
 900         while (context.typelist) {
 901                 struct type *t = context.typelist;
 902
 903                 context.typelist = t->next;
 904                 if (t->free_type)
 905                         t->free_type(t);
 906                 free(t);
 907         }
 908
 909 Type can be specified for local variables, for fields in a structure,
 910 for formal parameters to functions, and possibly elsewhere.  Different
 911 rules may apply in different contexts.  As a minimum, a named type may
 912 always be used.  Currently the type of a formal parameter can be
 913 different from types in other contexts, so we have a separate grammar
 914 symbol for those.
 915
 916 ###### Grammar
 917
 918         $*type
 919         Type -> IDENTIFIER ${
 920                 $0 = find_type(c, $1.txt);
 921                 if (!$0) {
 922                         tok_err(c,
 923                                 "error: undefined type", &$1);
 924
 925                         $0 = Tnone;
 926                 }
 927         }$
 928         ## type grammar
 929
 930         FormalType -> Type ${ $0 = $<1; }$
 931         ## formal type grammar
 932
 933 #### Base Types
 934
 935 Values of the base types can be numbers, which we represent as
 936 multi-precision fractions, strings, Booleans and labels.  When
 937 analysing the program we also need to allow for places where no value
 938 is meaningful (type `Tnone`) and where we don't know what type to
 939 expect yet (type is `NULL`).
 940
 941 Values are never shared, they are always copied when used, and freed
 942 when no longer needed.
 943
 944 When propagating type information around the program, we need to
 945 determine if two types are compatible, where type `NULL` is compatible
 946 with anything.  There are two special cases with type compatibility,
 947 both related to the Conditional Statement which will be described
 948 later.  In some cases a Boolean can be accepted as well as some other
 949 primary type, and in others any type is acceptable except a label (`Vlabel`).
 950 A separate function encoding these cases will simplify some code later.
 951
 952 ###### type functions
 953
 954         int (*compat)(struct type *this, struct type *other);
 955
 956 ###### ast functions
 957
 958         static int type_compat(struct type *require, struct type *have, int rules)
 959         {
 960                 if ((rules & Rboolok) && have == Tbool)
 961                         return 1;       // NOTEST
 962                 if ((rules & Rnolabel) && have == Tlabel)
 963                         return 0;       // NOTEST
 964                 if (!require || !have)
 965                         return 1;
 966
 967                 if (require->compat)
 968                         return require->compat(require, have);
 969
 970                 return require == have;
 971         }
 972
 973 ###### includes
 974         #include <gmp.h>
 975         #include "parse_string.h"
 976         #include "parse_number.h"
 977
 978 ###### libs
 979         myLDLIBS := libnumber.o libstring.o -lgmp
 980         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
 981
 982 ###### type union fields
 983         enum vtype {Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
 984
 985 ###### value union fields
 986         struct text str;
 987         mpq_t num;
 988         unsigned char bool;
 989         void *label;
 990
 991 ###### ast functions
 992         static void _free_value(struct type *type, struct value *v)
 993         {
 994                 if (!v)
 995                         return;         // NOTEST
 996                 switch (type->vtype) {
 997                 case Vnone: break;
 998                 case Vstr: free(v->str.txt); break;
 999                 case Vnum: mpq_clear(v->num); break;
1000                 case Vlabel:
1001                 case Vbool: break;
1002                 }
1003         }
1004
1005 ###### value functions
1006
1007         static void _val_init(struct type *type, struct value *val)
1008         {
1009                 switch(type->vtype) {
1010                 case Vnone:             // NOTEST
1011                         break;          // NOTEST
1012                 case Vnum:
1013                         mpq_init(val->num); break;
1014                 case Vstr:
1015                         val->str.txt = malloc(1);
1016                         val->str.len = 0;
1017                         break;
1018                 case Vbool:
1019                         val->bool = 0;
1020                         break;
1021                 case Vlabel:
1022                         val->label = NULL;
1023                         break;
1024                 }
1025         }
1026
1027         static void _dup_value(struct type *type,
1028                                struct value *vold, struct value *vnew)
1029         {
1030                 switch (type->vtype) {
1031                 case Vnone:             // NOTEST
1032                         break;          // NOTEST
1033                 case Vlabel:
1034                         vnew->label = vold->label;
1035                         break;
1036                 case Vbool:
1037                         vnew->bool = vold->bool;
1038                         break;
1039                 case Vnum:
1040                         mpq_init(vnew->num);
1041                         mpq_set(vnew->num, vold->num);
1042                         break;
1043                 case Vstr:
1044                         vnew->str.len = vold->str.len;
1045                         vnew->str.txt = malloc(vnew->str.len);
1046                         memcpy(vnew->str.txt, vold->str.txt, vnew->str.len);
1047                         break;
1048                 }
1049         }
1050
1051         static int _value_cmp(struct type *tl, struct type *tr,
1052                               struct value *left, struct value *right)
1053         {
1054                 int cmp;
1055                 if (tl != tr)
1056                         return tl - tr; // NOTEST
1057                 switch (tl->vtype) {
1058                 case Vlabel: cmp = left->label == right->label ? 0 : 1; break;
1059                 case Vnum: cmp = mpq_cmp(left->num, right->num); break;
1060                 case Vstr: cmp = text_cmp(left->str, right->str); break;
1061                 case Vbool: cmp = left->bool - right->bool; break;
1062                 case Vnone: cmp = 0;                    // NOTEST
1063                 }
1064                 return cmp;
1065         }
1066
1067         static void _print_value(struct type *type, struct value *v, FILE *f)
1068         {
1069                 switch (type->vtype) {
1070                 case Vnone:                             // NOTEST
1071                         fprintf(f, "*no-value*"); break;        // NOTEST
1072                 case Vlabel:                            // NOTEST
1073                         fprintf(f, "*label-%p*", v->label); break; // NOTEST
1074                 case Vstr:
1075                         fprintf(f, "%.*s", v->str.len, v->str.txt); break;
1076                 case Vbool:
1077                         fprintf(f, "%s", v->bool ? "True":"False"); break;
1078                 case Vnum:
1079                         {
1080                         mpf_t fl;
1081                         mpf_init2(fl, 20);
1082                         mpf_set_q(fl, v->num);
1083                         gmp_fprintf(f, "%Fg", fl);
1084                         mpf_clear(fl);
1085                         break;
1086                         }
1087                 }
1088         }
1089
1090         static void _free_value(struct type *type, struct value *v);
1091
1092         static struct type base_prototype = {
1093                 .init = _val_init,
1094                 .print = _print_value,
1095                 .cmp_order = _value_cmp,
1096                 .cmp_eq = _value_cmp,
1097                 .dup = _dup_value,
1098                 .free = _free_value,
1099         };
1100
1101         static struct type *Tbool, *Tstr, *Tnum, *Tnone, *Tlabel;
1102
1103 ###### ast functions
1104         static struct type *add_base_type(struct parse_context *c, char *n,
1105                                           enum vtype vt, int size)
1106         {
1107                 struct text txt = { n, strlen(n) };
1108                 struct type *t;
1109
1110                 t = add_type(c, txt, &base_prototype);
1111                 t->vtype = vt;
1112                 t->size = size;
1113                 t->align = size > sizeof(void*) ? sizeof(void*) : size;
1114                 if (t->size & (t->align - 1))
1115                         t->size = (t->size | (t->align - 1)) + 1;       // NOTEST
1116                 return t;
1117         }
1118
1119 ###### context initialization
1120
1121         Tbool  = add_base_type(&context, "Boolean", Vbool, sizeof(char));
1122         Tstr   = add_base_type(&context, "string", Vstr, sizeof(struct text));
1123         Tnum   = add_base_type(&context, "number", Vnum, sizeof(mpq_t));
1124         Tnone  = add_base_type(&context, "none", Vnone, 0);
1125         Tlabel = add_base_type(&context, "label", Vlabel, sizeof(void*));
1126
1127 ##### Base Values
1128
1129 We have already met values as separate objects.  When manifest constants
1130 appear in the program text, that must result in an executable which has
1131 a constant value.  So the `val` structure embeds a value in an
1132 executable.
1133
1134 ###### exec type
1135         Xval,
1136
1137 ###### ast
1138         struct val {
1139                 struct exec;
1140                 struct type *vtype;
1141                 struct value val;
1142         };
1143
1144 ###### ast functions
1145         struct val *new_val(struct type *T, struct token tk)
1146         {
1147                 struct val *v = new_pos(val, tk);
1148                 v->vtype = T;
1149                 return v;
1150         }
1151
1152 ###### Grammar
1153
1154         $TERM True False
1155
1156         $*val
1157         Value ->  True ${
1158                         $0 = new_val(Tbool, $1);
1159                         $0->val.bool = 1;
1160                         }$
1161                 | False ${
1162                         $0 = new_val(Tbool, $1);
1163                         $0->val.bool = 0;
1164                         }$
1165                 | NUMBER ${
1166                         $0 = new_val(Tnum, $1);
1167                         {
1168                         char tail[3];
1169                         if (number_parse($0->val.num, tail, $1.txt) == 0)
1170                                 mpq_init($0->val.num);  // UNTESTED
1171                                 if (tail[0])
1172                                         tok_err(c, "error: unsupported number suffix",
1173                                                 &$1);
1174                         }
1175                         }$
1176                 | STRING ${
1177                         $0 = new_val(Tstr, $1);
1178                         {
1179                         char tail[3];
1180                         string_parse(&$1, '\\', &$0->val.str, tail);
1181                         if (tail[0])
1182                                 tok_err(c, "error: unsupported string suffix",
1183                                         &$1);
1184                         }
1185                         }$
1186                 | MULTI_STRING ${
1187                         $0 = new_val(Tstr, $1);
1188                         {
1189                         char tail[3];
1190                         string_parse(&$1, '\\', &$0->val.str, tail);
1191                         if (tail[0])
1192                                 tok_err(c, "error: unsupported string suffix",
1193                                         &$1);
1194                         }
1195                         }$
1196
1197 ###### print exec cases
1198         case Xval:
1199         {
1200                 struct val *v = cast(val, e);
1201                 if (v->vtype == Tstr)
1202                         printf("\"");
1203                 print_value(v->vtype, &v->val, stdout);
1204                 if (v->vtype == Tstr)
1205                         printf("\"");
1206                 break;
1207         }
1208
1209 ###### propagate exec cases
1210         case Xval:
1211         {
1212                 struct val *val = cast(val, prog);
1213                 if (!type_compat(type, val->vtype, rules))
1214                         type_err(c, "error: expected %1%r found %2",
1215                                    prog, type, rules, val->vtype);
1216                 return val->vtype;
1217         }
1218
1219 ###### interp exec cases
1220         case Xval:
1221                 rvtype = cast(val, e)->vtype;
1222                 dup_value(rvtype, &cast(val, e)->val, &rv);
1223                 break;
1224
1225 ###### ast functions
1226         static void free_val(struct val *v)
1227         {
1228                 if (v)
1229                         free_value(v->vtype, &v->val);
1230                 free(v);
1231         }
1232
1233 ###### free exec cases
1234         case Xval: free_val(cast(val, e)); break;
1235
1236 ###### ast functions
1237         // Move all nodes from 'b' to 'rv', reversing their order.
1238         // In 'b' 'left' is a list, and 'right' is the last node.
1239         // In 'rv', left' is the first node and 'right' is a list.
1240         static struct binode *reorder_bilist(struct binode *b)
1241         {
1242                 struct binode *rv = NULL;
1243
1244                 while (b) {
1245                         struct exec *t = b->right;
1246                         b->right = rv;
1247                         rv = b;
1248                         if (b->left)
1249                                 b = cast(binode, b->left);
1250                         else
1251                                 b = NULL;
1252                         rv->left = t;
1253                 }
1254                 return rv;
1255         }
1256
1257 ### Variables
1258
1259 Variables are scoped named values.  We store the names in a linked list
1260 of "bindings" sorted in lexical order, and use sequential search and
1261 insertion sort.
1262
1263 ###### ast
1264
1265         struct binding {
1266                 struct text name;
1267                 struct binding *next;   // in lexical order
1268                 ## binding fields
1269         };
1270
1271 This linked list is stored in the parse context so that "reduce"
1272 functions can find or add variables, and so the analysis phase can
1273 ensure that every variable gets a type.
1274
1275 ###### parse context
1276
1277         struct binding *varlist;  // In lexical order
1278
1279 ###### ast functions
1280
1281         static struct binding *find_binding(struct parse_context *c, struct text s)
1282         {
1283                 struct binding **l = &c->varlist;
1284                 struct binding *n;
1285                 int cmp = 1;
1286
1287                 while (*l &&
1288                         (cmp = text_cmp((*l)->name, s)) < 0)
1289                                 l = & (*l)->next;
1290                 if (cmp == 0)
1291                         return *l;
1292                 n = calloc(1, sizeof(*n));
1293                 n->name = s;
1294                 n->next = *l;
1295                 *l = n;
1296                 return n;
1297         }
1298
1299 Each name can be linked to multiple variables defined in different
1300 scopes.  Each scope starts where the name is declared and continues
1301 until the end of the containing code block.  Scopes of a given name
1302 cannot nest, so a declaration while a name is in-scope is an error.
1303
1304 ###### binding fields
1305         struct variable *var;
1306
1307 ###### ast
1308         struct variable {
1309                 struct variable *previous;
1310                 struct type *type;
1311                 struct binding *name;
1312                 struct exec *where_decl;// where name was declared
1313                 struct exec *where_set; // where type was set
1314                 ## variable fields
1315         };
1316
1317 When a scope closes, the values of the variables might need to be freed.
1318 This happens in the context of some `struct exec` and each `exec` will
1319 need to know which variables need to be freed when it completes.
1320
1321 ####### exec fields
1322         struct variable *to_free;
1323
1324 ####### variable fields
1325         struct exec *cleanup_exec;
1326         struct variable *next_free;
1327
1328 ####### interp exec cleanup
1329         {
1330                 struct variable *v;
1331                 for (v = e->to_free; v; v = v->next_free) {
1332                         struct value *val = var_value(c, v);
1333                         free_value(v->type, val);
1334                 }
1335         }
1336
1337 ###### ast functions
1338         static void variable_unlink_exec(struct variable *v)
1339         {
1340                 struct variable **vp;
1341                 if (!v->cleanup_exec)
1342                         return;
1343                 for (vp = &v->cleanup_exec->to_free;
1344                     *vp; vp = &(*vp)->next_free) {
1345                         if (*vp != v)
1346                                 continue;
1347                         *vp = v->next_free;
1348                         v->cleanup_exec = NULL;
1349                         break;
1350                 }
1351         }
1352
1353 While the naming seems strange, we include local constants in the
1354 definition of variables.  A name declared `var := value` can
1355 subsequently be changed, but a name declared `var ::= value` cannot -
1356 it is constant
1357
1358 ###### variable fields
1359         int constant;
1360
1361 Scopes in parallel branches can be partially merged.  More
1362 specifically, if a given name is declared in both branches of an
1363 if/else then its scope is a candidate for merging.  Similarly if
1364 every branch of an exhaustive switch (e.g. has an "else" clause)
1365 declares a given name, then the scopes from the branches are
1366 candidates for merging.
1367
1368 Note that names declared inside a loop (which is only parallel to
1369 itself) are never visible after the loop.  Similarly names defined in
1370 scopes which are not parallel, such as those started by `for` and
1371 `switch`, are never visible after the scope.  Only variables defined in
1372 both `then` and `else` (including the implicit then after an `if`, and
1373 excluding `then` used with `for`) and in all `case`s and `else` of a
1374 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
1375
1376 Labels, which are a bit like variables, follow different rules.
1377 Labels are not explicitly declared, but if an undeclared name appears
1378 in a context where a label is legal, that effectively declares the
1379 name as a label.  The declaration remains in force (or in scope) at
1380 least to the end of the immediately containing block and conditionally
1381 in any larger containing block which does not declare the name in some
1382 other way.  Importantly, the conditional scope extension happens even
1383 if the label is only used in one parallel branch of a conditional --
1384 when used in one branch it is treated as having been declared in all
1385 branches.
1386
1387 Merge candidates are tentatively visible beyond the end of the
1388 branching statement which creates them.  If the name is used, the
1389 merge is affirmed and they become a single variable visible at the
1390 outer layer.  If not - if it is redeclared first - the merge lapses.
1391
1392 To track scopes we have an extra stack, implemented as a linked list,
1393 which roughly parallels the parse stack and which is used exclusively
1394 for scoping.  When a new scope is opened, a new frame is pushed and
1395 the child-count of the parent frame is incremented.  This child-count
1396 is used to distinguish between the first of a set of parallel scopes,
1397 in which declared variables must not be in scope, and subsequent
1398 branches, whether they may already be conditionally scoped.
1399
1400 We need a total ordering of scopes so we can easily compare to variables
1401 to see if they are concurrently in scope.  To achieve this we record a
1402 `scope_count` which is actually a count of both beginnings and endings
1403 of scopes.  Then each variable has a record of the scope count where it
1404 enters scope, and where it leaves.
1405
1406 To push a new frame *before* any code in the frame is parsed, we need a
1407 grammar reduction.  This is most easily achieved with a grammar
1408 element which derives the empty string, and creates the new scope when
1409 it is recognised.  This can be placed, for example, between a keyword
1410 like "if" and the code following it.
1411
1412 ###### ast
1413         struct scope {
1414                 struct scope *parent;
1415                 int child_count;
1416         };
1417
1418 ###### parse context
1419         int scope_depth;
1420         int scope_count;
1421         struct scope *scope_stack;
1422
1423 ###### variable fields
1424         int scope_start, scope_end;
1425
1426 ###### ast functions
1427         static void scope_pop(struct parse_context *c)
1428         {
1429                 struct scope *s = c->scope_stack;
1430
1431                 c->scope_stack = s->parent;
1432                 free(s);
1433                 c->scope_depth -= 1;
1434                 c->scope_count += 1;
1435         }
1436
1437         static void scope_push(struct parse_context *c)
1438         {
1439                 struct scope *s = calloc(1, sizeof(*s));
1440                 if (c->scope_stack)
1441                         c->scope_stack->child_count += 1;
1442                 s->parent = c->scope_stack;
1443                 c->scope_stack = s;
1444                 c->scope_depth += 1;
1445                 c->scope_count += 1;
1446         }
1447
1448 ###### Grammar
1449
1450         $void
1451         OpenScope -> ${ scope_push(c); }$
1452
1453 Each variable records a scope depth and is in one of four states:
1454
1455 - "in scope".  This is the case between the declaration of the
1456   variable and the end of the containing block, and also between
1457   the usage with affirms a merge and the end of that block.
1458
1459   The scope depth is not greater than the current parse context scope
1460   nest depth.  When the block of that depth closes, the state will
1461   change.  To achieve this, all "in scope" variables are linked
1462   together as a stack in nesting order.
1463
1464 - "pending".  The "in scope" block has closed, but other parallel
1465   scopes are still being processed.  So far, every parallel block at
1466   the same level that has closed has declared the name.
1467
1468   The scope depth is the depth of the last parallel block that
1469   enclosed the declaration, and that has closed.
1470
1471 - "conditionally in scope".  The "in scope" block and all parallel
1472   scopes have closed, and no further mention of the name has been seen.
1473   This state includes a secondary nest depth (`min_depth`) which records
1474   the outermost scope seen since the variable became conditionally in
1475   scope.  If a use of the name is found, the variable becomes "in scope"
1476   and that secondary depth becomes the recorded scope depth.  If the
1477   name is declared as a new variable, the old variable becomes "out of
1478   scope" and the recorded scope depth stays unchanged.
1479
1480 - "out of scope".  The variable is neither in scope nor conditionally
1481   in scope.  It is permanently out of scope now and can be removed from
1482   the "in scope" stack.  When a variable becomes out-of-scope it is
1483   moved to a separate list (`out_scope`) of variables which have fully
1484   known scope.  This will be used at the end of each function to assign
1485   each variable a place in the stack frame.
1486
1487 ###### variable fields
1488         int depth, min_depth;
1489         enum { OutScope, PendingScope, CondScope, InScope } scope;
1490         struct variable *in_scope;
1491
1492 ###### parse context
1493
1494         struct variable *in_scope;
1495         struct variable *out_scope;
1496
1497 All variables with the same name are linked together using the
1498 'previous' link.  Those variable that have been affirmatively merged all
1499 have a 'merged' pointer that points to one primary variable - the most
1500 recently declared instance.  When merging variables, we need to also
1501 adjust the 'merged' pointer on any other variables that had previously
1502 been merged with the one that will no longer be primary.
1503
1504 A variable that is no longer the most recent instance of a name may
1505 still have "pending" scope, if it might still be merged with most
1506 recent instance.  These variables don't really belong in the
1507 "in_scope" list, but are not immediately removed when a new instance
1508 is found.  Instead, they are detected and ignored when considering the
1509 list of in_scope names.
1510
1511 The storage of the value of a variable will be described later.  For now
1512 we just need to know that when a variable goes out of scope, it might
1513 need to be freed.  For this we need to be able to find it, so assume that
1514 `var_value()` will provide that.
1515
1516 ###### variable fields
1517         struct variable *merged;
1518
1519 ###### ast functions
1520
1521         static void variable_merge(struct variable *primary, struct variable *secondary)
1522         {
1523                 struct variable *v;
1524
1525                 primary = primary->merged;
1526
1527                 for (v = primary->previous; v; v=v->previous)
1528                         if (v == secondary || v == secondary->merged ||
1529                             v->merged == secondary ||
1530                             v->merged == secondary->merged) {
1531                                 v->scope = OutScope;
1532                                 v->merged = primary;
1533                                 if (v->scope_start < primary->scope_start)
1534                                         primary->scope_start = v->scope_start;
1535                                 if (v->scope_end > primary->scope_end)
1536                                         primary->scope_end = v->scope_end;      // NOTEST
1537                                 variable_unlink_exec(v);
1538                         }
1539         }
1540
1541 ###### forward decls
1542         static struct value *var_value(struct parse_context *c, struct variable *v);
1543
1544 ###### free global vars
1545
1546         while (context.varlist) {
1547                 struct binding *b = context.varlist;
1548                 struct variable *v = b->var;
1549                 context.varlist = b->next;
1550                 free(b);
1551                 while (v) {
1552                         struct variable *next = v->previous;
1553
1554                         if (v->global) {
1555                                 free_value(v->type, var_value(&context, v));
1556                                 if (v->depth == 0)
1557                                         // This is a global constant
1558                                         free_exec(v->where_decl);
1559                         }
1560                         free(v);
1561                         v = next;
1562                 }
1563         }
1564
1565 #### Manipulating Bindings
1566
1567 When a name is conditionally visible, a new declaration discards the old
1568 binding - the condition lapses.  Similarly when we reach the end of a
1569 function (outermost non-global scope) any conditional scope must lapse.
1570 Conversely a usage of the name affirms the visibility and extends it to
1571 the end of the containing block - i.e.  the block that contains both the
1572 original declaration and the latest usage.  This is determined from
1573 `min_depth`.  When a conditionally visible variable gets affirmed like
1574 this, it is also merged with other conditionally visible variables with
1575 the same name.
1576
1577 When we parse a variable declaration we either report an error if the
1578 name is currently bound, or create a new variable at the current nest
1579 depth if the name is unbound or bound to a conditionally scoped or
1580 pending-scope variable.  If the previous variable was conditionally
1581 scoped, it and its homonyms becomes out-of-scope.
1582
1583 When we parse a variable reference (including non-declarative assignment
1584 "foo = bar") we report an error if the name is not bound or is bound to
1585 a pending-scope variable; update the scope if the name is bound to a
1586 conditionally scoped variable; or just proceed normally if the named
1587 variable is in scope.
1588
1589 When we exit a scope, any variables bound at this level are either
1590 marked out of scope or pending-scoped, depending on whether the scope
1591 was sequential or parallel.  Here a "parallel" scope means the "then"
1592 or "else" part of a conditional, or any "case" or "else" branch of a
1593 switch.  Other scopes are "sequential".
1594
1595 When exiting a parallel scope we check if there are any variables that
1596 were previously pending and are still visible. If there are, then
1597 they weren't redeclared in the most recent scope, so they cannot be
1598 merged and must become out-of-scope.  If it is not the first of
1599 parallel scopes (based on `child_count`), we check that there was a
1600 previous binding that is still pending-scope.  If there isn't, the new
1601 variable must now be out-of-scope.
1602
1603 When exiting a sequential scope that immediately enclosed parallel
1604 scopes, we need to resolve any pending-scope variables.  If there was
1605 no `else` clause, and we cannot determine that the `switch` was exhaustive,
1606 we need to mark all pending-scope variable as out-of-scope.  Otherwise
1607 all pending-scope variables become conditionally scoped.
1608
1609 ###### ast
1610         enum closetype { CloseSequential, CloseFunction, CloseParallel, CloseElse };
1611
1612 ###### ast functions
1613
1614         static struct variable *var_decl(struct parse_context *c, struct text s)
1615         {
1616                 struct binding *b = find_binding(c, s);
1617                 struct variable *v = b->var;
1618
1619                 switch (v ? v->scope : OutScope) {
1620                 case InScope:
1621                         /* Caller will report the error */
1622                         return NULL;
1623                 case CondScope:
1624                         for (;
1625                              v && v->scope == CondScope;
1626                              v = v->previous)
1627                                 v->scope = OutScope;
1628                         break;
1629                 default: break;
1630                 }
1631                 v = calloc(1, sizeof(*v));
1632                 v->previous = b->var;
1633                 b->var = v;
1634                 v->name = b;
1635                 v->merged = v;
1636                 v->min_depth = v->depth = c->scope_depth;
1637                 v->scope = InScope;
1638                 v->in_scope = c->in_scope;
1639                 v->scope_start = c->scope_count;
1640                 c->in_scope = v;
1641                 ## variable init
1642                 return v;
1643         }
1644
1645         static struct variable *var_ref(struct parse_context *c, struct text s)
1646         {
1647                 struct binding *b = find_binding(c, s);
1648                 struct variable *v = b->var;
1649                 struct variable *v2;
1650
1651                 switch (v ? v->scope : OutScope) {
1652                 case OutScope:
1653                 case PendingScope:
1654                         /* Caller will report the error */
1655                         return NULL;
1656                 case CondScope:
1657                         /* All CondScope variables of this name need to be merged
1658                          * and become InScope
1659                          */
1660                         v->depth = v->min_depth;
1661                         v->scope = InScope;
1662                         for (v2 = v->previous;
1663                              v2 && v2->scope == CondScope;
1664                              v2 = v2->previous)
1665                                 variable_merge(v, v2);
1666                         break;
1667                 case InScope:
1668                         break;
1669                 }
1670                 return v;
1671         }
1672
1673         static int var_refile(struct parse_context *c, struct variable *v)
1674         {
1675                 /* Variable just went out of scope.  Add it to the out_scope
1676                  * list, sorted by ->scope_start
1677                  */
1678                 struct variable **vp = &c->out_scope;
1679                 while ((*vp) && (*vp)->scope_start < v->scope_start)
1680                         vp = &(*vp)->in_scope;
1681                 v->in_scope = *vp;
1682                 *vp = v;
1683                 return 0;
1684         }
1685
1686         static void var_block_close(struct parse_context *c, enum closetype ct,
1687                                     struct exec *e)
1688         {
1689                 /* Close off all variables that are in_scope.
1690                  * Some variables in c->scope may already be not-in-scope,
1691                  * such as when a PendingScope variable is hidden by a new
1692                  * variable with the same name.
1693                  * So we check for v->name->var != v and drop them.
1694                  * If we choose to make a variable OutScope, we drop it
1695                  * immediately too.
1696                  */
1697                 struct variable *v, **vp, *v2;
1698
1699                 scope_pop(c);
1700                 for (vp = &c->in_scope;
1701                      (v = *vp) && v->min_depth > c->scope_depth;
1702                      (v->scope == OutScope || v->name->var != v)
1703                      ? (*vp =  v->in_scope, var_refile(c, v))
1704                      : ( vp = &v->in_scope, 0)) {
1705                         v->min_depth = c->scope_depth;
1706                         if (v->name->var != v)
1707                                 /* This is still in scope, but we haven't just
1708                                  * closed the scope.
1709                                  */
1710                                 continue;
1711                         v->min_depth = c->scope_depth;
1712                         if (v->scope == InScope)
1713                                 v->scope_end = c->scope_count;
1714                         if (v->scope == InScope && e && !v->global) {
1715                                 /* This variable gets cleaned up when 'e' finishes */
1716                                 variable_unlink_exec(v);
1717                                 v->cleanup_exec = e;
1718                                 v->next_free = e->to_free;
1719                                 e->to_free = v;
1720                         }
1721                         switch (ct) {
1722                         case CloseElse:
1723                         case CloseParallel: /* handle PendingScope */
1724                                 switch(v->scope) {
1725                                 case InScope:
1726                                 case CondScope:
1727                                         if (c->scope_stack->child_count == 1)
1728                                                 /* first among parallel branches */
1729                                                 v->scope = PendingScope;
1730                                         else if (v->previous &&
1731                                                  v->previous->scope == PendingScope)
1732                                                 /* all previous branches used name */
1733                                                 v->scope = PendingScope;
1734                                         else if (v->type == Tlabel)
1735                                                 /* Labels remain pending even when not used */
1736                                                 v->scope = PendingScope;        // UNTESTED
1737                                         else
1738                                                 v->scope = OutScope;
1739                                         if (ct == CloseElse) {
1740                                                 /* All Pending variables with this name
1741                                                  * are now Conditional */
1742                                                 for (v2 = v;
1743                                                      v2 && v2->scope == PendingScope;
1744                                                      v2 = v2->previous)
1745                                                         v2->scope = CondScope;
1746                                         }
1747                                         break;
1748                                 case PendingScope:
1749                                         /* Not possible as it would require
1750                                          * parallel scope to be nested immediately
1751                                          * in a parallel scope, and that never
1752                                          * happens.
1753                                          */                     // NOTEST
1754                                 case OutScope:
1755                                         /* Not possible as we already tested for
1756                                          * OutScope
1757                                          */
1758                                         abort();                // NOTEST
1759                                 }
1760                                 break;
1761                         case CloseFunction:
1762                                 if (v->scope == CondScope)
1763                                         /* Condition cannot continue past end of function */
1764                                         v->scope = InScope;
1765                                 /* fallthrough */
1766                         case CloseSequential:
1767                                 if (v->type == Tlabel)
1768                                         v->scope = PendingScope;
1769                                 switch (v->scope) {
1770                                 case InScope:
1771                                         v->scope = OutScope;
1772                                         break;
1773                                 case PendingScope:
1774                                         /* There was no 'else', so we can only become
1775                                          * conditional if we know the cases were exhaustive,
1776                                          * and that doesn't mean anything yet.
1777                                          * So only labels become conditional..
1778                                          */
1779                                         for (v2 = v;
1780                                              v2 && v2->scope == PendingScope;
1781                                              v2 = v2->previous)
1782                                                 if (v2->type == Tlabel)
1783                                                         v2->scope = CondScope;
1784                                                 else
1785                                                         v2->scope = OutScope;
1786                                         break;
1787                                 case CondScope:
1788                                 case OutScope: break;
1789                                 }
1790                                 break;
1791                         }
1792                 }
1793         }
1794
1795 #### Storing Values
1796
1797 The value of a variable is store separately from the variable, on an
1798 analogue of a stack frame.  There are (currently) two frames that can be
1799 active.  A global frame which currently only stores constants, and a
1800 stacked frame which stores local variables.  Each variable knows if it
1801 is global or not, and what its index into the frame is.
1802
1803 Values in the global frame are known immediately they are relevant, so
1804 the frame needs to be reallocated as it grows so it can store those
1805 values.  The local frame doesn't get values until the interpreted phase
1806 is started, so there is no need to allocate until the size is known.
1807
1808 We initialize the `frame_pos` to an impossible value, so that we can
1809 tell if it was set or not later.
1810
1811 ###### variable fields
1812         short frame_pos;
1813         short global;
1814
1815 ###### variable init
1816         v->frame_pos = -1;
1817
1818 ###### parse context
1819
1820         short global_size, global_alloc;
1821         short local_size;
1822         void *global, *local;
1823
1824 ###### ast functions
1825
1826         static struct value *var_value(struct parse_context *c, struct variable *v)
1827         {
1828                 if (!v->global) {
1829                         if (!c->local || !v->type)
1830                                 return NULL;                    // NOTEST
1831                         if (v->frame_pos + v->type->size > c->local_size) {
1832                                 printf("INVALID frame_pos\n");  // NOTEST
1833                                 exit(2);                        // NOTEST
1834                         }
1835                         return c->local + v->frame_pos;
1836                 }
1837                 if (c->global_size > c->global_alloc) {
1838                         int old = c->global_alloc;
1839                         c->global_alloc = (c->global_size | 1023) + 1024;
1840                         c->global = realloc(c->global, c->global_alloc);
1841                         memset(c->global + old, 0, c->global_alloc - old);
1842                 }
1843                 return c->global + v->frame_pos;
1844         }
1845
1846         static struct value *global_alloc(struct parse_context *c, struct type *t,
1847                                           struct variable *v, struct value *init)
1848         {
1849                 struct value *ret;
1850                 struct variable scratch;
1851
1852                 if (t->prepare_type)
1853                         t->prepare_type(c, t, 1);       // NOTEST
1854
1855                 if (c->global_size & (t->align - 1))
1856                         c->global_size = (c->global_size + t->align) & ~(t->align-1);
1857                 if (!v) {
1858                         v = &scratch;
1859                         v->type = t;
1860                 }
1861                 v->frame_pos = c->global_size;
1862                 v->global = 1;
1863                 c->global_size += v->type->size;
1864                 ret = var_value(c, v);
1865                 if (init)
1866                         memcpy(ret, init, t->size);
1867                 else
1868                         val_init(t, ret);
1869                 return ret;
1870         }
1871
1872 As global values are found -- struct field initializers, labels etc --
1873 `global_alloc()` is called to record the value in the global frame.
1874
1875 When the program is fully parsed, each function is analysed, we need to
1876 walk the list of variables local to that function and assign them an
1877 offset in the stack frame.  For this we have `scope_finalize()`.
1878
1879 We keep the stack from dense by re-using space for between variables
1880 that are not in scope at the same time.  The `out_scope` list is sorted
1881 by `scope_start` and as we process a varible, we move it to an FIFO
1882 stack.  For each variable we consider, we first discard any from the
1883 stack anything that went out of scope before the new variable came in.
1884 Then we place the new variable just after the one at the top of the
1885 stack.
1886
1887 ###### ast functions
1888
1889         static void scope_finalize(struct parse_context *c, struct type *ft)
1890         {
1891                 int size = ft->function.local_size;
1892                 struct variable *next = ft->function.scope;
1893                 struct variable *done = NULL;
1894
1895                 while (next) {
1896                         struct variable *v = next;
1897                         struct type *t = v->type;
1898                         int pos;
1899                         next = v->in_scope;
1900                         if (v->merged != v)
1901                                 continue;
1902                         if (!t)
1903                                 continue;
1904                         if (v->frame_pos >= 0)
1905                                 continue;
1906                         while (done && done->scope_end < v->scope_start)
1907                                 done = done->in_scope;
1908                         if (done)
1909                                 pos = done->frame_pos + done->type->size;
1910                         else
1911                                 pos = ft->function.local_size;
1912                         if (pos & (t->align - 1))
1913                                 pos = (pos + t->align) & ~(t->align-1);
1914                         v->frame_pos = pos;
1915                         if (size < pos + v->type->size)
1916                                 size = pos + v->type->size;
1917                         v->in_scope = done;
1918                         done = v;
1919                 }
1920                 c->out_scope = NULL;
1921                 ft->function.local_size = size;
1922         }
1923
1924 ###### free context storage
1925         free(context.global);
1926
1927 #### Variables as executables
1928
1929 Just as we used a `val` to wrap a value into an `exec`, we similarly
1930 need a `var` to wrap a `variable` into an exec.  While each `val`
1931 contained a copy of the value, each `var` holds a link to the variable
1932 because it really is the same variable no matter where it appears.
1933 When a variable is used, we need to remember to follow the `->merged`
1934 link to find the primary instance.
1935
1936 When a variable is declared, it may or may not be given an explicit
1937 type.  We need to record which so that we can report the parsed code
1938 correctly.
1939
1940 ###### exec type
1941         Xvar,
1942
1943 ###### ast
1944         struct var {
1945                 struct exec;
1946                 struct variable *var;
1947         };
1948
1949 ###### variable fields
1950         int explicit_type;
1951
1952 ###### Grammar
1953
1954         $TERM : ::
1955
1956         $*var
1957         VariableDecl -> IDENTIFIER : ${ {
1958                 struct variable *v = var_decl(c, $1.txt);
1959                 $0 = new_pos(var, $1);
1960                 $0->var = v;
1961                 if (v)
1962                         v->where_decl = $0;
1963                 else {
1964                         v = var_ref(c, $1.txt);
1965                         $0->var = v;
1966                         type_err(c, "error: variable '%v' redeclared",
1967                                  $0, NULL, 0, NULL);
1968                         type_err(c, "info: this is where '%v' was first declared",
1969                                  v->where_decl, NULL, 0, NULL);
1970                 }
1971         } }$
1972             | IDENTIFIER :: ${ {
1973                 struct variable *v = var_decl(c, $1.txt);
1974                 $0 = new_pos(var, $1);
1975                 $0->var = v;
1976                 if (v) {
1977                         v->where_decl = $0;
1978                         v->constant = 1;
1979                 } else {
1980                         v = var_ref(c, $1.txt);
1981                         $0->var = v;
1982                         type_err(c, "error: variable '%v' redeclared",
1983                                  $0, NULL, 0, NULL);
1984                         type_err(c, "info: this is where '%v' was first declared",
1985                                  v->where_decl, NULL, 0, NULL);
1986                 }
1987         } }$
1988             | IDENTIFIER : Type ${ {
1989                 struct variable *v = var_decl(c, $1.txt);
1990                 $0 = new_pos(var, $1);
1991                 $0->var = v;
1992                 if (v) {
1993                         v->where_decl = $0;
1994                         v->where_set = $0;
1995                         v->type = $<Type;
1996                         v->explicit_type = 1;
1997                 } else {
1998                         v = var_ref(c, $1.txt);
1999                         $0->var = v;
2000                         type_err(c, "error: variable '%v' redeclared",
2001                                  $0, NULL, 0, NULL);
2002                         type_err(c, "info: this is where '%v' was first declared",
2003                                  v->where_decl, NULL, 0, NULL);
2004                 }
2005         } }$
2006             | IDENTIFIER :: Type ${ {
2007                 struct variable *v = var_decl(c, $1.txt);
2008                 $0 = new_pos(var, $1);
2009                 $0->var = v;
2010                 if (v) {
2011                         v->where_decl = $0;
2012                         v->where_set = $0;
2013                         v->type = $<Type;
2014                         v->constant = 1;
2015                         v->explicit_type = 1;
2016                 } else {
2017                         v = var_ref(c, $1.txt);
2018                         $0->var = v;
2019                         type_err(c, "error: variable '%v' redeclared",
2020                                  $0, NULL, 0, NULL);
2021                         type_err(c, "info: this is where '%v' was first declared",
2022                                  v->where_decl, NULL, 0, NULL);
2023                 }
2024         } }$
2025
2026         $*exec
2027         Variable -> IDENTIFIER ${ {
2028                 struct variable *v = var_ref(c, $1.txt);
2029                 $0 = new_pos(var, $1);
2030                 if (v == NULL) {
2031                         /* This might be a label - allocate a var just in case */
2032                         v = var_decl(c, $1.txt);
2033                         if (v) {
2034                                 v->type = Tnone;
2035                                 v->where_decl = $0;
2036                                 v->where_set = $0;
2037                         }
2038                 }
2039                 cast(var, $0)->var = v;
2040                 } }$
2041
2042 ###### print exec cases
2043         case Xvar:
2044         {
2045                 struct var *v = cast(var, e);
2046                 if (v->var) {
2047                         struct binding *b = v->var->name;
2048                         printf("%.*s", b->name.len, b->name.txt);
2049                 }
2050                 break;
2051         }
2052
2053 ###### format cases
2054         case 'v':
2055                 if (loc && loc->type == Xvar) {
2056                         struct var *v = cast(var, loc);
2057                         if (v->var) {
2058                                 struct binding *b = v->var->name;
2059                                 fprintf(stderr, "%.*s", b->name.len, b->name.txt);
2060                         } else
2061                                 fputs("???", stderr);   // NOTEST
2062                 } else
2063                         fputs("NOTVAR", stderr);
2064                 break;
2065
2066 ###### propagate exec cases
2067
2068         case Xvar:
2069         {
2070                 struct var *var = cast(var, prog);
2071                 struct variable *v = var->var;
2072                 if (!v) {
2073                         type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
2074                         return Tnone;                                   // NOTEST
2075                 }
2076                 v = v->merged;
2077                 if (v->constant && (rules & Rnoconstant)) {
2078                         type_err(c, "error: Cannot assign to a constant: %v",
2079                                  prog, NULL, 0, NULL);
2080                         type_err(c, "info: name was defined as a constant here",
2081                                  v->where_decl, NULL, 0, NULL);
2082                         return v->type;
2083                 }
2084                 if (v->type == Tnone && v->where_decl == prog)
2085                         type_err(c, "error: variable used but not declared: %v",
2086                                  prog, NULL, 0, NULL);
2087                 if (v->type == NULL) {
2088                         if (type && *ok != 0) {
2089                                 v->type = type;
2090                                 v->where_set = prog;
2091                                 *ok = 2;
2092                         }
2093                         return type;
2094                 }
2095                 if (!type_compat(type, v->type, rules)) {
2096                         type_err(c, "error: expected %1%r but variable '%v' is %2", prog,
2097                                  type, rules, v->type);
2098                         type_err(c, "info: this is where '%v' was set to %1", v->where_set,
2099                                  v->type, rules, NULL);
2100                 }
2101                 if (!type)
2102                         return v->type;
2103                 return type;
2104         }
2105
2106 ###### interp exec cases
2107         case Xvar:
2108         {
2109                 struct var *var = cast(var, e);
2110                 struct variable *v = var->var;
2111
2112                 v = v->merged;
2113                 lrv = var_value(c, v);
2114                 rvtype = v->type;
2115                 break;
2116         }
2117
2118 ###### ast functions
2119
2120         static void free_var(struct var *v)
2121         {
2122                 free(v);
2123         }
2124
2125 ###### free exec cases
2126         case Xvar: free_var(cast(var, e)); break;
2127
2128
2129 ### Complex types
2130
2131 Now that we have the shape of the interpreter in place we can add some
2132 complex types and connected them in to the data structures and the
2133 different phases of parse, analyse, print, interpret.
2134
2135 Being "complex" the language will naturally have syntax to access
2136 specifics of objects of these types.  These will fit into the grammar as
2137 "Terms" which are the things that are combined with various operators to
2138 form "Expression".  Where a Term is formed by some operation on another
2139 Term, the subordinate Term will always come first, so for example a
2140 member of an array will be expressed as the Term for the array followed
2141 by an index in square brackets.  The strict rule of using postfix
2142 operations makes precedence irrelevant within terms.  To provide a place
2143 to put the grammar for each terms of each type, we will start out by
2144 introducing the "Term" grammar production, with contains at least a
2145 simple "Value" (to be explained later).
2146
2147 ###### Grammar
2148         $*exec
2149         Term ->  Value ${ $0 = $<1; }$
2150                 | Variable ${ $0 = $<1; }$
2151                 ## term grammar
2152
2153 Thus far the complex types we have are arrays and structs.
2154
2155 #### Arrays
2156
2157 Arrays can be declared by giving a size and a type, as `[size]type' so
2158 `freq:[26]number` declares `freq` to be an array of 26 numbers.  The
2159 size can be either a literal number, or a named constant.  Some day an
2160 arbitrary expression will be supported.
2161
2162 As a formal parameter to a function, the array can be declared with a
2163 new variable as the size: `name:[size::number]string`.  The `size`
2164 variable is set to the size of the array and must be a constant.  As
2165 `number` is the only supported type, it can be left out:
2166 `name:[size::]string`.
2167
2168 Arrays cannot be assigned.  When pointers are introduced we will also
2169 introduce array slices which can refer to part or all of an array -
2170 the assignment syntax will create a slice.  For now, an array can only
2171 ever be referenced by the name it is declared with.  It is likely that
2172 a "`copy`" primitive will eventually be define which can be used to
2173 make a copy of an array with controllable recursive depth.
2174
2175 For now we have two sorts of array, those with fixed size either because
2176 it is given as a literal number or because it is a struct member (which
2177 cannot have a runtime-changing size), and those with a size that is
2178 determined at runtime - local variables with a const size.  The former
2179 have their size calculated at parse time, the latter at run time.
2180
2181 For the latter type, the `size` field of the type is the size of a
2182 pointer, and the array is reallocated every time it comes into scope.
2183
2184 We differentiate struct fields with a const size from local variables
2185 with a const size by whether they are prepared at parse time or not.
2186
2187 ###### type union fields
2188
2189         struct {
2190                 int unspec;     // size is unspecified - vsize must be set.
2191                 short size;
2192                 short static_size;
2193                 struct variable *vsize;
2194                 struct type *member;
2195         } array;
2196
2197 ###### value union fields
2198         void *array;  // used if not static_size
2199
2200 ###### value functions
2201
2202         static void array_prepare_type(struct parse_context *c, struct type *type,
2203                                        int parse_time)
2204         {
2205                 struct value *vsize;
2206                 mpz_t q;
2207                 if (!type->array.vsize || type->array.static_size)
2208                         return;
2209
2210                 vsize = var_value(c, type->array.vsize);
2211                 mpz_init(q);
2212                 mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
2213                 type->array.size = mpz_get_si(q);
2214                 mpz_clear(q);
2215
2216                 if (parse_time) {
2217                         type->array.static_size = 1;
2218                         type->size = type->array.size * type->array.member->size;
2219                         type->align = type->array.member->align;
2220                 }
2221         }
2222
2223         static void array_init(struct type *type, struct value *val)
2224         {
2225                 int i;
2226                 void *ptr = val->ptr;
2227
2228                 if (!val)
2229                         return;                         // NOTEST
2230                 if (!type->array.static_size) {
2231                         val->array = calloc(type->array.size,
2232                                             type->array.member->size);
2233                         ptr = val->array;
2234                 }
2235                 for (i = 0; i < type->array.size; i++) {
2236                         struct value *v;
2237                         v = (void*)ptr + i * type->array.member->size;
2238                         val_init(type->array.member, v);
2239                 }
2240         }
2241
2242         static void array_free(struct type *type, struct value *val)
2243         {
2244                 int i;
2245                 void *ptr = val->ptr;
2246
2247                 if (!type->array.static_size)
2248                         ptr = val->array;
2249                 for (i = 0; i < type->array.size; i++) {
2250                         struct value *v;
2251                         v = (void*)ptr + i * type->array.member->size;
2252                         free_value(type->array.member, v);
2253                 }
2254                 if (!type->array.static_size)
2255                         free(ptr);
2256         }
2257
2258         static int array_compat(struct type *require, struct type *have)
2259         {
2260                 if (have->compat != require->compat)
2261                         return 0;
2262                 /* Both are arrays, so we can look at details */
2263                 if (!type_compat(require->array.member, have->array.member, 0))
2264                         return 0;
2265                 if (have->array.unspec && require->array.unspec) {
2266                         if (have->array.vsize && require->array.vsize &&
2267                             have->array.vsize != require->array.vsize)  // UNTESTED
2268                                 /* sizes might not be the same */
2269                                 return 0;       // UNTESTED
2270                         return 1;
2271                 }
2272                 if (have->array.unspec || require->array.unspec)
2273                         return 1;       // UNTESTED
2274                 if (require->array.vsize == NULL && have->array.vsize == NULL)
2275                         return require->array.size == have->array.size;
2276
2277                 return require->array.vsize == have->array.vsize;       // UNTESTED
2278         }
2279
2280         static void array_print_type(struct type *type, FILE *f)
2281         {
2282                 fputs("[", f);
2283                 if (type->array.vsize) {
2284                         struct binding *b = type->array.vsize->name;
2285                         fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
2286                                 type->array.unspec ? "::" : "");
2287                 } else
2288                         fprintf(f, "%d]", type->array.size);
2289                 type_print(type->array.member, f);
2290         }
2291
2292         static struct type array_prototype = {
2293                 .init = array_init,
2294                 .prepare_type = array_prepare_type,
2295                 .print_type = array_print_type,
2296                 .compat = array_compat,
2297                 .free = array_free,
2298                 .size = sizeof(void*),
2299                 .align = sizeof(void*),
2300         };
2301
2302 ###### declare terminals
2303         $TERM [ ]
2304
2305 ###### type grammar
2306
2307         | [ NUMBER ] Type ${ {
2308                 char tail[3];
2309                 mpq_t num;
2310                 struct text noname = { "", 0 };
2311                 struct type *t;
2312
2313                 $0 = t = add_type(c, noname, &array_prototype);
2314                 t->array.member = $<4;
2315                 t->array.vsize = NULL;
2316                 if (number_parse(num, tail, $2.txt) == 0)
2317                         tok_err(c, "error: unrecognised number", &$2);
2318                 else if (tail[0]) {
2319                         tok_err(c, "error: unsupported number suffix", &$2);
2320                         mpq_clear(num);
2321                 } else {
2322                         t->array.size = mpz_get_ui(mpq_numref(num));
2323                         if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
2324                                 tok_err(c, "error: array size must be an integer",
2325                                         &$2);
2326                         } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
2327                                 tok_err(c, "error: array size is too large",
2328                                         &$2);
2329                         mpq_clear(num);
2330                 }
2331                 t->array.static_size = 1;
2332                 t->size = t->array.size * t->array.member->size;
2333                 t->align = t->array.member->align;
2334         } }$
2335
2336         | [ IDENTIFIER ] Type ${ {
2337                 struct variable *v = var_ref(c, $2.txt);
2338                 struct text noname = { "", 0 };
2339
2340                 if (!v)
2341                         tok_err(c, "error: name undeclared", &$2);
2342                 else if (!v->constant)
2343                         tok_err(c, "error: array size must be a constant", &$2);
2344
2345                 $0 = add_type(c, noname, &array_prototype);
2346                 $0->array.member = $<4;
2347                 $0->array.size = 0;
2348                 $0->array.vsize = v;
2349         } }$
2350
2351 ###### Grammar
2352         $*type
2353         OptType -> Type ${ $0 = $<1; }$
2354                 | ${ $0 = NULL; }$
2355
2356 ###### formal type grammar
2357
2358         | [ IDENTIFIER :: OptType ] Type ${ {
2359                 struct variable *v = var_decl(c, $ID.txt);
2360                 struct text noname = { "", 0 };
2361
2362                 v->type = $<OT;
2363                 v->constant = 1;
2364                 if (!v->type)
2365                         v->type = Tnum;
2366                 $0 = add_type(c, noname, &array_prototype);
2367                 $0->array.member = $<6;
2368                 $0->array.size = 0;
2369                 $0->array.unspec = 1;
2370                 $0->array.vsize = v;
2371         } }$
2372
2373 ###### Binode types
2374         Index,
2375
2376 ###### term grammar
2377
2378         | Term [ Expression ] ${ {
2379                 struct binode *b = new(binode);
2380                 b->op = Index;
2381                 b->left = $<1;
2382                 b->right = $<3;
2383                 $0 = b;
2384         } }$
2385
2386 ###### print binode cases
2387         case Index:
2388                 print_exec(b->left, -1, bracket);
2389                 printf("[");
2390                 print_exec(b->right, -1, bracket);
2391                 printf("]");
2392                 break;
2393
2394 ###### propagate binode cases
2395         case Index:
2396                 /* left must be an array, right must be a number,
2397                  * result is the member type of the array
2398                  */
2399                 propagate_types(b->right, c, ok, Tnum, 0);
2400                 t = propagate_types(b->left, c, ok, NULL, rules & Rnoconstant);
2401                 if (!t || t->compat != array_compat) {
2402                         type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
2403                         return NULL;
2404                 } else {
2405                         if (!type_compat(type, t->array.member, rules)) {
2406                                 type_err(c, "error: have %1 but need %2", prog,
2407                                          t->array.member, rules, type);
2408                         }
2409                         return t->array.member;
2410                 }
2411                 break;
2412
2413 ###### interp binode cases
2414         case Index: {
2415                 mpz_t q;
2416                 long i;
2417                 void *ptr;
2418
2419                 lleft = linterp_exec(c, b->left, &ltype);
2420                 right = interp_exec(c, b->right, &rtype);
2421                 mpz_init(q);
2422                 mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
2423                 i = mpz_get_si(q);
2424                 mpz_clear(q);
2425
2426                 if (ltype->array.static_size)
2427                         ptr = lleft;
2428                 else
2429                         ptr = *(void**)lleft;
2430                 rvtype = ltype->array.member;
2431                 if (i >= 0 && i < ltype->array.size)
2432                         lrv = ptr + i * rvtype->size;
2433                 else
2434                         val_init(ltype->array.member, &rv); // UNSAFE
2435                 ltype = NULL;
2436                 break;
2437         }
2438
2439 #### Structs
2440
2441 A `struct` is a data-type that contains one or more other data-types.
2442 It differs from an array in that each member can be of a different
2443 type, and they are accessed by name rather than by number.  Thus you
2444 cannot choose an element by calculation, you need to know what you
2445 want up-front.
2446
2447 The language makes no promises about how a given structure will be
2448 stored in memory - it is free to rearrange fields to suit whatever
2449 criteria seems important.
2450
2451 Structs are declared separately from program code - they cannot be
2452 declared in-line in a variable declaration like arrays can.  A struct
2453 is given a name and this name is used to identify the type - the name
2454 is not prefixed by the word `struct` as it would be in C.
2455
2456 Structs are only treated as the same if they have the same name.
2457 Simply having the same fields in the same order is not enough.  This
2458 might change once we can create structure initializers from a list of
2459 values.
2460
2461 Each component datum is identified much like a variable is declared,
2462 with a name, one or two colons, and a type.  The type cannot be omitted
2463 as there is no opportunity to deduce the type from usage.  An initial
2464 value can be given following an equals sign, so
2465
2466 ##### Example: a struct type
2467
2468         struct complex:
2469                 x:number = 0
2470                 y:number = 0
2471
2472 would declare a type called "complex" which has two number fields,
2473 each initialised to zero.
2474
2475 Struct will need to be declared separately from the code that uses
2476 them, so we will need to be able to print out the declaration of a
2477 struct when reprinting the whole program.  So a `print_type_decl` type
2478 function will be needed.
2479
2480 ###### type union fields
2481
2482         struct {
2483                 int nfields;
2484                 struct field {
2485                         struct text name;
2486                         struct type *type;
2487                         struct value *init;
2488                         int offset;
2489                 } *fields;
2490         } structure;
2491
2492 ###### type functions
2493         void (*print_type_decl)(struct type *type, FILE *f);
2494
2495 ###### value functions
2496
2497         static void structure_init(struct type *type, struct value *val)
2498         {
2499                 int i;
2500
2501                 for (i = 0; i < type->structure.nfields; i++) {
2502                         struct value *v;
2503                         v = (void*) val->ptr + type->structure.fields[i].offset;
2504                         if (type->structure.fields[i].init)
2505                                 dup_value(type->structure.fields[i].type,
2506                                           type->structure.fields[i].init,
2507                                           v);
2508                         else
2509                                 val_init(type->structure.fields[i].type, v);
2510                 }
2511         }
2512
2513         static void structure_free(struct type *type, struct value *val)
2514         {
2515                 int i;
2516
2517                 for (i = 0; i < type->structure.nfields; i++) {
2518                         struct value *v;
2519                         v = (void*)val->ptr + type->structure.fields[i].offset;
2520                         free_value(type->structure.fields[i].type, v);
2521                 }
2522         }
2523
2524         static void structure_free_type(struct type *t)
2525         {
2526                 int i;
2527                 for (i = 0; i < t->structure.nfields; i++)
2528                         if (t->structure.fields[i].init) {
2529                                 free_value(t->structure.fields[i].type,
2530                                            t->structure.fields[i].init);
2531                         }
2532                 free(t->structure.fields);
2533         }
2534
2535         static struct type structure_prototype = {
2536                 .init = structure_init,
2537                 .free = structure_free,
2538                 .free_type = structure_free_type,
2539                 .print_type_decl = structure_print_type,
2540         };
2541
2542 ###### exec type
2543         Xfieldref,
2544
2545 ###### ast
2546         struct fieldref {
2547                 struct exec;
2548                 struct exec *left;
2549                 int index;
2550                 struct text name;
2551         };
2552
2553 ###### free exec cases
2554         case Xfieldref:
2555                 free_exec(cast(fieldref, e)->left);
2556                 free(e);
2557                 break;
2558
2559 ###### declare terminals
2560         $TERM struct .
2561
2562 ###### term grammar
2563
2564         | Term . IDENTIFIER ${ {
2565                 struct fieldref *fr = new_pos(fieldref, $2);
2566                 fr->left = $<1;
2567                 fr->name = $3.txt;
2568                 fr->index = -2;
2569                 $0 = fr;
2570         } }$
2571
2572 ###### print exec cases
2573
2574         case Xfieldref:
2575         {
2576                 struct fieldref *f = cast(fieldref, e);
2577                 print_exec(f->left, -1, bracket);
2578                 printf(".%.*s", f->name.len, f->name.txt);
2579                 break;
2580         }
2581
2582 ###### ast functions
2583         static int find_struct_index(struct type *type, struct text field)
2584         {
2585                 int i;
2586                 for (i = 0; i < type->structure.nfields; i++)
2587                         if (text_cmp(type->structure.fields[i].name, field) == 0)
2588                                 return i;
2589                 return -1;
2590         }
2591
2592 ###### propagate exec cases
2593
2594         case Xfieldref:
2595         {
2596                 struct fieldref *f = cast(fieldref, prog);
2597                 struct type *st = propagate_types(f->left, c, ok, NULL, 0);
2598
2599                 if (!st)
2600                         type_err(c, "error: unknown type for field access", f->left,    // UNTESTED
2601                                  NULL, 0, NULL);
2602                 else if (st->init != structure_init)
2603                         type_err(c, "error: field reference attempted on %1, not a struct",
2604                                  f->left, st, 0, NULL);
2605                 else if (f->index == -2) {
2606                         f->index = find_struct_index(st, f->name);
2607                         if (f->index < 0)
2608                                 type_err(c, "error: cannot find requested field in %1",
2609                                          f->left, st, 0, NULL);
2610                 }
2611                 if (f->index >= 0) {
2612                         struct type *ft = st->structure.fields[f->index].type;
2613                         if (!type_compat(type, ft, rules))
2614                                 type_err(c, "error: have %1 but need %2", prog,
2615                                          ft, rules, type);
2616                         return ft;
2617                 }
2618                 break;
2619         }
2620
2621 ###### interp exec cases
2622         case Xfieldref:
2623         {
2624                 struct fieldref *f = cast(fieldref, e);
2625                 struct type *ltype;
2626                 struct value *lleft = linterp_exec(c, f->left, &ltype);
2627                 lrv = (void*)lleft->ptr + ltype->structure.fields[f->index].offset;
2628                 rvtype = ltype->structure.fields[f->index].type;
2629                 break;
2630         }
2631
2632 ###### ast
2633         struct fieldlist {
2634                 struct fieldlist *prev;
2635                 struct field f;
2636         };
2637
2638 ###### ast functions
2639         static void free_fieldlist(struct fieldlist *f)
2640         {
2641                 if (!f)
2642                         return;
2643                 free_fieldlist(f->prev);
2644                 if (f->f.init) {
2645                         free_value(f->f.type, f->f.init);       // UNTESTED
2646                         free(f->f.init);        // UNTESTED
2647                 }
2648                 free(f);
2649         }
2650
2651 ###### top level grammar
2652         DeclareStruct -> struct IDENTIFIER FieldBlock Newlines ${ {
2653                         struct type *t =
2654                                 add_type(c, $2.txt, &structure_prototype);
2655                         int cnt = 0;
2656                         struct fieldlist *f;
2657
2658                         for (f = $3; f; f=f->prev)
2659                                 cnt += 1;
2660
2661                         t->structure.nfields = cnt;
2662                         t->structure.fields = calloc(cnt, sizeof(struct field));
2663                         f = $3;
2664                         while (cnt > 0) {
2665                                 int a = f->f.type->align;
2666                                 cnt -= 1;
2667                                 t->structure.fields[cnt] = f->f;
2668                                 if (t->size & (a-1))
2669                                         t->size = (t->size | (a-1)) + 1;
2670                                 t->structure.fields[cnt].offset = t->size;
2671                                 t->size += ((f->f.type->size - 1) | (a-1)) + 1;
2672                                 if (a > t->align)
2673                                         t->align = a;
2674                                 f->f.init = NULL;
2675                                 f = f->prev;
2676                         }
2677                 } }$
2678
2679         $*fieldlist
2680         FieldBlock -> { IN OptNL FieldLines OUT OptNL } ${ $0 = $<FL; }$
2681                 | { SimpleFieldList } ${ $0 = $<SFL; }$
2682                 | IN OptNL FieldLines OUT ${ $0 = $<FL; }$
2683                 | SimpleFieldList EOL ${ $0 = $<SFL; }$
2684
2685         FieldLines -> SimpleFieldList Newlines ${ $0 = $<SFL; }$
2686                 | FieldLines SimpleFieldList Newlines ${
2687                         $SFL->prev = $<FL;
2688                         $0 = $<SFL;
2689                 }$
2690
2691         SimpleFieldList -> Field ${ $0 = $<F; }$
2692                 | SimpleFieldList ; Field ${
2693                         $F->prev = $<SFL;
2694                         $0 = $<F;
2695                 }$
2696                 | SimpleFieldList ; ${
2697                         $0 = $<SFL;
2698                 }$
2699                 | ERROR ${ tok_err(c, "Syntax error in struct field", &$1); }$
2700
2701         Field -> IDENTIFIER : Type = Expression ${ {
2702                         int ok;
2703
2704                         $0 = calloc(1, sizeof(struct fieldlist));
2705                         $0->f.name = $1.txt;
2706                         $0->f.type = $<3;
2707                         $0->f.init = NULL;
2708                         do {
2709                                 ok = 1;
2710                                 propagate_types($<5, c, &ok, $3, 0);
2711                         } while (ok == 2);
2712                         if (!ok)
2713                                 c->parse_error = 1;     // UNTESTED
2714                         else {
2715                                 struct value vl = interp_exec(c, $5, NULL);
2716                                 $0->f.init = global_alloc(c, $0->f.type, NULL, &vl);
2717                         }
2718                 } }$
2719                 | IDENTIFIER : Type ${
2720                         $0 = calloc(1, sizeof(struct fieldlist));
2721                         $0->f.name = $1.txt;
2722                         $0->f.type = $<3;
2723                         if ($0->f.type->prepare_type)
2724                                 $0->f.type->prepare_type(c, $0->f.type, 1);
2725                 }$
2726
2727 ###### forward decls
2728         static void structure_print_type(struct type *t, FILE *f);
2729
2730 ###### value functions
2731         static void structure_print_type(struct type *t, FILE *f)
2732         {
2733                 int i;
2734
2735                 fprintf(f, "struct %.*s\n", t->name.len, t->name.txt);
2736
2737                 for (i = 0; i < t->structure.nfields; i++) {
2738                         struct field *fl = t->structure.fields + i;
2739                         fprintf(f, "    %.*s : ", fl->name.len, fl->name.txt);
2740                         type_print(fl->type, f);
2741                         if (fl->type->print && fl->init) {
2742                                 fprintf(f, " = ");
2743                                 if (fl->type == Tstr)
2744                                         fprintf(f, "\"");       // UNTESTED
2745                                 print_value(fl->type, fl->init, f);
2746                                 if (fl->type == Tstr)
2747                                         fprintf(f, "\"");       // UNTESTED
2748                         }
2749                         fprintf(f, "\n");
2750                 }
2751         }
2752
2753 ###### print type decls
2754         {
2755                 struct type *t;
2756                 int target = -1;
2757
2758                 while (target != 0) {
2759                         int i = 0;
2760                         for (t = context.typelist; t ; t=t->next)
2761                                 if (t->print_type_decl && !t->check_args && t->name.txt[0] != ' ') {
2762                                         i += 1;
2763                                         if (i == target)
2764                                                 break;
2765                                 }
2766
2767                         if (target == -1) {
2768                                 target = i;
2769                         } else {
2770                                 t->print_type_decl(t, stdout);
2771                                 target -= 1;
2772                         }
2773                 }
2774         }
2775
2776 #### Functions
2777
2778 A function is a chunk of code which can be passed parameters and can
2779 return results.  Each function has a type which includes the set of
2780 parameters and the return value.  As yet these types cannot be declared
2781 separately from the function itself.
2782
2783 The parameters can be specified either in parentheses as a ';' separated
2784 list, such as
2785
2786 ##### Example: function 1
2787
2788         func main(av:[ac::number]string; env:[envc::number]string)
2789                 code block
2790
2791 or as an indented list of one parameter per line (though each line can
2792 be a ';' separated list)
2793
2794 ##### Example: function 2
2795
2796         func main
2797                 argv:[argc::number]string
2798                 env:[envc::number]string
2799         do
2800                 code block
2801
2802 In the first case a return type can follow the parentheses after a colon,
2803 in the second it is given on a line starting with the word `return`.
2804
2805 ##### Example: functions that return
2806
2807         func add(a:number; b:number): number
2808                 code block
2809
2810         func catenate
2811                 a: string
2812                 b: string
2813         return string
2814         do
2815                 code block
2816
2817 Rather than returning a type, the function can specify a set of local
2818 variables to return as a struct.  The values of these variables when the
2819 function exits will be provided to the caller.  For this the return type
2820 is replaced with a block of result declarations, either in parentheses
2821 or bracketed by `return` and `do`.
2822
2823 ##### Example: functions returning multiple variables
2824
2825         func to_cartesian(rho:number; theta:number):(x:number; y:number)
2826                 x = .....
2827                 y = .....
2828
2829         func to_polar
2830                 x:number; y:number
2831         return
2832                 rho:number
2833                 theta:number
2834         do
2835                 rho = ....
2836                 theta = ....
2837
2838 For constructing the lists we use a `List` binode, which will be
2839 further detailed when Expression Lists are introduced.
2840
2841 ###### type union fields
2842
2843         struct {
2844                 struct binode *params;
2845                 struct type *return_type;
2846                 struct variable *scope;
2847                 int inline_result;      // return value is at start of 'local'
2848                 int local_size;
2849         } function;
2850
2851 ###### value union fields
2852         struct exec *function;
2853
2854 ###### type functions
2855         void (*check_args)(struct parse_context *c, int *ok,
2856                            struct type *require, struct exec *args);
2857
2858 ###### value functions
2859
2860         static void function_free(struct type *type, struct value *val)
2861         {
2862                 free_exec(val->function);
2863                 val->function = NULL;
2864         }
2865
2866         static int function_compat(struct type *require, struct type *have)
2867         {
2868                 // FIXME can I do anything here yet?
2869                 return 0;
2870         }
2871
2872         static void function_check_args(struct parse_context *c, int *ok,
2873                                         struct type *require, struct exec *args)
2874         {
2875                 /* This should be 'compat', but we don't have a 'tuple' type to
2876                  * hold the type of 'args'
2877                  */
2878                 struct binode *arg = cast(binode, args);
2879                 struct binode *param = require->function.params;
2880
2881                 while (param) {
2882                         struct var *pv = cast(var, param->left);
2883                         if (!arg) {
2884                                 type_err(c, "error: insufficient arguments to function.",
2885                                          args, NULL, 0, NULL);
2886                                 break;
2887                         }
2888                         *ok = 1;
2889                         propagate_types(arg->left, c, ok, pv->var->type, 0);
2890                         param = cast(binode, param->right);
2891                         arg = cast(binode, arg->right);
2892                 }
2893                 if (arg)
2894                         type_err(c, "error: too many arguments to function.",
2895                                  args, NULL, 0, NULL);
2896         }
2897
2898         static void function_print(struct type *type, struct value *val, FILE *f)
2899         {
2900                 print_exec(val->function, 1, 0);
2901         }
2902
2903         static void function_print_type_decl(struct type *type, FILE *f)
2904         {
2905                 struct binode *b;
2906                 fprintf(f, "(");
2907                 for (b = type->function.params; b; b = cast(binode, b->right)) {
2908                         struct variable *v = cast(var, b->left)->var;
2909                         fprintf(f, "%.*s%s", v->name->name.len, v->name->name.txt,
2910                                 v->constant ? "::" : ":");
2911                         type_print(v->type, f);
2912                         if (b->right)
2913                                 fprintf(f, "; ");
2914                 }
2915                 fprintf(f, ")");
2916                 if (type->function.return_type != Tnone) {
2917                         fprintf(f, ":");
2918                         if (type->function.inline_result) {
2919                                 int i;
2920                                 struct type *t = type->function.return_type;
2921                                 fprintf(f, " (");
2922                                 for (i = 0; i < t->structure.nfields; i++) {
2923                                         struct field *fl = t->structure.fields + i;
2924                                         if (i)
2925                                                 fprintf(f, "; ");
2926                                         fprintf(f, "%.*s:", fl->name.len, fl->name.txt);
2927                                         type_print(fl->type, f);
2928                                 }
2929                                 fprintf(f, ")");
2930                         } else
2931                                 type_print(type->function.return_type, f);
2932                 }
2933                 fprintf(f, "\n");
2934         }
2935
2936         static void function_free_type(struct type *t)
2937         {
2938                 free_exec(t->function.params);
2939         }
2940
2941         static struct type function_prototype = {
2942                 .size = sizeof(void*),
2943                 .align = sizeof(void*),
2944                 .free = function_free,
2945                 .compat = function_compat,
2946                 .check_args = function_check_args,
2947                 .print = function_print,
2948                 .print_type_decl = function_print_type_decl,
2949                 .free_type = function_free_type,
2950         };
2951
2952 ###### declare terminals
2953
2954         $TERM func
2955
2956 ###### Binode types
2957         List,
2958
2959 ###### Grammar
2960
2961         $*variable
2962         FuncName -> IDENTIFIER ${ {
2963                         struct variable *v = var_decl(c, $1.txt);
2964                         struct var *e = new_pos(var, $1);
2965                         e->var = v;
2966                         if (v) {
2967                                 v->where_decl = e;
2968                                 $0 = v;
2969                         } else {
2970                                 v = var_ref(c, $1.txt);
2971                                 e->var = v;
2972                                 type_err(c, "error: function '%v' redeclared",
2973                                         e, NULL, 0, NULL);
2974                                 type_err(c, "info: this is where '%v' was first declared",
2975                                         v->where_decl, NULL, 0, NULL);
2976                                 free_exec(e);
2977                         }
2978                 } }$
2979
2980         $*binode
2981         Args -> ArgsLine NEWLINE ${ $0 = $<AL; }$
2982                 | Args ArgsLine NEWLINE ${ {
2983                         struct binode *b = $<AL;
2984                         struct binode **bp = &b;
2985                         while (*bp)
2986                                 bp = (struct binode **)&(*bp)->left;
2987                         *bp = $<A;
2988                         $0 = b;
2989                 } }$
2990
2991         ArgsLine -> ${ $0 = NULL; }$
2992                 | Varlist ${ $0 = $<1; }$
2993                 | Varlist ; ${ $0 = $<1; }$
2994
2995         Varlist -> Varlist ; ArgDecl ${
2996                         $0 = new(binode);
2997                         $0->op = List;
2998                         $0->left = $<Vl;
2999                         $0->right = $<AD;
3000                 }$
3001                 | ArgDecl ${
3002                         $0 = new(binode);
3003                         $0->op = List;
3004                         $0->left = NULL;
3005                         $0->right = $<AD;
3006                 }$
3007
3008         $*var
3009         ArgDecl -> IDENTIFIER : FormalType ${ {
3010                 struct variable *v = var_decl(c, $1.txt);
3011                 $0 = new(var);
3012                 $0->var = v;
3013                 v->type = $<FT;
3014         } }$
3015
3016 ##### Function calls
3017
3018 A function call can appear either as an expression or as a statement.
3019 We use a new 'Funcall' binode type to link the function with a list of
3020 arguments, form with the 'List' nodes.
3021
3022 We have already seen the "Term" which is how a function call can appear
3023 in an expression.  To parse a function call into a statement we include
3024 it in the "SimpleStatement Grammar" which will be described later.
3025
3026 ###### Binode types
3027         Funcall,
3028
3029 ###### term grammar
3030         | Term ( ExpressionList ) ${ {
3031                 struct binode *b = new(binode);
3032                 b->op = Funcall;
3033                 b->left = $<T;
3034                 b->right = reorder_bilist($<EL);
3035                 $0 = b;
3036         } }$
3037         | Term ( ) ${ {
3038                 struct binode *b = new(binode);
3039                 b->op = Funcall;
3040                 b->left = $<T;
3041                 b->right = NULL;
3042                 $0 = b;
3043         } }$
3044
3045 ###### SimpleStatement Grammar
3046
3047         | Term ( ExpressionList ) ${ {
3048                 struct binode *b = new(binode);
3049                 b->op = Funcall;
3050                 b->left = $<T;
3051                 b->right = reorder_bilist($<EL);
3052                 $0 = b;
3053         } }$
3054
3055 ###### print binode cases
3056
3057         case Funcall:
3058                 do_indent(indent, "");
3059                 print_exec(b->left, -1, bracket);
3060                 printf("(");
3061                 for (b = cast(binode, b->right); b; b = cast(binode, b->right)) {
3062                         if (b->left) {
3063                                 printf(" ");
3064                                 print_exec(b->left, -1, bracket);
3065                                 if (b->right)
3066                                         printf(",");
3067                         }
3068                 }
3069                 printf(")");
3070                 if (indent >= 0)
3071                         printf("\n");
3072                 break;
3073
3074 ###### propagate binode cases
3075
3076         case Funcall: {
3077                 /* Every arg must match formal parameter, and result
3078                  * is return type of function
3079                  */
3080                 struct binode *args = cast(binode, b->right);
3081                 struct var *v = cast(var, b->left);
3082
3083                 if (!v->var->type || v->var->type->check_args == NULL) {
3084                         type_err(c, "error: attempt to call a non-function.",
3085                                  prog, NULL, 0, NULL);
3086                         return NULL;
3087                 }
3088                 v->var->type->check_args(c, ok, v->var->type, args);
3089                 return v->var->type->function.return_type;
3090         }
3091
3092 ###### interp binode cases
3093
3094         case Funcall: {
3095                 struct var *v = cast(var, b->left);
3096                 struct type *t = v->var->type;
3097                 void *oldlocal = c->local;
3098                 int old_size = c->local_size;
3099                 void *local = calloc(1, t->function.local_size);
3100                 struct value *fbody = var_value(c, v->var);
3101                 struct binode *arg = cast(binode, b->right);
3102                 struct binode *param = t->function.params;
3103
3104                 while (param) {
3105                         struct var *pv = cast(var, param->left);
3106                         struct type *vtype = NULL;
3107                         struct value val = interp_exec(c, arg->left, &vtype);
3108                         struct value *lval;
3109                         c->local = local; c->local_size = t->function.local_size;
3110                         lval = var_value(c, pv->var);
3111                         c->local = oldlocal; c->local_size = old_size;
3112                         memcpy(lval, &val, vtype->size);
3113                         param = cast(binode, param->right);
3114                         arg = cast(binode, arg->right);
3115                 }
3116                 c->local = local; c->local_size = t->function.local_size;
3117                 if (t->function.inline_result && dtype) {
3118                         _interp_exec(c, fbody->function, NULL, NULL);
3119                         memcpy(dest, local, dtype->size);
3120                         rvtype = ret.type = NULL;
3121                 } else
3122                         rv = interp_exec(c, fbody->function, &rvtype);
3123                 c->local = oldlocal; c->local_size = old_size;
3124                 free(local);
3125                 break;
3126         }
3127
3128 ## Complex executables: statements and expressions
3129
3130 Now that we have types and values and variables and most of the basic
3131 Terms which provide access to these, we can explore the more complex
3132 code that combine all of these to get useful work done.  Specifically
3133 statements and expressions.
3134
3135 Expressions are various combinations of Terms.  We will use operator
3136 precedence to ensure correct parsing.  The simplest Expression is just a
3137 Term - others will follow.
3138
3139 ###### Grammar
3140
3141         $*exec
3142         Expression -> Term ${ $0 = $<Term; }$
3143                 ## expression grammar
3144
3145 ### Expressions: Conditional
3146
3147 Our first user of the `binode` will be conditional expressions, which
3148 is a bit odd as they actually have three components.  That will be
3149 handled by having 2 binodes for each expression.  The conditional
3150 expression is the lowest precedence operator which is why we define it
3151 first - to start the precedence list.
3152
3153 Conditional expressions are of the form "value `if` condition `else`
3154 other_value".  They associate to the right, so everything to the right
3155 of `else` is part of an else value, while only a higher-precedence to
3156 the left of `if` is the if values.  Between `if` and `else` there is no
3157 room for ambiguity, so a full conditional expression is allowed in
3158 there.
3159
3160 ###### Binode types
3161         CondExpr,
3162
3163 ###### declare terminals
3164
3165         $LEFT if $$ifelse
3166
3167 ###### expression grammar
3168
3169         | Expression if Expression else Expression $$ifelse ${ {
3170                 struct binode *b1 = new(binode);
3171                 struct binode *b2 = new(binode);
3172                 b1->op = CondExpr;
3173                 b1->left = $<3;
3174                 b1->right = b2;
3175                 b2->op = CondExpr;
3176                 b2->left = $<1;
3177                 b2->right = $<5;
3178                 $0 = b1;
3179         } }$
3180
3181 ###### print binode cases
3182
3183         case CondExpr:
3184                 b2 = cast(binode, b->right);
3185                 if (bracket) printf("(");
3186                 print_exec(b2->left, -1, bracket);
3187                 printf(" if ");
3188                 print_exec(b->left, -1, bracket);
3189                 printf(" else ");
3190                 print_exec(b2->right, -1, bracket);
3191                 if (bracket) printf(")");
3192                 break;
3193
3194 ###### propagate binode cases
3195
3196         case CondExpr: {
3197                 /* cond must be Tbool, others must match */
3198                 struct binode *b2 = cast(binode, b->right);
3199                 struct type *t2;
3200
3201                 propagate_types(b->left, c, ok, Tbool, 0);
3202                 t = propagate_types(b2->left, c, ok, type, Rnolabel);
3203                 t2 = propagate_types(b2->right, c, ok, type ?: t, Rnolabel);
3204                 return t ?: t2;
3205         }
3206
3207 ###### interp binode cases
3208
3209         case CondExpr: {
3210                 struct binode *b2 = cast(binode, b->right);
3211                 left = interp_exec(c, b->left, &ltype);
3212                 if (left.bool)
3213                         rv = interp_exec(c, b2->left, &rvtype); // UNTESTED
3214                 else
3215                         rv = interp_exec(c, b2->right, &rvtype);
3216                 }
3217                 break;
3218
3219 ### Expression list
3220
3221 We take a brief detour, now that we have expressions, to describe lists
3222 of expressions.  These will be needed for function parameters and
3223 possibly other situations.  They seem generic enough to introduce here
3224 to be used elsewhere.
3225
3226 And ExpressionList will use the `List` type of `binode`, building up at
3227 the end.  And place where they are used will probably call
3228 `reorder_bilist()` to get a more normal first/next arrangement.
3229
3230 ###### declare terminals
3231         $TERM ,
3232
3233 `List` execs have no implicit semantics, so they are never propagated or
3234 interpreted.  The can be printed as a comma separate list, which is how
3235 they are parsed.  Note they are also used for function formal parameter
3236 lists.  In that case a separate function is used to print them.
3237
3238 ###### print binode cases
3239         case List:
3240                 while (b) {
3241                         printf(" ");
3242                         print_exec(b->left, -1, bracket);
3243                         if (b->right)
3244                                 printf(",");
3245                         b = cast(binode, b->right);
3246                 }
3247                 break;
3248
3249 ###### propagate binode cases
3250         case List: abort(); // NOTEST
3251 ###### interp binode cases
3252         case List: abort(); // NOTEST
3253
3254 ###### Grammar
3255
3256         $*binode
3257         ExpressionList -> ExpressionList , Expression ${
3258                         $0 = new(binode);
3259                         $0->op = List;
3260                         $0->left = $<1;
3261                         $0->right = $<3;
3262                 }$
3263                 | Expression ${
3264                         $0 = new(binode);
3265                         $0->op = List;
3266                         $0->left = NULL;
3267                         $0->right = $<1;
3268                 }$
3269
3270 ### Expressions: Boolean
3271
3272 The next class of expressions to use the `binode` will be Boolean
3273 expressions.  "`and then`" and "`or else`" are similar to `and` and `or`
3274 have same corresponding precendence.  The difference is that they don't
3275 evaluate the second expression if not necessary.
3276
3277 ###### Binode types
3278         And,
3279         AndThen,
3280         Or,
3281         OrElse,
3282         Not,
3283
3284 ###### declare terminals
3285         $LEFT or
3286         $LEFT and
3287         $LEFT not
3288
3289 ###### expression grammar
3290                 | Expression or Expression ${ {
3291                         struct binode *b = new(binode);
3292                         b->op = Or;
3293                         b->left = $<1;
3294                         b->right = $<3;
3295                         $0 = b;
3296                 } }$
3297                 | Expression or else Expression ${ {
3298                         struct binode *b = new(binode);
3299                         b->op = OrElse;
3300                         b->left = $<1;
3301                         b->right = $<4;
3302                         $0 = b;
3303                 } }$
3304
3305                 | Expression and Expression ${ {
3306                         struct binode *b = new(binode);
3307                         b->op = And;
3308                         b->left = $<1;
3309                         b->right = $<3;
3310                         $0 = b;
3311                 } }$
3312                 | Expression and then Expression ${ {
3313                         struct binode *b = new(binode);
3314                         b->op = AndThen;
3315                         b->left = $<1;
3316                         b->right = $<4;
3317                         $0 = b;
3318                 } }$
3319
3320                 | not Expression ${ {
3321                         struct binode *b = new(binode);
3322                         b->op = Not;
3323                         b->right = $<2;
3324                         $0 = b;
3325                 } }$
3326
3327 ###### print binode cases
3328         case And:
3329                 if (bracket) printf("(");
3330                 print_exec(b->left, -1, bracket);
3331                 printf(" and ");
3332                 print_exec(b->right, -1, bracket);
3333                 if (bracket) printf(")");
3334                 break;
3335         case AndThen:
3336                 if (bracket) printf("(");
3337                 print_exec(b->left, -1, bracket);
3338                 printf(" and then ");
3339                 print_exec(b->right, -1, bracket);
3340                 if (bracket) printf(")");
3341                 break;
3342         case Or:
3343                 if (bracket) printf("(");
3344                 print_exec(b->left, -1, bracket);
3345                 printf(" or ");
3346                 print_exec(b->right, -1, bracket);
3347                 if (bracket) printf(")");
3348                 break;
3349         case OrElse:
3350                 if (bracket) printf("(");
3351                 print_exec(b->left, -1, bracket);
3352                 printf(" or else ");
3353                 print_exec(b->right, -1, bracket);
3354                 if (bracket) printf(")");
3355                 break;
3356         case Not:
3357                 if (bracket) printf("(");
3358                 printf("not ");
3359                 print_exec(b->right, -1, bracket);
3360                 if (bracket) printf(")");
3361                 break;
3362
3363 ###### propagate binode cases
3364         case And:
3365         case AndThen:
3366         case Or:
3367         case OrElse:
3368         case Not:
3369                 /* both must be Tbool, result is Tbool */
3370                 propagate_types(b->left, c, ok, Tbool, 0);
3371                 propagate_types(b->right, c, ok, Tbool, 0);
3372                 if (type && type != Tbool)
3373                         type_err(c, "error: %1 operation found where %2 expected", prog,
3374                                    Tbool, 0, type);
3375                 return Tbool;
3376
3377 ###### interp binode cases
3378         case And:
3379                 rv = interp_exec(c, b->left, &rvtype);
3380                 right = interp_exec(c, b->right, &rtype);
3381                 rv.bool = rv.bool && right.bool;
3382                 break;
3383         case AndThen:
3384                 rv = interp_exec(c, b->left, &rvtype);
3385                 if (rv.bool)
3386                         rv = interp_exec(c, b->right, NULL);
3387                 break;
3388         case Or:
3389                 rv = interp_exec(c, b->left, &rvtype);
3390                 right = interp_exec(c, b->right, &rtype);
3391                 rv.bool = rv.bool || right.bool;
3392                 break;
3393         case OrElse:
3394                 rv = interp_exec(c, b->left, &rvtype);
3395                 if (!rv.bool)
3396                         rv = interp_exec(c, b->right, NULL);
3397                 break;
3398         case Not:
3399                 rv = interp_exec(c, b->right, &rvtype);
3400                 rv.bool = !rv.bool;
3401                 break;
3402
3403 ### Expressions: Comparison
3404
3405 Of slightly higher precedence that Boolean expressions are Comparisons.
3406 A comparison takes arguments of any comparable type, but the two types
3407 must be the same.
3408
3409 To simplify the parsing we introduce an `eop` which can record an
3410 expression operator, and the `CMPop` non-terminal will match one of them.
3411
3412 ###### ast
3413         struct eop {
3414                 enum Btype op;
3415         };
3416
3417 ###### ast functions
3418         static void free_eop(struct eop *e)
3419         {
3420                 if (e)
3421                         free(e);
3422         }
3423
3424 ###### Binode types
3425         Less,
3426         Gtr,
3427         LessEq,
3428         GtrEq,
3429         Eql,
3430         NEql,
3431
3432 ###### declare terminals
3433         $LEFT < > <= >= == != CMPop
3434
3435 ###### expression grammar
3436         | Expression CMPop Expression ${ {
3437                 struct binode *b = new(binode);
3438                 b->op = $2.op;
3439                 b->left = $<1;
3440                 b->right = $<3;
3441                 $0 = b;
3442         } }$
3443
3444 ###### Grammar
3445
3446         $eop
3447         CMPop ->   < ${ $0.op = Less; }$
3448                 |  > ${ $0.op = Gtr; }$
3449                 |  <= ${ $0.op = LessEq; }$
3450                 |  >= ${ $0.op = GtrEq; }$
3451                 |  == ${ $0.op = Eql; }$
3452                 |  != ${ $0.op = NEql; }$
3453
3454 ###### print binode cases
3455
3456         case Less:
3457         case LessEq:
3458         case Gtr:
3459         case GtrEq:
3460         case Eql:
3461         case NEql:
3462                 if (bracket) printf("(");
3463                 print_exec(b->left, -1, bracket);
3464                 switch(b->op) {
3465                 case Less:   printf(" < "); break;
3466                 case LessEq: printf(" <= "); break;
3467                 case Gtr:    printf(" > "); break;
3468                 case GtrEq:  printf(" >= "); break;
3469                 case Eql:    printf(" == "); break;
3470                 case NEql:   printf(" != "); break;
3471                 default: abort();               // NOTEST
3472                 }
3473                 print_exec(b->right, -1, bracket);
3474                 if (bracket) printf(")");
3475                 break;
3476
3477 ###### propagate binode cases
3478         case Less:
3479         case LessEq:
3480         case Gtr:
3481         case GtrEq:
3482         case Eql:
3483         case NEql:
3484                 /* Both must match but not be labels, result is Tbool */
3485                 t = propagate_types(b->left, c, ok, NULL, Rnolabel);
3486                 if (t)
3487                         propagate_types(b->right, c, ok, t, 0);
3488                 else {
3489                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);   // UNTESTED
3490                         if (t)  // UNTESTED
3491                                 t = propagate_types(b->left, c, ok, t, 0);      // UNTESTED
3492                 }
3493                 if (!type_compat(type, Tbool, 0))
3494                         type_err(c, "error: Comparison returns %1 but %2 expected", prog,
3495                                     Tbool, rules, type);
3496                 return Tbool;
3497
3498 ###### interp binode cases
3499         case Less:
3500         case LessEq:
3501         case Gtr:
3502         case GtrEq:
3503         case Eql:
3504         case NEql:
3505         {
3506                 int cmp;
3507                 left = interp_exec(c, b->left, &ltype);
3508                 right = interp_exec(c, b->right, &rtype);
3509                 cmp = value_cmp(ltype, rtype, &left, &right);
3510                 rvtype = Tbool;
3511                 switch (b->op) {
3512                 case Less:      rv.bool = cmp <  0; break;
3513                 case LessEq:    rv.bool = cmp <= 0; break;
3514                 case Gtr:       rv.bool = cmp >  0; break;
3515                 case GtrEq:     rv.bool = cmp >= 0; break;
3516                 case Eql:       rv.bool = cmp == 0; break;
3517                 case NEql:      rv.bool = cmp != 0; break;
3518                 default:        rv.bool = 0; break;     // NOTEST
3519                 }
3520                 break;
3521         }
3522
3523 ### Expressions: Arithmetic etc.
3524
3525 The remaining expressions with the highest precedence are arithmetic,
3526 string concatenation, and string conversion.  String concatenation
3527 (`++`) has the same precedence as multiplication and division, but lower
3528 than the uniary.
3529
3530 String conversion is a temporary feature until I get a better type
3531 system.  `$` is a prefix operator which expects a string and returns
3532 a number.
3533
3534 `+` and `-` are both infix and prefix operations (where they are
3535 absolute value and negation).  These have different operator names.
3536
3537 We also have a 'Bracket' operator which records where parentheses were
3538 found.  This makes it easy to reproduce these when printing.  Possibly I
3539 should only insert brackets were needed for precedence.  Putting
3540 parentheses around an expression converts it into a Term,
3541
3542 ###### Binode types
3543         Plus, Minus,
3544         Times, Divide, Rem,
3545         Concat,
3546         Absolute, Negate,
3547         StringConv,
3548         Bracket,
3549
3550 ###### declare terminals
3551         $LEFT + - Eop
3552         $LEFT * / % ++ Top
3553         $LEFT Uop $
3554         $TERM ( )
3555
3556 ###### expression grammar
3557         | Expression Eop Expression ${ {
3558                 struct binode *b = new(binode);
3559                 b->op = $2.op;
3560                 b->left = $<1;
3561                 b->right = $<3;
3562                 $0 = b;
3563         } }$
3564
3565         | Expression Top Expression ${ {
3566                 struct binode *b = new(binode);
3567                 b->op = $2.op;
3568                 b->left = $<1;
3569                 b->right = $<3;
3570                 $0 = b;
3571         } }$
3572
3573         | Uop Expression ${ {
3574                 struct binode *b = new(binode);
3575                 b->op = $1.op;
3576                 b->right = $<2;
3577                 $0 = b;
3578         } }$
3579
3580 ###### term grammar
3581
3582         | ( Expression ) ${ {
3583                 struct binode *b = new_pos(binode, $1);
3584                 b->op = Bracket;
3585                 b->right = $<2;
3586                 $0 = b;
3587         } }$
3588
3589 ###### Grammar
3590
3591         $eop
3592         Eop ->    + ${ $0.op = Plus; }$
3593                 | - ${ $0.op = Minus; }$
3594
3595         Uop ->    + ${ $0.op = Absolute; }$
3596                 | - ${ $0.op = Negate; }$
3597                 | $ ${ $0.op = StringConv; }$
3598
3599         Top ->    * ${ $0.op = Times; }$
3600                 | / ${ $0.op = Divide; }$
3601                 | % ${ $0.op = Rem; }$
3602                 | ++ ${ $0.op = Concat; }$
3603
3604 ###### print binode cases
3605         case Plus:
3606         case Minus:
3607         case Times:
3608         case Divide:
3609         case Concat:
3610         case Rem:
3611                 if (bracket) printf("(");
3612                 print_exec(b->left, indent, bracket);
3613                 switch(b->op) {
3614                 case Plus:   fputs(" + ", stdout); break;
3615                 case Minus:  fputs(" - ", stdout); break;
3616                 case Times:  fputs(" * ", stdout); break;
3617                 case Divide: fputs(" / ", stdout); break;
3618                 case Rem:    fputs(" % ", stdout); break;
3619                 case Concat: fputs(" ++ ", stdout); break;
3620                 default: abort();       // NOTEST
3621                 }                       // NOTEST
3622                 print_exec(b->right, indent, bracket);
3623                 if (bracket) printf(")");
3624                 break;
3625         case Absolute:
3626         case Negate:
3627         case StringConv:
3628                 if (bracket) printf("(");
3629                 switch (b->op) {
3630                 case Absolute:   fputs("+", stdout); break;
3631                 case Negate:     fputs("-", stdout); break;
3632                 case StringConv: fputs("$", stdout); break;
3633                 default: abort();       // NOTEST
3634                 }                       // NOTEST
3635                 print_exec(b->right, indent, bracket);
3636                 if (bracket) printf(")");
3637                 break;
3638         case Bracket:
3639                 printf("(");
3640                 print_exec(b->right, indent, bracket);
3641                 printf(")");
3642                 break;
3643
3644 ###### propagate binode cases
3645         case Plus:
3646         case Minus:
3647         case Times:
3648         case Rem:
3649         case Divide:
3650                 /* both must be numbers, result is Tnum */
3651         case Absolute:
3652         case Negate:
3653                 /* as propagate_types ignores a NULL,
3654                  * unary ops fit here too */
3655                 propagate_types(b->left, c, ok, Tnum, 0);
3656                 propagate_types(b->right, c, ok, Tnum, 0);
3657                 if (!type_compat(type, Tnum, 0))
3658                         type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
3659                                    Tnum, rules, type);
3660                 return Tnum;
3661
3662         case Concat:
3663                 /* both must be Tstr, result is Tstr */
3664                 propagate_types(b->left, c, ok, Tstr, 0);
3665                 propagate_types(b->right, c, ok, Tstr, 0);
3666                 if (!type_compat(type, Tstr, 0))
3667                         type_err(c, "error: Concat returns %1 but %2 expected", prog,
3668                                    Tstr, rules, type);
3669                 return Tstr;
3670
3671         case StringConv:
3672                 /* op must be string, result is number */
3673                 propagate_types(b->left, c, ok, Tstr, 0);
3674                 if (!type_compat(type, Tnum, 0))
3675                         type_err(c,     // UNTESTED
3676                           "error: Can only convert string to number, not %1",
3677                                 prog, type, 0, NULL);
3678                 return Tnum;
3679
3680         case Bracket:
3681                 return propagate_types(b->right, c, ok, type, 0);
3682
3683 ###### interp binode cases
3684
3685         case Plus:
3686                 rv = interp_exec(c, b->left, &rvtype);
3687                 right = interp_exec(c, b->right, &rtype);
3688                 mpq_add(rv.num, rv.num, right.num);
3689                 break;
3690         case Minus:
3691                 rv = interp_exec(c, b->left, &rvtype);
3692                 right = interp_exec(c, b->right, &rtype);
3693                 mpq_sub(rv.num, rv.num, right.num);
3694                 break;
3695         case Times:
3696                 rv = interp_exec(c, b->left, &rvtype);
3697                 right = interp_exec(c, b->right, &rtype);
3698                 mpq_mul(rv.num, rv.num, right.num);
3699                 break;
3700         case Divide:
3701                 rv = interp_exec(c, b->left, &rvtype);
3702                 right = interp_exec(c, b->right, &rtype);
3703                 mpq_div(rv.num, rv.num, right.num);
3704                 break;
3705         case Rem: {
3706                 mpz_t l, r, rem;
3707
3708                 left = interp_exec(c, b->left, &ltype);
3709                 right = interp_exec(c, b->right, &rtype);
3710                 mpz_init(l); mpz_init(r); mpz_init(rem);
3711                 mpz_tdiv_q(l, mpq_numref(left.num), mpq_denref(left.num));
3712                 mpz_tdiv_q(r, mpq_numref(right.num), mpq_denref(right.num));
3713                 mpz_tdiv_r(rem, l, r);
3714                 val_init(Tnum, &rv);
3715                 mpq_set_z(rv.num, rem);
3716                 mpz_clear(r); mpz_clear(l); mpz_clear(rem);
3717                 rvtype = ltype;
3718                 break;
3719         }
3720         case Negate:
3721                 rv = interp_exec(c, b->right, &rvtype);
3722                 mpq_neg(rv.num, rv.num);
3723                 break;
3724         case Absolute:
3725                 rv = interp_exec(c, b->right, &rvtype);
3726                 mpq_abs(rv.num, rv.num);
3727                 break;
3728         case Bracket:
3729                 rv = interp_exec(c, b->right, &rvtype);
3730                 break;
3731         case Concat:
3732                 left = interp_exec(c, b->left, &ltype);
3733                 right = interp_exec(c, b->right, &rtype);
3734                 rvtype = Tstr;
3735                 rv.str = text_join(left.str, right.str);
3736                 break;
3737         case StringConv:
3738                 right = interp_exec(c, b->right, &rvtype);
3739                 rtype = Tstr;
3740                 rvtype = Tnum;
3741
3742                 struct text tx = right.str;
3743                 char tail[3];
3744                 int neg = 0;
3745                 if (tx.txt[0] == '-') {
3746                         neg = 1;        // UNTESTED
3747                         tx.txt++;       // UNTESTED
3748                         tx.len--;       // UNTESTED
3749                 }
3750                 if (number_parse(rv.num, tail, tx) == 0)
3751                         mpq_init(rv.num);       // UNTESTED
3752                 else if (neg)
3753                         mpq_neg(rv.num, rv.num);        // UNTESTED
3754                 if (tail[0])
3755                         printf("Unsupported suffix: %.*s\n", tx.len, tx.txt);   // UNTESTED
3756
3757                 break;
3758
3759 ###### value functions
3760
3761         static struct text text_join(struct text a, struct text b)
3762         {
3763                 struct text rv;
3764                 rv.len = a.len + b.len;
3765                 rv.txt = malloc(rv.len);
3766                 memcpy(rv.txt, a.txt, a.len);
3767                 memcpy(rv.txt+a.len, b.txt, b.len);
3768                 return rv;
3769         }
3770
3771 ### Blocks, Statements, and Statement lists.
3772
3773 Now that we have expressions out of the way we need to turn to
3774 statements.  There are simple statements and more complex statements.
3775 Simple statements do not contain (syntactic) newlines, complex statements do.
3776
3777 Statements often come in sequences and we have corresponding simple
3778 statement lists and complex statement lists.
3779 The former comprise only simple statements separated by semicolons.
3780 The later comprise complex statements and simple statement lists.  They are
3781 separated by newlines.  Thus the semicolon is only used to separate
3782 simple statements on the one line.  This may be overly restrictive,
3783 but I'm not sure I ever want a complex statement to share a line with
3784 anything else.
3785
3786 Note that a simple statement list can still use multiple lines if
3787 subsequent lines are indented, so
3788
3789 ###### Example: wrapped simple statement list
3790
3791         a = b; c = d;
3792            e = f; print g
3793
3794 is a single simple statement list.  This might allow room for
3795 confusion, so I'm not set on it yet.
3796
3797 A simple statement list needs no extra syntax.  A complex statement
3798 list has two syntactic forms.  It can be enclosed in braces (much like
3799 C blocks), or it can be introduced by an indent and continue until an
3800 unindented newline (much like Python blocks).  With this extra syntax
3801 it is referred to as a block.
3802
3803 Note that a block does not have to include any newlines if it only
3804 contains simple statements.  So both of:
3805
3806         if condition: a=b; d=f
3807
3808         if condition { a=b; print f }
3809
3810 are valid.
3811
3812 In either case the list is constructed from a `binode` list with
3813 `Block` as the operator.  When parsing the list it is most convenient
3814 to append to the end, so a list is a list and a statement.  When using
3815 the list it is more convenient to consider a list to be a statement
3816 and a list.  So we need a function to re-order a list.
3817 `reorder_bilist` serves this purpose.
3818
3819 The only stand-alone statement we introduce at this stage is `pass`
3820 which does nothing and is represented as a `NULL` pointer in a `Block`
3821 list.  Other stand-alone statements will follow once the infrastructure
3822 is in-place.
3823
3824 As many statements will use binodes, we declare a binode pointer 'b' in
3825 the common header for all reductions to use.
3826
3827 ###### Parser: reduce
3828         struct binode *b;
3829
3830 ###### Binode types
3831         Block,
3832
3833 ###### Grammar
3834
3835         $TERM { } ;
3836
3837         $*binode
3838         Block -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3839                 | { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3840                 | SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3841                 | SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3842                 | IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3843
3844         OpenBlock -> OpenScope { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3845                 | OpenScope { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3846                 | OpenScope SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3847                 | OpenScope SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3848                 | IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3849
3850         UseBlock -> { OpenScope IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3851                 | { OpenScope SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3852                 | IN OpenScope OptNL Statementlist OUT ${ $0 = $<Sl; }$
3853
3854         ColonBlock -> { IN OptNL Statementlist OUT OptNL } ${ $0 = $<Sl; }$
3855                 | { SimpleStatements } ${ $0 = reorder_bilist($<SS); }$
3856                 | : SimpleStatements ; ${ $0 = reorder_bilist($<SS); }$
3857                 | : SimpleStatements EOL ${ $0 = reorder_bilist($<SS); }$
3858                 | : IN OptNL Statementlist OUT ${ $0 = $<Sl; }$
3859
3860         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<CS); }$
3861
3862         ComplexStatements -> ComplexStatements ComplexStatement ${
3863                         if ($2 == NULL) {
3864                                 $0 = $<1;
3865                         } else {
3866                                 $0 = new(binode);
3867                                 $0->op = Block;
3868                                 $0->left = $<1;
3869                                 $0->right = $<2;
3870                         }
3871                 }$
3872                 | ComplexStatement ${
3873                         if ($1 == NULL) {
3874                                 $0 = NULL;
3875                         } else {
3876                                 $0 = new(binode);
3877                                 $0->op = Block;
3878                                 $0->left = NULL;
3879                                 $0->right = $<1;
3880                         }
3881                 }$
3882
3883         $*exec
3884         ComplexStatement -> SimpleStatements Newlines ${
3885                         $0 = reorder_bilist($<SS);
3886                         }$
3887                 |  SimpleStatements ; Newlines ${
3888                         $0 = reorder_bilist($<SS);
3889                         }$
3890                 ## ComplexStatement Grammar
3891
3892         $*binode
3893         SimpleStatements -> SimpleStatements ; SimpleStatement ${
3894                         $0 = new(binode);
3895                         $0->op = Block;
3896                         $0->left = $<1;
3897                         $0->right = $<3;
3898                         }$
3899                 | SimpleStatement ${
3900                         $0 = new(binode);
3901                         $0->op = Block;
3902                         $0->left = NULL;
3903                         $0->right = $<1;
3904                         }$
3905
3906         $TERM pass
3907         $*exec
3908         SimpleStatement -> pass ${ $0 = NULL; }$
3909                 | ERROR ${ tok_err(c, "Syntax error in statement", &$1); }$
3910                 ## SimpleStatement Grammar
3911
3912 ###### print binode cases
3913         case Block:
3914                 if (indent < 0) {
3915                         // simple statement
3916                         if (b->left == NULL)    // UNTESTED
3917                                 printf("pass"); // UNTESTED
3918                         else
3919                                 print_exec(b->left, indent, bracket);   // UNTESTED
3920                         if (b->right) { // UNTESTED
3921                                 printf("; ");   // UNTESTED
3922                                 print_exec(b->right, indent, bracket);  // UNTESTED
3923                         }
3924                 } else {
3925                         // block, one per line
3926                         if (b->left == NULL)
3927                                 do_indent(indent, "pass\n");
3928                         else
3929                                 print_exec(b->left, indent, bracket);
3930                         if (b->right)
3931                                 print_exec(b->right, indent, bracket);
3932                 }
3933                 break;
3934
3935 ###### propagate binode cases
3936         case Block:
3937         {
3938                 /* If any statement returns something other than Tnone
3939                  * or Tbool then all such must return same type.
3940                  * As each statement may be Tnone or something else,
3941                  * we must always pass NULL (unknown) down, otherwise an incorrect
3942                  * error might occur.  We never return Tnone unless it is
3943                  * passed in.
3944                  */
3945                 struct binode *e;
3946
3947                 for (e = b; e; e = cast(binode, e->right)) {
3948                         t = propagate_types(e->left, c, ok, NULL, rules);
3949                         if ((rules & Rboolok) && (t == Tbool || t == Tnone))
3950                                 t = NULL;
3951                         if (t == Tnone && e->right)
3952                                 /* Only the final statement *must* return a value
3953                                  * when not Rboolok
3954                                  */
3955                                 t = NULL;
3956                         if (t) {
3957                                 if (!type)
3958                                         type = t;
3959                                 else if (t != type)
3960                                         type_err(c, "error: expected %1%r, found %2",
3961                                                  e->left, type, rules, t);
3962                         }
3963                 }
3964                 return type;
3965         }
3966
3967 ###### interp binode cases
3968         case Block:
3969                 while (rvtype == Tnone &&
3970                        b) {
3971                         if (b->left)
3972                                 rv = interp_exec(c, b->left, &rvtype);
3973                         b = cast(binode, b->right);
3974                 }
3975                 break;
3976
3977 ### The Print statement
3978
3979 `print` is a simple statement that takes a comma-separated list of
3980 expressions and prints the values separated by spaces and terminated
3981 by a newline.  No control of formatting is possible.
3982
3983 `print` uses `ExpressionList` to collect the expressions and stores them
3984 on the left side of a `Print` binode unlessthere is a trailing comma
3985 when the list is stored on the `right` side and no trailing newline is
3986 printed.
3987
3988 ###### Binode types
3989         Print,
3990
3991 ##### declare terminals
3992         $TERM print
3993
3994 ###### SimpleStatement Grammar
3995
3996         | print ExpressionList ${
3997                 $0 = b = new(binode);
3998                 b->op = Print;
3999                 b->right = NULL;
4000                 b->left = reorder_bilist($<EL);
4001         }$
4002         | print ExpressionList , ${ {
4003                 $0 = b = new(binode);
4004                 b->op = Print;
4005                 b->right = reorder_bilist($<EL);
4006                 b->left = NULL;
4007         } }$
4008         | print ${
4009                 $0 = b = new(binode);
4010                 b->op = Print;
4011                 b->left = NULL;
4012                 b->right = NULL;
4013         }$
4014
4015 ###### print binode cases
4016
4017         case Print:
4018                 do_indent(indent, "print");
4019                 if (b->right) {
4020                         print_exec(b->right, -1, bracket);
4021                         printf(",");
4022                 } else
4023                         print_exec(b->left, -1, bracket);
4024                 if (indent >= 0)
4025                         printf("\n");
4026                 break;
4027
4028 ###### propagate binode cases
4029
4030         case Print:
4031                 /* don't care but all must be consistent */
4032                 if (b->left)
4033                         b = cast(binode, b->left);
4034                 else
4035                         b = cast(binode, b->right);
4036                 while (b) {
4037                         propagate_types(b->left, c, ok, NULL, Rnolabel);
4038                         b = cast(binode, b->right);
4039                 }
4040                 break;
4041
4042 ###### interp binode cases
4043
4044         case Print:
4045         {
4046                 struct binode *b2 = cast(binode, b->left);
4047                 if (!b2)
4048                         b2 = cast(binode, b->right);
4049                 for (; b2; b2 = cast(binode, b2->right)) {
4050                         left = interp_exec(c, b2->left, &ltype);
4051                         print_value(ltype, &left, stdout);
4052                         free_value(ltype, &left);
4053                         if (b2->right)
4054                                 putchar(' ');
4055                 }
4056                 if (b->right == NULL)
4057                         printf("\n");
4058                 ltype = Tnone;
4059                 break;
4060         }
4061
4062 ###### Assignment statement
4063
4064 An assignment will assign a value to a variable, providing it hasn't
4065 been declared as a constant.  The analysis phase ensures that the type
4066 will be correct so the interpreter just needs to perform the
4067 calculation.  There is a form of assignment which declares a new
4068 variable as well as assigning a value.  If a name is assigned before
4069 it is declared, and error will be raised as the name is created as
4070 `Tlabel` and it is illegal to assign to such names.
4071
4072 ###### Binode types
4073         Assign,
4074         Declare,
4075
4076 ###### declare terminals
4077         $TERM =
4078
4079 ###### SimpleStatement Grammar
4080         | Term = Expression ${
4081                         $0 = b= new(binode);
4082                         b->op = Assign;
4083                         b->left = $<1;
4084                         b->right = $<3;
4085                 }$
4086         | VariableDecl = Expression ${
4087                         $0 = b= new(binode);
4088                         b->op = Declare;
4089                         b->left = $<1;
4090                         b->right =$<3;
4091                 }$
4092
4093         | VariableDecl ${
4094                         if ($1->var->where_set == NULL) {
4095                                 type_err(c,
4096                                          "Variable declared with no type or value: %v",
4097                                          $1, NULL, 0, NULL);
4098                                 free_var($1);
4099                         } else {
4100                                 $0 = b = new(binode);
4101                                 b->op = Declare;
4102                                 b->left = $<1;
4103                                 b->right = NULL;
4104                         }
4105                 }$
4106
4107 ###### print binode cases
4108
4109         case Assign:
4110                 do_indent(indent, "");
4111                 print_exec(b->left, indent, bracket);
4112                 printf(" = ");
4113                 print_exec(b->right, indent, bracket);
4114                 if (indent >= 0)
4115                         printf("\n");
4116                 break;
4117
4118         case Declare:
4119                 {
4120                 struct variable *v = cast(var, b->left)->var;
4121                 do_indent(indent, "");
4122                 print_exec(b->left, indent, bracket);
4123                 if (cast(var, b->left)->var->constant) {
4124                         printf("::");
4125                         if (v->explicit_type) {
4126                                 type_print(v->type, stdout);
4127                                 printf(" ");
4128                         }
4129                 } else {
4130                         printf(":");
4131                         if (v->explicit_type) {
4132                                 type_print(v->type, stdout);
4133                                 printf(" ");
4134                         }
4135                 }
4136                 if (b->right) {
4137                         printf("= ");
4138                         print_exec(b->right, indent, bracket);
4139                 }
4140                 if (indent >= 0)
4141                         printf("\n");
4142                 }
4143                 break;
4144
4145 ###### propagate binode cases
4146
4147         case Assign:
4148         case Declare:
4149                 /* Both must match and not be labels,
4150                  * Type must support 'dup',
4151                  * For Assign, left must not be constant.
4152                  * result is Tnone
4153                  */
4154                 t = propagate_types(b->left, c, ok, NULL,
4155                                     Rnolabel | (b->op == Assign ? Rnoconstant : 0));
4156                 if (!b->right)
4157                         return Tnone;
4158
4159                 if (t) {
4160                         if (propagate_types(b->right, c, ok, t, 0) != t)
4161                                 if (b->left->type == Xvar)
4162                                         type_err(c, "info: variable '%v' was set as %1 here.",
4163                                                  cast(var, b->left)->var->where_set, t, rules, NULL);
4164                 } else {
4165                         t = propagate_types(b->right, c, ok, NULL, Rnolabel);
4166                         if (t)
4167                                 propagate_types(b->left, c, ok, t,
4168                                                 (b->op == Assign ? Rnoconstant : 0));
4169                 }
4170                 if (t && t->dup == NULL && t->name.txt[0] != ' ') // HACK
4171                         type_err(c, "error: cannot assign value of type %1", b, t, 0, NULL);
4172                 return Tnone;
4173
4174                 break;
4175
4176 ###### interp binode cases
4177
4178         case Assign:
4179                 lleft = linterp_exec(c, b->left, &ltype);
4180                 if (lleft)
4181                         dinterp_exec(c, b->right, lleft, ltype, 1);
4182                 ltype = Tnone;
4183                 break;
4184
4185         case Declare:
4186         {
4187                 struct variable *v = cast(var, b->left)->var;
4188                 struct value *val;
4189                 v = v->merged;
4190                 val = var_value(c, v);
4191                 if (v->type->prepare_type)
4192                         v->type->prepare_type(c, v->type, 0);
4193                 if (b->right)
4194                         dinterp_exec(c, b->right, val, v->type, 0);
4195                 else
4196                         val_init(v->type, val);
4197                 break;
4198         }
4199
4200 ### The `use` statement
4201
4202 The `use` statement is the last "simple" statement.  It is needed when a
4203 statement block can return a value.  This includes the body of a
4204 function which has a return type, and the "condition" code blocks in
4205 `if`, `while`, and `switch` statements.
4206
4207 ###### Binode types
4208         Use,
4209
4210 ###### declare terminals
4211         $TERM use
4212
4213 ###### SimpleStatement Grammar
4214         | use Expression ${
4215                 $0 = b = new_pos(binode, $1);
4216                 b->op = Use;
4217                 b->right = $<2;
4218                 if (b->right->type == Xvar) {
4219                         struct var *v = cast(var, b->right);
4220                         if (v->var->type == Tnone) {
4221                                 /* Convert this to a label */
4222                                 struct value *val;
4223
4224                                 v->var->type = Tlabel;
4225                                 val = global_alloc(c, Tlabel, v->var, NULL);
4226                                 val->label = val;
4227                         }
4228                 }
4229         }$
4230
4231 ###### print binode cases
4232
4233         case Use:
4234                 do_indent(indent, "use ");
4235                 print_exec(b->right, -1, bracket);
4236                 if (indent >= 0)
4237                         printf("\n");
4238                 break;
4239
4240 ###### propagate binode cases
4241
4242         case Use:
4243                 /* result matches value */
4244                 return propagate_types(b->right, c, ok, type, 0);
4245
4246 ###### interp binode cases
4247
4248         case Use:
4249                 rv = interp_exec(c, b->right, &rvtype);
4250                 break;
4251
4252 ### The Conditional Statement
4253
4254 This is the biggy and currently the only complex statement.  This
4255 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
4256 It is comprised of a number of parts, all of which are optional though
4257 set combinations apply.  Each part is (usually) a key word (`then` is
4258 sometimes optional) followed by either an expression or a code block,
4259 except the `casepart` which is a "key word and an expression" followed
4260 by a code block.  The code-block option is valid for all parts and,
4261 where an expression is also allowed, the code block can use the `use`
4262 statement to report a value.  If the code block does not report a value
4263 the effect is similar to reporting `True`.
4264
4265 The `else` and `case` parts, as well as `then` when combined with
4266 `if`, can contain a `use` statement which will apply to some
4267 containing conditional statement. `for` parts, `do` parts and `then`
4268 parts used with `for` can never contain a `use`, except in some
4269 subordinate conditional statement.
4270
4271 If there is a `forpart`, it is executed first, only once.
4272 If there is a `dopart`, then it is executed repeatedly providing
4273 always that the `condpart` or `cond`, if present, does not return a non-True
4274 value.  `condpart` can fail to return any value if it simply executes
4275 to completion.  This is treated the same as returning `True`.
4276
4277 If there is a `thenpart` it will be executed whenever the `condpart`
4278 or `cond` returns True (or does not return any value), but this will happen
4279 *after* `dopart` (when present).
4280
4281 If `elsepart` is present it will be executed at most once when the
4282 condition returns `False` or some value that isn't `True` and isn't
4283 matched by any `casepart`.  If there are any `casepart`s, they will be
4284 executed when the condition returns a matching value.
4285
4286 The particular sorts of values allowed in case parts has not yet been
4287 determined in the language design, so nothing is prohibited.
4288
4289 The various blocks in this complex statement potentially provide scope
4290 for variables as described earlier.  Each such block must include the
4291 "OpenScope" nonterminal before parsing the block, and must call
4292 `var_block_close()` when closing the block.
4293
4294 The code following "`if`", "`switch`" and "`for`" does not get its own
4295 scope, but is in a scope covering the whole statement, so names
4296 declared there cannot be redeclared elsewhere.  Similarly the
4297 condition following "`while`" is in a scope the covers the body
4298 ("`do`" part) of the loop, and which does not allow conditional scope
4299 extension.  Code following "`then`" (both looping and non-looping),
4300 "`else`" and "`case`" each get their own local scope.
4301
4302 The type requirements on the code block in a `whilepart` are quite
4303 unusal.  It is allowed to return a value of some identifiable type, in
4304 which case the loop aborts and an appropriate `casepart` is run, or it
4305 can return a Boolean, in which case the loop either continues to the
4306 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
4307 This is different both from the `ifpart` code block which is expected to
4308 return a Boolean, or the `switchpart` code block which is expected to
4309 return the same type as the casepart values.  The correct analysis of
4310 the type of the `whilepart` code block is the reason for the
4311 `Rboolok` flag which is passed to `propagate_types()`.
4312
4313 The `cond_statement` cannot fit into a `binode` so a new `exec` is
4314 defined.  As there are two scopes which cover multiple parts - one for
4315 the whole statement and one for "while" and "do" - and as we will use
4316 the 'struct exec' to track scopes, we actually need two new types of
4317 exec.  One is a `binode` for the looping part, the rest is the
4318 `cond_statement`.  The `cond_statement` will use an auxilliary `struct
4319 casepart` to track a list of case parts.
4320
4321 ###### Binode types
4322         Loop
4323
4324 ###### exec type
4325         Xcond_statement,
4326
4327 ###### ast
4328         struct casepart {
4329                 struct exec *value;
4330                 struct exec *action;
4331                 struct casepart *next;
4332         };
4333         struct cond_statement {
4334                 struct exec;
4335                 struct exec *forpart, *condpart, *thenpart, *elsepart;
4336                 struct binode *looppart;
4337                 struct casepart *casepart;
4338         };
4339
4340 ###### ast functions
4341
4342         static void free_casepart(struct casepart *cp)
4343         {
4344                 while (cp) {
4345                         struct casepart *t;
4346                         free_exec(cp->value);
4347                         free_exec(cp->action);
4348                         t = cp->next;
4349                         free(cp);
4350                         cp = t;
4351                 }
4352         }
4353
4354         static void free_cond_statement(struct cond_statement *s)
4355         {
4356                 if (!s)
4357                         return;
4358                 free_exec(s->forpart);
4359                 free_exec(s->condpart);
4360                 free_exec(s->looppart);
4361                 free_exec(s->thenpart);
4362                 free_exec(s->elsepart);
4363                 free_casepart(s->casepart);
4364                 free(s);
4365         }
4366
4367 ###### free exec cases
4368         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
4369
4370 ###### ComplexStatement Grammar
4371         | CondStatement ${ $0 = $<1; }$
4372
4373 ###### declare terminals
4374         $TERM for then while do
4375         $TERM else
4376         $TERM switch case
4377
4378 ###### Grammar
4379
4380         $*cond_statement
4381         // A CondStatement must end with EOL, as does CondSuffix and
4382         // IfSuffix.
4383         // ForPart, ThenPart, SwitchPart, CasePart are non-empty and
4384         // may or may not end with EOL
4385         // WhilePart and IfPart include an appropriate Suffix
4386
4387         // ForPart, SwitchPart, and IfPart open scopes, o we have to close
4388         // them.  WhilePart opens and closes its own scope.
4389         CondStatement -> ForPart OptNL ThenPart OptNL WhilePart CondSuffix ${
4390                         $0 = $<CS;
4391                         $0->forpart = $<FP;
4392                         $0->thenpart = $<TP;
4393                         $0->looppart = $<WP;
4394                         var_block_close(c, CloseSequential, $0);
4395                         }$
4396                 | ForPart OptNL WhilePart CondSuffix ${
4397                         $0 = $<CS;
4398                         $0->forpart = $<FP;
4399                         $0->looppart = $<WP;
4400                         var_block_close(c, CloseSequential, $0);
4401                         }$
4402                 | WhilePart CondSuffix ${
4403                         $0 = $<CS;
4404                         $0->looppart = $<WP;
4405                         }$
4406                 | SwitchPart OptNL CasePart CondSuffix ${
4407                         $0 = $<CS;
4408                         $0->condpart = $<SP;
4409                         $CP->next = $0->casepart;
4410                         $0->casepart = $<CP;
4411                         var_block_close(c, CloseSequential, $0);
4412                         }$
4413                 | SwitchPart : IN OptNL CasePart CondSuffix OUT Newlines ${
4414                         $0 = $<CS;
4415                         $0->condpart = $<SP;
4416                         $CP->next = $0->casepart;
4417                         $0->casepart = $<CP;
4418                         var_block_close(c, CloseSequential, $0);
4419                         }$
4420                 | IfPart IfSuffix ${
4421                         $0 = $<IS;
4422                         $0->condpart = $IP.condpart; $IP.condpart = NULL;
4423                         $0->thenpart = $IP.thenpart; $IP.thenpart = NULL;
4424                         // This is where we close an "if" statement
4425                         var_block_close(c, CloseSequential, $0);
4426                         }$
4427
4428         CondSuffix -> IfSuffix ${
4429                         $0 = $<1;
4430                 }$
4431                 | Newlines CasePart CondSuffix ${
4432                         $0 = $<CS;
4433                         $CP->next = $0->casepart;
4434                         $0->casepart = $<CP;
4435                 }$
4436                 | CasePart CondSuffix ${
4437                         $0 = $<CS;
4438                         $CP->next = $0->casepart;
4439                         $0->casepart = $<CP;
4440                 }$
4441
4442         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
4443                 | Newlines ElsePart ${ $0 = $<EP; }$
4444                 | ElsePart ${$0 = $<EP; }$
4445
4446         ElsePart -> else OpenBlock Newlines ${
4447                         $0 = new(cond_statement);
4448                         $0->elsepart = $<OB;
4449                         var_block_close(c, CloseElse, $0->elsepart);
4450                 }$
4451                 | else OpenScope CondStatement ${
4452                         $0 = new(cond_statement);
4453                         $0->elsepart = $<CS;
4454                         var_block_close(c, CloseElse, $0->elsepart);
4455                 }$
4456
4457         $*casepart
4458         CasePart -> case Expression OpenScope ColonBlock ${
4459                         $0 = calloc(1,sizeof(struct casepart));
4460                         $0->value = $<Ex;
4461                         $0->action = $<Bl;
4462                         var_block_close(c, CloseParallel, $0->action);
4463                 }$
4464
4465         $*exec
4466         // These scopes are closed in CondStatement
4467         ForPart -> for OpenBlock ${
4468                         $0 = $<Bl;
4469                 }$
4470
4471         ThenPart -> then OpenBlock ${
4472                         $0 = $<OB;
4473                         var_block_close(c, CloseSequential, $0);
4474                 }$
4475
4476         $*binode
4477         // This scope is closed in CondStatement
4478         WhilePart -> while UseBlock OptNL do OpenBlock ${
4479                         $0 = new(binode);
4480                         $0->op = Loop;
4481                         $0->left = $<UB;
4482                         $0->right = $<OB;
4483                         var_block_close(c, CloseSequential, $0->right);
4484                         var_block_close(c, CloseSequential, $0);
4485                 }$
4486                 | while OpenScope Expression OpenScope ColonBlock ${
4487                         $0 = new(binode);
4488                         $0->op = Loop;
4489                         $0->left = $<Exp;
4490                         $0->right = $<CB;
4491                         var_block_close(c, CloseSequential, $0->right);
4492                         var_block_close(c, CloseSequential, $0);
4493                 }$
4494
4495         $cond_statement
4496         IfPart -> if UseBlock OptNL then OpenBlock ${
4497                         $0.condpart = $<UB;
4498                         $0.thenpart = $<OB;
4499                         var_block_close(c, CloseParallel, $0.thenpart);
4500                 }$
4501                 | if OpenScope Expression OpenScope ColonBlock ${
4502                         $0.condpart = $<Ex;
4503                         $0.thenpart = $<CB;
4504                         var_block_close(c, CloseParallel, $0.thenpart);
4505                 }$
4506                 | if OpenScope Expression OpenScope OptNL then Block ${
4507                         $0.condpart = $<Ex;
4508                         $0.thenpart = $<Bl;
4509                         var_block_close(c, CloseParallel, $0.thenpart);
4510                 }$
4511
4512         $*exec
4513         // This scope is closed in CondStatement
4514         SwitchPart -> switch OpenScope Expression ${
4515                         $0 = $<Ex;
4516                 }$
4517                 | switch UseBlock ${
4518                         $0 = $<Bl;
4519                 }$
4520
4521 ###### print binode cases
4522         case Loop:
4523                 if (b->left && b->left->type == Xbinode &&
4524                     cast(binode, b->left)->op == Block) {
4525                         if (bracket)
4526                                 do_indent(indent, "while {\n");
4527                         else
4528                                 do_indent(indent, "while\n");
4529                         print_exec(b->left, indent+1, bracket);
4530                         if (bracket)
4531                                 do_indent(indent, "} do {\n");
4532                         else
4533                                 do_indent(indent, "do\n");
4534                         print_exec(b->right, indent+1, bracket);
4535                         if (bracket)
4536                                 do_indent(indent, "}\n");
4537                 } else {
4538                         do_indent(indent, "while ");
4539                         print_exec(b->left, 0, bracket);
4540                         if (bracket)
4541                                 printf(" {\n");
4542                         else
4543                                 printf(":\n");
4544                         print_exec(b->right, indent+1, bracket);
4545                         if (bracket)
4546                                 do_indent(indent, "}\n");
4547                 }
4548                 break;
4549
4550 ###### print exec cases
4551
4552         case Xcond_statement:
4553         {
4554                 struct cond_statement *cs = cast(cond_statement, e);
4555                 struct casepart *cp;
4556                 if (cs->forpart) {
4557                         do_indent(indent, "for");
4558                         if (bracket) printf(" {\n"); else printf("\n");
4559                         print_exec(cs->forpart, indent+1, bracket);
4560                         if (cs->thenpart) {
4561                                 if (bracket)
4562                                         do_indent(indent, "} then {\n");
4563                                 else
4564                                         do_indent(indent, "then\n");
4565                                 print_exec(cs->thenpart, indent+1, bracket);
4566                         }
4567                         if (bracket) do_indent(indent, "}\n");
4568                 }
4569                 if (cs->looppart) {
4570                         print_exec(cs->looppart, indent, bracket);
4571                 } else {
4572                         // a condition
4573                         if (cs->casepart)
4574                                 do_indent(indent, "switch");
4575                         else
4576                                 do_indent(indent, "if");
4577                         if (cs->condpart && cs->condpart->type == Xbinode &&
4578                             cast(binode, cs->condpart)->op == Block) {
4579                                 if (bracket)
4580                                         printf(" {\n");
4581                                 else
4582                                         printf("\n");
4583                                 print_exec(cs->condpart, indent+1, bracket);
4584                                 if (bracket)
4585                                         do_indent(indent, "}\n");
4586                                 if (cs->thenpart) {
4587                                         do_indent(indent, "then\n");
4588                                         print_exec(cs->thenpart, indent+1, bracket);
4589                                 }
4590                         } else {
4591                                 printf(" ");
4592                                 print_exec(cs->condpart, 0, bracket);
4593                                 if (cs->thenpart) {
4594                                         if (bracket)
4595                                                 printf(" {\n");
4596                                         else
4597                                                 printf(":\n");
4598                                         print_exec(cs->thenpart, indent+1, bracket);
4599                                         if (bracket)
4600                                                 do_indent(indent, "}\n");
4601                                 } else
4602                                         printf("\n");
4603                         }
4604                 }
4605                 for (cp = cs->casepart; cp; cp = cp->next) {
4606                         do_indent(indent, "case ");
4607                         print_exec(cp->value, -1, 0);
4608                         if (bracket)
4609                                 printf(" {\n");
4610                         else
4611                                 printf(":\n");
4612                         print_exec(cp->action, indent+1, bracket);
4613                         if (bracket)
4614                                 do_indent(indent, "}\n");
4615                 }
4616                 if (cs->elsepart) {
4617                         do_indent(indent, "else");
4618                         if (bracket)
4619                                 printf(" {\n");
4620                         else
4621                                 printf("\n");
4622                         print_exec(cs->elsepart, indent+1, bracket);
4623                         if (bracket)
4624                                 do_indent(indent, "}\n");
4625                 }
4626                 break;
4627         }
4628
4629 ###### propagate binode cases
4630         case Loop:
4631                 t = propagate_types(b->right, c, ok, Tnone, 0);
4632                 if (!type_compat(Tnone, t, 0))
4633                         *ok = 0;        // UNTESTED
4634                 return propagate_types(b->left, c, ok, type, rules);
4635
4636 ###### propagate exec cases
4637         case Xcond_statement:
4638         {
4639                 // forpart and looppart->right must return Tnone
4640                 // thenpart must return Tnone if there is a loopart,
4641                 // otherwise it is like elsepart.
4642                 // condpart must:
4643                 //    be bool if there is no casepart
4644                 //    match casepart->values if there is a switchpart
4645                 //    either be bool or match casepart->value if there
4646                 //             is a whilepart
4647                 // elsepart and casepart->action must match the return type
4648                 //   expected of this statement.
4649                 struct cond_statement *cs = cast(cond_statement, prog);
4650                 struct casepart *cp;
4651
4652                 t = propagate_types(cs->forpart, c, ok, Tnone, 0);
4653                 if (!type_compat(Tnone, t, 0))
4654                         *ok = 0;        // UNTESTED
4655
4656                 if (cs->looppart) {
4657                         t = propagate_types(cs->thenpart, c, ok, Tnone, 0);
4658                         if (!type_compat(Tnone, t, 0))
4659                                 *ok = 0;        // UNTESTED
4660                 }
4661                 if (cs->casepart == NULL) {
4662                         propagate_types(cs->condpart, c, ok, Tbool, 0);
4663                         propagate_types(cs->looppart, c, ok, Tbool, 0);
4664                 } else {
4665                         /* Condpart must match case values, with bool permitted */
4666                         t = NULL;
4667                         for (cp = cs->casepart;
4668                              cp && !t; cp = cp->next)
4669                                 t = propagate_types(cp->value, c, ok, NULL, 0);
4670                         if (!t && cs->condpart)
4671                                 t = propagate_types(cs->condpart, c, ok, NULL, Rboolok);        // UNTESTED
4672                         if (!t && cs->looppart)
4673                                 t = propagate_types(cs->looppart, c, ok, NULL, Rboolok);        // UNTESTED
4674                         // Now we have a type (I hope) push it down
4675                         if (t) {
4676                                 for (cp = cs->casepart; cp; cp = cp->next)
4677                                         propagate_types(cp->value, c, ok, t, 0);
4678                                 propagate_types(cs->condpart, c, ok, t, Rboolok);
4679                                 propagate_types(cs->looppart, c, ok, t, Rboolok);
4680                         }
4681                 }
4682                 // (if)then, else, and case parts must return expected type.
4683                 if (!cs->looppart && !type)
4684                         type = propagate_types(cs->thenpart, c, ok, NULL, rules);
4685                 if (!type)
4686                         type = propagate_types(cs->elsepart, c, ok, NULL, rules);
4687                 for (cp = cs->casepart;
4688                      cp && !type;
4689                      cp = cp->next)     // UNTESTED
4690                         type = propagate_types(cp->action, c, ok, NULL, rules); // UNTESTED
4691                 if (type) {
4692                         if (!cs->looppart)
4693                                 propagate_types(cs->thenpart, c, ok, type, rules);
4694                         propagate_types(cs->elsepart, c, ok, type, rules);
4695                         for (cp = cs->casepart; cp ; cp = cp->next)
4696                                 propagate_types(cp->action, c, ok, type, rules);
4697                         return type;
4698                 } else
4699                         return NULL;
4700         }
4701
4702 ###### interp binode cases
4703         case Loop:
4704                 // This just performs one iterration of the loop
4705                 rv = interp_exec(c, b->left, &rvtype);
4706                 if (rvtype == Tnone ||
4707                     (rvtype == Tbool && rv.bool != 0))
4708                         // rvtype is Tnone or Tbool, doesn't need to be freed
4709                         interp_exec(c, b->right, NULL);
4710                 break;
4711
4712 ###### interp exec cases
4713         case Xcond_statement:
4714         {
4715                 struct value v, cnd;
4716                 struct type *vtype, *cndtype;
4717                 struct casepart *cp;
4718                 struct cond_statement *cs = cast(cond_statement, e);
4719
4720                 if (cs->forpart)
4721                         interp_exec(c, cs->forpart, NULL);
4722                 if (cs->looppart) {
4723                         while ((cnd = interp_exec(c, cs->looppart, &cndtype)),
4724                                cndtype == Tnone || (cndtype == Tbool && cnd.bool != 0))
4725                                 interp_exec(c, cs->thenpart, NULL);
4726                 } else {
4727                         cnd = interp_exec(c, cs->condpart, &cndtype);
4728                         if ((cndtype == Tnone ||
4729                             (cndtype == Tbool && cnd.bool != 0))) {
4730                                 // cnd is Tnone or Tbool, doesn't need to be freed
4731                                 rv = interp_exec(c, cs->thenpart, &rvtype);
4732                                 // skip else (and cases)
4733                                 goto Xcond_done;
4734                         }
4735                 }
4736                 for (cp = cs->casepart; cp; cp = cp->next) {
4737                         v = interp_exec(c, cp->value, &vtype);
4738                         if (value_cmp(cndtype, vtype, &v, &cnd) == 0) {
4739                                 free_value(vtype, &v);
4740                                 free_value(cndtype, &cnd);
4741                                 rv = interp_exec(c, cp->action, &rvtype);
4742                                 goto Xcond_done;
4743                         }
4744                         free_value(vtype, &v);
4745                 }
4746                 free_value(cndtype, &cnd);
4747                 if (cs->elsepart)
4748                         rv = interp_exec(c, cs->elsepart, &rvtype);
4749                 else
4750                         rvtype = Tnone;
4751         Xcond_done:
4752                 break;
4753         }
4754
4755 ### Top level structure
4756
4757 All the language elements so far can be used in various places.  Now
4758 it is time to clarify what those places are.
4759
4760 At the top level of a file there will be a number of declarations.
4761 Many of the things that can be declared haven't been described yet,
4762 such as functions, procedures, imports, and probably more.
4763 For now there are two sorts of things that can appear at the top
4764 level.  They are predefined constants, `struct` types, and the `main`
4765 function.  While the syntax will allow the `main` function to appear
4766 multiple times, that will trigger an error if it is actually attempted.
4767
4768 The various declarations do not return anything.  They store the
4769 various declarations in the parse context.
4770
4771 ###### Parser: grammar
4772
4773         $void
4774         Ocean -> OptNL DeclarationList
4775
4776         ## declare terminals
4777
4778         OptNL ->
4779                 | OptNL NEWLINE
4780         Newlines -> NEWLINE
4781                 | Newlines NEWLINE
4782
4783         DeclarationList -> Declaration
4784                 | DeclarationList Declaration
4785
4786         Declaration -> ERROR Newlines ${
4787                         tok_err(c,      // UNTESTED
4788                                 "error: unhandled parse error", &$1);
4789                 }$
4790                 | DeclareConstant
4791                 | DeclareFunction
4792                 | DeclareStruct
4793
4794         ## top level grammar
4795
4796         ## Grammar
4797
4798 ### The `const` section
4799
4800 As well as being defined in with the code that uses them, constants
4801 can be declared at the top level.  These have full-file scope, so they
4802 are always `InScope`.  The value of a top level constant can be given
4803 as an expression, and this is evaluated immediately rather than in the
4804 later interpretation stage.  Once we add functions to the language, we
4805 will need rules concern which, if any, can be used to define a top
4806 level constant.
4807
4808 Constants are defined in a section that starts with the reserved word
4809 `const` and then has a block with a list of assignment statements.
4810 For syntactic consistency, these must use the double-colon syntax to
4811 make it clear that they are constants.  Type can also be given: if
4812 not, the type will be determined during analysis, as with other
4813 constants.
4814
4815 As the types constants are inserted at the head of a list, printing
4816 them in the same order that they were read is not straight forward.
4817 We take a quadratic approach here and count the number of constants
4818 (variables of depth 0), then count down from there, each time
4819 searching through for the Nth constant for decreasing N.
4820
4821 ###### top level grammar
4822
4823         $TERM const
4824
4825         DeclareConstant -> const { IN OptNL ConstList OUT OptNL } Newlines
4826                 | const { SimpleConstList } Newlines
4827                 | const IN OptNL ConstList OUT Newlines
4828                 | const SimpleConstList Newlines
4829
4830         ConstList -> ConstList SimpleConstLine
4831                 | SimpleConstLine
4832         SimpleConstList -> SimpleConstList ; Const
4833                 | Const
4834                 | SimpleConstList ;
4835         SimpleConstLine -> SimpleConstList Newlines
4836                 | ERROR Newlines ${ tok_err(c, "Syntax error in constant", &$1); }$
4837
4838         $*type
4839         CType -> Type   ${ $0 = $<1; }$
4840                 |       ${ $0 = NULL; }$
4841         $void
4842         Const -> IDENTIFIER :: CType = Expression ${ {
4843                 int ok;
4844                 struct variable *v;
4845
4846                 v = var_decl(c, $1.txt);
4847                 if (v) {
4848                         struct var *var = new_pos(var, $1);
4849                         v->where_decl = var;
4850                         v->where_set = var;
4851                         var->var = v;
4852                         v->constant = 1;
4853                         v->global = 1;
4854                 } else {
4855                         struct variable *vorig = var_ref(c, $1.txt);
4856                         tok_err(c, "error: name already declared", &$1);
4857                         type_err(c, "info: this is where '%v' was first declared",
4858                                  vorig->where_decl, NULL, 0, NULL);
4859                 }
4860                 do {
4861                         ok = 1;
4862                         propagate_types($5, c, &ok, $3, 0);
4863                 } while (ok == 2);
4864                 if (!ok)
4865                         c->parse_error = 1;
4866                 else if (v) {
4867                         struct value res = interp_exec(c, $5, &v->type);
4868                         global_alloc(c, v->type, v, &res);
4869                 }
4870         } }$
4871
4872 ###### print const decls
4873         {
4874                 struct variable *v;
4875                 int target = -1;
4876
4877                 while (target != 0) {
4878                         int i = 0;
4879                         for (v = context.in_scope; v; v=v->in_scope)
4880                                 if (v->depth == 0 && v->constant) {
4881                                         i += 1;
4882                                         if (i == target)
4883                                                 break;
4884                                 }
4885
4886                         if (target == -1) {
4887                                 if (i)
4888                                         printf("const\n");
4889                                 target = i;
4890                         } else {
4891                                 struct value *val = var_value(&context, v);
4892                                 printf("    %.*s :: ", v->name->name.len, v->name->name.txt);
4893                                 type_print(v->type, stdout);
4894                                 printf(" = ");
4895                                 if (v->type == Tstr)
4896                                         printf("\"");
4897                                 print_value(v->type, val, stdout);
4898                                 if (v->type == Tstr)
4899                                         printf("\"");
4900                                 printf("\n");
4901                                 target -= 1;
4902                         }
4903                 }
4904         }
4905
4906 ### Function declarations
4907
4908 The code in an Ocean program is all stored in function declarations.
4909 One of the functions must be named `main` and it must accept an array of
4910 strings as a parameter - the command line arguments.
4911
4912 As this is the top level, several things are handled a bit differently.
4913 The function is not interpreted by `interp_exec` as that isn't passed
4914 the argument list which the program requires.  Similarly type analysis
4915 is a bit more interesting at this level.
4916
4917 ###### ast functions
4918
4919         static struct type *handle_results(struct parse_context *c,
4920                                            struct binode *results)
4921         {
4922                 /* Create a 'struct' type from the results list, which
4923                  * is a list for 'struct var'
4924                  */
4925                 struct text result_type_name = { " function_result", 5 };
4926                 struct type *t = add_type(c, result_type_name, &structure_prototype);
4927                 int cnt = 0;
4928                 struct binode *b;
4929
4930                 for (b = results; b; b = cast(binode, b->right))
4931                         cnt += 1;
4932                 t->structure.nfields = cnt;
4933                 t->structure.fields = calloc(cnt, sizeof(struct field));
4934                 cnt = 0;
4935                 for (b = results; b; b = cast(binode, b->right)) {
4936                         struct var *v = cast(var, b->left);
4937                         struct field *f = &t->structure.fields[cnt++];
4938                         int a = v->var->type->align;
4939                         f->name = v->var->name->name;
4940                         f->type = v->var->type;
4941                         f->init = NULL;
4942                         f->offset = t->size;
4943                         v->var->frame_pos = f->offset;
4944                         t->size += ((f->type->size - 1) | (a-1)) + 1;
4945                         if (a > t->align)
4946                                 t->align = a;
4947                         variable_unlink_exec(v->var);
4948                 }
4949                 free_binode(results);
4950                 return t;
4951         }
4952
4953         static struct variable *declare_function(struct parse_context *c,
4954                                                 struct variable *name,
4955                                                 struct binode *args,
4956                                                 struct type *ret,
4957                                                 struct binode *results,
4958                                                 struct exec *code)
4959         {
4960                 struct text funcname = {" func", 5};
4961                 if (name) {
4962                         struct value fn = {.function = code};
4963                         struct type *t;
4964                         var_block_close(c, CloseFunction, code);
4965                         t = add_type(c, funcname, &function_prototype);
4966                         name->type = t;
4967                         t->function.params = reorder_bilist(args);
4968                         if (!ret) {
4969                                 ret = handle_results(c, reorder_bilist(results));
4970                                 t->function.inline_result = 1;
4971                                 t->function.local_size = ret->size;
4972                         }
4973                         t->function.return_type = ret;
4974                         global_alloc(c, t, name, &fn);
4975                         name->type->function.scope = c->out_scope;
4976                 } else {
4977                         free_binode(args);
4978                         free_type(ret);
4979                         free_exec(code);
4980                         var_block_close(c, CloseFunction, NULL);
4981                 }
4982                 c->out_scope = NULL;
4983                 return name;
4984         }
4985
4986 ###### declare terminals
4987         $TERM return
4988
4989 ###### top level grammar
4990
4991         $*variable
4992         DeclareFunction -> func FuncName ( OpenScope ArgsLine ) Block Newlines ${
4993                         $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
4994                 }$
4995                 | func FuncName IN OpenScope Args OUT OptNL do Block Newlines ${
4996                         $0 = declare_function(c, $<FN, $<Ar, Tnone, NULL, $<Bl);
4997                 }$
4998                 | func FuncName NEWLINE OpenScope OptNL do Block Newlines ${
4999                         $0 = declare_function(c, $<FN, NULL, Tnone, NULL, $<Bl);
5000                 }$
5001                 | func FuncName ( OpenScope ArgsLine ) : Type Block Newlines ${
5002                         $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5003                 }$
5004                 | func FuncName ( OpenScope ArgsLine ) : ( ArgsLine ) Block Newlines ${
5005                         $0 = declare_function(c, $<FN, $<AL, NULL, $<AL2, $<Bl);
5006                 }$
5007                 | func FuncName IN OpenScope Args OUT OptNL return Type Newlines do Block Newlines ${
5008                         $0 = declare_function(c, $<FN, $<Ar, $<Ty, NULL, $<Bl);
5009                 }$
5010                 | func FuncName NEWLINE OpenScope return Type Newlines do Block Newlines ${
5011                         $0 = declare_function(c, $<FN, NULL, $<Ty, NULL, $<Bl);
5012                 }$
5013                 | func FuncName IN OpenScope Args OUT OptNL return IN Args OUT OptNL do Block Newlines ${
5014                         $0 = declare_function(c, $<FN, $<Ar, NULL, $<Ar2, $<Bl);
5015                 }$
5016                 | func FuncName NEWLINE OpenScope return IN Args OUT OptNL do Block Newlines ${
5017                         $0 = declare_function(c, $<FN, NULL, NULL, $<Ar, $<Bl);
5018                 }$
5019
5020 ###### print func decls
5021         {
5022                 struct variable *v;
5023                 int target = -1;
5024
5025                 while (target != 0) {
5026                         int i = 0;
5027                         for (v = context.in_scope; v; v=v->in_scope)
5028                                 if (v->depth == 0 && v->type && v->type->check_args) {
5029                                         i += 1;
5030                                         if (i == target)
5031                                                 break;
5032                                 }
5033
5034                         if (target == -1) {
5035                                 target = i;
5036                         } else {
5037                                 struct value *val = var_value(&context, v);
5038                                 printf("func %.*s", v->name->name.len, v->name->name.txt);
5039                                 v->type->print_type_decl(v->type, stdout);
5040                                 if (brackets)
5041                                         print_exec(val->function, 0, brackets);
5042                                 else
5043                                         print_value(v->type, val, stdout);
5044                                 printf("/* frame size %d */\n", v->type->function.local_size);
5045                                 target -= 1;
5046                         }
5047                 }
5048         }
5049
5050 ###### core functions
5051
5052         static int analyse_funcs(struct parse_context *c)
5053         {
5054                 struct variable *v;
5055                 int all_ok = 1;
5056                 for (v = c->in_scope; v; v = v->in_scope) {
5057                         struct value *val;
5058                         struct type *ret;
5059                         int ok = 1;
5060                         if (v->depth != 0 || !v->type || !v->type->check_args)
5061                                 continue;
5062                         ret = v->type->function.inline_result ?
5063                                 Tnone : v->type->function.return_type;
5064                         val = var_value(c, v);
5065                         do {
5066                                 ok = 1;
5067                                 propagate_types(val->function, c, &ok, ret, 0);
5068                         } while (ok == 2);
5069                         if (ok)
5070                                 /* Make sure everything is still consistent */
5071                                 propagate_types(val->function, c, &ok, ret, 0);
5072                         if (!ok)
5073                                 all_ok = 0;
5074                         if (!v->type->function.inline_result &&
5075                             !v->type->function.return_type->dup) {
5076                                 type_err(c, "error: function cannot return value of type %1",
5077                                          v->where_decl, v->type->function.return_type, 0, NULL);
5078                         }
5079
5080                         scope_finalize(c, v->type);
5081                 }
5082                 return all_ok;
5083         }
5084
5085         static int analyse_main(struct type *type, struct parse_context *c)
5086         {
5087                 struct binode *bp = type->function.params;
5088                 struct binode *b;
5089                 int ok = 1;
5090                 int arg = 0;
5091                 struct type *argv_type;
5092                 struct text argv_type_name = { " argv", 5 };
5093
5094                 argv_type = add_type(c, argv_type_name, &array_prototype);
5095                 argv_type->array.member = Tstr;
5096                 argv_type->array.unspec = 1;
5097
5098                 for (b = bp; b; b = cast(binode, b->right)) {
5099                         ok = 1;
5100                         switch (arg++) {
5101                         case 0: /* argv */
5102                                 propagate_types(b->left, c, &ok, argv_type, 0);
5103                                 break;
5104                         default: /* invalid */  // NOTEST
5105                                 propagate_types(b->left, c, &ok, Tnone, 0);     // NOTEST
5106                         }
5107                         if (!ok)
5108                                 c->parse_error = 1;
5109                 }
5110
5111                 return !c->parse_error;
5112         }
5113
5114         static void interp_main(struct parse_context *c, int argc, char **argv)
5115         {
5116                 struct value *progp = NULL;
5117                 struct text main_name = { "main", 4 };
5118                 struct variable *mainv;
5119                 struct binode *al;
5120                 int anum = 0;
5121                 struct value v;
5122                 struct type *vtype;
5123
5124                 mainv = var_ref(c, main_name);
5125                 if (mainv)
5126                         progp = var_value(c, mainv);
5127                 if (!progp || !progp->function) {
5128                         fprintf(stderr, "oceani: no main function found.\n");
5129                         c->parse_error = 1;
5130                         return;
5131                 }
5132                 if (!analyse_main(mainv->type, c)) {
5133                         fprintf(stderr, "oceani: main has wrong type.\n");
5134                         c->parse_error = 1;
5135                         return;
5136                 }
5137                 al = mainv->type->function.params;
5138
5139                 c->local_size = mainv->type->function.local_size;
5140                 c->local = calloc(1, c->local_size);
5141                 while (al) {
5142                         struct var *v = cast(var, al->left);
5143                         struct value *vl = var_value(c, v->var);
5144                         struct value arg;
5145                         struct type *t;
5146                         mpq_t argcq;
5147                         int i;
5148
5149                         switch (anum++) {
5150                         case 0: /* argv */
5151                                 t = v->var->type;
5152                                 mpq_init(argcq);
5153                                 mpq_set_ui(argcq, argc, 1);
5154                                 memcpy(var_value(c, t->array.vsize), &argcq, sizeof(argcq));
5155                                 t->prepare_type(c, t, 0);
5156                                 array_init(v->var->type, vl);
5157                                 for (i = 0; i < argc; i++) {
5158                                         struct value *vl2 = vl->array + i * v->var->type->array.member->size;
5159
5160                                         arg.str.txt = argv[i];
5161                                         arg.str.len = strlen(argv[i]);
5162                                         free_value(Tstr, vl2);
5163                                         dup_value(Tstr, &arg, vl2);
5164                                 }
5165                                 break;
5166                         }
5167                         al = cast(binode, al->right);
5168                 }
5169                 v = interp_exec(c, progp->function, &vtype);
5170                 free_value(vtype, &v);
5171                 free(c->local);
5172                 c->local = NULL;
5173         }
5174
5175 ###### ast functions
5176         void free_variable(struct variable *v)
5177         {
5178         }
5179
5180 ## And now to test it out.
5181
5182 Having a language requires having a "hello world" program.  I'll
5183 provide a little more than that: a program that prints "Hello world"
5184 finds the GCD of two numbers, prints the first few elements of
5185 Fibonacci, performs a binary search for a number, and a few other
5186 things which will likely grow as the languages grows.
5187
5188 ###### File: oceani.mk
5189         demos :: sayhello
5190         sayhello : oceani
5191                 @echo "===== DEMO ====="
5192                 ./oceani --section "demo: hello" oceani.mdc 55 33
5193
5194 ###### demo: hello
5195
5196         const
5197                 pi ::= 3.141_592_6
5198                 four ::= 2 + 2 ; five ::= 10/2
5199         const pie ::= "I like Pie";
5200                 cake ::= "The cake is"
5201                   ++ " a lie"
5202
5203         struct fred
5204                 size:[four]number
5205                 name:string
5206                 alive:Boolean
5207
5208         func main(argv:[argc::]string)
5209                 print "Hello World, what lovely oceans you have!"
5210                 print "Are there", five, "?"
5211                 print pi, pie, "but", cake
5212
5213                 A := $argv[1]; B := $argv[2]
5214
5215                 /* When a variable is defined in both branches of an 'if',
5216                  * and used afterwards, the variables are merged.
5217                  */
5218                 if A > B:
5219                         bigger := "yes"
5220                 else
5221                         bigger := "no"
5222                 print "Is", A, "bigger than", B,"? ", bigger
5223                 /* If a variable is not used after the 'if', no
5224                  * merge happens, so types can be different
5225                  */
5226                 if A > B * 2:
5227                         double:string = "yes"
5228                         print A, "is more than twice", B, "?", double
5229                 else
5230                         double := B*2
5231                         print "double", B, "is", double
5232
5233                 a : number
5234                 a = A;
5235                 b:number = B
5236                 if a > 0 and then b > 0:
5237                         while a != b:
5238                                 if a < b:
5239                                         b = b - a
5240                                 else
5241                                         a = a - b
5242                         print "GCD of", A, "and", B,"is", a
5243                 else if a <= 0:
5244                         print a, "is not positive, cannot calculate GCD"
5245                 else
5246                         print b, "is not positive, cannot calculate GCD"
5247
5248                 for
5249                         togo := 10
5250                         f1 := 1; f2 := 1
5251                         print "Fibonacci:", f1,f2,
5252                 then togo = togo - 1
5253                 while togo > 0:
5254                         f3 := f1 + f2
5255                         print "", f3,
5256                         f1 = f2
5257                         f2 = f3
5258                 print ""
5259
5260                 /* Binary search... */
5261                 for
5262                         lo:= 0; hi := 100
5263                         target := 77
5264                 while
5265                         mid := (lo + hi) / 2
5266                         if mid == target:
5267                                 use Found
5268                         if mid < target:
5269                                 lo = mid
5270                         else
5271                                 hi = mid
5272                         if hi - lo < 1:
5273                                 lo = mid
5274                                 use GiveUp
5275                         use True
5276                 do pass
5277                 case Found:
5278                         print "Yay, I found", target
5279                 case GiveUp:
5280                         print "Closest I found was", lo
5281
5282                 size::= 10
5283                 list:[size]number
5284                 list[0] = 1234
5285                 // "middle square" PRNG.  Not particularly good, but one my
5286                 // Dad taught me - the first one I ever heard of.
5287                 for i:=1; then i = i + 1; while i < size:
5288                         n := list[i-1] * list[i-1]
5289                         list[i] = (n / 100) % 10 000
5290
5291                 print "Before sort:",
5292                 for i:=0; then i = i + 1; while i < size:
5293                         print "", list[i],
5294                 print
5295
5296                 for i := 1; then i=i+1; while i < size:
5297                         for j:=i-1; then j=j-1; while j >= 0:
5298                                 if list[j] > list[j+1]:
5299                                         t:= list[j]
5300                                         list[j] = list[j+1]
5301                                         list[j+1] = t
5302                 print " After sort:",
5303                 for i:=0; then i = i + 1; while i < size:
5304                         print "", list[i],
5305                 print
5306
5307                 if 1 == 2 then print "yes"; else print "no"
5308
5309                 bob:fred
5310                 bob.name = "Hello"
5311                 bob.alive = (bob.name == "Hello")
5312                 print "bob", "is" if  bob.alive else "isn't", "alive"