ocean-lang.org Git - ocean/blob - csrc/oceani.mdc

   1 # Ocean Interpreter - Stoney Creek version
   2
   3 Ocean is intended to be an compiled language, so this interpreter is
   4 not targeted at being the final product.  It is, rather, an intermediate
   5 stage, and fills that role in two distinct ways.
   6
   7 Firstly, it exists as a platform to experiment with the early language
   8 design.  An interpreter is easy to write and easy to get working, so
   9 the barrier for entry is lower if I aim to start with an interpreter.
  10
  11 Secondly, the plan for the Ocean compiler is to write it in the
  12 [Ocean language](http://ocean-lang.org).  To achieve this we naturally
  13 need some sort of boot-strap process and this interpreter - written in
  14 portable C - will fill that role.  It will be used to bootstrap the
  15 Ocean compiler.
  16
  17 Two features that are not needed to fill either of these roles are
  18 performance and completeness.  The interpreter only needs to be fast
  19 enough to run small test programs and occasionally to run the compiler
  20 on itself.  It only needs to be complete enough to test aspects of the
  21 design which are developed before the compiler is working, and to run
  22 the compiler on itself.  Any features not used by the compiler when
  23 compiling itself are superfluous.  They may be included anyway, but
  24 they may not.
  25
  26 Nonetheless, the interpreter should end up being reasonably complete,
  27 and any performance bottlenecks which appear and are easily fixed, will
  28 be.
  29
  30 ## Current version
  31
  32 This second version of the interpreter exists to test out the
  33 structured statement providing conditions and iteration, and simple
  34 variable scoping.  Clearly we need some minimal other functionality so
  35 that values can be tested and instructions iterated over.  All that
  36 functionality is clearly not normative at this stage (not that
  37 anything is **really** normative yet) and will change, so early test
  38 code will certainly break in later versions.
  39
  40 The under-test parts of the language are:
  41
  42  - conditional/looping structured statements
  43  - the `use` statement which is needed for that
  44  - Variable binding using ":=" and "::=", and assignment using "=".
  45
  46 Elements which are present to make a usable language are:
  47
  48  - "blocks" of multiple statements.
  49  - `pass`: a statement which does nothing.
  50  - expressions: `+`, `-`, `*`, `/` can apply to numbers and `++` can
  51    catenate strings.  `and`, `or`, `not` manipulate Booleans, and
  52    normal comparison operators can work on all three types.
  53  - `print`: will print the values in a list of expressions.
  54  - `program`: is given a list of identifiers to initialize from
  55    arguments.
  56
  57 ## Naming
  58
  59 Versions of the interpreter which obviously do not support a complete
  60 language will be named after creeks and streams.  This one is Stoney
  61 Creek.
  62
  63 Once we have something reasonably resembling a complete language, the
  64 names of rivers will be used.
  65 Early versions of the compiler will be named after seas.  Major
  66 releases of the compiler will be named after oceans.  Hopefully I will
  67 be finished once I get to the Pacific Ocean release.
  68
  69 ## Outline
  70
  71 As well as parsing and executing a program, the interpreter can print
  72 out the program from the parsed internal structure.  This is useful
  73 for validating the parsing.
  74 So the main requirements of the interpreter are:
  75
  76 - Parse the program, possible with tracing
  77 - Analyse the parsed program to ensure consistency
  78 - print the program
  79 - execute the program
  80
  81 This is all performed by a single C program extracted with
  82 `parsergen`.
  83
  84 There will be two formats for printing the program: a default and one
  85 that uses bracketing.  So a `--bracket` command line option is needed
  86 for that.  Normally the first code section found is used, however an
  87 alternate section can be requested so that a file (such as this one)
  88 can contain multiple programs This is effected with the `--section`
  89 option.
  90
  91 ###### File: oceani.mk
  92
  93         myCFLAGS := -Wall -g -fplan9-extensions
  94         CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
  95         myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
  96         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
  97         ## libs
  98         all :: $(LDLIBS) oceani
  99         oceani.c oceani.h : oceani.mdc parsergen
 100                 ./parsergen -o oceani --LALR --tag Parser oceani.mdc
 101         oceani.mk: oceani.mdc md2c
 102                 ./md2c oceani.mdc
 103
 104         oceani: oceani.o $(LDLIBS)
 105                 $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
 106
 107 ###### Parser: header
 108         ## macros
 109         ## ast
 110         struct parse_context {
 111                 struct token_config config;
 112                 ## parse context
 113         };
 114
 115 ###### macros
 116
 117         #define container_of(ptr, type, member) ({                      \
 118                 const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 119                 (type *)( (char *)__mptr - offsetof(type,member) );})
 120
 121         #define config2context(_conf) container_of(_conf, struct parse_context, \
 122                 config)
 123
 124 ###### Parser: code
 125
 126         #include <unistd.h>
 127         #include <stdlib.h>
 128         #include <fcntl.h>
 129         #include <errno.h>
 130         #include <sys/mman.h>
 131         #include <string.h>
 132         #include <stdio.h>
 133         #include <locale.h>
 134         #include <malloc.h>
 135         #include "mdcode.h"
 136         #include "scanner.h"
 137         #include "parser.h"
 138
 139         ## includes
 140
 141         #include "oceani.h"
 142
 143         ## forward decls
 144         ## value functions
 145         ## ast functions
 146         ## core functions
 147
 148         #include <getopt.h>
 149         static char Usage[] = "Usage: oceani --trace --print --noexec --brackets"
 150                               "--section=SectionName prog.ocn\n";
 151         static const struct option long_options[] = {
 152                 {"trace",     0, NULL, 't'},
 153                 {"print",     0, NULL, 'p'},
 154                 {"noexec",    0, NULL, 'n'},
 155                 {"brackets",  0, NULL, 'b'},
 156                 {"section",   1, NULL, 's'},
 157                 {NULL,        0, NULL, 0},
 158         };
 159         const char *options = "tpnbs";
 160         int main(int argc, char *argv[])
 161         {
 162                 int fd;
 163                 int len;
 164                 char *file;
 165                 struct section *s;
 166                 char *section = NULL;
 167                 struct parse_context context = {
 168                         .config = {
 169                                 .ignored = (1 << TK_line_comment)
 170                                          | (1 << TK_block_comment),
 171                                 .number_chars = ".,_+-",
 172                                 .word_start = "_",
 173                                 .word_cont = "_",
 174                         },
 175                 };
 176                 int doprint=0, dotrace=0, doexec=1, brackets=0;
 177                 struct exec **prog;
 178                 int opt;
 179                 while ((opt = getopt_long(argc, argv, options, long_options, NULL))
 180                        != -1) {
 181                         switch(opt) {
 182                         case 't': dotrace=1; break;
 183                         case 'p': doprint=1; break;
 184                         case 'n': doexec=0; break;
 185                         case 'b': brackets=1; break;
 186                         case 's': section = optarg; break;
 187                         default: fprintf(stderr, Usage);
 188                                 exit(1);
 189                         }
 190                 }
 191                 if (optind >= argc) {
 192                         fprintf(stderr, "oceani: no input file given\n");
 193                         exit(1);
 194                 }
 195                 fd = open(argv[optind], O_RDONLY);
 196                 if (fd < 0) {
 197                         fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
 198                         exit(1);
 199                 }
 200                 len = lseek(fd, 0, 2);
 201                 file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
 202                 s = code_extract(file, file+len, NULL);
 203                 if (!s) {
 204                         fprintf(stderr, "oceani: could not find any code in %s\n",
 205                                 argv[optind]);
 206                         exit(1);
 207                 }
 208                 if (section) {
 209                         struct section *ss;
 210                         for (ss = s; ss; ss = ss->next) {
 211                                 struct text sec = ss->section;
 212                                 if (sec.len == strlen(section) &&
 213                                     strncmp(sec.txt, section, sec.len) == 0)
 214                                         break;
 215                         }
 216                         if (ss)
 217                                 prog = parse_oceani(ss->code, &context.config,
 218                                                     dotrace ? stderr : NULL);
 219                         else {
 220                                 fprintf(stderr, "oceani: cannot find section %s\n",
 221                                         section);
 222                                 exit(1);
 223                         }
 224                 } else
 225                         prog = parse_oceani(s->code, &context.config,
 226                                     dotrace ? stderr : NULL);
 227                 if (prog && doprint)
 228                         print_exec(*prog, 0, brackets);
 229                 if (prog && doexec) {
 230                         if (!analyse_prog(*prog, &context)) {
 231                                 fprintf(stderr, "oceani: type error in program\n");
 232                                 exit(1);
 233                         }
 234                         interp_prog(*prog, argv+optind+1);
 235                 }
 236                 if (prog) {
 237                         free_exec(*prog);
 238                         free(prog);
 239                 }
 240                 while (s) {
 241                         struct section *t = s->next;
 242                         code_free(s->code);
 243                         free(s);
 244                         s = t;
 245                 }
 246                 ## free context
 247                 exit(0);
 248         }
 249
 250 ### Analysis
 251
 252 These four requirements of parse, analyse, print, interpret apply to
 253 each language element individually so that is how most of the code
 254 will be structured.
 255
 256 Three of the four are fairly self explanatory.  The one that requires
 257 a little explanation is the analysis step.
 258
 259 The current language design does not require (or even allow) the types
 260 of variables to be declared, but they must still have a single type.
 261 Different operations impose different requirements on the variables,
 262 for example addition requires both arguments to be numeric, and
 263 assignment requires the variable on the left to have the same type as
 264 the expression on the right.
 265
 266 Analysis involves propagating these type requirements around and
 267 consequently setting the type of each variable.  If any requirements
 268 are violated (e.g. a string is compared with a number) or if a
 269 variable needs to have two different types, then an error is raised
 270 and the program will not run.
 271
 272 If the same variable is declared in both branchs of an 'if/else', or
 273 in all cases of a 'switch' then the multiple instances may be merged
 274 into just one variable if the variable is references after the
 275 conditional statement.  When this happens, the types must naturally be
 276 consistent across all the branches.  When the variable is not used
 277 outside the if, the variables in the different branches are distinct
 278 and can be of different types.
 279
 280 Determining the types of all variables early is important for
 281 processing command line arguments.  These can be assigned to any type
 282 of variable, but we must first know the correct type so any required
 283 conversion can happen.  If a variable is associated with a command
 284 line argument but no type can be interpreted (e.g. the variable is
 285 only ever used in a `print` statement), then the type is set to
 286 'string'.
 287
 288 Undeclared names may only appear in "use" statements and "case" expressions.
 289 These names are given a type of "label" and a unique value.
 290 This allows them to fill the role of a name in an enumerated type, which
 291 is useful for testing the `switch` statement.
 292
 293 As we will see, the condition part of a `while` statement can return
 294 either a Boolean or some other type.  This requires that the expect
 295 type that gets passed around comprises a type (`enum vtype`) and a
 296 flag to indicate that `Vbool` is also permitted.
 297
 298 As there are, as yet, no distinct types that are compatible, there
 299 isn't much subtlety in the analysis.  When we hav distinct number
 300 types, this will become more interesting.
 301
 302 ## Data Structures
 303
 304 One last introductory step before detailing the language elements and
 305 providing their four requirements is to establish the data structures
 306 to store these elements.
 307
 308 There are two key objects that we need to work with: executable
 309 elements which comprise the program, and values which the program
 310 works with.  Between these are the variables in their various scopes
 311 which hold the values.
 312
 313 ### Values
 314
 315 Values can be numbers, which we represent as multi-precision
 316 fractions, strings, Booleans and labels.  When analysing the program
 317 we also need to allow for places where no value is meaningful
 318 (`Vnone`) and where we don't know what type to expect yet (`Vunknown`
 319 which can be anything and `Vnolabel` which can be anything except a
 320 label).  A 2 character 'tail' is included in each value as the scanner
 321 wants to parse that from the end of numbers and we need somewhere to
 322 put it.  It is currently ignored but one day might allow for
 323 e.g. "imaginary" numbers.
 324
 325 Values are never shared, they are always copied when used, and freed
 326 when no longer needed.
 327
 328 When propagating type information around the program, we need to
 329 determine if two types are compatible, where `Vunknown` is compatible
 330 which anything, and `Vnolabel` is compatible with anything except a
 331 label.  A separate funtion to encode this rule will simplify some code
 332 later.
 333
 334 When assigning command line arguments to variable, we need to be able
 335 to parse each type from a string.
 336
 337 ###### includes
 338         #include <gmp.h>
 339         #include "string.h"
 340         #include "number.h"
 341
 342 ###### libs
 343         myLDLIBS := libnumber.o libstring.o -lgmp
 344         LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
 345
 346 ###### ast
 347         struct value {
 348                 enum vtype {Vnolabel, Vunknown, Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
 349                 union {
 350                         struct text str;
 351                         mpq_t num;
 352                         int bool;
 353                         void *label;
 354                 };
 355                 char tail[2];
 356         };
 357
 358 ###### ast functions
 359         static void free_value(struct value v)
 360         {
 361                 switch (v.vtype) {
 362                 case Vnone:
 363                 case Vnolabel:
 364                 case Vunknown: break;
 365                 case Vstr: free(v.str.txt); break;
 366                 case Vnum: mpq_clear(v.num); break;
 367                 case Vlabel:
 368                 case Vbool: break;
 369                 }
 370         }
 371
 372         static int vtype_compat(enum vtype require, enum vtype have, int bool_permitted)
 373         {
 374                 if (bool_permitted && have == Vbool)
 375                         return 1;
 376                 switch (require) {
 377                 case Vnolabel:
 378                         return have != Vlabel;
 379                 case Vunknown:
 380                         return 1;
 381                 default:
 382                         return have == Vunknown || require == have;
 383                 }
 384         }
 385
 386 ###### value functions
 387
 388         static void val_init(struct value *val, enum vtype type)
 389         {
 390                 val->vtype = type;
 391                 switch(type) {
 392                 case Vnone:abort();
 393                 case Vnolabel:
 394                 case Vunknown: break;
 395                 case Vnum:
 396                         mpq_init(val->num); break;
 397                 case Vstr:
 398                         val->str.txt = malloc(1);
 399                         val->str.len = 0;
 400                         break;
 401                 case Vbool:
 402                         val->bool = 0;
 403                         break;
 404                 case Vlabel:
 405                         val->label = val;
 406                         break;
 407                 }
 408         }
 409
 410         static struct value dup_value(struct value v)
 411         {
 412                 struct value rv;
 413                 rv.vtype = v.vtype;
 414                 switch (rv.vtype) {
 415                 case Vnone:
 416                 case Vnolabel:
 417                 case Vunknown: break;
 418                 case Vlabel:
 419                         rv.label = v.label;
 420                         break;
 421                 case Vbool:
 422                         rv.bool = v.bool;
 423                         break;
 424                 case Vnum:
 425                         mpq_init(rv.num);
 426                         mpq_set(rv.num, v.num);
 427                         break;
 428                 case Vstr:
 429                         rv.str.len = v.str.len;
 430                         rv.str.txt = malloc(rv.str.len);
 431                         memcpy(rv.str.txt, v.str.txt, v.str.len);
 432                         break;
 433                 }
 434                 return rv;
 435         }
 436
 437         static int value_cmp(struct value left, struct value right)
 438         {
 439                 int cmp;
 440                 if (left.vtype != right.vtype)
 441                         return left.vtype - right.vtype;
 442                 switch (left.vtype) {
 443                 case Vlabel: cmp = left.label == right.label ? 0 : 1; break;
 444                 case Vnum: cmp = mpq_cmp(left.num, right.num); break;
 445                 case Vstr: cmp = text_cmp(left.str, right.str); break;
 446                 case Vbool: cmp = left.bool - right.bool; break;
 447                 case Vnone:
 448                 case Vnolabel:
 449                 case Vunknown: cmp = 0;
 450                 }
 451                 return cmp;
 452         }
 453
 454         static struct text text_join(struct text a, struct text b)
 455         {
 456                 struct text rv;
 457                 rv.len = a.len + b.len;
 458                 rv.txt = malloc(rv.len);
 459                 memcpy(rv.txt, a.txt, a.len);
 460                 memcpy(rv.txt+a.len, b.txt, b.len);
 461                 return rv;
 462         }
 463
 464         static void print_value(struct value v)
 465         {
 466                 switch (v.vtype) {
 467                 case Vunknown:
 468                         printf("*Unknown*"); break;
 469                 case Vnone:
 470                 case Vnolabel:
 471                         printf("*no-value*"); break;
 472                 case Vlabel:
 473                         printf("*label-%p*", v.label); break;
 474                 case Vstr:
 475                         printf("%.*s", v.str.len, v.str.txt); break;
 476                 case Vbool:
 477                         printf("%s", v.bool ? "True":"False"); break;
 478                 case Vnum:
 479                         {
 480                         mpf_t fl;
 481                         mpf_init2(fl, 20);
 482                         mpf_set_q(fl, v.num);
 483                         gmp_printf("%Fg", fl);
 484                         mpf_clear(fl);
 485                         break;
 486                         }
 487                 }
 488         }
 489
 490         static int parse_value(struct value *vl, char *arg)
 491         {
 492                 struct text tx;
 493                 int neg = 0;
 494                 switch(vl->vtype) {
 495                 case Vnolabel:
 496                 case Vlabel:
 497                 case Vunknown:
 498                 case Vnone:
 499                         return 0;
 500                 case Vstr:
 501                         vl->str.len = strlen(arg);
 502                         vl->str.txt = malloc(vl->str.len);
 503                         memcpy(vl->str.txt, arg, vl->str.len);
 504                         break;
 505                 case Vnum:
 506                         if (*arg == '-') {
 507                                 neg = 1;
 508                                 arg++;
 509                         }
 510                         tx.txt = arg; tx.len = strlen(tx.txt);
 511                         if (number_parse(vl->num, vl->tail, tx) == 0)
 512                                 mpq_init(vl->num);
 513                         else if (neg)
 514                                 mpq_neg(vl->num, vl->num);
 515                         break;
 516                 case Vbool:
 517                         if (strcasecmp(arg, "true") == 0 ||
 518                             strcmp(arg, "1") == 0)
 519                                 vl->bool = 1;
 520                         else if (strcasecmp(arg, "false") == 0 ||
 521                                  strcmp(arg, "0") == 0)
 522                                 vl->bool = 0;
 523                         else {
 524                                 printf("Bad bool: %s\n", arg);
 525                                 return 0;
 526                         }
 527                         break;
 528                 }
 529                 return 1;
 530         }
 531
 532 ### Variables
 533
 534 Variables are scoped named values.  We store the names in a linked
 535 list of "bindings" sorted lexically, and use sequential search and
 536 insertion sort.
 537
 538 ###### ast
 539
 540         struct binding {
 541                 struct text name;
 542                 struct binding *next;   // in lexical order
 543                 ## binding fields
 544         };
 545
 546 This linked list is stored in the parse context so that "reduce"
 547 functions can find or add variables, and so the analysis phase can
 548 ensure that every variable gets a type.
 549
 550 ###### parse context
 551
 552         struct binding *varlist;  // In lexical order
 553
 554 ###### ast functions
 555
 556         static struct binding *find_binding(struct parse_context *c, struct text s)
 557         {
 558                 struct binding **l = &c->varlist;
 559                 struct binding *n;
 560                 int cmp = 1;
 561
 562                 while (*l &&
 563                         (cmp = text_cmp((*l)->name, s)) < 0)
 564                                 l = & (*l)->next;
 565                 if (cmp == 0)
 566                         return *l;
 567                 n = calloc(1, sizeof(*n));
 568                 n->name = s;
 569                 n->next = *l;
 570                 *l = n;
 571                 return n;
 572         }
 573
 574 Each name can be linked to multiple variables defined in different
 575 scopes.  Each scope starts where the name is declared and continues
 576 until the end of the containing code block.  Scopes of a given name
 577 cannot nest, so a declaration while a name is in-scope is an error.
 578
 579 ###### binding fields
 580         struct variable *var;
 581
 582 ###### ast
 583         struct variable {
 584                 struct variable *previous;
 585                 struct value val;
 586                 struct binding *name;
 587                 ## variable fields
 588         };
 589
 590 While the naming seems strange, we include local constants in the
 591 definition of variables.  A name declared `var := value` can
 592 subsequently be changed, but a name declared `var ::= value` cannot -
 593 it is constant
 594
 595 ###### variable fields
 596         int constant;
 597
 598 Scopes in parallel branches can be partially merged.  More
 599 specifically, if a given name is declared in both branches of an
 600 if/else then it's scope is a candidate for merging.  Similarly if
 601 every branch of an exhaustive switch (e.g. has an "else" clause)
 602 declares a given name, then the scopes from the branches are
 603 candidates for merging.
 604
 605 Note that names declared inside a loop (which is only parallel to
 606 itself) are never visible after the loop.  Similarly names defined in
 607 scopes which are not parallel, such as those started by `for` and
 608 `switch`, are never visible after the scope.  Only variable defined in
 609 both `then` and `else` (including the implicit then after an `if`, and
 610 excluding `then` used with `for`) and in all `case`s and `else` of a
 611 `switch` or `while` can be visible beyond the `if`/`switch`/`while`.
 612
 613 Labels, which are a bit like variables, follow different rules.
 614 Labels are not explicitly declared, but if an undeclared name appears
 615 in a context where a label is legal, that effectively declares the
 616 name as a label.  The declaration remains in force (or in scope) at
 617 least to the end of the immediately containing block and conditionally
 618 in any larger containing block which does not declare the name in some
 619 other way.  Importantly, the conditional scope extension happens even
 620 if the label is only used in parallel branch of a conditional -- when
 621 used in one branch it is treated as having been declared in all
 622 branches.
 623
 624 Merge candidates are tentatively visible beyond the end of the
 625 branching statement which creates them.  If the name is used, the
 626 merge is affirmed and they become a single variable visible at the
 627 outer layer.  If not - if it is redeclared first - the merge lapses.
 628
 629 To track scopes we have an extra stack, implemented as a linked list,
 630 which roughly parallels the parse stack and which is used exclusively
 631 for scoping.  When a new scope is opened, a new frame is pushed and
 632 the child-count of the parent frame is incremented.  This child-count
 633 is used to distinguish between the first of a set of parallel scopes,
 634 in which declared variables must not be in scope, and subsequent
 635 branches, whether they must already be conditionally scoped.
 636
 637 To push a new frame *before* any code in the frame is parsed, we need a
 638 grammar reduction.  This is most easily achieved with a grammar
 639 element which derives the empty string, and created the new scope when
 640 it is recognized.  This can be placed, for example, between a keyword
 641 like "if" and the code following it.
 642
 643 ###### ast
 644         struct scope {
 645                 struct scope *parent;
 646                 int child_count;
 647         };
 648
 649 ###### parse context
 650         int scope_depth;
 651         struct scope *scope_stack;
 652
 653 ###### ast functions
 654         static void scope_pop(struct parse_context *c)
 655         {
 656                 struct scope *s = c->scope_stack;
 657
 658                 c->scope_stack = s->parent;
 659                 free(s);
 660                 c->scope_depth -= 1;
 661         }
 662
 663         static void scope_push(struct parse_context *c)
 664         {
 665                 struct scope *s = calloc(1, sizeof(*s));
 666                 if (c->scope_stack)
 667                         c->scope_stack->child_count += 1;
 668                 s->parent = c->scope_stack;
 669                 c->scope_stack = s;
 670                 c->scope_depth += 1;
 671         }
 672
 673 ###### Grammar
 674
 675         $void
 676         OpenScope -> ${ scope_push(config2context(config)); }$
 677
 678
 679 Each variable records a scope depth and is in one of four states:
 680
 681 - "in scope".  This is the case between the declaration of the
 682   variable and the end of the containing block, and also between
 683   the usage with affirms a merge and the end of the block.
 684
 685   The scope depth is not greater than the current parse context scope
 686   nest depth.  When the block of that depth closes, the state will
 687   change.  To achieve this, all "in scope" variables are linked
 688   together as a stack in nesting order.
 689
 690 - "pending".  The "in scope" block has closed, but other parallel
 691   scopes are still being processed.  So far, every parallel block at
 692   the same level that has closed has declared the name.
 693
 694   The scope depth is the depth of the last parallel block that
 695   enclosed the declaration, and that has closed.
 696
 697 - "conditionally in scope".  The "in scope" block and all parallel
 698   scopes have closed, and no further mention of the name has been
 699   seen.  This state includes a secondary nest depth which records the
 700   outermost scope seen since the variable became conditionally in
 701   scope.  If a use of the name is found, the variable becomes "in
 702   scope" and that secondary depth becomes the recorded scope depth.
 703   If the name is declared as a new variable, the old variable becomes
 704   "out of scope" and the recorded scope depth stays unchanged.
 705
 706 - "out of scope".  The variable is neither in scope nor conditionally
 707   in scope.  It is permanently out of scope now and can be removed from
 708   the "in scope" stack.
 709
 710
 711 ###### variable fields
 712         int depth, min_depth;
 713         enum { OutScope, PendingScope, CondScope, InScope } scope;
 714         struct variable *in_scope;
 715
 716 ###### parse context
 717
 718         struct variable *in_scope;
 719
 720 All variables with the same name are linked together using the
 721 'previous' link.  Those variable that have
 722 been affirmatively merged all have a 'merged' pointer that points to
 723 one primary variable - the most recently declared instance. When
 724 merging variables, we need to also adjust the 'merged' pointer on any
 725 other variables that had previously been merged with the one that will
 726 no longer be primary.
 727
 728 ###### variable fields
 729         struct variable *merged;
 730
 731 ###### ast functions
 732
 733         static void variable_merge(struct variable *primary, struct variable *secondary)
 734         {
 735                 struct variable *v;
 736
 737                 if (primary->merged)
 738                         // shouldn't happen
 739                         primary = primary->merged;
 740
 741                 for (v = primary->previous; v; v=v->previous)
 742                         if (v == secondary || v == secondary->merged ||
 743                             v->merged == secondary ||
 744                             (v->merged && v->merged == secondary->merged)) {
 745                                 v->scope = OutScope;
 746                                 v->merged = primary;
 747                         }
 748         }
 749
 750 ###### free context
 751
 752         while (context.varlist) {
 753                 struct binding *b = context.varlist;
 754                 struct variable *v = b->var;
 755                 context.varlist = b->next;
 756                 free(b);
 757                 while (v) {
 758                         struct variable *t = v;
 759
 760                         v = t->previous;
 761                         free_value(t->val);
 762                         free(t);
 763                 }
 764         }
 765
 766 #### Manipulating Bindings
 767
 768 When a name is conditionally visible, a new declaration discards the
 769 old binding - the condition lapses.  Conversely a usage of the name
 770 affirms the visibility and extends it to the end of the containing
 771 block - i.e. the block that contains both the original declaration and
 772 the latest usage.  This is determined from `min_depth`.  When a
 773 conditionally visible variable gets affirmed like this, it is also
 774 merged with other conditionally visible variables with the same name.
 775
 776 When we parse a variable declaration we either signal an error if the
 777 name is currently bound, or create a new variable at the current nest
 778 depth if the name is unbound or bound to a conditionally scoped or
 779 pending-scope variable.  If the previous variable was conditionally
 780 scoped, it and its homonyms becomes out-of-scope.
 781
 782 When we parse a variable reference (including non-declarative
 783 assignment) we signal an error if the name is not bound or is bound to
 784 a pending-scope variable; update the scope if the name is bound to a
 785 conditionally scoped variable; or just proceed normally if the named
 786 variable is in scope.
 787
 788 When we exit a scope, any variables bound at this level are either
 789 marked out of scope or pending-scoped, depending on whether the
 790 scope was sequential or parallel.
 791
 792 When exiting a parallel scope we check if there are any variables that
 793 were previously pending and are still visible. If there are, then
 794 there weren't redeclared in the most recent scope, so they cannot be
 795 merged and must become out-of-scope.  If it is not the first of
 796 parallel scopes (based on `child_count`), we check that there was a
 797 previous binding that is still pending-scope.  If there isn't, the new
 798 variable must now be out-of-scope.
 799
 800 When exiting a sequential scope that immediately enclosed parallel
 801 scopes, we need to resolve any pending-scope variables.  If there was
 802 no `else` clause, and we cannot determine that the `switch` was exhaustive,
 803 we need to mark all pending-scope variable as out-of-scope.  Otherwise
 804 all pending-scope variables become conditionally scoped.
 805
 806 ###### ast
 807         enum closetype { CloseSequential, CloseParallel, CloseElse };
 808
 809 ###### ast functions
 810
 811         static struct variable *var_decl(struct parse_context *c, struct text s)
 812         {
 813                 struct binding *b = find_binding(c, s);
 814                 struct variable *v = b->var;
 815
 816                 switch (v ? v->scope : OutScope) {
 817                 case InScope:
 818                         /* Signal error ... once I build error signalling support */
 819                         return NULL;
 820                 case CondScope:
 821                         for (;
 822                              v && v->scope == CondScope;
 823                              v = v->previous)
 824                                 v->scope = OutScope;
 825                         break;
 826                 default: break;
 827                 }
 828                 v = calloc(1, sizeof(*v));
 829                 v->previous = b->var;
 830                 b->var = v;
 831                 v->name = b;
 832                 v->min_depth = v->depth = c->scope_depth;
 833                 v->scope = InScope;
 834                 v->in_scope = c->in_scope;
 835                 c->in_scope = v;
 836                 val_init(&v->val, Vunknown);
 837                 return v;
 838         }
 839
 840         static struct variable *var_ref(struct parse_context *c, struct text s)
 841         {
 842                 struct binding *b = find_binding(c, s);
 843                 struct variable *v = b->var;
 844                 struct variable *v2;
 845
 846                 switch (v ? v->scope : OutScope) {
 847                 case OutScope:
 848                 case PendingScope:
 849                         /* Signal an error - once that is possible */
 850                         return NULL;
 851                 case CondScope:
 852                         /* All CondScope variables of this name need to be merged
 853                          * and become InScope
 854                          */
 855                         v->depth = v->min_depth;
 856                         v->scope = InScope;
 857                         for (v2 = v->previous;
 858                              v2 && v2->scope == CondScope;
 859                              v2 = v2->previous)
 860                                 variable_merge(v, v2);
 861                         break;
 862                 case InScope:
 863                         break;
 864                 }
 865                 return v;
 866         }
 867
 868         static void var_block_close(struct parse_context *c, enum closetype ct)
 869         {
 870                 /* close of all variables that are in_scope */
 871                 struct variable *v, **vp, *v2;
 872
 873                 scope_pop(c);
 874                 for (vp = &c->in_scope;
 875                      v = *vp, v && v->depth > c->scope_depth && v->min_depth > c->scope_depth;
 876                      ) {
 877                         switch (ct) {
 878                         case CloseElse:
 879                         case CloseParallel: /* handle PendingScope */
 880                                 switch(v->scope) {
 881                                 case InScope:
 882                                 case CondScope:
 883                                         if (c->scope_stack->child_count == 1)
 884                                                 v->scope = PendingScope;
 885                                         else if (v->previous &&
 886                                                  v->previous->scope == PendingScope)
 887                                                 v->scope = PendingScope;
 888                                         else if (v->val.vtype == Vlabel)
 889                                                 v->scope = PendingScope;
 890                                         else if (v->name->var == v)
 891                                                 v->scope = OutScope;
 892                                         if (ct == CloseElse) {
 893                                                 /* All Pending variables with this name
 894                                                  * are now Conditional */
 895                                                 for (v2 = v;
 896                                                      v2 && v2->scope == PendingScope;
 897                                                      v2 = v2->previous)
 898                                                         v2->scope = CondScope;
 899                                         }
 900                                         break;
 901                                 case PendingScope:
 902                                         for (v2 = v;
 903                                              v2 && v2->scope == PendingScope;
 904                                              v2 = v2->previous)
 905                                                 if (v2->val.vtype != Vlabel)
 906                                                         v2->scope = OutScope;
 907                                         break;
 908                                 case OutScope: break;
 909                                 }
 910                                 break;
 911                         case CloseSequential:
 912                                 if (v->val.vtype == Vlabel)
 913                                         v->scope = PendingScope;
 914                                 switch (v->scope) {
 915                                 case InScope:
 916                                         v->scope = OutScope;
 917                                         break;
 918                                 case PendingScope:
 919                                         /* There was no 'else', so we can only become
 920                                          * conditional if we know the cases were exhaustive,
 921                                          * and that doesn't mean anything yet.
 922                                          * So only labels become conditional..
 923                                          */
 924                                         for (v2 = v;
 925                                              v2 && v2->scope == PendingScope;
 926                                              v2 = v2->previous)
 927                                                 if (v2->val.vtype == Vlabel) {
 928                                                         v2->scope = CondScope;
 929                                                         v2->min_depth = c->scope_depth;
 930                                                 } else
 931                                                         v2->scope = OutScope;
 932                                         break;
 933                                 case CondScope:
 934                                 case OutScope: break;
 935                                 }
 936                                 break;
 937                         }
 938                         if (v->scope == OutScope)
 939                                 *vp = v->in_scope;
 940                         else
 941                                 vp = &v->in_scope;
 942                 }
 943         }
 944
 945 ### Executables
 946
 947 Executables can be lots of different things.  In many cases an
 948 executable is just an operation combined with one or two other
 949 executables.  This allows for expressions and lists etc.  Other times
 950 an executable is something quite specific like a constant or variable
 951 name.  So we define a `struct exec` to be a general executable with a
 952 type, and a `struct binode` which is a subclass of `exec` and forms a
 953 node in a binary tree and holding an operation. There will be other
 954 subclasses, and to access these we need to be able to `cast` the
 955 `exec` into the various other types.
 956
 957 ###### macros
 958         #define cast(structname, pointer) ({            \
 959                 const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
 960                 if (__mptr && *__mptr != X##structname) abort();                \
 961                 (struct structname *)( (char *)__mptr);})
 962
 963         #define new(structname) ({                      \
 964                 struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
 965                 __ptr->type = X##structname;            \
 966                 __ptr;})
 967
 968 ###### ast
 969         enum exec_types {
 970                 Xbinode,
 971                 ## exec type
 972         };
 973         struct exec {
 974                 enum exec_types type;
 975         };
 976         struct binode {
 977                 struct exec;
 978                 enum Btype {
 979                         ## Binode types
 980                 } op;
 981                 struct exec *left, *right;
 982         };
 983
 984 Each different type of `exec` node needs a number of functions
 985 defined, a bit like methods.  We must be able to be able to free it,
 986 print it, analyse it and execute it.  Once we have specific `exec`
 987 types we will need to parse them too.  Let's take this a bit more
 988 slowly.
 989
 990 #### Freeing
 991
 992 The parser generator requires a `free_foo` function for each struct
 993 that stores attributes and they will be `exec`s of subtypes there-of.
 994 So we need `free_exec` which can handle all the subtypes, and we need
 995 `free_binode`.
 996
 997 ###### ast functions
 998
 999         static void free_binode(struct binode *b)
1000         {
1001                 if (!b)
1002                         return;
1003                 free_exec(b->left);
1004                 free_exec(b->right);
1005                 free(b);
1006         }
1007
1008 ###### core functions
1009         static void free_exec(struct exec *e)
1010         {
1011                 if (!e)
1012                         return;
1013                 switch(e->type) {
1014                         ## free exec cases
1015                 }
1016         }
1017
1018 ###### forward decls
1019
1020         static void free_exec(struct exec *e);
1021
1022 ###### free exec cases
1023         case Xbinode: free_binode(cast(binode, e)); break;
1024
1025 #### Printing
1026
1027 Printing an `exec` requires that we know the current indent level for
1028 printing line-oriented components.  As will become clear later, we
1029 also want to know what sort of bracketing to use.
1030
1031 ###### ast functions
1032
1033         static void do_indent(int i, char *str)
1034         {
1035                 while (i--)
1036                         printf("    ");
1037                 printf("%s", str);
1038         }
1039
1040 ###### core functions
1041         static void print_binode(struct binode *b, int indent, int bracket)
1042         {
1043                 struct binode *b2;
1044                 switch(b->op) {
1045                 ## print binode cases
1046                 }
1047         }
1048
1049         static void print_exec(struct exec *e, int indent, int bracket)
1050         {
1051                 switch (e->type) {
1052                 case Xbinode:
1053                         print_binode(cast(binode, e), indent, bracket); break;
1054                 ## print exec cases
1055                 }
1056         }
1057
1058 ###### forward decls
1059
1060         static void print_exec(struct exec *e, int indent, int bracket);
1061
1062 #### Analysing
1063
1064 As discussed, analysis involves propagating type requirements around
1065 the program and looking for errors.
1066
1067 So `propagate_types` is passed an expected type (being a `vtype`
1068 together with a `bool_permitted` flag) that the `exec` is expected to
1069 return, and returns the type that it does return, either of which can
1070 be `Vunknown`.  An `ok` flag is passed by reference. It is set to `0`
1071 when an error is found, and `2` when any change is made.  If it
1072 remains unchanged at `1`, then no more propagation is needed.
1073
1074 ###### core functions
1075
1076         static enum vtype propagate_types(struct exec *prog, struct parse_context *c, int *ok,
1077                                           enum vtype type, int bool_permitted)
1078         {
1079                 enum vtype t;
1080
1081                 if (!prog)
1082                         return Vnone;
1083
1084                 switch (prog->type) {
1085                 case Xbinode:
1086                 {
1087                         struct binode *b = cast(binode, prog);
1088                         switch (b->op) {
1089                         ## propagate binode cases
1090                         }
1091                         break;
1092                 }
1093                 ## propagate exec cases
1094                 }
1095                 return Vnone;
1096         }
1097
1098 #### Interpreting
1099
1100 Interpreting an `exec` doesn't require anything but the `exec`.  State
1101 is stored in variables and each variable will be directly linked from
1102 within the `exec` tree.  The exception to this is the whole `program`
1103 which needs to look at command line arguments.  The `program` will be
1104 interpreted separately.
1105
1106 Each `exec` can return a value, which may be `Vnone` but shouldn't be `Vunknown`.
1107
1108 ###### core functions
1109
1110         static struct value interp_exec(struct exec *e)
1111         {
1112                 struct value rv;
1113                 rv.vtype = Vnone;
1114                 if (!e)
1115                         return rv;
1116
1117                 switch(e->type) {
1118                 case Xbinode:
1119                 {
1120                         struct binode *b = cast(binode, e);
1121                         struct value left, right;
1122                         left.vtype = right.vtype = Vnone;
1123                         switch (b->op) {
1124                         ## interp binode cases
1125                         }
1126                         free_value(left); free_value(right);
1127                         break;
1128                 }
1129                 ## interp exec cases
1130                 }
1131                 return rv;
1132         }
1133
1134 ## Language elements
1135
1136 Each language element needs to be parsed, printed, analysed,
1137 interpreted, and freed.  There are several, so let's just start with
1138 the easy ones and work our way up.
1139
1140 ### Values
1141
1142 We have already met values as separate objects.  When manifest
1143 constants appear in the program text that must result in an executable
1144 which has a constant value.  So the `val` structure embeds a value in
1145 an executable.
1146
1147 ###### exec type
1148         Xval,
1149
1150 ###### ast
1151         struct val {
1152                 struct exec;
1153                 struct value val;
1154         };
1155
1156 ###### Grammar
1157
1158         $*val
1159         Value ->  True ${
1160                         $0 = new(val);
1161                         $0->val.vtype = Vbool;
1162                         $0->val.bool = 1;
1163                         }$
1164                 | False ${
1165                         $0 = new(val);
1166                         $0->val.vtype = Vbool;
1167                         $0->val.bool = 0;
1168                         }$
1169                 | NUMBER ${
1170                         $0 = new(val);
1171                         $0->val.vtype = Vnum;
1172                         if (number_parse($0->val.num, $0->val.tail, $1.txt) == 0)
1173                                 mpq_init($0->val.num);
1174                         }$
1175                 | STRING ${
1176                         $0 = new(val);
1177                         $0->val.vtype = Vstr;
1178                         string_parse(&$1, '\\', &$0->val.str, $0->val.tail);
1179                         }$
1180                 | MULTI_STRING ${
1181                         $0 = new(val);
1182                         $0->val.vtype = Vstr;
1183                         string_parse(&$1, '\\', &$0->val.str, $0->val.tail);
1184                         }$
1185
1186 ###### print exec cases
1187         case Xval:
1188         {
1189                 struct val *v = cast(val, e);
1190                 if (v->val.vtype == Vstr)
1191                         printf("\"");
1192                 print_value(v->val);
1193                 if (v->val.vtype == Vstr)
1194                         printf("\"");
1195                 break;
1196         }
1197
1198 ###### propagate exec cases
1199                 case Xval:
1200                 {
1201                         struct val *val = cast(val, prog);
1202                         if (!vtype_compat(type, val->val.vtype, bool_permitted))
1203                                 *ok = 0;
1204                         return val->val.vtype;
1205                 }
1206
1207 ###### interp exec cases
1208         case Xval:
1209                 return dup_value(cast(val, e)->val);
1210
1211 ###### ast functions
1212         static void free_val(struct val *v)
1213         {
1214                 if (!v)
1215                         return;
1216                 free_value(v->val);
1217                 free(v);
1218         }
1219
1220 ###### free exec cases
1221         case Xval: free_val(cast(val, e)); break;
1222
1223 ###### ast functions
1224         // Move all nodes from 'b' to 'rv', reversing the order.
1225         // In 'b' 'left' is a list, and 'right' is the last node.
1226         // In 'rv', left' is the first node and 'right' is a list.
1227         static struct binode *reorder_bilist(struct binode *b)
1228         {
1229                 struct binode *rv = NULL;
1230
1231                 while (b) {
1232                         struct exec *t = b->right;
1233                         b->right = rv;
1234                         rv = b;
1235                         if (b->left)
1236                                 b = cast(binode, b->left);
1237                         else
1238                                 b = NULL;
1239                         rv->left = t;
1240                 }
1241                 return rv;
1242         }
1243
1244 ### Variables
1245
1246 Just as we used as `val` to wrap a value into an `exec`, we similarly
1247 need a `var` to wrap a `variable` into an exec.  While each `val`
1248 contained a copy of the value, each `var` hold a link to the variable
1249 because it really is the same variable no matter where it appears.
1250 When a variable is used, we need to remember to follow the `->merged`
1251 link to find the primary instance.
1252
1253 ###### exec type
1254         Xvar,
1255
1256 ###### ast
1257         struct var {
1258                 struct exec;
1259                 struct variable *var;
1260         };
1261
1262 ###### Grammar
1263
1264         $*var
1265         VariableDecl -> IDENTIFIER := ${ {
1266                 struct variable *v = var_decl(config2context(config), $1.txt);
1267                 $0 = new(var);
1268                 $0->var = v;
1269         } }$
1270             | IDENTIFIER ::= ${ {
1271                 struct variable *v = var_decl(config2context(config), $1.txt);
1272                 v->constant = 1;
1273                 $0 = new(var);
1274                 $0->var = v;
1275         } }$
1276
1277         Variable -> IDENTIFIER ${ {
1278                 struct variable *v = var_ref(config2context(config), $1.txt);
1279                 if (v == NULL) {
1280                         /* This might be a label - allocate a var just in case */
1281                         v = var_decl(config2context(config), $1.txt);
1282                         if (v)
1283                                 val_init(&v->val, Vlabel);
1284                 }
1285                 $0 = new(var);
1286                 $0->var = v;
1287         } }$
1288
1289 ###### print exec cases
1290         case Xvar:
1291         {
1292                 struct var *v = cast(var, e);
1293                 if (v->var) {
1294                         struct binding *b = v->var->name;
1295                         printf("%.*s", b->name.len, b->name.txt);
1296                 }
1297                 break;
1298         }
1299
1300 ###### propagate exec cases
1301
1302         case Xvar:
1303         {
1304                 struct var *var = cast(var, prog);
1305                 struct variable *v = var->var;
1306                 if (!v) {
1307                         *ok = 0;
1308                         return Vnone;
1309                 }
1310                 if (v->merged)
1311                         v = v->merged;
1312                 if (v->val.vtype == Vunknown) {
1313                         if (type > Vunknown && *ok != 0) {
1314                                 val_init(&v->val, type);
1315                                 *ok = 2;
1316                         }
1317                         return type;
1318                 }
1319                 if (!vtype_compat(type, v->val.vtype, bool_permitted))
1320                         *ok = 0;
1321                 if (type <= Vunknown)
1322                         return v->val.vtype;
1323                 return type;
1324         }
1325
1326 ###### interp exec cases
1327         case Xvar:
1328         {
1329                 struct var *var = cast(var, e);
1330                 struct variable *v = var->var;
1331
1332                 if (v->merged)
1333                         v = v->merged;
1334                 return dup_value(v->val);
1335         }
1336
1337 ###### ast functions
1338
1339         static void free_var(struct var *v)
1340         {
1341                 free(v);
1342         }
1343
1344 ###### free exec cases
1345         case Xvar: free_var(cast(var, e)); break;
1346
1347 ### Expressions: Boolean
1348
1349 Our first user of the `binode` will be expressions, and particularly
1350 Boolean expressions.  As I haven't implemented precedence in the
1351 parser generator yet, we need different names from each precedence
1352 level used by expressions.  The outer most or lowest level precedence
1353 are Boolean `or` `and`, and `not` which form an `Expression` out of `BTerm`s
1354 and `BFact`s.
1355
1356 ###### Binode types
1357         And,
1358         Or,
1359         Not,
1360
1361 ####### Grammar
1362
1363         $*binode
1364         Expression -> Expression or BTerm ${
1365                         $0 = new(binode);
1366                         $0->op = Or;
1367                         $0->left = $<1;
1368                         $0->right = $<3;
1369                 }$
1370                 | BTerm ${ $0 = $<1; }$
1371
1372         BTerm -> BTerm and BFact ${
1373                         $0 = new(binode);
1374                         $0->op = And;
1375                         $0->left = $<1;
1376                         $0->right = $<3;
1377                 }$
1378                 | BFact ${ $0 = $<1; }$
1379
1380         BFact -> not BFact ${
1381                         $0 = new(binode);
1382                         $0->op = Not;
1383                         $0->right = $<2;
1384                         }$
1385                 ## other BFact
1386
1387 ###### print binode cases
1388         case And:
1389                 print_exec(b->left, -1, 0);
1390                 printf(" and ");
1391                 print_exec(b->right, -1, 0);
1392                 break;
1393         case Or:
1394                 print_exec(b->left, -1, 0);
1395                 printf(" or ");
1396                 print_exec(b->right, -1, 0);
1397                 break;
1398         case Not:
1399                 printf("not ");
1400                 print_exec(b->right, -1, 0);
1401                 break;
1402
1403 ###### propagate binode cases
1404         case And:
1405         case Or:
1406         case Not:
1407                 /* both must be Vbool, result is Vbool */
1408                 propagate_types(b->left, c, ok, Vbool, 0);
1409                 propagate_types(b->right, c, ok, Vbool, 0);
1410                 if (type != Vbool && type > Vunknown)
1411                         *ok = 0;
1412                 return Vbool;
1413
1414 ###### interp binode cases
1415         case And:
1416                 rv = interp_exec(b->left);
1417                 right = interp_exec(b->right);
1418                 rv.bool = rv.bool && right.bool;
1419                 break;
1420         case Or:
1421                 rv = interp_exec(b->left);
1422                 right = interp_exec(b->right);
1423                 rv.bool = rv.bool || right.bool;
1424                 break;
1425         case Not:
1426                 rv = interp_exec(b->right);
1427                 rv.bool = !rv.bool;
1428                 break;
1429
1430 ### Expressions: Comparison
1431
1432 Of slightly higher precedence that Boolean expressions are
1433 Comparisons.
1434 A comparison takes arguments of any type, but the two types must be
1435 the same.
1436
1437 To simplify the parsing we introduce an `eop` which can return an
1438 expression operator.
1439
1440 ###### ast
1441         struct eop {
1442                 enum Btype op;
1443         };
1444
1445 ###### ast functions
1446         static void free_eop(struct eop *e)
1447         {
1448                 if (e)
1449                         free(e);
1450         }
1451
1452 ###### Binode types
1453         Less,
1454         Gtr,
1455         LessEq,
1456         GtrEq,
1457         Eql,
1458         NEql,
1459
1460 ###### other BFact
1461         | Expr CMPop Expr ${
1462                         $0 = new(binode);
1463                         $0->op = $2.op;
1464                         $0->left = $<1;
1465                         $0->right = $<3;
1466                 }$
1467         | Expr ${ $0 = $<1; }$
1468
1469 ###### Grammar
1470
1471         $eop
1472         CMPop ->   < ${ $0.op = Less; }$
1473                 |  > ${ $0.op = Gtr; }$
1474                 |  <= ${ $0.op = LessEq; }$
1475                 |  >= ${ $0.op = GtrEq; }$
1476                 |  == ${ $0.op = Eql; }$
1477                 |  != ${ $0.op = NEql; }$
1478
1479 ###### print binode cases
1480
1481         case Less:
1482         case LessEq:
1483         case Gtr:
1484         case GtrEq:
1485         case Eql:
1486         case NEql:
1487                 print_exec(b->left, -1, 0);
1488                 switch(b->op) {
1489                 case Less:   printf(" < "); break;
1490                 case LessEq: printf(" <= "); break;
1491                 case Gtr:    printf(" > "); break;
1492                 case GtrEq:  printf(" >= "); break;
1493                 case Eql:    printf(" == "); break;
1494                 case NEql:   printf(" != "); break;
1495                 default: abort();
1496                 }
1497                 print_exec(b->right, -1, 0);
1498                 break;
1499
1500 ###### propagate binode cases
1501         case Less:
1502         case LessEq:
1503         case Gtr:
1504         case GtrEq:
1505         case Eql:
1506         case NEql:
1507                 /* Both must match but not labels, result is Vbool */
1508                 t = propagate_types(b->left, c, ok, Vnolabel, 0);
1509                 if (t > Vunknown)
1510                         propagate_types(b->right, c, ok, t, 0);
1511                 else {
1512                         t = propagate_types(b->right, c, ok, Vnolabel, 0);
1513                         if (t > Vunknown)
1514                                 t = propagate_types(b->left, c, ok, t, 0);
1515                 }
1516                 if (!vtype_compat(type, Vbool, 0))
1517                         *ok = 0;
1518                 return Vbool;
1519
1520 ###### interp binode cases
1521         case Less:
1522         case LessEq:
1523         case Gtr:
1524         case GtrEq:
1525         case Eql:
1526         case NEql:
1527         {
1528                 int cmp;
1529                 left = interp_exec(b->left);
1530                 right = interp_exec(b->right);
1531                 cmp = value_cmp(left, right);
1532                 rv.vtype = Vbool;
1533                 switch (b->op) {
1534                 case Less:      rv.bool = cmp <  0; break;
1535                 case LessEq:    rv.bool = cmp <= 0; break;
1536                 case Gtr:       rv.bool = cmp >  0; break;
1537                 case GtrEq:     rv.bool = cmp >= 0; break;
1538                 case Eql:       rv.bool = cmp == 0; break;
1539                 case NEql:      rv.bool = cmp != 0; break;
1540                 default: rv.bool = 0; break;
1541                 }
1542                 break;
1543         }
1544
1545 ### Expressions: The rest
1546
1547 The remaining expressions with the highest precedence are arithmetic
1548 and string concatenation.  There are `Expr`, `Term`, and `Factor`.
1549 The `Factor` is where the `Value` and `Variable` that we already have
1550 are included.
1551
1552 `+` and `-` are both infix and prefix operations (where they are
1553 absolute value and negation).  These have different operator names.
1554
1555 We also have a 'Bracket' operator which records where parentheses were
1556 found.  This make it easy to reproduce these when printing.  Once
1557 precedence is handled better I might be able to discard this.
1558
1559 ###### Binode types
1560         Plus, Minus,
1561         Times, Divide,
1562         Concat,
1563         Absolute, Negate,
1564         Bracket,
1565
1566 ###### Grammar
1567
1568         $*binode
1569         Expr -> Expr Eop Term ${
1570                         $0 = new(binode);
1571                         $0->op = $2.op;
1572                         $0->left = $<1;
1573                         $0->right = $<3;
1574                 }$
1575                 | Term ${ $0 = $<1; }$
1576
1577         Term -> Term Top Factor ${
1578                         $0 = new(binode);
1579                         $0->op = $2.op;
1580                         $0->left = $<1;
1581                         $0->right = $<3;
1582                 }$
1583                 | Factor ${ $0 = $<1; }$
1584
1585         Factor -> ( Expression ) ${
1586                         $0 = new(binode);
1587                         $0->op = Bracket;
1588                         $0->right = $<2;
1589                 }$
1590                 | Uop Factor ${
1591                         $0 = new(binode);
1592                         $0->op = $1.op;
1593                         $0->right = $<2;
1594                 }$
1595                 | Value ${ $0 = (struct binode *)$<1; }$
1596                 | Variable ${ $0 = (struct binode *)$<1; }$
1597
1598         $eop
1599         Eop ->    + ${ $0.op = Plus; }$
1600                 | - ${ $0.op = Minus; }$
1601
1602         Uop ->    + ${ $0.op = Absolute; }$
1603                 | - ${ $0.op = Negate; }$
1604
1605         Top ->    * ${ $0.op = Times; }$
1606                 | / ${ $0.op = Divide; }$
1607                 | ++ ${ $0.op = Concat; }$
1608
1609 ###### print binode cases
1610         case Plus:
1611         case Minus:
1612         case Times:
1613         case Divide:
1614         case Concat:
1615                 print_exec(b->left, indent, 0);
1616                 switch(b->op) {
1617                 case Plus:   printf(" + "); break;
1618                 case Minus:  printf(" - "); break;
1619                 case Times:  printf(" * "); break;
1620                 case Divide: printf(" / "); break;
1621                 case Concat: printf(" ++ "); break;
1622                 default: abort();
1623                 }
1624                 print_exec(b->right, indent, 0);
1625                 break;
1626         case Absolute:
1627                 printf("+");
1628                 print_exec(b->right, indent, 0);
1629                 break;
1630         case Negate:
1631                 printf("-");
1632                 print_exec(b->right, indent, 0);
1633                 break;
1634         case Bracket:
1635                 printf("(");
1636                 print_exec(b->right, indent, 0);
1637                 printf(")");
1638                 break;
1639
1640 ###### propagate binode cases
1641         case Plus:
1642         case Minus:
1643         case Times:
1644         case Divide:
1645                 /* both must be numbers, result is Vnum */
1646         case Absolute:
1647         case Negate:
1648                 /* as propagate_types ignores a NULL,
1649                  * unary ops fit here too */
1650                 propagate_types(b->left, c, ok, Vnum, 0);
1651                 propagate_types(b->right, c, ok, Vnum, 0);
1652                 if (!vtype_compat(type, Vnum, 0))
1653                         *ok = 0;
1654                 return Vnum;
1655
1656         case Concat:
1657                 /* both must be Vstr, result is Vstr */
1658                 propagate_types(b->left, c, ok, Vstr, 0);
1659                 propagate_types(b->right, c, ok, Vstr, 0);
1660                 if (!vtype_compat(type, Vstr, 0))
1661                         *ok = 0;
1662                 return Vstr;
1663
1664         case Bracket:
1665                 return propagate_types(b->right, c, ok, type, 0);
1666
1667 ###### interp binode cases
1668
1669         case Plus:
1670                 rv = interp_exec(b->left);
1671                 right = interp_exec(b->right);
1672                 mpq_add(rv.num, rv.num, right.num);
1673                 break;
1674         case Minus:
1675                 rv = interp_exec(b->left);
1676                 right = interp_exec(b->right);
1677                 mpq_sub(rv.num, rv.num, right.num);
1678                 break;
1679         case Times:
1680                 rv = interp_exec(b->left);
1681                 right = interp_exec(b->right);
1682                 mpq_mul(rv.num, rv.num, right.num);
1683                 break;
1684         case Divide:
1685                 rv = interp_exec(b->left);
1686                 right = interp_exec(b->right);
1687                 mpq_div(rv.num, rv.num, right.num);
1688                 break;
1689         case Negate:
1690                 rv = interp_exec(b->right);
1691                 mpq_neg(rv.num, rv.num);
1692                 break;
1693         case Absolute:
1694                 rv = interp_exec(b->right);
1695                 mpq_abs(rv.num, rv.num);
1696                 break;
1697         case Bracket:
1698                 rv = interp_exec(b->right);
1699                 break;
1700         case Concat:
1701                 left = interp_exec(b->left);
1702                 right = interp_exec(b->right);
1703                 rv.vtype = Vstr;
1704                 rv.str = text_join(left.str, right.str);
1705                 break;
1706
1707 ### Blocks, Statements, and Statement lists.
1708
1709 Now that we have expressions out of the way we need to turn to
1710 statements.  There are simple statements and more complex statements.
1711 Simple statements do not contain newlines, complex statements do.
1712
1713 Statements often come in sequences and we have corresponding simple
1714 statement lists and complex statement lists.
1715 The former comprise only simple statements separated by semicolons.
1716 The later comprise complex statements and simple statement lists.  They are
1717 separated by newlines.  Thus the semicolon is only used to separate
1718 simple statements on the one line.  This may be overly restrictive,
1719 but I'm not sure I every want a complex statement to share a line with
1720 anything else.
1721
1722 Note that a simple statement list can still use multiple lines if
1723 subsequent lines are indented, so
1724
1725 ###### Example: wrapped simple statement list
1726
1727         a = b; c = d;
1728            e = f; print g
1729
1730 is a single simple statement list.  This might allow room for
1731 confusion, so I'm not set on it yet.
1732
1733 A simple statement list needs no extra syntax.  A complex statement
1734 list has two syntactic forms.  It can be enclosed in braces (much like
1735 C blocks), or it can be introduced by a colon and continue until an
1736 unindented newline (much like Python blocks).  With this extra syntax
1737 it is referred to as a block.
1738
1739 Note that a block does not have to include any newlines if it only
1740 contains simple statements.  So both of:
1741
1742         if condition: a=b; d=f
1743
1744         if condition { a=b; print f }
1745
1746 are valid.
1747
1748 In either case the list is constructed from a `binode` list with
1749 `Block` as the operator.  When parsing the list it is most convenient
1750 to append to the end, so a list is a list and a statement.  When using
1751 the list it is more convenient to consider a list to be a statement
1752 and a list.  So we need a function to re-order a list.
1753 `reorder_bilist` serves this purpose.
1754
1755 The only stand-alone statement we introduce at this stage is `pass`
1756 which does nothing and is represented as a `NULL` pointer in a `Block`
1757 list.
1758
1759 ###### Binode types
1760         Block,
1761
1762 ###### Grammar
1763
1764         $void
1765         OptNL -> Newlines
1766                 |
1767
1768         Newlines -> NEWLINE
1769                 | Newlines NEWLINE
1770
1771         $*binode
1772         Open -> {
1773                 | NEWLINE {
1774         Close -> }
1775                 | NEWLINE }
1776         Block -> Open Statementlist Close ${ $0 = $<2; }$
1777                 | Open Newlines Statementlist Close ${ $0 = $<3; }$
1778                 | Open SimpleStatements } ${ $0 = reorder_bilist($<2); }$
1779                 | Open Newlines SimpleStatements } ${ $0 = reorder_bilist($<3); }$
1780                 | : Statementlist ${ $0 = $<2; }$
1781                 | : SimpleStatements ${ $0 = reorder_bilist($<2); }$
1782
1783         Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<1); }$
1784
1785         ComplexStatements -> ComplexStatements ComplexStatement ${
1786                 $0 = new(binode);
1787                 $0->op = Block;
1788                 $0->left = $<1;
1789                 $0->right = $<2;
1790                 }$
1791                 | ComplexStatements NEWLINE ${ $0 = $<1; }$
1792                 | ComplexStatement ${
1793                 $0 = new(binode);
1794                 $0->op = Block;
1795                 $0->left = NULL;
1796                 $0->right = $<1;
1797                 }$
1798
1799         $*exec
1800         ComplexStatement -> SimpleStatements NEWLINE ${
1801                         $0 = reorder_bilist($<1);
1802                         }$
1803                 ## ComplexStatement Grammar
1804
1805         $*binode
1806         SimpleStatements -> SimpleStatements ; SimpleStatement ${
1807                         $0 = new(binode);
1808                         $0->op = Block;
1809                         $0->left = $<1;
1810                         $0->right = $<3;
1811                         }$
1812                 | SimpleStatement ${
1813                         $0 = new(binode);
1814                         $0->op = Block;
1815                         $0->left = NULL;
1816                         $0->right = $<1;
1817                         }$
1818                 | SimpleStatements ; ${ $0 = $<1; }$
1819
1820         SimpleStatement -> pass ${ $0 = NULL; }$
1821                 ## SimpleStatement Grammar
1822
1823 ###### print binode cases
1824         case Block:
1825                 if (indent < 0) {
1826                         // simple statement
1827                         if (b->left == NULL)
1828                                 printf("pass");
1829                         else
1830                                 print_exec(b->left, indent, 0);
1831                         if (b->right) {
1832                                 printf("; ");
1833                                 print_exec(b->right, indent, 0);
1834                         }
1835                 } else {
1836                         // block, one per line
1837                         if (b->left == NULL)
1838                                 do_indent(indent, "pass\n");
1839                         else
1840                                 print_exec(b->left, indent, bracket);
1841                         if (b->right)
1842                                 print_exec(b->right, indent, bracket);
1843                 }
1844                 break;
1845
1846 ###### propagate binode cases
1847         case Block:
1848         {
1849                 /* If any statement returns something other then Vnone
1850                  * then all such must return same type.
1851                  * As each statement may be Vnone or something else,
1852                  * we must always pass Vunknown down, otherwise an incorrect
1853                  * error might occur.  We never return Vnone unless it is
1854                  * passed in.
1855                  */
1856                 struct binode *e;
1857
1858                 for (e = b; e; e = cast(binode, e->right)) {
1859                         t = propagate_types(e->left, c, ok, Vunknown, bool_permitted);
1860                         if (bool_permitted && t == Vbool)
1861                                 t = Vunknown;
1862                         if (t != Vunknown && t != Vnone) {
1863                                 if (type == Vunknown)
1864                                         type = t;
1865                                 else if (t != type)
1866                                         *ok = 0;
1867                         }
1868                 }
1869                 return type;
1870         }
1871
1872 ###### interp binode cases
1873         case Block:
1874                 while (rv.vtype == Vnone &&
1875                        b) {
1876                         if (b->left)
1877                                 rv = interp_exec(b->left);
1878                         b = cast(binode, b->right);
1879                 }
1880                 break;
1881
1882 ### The Print statement
1883
1884 `print` is a simple statement that takes a comma-separated list of
1885 expressions and prints the values separated by spaces and terminated
1886 by a newline.  No control of formatting is possible.
1887
1888 `print` faces the same list-ordering issue as blocks, and uses the
1889 same solution.
1890
1891 ###### Binode types
1892         Print,
1893
1894 ###### SimpleStatement Grammar
1895
1896         | print ExpressionList ${
1897                 $0 = reorder_bilist($<2);
1898         }$
1899         | print ExpressionList , ${
1900                 $0 = new(binode);
1901                 $0->op = Print;
1902                 $0->right = NULL;
1903                 $0->left = $<2;
1904                 $0 = reorder_bilist($0);
1905         }$
1906         | print ${
1907                 $0 = new(binode);
1908                 $0->op = Print;
1909                 $0->right = NULL;
1910         }$
1911
1912 ###### Grammar
1913
1914         $*binode
1915         ExpressionList -> ExpressionList , Expression ${
1916                 $0 = new(binode);
1917                 $0->op = Print;
1918                 $0->left = $<1;
1919                 $0->right = $<3;
1920                 }$
1921                 | Expression ${
1922                         $0 = new(binode);
1923                         $0->op = Print;
1924                         $0->left = NULL;
1925                         $0->right = $<1;
1926                 }$
1927
1928 ###### print binode cases
1929
1930         case Print:
1931                 do_indent(indent, "print");
1932                 while (b) {
1933                         if (b->left) {
1934                                 printf(" ");
1935                                 print_exec(b->left, -1, 0);
1936                                 if (b->right)
1937                                         printf(",");
1938                         }
1939                         b = cast(binode, b->right);
1940                 }
1941                 if (indent >= 0)
1942                         printf("\n");
1943                 break;
1944
1945 ###### propagate binode cases
1946
1947         case Print:
1948                 /* don't care but all must be consistent */
1949                 propagate_types(b->left, c, ok, Vnolabel, 0);
1950                 propagate_types(b->right, c, ok, Vnolabel, 0);
1951                 break;
1952
1953 ###### interp binode cases
1954
1955         case Print:
1956         {
1957                 char sep = 0;
1958                 int eol = 1;
1959                 for ( ; b; b = cast(binode, b->right))
1960                         if (b->left) {
1961                                 if (sep)
1962                                         putchar(sep);
1963                                 left = interp_exec(b->left);
1964                                 print_value(left);
1965                                 free_value(left);
1966                                 if (b->right)
1967                                         sep = ' ';
1968                         } else if (sep)
1969                                 eol = 0;
1970                 left.vtype = Vnone;
1971                 if (eol)
1972                         printf("\n");
1973                 break;
1974         }
1975
1976 ###### Assignment statement
1977
1978 An assignment will assign a value to a variable, providing it hasn't
1979 be declared as a constant.  The analysis phase ensures that the type
1980 will be correct so the interpreter just needs to perform the
1981 calculation.  There is a form of assignment which declares a new
1982 variable as well as assigning a value.  If a name is assigned before
1983 it is declared, and error will be raised as the name is created as
1984 `Vlabel` and it is illegal to assign to such names.
1985
1986 ###### Binode types
1987         Assign,
1988         Declare,
1989
1990 ###### SimpleStatement Grammar
1991         | Variable = Expression ${ {
1992                         struct var *v = cast(var, $1);
1993
1994                         $0 = new(binode);
1995                         $0->op = Assign;
1996                         $0->left = $<1;
1997                         $0->right = $<3;
1998                         if (v->var && !v->var->constant) {
1999                                 /* FIXME error? */
2000                         }
2001                 } }$
2002         | VariableDecl Expression ${
2003                         $0 = new(binode);
2004                         $0->op = Declare;
2005                         $0->left = $<1;
2006                         $0->right =$<2;
2007                 }$
2008
2009 ###### print binode cases
2010
2011         case Assign:
2012                 do_indent(indent, "");
2013                 print_exec(b->left, indent, 0);
2014                 printf(" = ");
2015                 print_exec(b->right, indent, 0);
2016                 if (indent >= 0)
2017                         printf("\n");
2018                 break;
2019
2020         case Declare:
2021                 do_indent(indent, "");
2022                 print_exec(b->left, indent, 0);
2023                 if (cast(var, b->left)->var->constant)
2024                         printf(" ::= ");
2025                 else
2026                         printf(" := ");
2027                 print_exec(b->right, indent, 0);
2028                 if (indent >= 0)
2029                         printf("\n");
2030                 break;
2031
2032 ###### propagate binode cases
2033
2034         case Assign:
2035         case Declare:
2036                 /* Both must match and not be labels, result is Vnone */
2037                 t = propagate_types(b->left, c, ok, Vnolabel, 0);
2038                 if (t > Vunknown)
2039                         propagate_types(b->right, c, ok, t, 0);
2040                 else {
2041                         t = propagate_types(b->right, c, ok, Vnolabel, 0);
2042                         if (t > Vunknown)
2043                                 t = propagate_types(b->left, c, ok, t, 0);
2044                 }
2045                 return Vnone;
2046
2047                 break;
2048
2049 ###### interp binode cases
2050
2051         case Assign:
2052         case Declare:
2053         {
2054                 struct variable *v = cast(var, b->left)->var;
2055                 if (v->merged)
2056                         v = v->merged;
2057                 right = interp_exec(b->right);
2058                 free_value(v->val);
2059                 v->val = right;
2060                 right.vtype = Vunknown;
2061                 break;
2062         }
2063
2064 ### The `use` statement
2065
2066 The `use` statement is the last "simple" statement.  It is needed when
2067 the condition in a conditional statement is a block.  `use` works much
2068 like `return` in C, but only completes the `condition`, not the whole
2069 function.
2070
2071 ###### Binode types
2072         Use,
2073
2074 ###### SimpleStatement Grammar
2075         | use Expression ${
2076                 $0 = new(binode);
2077                 $0->op = Use;
2078                 $0->right = $<2;
2079         }$
2080
2081 ###### print binode cases
2082
2083         case Use:
2084                 do_indent(indent, "use ");
2085                 print_exec(b->right, -1, 0);
2086                 if (indent >= 0)
2087                         printf("\n");
2088                 break;
2089
2090 ###### propagate binode cases
2091
2092         case Use:
2093                 /* result matches value */
2094                 return propagate_types(b->right, c, ok, type, 0);
2095
2096 ###### interp binode cases
2097
2098         case Use:
2099                 rv = interp_exec(b->right);
2100                 break;
2101
2102 ### The Conditional Statement
2103
2104 This is the biggy and currently the only complex statement.  This
2105 subsumes `if`, `while`, `do/while`, `switch`, and some parts of `for`.
2106 It is comprised of a number of parts, all of which are optional though
2107 set combinations apply.  Each part is (usually) a key word (`then` is
2108 sometimes optional) followed by either an expression of a code block,
2109 except the `casepart` which is a "key word and an expression" followed
2110 by a code block.  The code-block option is valid for all parts and,
2111 where an expression is also allowed, the code block can use the `use`
2112 statement to report a value.  If the code block does no report a value
2113 the effect is similar to reporting `False`.
2114
2115 The `else` and `case` parts, as well as `then` when combined with
2116 `if`, can contain a `use` statement which will apply to some
2117 containing conditional statement. `for` parts, `do` parts and `then`
2118 parts used with `for` can never contain a `use`, except in some
2119 subordinate conditional statement.
2120
2121 If there is a `forpart`, it is executed first, only once.
2122 If there is a `dopart`, then it is executed repeatedly providing
2123 always that the `condpart` or `cond`, if present, does not return a non-True
2124 value.  `condpart` can fail to return any value if it simply executes
2125 to completion.  This is treated the same as returning `True`.
2126
2127 If there is a `thenpart` it will be executed whenever the `condpart`
2128 or `cond` returns True (or does not return any value), but this will happen
2129 *after* `dopart` (when present).
2130
2131 If `elsepart` is present it will be executed at most once when the
2132 condition returns `False` or some value that isn't `True` and isn't
2133 matched by any `casepart`.  If there are any `casepart`s, they will be
2134 executed when the condition returns a matching value.
2135
2136 The particular sorts of values allowed in case parts has not yet been
2137 determined in the language design, so nothing is prohibited.
2138
2139 The various blocks in this complex statement potentially provide scope
2140 for variables as described earlier.  Each such block must include the
2141 "OpenScope" nonterminal before parsing the block, and must call
2142 `var_block_close()` when closing the block.
2143
2144 The code following "`if`", "`switch`" and "`for`" does not get its own
2145 scope, but is in a scope covering the whole statement, so names
2146 declared there cannot be redeclared elsewhere.  Similarly the
2147 condition following "`while`" is in a scope the covers the body
2148 ("`do`" part) of the loop, and which does not allow conditional scope
2149 extension.  Code following "`then`" (both looping and non-looping),
2150 "`else`" and "`case`" each get their own local scope.
2151
2152 The type requirements on the code block in a `whilepart` are quite
2153 unusal.  It is allowed to return a value of some identifiable type, in
2154 which case the loop abort and an appropriate `casepart` is run, or it
2155 can return a Boolean, in which case the loop either continues to the
2156 `dopart` (on `True`) or aborts and runs the `elsepart` (on `False`).
2157 This is different both from the `ifpart` code block which is expected to
2158 return a Boolean, or the `switchpart` code block which is expected to
2159 return the same type as the casepart values.  The correct analysis of
2160 the type of the `whilepart` code block is the reason for the
2161 `bool_permitted` flag which is passed to `propagate_types()`.
2162
2163 The `cond_statement` cannot fit into a `binode` so a new `exec` is
2164 defined.
2165
2166 ###### exec type
2167         Xcond_statement,
2168
2169 ###### ast
2170         struct casepart {
2171                 struct exec *value;
2172                 struct exec *action;
2173                 struct casepart *next;
2174         };
2175         struct cond_statement {
2176                 struct exec;
2177                 struct exec *forpart, *condpart, *dopart, *thenpart, *elsepart;
2178                 struct casepart *casepart;
2179         };
2180
2181 ###### ast functions
2182
2183         static void free_casepart(struct casepart *cp)
2184         {
2185                 while (cp) {
2186                         struct casepart *t;
2187                         free_exec(cp->value);
2188                         free_exec(cp->action);
2189                         t = cp->next;
2190                         free(cp);
2191                         cp = t;
2192                 }
2193         }
2194
2195         static void free_cond_statement(struct cond_statement *s)
2196         {
2197                 if (!s)
2198                         return;
2199                 free_exec(s->forpart);
2200                 free_exec(s->condpart);
2201                 free_exec(s->dopart);
2202                 free_exec(s->thenpart);
2203                 free_exec(s->elsepart);
2204                 free_casepart(s->casepart);
2205                 free(s);
2206         }
2207
2208 ###### free exec cases
2209         case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
2210
2211 ###### ComplexStatement Grammar
2212         | CondStatement ${ $0 = $<1; }$
2213
2214 ###### Grammar
2215
2216         $*cond_statement
2217         // both ForThen and Whilepart open scopes, and CondSuffix only
2218         // closes one - so in the first branch here we have another to close.
2219         CondStatement -> ForThen WhilePart CondSuffix ${
2220                         $0 = $<3;
2221                         $0->forpart = $1.forpart; $1.forpart = NULL;
2222                         $0->thenpart = $1.thenpart; $1.thenpart = NULL;
2223                         $0->condpart = $2.condpart; $2.condpart = NULL;
2224                         $0->dopart = $2.dopart; $2.dopart = NULL;
2225                         var_block_close(config2context(config), CloseSequential);
2226                         }$
2227                 | WhilePart CondSuffix ${
2228                         $0 = $<2;
2229                         $0->condpart = $1.condpart; $1.condpart = NULL;
2230                         $0->dopart = $1.dopart; $1.dopart = NULL;
2231                         }$
2232                 | SwitchPart CondSuffix ${
2233                         $0 = $<2;
2234                         $0->condpart = $<1;
2235                         }$
2236                 | IfPart IfSuffix ${
2237                         $0 = $<2;
2238                         $0->condpart = $1.condpart; $1.condpart = NULL;
2239                         $0->thenpart = $1.thenpart; $1.thenpart = NULL;
2240                         // This is where we close an "if" statement
2241                         var_block_close(config2context(config), CloseSequential);
2242                         }$
2243
2244         CondSuffix -> IfSuffix ${
2245                         $0 = $<1;
2246                         // This is where we close scope of the whole
2247                         // "for" or "while" statement
2248                         var_block_close(config2context(config), CloseSequential);
2249                 }$
2250                 | CasePart CondSuffix ${
2251                         $0 = $<2;
2252                         $1->next = $0->casepart;
2253                         $0->casepart = $<1;
2254                 }$
2255
2256         $*casepart
2257         CasePart -> Newlines case Expression OpenScope Block ${
2258                         $0 = calloc(1,sizeof(struct casepart));
2259                         $0->value = $<3;
2260                         $0->action = $<5;
2261                         var_block_close(config2context(config), CloseParallel);
2262                 }$
2263                 | case Expression OpenScope Block ${
2264                         $0 = calloc(1,sizeof(struct casepart));
2265                         $0->value = $<2;
2266                         $0->action = $<4;
2267                         var_block_close(config2context(config), CloseParallel);
2268                 }$
2269
2270         $*cond_statement
2271         IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
2272                 | Newlines else OpenScope Block ${
2273                         $0 = new(cond_statement);
2274                         $0->elsepart = $<4;
2275                         var_block_close(config2context(config), CloseElse);
2276                 }$
2277                 | else OpenScope Block ${
2278                         $0 = new(cond_statement);
2279                         $0->elsepart = $<3;
2280                         var_block_close(config2context(config), CloseElse);
2281                 }$
2282                 | Newlines else OpenScope CondStatement ${
2283                         $0 = new(cond_statement);
2284                         $0->elsepart = $<4;
2285                         var_block_close(config2context(config), CloseElse);
2286                 }$
2287                 | else OpenScope CondStatement ${
2288                         $0 = new(cond_statement);
2289                         $0->elsepart = $<3;
2290                         var_block_close(config2context(config), CloseElse);
2291                 }$
2292
2293
2294         $*exec
2295         // These scopes are closed in CondSuffix
2296         ForPart -> for OpenScope SimpleStatements ${
2297                         $0 = reorder_bilist($<3);
2298                 }$
2299                 |  for OpenScope Block ${
2300                         $0 = $<3;
2301                 }$
2302
2303         ThenPart -> then OpenScope SimpleStatements ${
2304                         $0 = reorder_bilist($<3);
2305                         var_block_close(config2context(config), CloseSequential);
2306                 }$
2307                 |  then OpenScope Block ${
2308                         $0 = $<3;
2309                         var_block_close(config2context(config), CloseSequential);
2310                 }$
2311
2312         ThenPartNL -> ThenPart OptNL ${
2313                         $0 = $<1;
2314                 }$
2315
2316         // This scope is closed in CondSuffix
2317         WhileHead -> while OpenScope Block ${
2318                 $0 = $<3;
2319                 }$
2320
2321         $cond_statement
2322         ForThen -> ForPart OptNL ThenPartNL ${
2323                         $0.forpart = $<1;
2324                         $0.thenpart = $<3;
2325                 }$
2326                 | ForPart OptNL ${
2327                         $0.forpart = $<1;
2328                 }$
2329
2330         // This scope is closed in CondSuffix
2331         WhilePart -> while OpenScope Expression Block ${
2332                         $0.type = Xcond_statement;
2333                         $0.condpart = $<3;
2334                         $0.dopart = $<4;
2335                 }$
2336                 | WhileHead OptNL do Block ${
2337                         $0.type = Xcond_statement;
2338                         $0.condpart = $<1;
2339                         $0.dopart = $<4;
2340                 }$
2341
2342         IfPart -> if OpenScope Expression OpenScope Block ${
2343                         $0.type = Xcond_statement;
2344                         $0.condpart = $<3;
2345                         $0.thenpart = $<5;
2346                         var_block_close(config2context(config), CloseParallel);
2347                 }$
2348                 | if OpenScope Block OptNL then OpenScope Block ${
2349                         $0.type = Xcond_statement;
2350                         $0.condpart = $<3;
2351                         $0.thenpart = $<7;
2352                         var_block_close(config2context(config), CloseParallel);
2353                 }$
2354
2355         $*exec
2356         // This scope is closed in CondSuffix
2357         SwitchPart -> switch OpenScope Expression ${
2358                         $0 = $<3;
2359                 }$
2360                 | switch OpenScope Block ${
2361                         $0 = $<3;
2362                 }$
2363
2364 ###### print exec cases
2365
2366         case Xcond_statement:
2367         {
2368                 struct cond_statement *cs = cast(cond_statement, e);
2369                 struct casepart *cp;
2370                 if (cs->forpart) {
2371                         do_indent(indent, "for");
2372                         if (bracket) printf(" {\n"); else printf(":\n");
2373                         print_exec(cs->forpart, indent+1, bracket);
2374                         if (cs->thenpart) {
2375                                 if (bracket)
2376                                         do_indent(indent, "} then {\n");
2377                                 else
2378                                         do_indent(indent, "then:\n");
2379                                 print_exec(cs->thenpart, indent+1, bracket);
2380                         }
2381                         if (bracket) do_indent(indent, "}\n");
2382                 }
2383                 if (cs->dopart) {
2384                         // a loop
2385                         if (cs->condpart && cs->condpart->type == Xbinode &&
2386                             cast(binode, cs->condpart)->op == Block) {
2387                                 if (bracket)
2388                                         do_indent(indent, "while {\n");
2389                                 else
2390                                         do_indent(indent, "while:\n");
2391                                 print_exec(cs->condpart, indent+1, bracket);
2392                                 if (bracket)
2393                                         do_indent(indent, "} do {\n");
2394                                 else
2395                                         do_indent(indent, "do:\n");
2396                                 print_exec(cs->dopart, indent+1, bracket);
2397                                 if (bracket)
2398                                         do_indent(indent, "}\n");
2399                         } else {
2400                                 do_indent(indent, "while ");
2401                                 print_exec(cs->condpart, 0, bracket);
2402                                 if (bracket)
2403                                         printf(" {\n");
2404                                 else
2405                                         printf(":\n");
2406                                 print_exec(cs->dopart, indent+1, bracket);
2407                                 if (bracket)
2408                                         do_indent(indent, "}\n");
2409                         }
2410                 } else {
2411                         // a condition
2412                         if (cs->casepart)
2413                                 do_indent(indent, "switch");
2414                         else
2415                                 do_indent(indent, "if");
2416                         if (cs->condpart && cs->condpart->type == Xbinode &&
2417                             cast(binode, cs->condpart)->op == Block) {
2418                                 if (bracket)
2419                                         printf(" {\n");
2420                                 else
2421                                         printf(":\n");
2422                                 print_exec(cs->condpart, indent+1, bracket);
2423                                 if (bracket)
2424                                         do_indent(indent, "}\n");
2425                                 if (cs->thenpart) {
2426                                         do_indent(indent, "then:\n");
2427                                         print_exec(cs->thenpart, indent+1, bracket);
2428                                 }
2429                         } else {
2430                                 printf(" ");
2431                                 print_exec(cs->condpart, 0, bracket);
2432                                 if (cs->thenpart) {
2433                                         if (bracket)
2434                                                 printf(" {\n");
2435                                         else
2436                                                 printf(":\n");
2437                                         print_exec(cs->thenpart, indent+1, bracket);
2438                                         if (bracket)
2439                                                 do_indent(indent, "}\n");
2440                                 } else
2441                                         printf("\n");
2442                         }
2443                 }
2444                 for (cp = cs->casepart; cp; cp = cp->next) {
2445                         do_indent(indent, "case ");
2446                         print_exec(cp->value, -1, 0);
2447                         if (bracket)
2448                                 printf(" {\n");
2449                         else
2450                                 printf(":\n");
2451                         print_exec(cp->action, indent+1, bracket);
2452                         if (bracket)
2453                                 do_indent(indent, "}\n");
2454                 }
2455                 if (cs->elsepart) {
2456                         do_indent(indent, "else");
2457                         if (bracket)
2458                                 printf(" {\n");
2459                         else
2460                                 printf(":\n");
2461                         print_exec(cs->elsepart, indent+1, bracket);
2462                         if (bracket)
2463                                 do_indent(indent, "}\n");
2464                 }
2465                 break;
2466         }
2467
2468 ###### propagate exec cases
2469         case Xcond_statement:
2470         {
2471                 // forpart and dopart must return Vnone
2472                 // thenpart must return Vnone if there is a dopart,
2473                 // otherwise it is like elsepart.
2474                 // condpart must:
2475                 //    be bool if there is not casepart
2476                 //    match casepart->values if there is a switchpart
2477                 //    either be bool or match casepart->value if there
2478                 //             is a whilepart
2479                 // elsepart, casepart->action must match there return type
2480                 // expected of this statement.
2481                 struct cond_statement *cs = cast(cond_statement, prog);
2482                 struct casepart *cp;
2483
2484                 t = propagate_types(cs->forpart, c, ok, Vnone, 0);
2485                 if (!vtype_compat(Vnone, t, 0))
2486                         *ok = 0;
2487                 t = propagate_types(cs->dopart, c, ok, Vnone, 0);
2488                 if (!vtype_compat(Vnone, t, 0))
2489                         *ok = 0;
2490                 if (cs->dopart) {
2491                         t = propagate_types(cs->thenpart, c, ok, Vnone, 0);
2492                         if (!vtype_compat(Vnone, t, 0))
2493                                 *ok = 0;
2494                 }
2495                 if (cs->casepart == NULL)
2496                         propagate_types(cs->condpart, c, ok, Vbool, 0);
2497                 else {
2498                         /* Condpart must match case values, with bool permitted */
2499                         t = Vunknown;
2500                         for (cp = cs->casepart;
2501                              cp && (t == Vunknown); cp = cp->next)
2502                                 t = propagate_types(cp->value, c, ok, Vunknown, 0);
2503                         if (t == Vunknown && cs->condpart)
2504                                 t = propagate_types(cs->condpart, c, ok, Vunknown, 1);
2505                         // Now we have a type (I hope) push it down
2506                         if (t != Vunknown) {
2507                                 for (cp = cs->casepart; cp; cp = cp->next)
2508                                         propagate_types(cp->value, c, ok, t, 0);
2509                                 propagate_types(cs->condpart, c, ok, t, 1);
2510                         }
2511                 }
2512                 // (if)then, else, and case parts must return expected type.
2513                 if (!cs->dopart && type == Vunknown)
2514                         type = propagate_types(cs->thenpart, c, ok, Vunknown, bool_permitted);
2515                 if (type == Vunknown)
2516                         type = propagate_types(cs->elsepart, c, ok, Vunknown, bool_permitted);
2517                 for (cp = cs->casepart;
2518                      cp && type == Vunknown;
2519                      cp = cp->next)
2520                         type = propagate_types(cp->action, c, ok, Vunknown, bool_permitted);
2521                 if (type > Vunknown) {
2522                         if (!cs->dopart)
2523                                 propagate_types(cs->thenpart, c, ok, type, bool_permitted);
2524                         propagate_types(cs->elsepart, c, ok, type, bool_permitted);
2525                         for (cp = cs->casepart; cp ; cp = cp->next)
2526                                 propagate_types(cp->action, c, ok, type, bool_permitted);
2527                         return type;
2528                 } else
2529                         return Vunknown;
2530         }
2531
2532 ###### interp exec cases
2533         case Xcond_statement:
2534         {
2535                 struct value v, cnd;
2536                 struct casepart *cp;
2537                 struct cond_statement *c = cast(cond_statement, e);
2538                 if (c->forpart)
2539                         interp_exec(c->forpart);
2540                 do {
2541                         if (c->condpart)
2542                                 cnd = interp_exec(c->condpart);
2543                         else
2544                                 cnd.vtype = Vnone;
2545                         if (!(cnd.vtype == Vnone ||
2546                               (cnd.vtype == Vbool && cnd.bool != 0)))
2547                                 break;
2548                         if (c->dopart) {
2549                                 free_value(cnd);
2550                                 interp_exec(c->dopart);
2551                         }
2552                         if (c->thenpart) {
2553                                 v = interp_exec(c->thenpart);
2554                                 if (v.vtype != Vnone || !c->dopart)
2555                                         return v;
2556                                 free_value(v);
2557                         }
2558                 } while (c->dopart);
2559
2560                 for (cp = c->casepart; cp; cp = cp->next) {
2561                         v = interp_exec(cp->value);
2562                         if (value_cmp(v, cnd) == 0) {
2563                                 free_value(v);
2564                                 free_value(cnd);
2565                                 return interp_exec(cp->action);
2566                         }
2567                         free_value(v);
2568                 }
2569                 free_value(cnd);
2570                 if (c->elsepart)
2571                         return interp_exec(c->elsepart);
2572                 v.vtype = Vnone;
2573                 return v;
2574         }
2575
2576 ### Finally the whole program.
2577
2578 Somewhat reminiscent of Pascal a (current) Ocean program starts with
2579 the keyword "program" and a list of variable names which are assigned
2580 values from command line arguments.  Following this is a `block` which
2581 is the code to execute.
2582
2583 As this is the top level, several things are handled a bit
2584 differently.
2585 The whole program is not interpreted by `interp_exec` as that isn't
2586 passed the argument list which the program requires.  Similarly type
2587 analysis is a bit more interesting at this level.
2588
2589 ###### Binode types
2590         Program,
2591
2592 ###### Parser: grammar
2593
2594         $*binode
2595         Program -> program OpenScope Varlist Block OptNL ${
2596                 $0 = new(binode);
2597                 $0->op = Program;
2598                 $0->left = reorder_bilist($<3);
2599                 $0->right = $<4;
2600                 var_block_close(config2context(config), CloseSequential);
2601                 if (config2context(config)->scope_stack) abort();
2602         }$
2603
2604         Varlist -> Varlist ArgDecl ${
2605                         $0 = new(binode);
2606                         $0->op = Program;
2607                         $0->left = $<1;
2608                         $0->right = $<2;
2609                 }$
2610                 | ${ $0 = NULL; }$
2611
2612         $*var
2613         ArgDecl -> IDENTIFIER ${ {
2614                 struct variable *v = var_decl(config2context(config), $1.txt);
2615                 $0 = new(var);
2616                 $0->var = v;
2617         } }$
2618
2619         ## Grammar
2620
2621 ###### print binode cases
2622         case Program:
2623                 do_indent(indent, "program");
2624                 for (b2 = cast(binode, b->left); b2; b2 = cast(binode, b2->right)) {
2625                         printf(" ");
2626                         print_exec(b2->left, 0, 0);
2627                 }
2628                 if (bracket)
2629                         printf(" {\n");
2630                 else
2631                         printf(":\n");
2632                 print_exec(b->right, indent+1, bracket);
2633                 if (bracket)
2634                         do_indent(indent, "}\n");
2635                 break;
2636
2637 ###### propagate binode cases
2638         case Program: abort();
2639
2640 ###### core functions
2641
2642         static int analyse_prog(struct exec *prog, struct parse_context *c)
2643         {
2644                 struct binode *b = cast(binode, prog);
2645                 int ok = 1;
2646
2647                 do {
2648                         ok = 1;
2649                         propagate_types(b->right, c, &ok, Vnone, 0);
2650                 } while (ok == 2);
2651                 if (!ok)
2652                         return 0;
2653
2654                 for (b = cast(binode, b->left); b; b = cast(binode, b->right)) {
2655                         struct var *v = cast(var, b->left);
2656                         if (v->var->val.vtype == Vunknown)
2657                                 val_init(&v->var->val, Vstr);
2658                 }
2659                 b = cast(binode, prog);
2660                 do {
2661                         ok = 1;
2662                         propagate_types(b->right, c, &ok, Vnone, 0);
2663                 } while (ok == 2);
2664                 if (!ok)
2665                         return 0;
2666
2667                 /* Make sure everything is still consistent */
2668                 propagate_types(b->right, c, &ok, Vnone, 0);
2669                 return !!ok;
2670         }
2671
2672         static void interp_prog(struct exec *prog, char **argv)
2673         {
2674                 struct binode *p = cast(binode, prog);
2675                 struct binode *al = cast(binode, p->left);
2676                 struct value v;
2677
2678                 while (al) {
2679                         struct var *v = cast(var, al->left);
2680                         struct value *vl = &v->var->val;
2681
2682                         if (argv[0] == NULL) {
2683                                 printf("Not enough args\n");
2684                                 exit(1);
2685                         }
2686                         al = cast(binode, al->right);
2687                         free_value(*vl);
2688                         if (!parse_value(vl, argv[0]))
2689                                 exit(1);
2690                         argv++;
2691                 }
2692                 v = interp_exec(p->right);
2693                 free_value(v);
2694         }
2695
2696 ###### interp binode cases
2697         case Program: abort();
2698
2699 ## And now to test it out.
2700
2701 Having a language requires having a "hello world" program. I'll
2702 provide a little more than that: a program that prints "Hello world"
2703 finds the GCD of two numbers, prints the first few elements of
2704 Fibonacci, and performs a binary search for a number.
2705
2706 ###### File: oceani.mk
2707         tests :: sayhello
2708         sayhello : oceani
2709                 @echo "===== TEST ====="
2710                 ./oceani --section "test: hello" oceani.mdc 55 33
2711
2712 ###### test: hello
2713
2714         program A B:
2715                 print "Hello World, what lovely oceans you have!"
2716                 /* When a variable is defined in both branches of an 'if',
2717                  * and used afterwards, the variables are merged.
2718                  */
2719                 if A > B:
2720                         bigger := "yes"
2721                 else:
2722                         bigger := "no"
2723                 print "Is", A, "bigger than", B,"? ", bigger
2724                 /* If a variable is not used after the 'if', no
2725                  * merge happens, so types can be different
2726                  */
2727                 if A * 2 > B:
2728                         double := "yes"
2729                         print A, "is more than twice", B, "?", double
2730                 else:
2731                         double := A*2
2732                         print "double", A, "is only", double
2733
2734                 a := A; b := B
2735                 if a > 0 and b > 0:
2736                         while a != b:
2737                                 if a < b:
2738                                         b = b - a
2739                                 else:
2740                                         a = a - b
2741                         print "GCD of", A, "and", B,"is", a
2742                 else if a <= 0:
2743                         print a, "is not positive, cannot calculate GCD"
2744                 else:
2745                         print b, "is not positive, cannot calculate GCD"
2746
2747                 for:
2748                         togo := 10
2749                         f1 := 1; f2 := 1;
2750                         print "Fibonacci:", f1,f2,
2751                 then togo = togo - 1
2752                 while togo > 0:
2753                         f3 := f1 + f2
2754                         print "", f3,
2755                         f1 = f2
2756                         f2 = f3
2757                 print ""
2758
2759                 /* Binary search... */
2760                 for:
2761                         lo:= 0; hi := 100
2762                         target := 77
2763                 while:
2764                         mid := (lo + hi) / 2
2765                         if mid == target:
2766                                 use Found
2767                         if mid < target:
2768                                 lo = mid
2769                         else:
2770                                 hi = mid
2771                         if hi - lo < 1:
2772                                 use GiveUp
2773                         use True
2774                 do: pass
2775                 case Found:
2776                         print "Yay, I found", target
2777                 case GiveUp:
2778                         print "Closest I found was", mid