[find the source at oceani.mdc]

Ocean Interpreter - Stoney Creek version

Ocean is intended to be an compiled language, so this interpreter is not targeted at being the final product. It is, rather, an intermediate stage, and fills that role in two distinct ways.

Firstly, it exists as a platform to experiment with the early language design. An interpreter is easy to write and easy to get working, so the barrier for entry is lower if I aim to start with an interpreter.

Secondly, the plan for the Ocean compiler is to write it in the Ocean language. To achieve this we naturally need some sort of boot-strap process and this interpreter - written in portable C - will fill that role. It will be used to bootstrap the Ocean compiler.

Two features that are not needed to fill either of these roles are performance and completeness. The interpreter only needs to be fast enough to run small test programs and occasionally to run the compiler on itself. It only needs to be complete enough to test aspects of the design which are developed before the compiler is working, and to run the compiler on itself. Any features not used by the compiler when compiling itself are superfluous. They may be included anyway, but they may not.

Nonetheless, the interpreter should end up being reasonably complete, and any performance bottlenecks which appear and are easily fixed, will be.

Current version

This second version of the interpreter exists to test out the structured statement providing conditions and iteration, and simple variable scoping. Clearly we need some minimal other functionality so that values can be tested and instructions iterated over. All that functionality is clearly not normative at this stage (not that anything is really normative yet) and will change, so early test code will certainly break in later versions.

The under-test parts of the language are:

Elements which are present to make a usable language are:

Naming

Versions of the interpreter which obviously do not support a complete language will be named after creeks and streams. This one is Stoney Creek.

Once we have something reasonably resembling a complete language, the names of rivers will be used. Early versions of the compiler will be named after seas. Major releases of the compiler will be named after oceans. Hopefully I will be finished once I get to the Pacific Ocean release.

Outline

As well as parsing and executing a program, the interpreter can print out the program from the parsed internal structure. This is useful for validating the parsing. So the main requirements of the interpreter are:

This is all performed by a single C program extracted with parsergen.

There will be two formats for printing the program: a default and one that uses bracketing. So a --bracket command line option is needed for that. Normally the first code section found is used, however an alternate section can be requested so that a file (such as this one) can contain multiple programs This is effected with the --section option.

File: oceani.mk
myCFLAGS := -Wall -g -fplan9-extensions
CFLAGS := $(filter-out $(myCFLAGS),$(CFLAGS)) $(myCFLAGS)
myLDLIBS:= libparser.o libscanner.o libmdcode.o -licuuc
LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
## libs
all :: $(LDLIBS) oceani
oceani.c oceani.h : oceani.mdc parsergen
    ./parsergen -o oceani --LALR --tag Parser oceani.mdc
oceani.mk: oceani.mdc md2c
    ./md2c oceani.mdc

oceani: oceani.o $(LDLIBS)
    $(CC) $(CFLAGS) -o oceani oceani.o $(LDLIBS)
Parser: header
## macros
## ast
struct parse_context {
    struct token_config config;
    char *file_name;
    int parse_error;
    ## parse context
};
macros
#define container_of(ptr, type, member) ({                      \
    const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
    (type *)( (char *)__mptr - offsetof(type,member) );})

#define config2context(_conf) container_of(_conf, struct parse_context, \
    config)
Parser: code
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/mman.h>
#include <string.h>
#include <stdio.h>
#include <locale.h>
#include <malloc.h>
#include "mdcode.h"
#include "scanner.h"
#include "parser.h"

## includes

#include "oceani.h"

## forward decls
## value functions
## ast functions
## core functions

#include <getopt.h>
static char Usage[] = "Usage: oceani --trace --print --noexec --brackets"
                      "--section=SectionName prog.ocn\n";
static const struct option long_options[] = {
    {"trace",     0, NULL, 't'},
    {"print",     0, NULL, 'p'},
    {"noexec",    0, NULL, 'n'},
    {"brackets",  0, NULL, 'b'},
    {"section",   1, NULL, 's'},
    {NULL,        0, NULL, 0},
};
const char *options = "tpnbs";
int main(int argc, char *argv[])
{
    int fd;
    int len;
    char *file;
    struct section *s;
    char *section = NULL;
    struct parse_context context = {
        .config = {
            .ignored = (1 << TK_line_comment)
                     | (1 << TK_block_comment),
            .number_chars = ".,_+-",
            .word_start = "_",
            .word_cont = "_",
        },
    };
    int doprint=0, dotrace=0, doexec=1, brackets=0;
    struct exec **prog;
    int opt;
    while ((opt = getopt_long(argc, argv, options, long_options, NULL))
           != -1) {
        switch(opt) {
        case 't': dotrace=1; break;
        case 'p': doprint=1; break;
        case 'n': doexec=0; break;
        case 'b': brackets=1; break;
        case 's': section = optarg; break;
        default: fprintf(stderr, Usage);
            exit(1);
        }
    }
    if (optind >= argc) {
        fprintf(stderr, "oceani: no input file given\n");
        exit(1);
    }
    fd = open(argv[optind], O_RDONLY);
    if (fd < 0) {
        fprintf(stderr, "oceani: cannot open %s\n", argv[optind]);
        exit(1);
    }
    context.file_name = argv[optind];
    len = lseek(fd, 0, 2);
    file = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0);
    s = code_extract(file, file+len, NULL);
    if (!s) {
        fprintf(stderr, "oceani: could not find any code in %s\n",
                argv[optind]);
        exit(1);
    }
    if (section) {
        struct section *ss;
        for (ss = s; ss; ss = ss->next) {
            struct text sec = ss->section;
            if (sec.len == strlen(section) &&
                strncmp(sec.txt, section, sec.len) == 0)
                break;
        }
        if (ss)
            prog = parse_oceani(ss->code, &context.config,
                                dotrace ? stderr : NULL);
        else {
            fprintf(stderr, "oceani: cannot find section %s\n",
                    section);
            exit(1);
        }
    } else
        prog = parse_oceani(s->code, &context.config,
                        dotrace ? stderr : NULL);
    if (!prog) {
        fprintf(stderr, "oceani: fatal parser error.\n");
        context.parse_error = 1;
    }
    if (prog && doprint)
        print_exec(*prog, 0, brackets);
    if (prog && doexec && !context.parse_error) {
        if (!analyse_prog(*prog, &context)) {
            fprintf(stderr, "oceani: type error in program - not running.\n");
            exit(1);
        }
        interp_prog(*prog, argv+optind+1);
    }
    if (prog) {
        free_exec(*prog);
        free(prog);
    }
    while (s) {
        struct section *t = s->next;
        code_free(s->code);
        free(s);
        s = t;
    }
    ## free context
    exit(context.parse_error ? 1 : 0);
}

Analysis

These four requirements of parse, analyse, print, interpret apply to each language element individually so that is how most of the code will be structured.

Three of the four are fairly self explanatory. The one that requires a little explanation is the analysis step.

The current language design does not require (or even allow) the types of variables to be declared, but they must still have a single type. Different operations impose different requirements on the variables, for example addition requires both arguments to be numeric, and assignment requires the variable on the left to have the same type as the expression on the right.

Analysis involves propagating these type requirements around and consequently setting the type of each variable. If any requirements are violated (e.g. a string is compared with a number) or if a variable needs to have two different types, then an error is raised and the program will not run.

If the same variable is declared in both branchs of an 'if/else', or in all cases of a 'switch' then the multiple instances may be merged into just one variable if the variable is references after the conditional statement. When this happens, the types must naturally be consistent across all the branches. When the variable is not used outside the if, the variables in the different branches are distinct and can be of different types.

Determining the types of all variables early is important for processing command line arguments. These can be assigned to any type of variable, but we must first know the correct type so any required conversion can happen. If a variable is associated with a command line argument but no type can be interpreted (e.g. the variable is only ever used in a print statement), then the type is set to 'string'.

Undeclared names may only appear in "use" statements and "case" expressions. These names are given a type of "label" and a unique value. This allows them to fill the role of a name in an enumerated type, which is useful for testing the switch statement.

As we will see, the condition part of a while statement can return either a Boolean or some other type. This requires that the expect type that gets passed around comprises a type (enum vtype) and a flag to indicate that Vbool is also permitted.

As there are, as yet, no distinct types that are compatible, there isn't much subtlety in the analysis. When we have distinct number types, this will become more interesting.

Error reporting

When analysis discovers an inconsistency it needs to report an error; just refusing to run the code esure that the error doesn't cascade, but by itself it isn't very useful. A clear understand of the sort of error message that are useful will help guide the process of analysis.

At a simplistic level, the only sort of error that type analysis can report is that the type of some construct doesn't match a contextual requirement. For example, in 4 + "hello" the addition provides a contextual requirement for numbers, but "hello" is not a number. In this particular example no further information is needed as the types are obvious from local information. When a variable is involved that isn't the case. It may be helpful to explain why the variable has a particular type, by indicating the location where the type was set, whether by declaration or usage.

Using a recursive-descent analysis we can easily detect a problem at multiple locations. In "hello:= "there"; 4 + hello" the addition will detect that one argument is not a number and the usage of hello will detect that a number was wanted, but not provided. In this (early) version of the language, we will generate error reports at multiple locations, to the use of hello will report an error and explain were the value was set, and the addition will report an error and say why numbers are needed. To be able to report locations for errors, each language element will need to record a file location (line and column) and each variable will need to record the language element where its type was set. For now we will assume that each line of an error message indicates one location in the file, and up to 2 types. So we provide a printf-like function which takes a format, a language (a struct exec which has not yet been introduced), and 2 types. "$1" reports the first type, "$2" reports the second. We will need a function to print the location, once we know how that is stored.

forward decls
static void fput_loc(struct exec *loc, FILE *f);
core functions
static void type_err(struct parse_context *c,
                     char *fmt, struct exec *loc,
                     enum vtype t1, enum vtype t2)
{
    fprintf(stderr, "%s:", c->file_name);
    fput_loc(loc, stderr);
    for (; *fmt ; fmt++) {
        if (*fmt != '%') {
            fputc(*fmt, stderr);
            continue;
        }
        fmt++;
        switch (*fmt) {
        case '%': fputc(*fmt, stderr); break;
        default: fputc('?', stderr); break;
        case '1':
            fputs(vtype_names[t1], stderr);
            break;
        case '2':
            fputs(vtype_names[t2], stderr);
            break;
        ## format cases
        }
    }
    fputs("\n", stderr);
    c->parse_error = 1;
}

static void tok_err(struct parse_context *c, char *fmt, struct token *t)
{
    fprintf(stderr, "%s:%d:%d: %s\n", c->file_name, t->line, t->col, fmt);
    c->parse_error = 1;
}

Data Structures

One last introductory step before detailing the language elements and providing their four requirements is to establish the data structures to store these elements.

There are two key objects that we need to work with: executable elements which comprise the program, and values which the program works with. Between these are the variables in their various scopes which hold the values.

Values

Values can be numbers, which we represent as multi-precision fractions, strings, Booleans and labels. When analysing the program we also need to allow for places where no value is meaningful (Vnone) and where we don't know what type to expect yet (Vunknown which can be anything and Vnolabel which can be anything except a label). A 2 character 'tail' is included in each value as the scanner wants to parse that from the end of numbers and we need somewhere to put it. It is currently ignored but one day might allow for e.g. "imaginary" numbers.

Values are never shared, they are always copied when used, and freed when no longer needed.

When propagating type information around the program, we need to determine if two types are compatible, where Vunknown is compatible which anything, and Vnolabel is compatible with anything except a label. A separate funtion to encode this rule will simplify some code later.

When assigning command line arguments to variable, we need to be able to parse each type from a string.

includes
#include <gmp.h>
#include "string.h"
#include "number.h"
libs
myLDLIBS := libnumber.o libstring.o -lgmp
LDLIBS := $(filter-out $(myLDLIBS),$(LDLIBS)) $(myLDLIBS)
ast
struct value {
    enum vtype {Vnolabel, Vunknown, Vnone, Vstr, Vnum, Vbool, Vlabel} vtype;
    union {
        struct text str;
        mpq_t num;
        int bool;
        void *label;
    };
    char tail[2];
};

char *vtype_names[] = {"nolabel", "unknown", "none", "string",
                       "number", "Boolean", "label"};
ast functions
static void free_value(struct value v)
{
    switch (v.vtype) {
    case Vnone:
    case Vnolabel:
    case Vunknown: break;
    case Vstr: free(v.str.txt); break;
    case Vnum: mpq_clear(v.num); break;
    case Vlabel:
    case Vbool: break;
    }
}

static int vtype_compat(enum vtype require, enum vtype have, int bool_permitted)
{
    if (bool_permitted && have == Vbool)
        return 1;
    switch (require) {
    case Vnolabel:
        return have != Vlabel;
    case Vunknown:
        return 1;
    default:
        return have == Vunknown || require == have;
    }
}
value functions
static void val_init(struct value *val, enum vtype type)
{
    val->vtype = type;
    switch(type) {
    case Vnone:abort();
    case Vnolabel:
    case Vunknown: break;
    case Vnum:
        mpq_init(val->num); break;
    case Vstr:
        val->str.txt = malloc(1);
        val->str.len = 0;
        break;
    case Vbool:
        val->bool = 0;
        break;
    case Vlabel:
        val->label = val;
        break;
    }
}

static struct value dup_value(struct value v)
{
    struct value rv;
    rv.vtype = v.vtype;
    switch (rv.vtype) {
    case Vnone:
    case Vnolabel:
    case Vunknown: break;
    case Vlabel:
        rv.label = v.label;
        break;
    case Vbool:
        rv.bool = v.bool;
        break;
    case Vnum:
        mpq_init(rv.num);
        mpq_set(rv.num, v.num);
        break;
    case Vstr:
        rv.str.len = v.str.len;
        rv.str.txt = malloc(rv.str.len);
        memcpy(rv.str.txt, v.str.txt, v.str.len);
        break;
    }
    return rv;
}

static int value_cmp(struct value left, struct value right)
{
    int cmp;
    if (left.vtype != right.vtype)
        return left.vtype - right.vtype;
    switch (left.vtype) {
    case Vlabel: cmp = left.label == right.label ? 0 : 1; break;
    case Vnum: cmp = mpq_cmp(left.num, right.num); break;
    case Vstr: cmp = text_cmp(left.str, right.str); break;
    case Vbool: cmp = left.bool - right.bool; break;
    case Vnone:
    case Vnolabel:
    case Vunknown: cmp = 0;
    }
    return cmp;
}

static struct text text_join(struct text a, struct text b)
{
    struct text rv;
    rv.len = a.len + b.len;
    rv.txt = malloc(rv.len);
    memcpy(rv.txt, a.txt, a.len);
    memcpy(rv.txt+a.len, b.txt, b.len);
    return rv;
}

static void print_value(struct value v)
{
    switch (v.vtype) {
    case Vunknown:
        printf("*Unknown*"); break;
    case Vnone:
    case Vnolabel:
        printf("*no-value*"); break;
    case Vlabel:
        printf("*label-%p*", v.label); break;
    case Vstr:
        printf("%.*s", v.str.len, v.str.txt); break;
    case Vbool:
        printf("%s", v.bool ? "True":"False"); break;
    case Vnum:
        {
        mpf_t fl;
        mpf_init2(fl, 20);
        mpf_set_q(fl, v.num);
        gmp_printf("%Fg", fl);
        mpf_clear(fl);
        break;
        }
    }
}

static int parse_value(struct value *vl, char *arg)
{
    struct text tx;
    int neg = 0;
    switch(vl->vtype) {
    case Vnolabel:
    case Vlabel:
    case Vunknown:
    case Vnone:
        return 0;
    case Vstr:
        vl->str.len = strlen(arg);
        vl->str.txt = malloc(vl->str.len);
        memcpy(vl->str.txt, arg, vl->str.len);
        break;
    case Vnum:
        if (*arg == '-') {
            neg = 1;
            arg++;
        }
        tx.txt = arg; tx.len = strlen(tx.txt);
        if (number_parse(vl->num, vl->tail, tx) == 0)
            mpq_init(vl->num);
        else if (neg)
            mpq_neg(vl->num, vl->num);
        break;
    case Vbool:
        if (strcasecmp(arg, "true") == 0 ||
            strcmp(arg, "1") == 0)
            vl->bool = 1;
        else if (strcasecmp(arg, "false") == 0 ||
                 strcmp(arg, "0") == 0)
            vl->bool = 0;
        else {
            printf("Bad bool: %s\n", arg);
            return 0;
        }
        break;
    }
    return 1;
}

Variables

Variables are scoped named values. We store the names in a linked list of "bindings" sorted lexically, and use sequential search and insertion sort.

ast
struct binding {
    struct text name;
    struct binding *next;   // in lexical order
    ## binding fields
};

This linked list is stored in the parse context so that "reduce" functions can find or add variables, and so the analysis phase can ensure that every variable gets a type.

parse context
struct binding *varlist;  // In lexical order
ast functions
static struct binding *find_binding(struct parse_context *c, struct text s)
{
    struct binding **l = &c->varlist;
    struct binding *n;
    int cmp = 1;

    while (*l &&
        (cmp = text_cmp((*l)->name, s)) < 0)
            l = & (*l)->next;
    if (cmp == 0)
        return *l;
    n = calloc(1, sizeof(*n));
    n->name = s;
    n->next = *l;
    *l = n;
    return n;
}

Each name can be linked to multiple variables defined in different scopes. Each scope starts where the name is declared and continues until the end of the containing code block. Scopes of a given name cannot nest, so a declaration while a name is in-scope is an error.

binding fields
struct variable *var;
ast
struct variable {
    struct variable *previous;
    struct value val;
    struct binding *name;
    struct exec *where_decl;// where name was declared
    struct exec *where_set; // where type was set
    ## variable fields
};

While the naming seems strange, we include local constants in the definition of variables. A name declared var := value can subsequently be changed, but a name declared var ::= value cannot - it is constant

variable fields
int constant;

Scopes in parallel branches can be partially merged. More specifically, if a given name is declared in both branches of an if/else then it's scope is a candidate for merging. Similarly if every branch of an exhaustive switch (e.g. has an "else" clause) declares a given name, then the scopes from the branches are candidates for merging.

Note that names declared inside a loop (which is only parallel to itself) are never visible after the loop. Similarly names defined in scopes which are not parallel, such as those started by for and switch, are never visible after the scope. Only variable defined in both then and else (including the implicit then after an if, and excluding then used with for) and in all cases and else of a switch or while can be visible beyond the if/switch/while.

Labels, which are a bit like variables, follow different rules. Labels are not explicitly declared, but if an undeclared name appears in a context where a label is legal, that effectively declares the name as a label. The declaration remains in force (or in scope) at least to the end of the immediately containing block and conditionally in any larger containing block which does not declare the name in some other way. Importantly, the conditional scope extension happens even if the label is only used in parallel branch of a conditional -- when used in one branch it is treated as having been declared in all branches.

Merge candidates are tentatively visible beyond the end of the branching statement which creates them. If the name is used, the merge is affirmed and they become a single variable visible at the outer layer. If not - if it is redeclared first - the merge lapses.

To track scopes we have an extra stack, implemented as a linked list, which roughly parallels the parse stack and which is used exclusively for scoping. When a new scope is opened, a new frame is pushed and the child-count of the parent frame is incremented. This child-count is used to distinguish between the first of a set of parallel scopes, in which declared variables must not be in scope, and subsequent branches, whether they must already be conditionally scoped.

To push a new frame before any code in the frame is parsed, we need a grammar reduction. This is most easily achieved with a grammar element which derives the empty string, and created the new scope when it is recognized. This can be placed, for example, between a keyword like "if" and the code following it.

ast
struct scope {
    struct scope *parent;
    int child_count;
};
parse context
int scope_depth;
struct scope *scope_stack;
ast functions
static void scope_pop(struct parse_context *c)
{
    struct scope *s = c->scope_stack;

    c->scope_stack = s->parent;
    free(s);
    c->scope_depth -= 1;
}

static void scope_push(struct parse_context *c)
{
    struct scope *s = calloc(1, sizeof(*s));
    if (c->scope_stack)
        c->scope_stack->child_count += 1;
    s->parent = c->scope_stack;
    c->scope_stack = s;
    c->scope_depth += 1;
}
Grammar
$void
OpenScope -> ${ scope_push(config2context(config)); }$

Each variable records a scope depth and is in one of four states:

The scope depth is not greater than the current parse context scope nest depth. When the block of that depth closes, the state will change. To achieve this, all "in scope" variables are linked together as a stack in nesting order.

The scope depth is the depth of the last parallel block that enclosed the declaration, and that has closed.

variable fields
int depth, min_depth;
enum { OutScope, PendingScope, CondScope, InScope } scope;
struct variable *in_scope;
parse context
struct variable *in_scope;

All variables with the same name are linked together using the 'previous' link. Those variable that have been affirmatively merged all have a 'merged' pointer that points to one primary variable - the most recently declared instance. When merging variables, we need to also adjust the 'merged' pointer on any other variables that had previously been merged with the one that will no longer be primary.

variable fields
struct variable *merged;
ast functions
static void variable_merge(struct variable *primary, struct variable *secondary)
{
    struct variable *v;

    if (primary->merged)
        // shouldn't happen
        primary = primary->merged;

    for (v = primary->previous; v; v=v->previous)
        if (v == secondary || v == secondary->merged ||
            v->merged == secondary ||
            (v->merged && v->merged == secondary->merged)) {
            v->scope = OutScope;
            v->merged = primary;
        }
}
free context
while (context.varlist) {
    struct binding *b = context.varlist;
    struct variable *v = b->var;
    context.varlist = b->next;
    free(b);
    while (v) {
        struct variable *t = v;

        v = t->previous;
        free_value(t->val);
        free(t);
    }
}

Manipulating Bindings

When a name is conditionally visible, a new declaration discards the old binding - the condition lapses. Conversely a usage of the name affirms the visibility and extends it to the end of the containing block - i.e. the block that contains both the original declaration and the latest usage. This is determined from min_depth. When a conditionally visible variable gets affirmed like this, it is also merged with other conditionally visible variables with the same name.

When we parse a variable declaration we either signal an error if the name is currently bound, or create a new variable at the current nest depth if the name is unbound or bound to a conditionally scoped or pending-scope variable. If the previous variable was conditionally scoped, it and its homonyms becomes out-of-scope.

When we parse a variable reference (including non-declarative assignment) we signal an error if the name is not bound or is bound to a pending-scope variable; update the scope if the name is bound to a conditionally scoped variable; or just proceed normally if the named variable is in scope.

When we exit a scope, any variables bound at this level are either marked out of scope or pending-scoped, depending on whether the scope was sequential or parallel.

When exiting a parallel scope we check if there are any variables that were previously pending and are still visible. If there are, then there weren't redeclared in the most recent scope, so they cannot be merged and must become out-of-scope. If it is not the first of parallel scopes (based on child_count), we check that there was a previous binding that is still pending-scope. If there isn't, the new variable must now be out-of-scope.

When exiting a sequential scope that immediately enclosed parallel scopes, we need to resolve any pending-scope variables. If there was no else clause, and we cannot determine that the switch was exhaustive, we need to mark all pending-scope variable as out-of-scope. Otherwise all pending-scope variables become conditionally scoped.

ast
enum closetype { CloseSequential, CloseParallel, CloseElse };
ast functions
static struct variable *var_decl(struct parse_context *c, struct text s)
{
    struct binding *b = find_binding(c, s);
    struct variable *v = b->var;

    switch (v ? v->scope : OutScope) {
    case InScope:
        /* Caller will report the error */
        return NULL;
    case CondScope:
        for (;
             v && v->scope == CondScope;
             v = v->previous)
            v->scope = OutScope;
        break;
    default: break;
    }
    v = calloc(1, sizeof(*v));
    v->previous = b->var;
    b->var = v;
    v->name = b;
    v->min_depth = v->depth = c->scope_depth;
    v->scope = InScope;
    v->in_scope = c->in_scope;
    c->in_scope = v;
    val_init(&v->val, Vunknown);
    return v;
}

static struct variable *var_ref(struct parse_context *c, struct text s)
{
    struct binding *b = find_binding(c, s);
    struct variable *v = b->var;
    struct variable *v2;

    switch (v ? v->scope : OutScope) {
    case OutScope:
    case PendingScope:
        /* Signal an error - once that is possible */
        return NULL;
    case CondScope:
        /* All CondScope variables of this name need to be merged
         * and become InScope
         */
        v->depth = v->min_depth;
        v->scope = InScope;
        for (v2 = v->previous;
             v2 && v2->scope == CondScope;
             v2 = v2->previous)
            variable_merge(v, v2);
        break;
    case InScope:
        break;
    }
    return v;
}

static void var_block_close(struct parse_context *c, enum closetype ct)
{
    /* close of all variables that are in_scope */
    struct variable *v, **vp, *v2;

    scope_pop(c);
    for (vp = &c->in_scope;
         v = *vp, v && v->depth > c->scope_depth && v->min_depth > c->scope_depth;
         ) {
        switch (ct) {
        case CloseElse:
        case CloseParallel: /* handle PendingScope */
            switch(v->scope) {
            case InScope:
            case CondScope:
                if (c->scope_stack->child_count == 1)
                    v->scope = PendingScope;
                else if (v->previous &&
                         v->previous->scope == PendingScope)
                    v->scope = PendingScope;
                else if (v->val.vtype == Vlabel)
                    v->scope = PendingScope;
                else if (v->name->var == v)
                    v->scope = OutScope;
                if (ct == CloseElse) {
                    /* All Pending variables with this name
                     * are now Conditional */
                    for (v2 = v;
                         v2 && v2->scope == PendingScope;
                         v2 = v2->previous)
                        v2->scope = CondScope;
                }
                break;
            case PendingScope:
                for (v2 = v;
                     v2 && v2->scope == PendingScope;
                     v2 = v2->previous)
                    if (v2->val.vtype != Vlabel)
                        v2->scope = OutScope;
                break;
            case OutScope: break;
            }
            break;
        case CloseSequential:
            if (v->val.vtype == Vlabel)
                v->scope = PendingScope;
            switch (v->scope) {
            case InScope:
                v->scope = OutScope;
                break;
            case PendingScope:
                /* There was no 'else', so we can only become
                 * conditional if we know the cases were exhaustive,
                 * and that doesn't mean anything yet.
                 * So only labels become conditional..
                 */
                for (v2 = v;
                     v2 && v2->scope == PendingScope;
                     v2 = v2->previous)
                    if (v2->val.vtype == Vlabel) {
                        v2->scope = CondScope;
                        v2->min_depth = c->scope_depth;
                    } else
                        v2->scope = OutScope;
                break;
            case CondScope:
            case OutScope: break;
            }
            break;
        }
        if (v->scope == OutScope)
            *vp = v->in_scope;
        else
            vp = &v->in_scope;
    }
}

Executables

Executables can be lots of different things. In many cases an executable is just an operation combined with one or two other executables. This allows for expressions and lists etc. Other times an executable is something quite specific like a constant or variable name. So we define a struct exec to be a general executable with a type, and a struct binode which is a subclass of exec and forms a node in a binary tree and holding an operation. There will be other subclasses, and to access these we need to be able to cast the exec into the various other types.

macros
#define cast(structname, pointer) ({        \
    const typeof( ((struct structname *)0)->type) *__mptr = &(pointer)->type; \
    if (__mptr && *__mptr != X##structname) abort();        \
    (struct structname *)( (char *)__mptr);})

#define new(structname) ({                      \
    struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
    __ptr->type = X##structname;                        \
    __ptr->line = -1; __ptr->column = -1;                   \
    __ptr;})

#define new_pos(structname, token) ({                       \
    struct structname *__ptr = ((struct structname *)calloc(1,sizeof(struct structname))); \
    __ptr->type = X##structname;                        \
    __ptr->line = token.line; __ptr->column = token.col;            \
    __ptr;})
ast
enum exec_types {
    Xbinode,
    ## exec type
};
struct exec {
    enum exec_types type;
    int line, column;
};
struct binode {
    struct exec;
    enum Btype {
        ## Binode types
    } op;
    struct exec *left, *right;
};
ast functions
static int __fput_loc(struct exec *loc, FILE *f)
{
    if (loc->line >= 0) {
        fprintf(f, "%d:%d: ", loc->line, loc->column);
        return 1;
    }
    if (loc->type == Xbinode)
        return __fput_loc(cast(binode,loc)->left, f) ||
               __fput_loc(cast(binode,loc)->right, f);
    return 0;
}
static void fput_loc(struct exec *loc, FILE *f)
{
    if (!__fput_loc(loc, f))
        fprintf(f, "??:??: ");
}

Each different type of exec node needs a number of functions defined, a bit like methods. We must be able to be able to free it, print it, analyse it and execute it. Once we have specific exec types we will need to parse them too. Let's take this a bit more slowly.

Freeing

The parser generator requires a free_foo function for each struct that stores attributes and they will be execs of subtypes there-of. So we need free_exec which can handle all the subtypes, and we need free_binode.

ast functions
static void free_binode(struct binode *b)
{
    if (!b)
        return;
    free_exec(b->left);
    free_exec(b->right);
    free(b);
}
core functions
static void free_exec(struct exec *e)
{
    if (!e)
        return;
    switch(e->type) {
        ## free exec cases
    }
}
forward decls
static void free_exec(struct exec *e);
free exec cases
case Xbinode: free_binode(cast(binode, e)); break;

Printing

Printing an exec requires that we know the current indent level for printing line-oriented components. As will become clear later, we also want to know what sort of bracketing to use.

ast functions
static void do_indent(int i, char *str)
{
    while (i--)
        printf("    ");
    printf("%s", str);
}
core functions
static void print_binode(struct binode *b, int indent, int bracket)
{
    struct binode *b2;
    switch(b->op) {
    ## print binode cases
    }
}

static void print_exec(struct exec *e, int indent, int bracket)
{
    if (!e)
        return;
    switch (e->type) {
    case Xbinode:
        print_binode(cast(binode, e), indent, bracket); break;
    ## print exec cases
    }
}
forward decls
static void print_exec(struct exec *e, int indent, int bracket);

Analysing

As discussed, analysis involves propagating type requirements around the program and looking for errors.

So propagate_types is passed an expected type (being a vtype together with a bool_permitted flag) that the exec is expected to return, and returns the type that it does return, either of which can be Vunknown. An ok flag is passed by reference. It is set to 0 when an error is found, and 2 when any change is made. If it remains unchanged at 1, then no more propagation is needed.

core functions
static enum vtype propagate_types(struct exec *prog, struct parse_context *c, int *ok,
                                  enum vtype type, int bool_permitted)
{
    enum vtype t;

    if (!prog)
        return Vnone;

    switch (prog->type) {
    case Xbinode:
    {
        struct binode *b = cast(binode, prog);
        switch (b->op) {
        ## propagate binode cases
        }
        break;
    }
    ## propagate exec cases
    }
    return Vnone;
}

Interpreting

Interpreting an exec doesn't require anything but the exec. State is stored in variables and each variable will be directly linked from within the exec tree. The exception to this is the whole program which needs to look at command line arguments. The program will be interpreted separately.

Each exec can return a value, which may be Vnone but shouldn't be Vunknown.

core functions
static struct value interp_exec(struct exec *e)
{
    struct value rv;
    rv.vtype = Vnone;
    if (!e)
        return rv;

    switch(e->type) {
    case Xbinode:
    {
        struct binode *b = cast(binode, e);
        struct value left, right;
        left.vtype = right.vtype = Vnone;
        switch (b->op) {
        ## interp binode cases
        }
        free_value(left); free_value(right);
        break;
    }
    ## interp exec cases
    }
    return rv;
}

Language elements

Each language element needs to be parsed, printed, analysed, interpreted, and freed. There are several, so let's just start with the easy ones and work our way up.

Values

We have already met values as separate objects. When manifest constants appear in the program text that must result in an executable which has a constant value. So the val structure embeds a value in an executable.

exec type
Xval,
ast
struct val {
    struct exec;
    struct value val;
};
Grammar
$*val
Value ->  True ${
        $0 = new_pos(val, $1);
        $0->val.vtype = Vbool;
        $0->val.bool = 1;
        }$
    | False ${
        $0 = new_pos(val, $1);
        $0->val.vtype = Vbool;
        $0->val.bool = 0;
        }$
    | NUMBER ${
        $0 = new_pos(val, $1);
        $0->val.vtype = Vnum;
        if (number_parse($0->val.num, $0->val.tail, $1.txt) == 0)
            mpq_init($0->val.num);
            if ($0->val.tail[0])
                tok_err(config2context(config), "error: unsupported number suffix.",
                        &$1);
        }$
    | STRING ${
        $0 = new_pos(val, $1);
        $0->val.vtype = Vstr;
        string_parse(&$1, '\\', &$0->val.str, $0->val.tail);
        if ($0->val.tail[0])
            tok_err(config2context(config), "error: unsupported string suffix.",
                    &$1);
        }$
    | MULTI_STRING ${
        $0 = new_pos(val, $1);
        $0->val.vtype = Vstr;
        string_parse(&$1, '\\', &$0->val.str, $0->val.tail);
        if ($0->val.tail[0])
            tok_err(config2context(config), "error: unsupported string suffix.",
                    &$1);
        }$
print exec cases
case Xval:
{
    struct val *v = cast(val, e);
    if (v->val.vtype == Vstr)
        printf("\"");
    print_value(v->val);
    if (v->val.vtype == Vstr)
        printf("\"");
    break;
}
propagate exec cases
    case Xval:
    {
        struct val *val = cast(val, prog);
        if (!vtype_compat(type, val->val.vtype, bool_permitted)) {
            type_err(c, "error: expected %1 found %2",
                       prog, type, val->val.vtype);
            *ok = 0;
        }
        return val->val.vtype;
    }
interp exec cases
case Xval:
    return dup_value(cast(val, e)->val);
ast functions
static void free_val(struct val *v)
{
    if (!v)
        return;
    free_value(v->val);
    free(v);
}
free exec cases
case Xval: free_val(cast(val, e)); break;
ast functions
// Move all nodes from 'b' to 'rv', reversing the order.
// In 'b' 'left' is a list, and 'right' is the last node.
// In 'rv', left' is the first node and 'right' is a list.
static struct binode *reorder_bilist(struct binode *b)
{
    struct binode *rv = NULL;

    while (b) {
        struct exec *t = b->right;
        b->right = rv;
        rv = b;
        if (b->left)
            b = cast(binode, b->left);
        else
            b = NULL;
        rv->left = t;
    }
    return rv;
}

Variables

Just as we used as val to wrap a value into an exec, we similarly need a var to wrap a variable into an exec. While each val contained a copy of the value, each var hold a link to the variable because it really is the same variable no matter where it appears. When a variable is used, we need to remember to follow the ->merged link to find the primary instance.

exec type
Xvar,
ast
struct var {
    struct exec;
    struct variable *var;
};
Grammar
$*var
VariableDecl -> IDENTIFIER := ${ {
    struct variable *v = var_decl(config2context(config), $1.txt);
    $0 = new_pos(var, $1);
    $0->var = v;
    if (v)
        v->where_decl = $0;
    else {
        v = var_ref(config2context(config), $1.txt);
        $0->var = v;
        type_err(config2context(config), "error: variable '%v' redeclared",
                 $0, Vnone, Vnone);
        type_err(config2context(config), "info: this is where '%v' was first declared",
                 v->where_decl, Vnone, Vnone);
    }
} }$
    | IDENTIFIER ::= ${ {
    struct variable *v = var_decl(config2context(config), $1.txt);
    $0 = new_pos(var, $1);
    $0->var = v;
    if (v) {
        v->where_decl = $0;
        v->constant = 1;
    } else {
        v = var_ref(config2context(config), $1.txt);
        $0->var = v;
        type_err(config2context(config), "error: variable '%v' redeclared",
                 $0, Vnone, Vnone);
        type_err(config2context(config), "info: this is where '%v' was first declared",
                 v->where_decl, Vnone, Vnone);
    }
} }$

Variable -> IDENTIFIER ${ {
    struct variable *v = var_ref(config2context(config), $1.txt);
    $0 = new_pos(var, $1);
    if (v == NULL) {
        /* This might be a label - allocate a var just in case */
        v = var_decl(config2context(config), $1.txt);
        if (v) {
            val_init(&v->val, Vlabel);
            v->where_set = $0;
        }
    }
    $0->var = v;
} }$
print exec cases
case Xvar:
{
    struct var *v = cast(var, e);
    if (v->var) {
        struct binding *b = v->var->name;
        printf("%.*s", b->name.len, b->name.txt);
    }
    break;
}
format cases
case 'v':
    if (loc->type == Xvar) {
        struct var *v = cast(var, loc);
        if (v->var) {
            struct binding *b = v->var->name;
            fprintf(stderr, "%.*s", b->name.len, b->name.txt);
        } else
            fputs("???", stderr);
    } else
        fputs("NOTVAR", stderr);
    break;
propagate exec cases
case Xvar:
{
    struct var *var = cast(var, prog);
    struct variable *v = var->var;
    if (!v) {
        type_err(c, "%d:BUG: no variable!!", prog, Vnone, Vnone);
        *ok = 0;
        return Vnone;
    }
    if (v->merged)
        v = v->merged;
    if (v->val.vtype == Vunknown) {
        if (type > Vunknown && *ok != 0) {
            val_init(&v->val, type);
            v->where_set = prog;
            *ok = 2;
        }
        return type;
    }
    if (!vtype_compat(type, v->val.vtype, bool_permitted)) {
        type_err(c, "error: expected %1 but variable '%v' is %2", prog,
                 type, v->val.vtype);
        type_err(c, "info: this is where '%v' was set to %1", v->where_set,
                 v->val.vtype, Vnone);
        *ok = 0;
    }
    if (type <= Vunknown)
        return v->val.vtype;
    return type;
}
interp exec cases
case Xvar:
{
    struct var *var = cast(var, e);
    struct variable *v = var->var;

    if (v->merged)
        v = v->merged;
    return dup_value(v->val);
}
ast functions
static void free_var(struct var *v)
{
    free(v);
}
free exec cases
case Xvar: free_var(cast(var, e)); break;

Expressions: Boolean

Our first user of the binode will be expressions, and particularly Boolean expressions. As I haven't implemented precedence in the parser generator yet, we need different names from each precedence level used by expressions. The outer most or lowest level precedence are Boolean or and, and not which form an Expression out of BTerms and BFacts.

Binode types
And,
Or,
Not,
# Grammar
$*exec
Expression -> Expression or BTerm ${ {
        struct binode *b = new(binode);
        b->op = Or;
        b->left = $<1;
        b->right = $<3;
        $0 = b;
    } }$
    | BTerm ${ $0 = $<1; }$

BTerm -> BTerm and BFact ${ {
        struct binode *b = new(binode);
        b->op = And;
        b->left = $<1;
        b->right = $<3;
        $0 = b;
    } }$
    | BFact ${ $0 = $<1; }$

BFact -> not BFact ${ {
        struct binode *b = new(binode);
        b->op = Not;
        b->right = $<2;
        $0 = b;
    } }$
    ## other BFact
print binode cases
case And:
    print_exec(b->left, -1, 0);
    printf(" and ");
    print_exec(b->right, -1, 0);
    break;
case Or:
    print_exec(b->left, -1, 0);
    printf(" or ");
    print_exec(b->right, -1, 0);
    break;
case Not:
    printf("not ");
    print_exec(b->right, -1, 0);
    break;
propagate binode cases
case And:
case Or:
case Not:
    /* both must be Vbool, result is Vbool */
    propagate_types(b->left, c, ok, Vbool, 0);
    propagate_types(b->right, c, ok, Vbool, 0);
    if (type != Vbool && type > Vunknown) {
        type_err(c, "error: %1 operation found where %2 expected", prog,
               Vbool, type);
        *ok = 0;
    }
    return Vbool;
interp binode cases
case And:
    rv = interp_exec(b->left);
    right = interp_exec(b->right);
    rv.bool = rv.bool && right.bool;
    break;
case Or:
    rv = interp_exec(b->left);
    right = interp_exec(b->right);
    rv.bool = rv.bool || right.bool;
    break;
case Not:
    rv = interp_exec(b->right);
    rv.bool = !rv.bool;
    break;

Expressions: Comparison

Of slightly higher precedence that Boolean expressions are Comparisons. A comparison takes arguments of any type, but the two types must be the same.

To simplify the parsing we introduce an eop which can return an expression operator.

ast
struct eop {
    enum Btype op;
};
ast functions
static void free_eop(struct eop *e)
{
    if (e)
        free(e);
}
Binode types
Less,
Gtr,
LessEq,
GtrEq,
Eql,
NEql,
other BFact
| Expr CMPop Expr ${ {
        struct binode *b = new(binode);
        b->op = $2.op;
        b->left = $<1;
        b->right = $<3;
        $0 = b;
} }$
| Expr ${ $0 = $<1; }$
Grammar
$eop
CMPop ->   < ${ $0.op = Less; }$
    |  > ${ $0.op = Gtr; }$
    |  <= ${ $0.op = LessEq; }$
    |  >= ${ $0.op = GtrEq; }$
    |  == ${ $0.op = Eql; }$
    |  != ${ $0.op = NEql; }$
print binode cases
case Less:
case LessEq:
case Gtr:
case GtrEq:
case Eql:
case NEql:
    print_exec(b->left, -1, 0);
    switch(b->op) {
    case Less:   printf(" < "); break;
    case LessEq: printf(" <= "); break;
    case Gtr:    printf(" > "); break;
    case GtrEq:  printf(" >= "); break;
    case Eql:    printf(" == "); break;
    case NEql:   printf(" != "); break;
    default: abort();
    }
    print_exec(b->right, -1, 0);
    break;
propagate binode cases
case Less:
case LessEq:
case Gtr:
case GtrEq:
case Eql:
case NEql:
    /* Both must match but not labels, result is Vbool */
    t = propagate_types(b->left, c, ok, Vnolabel, 0);
    if (t > Vunknown)
        propagate_types(b->right, c, ok, t, 0);
    else {
        t = propagate_types(b->right, c, ok, Vnolabel, 0);
        if (t > Vunknown)
            t = propagate_types(b->left, c, ok, t, 0);
    }
    if (!vtype_compat(type, Vbool, 0)) {
        type_err(c, "error: Comparison returns %1 but %2 expected", prog,
                    Vbool, type);
        *ok = 0;
    }
    return Vbool;
interp binode cases
case Less:
case LessEq:
case Gtr:
case GtrEq:
case Eql:
case NEql:
{
    int cmp;
    left = interp_exec(b->left);
    right = interp_exec(b->right);
    cmp = value_cmp(left, right);
    rv.vtype = Vbool;
    switch (b->op) {
    case Less:  rv.bool = cmp <  0; break;
    case LessEq:    rv.bool = cmp <= 0; break;
    case Gtr:   rv.bool = cmp >  0; break;
    case GtrEq: rv.bool = cmp >= 0; break;
    case Eql:   rv.bool = cmp == 0; break;
    case NEql:  rv.bool = cmp != 0; break;
    default: rv.bool = 0; break;
    }
    break;
}

Expressions: The rest

The remaining expressions with the highest precedence are arithmetic and string concatenation. There are Expr, Term, and Factor. The Factor is where the Value and Variable that we already have are included.

+ and - are both infix and prefix operations (where they are absolute value and negation). These have different operator names.

We also have a 'Bracket' operator which records where parentheses were found. This make it easy to reproduce these when printing. Once precedence is handled better I might be able to discard this.

Binode types
Plus, Minus,
Times, Divide,
Concat,
Absolute, Negate,
Bracket,
Grammar
$*exec
Expr -> Expr Eop Term ${ {
        struct binode *b = new(binode);
        b->op = $2.op;
        b->left = $<1;
        b->right = $<3;
        $0 = b;
    } }$
    | Term ${ $0 = $<1; }$

Term -> Term Top Factor ${ {
        struct binode *b = new(binode);
        b->op = $2.op;
        b->left = $<1;
        b->right = $<3;
        $0 = b;
    } }$
    | Factor ${ $0 = $<1; }$

Factor -> ( Expression ) ${ {
        struct binode *b = new_pos(binode, $1);
        b->op = Bracket;
        b->right = $<2;
        $0 = b;
    } }$
    | Uop Factor ${ {
        struct binode *b = new(binode);
        b->op = $1.op;
        b->right = $<2;
        $0 = b;
    } }$
    | Value ${ $0 = $<1; }$
    | Variable ${ $0 = $<1; }$

$eop
Eop ->    + ${ $0.op = Plus; }$
    | - ${ $0.op = Minus; }$

Uop ->    + ${ $0.op = Absolute; }$
    | - ${ $0.op = Negate; }$

Top ->    * ${ $0.op = Times; }$
    | / ${ $0.op = Divide; }$
    | ++ ${ $0.op = Concat; }$
print binode cases
case Plus:
case Minus:
case Times:
case Divide:
case Concat:
    print_exec(b->left, indent, 0);
    switch(b->op) {
    case Plus:   printf(" + "); break;
    case Minus:  printf(" - "); break;
    case Times:  printf(" * "); break;
    case Divide: printf(" / "); break;
    case Concat: printf(" ++ "); break;
    default: abort();
    }
    print_exec(b->right, indent, 0);
    break;
case Absolute:
    printf("+");
    print_exec(b->right, indent, 0);
    break;
case Negate:
    printf("-");
    print_exec(b->right, indent, 0);
    break;
case Bracket:
    printf("(");
    print_exec(b->right, indent, 0);
    printf(")");
    break;
propagate binode cases
case Plus:
case Minus:
case Times:
case Divide:
    /* both must be numbers, result is Vnum */
case Absolute:
case Negate:
    /* as propagate_types ignores a NULL,
     * unary ops fit here too */
    propagate_types(b->left, c, ok, Vnum, 0);
    propagate_types(b->right, c, ok, Vnum, 0);
    if (!vtype_compat(type, Vnum, 0)) {
        type_err(c, "error: Arithmetic returns %1 but %2 expected", prog,
                   Vnum, type);
        *ok = 0;
    }
    return Vnum;

case Concat:
    /* both must be Vstr, result is Vstr */
    propagate_types(b->left, c, ok, Vstr, 0);
    propagate_types(b->right, c, ok, Vstr, 0);
    if (!vtype_compat(type, Vstr, 0)) {
        type_err(c, "error: Concat returns %1 but %2 expected", prog,
                   Vstr, type);
        *ok = 0;
    }
    return Vstr;

case Bracket:
    return propagate_types(b->right, c, ok, type, 0);
interp binode cases
case Plus:
    rv = interp_exec(b->left);
    right = interp_exec(b->right);
    mpq_add(rv.num, rv.num, right.num);
    break;
case Minus:
    rv = interp_exec(b->left);
    right = interp_exec(b->right);
    mpq_sub(rv.num, rv.num, right.num);
    break;
case Times:
    rv = interp_exec(b->left);
    right = interp_exec(b->right);
    mpq_mul(rv.num, rv.num, right.num);
    break;
case Divide:
    rv = interp_exec(b->left);
    right = interp_exec(b->right);
    mpq_div(rv.num, rv.num, right.num);
    break;
case Negate:
    rv = interp_exec(b->right);
    mpq_neg(rv.num, rv.num);
    break;
case Absolute:
    rv = interp_exec(b->right);
    mpq_abs(rv.num, rv.num);
    break;
case Bracket:
    rv = interp_exec(b->right);
    break;
case Concat:
    left = interp_exec(b->left);
    right = interp_exec(b->right);
    rv.vtype = Vstr;
    rv.str = text_join(left.str, right.str);
    break;

Blocks, Statements, and Statement lists.

Now that we have expressions out of the way we need to turn to statements. There are simple statements and more complex statements. Simple statements do not contain newlines, complex statements do.

Statements often come in sequences and we have corresponding simple statement lists and complex statement lists. The former comprise only simple statements separated by semicolons. The later comprise complex statements and simple statement lists. They are separated by newlines. Thus the semicolon is only used to separate simple statements on the one line. This may be overly restrictive, but I'm not sure I every want a complex statement to share a line with anything else.

Note that a simple statement list can still use multiple lines if subsequent lines are indented, so

Example: wrapped simple statement list
a = b; c = d;
   e = f; print g

is a single simple statement list. This might allow room for confusion, so I'm not set on it yet.

A simple statement list needs no extra syntax. A complex statement list has two syntactic forms. It can be enclosed in braces (much like C blocks), or it can be introduced by a colon and continue until an unindented newline (much like Python blocks). With this extra syntax it is referred to as a block.

Note that a block does not have to include any newlines if it only contains simple statements. So both of:

if condition: a=b; d=f

if condition { a=b; print f }

are valid.

In either case the list is constructed from a binode list with Block as the operator. When parsing the list it is most convenient to append to the end, so a list is a list and a statement. When using the list it is more convenient to consider a list to be a statement and a list. So we need a function to re-order a list. reorder_bilist serves this purpose.

The only stand-alone statement we introduce at this stage is pass which does nothing and is represented as a NULL pointer in a Block list.

Binode types
Block,
Grammar
$void
OptNL -> Newlines
    |

Newlines -> NEWLINE
    | Newlines NEWLINE

$*binode
Open -> {
    | NEWLINE {
Close -> }
    | NEWLINE }
Block -> Open Statementlist Close ${ $0 = $<2; }$
    | Open Newlines Statementlist Close ${ $0 = $<3; }$
    | Open SimpleStatements } ${ $0 = reorder_bilist($<2); }$
    | Open Newlines SimpleStatements } ${ $0 = reorder_bilist($<3); }$
    | : Statementlist ${ $0 = $<2; }$
    | : SimpleStatements ${ $0 = reorder_bilist($<2); }$

Statementlist -> ComplexStatements ${ $0 = reorder_bilist($<1); }$

ComplexStatements -> ComplexStatements ComplexStatement ${
    $0 = new(binode);
    $0->op = Block;
    $0->left = $<1;
    $0->right = $<2;
    }$
    | ComplexStatements NEWLINE ${ $0 = $<1; }$
    | ComplexStatement ${
    $0 = new(binode);
    $0->op = Block;
    $0->left = NULL;
    $0->right = $<1;
    }$

$*exec
ComplexStatement -> SimpleStatements NEWLINE ${
        $0 = reorder_bilist($<1);
        }$
    ## ComplexStatement Grammar

$*binode
SimpleStatements -> SimpleStatements ; SimpleStatement ${
        $0 = new(binode);
        $0->op = Block;
        $0->left = $<1;
        $0->right = $<3;
        }$
    | SimpleStatement ${
        $0 = new(binode);
        $0->op = Block;
        $0->left = NULL;
        $0->right = $<1;
        }$
    | SimpleStatements ; ${ $0 = $<1; }$

SimpleStatement -> pass ${ $0 = NULL; }$
    ## SimpleStatement Grammar
print binode cases
case Block:
    if (indent < 0) {
        // simple statement
        if (b->left == NULL)
            printf("pass");
        else
            print_exec(b->left, indent, 0);
        if (b->right) {
            printf("; ");
            print_exec(b->right, indent, 0);
        }
    } else {
        // block, one per line
        if (b->left == NULL)
            do_indent(indent, "pass\n");
        else
            print_exec(b->left, indent, bracket);
        if (b->right)
            print_exec(b->right, indent, bracket);
    }
    break;
propagate binode cases
case Block:
{
    /* If any statement returns something other then Vnone
     * or Vbool then all such must return same type.
     * As each statement may be Vnone or something else,
     * we must always pass Vunknown down, otherwise an incorrect
     * error might occur.  We never return Vnone unless it is
     * passed in.
     */
    struct binode *e;

    for (e = b; e; e = cast(binode, e->right)) {
        t = propagate_types(e->left, c, ok, Vunknown, bool_permitted);
        if (bool_permitted && t == Vbool)
            t = Vunknown;
        if (t != Vunknown && t != Vnone && t != Vbool) {
            if (type == Vunknown)
                type = t;
            else if (t != type) {
                type_err(c, "error: expected %1, found %2",
                         e->left, type, t);
                *ok = 0;
            }
        }
    }
    return type;
}
interp binode cases
case Block:
    while (rv.vtype == Vnone &&
           b) {
        if (b->left)
            rv = interp_exec(b->left);
        b = cast(binode, b->right);
    }
    break;

The Print statement

print is a simple statement that takes a comma-separated list of expressions and prints the values separated by spaces and terminated by a newline. No control of formatting is possible.

print faces the same list-ordering issue as blocks, and uses the same solution.

Binode types
Print,
SimpleStatement Grammar
| print ExpressionList ${
    $0 = reorder_bilist($<2);
}$
| print ExpressionList , ${
    $0 = new(binode);
    $0->op = Print;
    $0->right = NULL;
    $0->left = $<2;
    $0 = reorder_bilist($0);
}$
| print ${
    $0 = new(binode);
    $0->op = Print;
    $0->right = NULL;
}$
Grammar
$*binode
ExpressionList -> ExpressionList , Expression ${
    $0 = new(binode);
    $0->op = Print;
    $0->left = $<1;
    $0->right = $<3;
    }$
    | Expression ${
        $0 = new(binode);
        $0->op = Print;
        $0->left = NULL;
        $0->right = $<1;
    }$
print binode cases
case Print:
    do_indent(indent, "print");
    while (b) {
        if (b->left) {
            printf(" ");
            print_exec(b->left, -1, 0);
            if (b->right)
                printf(",");
        }
        b = cast(binode, b->right);
    }
    if (indent >= 0)
        printf("\n");
    break;
propagate binode cases
case Print:
    /* don't care but all must be consistent */
    propagate_types(b->left, c, ok, Vnolabel, 0);
    propagate_types(b->right, c, ok, Vnolabel, 0);
    break;
interp binode cases
case Print:
{
    char sep = 0;
    int eol = 1;
    for ( ; b; b = cast(binode, b->right))
        if (b->left) {
            if (sep)
                putchar(sep);
            left = interp_exec(b->left);
            print_value(left);
            free_value(left);
            if (b->right)
                sep = ' ';
        } else if (sep)
            eol = 0;
    left.vtype = Vnone;
    if (eol)
        printf("\n");
    break;
}
Assignment statement

An assignment will assign a value to a variable, providing it hasn't be declared as a constant. The analysis phase ensures that the type will be correct so the interpreter just needs to perform the calculation. There is a form of assignment which declares a new variable as well as assigning a value. If a name is assigned before it is declared, and error will be raised as the name is created as Vlabel and it is illegal to assign to such names.

Binode types
Assign,
Declare,
SimpleStatement Grammar
| Variable = Expression ${ {
        struct var *v = cast(var, $1);

        $0 = new(binode);
        $0->op = Assign;
        $0->left = $<1;
        $0->right = $<3;
        if (v->var && !v->var->constant) {
            /* FIXME error? */
        }
    } }$
| VariableDecl Expression ${
        $0 = new(binode);
        $0->op = Declare;
        $0->left = $<1;
        $0->right =$<2;
    }$
print binode cases
case Assign:
    do_indent(indent, "");
    print_exec(b->left, indent, 0);
    printf(" = ");
    print_exec(b->right, indent, 0);
    if (indent >= 0)
        printf("\n");
    break;

case Declare:
    do_indent(indent, "");
    print_exec(b->left, indent, 0);
    if (cast(var, b->left)->var->constant)
        printf(" ::= ");
    else
        printf(" := ");
    print_exec(b->right, indent, 0);
    if (indent >= 0)
        printf("\n");
    break;
propagate binode cases
case Assign:
case Declare:
    /* Both must match and not be labels, result is Vnone */
    t = propagate_types(b->left, c, ok, Vnolabel, 0);
    if (t > Vunknown) {
        if (propagate_types(b->right, c, ok, t, 0) != t)
            if (b->left->type == Xvar)
                type_err(c, "info: variable '%v' was set as %1 here.",
                         cast(var, b->left)->var->where_set, t, Vnone);
    } else {
        t = propagate_types(b->right, c, ok, Vnolabel, 0);
        if (t > Vunknown)
            propagate_types(b->left, c, ok, t, 0);
    }
    return Vnone;

    break;
interp binode cases
case Assign:
case Declare:
{
    struct variable *v = cast(var, b->left)->var;
    if (v->merged)
        v = v->merged;
    right = interp_exec(b->right);
    free_value(v->val);
    v->val = right;
    right.vtype = Vunknown;
    break;
}

The use statement

The use statement is the last "simple" statement. It is needed when the condition in a conditional statement is a block. use works much like return in C, but only completes the condition, not the whole function.

Binode types
Use,
SimpleStatement Grammar
| use Expression ${
    $0 = new_pos(binode, $1);
    $0->op = Use;
    $0->right = $<2;
}$
print binode cases
case Use:
    do_indent(indent, "use ");
    print_exec(b->right, -1, 0);
    if (indent >= 0)
        printf("\n");
    break;
propagate binode cases
case Use:
    /* result matches value */
    return propagate_types(b->right, c, ok, type, 0);
interp binode cases
case Use:
    rv = interp_exec(b->right);
    break;

The Conditional Statement

This is the biggy and currently the only complex statement. This subsumes if, while, do/while, switch, and some parts of for. It is comprised of a number of parts, all of which are optional though set combinations apply. Each part is (usually) a key word (then is sometimes optional) followed by either an expression of a code block, except the casepart which is a "key word and an expression" followed by a code block. The code-block option is valid for all parts and, where an expression is also allowed, the code block can use the use statement to report a value. If the code block does no report a value the effect is similar to reporting False.

The else and case parts, as well as then when combined with if, can contain a use statement which will apply to some containing conditional statement. for parts, do parts and then parts used with for can never contain a use, except in some subordinate conditional statement.

If there is a forpart, it is executed first, only once. If there is a dopart, then it is executed repeatedly providing always that the condpart or cond, if present, does not return a non-True value. condpart can fail to return any value if it simply executes to completion. This is treated the same as returning True.

If there is a thenpart it will be executed whenever the condpart or cond returns True (or does not return any value), but this will happen after dopart (when present).

If elsepart is present it will be executed at most once when the condition returns False or some value that isn't True and isn't matched by any casepart. If there are any caseparts, they will be executed when the condition returns a matching value.

The particular sorts of values allowed in case parts has not yet been determined in the language design, so nothing is prohibited.

The various blocks in this complex statement potentially provide scope for variables as described earlier. Each such block must include the "OpenScope" nonterminal before parsing the block, and must call var_block_close() when closing the block.

The code following "if", "switch" and "for" does not get its own scope, but is in a scope covering the whole statement, so names declared there cannot be redeclared elsewhere. Similarly the condition following "while" is in a scope the covers the body ("do" part) of the loop, and which does not allow conditional scope extension. Code following "then" (both looping and non-looping), "else" and "case" each get their own local scope.

The type requirements on the code block in a whilepart are quite unusal. It is allowed to return a value of some identifiable type, in which case the loop abort and an appropriate casepart is run, or it can return a Boolean, in which case the loop either continues to the dopart (on True) or aborts and runs the elsepart (on False). This is different both from the ifpart code block which is expected to return a Boolean, or the switchpart code block which is expected to return the same type as the casepart values. The correct analysis of the type of the whilepart code block is the reason for the bool_permitted flag which is passed to propagate_types().

The cond_statement cannot fit into a binode so a new exec is defined.

exec type
Xcond_statement,
ast
struct casepart {
    struct exec *value;
    struct exec *action;
    struct casepart *next;
};
struct cond_statement {
    struct exec;
    struct exec *forpart, *condpart, *dopart, *thenpart, *elsepart;
    struct casepart *casepart;
};
ast functions
static void free_casepart(struct casepart *cp)
{
    while (cp) {
        struct casepart *t;
        free_exec(cp->value);
        free_exec(cp->action);
        t = cp->next;
        free(cp);
        cp = t;
    }
}

static void free_cond_statement(struct cond_statement *s)
{
    if (!s)
        return;
    free_exec(s->forpart);
    free_exec(s->condpart);
    free_exec(s->dopart);
    free_exec(s->thenpart);
    free_exec(s->elsepart);
    free_casepart(s->casepart);
    free(s);
}
free exec cases
case Xcond_statement: free_cond_statement(cast(cond_statement, e)); break;
ComplexStatement Grammar
| CondStatement ${ $0 = $<1; }$
Grammar
$*cond_statement
// both ForThen and Whilepart open scopes, and CondSuffix only
// closes one - so in the first branch here we have another to close.
CondStatement -> ForThen WhilePart CondSuffix ${
        $0 = $<3;
        $0->forpart = $1.forpart; $1.forpart = NULL;
        $0->thenpart = $1.thenpart; $1.thenpart = NULL;
        $0->condpart = $2.condpart; $2.condpart = NULL;
        $0->dopart = $2.dopart; $2.dopart = NULL;
        var_block_close(config2context(config), CloseSequential);
        }$
    | WhilePart CondSuffix ${
        $0 = $<2;
        $0->condpart = $1.condpart; $1.condpart = NULL;
        $0->dopart = $1.dopart; $1.dopart = NULL;
        }$
    | SwitchPart CondSuffix ${
        $0 = $<2;
        $0->condpart = $<1;
        }$
    | IfPart IfSuffix ${
        $0 = $<2;
        $0->condpart = $1.condpart; $1.condpart = NULL;
        $0->thenpart = $1.thenpart; $1.thenpart = NULL;
        // This is where we close an "if" statement
        var_block_close(config2context(config), CloseSequential);
        }$

CondSuffix -> IfSuffix ${
        $0 = $<1;
        // This is where we close scope of the whole
        // "for" or "while" statement
        var_block_close(config2context(config), CloseSequential);
    }$
    | CasePart CondSuffix ${
        $0 = $<2;
        $1->next = $0->casepart;
        $0->casepart = $<1;
    }$

$*casepart
CasePart -> Newlines case Expression OpenScope Block ${
        $0 = calloc(1,sizeof(struct casepart));
        $0->value = $<3;
        $0->action = $<5;
        var_block_close(config2context(config), CloseParallel);
    }$
    | case Expression OpenScope Block ${
        $0 = calloc(1,sizeof(struct casepart));
        $0->value = $<2;
        $0->action = $<4;
        var_block_close(config2context(config), CloseParallel);
    }$

$*cond_statement
IfSuffix -> Newlines ${ $0 = new(cond_statement); }$
    | Newlines else OpenScope Block ${
        $0 = new(cond_statement);
        $0->elsepart = $<4;
        var_block_close(config2context(config), CloseElse);
    }$
    | else OpenScope Block ${
        $0 = new(cond_statement);
        $0->elsepart = $<3;
        var_block_close(config2context(config), CloseElse);
    }$
    | Newlines else OpenScope CondStatement ${
        $0 = new(cond_statement);
        $0->elsepart = $<4;
        var_block_close(config2context(config), CloseElse);
    }$
    | else OpenScope CondStatement ${
        $0 = new(cond_statement);
        $0->elsepart = $<3;
        var_block_close(config2context(config), CloseElse);
    }$


$*exec
// These scopes are closed in CondSuffix
ForPart -> for OpenScope SimpleStatements ${
        $0 = reorder_bilist($<3);
    }$
    |  for OpenScope Block ${
        $0 = $<3;
    }$

ThenPart -> then OpenScope SimpleStatements ${
        $0 = reorder_bilist($<3);
        var_block_close(config2context(config), CloseSequential);
    }$
    |  then OpenScope Block ${
        $0 = $<3;
        var_block_close(config2context(config), CloseSequential);
    }$

ThenPartNL -> ThenPart OptNL ${
        $0 = $<1;
    }$

// This scope is closed in CondSuffix
WhileHead -> while OpenScope Block ${
    $0 = $<3;
    }$

$cond_statement
ForThen -> ForPart OptNL ThenPartNL ${
        $0.forpart = $<1;
        $0.thenpart = $<3;
    }$
    | ForPart OptNL ${
        $0.forpart = $<1;
    }$

// This scope is closed in CondSuffix
WhilePart -> while OpenScope Expression Block ${
        $0.type = Xcond_statement;
        $0.condpart = $<3;
        $0.dopart = $<4;
    }$
    | WhileHead OptNL do Block ${
        $0.type = Xcond_statement;
        $0.condpart = $<1;
        $0.dopart = $<4;
    }$

IfPart -> if OpenScope Expression OpenScope Block ${
        $0.type = Xcond_statement;
        $0.condpart = $<3;
        $0.thenpart = $<5;
        var_block_close(config2context(config), CloseParallel);
    }$
    | if OpenScope Block OptNL then OpenScope Block ${
        $0.type = Xcond_statement;
        $0.condpart = $<3;
        $0.thenpart = $<7;
        var_block_close(config2context(config), CloseParallel);
    }$

$*exec
// This scope is closed in CondSuffix
SwitchPart -> switch OpenScope Expression ${
        $0 = $<3;
    }$
    | switch OpenScope Block ${
        $0 = $<3;
    }$
print exec cases
case Xcond_statement:
{
    struct cond_statement *cs = cast(cond_statement, e);
    struct casepart *cp;
    if (cs->forpart) {
        do_indent(indent, "for");
        if (bracket) printf(" {\n"); else printf(":\n");
        print_exec(cs->forpart, indent+1, bracket);
        if (cs->thenpart) {
            if (bracket)
                do_indent(indent, "} then {\n");
            else
                do_indent(indent, "then:\n");
            print_exec(cs->thenpart, indent+1, bracket);
        }
        if (bracket) do_indent(indent, "}\n");
    }
    if (cs->dopart) {
        // a loop
        if (cs->condpart && cs->condpart->type == Xbinode &&
            cast(binode, cs->condpart)->op == Block) {
            if (bracket)
                do_indent(indent, "while {\n");
            else
                do_indent(indent, "while:\n");
            print_exec(cs->condpart, indent+1, bracket);
            if (bracket)
                do_indent(indent, "} do {\n");
            else
                do_indent(indent, "do:\n");
            print_exec(cs->dopart, indent+1, bracket);
            if (bracket)
                do_indent(indent, "}\n");
        } else {
            do_indent(indent, "while ");
            print_exec(cs->condpart, 0, bracket);
            if (bracket)
                printf(" {\n");
            else
                printf(":\n");
            print_exec(cs->dopart, indent+1, bracket);
            if (bracket)
                do_indent(indent, "}\n");
        }
    } else {
        // a condition
        if (cs->casepart)
            do_indent(indent, "switch");
        else
            do_indent(indent, "if");
        if (cs->condpart && cs->condpart->type == Xbinode &&
            cast(binode, cs->condpart)->op == Block) {
            if (bracket)
                printf(" {\n");
            else
                printf(":\n");
            print_exec(cs->condpart, indent+1, bracket);
            if (bracket)
                do_indent(indent, "}\n");
            if (cs->thenpart) {
                do_indent(indent, "then:\n");
                print_exec(cs->thenpart, indent+1, bracket);
            }
        } else {
            printf(" ");
            print_exec(cs->condpart, 0, bracket);
            if (cs->thenpart) {
                if (bracket)
                    printf(" {\n");
                else
                    printf(":\n");
                print_exec(cs->thenpart, indent+1, bracket);
                if (bracket)
                    do_indent(indent, "}\n");
            } else
                printf("\n");
        }
    }
    for (cp = cs->casepart; cp; cp = cp->next) {
        do_indent(indent, "case ");
        print_exec(cp->value, -1, 0);
        if (bracket)
            printf(" {\n");
        else
            printf(":\n");
        print_exec(cp->action, indent+1, bracket);
        if (bracket)
            do_indent(indent, "}\n");
    }
    if (cs->elsepart) {
        do_indent(indent, "else");
        if (bracket)
            printf(" {\n");
        else
            printf(":\n");
        print_exec(cs->elsepart, indent+1, bracket);
        if (bracket)
            do_indent(indent, "}\n");
    }
    break;
}
propagate exec cases
case Xcond_statement:
{
    // forpart and dopart must return Vnone
    // thenpart must return Vnone if there is a dopart,
    // otherwise it is like elsepart.
    // condpart must:
    //    be bool if there is not casepart
    //    match casepart->values if there is a switchpart
    //    either be bool or match casepart->value if there
    //             is a whilepart
    // elsepart, casepart->action must match there return type
    // expected of this statement.
    struct cond_statement *cs = cast(cond_statement, prog);
    struct casepart *cp;

    t = propagate_types(cs->forpart, c, ok, Vnone, 0);
    if (!vtype_compat(Vnone, t, 0))
        *ok = 0;
    t = propagate_types(cs->dopart, c, ok, Vnone, 0);
    if (!vtype_compat(Vnone, t, 0))
        *ok = 0;
    if (cs->dopart) {
        t = propagate_types(cs->thenpart, c, ok, Vnone, 0);
        if (!vtype_compat(Vnone, t, 0))
            *ok = 0;
    }
    if (cs->casepart == NULL)
        propagate_types(cs->condpart, c, ok, Vbool, 0);
    else {
        /* Condpart must match case values, with bool permitted */
        t = Vunknown;
        for (cp = cs->casepart;
             cp && (t == Vunknown); cp = cp->next)
            t = propagate_types(cp->value, c, ok, Vunknown, 0);
        if (t == Vunknown && cs->condpart)
            t = propagate_types(cs->condpart, c, ok, Vunknown, 1);
        // Now we have a type (I hope) push it down
        if (t != Vunknown) {
            for (cp = cs->casepart; cp; cp = cp->next)
                propagate_types(cp->value, c, ok, t, 0);
            propagate_types(cs->condpart, c, ok, t, 1);
        }
    }
    // (if)then, else, and case parts must return expected type.
    if (!cs->dopart && type == Vunknown)
        type = propagate_types(cs->thenpart, c, ok, Vunknown, bool_permitted);
    if (type == Vunknown)
        type = propagate_types(cs->elsepart, c, ok, Vunknown, bool_permitted);
    for (cp = cs->casepart;
         cp && type == Vunknown;
         cp = cp->next)
        type = propagate_types(cp->action, c, ok, Vunknown, bool_permitted);
    if (type > Vunknown) {
        if (!cs->dopart)
            propagate_types(cs->thenpart, c, ok, type, bool_permitted);
        propagate_types(cs->elsepart, c, ok, type, bool_permitted);
        for (cp = cs->casepart; cp ; cp = cp->next)
            propagate_types(cp->action, c, ok, type, bool_permitted);
        return type;
    } else
        return Vunknown;
}
interp exec cases
case Xcond_statement:
{
    struct value v, cnd;
    struct casepart *cp;
    struct cond_statement *c = cast(cond_statement, e);
    if (c->forpart)
        interp_exec(c->forpart);
    do {
        if (c->condpart)
            cnd = interp_exec(c->condpart);
        else
            cnd.vtype = Vnone;
        if (!(cnd.vtype == Vnone ||
              (cnd.vtype == Vbool && cnd.bool != 0)))
            break;
        if (c->dopart) {
            free_value(cnd);
            interp_exec(c->dopart);
        }
        if (c->thenpart) {
            v = interp_exec(c->thenpart);
            if (v.vtype != Vnone || !c->dopart)
                return v;
            free_value(v);
        }
    } while (c->dopart);

    for (cp = c->casepart; cp; cp = cp->next) {
        v = interp_exec(cp->value);
        if (value_cmp(v, cnd) == 0) {
            free_value(v);
            free_value(cnd);
            return interp_exec(cp->action);
        }
        free_value(v);
    }
    free_value(cnd);
    if (c->elsepart)
        return interp_exec(c->elsepart);
    v.vtype = Vnone;
    return v;
}

Finally the whole program.

Somewhat reminiscent of Pascal a (current) Ocean program starts with the keyword "program" and a list of variable names which are assigned values from command line arguments. Following this is a block which is the code to execute.

As this is the top level, several things are handled a bit differently. The whole program is not interpreted by interp_exec as that isn't passed the argument list which the program requires. Similarly type analysis is a bit more interesting at this level.

Binode types
Program,
Parser: grammar
$*binode
Program -> program OpenScope Varlist Block OptNL ${
    $0 = new(binode);
    $0->op = Program;
    $0->left = reorder_bilist($<3);
    $0->right = $<4;
    var_block_close(config2context(config), CloseSequential);
    if (config2context(config)->scope_stack) abort();
    }$
    | ERROR ${
        tok_err(config2context(config),
                "error: unhandled parse error.", &$1);
    }$

Varlist -> Varlist ArgDecl ${
        $0 = new(binode);
        $0->op = Program;
        $0->left = $<1;
        $0->right = $<2;
    }$
    | ${ $0 = NULL; }$

$*var
ArgDecl -> IDENTIFIER ${ {
    struct variable *v = var_decl(config2context(config), $1.txt);
    $0 = new(var);
    $0->var = v;
} }$

## Grammar
print binode cases
case Program:
    do_indent(indent, "program");
    for (b2 = cast(binode, b->left); b2; b2 = cast(binode, b2->right)) {
        printf(" ");
        print_exec(b2->left, 0, 0);
    }
    if (bracket)
        printf(" {\n");
    else
        printf(":\n");
    print_exec(b->right, indent+1, bracket);
    if (bracket)
        do_indent(indent, "}\n");
    break;
propagate binode cases
case Program: abort();
core functions
static int analyse_prog(struct exec *prog, struct parse_context *c)
{
    struct binode *b = cast(binode, prog);
    int ok = 1;

    if (!b)
        return 0;
    do {
        ok = 1;
        propagate_types(b->right, c, &ok, Vnone, 0);
    } while (ok == 2);
    if (!ok)
        return 0;

    for (b = cast(binode, b->left); b; b = cast(binode, b->right)) {
        struct var *v = cast(var, b->left);
        if (v->var->val.vtype == Vunknown) {
            v->var->where_set = b;
            val_init(&v->var->val, Vstr);
        }
    }
    b = cast(binode, prog);
    do {
        ok = 1;
        propagate_types(b->right, c, &ok, Vnone, 0);
    } while (ok == 2);
    if (!ok)
        return 0;

    /* Make sure everything is still consistent */
    propagate_types(b->right, c, &ok, Vnone, 0);
    return !!ok;
}

static void interp_prog(struct exec *prog, char **argv)
{
    struct binode *p = cast(binode, prog);
    struct binode *al;
    struct value v;

    if (!prog)
        return;
    al = cast(binode, p->left);
    while (al) {
        struct var *v = cast(var, al->left);
        struct value *vl = &v->var->val;

        if (argv[0] == NULL) {
            printf("Not enough args\n");
            exit(1);
        }
        al = cast(binode, al->right);
        free_value(*vl);
        if (!parse_value(vl, argv[0]))
            exit(1);
        argv++;
    }
    v = interp_exec(p->right);
    free_value(v);
}
interp binode cases
case Program: abort();

And now to test it out.

Having a language requires having a "hello world" program. I'll provide a little more than that: a program that prints "Hello world" finds the GCD of two numbers, prints the first few elements of Fibonacci, and performs a binary search for a number.

File: oceani.mk
tests :: sayhello
sayhello : oceani
    @echo "===== TEST ====="
    ./oceani --section "test: hello" oceani.mdc 55 33
test: hello
program A B:
    print "Hello World, what lovely oceans you have!"
    /* When a variable is defined in both branches of an 'if',
     * and used afterwards, the variables are merged.
     */
    if A > B:
        bigger := "yes"
    else:
        bigger := "no"
    print "Is", A, "bigger than", B,"? ", bigger
    /* If a variable is not used after the 'if', no
     * merge happens, so types can be different
     */
    if A * 2 > B:
        double := "yes"
        print A, "is more than twice", B, "?", double
    else:
        double := A*2
        print "double", A, "is only", double

    a := A; b := B
    if a > 0 and b > 0:
        while a != b:
            if a < b:
                b = b - a
            else:
                a = a - b
        print "GCD of", A, "and", B,"is", a
    else if a <= 0:
        print a, "is not positive, cannot calculate GCD"
    else:
        print b, "is not positive, cannot calculate GCD"

    for:
        togo := 10
        f1 := 1; f2 := 1
        print "Fibonacci:", f1,f2,
    then togo = togo - 1
    while togo > 0:
        f3 := f1 + f2
        print "", f3,
        f1 = f2
        f2 = f3
    print ""

    /* Binary search... */
    for:
        lo:= 0; hi := 100
        target := 77
    while:
        mid := (lo + hi) / 2
        if mid == target:
            use Found
        if mid < target:
            lo = mid
        else:
            hi = mid
        if hi - lo < 1:
            use GiveUp
        use True
    do: pass
    case Found:
        print "Yay, I found", target
    case GiveUp:
        print "Closest I found was", mid