+###### ast functions
+ static void variable_unlink_exec(struct variable *v)
+ {
+ struct variable **vp;
+ if (!v->cleanup_exec)
+ return;
+ for (vp = &v->cleanup_exec->to_free;
+ *vp; vp = &(*vp)->next_free) {
+ if (*vp != v)
+ continue;
+ *vp = v->next_free;
+ v->cleanup_exec = NULL;
+ break;
+ }
+ }
+
+While the naming seems strange, we include local constants in the
+definition of variables. A name declared `var := value` can
+subsequently be changed, but a name declared `var ::= value` cannot -
+it is constant
+
+###### variable fields
+ int constant;
+
+Scopes in parallel branches can be partially merged. More
+specifically, if a given name is declared in both branches of an
+if/else then its scope is a candidate for merging. Similarly if
+every branch of an exhaustive switch (e.g. has an "else" clause)
+declares a given name, then the scopes from the branches are
+candidates for merging.
+
+Note that names declared inside a loop (which is only parallel to
+itself) are never visible after the loop. Similarly names defined in
+scopes which are not parallel, such as those started by `for` and
+`switch`, are never visible after the scope. Only variables defined in
+both `then` and `else` (including the implicit then after an `if`, and
+excluding `then` used with `for`) and in all `case`s and `else` of a
+`switch` or `while` can be visible beyond the `if`/`switch`/`while`.
+
+Labels, which are a bit like variables, follow different rules.
+Labels are not explicitly declared, but if an undeclared name appears
+in a context where a label is legal, that effectively declares the
+name as a label. The declaration remains in force (or in scope) at
+least to the end of the immediately containing block and conditionally
+in any larger containing block which does not declare the name in some
+other way. Importantly, the conditional scope extension happens even
+if the label is only used in one parallel branch of a conditional --
+when used in one branch it is treated as having been declared in all
+branches.
+
+Merge candidates are tentatively visible beyond the end of the
+branching statement which creates them. If the name is used, the
+merge is affirmed and they become a single variable visible at the
+outer layer. If not - if it is redeclared first - the merge lapses.
+
+To track scopes we have an extra stack, implemented as a linked list,
+which roughly parallels the parse stack and which is used exclusively
+for scoping. When a new scope is opened, a new frame is pushed and
+the child-count of the parent frame is incremented. This child-count
+is used to distinguish between the first of a set of parallel scopes,
+in which declared variables must not be in scope, and subsequent
+branches, whether they may already be conditionally scoped.
+
+We need a total ordering of scopes so we can easily compare to variables
+to see if they are concurrently in scope. To achieve this we record a
+`scope_count` which is actually a count of both beginnings and endings
+of scopes. Then each variable has a record of the scope count where it
+enters scope, and where it leaves.
+
+To push a new frame *before* any code in the frame is parsed, we need a
+grammar reduction. This is most easily achieved with a grammar
+element which derives the empty string, and creates the new scope when
+it is recognised. This can be placed, for example, between a keyword
+like "if" and the code following it.
+
+###### ast
+ struct scope {
+ struct scope *parent;
+ int child_count;
+ };
+
+###### parse context
+ int scope_depth;
+ int scope_count;
+ struct scope *scope_stack;
+
+###### variable fields
+ int scope_start, scope_end;
+
+###### ast functions
+ static void scope_pop(struct parse_context *c)
+ {
+ struct scope *s = c->scope_stack;
+
+ c->scope_stack = s->parent;
+ free(s);
+ c->scope_depth -= 1;
+ c->scope_count += 1;
+ }
+
+ static void scope_push(struct parse_context *c)
+ {
+ struct scope *s = calloc(1, sizeof(*s));
+ if (c->scope_stack)
+ c->scope_stack->child_count += 1;
+ s->parent = c->scope_stack;
+ c->scope_stack = s;
+ c->scope_depth += 1;
+ c->scope_count += 1;
+ }
+
+###### Grammar
+
+ $void
+ OpenScope -> ${ scope_push(c); }$
+
+Each variable records a scope depth and is in one of four states:
+
+- "in scope". This is the case between the declaration of the
+ variable and the end of the containing block, and also between
+ the usage with affirms a merge and the end of that block.
+
+ The scope depth is not greater than the current parse context scope
+ nest depth. When the block of that depth closes, the state will
+ change. To achieve this, all "in scope" variables are linked
+ together as a stack in nesting order.
+
+- "pending". The "in scope" block has closed, but other parallel
+ scopes are still being processed. So far, every parallel block at
+ the same level that has closed has declared the name.
+
+ The scope depth is the depth of the last parallel block that
+ enclosed the declaration, and that has closed.
+
+- "conditionally in scope". The "in scope" block and all parallel
+ scopes have closed, and no further mention of the name has been seen.
+ This state includes a secondary nest depth (`min_depth`) which records
+ the outermost scope seen since the variable became conditionally in
+ scope. If a use of the name is found, the variable becomes "in scope"
+ and that secondary depth becomes the recorded scope depth. If the
+ name is declared as a new variable, the old variable becomes "out of
+ scope" and the recorded scope depth stays unchanged.
+
+- "out of scope". The variable is neither in scope nor conditionally
+ in scope. It is permanently out of scope now and can be removed from
+ the "in scope" stack. When a variable becomes out-of-scope it is
+ moved to a separate list (`out_scope`) of variables which have fully
+ known scope. This will be used at the end of each function to assign
+ each variable a place in the stack frame.
+
+###### variable fields
+ int depth, min_depth;
+ enum { OutScope, PendingScope, CondScope, InScope } scope;
+ struct variable *in_scope;
+
+###### parse context
+
+ struct variable *in_scope;
+ struct variable *out_scope;
+
+All variables with the same name are linked together using the
+'previous' link. Those variable that have been affirmatively merged all
+have a 'merged' pointer that points to one primary variable - the most
+recently declared instance. When merging variables, we need to also
+adjust the 'merged' pointer on any other variables that had previously
+been merged with the one that will no longer be primary.
+
+A variable that is no longer the most recent instance of a name may
+still have "pending" scope, if it might still be merged with most
+recent instance. These variables don't really belong in the
+"in_scope" list, but are not immediately removed when a new instance
+is found. Instead, they are detected and ignored when considering the
+list of in_scope names.
+
+The storage of the value of a variable will be described later. For now
+we just need to know that when a variable goes out of scope, it might
+need to be freed. For this we need to be able to find it, so assume that
+`var_value()` will provide that.
+
+###### variable fields
+ struct variable *merged;
+
+###### ast functions
+
+ static void variable_merge(struct variable *primary, struct variable *secondary)
+ {
+ struct variable *v;
+
+ primary = primary->merged;
+
+ for (v = primary->previous; v; v=v->previous)
+ if (v == secondary || v == secondary->merged ||
+ v->merged == secondary ||
+ v->merged == secondary->merged) {
+ v->scope = OutScope;
+ v->merged = primary;
+ if (v->scope_start < primary->scope_start)
+ primary->scope_start = v->scope_start;
+ if (v->scope_end > primary->scope_end)
+ primary->scope_end = v->scope_end; // NOTEST
+ variable_unlink_exec(v);
+ }
+ }
+
+###### forward decls
+ static struct value *var_value(struct parse_context *c, struct variable *v);
+
+###### free global vars
+
+ while (context.varlist) {
+ struct binding *b = context.varlist;
+ struct variable *v = b->var;
+ context.varlist = b->next;
+ free(b);
+ while (v) {
+ struct variable *next = v->previous;
+
+ if (v->global && v->frame_pos >= 0) {
+ free_value(v->type, var_value(&context, v));
+ if (v->depth == 0 && v->type->free == function_free)
+ // This is a function constant
+ free_exec(v->where_decl);
+ }
+ free(v);
+ v = next;
+ }
+ }
+
+#### Manipulating Bindings
+
+When a name is conditionally visible, a new declaration discards the old
+binding - the condition lapses. Similarly when we reach the end of a
+function (outermost non-global scope) any conditional scope must lapse.
+Conversely a usage of the name affirms the visibility and extends it to
+the end of the containing block - i.e. the block that contains both the
+original declaration and the latest usage. This is determined from
+`min_depth`. When a conditionally visible variable gets affirmed like
+this, it is also merged with other conditionally visible variables with
+the same name.
+
+When we parse a variable declaration we either report an error if the
+name is currently bound, or create a new variable at the current nest
+depth if the name is unbound or bound to a conditionally scoped or
+pending-scope variable. If the previous variable was conditionally
+scoped, it and its homonyms becomes out-of-scope.
+
+When we parse a variable reference (including non-declarative assignment
+"foo = bar") we report an error if the name is not bound or is bound to
+a pending-scope variable; update the scope if the name is bound to a
+conditionally scoped variable; or just proceed normally if the named
+variable is in scope.
+
+When we exit a scope, any variables bound at this level are either
+marked out of scope or pending-scoped, depending on whether the scope
+was sequential or parallel. Here a "parallel" scope means the "then"
+or "else" part of a conditional, or any "case" or "else" branch of a
+switch. Other scopes are "sequential".
+
+When exiting a parallel scope we check if there are any variables that
+were previously pending and are still visible. If there are, then
+they weren't redeclared in the most recent scope, so they cannot be
+merged and must become out-of-scope. If it is not the first of
+parallel scopes (based on `child_count`), we check that there was a
+previous binding that is still pending-scope. If there isn't, the new
+variable must now be out-of-scope.
+
+When exiting a sequential scope that immediately enclosed parallel
+scopes, we need to resolve any pending-scope variables. If there was
+no `else` clause, and we cannot determine that the `switch` was exhaustive,
+we need to mark all pending-scope variable as out-of-scope. Otherwise
+all pending-scope variables become conditionally scoped.
+
+###### ast
+ enum closetype { CloseSequential, CloseFunction, CloseParallel, CloseElse };
+
+###### ast functions
+
+ static struct variable *var_decl(struct parse_context *c, struct text s)
+ {
+ struct binding *b = find_binding(c, s);
+ struct variable *v = b->var;
+
+ switch (v ? v->scope : OutScope) {
+ case InScope:
+ /* Caller will report the error */
+ return NULL;
+ case CondScope:
+ for (;
+ v && v->scope == CondScope;
+ v = v->previous)
+ v->scope = OutScope;
+ break;
+ default: break;
+ }
+ v = calloc(1, sizeof(*v));
+ v->previous = b->var;
+ b->var = v;
+ v->name = b;
+ v->merged = v;
+ v->min_depth = v->depth = c->scope_depth;
+ v->scope = InScope;
+ v->in_scope = c->in_scope;
+ v->scope_start = c->scope_count;
+ c->in_scope = v;
+ ## variable init
+ return v;
+ }
+
+ static struct variable *var_ref(struct parse_context *c, struct text s)
+ {
+ struct binding *b = find_binding(c, s);
+ struct variable *v = b->var;
+ struct variable *v2;
+
+ switch (v ? v->scope : OutScope) {
+ case OutScope:
+ case PendingScope:
+ /* Caller will report the error */
+ return NULL;
+ case CondScope:
+ /* All CondScope variables of this name need to be merged
+ * and become InScope
+ */
+ v->depth = v->min_depth;
+ v->scope = InScope;
+ for (v2 = v->previous;
+ v2 && v2->scope == CondScope;
+ v2 = v2->previous)
+ variable_merge(v, v2);
+ break;
+ case InScope:
+ break;
+ }
+ return v;
+ }
+
+ static int var_refile(struct parse_context *c, struct variable *v)
+ {
+ /* Variable just went out of scope. Add it to the out_scope
+ * list, sorted by ->scope_start
+ */
+ struct variable **vp = &c->out_scope;
+ while ((*vp) && (*vp)->scope_start < v->scope_start)
+ vp = &(*vp)->in_scope;
+ v->in_scope = *vp;
+ *vp = v;
+ return 0;
+ }
+
+ static void var_block_close(struct parse_context *c, enum closetype ct,
+ struct exec *e)
+ {
+ /* Close off all variables that are in_scope.
+ * Some variables in c->scope may already be not-in-scope,
+ * such as when a PendingScope variable is hidden by a new
+ * variable with the same name.
+ * So we check for v->name->var != v and drop them.
+ * If we choose to make a variable OutScope, we drop it
+ * immediately too.
+ */
+ struct variable *v, **vp, *v2;
+
+ scope_pop(c);
+ for (vp = &c->in_scope;
+ (v = *vp) && v->min_depth > c->scope_depth;
+ (v->scope == OutScope || v->name->var != v)
+ ? (*vp = v->in_scope, var_refile(c, v))
+ : ( vp = &v->in_scope, 0)) {
+ v->min_depth = c->scope_depth;
+ if (v->name->var != v)
+ /* This is still in scope, but we haven't just
+ * closed the scope.
+ */
+ continue;
+ v->min_depth = c->scope_depth;
+ if (v->scope == InScope)
+ v->scope_end = c->scope_count;
+ if (v->scope == InScope && e && !v->global) {
+ /* This variable gets cleaned up when 'e' finishes */
+ variable_unlink_exec(v);
+ v->cleanup_exec = e;
+ v->next_free = e->to_free;
+ e->to_free = v;
+ }
+ switch (ct) {
+ case CloseElse:
+ case CloseParallel: /* handle PendingScope */
+ switch(v->scope) {
+ case InScope:
+ case CondScope:
+ if (c->scope_stack->child_count == 1)
+ /* first among parallel branches */
+ v->scope = PendingScope;
+ else if (v->previous &&
+ v->previous->scope == PendingScope)
+ /* all previous branches used name */
+ v->scope = PendingScope;
+ else
+ v->scope = OutScope;
+ if (ct == CloseElse) {
+ /* All Pending variables with this name
+ * are now Conditional */
+ for (v2 = v;
+ v2 && v2->scope == PendingScope;
+ v2 = v2->previous)
+ v2->scope = CondScope;
+ }
+ break;
+ case PendingScope:
+ /* Not possible as it would require
+ * parallel scope to be nested immediately
+ * in a parallel scope, and that never
+ * happens.
+ */ // NOTEST
+ case OutScope:
+ /* Not possible as we already tested for
+ * OutScope
+ */
+ abort(); // NOTEST
+ }
+ break;
+ case CloseFunction:
+ if (v->scope == CondScope)
+ /* Condition cannot continue past end of function */
+ v->scope = InScope;
+ /* fallthrough */
+ case CloseSequential:
+ switch (v->scope) {
+ case InScope:
+ v->scope = OutScope;
+ break;
+ case PendingScope:
+ /* There was no 'else', so we can only become
+ * conditional if we know the cases were exhaustive,
+ * and that doesn't mean anything yet.
+ * So only labels become conditional..
+ */
+ for (v2 = v;
+ v2 && v2->scope == PendingScope;
+ v2 = v2->previous)
+ v2->scope = OutScope;
+ break;
+ case CondScope:
+ case OutScope: break;
+ }
+ break;
+ }
+ }
+ }
+
+#### Storing Values
+
+The value of a variable is store separately from the variable, on an
+analogue of a stack frame. There are (currently) two frames that can be
+active. A global frame which currently only stores constants, and a
+stacked frame which stores local variables. Each variable knows if it
+is global or not, and what its index into the frame is.
+
+Values in the global frame are known immediately they are relevant, so
+the frame needs to be reallocated as it grows so it can store those
+values. The local frame doesn't get values until the interpreted phase
+is started, so there is no need to allocate until the size is known.
+
+We initialize the `frame_pos` to an impossible value, so that we can
+tell if it was set or not later.
+
+###### variable fields
+ short frame_pos;
+ short global;
+
+###### variable init
+ v->frame_pos = -1;
+
+###### parse context
+
+ short global_size, global_alloc;
+ short local_size;
+ void *global, *local;
+
+###### forward decls
+ static struct value *global_alloc(struct parse_context *c, struct type *t,
+ struct variable *v, struct value *init);
+
+###### ast functions
+
+ static struct value *var_value(struct parse_context *c, struct variable *v)
+ {
+ if (!v->global) {
+ if (!c->local || !v->type)
+ return NULL; // UNTESTED
+ if (v->frame_pos + v->type->size > c->local_size) {
+ printf("INVALID frame_pos\n"); // NOTEST
+ exit(2); // NOTEST
+ }
+ return c->local + v->frame_pos;
+ }
+ if (c->global_size > c->global_alloc) {
+ int old = c->global_alloc;
+ c->global_alloc = (c->global_size | 1023) + 1024;
+ c->global = realloc(c->global, c->global_alloc);
+ memset(c->global + old, 0, c->global_alloc - old);
+ }
+ return c->global + v->frame_pos;
+ }
+
+ static struct value *global_alloc(struct parse_context *c, struct type *t,
+ struct variable *v, struct value *init)
+ {
+ struct value *ret;
+ struct variable scratch;
+
+ if (t->prepare_type)
+ t->prepare_type(c, t, 1); // NOTEST
+
+ if (c->global_size & (t->align - 1))
+ c->global_size = (c->global_size + t->align) & ~(t->align-1); // NOTEST
+ if (!v) {
+ v = &scratch;
+ v->type = t;
+ }
+ v->frame_pos = c->global_size;
+ v->global = 1;
+ c->global_size += v->type->size;
+ ret = var_value(c, v);
+ if (init)
+ memcpy(ret, init, t->size);
+ else
+ val_init(t, ret); // NOTEST
+ return ret;
+ }
+
+As global values are found -- struct field initializers, labels etc --
+`global_alloc()` is called to record the value in the global frame.
+
+When the program is fully parsed, each function is analysed, we need to
+walk the list of variables local to that function and assign them an
+offset in the stack frame. For this we have `scope_finalize()`.
+
+We keep the stack from dense by re-using space for between variables
+that are not in scope at the same time. The `out_scope` list is sorted
+by `scope_start` and as we process a varible, we move it to an FIFO
+stack. For each variable we consider, we first discard any from the
+stack anything that went out of scope before the new variable came in.
+Then we place the new variable just after the one at the top of the
+stack.
+
+###### ast functions
+
+ static void scope_finalize(struct parse_context *c, struct type *ft)
+ {
+ int size = ft->function.local_size;
+ struct variable *next = ft->function.scope;
+ struct variable *done = NULL;
+
+ while (next) {
+ struct variable *v = next;
+ struct type *t = v->type;
+ int pos;
+ next = v->in_scope;
+ if (v->merged != v)
+ continue;
+ if (!t)
+ continue;
+ if (v->frame_pos >= 0)
+ continue;
+ while (done && done->scope_end < v->scope_start)
+ done = done->in_scope;
+ if (done)
+ pos = done->frame_pos + done->type->size;
+ else
+ pos = ft->function.local_size;
+ if (pos & (t->align - 1))
+ pos = (pos + t->align) & ~(t->align-1);
+ v->frame_pos = pos;
+ if (size < pos + v->type->size)
+ size = pos + v->type->size;
+ v->in_scope = done;
+ done = v;
+ }
+ c->out_scope = NULL;
+ ft->function.local_size = size;
+ }
+
+###### free context storage
+ free(context.global);
+
+#### Variables as executables
+
+Just as we used a `val` to wrap a value into an `exec`, we similarly
+need a `var` to wrap a `variable` into an exec. While each `val`
+contained a copy of the value, each `var` holds a link to the variable
+because it really is the same variable no matter where it appears.
+When a variable is used, we need to remember to follow the `->merged`
+link to find the primary instance.
+
+When a variable is declared, it may or may not be given an explicit
+type. We need to record which so that we can report the parsed code
+correctly.
+
+###### exec type
+ Xvar,
+
+###### ast
+ struct var {
+ struct exec;
+ struct variable *var;
+ };
+
+###### variable fields
+ int explicit_type;
+
+###### Grammar
+
+ $TERM : ::
+
+ $*var
+ VariableDecl -> IDENTIFIER : ${ {
+ struct variable *v = var_decl(c, $1.txt);
+ $0 = new_pos(var, $1);
+ $0->var = v;
+ if (v)
+ v->where_decl = $0;
+ else {
+ v = var_ref(c, $1.txt);
+ $0->var = v;
+ type_err(c, "error: variable '%v' redeclared",
+ $0, NULL, 0, NULL);
+ type_err(c, "info: this is where '%v' was first declared",
+ v->where_decl, NULL, 0, NULL);
+ }
+ } }$
+ | IDENTIFIER :: ${ {
+ struct variable *v = var_decl(c, $1.txt);
+ $0 = new_pos(var, $1);
+ $0->var = v;
+ if (v) {
+ v->where_decl = $0;
+ v->constant = 1;
+ } else {
+ v = var_ref(c, $1.txt);
+ $0->var = v;
+ type_err(c, "error: variable '%v' redeclared",
+ $0, NULL, 0, NULL);
+ type_err(c, "info: this is where '%v' was first declared",
+ v->where_decl, NULL, 0, NULL);
+ }
+ } }$
+ | IDENTIFIER : Type ${ {
+ struct variable *v = var_decl(c, $1.txt);
+ $0 = new_pos(var, $1);
+ $0->var = v;
+ if (v) {
+ v->where_decl = $0;
+ v->where_set = $0;
+ v->type = $<Type;
+ v->explicit_type = 1;
+ } else {
+ v = var_ref(c, $1.txt);
+ $0->var = v;
+ type_err(c, "error: variable '%v' redeclared",
+ $0, NULL, 0, NULL);
+ type_err(c, "info: this is where '%v' was first declared",
+ v->where_decl, NULL, 0, NULL);
+ }
+ } }$
+ | IDENTIFIER :: Type ${ {
+ struct variable *v = var_decl(c, $1.txt);
+ $0 = new_pos(var, $1);
+ $0->var = v;
+ if (v) {
+ v->where_decl = $0;
+ v->where_set = $0;
+ v->type = $<Type;
+ v->constant = 1;
+ v->explicit_type = 1;
+ } else {
+ v = var_ref(c, $1.txt);
+ $0->var = v;
+ type_err(c, "error: variable '%v' redeclared",
+ $0, NULL, 0, NULL);
+ type_err(c, "info: this is where '%v' was first declared",
+ v->where_decl, NULL, 0, NULL);
+ }
+ } }$
+
+ $*exec
+ Variable -> IDENTIFIER ${ {
+ struct variable *v = var_ref(c, $1.txt);
+ $0 = new_pos(var, $1);
+ if (v == NULL) {
+ /* This might be a global const or a label
+ * Allocate a var with impossible type Tnone,
+ * which will be adjusted when we find out what it is,
+ * or will trigger an error.
+ */
+ v = var_decl(c, $1.txt);
+ if (v) {
+ v->type = Tnone;
+ v->where_decl = $0;
+ v->where_set = $0;
+ }
+ }
+ cast(var, $0)->var = v;
+ } }$
+
+###### print exec cases
+ case Xvar:
+ {
+ struct var *v = cast(var, e);
+ if (v->var) {
+ struct binding *b = v->var->name;
+ printf("%.*s", b->name.len, b->name.txt);
+ }
+ break;
+ }
+
+###### format cases
+ case 'v':
+ if (loc && loc->type == Xvar) {
+ struct var *v = cast(var, loc);
+ if (v->var) {
+ struct binding *b = v->var->name;
+ fprintf(stderr, "%.*s", b->name.len, b->name.txt);
+ } else
+ fputs("???", stderr); // NOTEST
+ } else
+ fputs("NOTVAR", stderr); // NOTEST
+ break;
+
+###### propagate exec cases
+
+ case Xvar:
+ {
+ struct var *var = cast(var, prog);
+ struct variable *v = var->var;
+ if (!v) {
+ type_err(c, "%d:BUG: no variable!!", prog, NULL, 0, NULL); // NOTEST
+ return Tnone; // NOTEST
+ }
+ v = v->merged;
+ if (v->type == Tnone && v->where_decl == prog)
+ type_err(c, "error: variable used but not declared: %v",
+ prog, NULL, 0, NULL);
+ if (v->type == NULL) {
+ if (type && !(*perr & Efail)) {
+ v->type = type;
+ v->where_set = prog;
+ *perr |= Eretry;
+ }
+ } else if (!type_compat(type, v->type, rules)) {
+ type_err(c, "error: expected %1 but variable '%v' is %2", prog,
+ type, rules, v->type);
+ type_err(c, "info: this is where '%v' was set to %1", v->where_set,
+ v->type, rules, NULL);
+ }
+ if (!v->global || v->frame_pos < 0)
+ *perr |= Eruntime;
+ if (v->constant)
+ *perr |= Econst;
+ return v->type;
+ }
+
+###### interp exec cases
+ case Xvar:
+ {
+ struct var *var = cast(var, e);
+ struct variable *v = var->var;
+
+ v = v->merged;
+ lrv = var_value(c, v);
+ rvtype = v->type;
+ break;
+ }
+
+###### ast functions
+
+ static void free_var(struct var *v)
+ {
+ free(v);
+ }
+
+###### free exec cases
+ case Xvar: free_var(cast(var, e)); break;
+
+
+### Complex types
+
+Now that we have the shape of the interpreter in place we can add some
+complex types and connected them in to the data structures and the
+different phases of parse, analyse, print, interpret.
+
+Being "complex" the language will naturally have syntax to access
+specifics of objects of these types. These will fit into the grammar as
+"Terms" which are the things that are combined with various operators to
+form "Expression". Where a Term is formed by some operation on another
+Term, the subordinate Term will always come first, so for example a
+member of an array will be expressed as the Term for the array followed
+by an index in square brackets. The strict rule of using postfix
+operations makes precedence irrelevant within terms. To provide a place
+to put the grammar for each terms of each type, we will start out by
+introducing the "Term" grammar production, with contains at least a
+simple "Value" (to be explained later).
+
+###### Grammar
+ $*exec
+ Term -> Value ${ $0 = $<1; }$
+ | Variable ${ $0 = $<1; }$
+ ## term grammar
+
+Thus far the complex types we have are arrays and structs.
+
+#### Arrays
+
+Arrays can be declared by giving a size and a type, as `[size]type' so
+`freq:[26]number` declares `freq` to be an array of 26 numbers. The
+size can be either a literal number, or a named constant. Some day an
+arbitrary expression will be supported.
+
+As a formal parameter to a function, the array can be declared with a
+new variable as the size: `name:[size::number]string`. The `size`
+variable is set to the size of the array and must be a constant. As
+`number` is the only supported type, it can be left out:
+`name:[size::]string`.
+
+Arrays cannot be assigned. When pointers are introduced we will also
+introduce array slices which can refer to part or all of an array -
+the assignment syntax will create a slice. For now, an array can only
+ever be referenced by the name it is declared with. It is likely that
+a "`copy`" primitive will eventually be define which can be used to
+make a copy of an array with controllable recursive depth.
+
+For now we have two sorts of array, those with fixed size either because
+it is given as a literal number or because it is a struct member (which
+cannot have a runtime-changing size), and those with a size that is
+determined at runtime - local variables with a const size. The former
+have their size calculated at parse time, the latter at run time.
+
+For the latter type, the `size` field of the type is the size of a
+pointer, and the array is reallocated every time it comes into scope.
+
+We differentiate struct fields with a const size from local variables
+with a const size by whether they are prepared at parse time or not.
+
+###### type union fields
+
+ struct {
+ int unspec; // size is unspecified - vsize must be set.
+ short size;
+ short static_size;
+ struct variable *vsize;
+ struct type *member;
+ } array;
+
+###### value union fields
+ void *array; // used if not static_size
+
+###### value functions
+
+ static int array_prepare_type(struct parse_context *c, struct type *type,
+ int parse_time)
+ {
+ struct value *vsize;
+ mpz_t q;
+ if (type->array.static_size)
+ return 1; // UNTESTED
+ if (type->array.unspec && parse_time)
+ return 1; // UNTESTED
+ if (parse_time && type->array.vsize && !type->array.vsize->global)
+ return 1; // UNTESTED
+
+ if (type->array.vsize) {
+ vsize = var_value(c, type->array.vsize);
+ if (!vsize)
+ return 1; // UNTESTED
+ mpz_init(q);
+ mpz_tdiv_q(q, mpq_numref(vsize->num), mpq_denref(vsize->num));
+ type->array.size = mpz_get_si(q);
+ mpz_clear(q);
+ }
+ if (!parse_time)
+ return 1;
+ if (type->array.member->size <= 0)
+ return 0; // UNTESTED
+
+ type->array.static_size = 1;
+ type->size = type->array.size * type->array.member->size;
+ type->align = type->array.member->align;
+
+ return 1;
+ }
+
+ static void array_init(struct type *type, struct value *val)
+ {
+ int i;
+ void *ptr = val->ptr;
+
+ if (!val)
+ return; // NOTEST
+ if (!type->array.static_size) {
+ val->array = calloc(type->array.size,
+ type->array.member->size);
+ ptr = val->array;
+ }
+ for (i = 0; i < type->array.size; i++) {
+ struct value *v;
+ v = (void*)ptr + i * type->array.member->size;
+ val_init(type->array.member, v);
+ }
+ }
+
+ static void array_free(struct type *type, struct value *val)
+ {
+ int i;
+ void *ptr = val->ptr;
+
+ if (!type->array.static_size)
+ ptr = val->array;
+ for (i = 0; i < type->array.size; i++) {
+ struct value *v;
+ v = (void*)ptr + i * type->array.member->size;
+ free_value(type->array.member, v);
+ }
+ if (!type->array.static_size)
+ free(ptr);
+ }
+
+ static int array_compat(struct type *require, struct type *have,
+ enum val_rules rules)
+ {
+ if (have->compat != require->compat)
+ return 0;
+ /* Both are arrays, so we can look at details */
+ if (!type_compat(require->array.member, have->array.member, 0))
+ return 0;
+ if (have->array.unspec && require->array.unspec) {
+ if (have->array.vsize && require->array.vsize &&
+ have->array.vsize != require->array.vsize) // UNTESTED
+ /* sizes might not be the same */
+ return 0; // UNTESTED
+ return 1;
+ }
+ if (have->array.unspec || require->array.unspec)
+ return 1; // UNTESTED
+ if (require->array.vsize == NULL && have->array.vsize == NULL)
+ return require->array.size == have->array.size;
+
+ return require->array.vsize == have->array.vsize; // UNTESTED
+ }
+
+ static void array_print_type(struct type *type, FILE *f)
+ {
+ fputs("[", f);
+ if (type->array.vsize) {
+ struct binding *b = type->array.vsize->name;
+ fprintf(f, "%.*s%s]", b->name.len, b->name.txt,
+ type->array.unspec ? "::" : "");
+ } else if (type->array.size)
+ fprintf(f, "%d]", type->array.size);
+ else
+ fprintf(f, "]");
+ type_print(type->array.member, f);
+ }
+
+ static struct type array_prototype = {
+ .init = array_init,
+ .prepare_type = array_prepare_type,
+ .print_type = array_print_type,
+ .compat = array_compat,
+ .free = array_free,
+ .size = sizeof(void*),
+ .align = sizeof(void*),
+ };
+
+###### declare terminals
+ $TERM [ ]
+
+###### type grammar
+
+ | [ NUMBER ] Type ${ {
+ char tail[3];
+ mpq_t num;
+ struct type *t;
+ int elements = 0;
+
+ if (number_parse(num, tail, $2.txt) == 0)
+ tok_err(c, "error: unrecognised number", &$2);
+ else if (tail[0]) {
+ tok_err(c, "error: unsupported number suffix", &$2);
+ mpq_clear(num);
+ } else {
+ elements = mpz_get_ui(mpq_numref(num));
+ if (mpz_cmp_ui(mpq_denref(num), 1) != 0) {
+ tok_err(c, "error: array size must be an integer",
+ &$2);
+ } else if (mpz_cmp_ui(mpq_numref(num), 1UL << 30) >= 0)
+ tok_err(c, "error: array size is too large",
+ &$2);
+ mpq_clear(num);
+ }
+
+ $0 = t = add_anon_type(c, &array_prototype, "array[%d]", elements );
+ t->array.size = elements;
+ t->array.member = $<4;
+ t->array.vsize = NULL;
+ } }$
+
+ | [ IDENTIFIER ] Type ${ {
+ struct variable *v = var_ref(c, $2.txt);
+
+ if (!v)
+ tok_err(c, "error: name undeclared", &$2);
+ else if (!v->constant)
+ tok_err(c, "error: array size must be a constant", &$2);
+
+ $0 = add_anon_type(c, &array_prototype, "array[%.*s]", $2.txt.len, $2.txt.txt);
+ $0->array.member = $<4;
+ $0->array.size = 0;
+ $0->array.vsize = v;
+ } }$
+
+###### Grammar
+ $*type
+ OptType -> Type ${ $0 = $<1; }$
+ | ${ $0 = NULL; }$
+
+###### formal type grammar
+
+ | [ IDENTIFIER :: OptType ] Type ${ {
+ struct variable *v = var_decl(c, $ID.txt);
+
+ v->type = $<OT;
+ v->constant = 1;
+ if (!v->type)
+ v->type = Tnum;
+ $0 = add_anon_type(c, &array_prototype, "array[var]");
+ $0->array.member = $<6;
+ $0->array.size = 0;
+ $0->array.unspec = 1;
+ $0->array.vsize = v;
+ } }$
+
+###### Binode types
+ Index,
+
+###### term grammar
+
+ | Term [ Expression ] ${ {
+ struct binode *b = new(binode);
+ b->op = Index;
+ b->left = $<1;
+ b->right = $<3;
+ $0 = b;
+ } }$
+
+###### print binode cases
+ case Index:
+ print_exec(b->left, -1, bracket);
+ printf("[");
+ print_exec(b->right, -1, bracket);
+ printf("]");
+ break;
+
+###### propagate binode cases
+ case Index:
+ /* left must be an array, right must be a number,
+ * result is the member type of the array
+ */
+ propagate_types(b->right, c, perr_local, Tnum, 0);
+ t = propagate_types(b->left, c, perr, NULL, 0);
+ if (!t || t->compat != array_compat) {
+ type_err(c, "error: %1 cannot be indexed", prog, t, 0, NULL);
+ return NULL;
+ } else {
+ if (!type_compat(type, t->array.member, rules)) {
+ type_err(c, "error: have %1 but need %2", prog,
+ t->array.member, rules, type);
+ }
+ return t->array.member;
+ }
+ break;
+
+###### interp binode cases
+ case Index: {
+ mpz_t q;
+ long i;
+ void *ptr;
+
+ lleft = linterp_exec(c, b->left, <ype);
+ right = interp_exec(c, b->right, &rtype);
+ mpz_init(q);
+ mpz_tdiv_q(q, mpq_numref(right.num), mpq_denref(right.num));
+ i = mpz_get_si(q);
+ mpz_clear(q);
+
+ if (ltype->array.static_size)
+ ptr = lleft;
+ else
+ ptr = *(void**)lleft;
+ rvtype = ltype->array.member;
+ if (i >= 0 && i < ltype->array.size)
+ lrv = ptr + i * rvtype->size;
+ else
+ val_init(ltype->array.member, &rv); // UNSAFE
+ ltype = NULL;
+ break;
+ }
+
+#### Structs
+
+A `struct` is a data-type that contains one or more other data-types.
+It differs from an array in that each member can be of a different
+type, and they are accessed by name rather than by number. Thus you
+cannot choose an element by calculation, you need to know what you
+want up-front.
+
+The language makes no promises about how a given structure will be
+stored in memory - it is free to rearrange fields to suit whatever
+criteria seems important.
+
+Structs are declared separately from program code - they cannot be
+declared in-line in a variable declaration like arrays can. A struct
+is given a name and this name is used to identify the type - the name
+is not prefixed by the word `struct` as it would be in C.
+
+Structs are only treated as the same if they have the same name.
+Simply having the same fields in the same order is not enough. This
+might change once we can create structure initializers from a list of
+values.
+
+Each component datum is identified much like a variable is declared,
+with a name, one or two colons, and a type. The type cannot be omitted
+as there is no opportunity to deduce the type from usage. An initial
+value can be given following an equals sign, so
+
+##### Example: a struct type
+
+ struct complex:
+ x:number = 0
+ y:number = 0
+
+would declare a type called "complex" which has two number fields,
+each initialised to zero.
+
+Struct will need to be declared separately from the code that uses
+them, so we will need to be able to print out the declaration of a
+struct when reprinting the whole program. So a `print_type_decl` type
+function will be needed.
+
+###### type union fields
+
+ struct {
+ int nfields;
+ struct field {
+ struct text name;
+ struct type *type;
+ struct value *init;
+ int offset;
+ } *fields; // This is created when field_list is analysed.
+ struct fieldlist {
+ struct fieldlist *prev;
+ struct field f;
+ struct exec *init;
+ } *field_list; // This is created during parsing
+ } structure;
+
+###### type functions
+ void (*print_type_decl)(struct type *type, FILE *f);
+ struct type *(*fieldref)(struct type *t, struct parse_context *c,
+ struct fieldref *f, struct value **vp);