ocean-lang.org Git - ocean-D/blob - Ocean-types

   1
   2 Types have a per-module namespace.
   3 This is pre-populated with
   4   int i8 i16 uint u8 u64 etc
   5   num float f64 f128
   6   Bool
   7   byte
   8   char string
   9
  10 Types can be added with:
  11
  12   struct name: content
  13   record name: content
  14   enum name: content
  15   class name: content
  16
  17  name is optional, and can list (parameters) and /attributes
  18
  19 Types can be constructed with
  20
  21    name(args)   parameterized type
  22    name^        reference type
  23    name[size]   array type
  24    (args:: args)  procedure type
  25    (args:: type)    function type
  26
  27   I think that for pointer/array constructor, the decoration comes first.
  28     foo: [5]int
  29   is an array of 5 integers, so foo[5] is an int
  30     foo: @bar
  31   is an owned pointer to bar, so "foo@" is a bar.
  32     foo: @[5]int
  33   is an owner pointer to 5 integers while
  34     foo:[5]^int
  35   is an array of 5 borrowed pointers to integers.
  36   Having the result type at the end fits better with the function type.
  37    foo:(int, int)string
  38   then foo(1,2) returns a string.
  39
  40
  41 The content of struct and record are a list of:
  42    fieldname: type
  43  or
  44    fieldname/attribute: type
  45
  46  'attribute' can:
  47       indicate endianness - bigendian littleendian hostendian
  48       'const' ??
  49       identify a refcount,
  50       protected by a given lock??
  51
  52  For enum, content is list of
  53    name = value
  54  where "= value" is optional
  55
  56 Pointers:
  57   to assign a pointer, use foo = stuff
  58   to update what the pointer points to, use foo^ = stuff
  59   to get a reference to store in a pointer....
  60      references are either borrowed or owned.
  61      a name defined "type^" is a borrowed reference, a name defined
  62      "type@" is owned.
  63      borrowed references may be taken of anything, but only remain defined
  64      as long as the owner remains defined.
  65      owned references can only be taken of ownable objects, and
  66      remain indefinitely.
  67
  68    There are various ways to own an object:
  69     - refcount or lock
  70     - ownership of a containing object.
  71     - ownership provided by class method
  72
  73 non-type names (vars, constants) can be introduced with
  74
  75 const prefix: name = value; ...
  76 func: name(args::type): statements
  77 proc: name(args::result): statements
  78
  79 main: statements
  80 init:
  81 exit:
  82
  83
  84 enum name: values
  85
  86
  87 Plan:
  88   decide on data structure
  89   different types can't really be handled by a big switch now,
  90     I probably need and object with function pointers.
  91         free_value(), vtype_compat(), val_init(),
  92         dup_value(), value_cmp(), print_value(), parse_value()
  93
  94         parse_value only needed for args - str and num
  95         val_init...
  96         print_value - only needed for print and code-dump
  97         dup_value - needed until pointers can make sense
  98         vtype_compat - needed for various things
  99         value_cmp - needed until we have object behaviours
 100
 101   pre-defined
 102      Bool
 103      int - sizes
 104      char
 105      string
 106   array
 107      I think I need to disassociate the type from the storage.
 108      So a 'value' is a parsed constant, but something new is needed for
 109      the content of a variable.
 110      Also, something new is needed for an intermediate lvalue such as an
 111      index to an array or a field in a record.  Ths is a type plus a pointer.
 112      What about rvalues and the result of a calculation.  I guess I store
 113      that in a temp location, with a pointer...
 114
 115      I need to be clear what "interp_exec()" returns.  Conceptually it
 116      can be an object of any type, and for procedures it can be a tuple of
 117      objects.  It needs to identify a type and a value of that type.
 118      The "value" might be a reference into a variable, or it might be
 119      a copy from a calculation.  Eventually the reference will have
 120      ownership information
 121   const
 122   struct
 123
 124
 125 Steps:
 126  0/ when does 1/2 produce an integer?  Only when explicitly expected.
 127  1/ remove 'tail' from value
 128  2/ define a 'type'.
 129  3/ Add a 'Vtyped' vtype and a 'type' pointer
 130  4/ add an owership enum: borrowed, single
 131  3/ add a 'void *valref' ??
 132  4/ Convert num bool str label to Vtyped
 133
 134 No, this is awkward because propagate_types wants a 'type' and this intermediate
 135 format has a enum+pointer.  So make it just a pointer...
 136 What do I do with
 137   Vnolabel - rule flag
 138   Vunknown   - NULL pointer
 139   Vnone  - special value
 140
 141
 142  A 'type' must be able to:
 143   check is it can convert to some other type, reporting if it wants to
 144   convert to another type.  This requires visibility into other types.
 145   print, compare, parse
 146   add subtract multiply divide index
 147 ----------
 148
 149  I currently have an enum of types that is used to test comparability
 150  and for propagation.
 151  This needs to change .... I guess I need a struct type* What goes in it?
 152   - name
 153   - scalar/record/struct/array/pointer/func/enum
 154   - other details.
 155
 156  Do I need forward declarations?  Maybe I can just be lazy and
 157  require everything to be declared eventually.
 158  That isn't sufficient for:
 159   - mutually recursive functions
 160   - mutually recursive structures
 161
 162  What about unions ???
 163    blend with enum: a tag with fields?
 164    struct name:
 165      x:int; y:int
 166
 167      .Bool.true = a:char; b:char
 168      .Bool.false = ......
 169
 170   I like "inheritance" but that doesn't allow the size of the whole to
 171   be known in advance.
 172   I need a way to talk about which instance is active, even if the value
 173   isn't stored.  So Pascal-like variant records are good.
 174   Maybe a struct could be declared as 'extensible' and other structs
 175   could 'extend', and it literally makes it bigger
 176   struct a extensible {x1:int}
 177   struct b extends a {y1:int}
 178   struct c extends a extensible {y2:int}
 179   struct d extends c  {z1:int}
 180
 181   now 'a' has room for x1, (y2 and z1) or y1
 182   Which is used depends on context .. or by assertion on content.
 183   So extensions can set values for existing fields
 184     struct z extends a {x1=4; string c}
 185
 186   Syntax is a bit clumsy.  Do I need "extensible"??
 187
 188
 189
 190  Are enums just a fancy way of doing 'const'?
 191  I could have
 192    const: a=1; b=2; c; d
 193  which defines module-wide consts.
 194  And separately have
 195    enum foo: a, b, c
 196  which defines consts foo.a foo.b foo.c
 197  But I don't really like foo entering the val namespace.
 198  If bar:foo, then bar.a could be true/false depending on value of bar.
 199  bar.a! could set it??  bar.a=true?
 200    var = .foo.a
 201
 202  How do we allocate new objects?
 203    new(type) ???
 204    ptrname = new ??
 205    new(ptrname)
 206    ptrname := new(type)
 207
 208  Maybe ^= assigns a borrowed reference, and @= assigns an owned reference.
 209
 210  allocating isn't really a top priority, so I should just focus on
 211  non-allocated types.
 212
 213
 214 -----
 215 I need a list of steps again:
 216
 217  - look up type by name and add syntax for
 218       name : type = value
 219    and
 220       name : type
 221
 222  - Add arrays:
 223       name : type[size]
 224         e.g. int[4] or foo@[3]
 225           What is int[5][20] ?? it is an array of int[5], which is backwards.
 226           So maybe I want [5][20]int ??
 227     declares an array of that type/size
 228       name : [] = [ 1,2,3 ]
 229     declares an array of 3 numbers.
 230       name[1]
 231     extracts an element from the array,
 232       name[2] = 4
 233     updates an element
 234       name[4:5]
 235     creates a slice, which can be stored in a var (borrowed ref) or
 236     assigned to
 237
 238     This requires:
 239      - new type class which has a size and member type
 240      - new type access methods: index and size(?)
 241      - new syntax
 242      - new manifest values: [a,b,c] creates an array of whatever member type.
 243
 244  - add syntax for
 245       struct name : [[ name : type ]]
 246    to define a new struct
 247
 248  - add syntax to extra a field from a struct
 249         how is this type-checked if I don't know the type yet?
 250
 251  - add syntax for pointers
 252      a:= new(struct foo)
 253      a:struct foo^ = new()
 254
 255 -------
 256 Questions:
 257   If I declare "struct foo ..." do I use "foo" or "struct foo" to ref the type?
 258   I think just "foo".
 259   So structs, records, enums, and classes must have distinct names.
 260
 261  When do I differentiate between compile-time constants and run-time values?
 262  When declaring an array, do I require the size to be constant?
 263  In a struct I do ... at time of declaration I calculate the size.
 264  For foo:[sqr(a)]int
 265  I do that too - and it is at run-time.
 266  So during parsing, I need to describe the array with a member-type and executable size.
 267  When that is evaluated, a type is created.
 268  So we really need an executable which returns a type.
 269  But ... we need to know the type when doing type analysis.  So while variable size
 270  is OK, the compiler needs to know what it is.  Maybe the size needs to be a constant, as in
 271  a names assiged with "size ::= 4*5".  This gives the compiler some chance of comparing
 272  types of array - and doing range checking on indexes.
 273
 274  We currently call var_init to set the type of a variable during type
 275  analysis - which makes sense.
 276  But for an array we don't have the final type until run-time.  So we need an
 277  intermediate type.
 278  So (for now) the size of an array is either a NUMBER or an IDENTIFIER which must be a
 279   constant var.
 280  I need a point where the type is instantiated - where the variable is evaluated
 281   and the size is set.  I guess this happens when the 'struct var' is evaluated...
 282   no, when a Declare binode is evaluated.
 283
 284  What happens if I have
 285     a:[foo]number = thing
 286  I guess the type analysis needs to afirm that thing has the correct type,
 287  then a doesn't need to be initialized.
 288
 289  If I find
 290     a[4] = "hello"
 291  and 'a' hasn't been declared .... obviously an error.
 292
 293  When/what/how.
 294   For field access, I need to know the type of the variable.
 295   But I can delay the look up until type analysis.
 296   So a[4] is Index(a, 4) - a binode
 297      a.foo is Field(a, "foo") -  need a new exec type - Fieldname
 298      a(args) is Call(a, args) - need a new Binode type - Tuple.
 299
 300 So I have to delay 'const' assessment to later too.
 301
 302 ----------
 303 Where do type definition go?
 304 I don't think they go with statements, they belong separately.
 305 I don't want the full separation of a "type" section like Pascal
 306 So they probably go at the top level, equivalent to "program" - and before.
 307 They start with "struct" or "enum" or "record" etc.
 308
 309 So: what about constants?  These are currently statements and so affect a scope in time.
 310 But for declaring arrays in structs, or initial values of fields, we might want constants.
 311 A constant could be within a struct, but only that it too limiting.  I need module-wide
 312 constants.
 313 So I guess:
 314
 315    const:
 316       name ::= value
 317       name ::= value
 318
 319 or
 320    const { name ::= value ; name ::= value }
 321
 322 --------------
 323 I'm in the middle of stage-1 on structures.
 324
 325 I need a type to parse the declaration into.  It needs to be a linked list
 326 of fields, each of which is a type, a name, and an initial value.  i.e. a 'struct field'.
 327
 328 -----------------
 329
 330 Numbers...
 331 I want signed/unsigned/bitset integers (and probably floats).
 332 These are different sizes, and I want to move 'type' out of 'value'
 333 so I can have arrays of numbers that are *just* the densely packets numbers.
 334
 335 So there are two questions here: how will I handle values in oceani, and
 336 what are the semantics of numbers in ocean.
 337
 338 I think I want bitops to requires bitsets and arith ops to require signed/unsigned.
 339 But there is some overlap.
 340 e.g. we use bitops to test if a number is a power of two
 341 We sometimes use bitops to multiply, but that is probably best avoided.
 342 use * to multiply.
 343
 344 Converting between the two can be done with simple assignment.
 345
 346 So + - * / %     require/assume signed or unsigned
 347    | & ~ << >>   require/assume bitset
 348
 349   #  accepts either and produces a bitset
 350
 351 Other issue is overflow/underflow checking.
 352 Do we need another unsigned type - cyclic
 353
 354     i32 - signed integer in 32 bits
 355     u32 - unsigned integer
 356     c32 - unsigned with overflow permitted and ignored
 357     b32 - bitset
 358
 359     int uint cint bset - whatever size.
 360
 361 i32 and u32 detect overflow/underflow and set to NaN - all 1's
 362 If I want to allow overloading (such a NaN), I need a type that
 363 declare no overloading. s32 and c32?  Or annotations.  !s32 !u32
 364
 365 So what about values in oceani?  I want to separate out the type and not
 366 use a union.
 367 Where are they used?
 368  - return of init, prepare, parse, dup
 369  - passed to  print, cmp, dup, free, to_int, to_float, to_mpq
 370  - field in 'struct variable'
 371  - field in 'struct lrval'
 372  - result of 'interp'
 373  - intermediate left/right in interp
 374  - field in array and struct field
 375  - field in 'struct val' for manifest constants
 376
 377 So:
 378   variable gets a 'type' pointer and a union which can be a pointer
 379   to the value, or the value itself (depending on size)
 380   lrval get a type pointer as well, plus the union
 381   interp returns ...
 382
 383
 384 -----------------
 385 Struct/array initialisers.
 386 I like [a,b,c] rather than {a,b,c} because the latter can look like code.
 387 But [] is also array indexing.
 388 So an array initializer could look:
 389   [ [1] = "hello", [5] = "there" ]
 390 and that is confusingly similar to nested initialization
 391   [ [1,2] , [3,4] ]
 392 Options:
 393  1/ use different outer.  {}  () <> << >>
 394    < is possibly as it is not a prefix operator.
 395      But nesting results in <<1,2>,<3,4>> which looks like << instead of < <
 396    {} I already don't like
 397    () is bad enough with function calls - it is best if it is grouping only.
 398      though with function calls it is a list ...
 399    << [1]="hello", [2]="there" >>...  I don't really like that
 400
 401    array[ ]
 402    struct[ ]
 403      No, too noisy.
 404
 405  2/ use different inner syntax.
 406      [ .[1] = "hello", .[5] = "hello" ]
 407
 408  What about a newline-based syntax:
 409   a: [4]int :
 410         [0] = 2
 411         [1] = 3
 412         [3] = 1
 413
 414  Nice, but doesn't actually help.  Still need .[] because I want to allow
 415  a one-line syntax too.
 416  Maybe I just use {} after all.
 417
 418   a:[4]int = { [0]=2, [1]=3, [3]=1 }
 419  Yes, I guess that is best.