ocean-lang.org Git - ocean-D/blob - Ocean-types

   1
   2 Types have a per-module namespace.
   3 This is pre-populated with
   4   int i8 i16 uint u8 u64 etc
   5   num float f64 f128
   6   Bool
   7   byte
   8   char string
   9
  10 Types can be added with:
  11
  12   struct name: content
  13   record name: content
  14   enum name: content
  15   class name: content
  16
  17  name is optional, and can list (parameters) and /attributes
  18
  19 Types can be constructed with
  20
  21    name(args)   parameterized type
  22    name^        reference type
  23    name[size]   array type
  24    (args:: args)  procedure type
  25    (args:: type)    function type
  26
  27   I think that for pointer/array constructor, the decoration comes first.
  28     foo: [5]int
  29   is an array of 5 integers, so foo[5] is an int
  30     foo: @bar
  31   is an owned pointer to bar, so "foo@" is a bar.
  32     foo: @[5]int
  33   is an owner pointer to 5 integers while
  34     foo:[5]^int
  35   is an array of 5 borrowed pointers to integers.
  36   Having the result type at the end fits better with the function type.
  37    foo:(int, int)string
  38   then foo(1,2) returns a string.
  39
  40
  41 The content of struct and record are a list of:
  42    fieldname: type
  43  or
  44    fieldname/attribute: type
  45
  46  'attribute' can:
  47       indicate endianness - bigendian littleendian hostendian
  48       'const' ??
  49       identify a refcount,
  50       protected by a given lock??
  51
  52  For enum, content is list of
  53    name = value
  54  where "= value" is optional
  55
  56 Pointers:
  57   to assign a pointer, use foo = stuff
  58   to update what the pointer points to, use foo^ = stuff
  59   to get a reference to store in a pointer....
  60      references are either borrowed or owned.
  61      a name defined "type^" is a borrowed reference, a name defined
  62      "type@" is owned.
  63      borrowed references may be taken of anything, but only remain defined
  64      as long as the owner remains defined.
  65      owned references can only be taken of ownable objects, and
  66      remain indefinitely.
  67
  68    There are various ways to own an object:
  69     - refcount or lock
  70     - ownership of a containing object.
  71     - ownership provided by class method
  72
  73 non-type names (vars, constants) can be introduced with
  74
  75 const prefix: name = value; ...
  76 func: name(args::type): statements
  77 proc: name(args::result): statements
  78
  79 main: statements
  80 init:
  81 exit:
  82
  83
  84 enum name: values
  85
  86
  87 Plan:
  88   decide on data structure
  89   different types can't really be handled by a big switch now,
  90     I probably need and object with function pointers.
  91         free_value(), vtype_compat(), val_init(),
  92         dup_value(), value_cmp(), print_value(), parse_value()
  93
  94         parse_value only needed for args - str and num
  95         val_init...
  96         print_value - only needed for print and code-dump
  97         dup_value - needed until pointers can make sense
  98         vtype_compat - needed for various things
  99         value_cmp - needed until we have object behaviours
 100
 101   pre-defined
 102      Bool
 103      int - sizes
 104      char
 105      string
 106   array
 107      I think I need to disassociate the type from the storage.
 108      So a 'value' is a parsed constant, but something new is needed for
 109      the content of a variable.
 110      Also, something new is needed for an intermediate lvalue such as an
 111      index to an array or a field in a record.  Ths is a type plus a pointer.
 112      What about rvalues and the result of a calculation.  I guess I store
 113      that in a temp location, with a pointer...
 114
 115      I need to be clear what "interp_exec()" returns.  Conceptually it
 116      can be an object of any type, and for procedures it can be a tuple of
 117      objects.  It needs to identify a type and a value of that type.
 118      The "value" might be a reference into a variable, or it might be
 119      a copy from a calculation.  Eventually the reference will have
 120      ownership information
 121   const
 122   struct
 123
 124
 125 Steps:
 126  0/ when does 1/2 produce an integer?  Only when explicitly expected.
 127  1/ remove 'tail' from value
 128  2/ define a 'type'.
 129  3/ Add a 'Vtyped' vtype and a 'type' pointer
 130  4/ add an owership enum: borrowed, single
 131  3/ add a 'void *valref' ??
 132  4/ Convert num bool str label to Vtyped
 133
 134 No, this is awkward because propagate_types wants a 'type' and this intermediate
 135 format has a enum+pointer.  So make it just a pointer...
 136 What do I do with
 137   Vnolabel - rule flag
 138   Vunknown   - NULL pointer
 139   Vnone  - special value
 140
 141
 142  A 'type' must be able to:
 143   check is it can convert to some other type, reporting if it wants to
 144   convert to another type.  This requires visibility into other types.
 145   print, compare, parse
 146   add subtract multiply divide index
 147 ----------
 148
 149  I currently have an enum of types that is used to test compatability
 150  and for propagation.
 151  This needs to change .... I guess I need a struct type* What goes in it?
 152   - name
 153   - scalar/record/struct/array/pointer/func/enum
 154   - other details.
 155
 156  Do I need forward declarations?  Maybe I can just be lazy and
 157  require everything to be declared eventually.
 158  That isn't sufficient for:
 159   - mutually recursive functions
 160   - mutually recursive structures
 161
 162  What about unions ???
 163    blend with enum: a tag with fields?
 164    struct name:
 165      x:int; y:int
 166
 167      .Bool.true = a:char; b:char
 168      .Bool.false = ......
 169
 170  Are enums just a fancy way of doing 'const'?
 171  I could have
 172    const: a=1; b=2; c; d
 173  which defines module-wide consts.
 174  And separately have
 175    enum foo: a, b, c
 176  which defines consts foo.a foo.b foo.c
 177  But I don't really like foo entering the val namespace.
 178  If bar:foo, then bar.a could be true/false depending on value of bar.
 179  bar.a! could set it??  bar.a=true?
 180    var = .foo.a
 181
 182  How do we allocate new objects?
 183    new(type) ???
 184    ptrname = new ??
 185    new(ptrname)
 186    ptrname := new(type)
 187
 188  Maybe ^= assigns a borrowed reference, and @= assigns an owned reference.
 189
 190  allocating isn't really a top priority, so I should just focus on
 191  non-allocated types.
 192
 193
 194 -----
 195 I need a list of steps again:
 196
 197  - look up type by name and add syntax for
 198       name : type = value
 199    and
 200       name : type
 201
 202  - Add arrays:
 203       name : type[size]
 204         e.g. int[4] or foo@[3]
 205           What is int[5][20] ?? it is an array of int[5], which is backwards.
 206           So maybe I want [5][20]int ??
 207     declares an array of that type/size
 208       name : [] = [ 1,2,3 ]
 209     declares an array of 3 numbers.
 210       name[1]
 211     extracts an element from the array,
 212       name[2] = 4
 213     updates an element
 214       name[4:5]
 215     creates a slice, which can be stored in a var (borrowed ref) or
 216     assigned to
 217
 218     This requires:
 219      - new type class which has a size and member type
 220      - new type access methods: index and size(?)
 221      - new syntax
 222      - new manifest values: [a,b,c] creates an array of whatever member type.
 223
 224  - add syntax for
 225       struct name : [[ name : type ]]
 226    to define a new struct
 227
 228  - add syntax to extra a field from a struct
 229         how is this type-checked if I don't know the type yet?
 230
 231  - add syntax for pointers
 232      a:= new(struct foo)
 233      a:struct foo^ = new()
 234
 235 -------
 236 Questions:
 237   If I declare "struct foo ..." do I use "foo" or "struct foo" to ref the type?
 238   I think just "foo".
 239   So structs, records, enums, and classes must have distinct names.
 240
 241  When do I differentiate between compile-time constants and run-time values?
 242  When declaring an array, do I require the size to be constant?
 243  In a struct I do ... at time of declaration I calculate the size.
 244  For foo:[sqr(a)]int
 245  I do that too - and it is at run-time.
 246  So during parsing, I need to describe the array with a member-type and executable size.
 247  When that is evaluated, a type is created.
 248  So we really need an executable which returns a type.
 249  But ... we need to know the type when doing type analysis.  So while variable size
 250  is OK, the compiler needs to know what it is.  Maybe the size needs to be a constant, as in
 251  a names assiged with "size ::= 4*5".  This gives the compiler some chance of comparing
 252  types of array - and doing range checking on indexes.
 253
 254  We currently call var_init to set the type of a variable during type
 255  analysis - which makes sense.
 256  But for an array we don't have the final type until run-time.  So we need an
 257  intermediate type.
 258  So (for now) the size of an array is either a NUMBER or an IDENTIFIER which must be a
 259   constant var.
 260  I need a point where the type is instantiated - where the variable is evaluated
 261   and the size is set.  I guess this happens when the 'struct var' is evaluated...
 262   no, when a Declare binode is evaluated.
 263
 264  What happens if I have
 265     a:[foo]number = thing
 266  I guess the type analysis needs to afirm that thing has the correct type,
 267  then a doesn't need to be initialized.
 268
 269  If I find
 270     a[4] = "hello"
 271  and 'a' hasn't been declared .... obviously an error.
 272
 273  When/what/how.
 274   For field access, I need to know the type of the variable.
 275   But I can delay the look up until type analysis.
 276   So a[4] is Index(a, 4) - a binode
 277      a.foo is Field(a, "foo") -  need a new exec type - Fieldname
 278      a(args) is Call(a, args) - need a new Binode type - Tuple.
 279
 280 So I have to delay 'const' assessment to later too.
 281
 282 ----------
 283 Where do type definition go?
 284 I don't think they go with statements, they belong separately.
 285 I don't want the full separation of a "type" section like Pascal
 286 So they probably go at the top level, equivalent to "program" - and before.
 287 They start with "struct" or "enum" or "record" etc.
 288
 289 So: what about constants?  These are currently statements and so affect a scope in time.
 290 But for declaring arrays in structs, or initial values of fields, we might want constants.
 291 A constant could be within a struct, but only that it too limiting.  I need module-wide
 292 constants.
 293 So I guess:
 294
 295    const:
 296       name ::= value
 297       name ::= value
 298
 299 or
 300    const { name ::= value ; name ::= value }
 301
 302 --------------
 303 I'm in the middle of stage-1 on structures.
 304
 305 I need a type to parse the declaration into.  It needs to be a linked list
 306 of fields, each of which is a type, a name, and an initial value.  i.e. a 'struct field'.
 307
 308 -----------------
 309
 310 Numbers...
 311 I want signed/unsigned/bitset integers (and probably floats).
 312 These are different sizes, and I want to move 'type' out of 'value'
 313 so I can have arrays of numbers that are *just* the densely packets numbers.
 314
 315 So there are two questions here: how will I handle values in oceani, and
 316 what are the semantics of numbers in ocean.
 317
 318 I think I want bitops to requires bitsets and arith ops to require signed/unsigned.
 319 But there is some overlap.
 320 e.g. we use bitops to test if a number is a power of two
 321 We sometimes use bitops to multiply, but that is probably best avoided.
 322 use * to multiply.
 323
 324 Converting between the two can be done with simple assignment.
 325
 326 So + - * / %     require/assume signed or unsigned
 327    | & ~ << >>   require/assume bitset
 328
 329   #  accepts either and produces a bitset
 330
 331 Other issue is overflow/underflow checking.
 332 Do we need another unsigned type - cyclic
 333
 334     i32 - signed integer in 32 bits
 335     u32 - unsigned integer
 336     c32 - unsigned with overflow permitted and ignored
 337     b32 - bitset
 338
 339     int uint cint bset - whatever size.
 340
 341 i32 and u32 detect overflow/underflow and set to NaN - all 1's
 342 If I want to allow overloading (such a NaN), I need a type that
 343 declare no overloading. s32 and c32?  Or annotations.  !s32 !u32
 344
 345 So what about values in oceani?  I want to separate out the type and not
 346 use a union.
 347 Where are they used?
 348  - return of init, prepare, parse, dup
 349  - passed to  print, cmp, dup, free, to_int, to_float, to_mpq
 350  - field in 'struct variable'
 351  - field in 'struct lrval'
 352  - result of 'interp'
 353  - intermediate left/right in interp
 354  - field in array and struct field
 355  - field in 'struct val' for manifest constants
 356
 357 So:
 358   variable gets a 'type' pointer and a union which can be a pointer
 359   to the value, or the value itself (depending on size)
 360   lrval get a type pointer as well, plus the union
 361   interp returns ...
 362
 363
 364 -----------------
 365 Struct/array initialisers.
 366 I like [a,b,c] rather than {a,b,c} because the latter can look like code.
 367 But [] is also array indexing.
 368 So an array initializer could look:
 369   [ [1] = "hello", [5] = "there" ]
 370 and that is confusingly similar to nested initialization
 371   [ [1,2] , [3,4] ]
 372 Options:
 373  1/ use different outer.  {}  () <> << >>
 374    < is possibly as it is not a prefix operator.
 375      But nesting results in <<1,2>,<3,4>> which looks like << instead of < <
 376    {} I already don't like
 377    () is bad enough with function calls - it is best if it is grouping only.
 378      though with function calls it is a list ...
 379    << [1]="hello", [2]="there" >>...  I don't really like that
 380
 381    array[ ]
 382    struct[ ]
 383      No, too noisy.
 384
 385  2/ use different inner syntax.
 386      [ .[1] = "hello", .[5] = "hello" ]
 387
 388  What about a newline-based syntax:
 389   a: [4]int :
 390         [0] = 2
 391         [1] = 3
 392         [3] = 1
 393
 394  Nice, but doesn't actually help.  Still need .[] because I want to allow
 395  a one-line syntax too.
 396  Maybe I just use {} after all.
 397
 398   a:[4]int = { [0]=2, [1]=3, [3]=1 }
 399  Yes, I guess that is best.