caneka

A programming langauge for streams and lifecycles

=

Span Data Structures

This family of data structures is a auto-scaling family with a focus on direct index access while providing name to value resolution when needed. It proritizes simplicity and scalability over re-balancing.

The basic structure starts with a number of slots, in this case the dimension length is 8.

When an item is added beyond the current bounds, it scales to place a span of pointers to route to higher values.

They system scales in powers of the span size. in this case dimension one has 8 slots, dimension 2 has 64 slots and dimension 3 has 512 slots. The standard Caneka span has a dimsize of 16 with 16/256/4096/65,536/1,048,576 per dimension. This makes a 5 dimension table capable of holding around a million disperate values.

In the below example, the disparate set of values from 0,1,2,3,6, and,225 can all be consistently accessed and stored with only 5 allocated span slabs.

This is how our Debug_Print would represent this Span on the console.

Span<5 items in 3 dims of 8 slots each
  Slab<incr512[0] 0=0, 3=192>
    0=Slab<incr64[0] 0=0>
      0=Slab<icr8[0] 0=Wi64<0>, 1=Wi64<1>, 2=Wi64<2>, 6=Wi64<6>>
    3=Slab<incr64[192] 4=224>
      4=Slab<icr8[224] 1=Wi64<225>>
>

One factor of consideration is that each dimension adds an additional hop to element accesss. This is not fundementally different than an expanding binary tree where comparisons are common no matter how consisten they are. However, because of this, the Span shines in use-cases where items are accessed either both directly and iteratively becuase iteration can traverse each span with no lookup cost between consecutive items.

Summary

The span architecture is the central structure underneith our string buffer handling and our key/value resolution. See our page on key/value resolution with Tables for an idea of what is built on top of this.