Table KeyValue Store

Table is a convention on top of the Span data structure which provides key/value lookup resolution. In short, this is a hash table.

The underlying mechanism is to resolve a portion of bits from the hash value of the key (such as 0001 from 11100111101001011001101010110001). In this case we have the hash of the string "Delta" (generated by our hashing algorithm which is a modification of dbj2), into a value within the range of the slots in the Span, 16 to start, then later 256.

For example, a Table would resolve the above hash into slot 1 (0001 in binary) for a 16 slot span, or 177 (10110001 in binary) for a 256 slot Span.

This is done by masking the hash positionally, against the max value of the slots available.

Dimensions
Range
Mask
Places

1
0-16
00000000000000000000000000001111
8

2
0-255
00000000000000000000000011111111
7

3
0-4096
00000000000000000000111111111111
6

4
0-65535
00000000000000001111111111111111
5

The placement and retrieval lookup resolution is re-query based. This means that if there is already a non-matching item in the slot, it will try the neighboring bits to find a home for the item. In the above example, if slot 1 is taken an attempt will be made to place the key/value in slot 11 (1011 in binary) because that is the second 4 bits of the hash value.

Each req-query examines the bits 4 positions higher and looks in that location for an opening to place the item. When it is fetched it will pursue the same lookup pattern, and is thereby gauranteed to find the item by comparing the full hash.

Attempt
Key
Index
SubHash
Position
Hash

1
"Delta"
1
0001
1st
11100111101001011001101010110001

2
"Delta"
11
1011
2nd
11100111101001011001101010110001

Below is a table with four items in it. Two of the items find an open slot on the first try, with one needing two tries, and one needing three tries, but in practice the second try for "Carrots" would be skipped because it contains identicle bits.

Attempt
Key
Index
SubHash
Position
Hash

Attempt: 1
Key: "Apples"
Index: 7
SubHash: 0111
Position: 1st
Hash:
00000100001010001101101101100111

1
"Bananas"
7
0111
1st
00011010001010101111110100110111

2
"Bananas"
3
0011
2nd
00011010001010101111110100110111

1
"Carrots"
3
0011
1st
10111011100011011100100100110011

2
"Carrots"
3
0011
2nd
10111011100011011100100100110011

3
"Carrots"
9
1001
3rd
10111011100011011100100100110011

1
"Dandilions"
11
1011
1st
00110100001111111001111000011011

This is how our debugger presents the table on the console using our Debug_Print functionality

Span<4 items in 1 dims of 16 slots each
  Slab<icr16[0] 
    3=H<itm="Bananas" v="Don't slip">, 
    7=H<itm="Apples" v="Mmmm pie in the making">, 
    9=H<itm="Carrots" v="And peas and potatoes and corn.">, 
    11=H<itm="Dandilions" v="Summer-summer-summer-Time!">>
>

Table Resize

The table will resize if it has to query too many times to find a place for items, or if it begins to fill the slots. The below table is how the first 10 items would be inserted into a Table.

Attempt
Key
Index
SubHash
Position
Hash

1
"Alpha"
1
0001
1
00101110011111101000111000000001

1
"Bravo"
7
0111
1
01101000011000000001011010010111

1
"Charlie"
9
1001
1
10111000101110101011110101101001

1
"Delta"
1
0001
1
11100111101001011001101010110001

2
"Delta"
11
1011
2
11100111101001011001101010110001

1
"Echo"
15
1111
1
11010111010001101010100001111111

1
"Foxtrot"
0
0000
1
10100101110011011001100000110000

1
"Golf"
6
0110
1
11101011000110111101101001000110

1
"Hotel"
0
0000
1
11000111000011111000010110100000

2
"Hotel"
10
1010
2
11000111000011111000010110100000

1
"India"
5
0101
1
10111100010000101100011111100101

1
"Juliet"
8
1000
1
01000110011000101010110000111000

Debug_Print output:

Span<10 items in 1 dims of 16 slots each
  Slab<icr16[0]
    0=H<itm="Foxtrot" v="Five Minutes">,
    1=H<itm="Alpha" v="Apples">
    5=H<itm="India" v="Idio-syncratic">
    6=H<itm="Golf" v="Gophers Are Cool">
    7=H<itm="Bravo" v="Bandits">
    8=H<itm="Juliet" v="Jockey Rider">
    9=H<itm="Charlie" v="Carrots">
    10=H<itm="Hotel" v="Happy Go Lucky">
    11=H<itm="Delta" v="DeadMan">
    15=H<itm="Echo" v="Elevator">>
>

Inserting one more value puts it over the slots-filled limit, because 11 values in 16 spaces will increase the collisions significantly, so it resizes after the below insertion.

Attempt
Key
Index
SubHash
Position
Hash

1
"Kilo"
15
1111
1
00010010001111001011010011101111

2
"Kilo"
14
1110
2
00010010001111001011010011101111

After "Kilo" is inserted into the table, it resizes to a dimension 2 span with 256 slots instead of the basic 16. Below is the re-insertion pattern into the new table using 8 bits instead of 4. As you can see, the index values are now between 0 and 255.

Attempt
Key
Index
SubHash
Position
Hash

1
"Foxtrot"
48
00110000
1
10100101110011011001100000110000

1
"Alpha"
1
00000001
1
00101110011111101000111000000001

1
"India"
229
11100101
1
10111100010000101100011111100101

1
"Golf"
70
01000110
1
11101011000110111101101001000110

1
"Bravo"
151
10010111
1
01101000011000000001011010010111

1
"Juliet"
56
00111000
1
01000110011000101010110000111000

1
"Charlie"
105
01101001
1
10111000101110101011110101101001

1
"Hotel"
160
10100000
1
11000111000011111000010110100000

1
"Delta"
177
10110001
1
11100111101001011001101010110001

1
"Kilo"
239
11101111
1
00010010001111001011010011101111

1
"Echo"
127
01111111
1
11010111010001101010100001111111

1
"Lima"
69
01000101
1
10011100000010111100110101000101

Here is the Debug_Print output for the final 256 slot Table. It's complex to read becuase it splits over several slabs now that the Span has two levels of slab dimensions.

Span<12 items in 2 dims of 16 slots each
  Slab<incr256[0] 0=0, 1=16, 3=48, 4=64, 6=96, 7=112, 9=144, 10=160, 11=176, 14=224>
    0=Slab<icr16[0]
        1=H<1 itm="Alpha" v="Apples">>
    3=Slab<icr16[48]
        0=H<48 itm="Foxtrot" v="Five Minutes">
        8=H<56 itm="Juliet" v="Jockey Rider">>
    4=Slab<icr16[64]
        6=H<70 itm="Golf" v="Gophers Are Cool">>
    6=Slab<icr16[96]
        9=H<105 itm="Charlie" v="Carrots">>
    7=Slab<icr16[112]
        15=H<127 itm="Echo" v="Elevator">>
    9=Slab<icr16[144]
        7=H<151 itm="Bravo" v="Bandits">>
    10=Slab<icr16[160]
        0=H<160 itm="Hotel" v="Happy Go Lucky">>
    11=Slab<icr16[176]
        1=H<177 itm="Delta" v="DeadMan">>
    14=Slab<icr16[224]
        5=H<229 itm="India" v="Idio-syncratic">
        15=H<239 itm="Kilo" v="Kangaroo">>
>

If the query placement encounters more than a threshold of insert conflicts it will resort to a linked list of items as is standard in a normal hash table. This is possible but has arisen very rarely in our testing so far.

Summary

The system is in it's infancy and is ripe for optimization. One challenge is that just after the table expands, it is likely not very memory efficient due to several 16 slot spans allocated with only a single item in them. The system is functional now but may be optimized more if we were to move into IoT or embedded spaces where memory usage is more sensitive.

For the sake of clarity we can emphasis that the simplicity of using several sections of bits from the hash makes for a clean memory footprint while providing consistent lookups. See the page dedicated to the underlying architecture for an overview of how the Span allocation works.

caneka

A programming langauge for streams and lifecycles

caneka

A programming langauge for streams and lifecycles

Table KeyValue Store

Table Resize

Summary