Partial TOAST decompression for jsonb

Apr 05, 2019 13:12

Inspired by commit support for partial TOAST decompression
"When asked for a slice of a TOAST entry, decompress enough to return the slice instead of decompressing the entire object."

I and Nikita Glukhov made a quick experiment to see how jsonb could get benefit from this commit. The idea is simple, let's short values (more valueable) stores before long one. Currently, access time is independent on key, but with support of partial decompression we can get benefit for front keys.

Since jsonb stores values of keys in sorted (by key) order, we generate values depending on key name.

{
"key1": "aaaa", /* 4 ^ 1 */
"key2": "aaaaaaaaaaaaaaaa", /* 4 ^ 2 = 16 */
...
"key10": "aaa ... aaa" /* 4 ^ 10 = 1M */
}

create table t(jb jsonb);
insert into t select (
select jsonb_object_agg('key' || i, repeat('a', pow(4, i)::int)) from generate_series(1,10) i
) from generate_series(1, 1000);

We applied the partial decompression for '->' operator and tested performance with this simple query

select jb->'key1' from t;

The result is as expected - access time depends on a key:

key1-key5 key7 key8 key9 key10
10 ms 48 ms 152 ms 548 ms 2037 ms

Access time for non-optimized operator '->>' is the same for all keys and roughly is 2000 ms.

So, this is what we can get for now. Ideally we want to have access time for all keys equal for time of accessing the first (fastest) key, currently we have the opposite.

I hope TOAST will be improved and we could decompress any slice using data type specific algorithm.

toast, jsonb, pg, pgen

Previous post Next post
Up