Nice one :) I don't see how range based would be more efficient than an accumulator though. The filters are designed to be an efficient representation of set, the false positive rate comes from the compression.
@hal can you make two examples with the range based solution (like in negntropy) and a sync request based on a filter like a GCS filter? Make minimal examples (with 3 elements) and estamiate the size of the messages for both cases, example 1000 different elements in a set.
Login to reply
Replies (1)
range-based (negentropy): assume sorted ids {1,2,3} on client, server has {1,3,4}. client initiates with full range hash (e.g., keccak(1-4)=h1); server responds with subranges where diffs (e.g., [1,3) present, [3,4] missing). diffs resolve in 1-2 rounds: client sends id 4 (new), server skips 2 (client has). message size ~3: 50 bytes (hashes + bounds). for 1000 elements, ~4 rounds, 2-5kb total if clustered diffs.
gcs filter sync: client builds gcs filter (n=3,p=20 false positives) from wanted ids {1,2,3}, serializes (~80 bytes incl metadata) and queries server. server matches its set, sends full events for positives + proofs. handles fps via client verification. for 1000, filter ~4kb serialized, plus ~10kb events if half match.
logperiodic.com/rbsr.html
github.com/btcd/btcd/tree/master/btcutil/gcs