CPC Sketch Functions
Compressed Probabilistic Counting (CPC) sketches provide cardinality estimation with better accuracy-per-byte than HLL. They are a good choice when storage efficiency is critical.
Table of contents
Aggregation Functions
ds_cpc_sketch
Builds a CPC sketch from input values.
ds_cpc_sketch(value) -> varbinary
ds_cpc_sketch(value, lgK) -> varbinary
| Parameter | Type | Description |
|---|---|---|
value | VARCHAR, BIGINT, or DOUBLE | Values to count |
lgK | INTEGER | Log2 of sketch size (4-26, default 11) |
ds_cpc_union
Computes the union of multiple CPC sketches.
ds_cpc_union(sketch) -> varbinary
ds_cpc_union(sketch, lgK) -> varbinary
Scalar Functions
ds_cpc_estimate
Returns the cardinality estimate.
ds_cpc_estimate(sketch) -> double
ds_cpc_estimate_bounds
Returns estimate with error bounds.
ds_cpc_estimate_bounds(sketch, kappa) -> array(double)
| Parameter | Type | Description |
|---|---|---|
sketch | VARBINARY | CPC sketch |
kappa | INTEGER | Number of standard deviations (1, 2, or 3) |
Returns [estimate, lower_bound, upper_bound].
ds_cpc_merge
Pairwise merge of two CPC sketches.
ds_cpc_merge(sketch1, sketch2) -> varbinary
ds_cpc_stringify
Returns a human-readable string representation.
ds_cpc_stringify(sketch) -> varchar
CPC vs HLL
| Property | CPC | HLL |
|---|---|---|
| Accuracy per byte | Better | Good |
| Merge speed | Slower | Faster |
| Set operations | Union only | Union only |
| Maturity | Newer | Well-established |
Use CPC when you need the most compact sketches. Use HLL when merge performance matters or for compatibility with existing systems.