Tuple ArrayOfDoubles Sketch Functions

ArrayOfDoubles sketches combine cardinality estimation with associated numeric values. Each key has one or more double values that are summed on duplicate keys. This enables computing means, variances, and t-tests over sketch data.

Table of contents

  1. Aggregation Functions
    1. ds_tuple_doubles_sketch
    2. ds_tuple_doubles_union
  2. Scalar Functions
    1. ds_tuple_doubles_estimate
    2. ds_tuple_doubles_estimate_bounds
    3. ds_tuple_doubles_retained
    4. ds_tuple_doubles_means
    5. ds_tuple_doubles_variances
    6. ds_tuple_doubles_ttest
    7. ds_tuple_doubles_merge
    8. ds_tuple_doubles_to_quantiles_sketch
    9. ds_tuple_doubles_stringify
  3. Use Cases

Aggregation Functions

ds_tuple_doubles_sketch

Builds a tuple sketch from key-value pairs. Duplicate keys have their values summed.

ds_tuple_doubles_sketch(key, value) -> varbinary
Parameter Type Description
key VARCHAR or BIGINT Key to sketch
value DOUBLE Associated numeric value

ds_tuple_doubles_union

Computes the union of multiple tuple sketches.

ds_tuple_doubles_union(sketch) -> varbinary

Scalar Functions

ds_tuple_doubles_estimate

Returns the cardinality estimate (number of distinct keys).

ds_tuple_doubles_estimate(sketch) -> double

ds_tuple_doubles_estimate_bounds

Returns estimate with error bounds.

ds_tuple_doubles_estimate_bounds(sketch, numStdDevs) -> array(double)

ds_tuple_doubles_retained

Returns the number of retained entries in the sketch.

ds_tuple_doubles_retained(sketch) -> integer

ds_tuple_doubles_means

Computes the mean of each value column across all retained entries.

ds_tuple_doubles_means(sketch) -> array(double)

Example:

-- Average revenue per user
SELECT ds_tuple_doubles_means(ds_tuple_doubles_sketch(user_id, revenue))
FROM orders;
-- Returns: [42.50]  (average revenue across all users)

ds_tuple_doubles_variances

Computes the variance of each value column.

ds_tuple_doubles_variances(sketch) -> array(double)

ds_tuple_doubles_ttest

Performs a t-test comparing two tuple sketches, returning p-values for each value column.

ds_tuple_doubles_ttest(sketch1, sketch2) -> array(double)

Example:

-- A/B test: is the difference in revenue per user significant?
SELECT ds_tuple_doubles_ttest(
    (SELECT ds_tuple_doubles_sketch(user_id, revenue) FROM orders WHERE variant = 'control'),
    (SELECT ds_tuple_doubles_sketch(user_id, revenue) FROM orders WHERE variant = 'treatment')
);
-- Returns: [0.003]  (p-value < 0.05, significant difference)

ds_tuple_doubles_merge

Pairwise union of two tuple sketches.

ds_tuple_doubles_merge(sketch1, sketch2) -> varbinary

ds_tuple_doubles_to_quantiles_sketch

Extracts a value column from the tuple sketch and converts it to a KLL quantiles sketch.

ds_tuple_doubles_to_quantiles_sketch(sketch, columnIndex) -> varbinary
Parameter Type Description
sketch VARBINARY Tuple sketch
columnIndex INTEGER 0-based column index

Example:

-- Get the median revenue per user from a tuple sketch
SELECT ds_kll_quantile(
    ds_tuple_doubles_to_quantiles_sketch(
        ds_tuple_doubles_sketch(user_id, revenue), 0),
    0.5
) AS median_revenue
FROM orders;

ds_tuple_doubles_stringify

ds_tuple_doubles_stringify(sketch) -> varchar

Use Cases

  • A/B testing: compare metric distributions between experiment groups using ds_tuple_doubles_ttest
  • Per-user metrics: track revenue, session count, or engagement per user
  • Anomaly detection: compare means/variances across time periods

Back to top

Trino DataSketches Plugin — Apache License 2.0