Function Reference

The plugin provides 65 SQL functions organized into 8 sketch families. Each family has:

  • Aggregation functions that build sketches from raw data (ds_*_sketch) or merge existing sketches (ds_*_union)
  • Scalar functions that query sketches (estimate, quantile, merge, stringify, etc.)

All sketches are represented as VARBINARY in SQL.

Quick Lookup

I want to… Function
Count distinct values ds_hll_sketch + ds_hll_estimate
Count distinct with set operations ds_theta_sketch + ds_theta_estimate / ds_theta_intersection / ds_theta_exclude
Compute percentiles ds_kll_sketch + ds_kll_quantile
Find the rank of a value ds_kll_sketch + ds_kll_rank
Find frequent items ds_freq_sketch + ds_freq_frequent_items
Compare two distributions (A/B test) ds_tuple_doubles_sketch + ds_tuple_doubles_ttest
Compute Jaccard similarity ds_theta_sketch + ds_theta_similarity
Merge pre-aggregated sketches ds_*_union (aggregation) or ds_*_merge (scalar, pairwise)

Conventions

  • NULL sketch arguments to scalar functions return NULL
  • NULL input values in aggregation functions are silently skipped
  • All sketches support a ds_*_stringify function that returns a human-readable summary
  • Accuracy parameters (lgK, k, nominalEntries) trade memory for precision — higher values = more accurate but larger sketches

Table of contents


Back to top

Trino DataSketches Plugin — Apache License 2.0