# COUNT_DISTINCT

The COUNT_DISTINCT function returns the approximate number of distinct values in a column or expression. It uses the HyperLogLog++ algorithm, which provides configurable precision and fixed memory usage, making it suitable for high-cardinality sets and large datasets. Counts are approximate, and the accuracy depends on the configured precision threshold and the dataset.

## Syntax

`COUNT_DISTINCT(field, precision)`

### Parameters

#### field

Column or literal for which to count the number of distinct values.

#### precision

Optional. Precision threshold that controls the trade-off between memory usage and accuracy. The maximum supported value is 40000; values above this will be treated as 40000. The default value is 3000.

## Examples

Counts the number of unique values in the `ip0` and `ip1` columns from the `hosts` dataset.

```esql
FROM hosts
| STATS COUNT_DISTINCT(ip0), COUNT_DISTINCT(ip1)
```

Calculates the distinct count for `ip0` with a high precision threshold and for `ip1` with a low precision threshold.

```esql
FROM hosts
| STATS COUNT_DISTINCT(ip0, 80000), COUNT_DISTINCT(ip1, 5)
```

Counts the number of unique words in a semicolon-separated string after splitting it into individual words.

```esql
ROW words="foo;bar;baz;qux;quux;foo"
| STATS distinct_word_count = COUNT_DISTINCT(SPLIT(words, ";"))
```

## Limitations

- Counts are approximate and not exact.
- The maximum supported precision threshold is 40000; higher values have no additional effect.
- Accuracy depends on the dataset and the configured precision threshold. For low thresholds, error rates remain low (1-6%) even for large cardinalities.
