percentile
This page explains how to use the percentile aggregation function in APL.
The percentile
aggregation function in Axiom Processing Language (APL) allows you to calculate the value below which a given percentage of data points fall. It is particularly useful when you need to analyze distributions and want to summarize the data using specific thresholds, such as the 90th or 95th percentile. This function can be valuable in performance analysis, trend detection, or identifying outliers across large datasets.
You can apply the percentile
function to various use cases, such as analyzing log data for request durations, OpenTelemetry traces for service latencies, or security logs to assess risk patterns.
For users of other query languages
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Usage
Syntax
Parameters
- column: The name of the column to calculate the percentile on. This must be a numeric field.
- percentile: The target percentile value (between 0 and 100).
Returns
The function returns the value from the specified column that corresponds to the given percentile.
Use case examples
In log analysis, you can use the percentile
function to identify the 95th percentile of request durations, which gives you an idea of the tail-end latencies of requests in your system.
Query
Output
percentile_req_duration_ms |
---|
1200 |
This query calculates the 95th percentile of request durations, showing that 95% of requests take less than or equal to 1200ms.
List of related aggregations
- avg: Use
avg
to calculate the average of a column, which gives you the central tendency of your data. In contrast,percentile
provides more insight into the distribution and tail values. - min: The
min
function returns the smallest value in a column. Use this when you need the absolute lowest value instead of a specific percentile. - max: The
max
function returns the highest value in a column. It’s useful for finding the upper bound, whilepercentile
allows you to focus on a specific point in the data distribution. - stdev:
stdev
calculates the standard deviation of a column, which helps measure data variability. Whilestdev
provides insight into overall data spread,percentile
focuses on specific distribution points.