HelixDB Docs

Aggregate Functions

HelixQL provides a comprehensive set of aggregate functions for performing statistical operations and summarizing collections of values. These functions operate on arrays or collections to produce single summary values.

Available Functions

MIN - Minimum Value

MIN(collection)  // Returns the smallest value

Returns the minimum value from a collection of numbers.

MAX - Maximum Value

MAX(collection)  // Returns the largest value

Returns the maximum value from a collection of numbers.

SUM - Sum of Values

SUM(collection)  // Returns the total sum

Returns the sum of all values in a collection.

AVG - Average Value

AVG(collection)  // Returns the mean

Returns the arithmetic mean (average) of all values in a collection.

COUNT - Count Elements

COUNT(collection)  // Returns the number of elements

Returns the number of elements in a collection.

When using the SDKs or curling the endpoint, the query name must match what is defined in the queries.hx file exactly.

Example 1: Statistical analysis of sales data

Analyze sales performance using aggregate functions:

QUERY AnalyzeSalesStatistics() =>
    stats <- {
        total_sales: SUM(N::Sale::{amount}),
        average_sale: AVG(N::Sale::{amount}),
        max_sale: MAX(N::Sale::{amount}),
        min_sale: MIN(N::Sale::{amount}),
        sale_count: COUNT(N::Sale::{amount})
    }
    RETURN stats

QUERY CreateSale(amount: F64, product: String) =>
    sale <- AddN<Sale>({ amount: amount, product: product })
    RETURN sale

Here’s how to run the query using the SDKs or curl

from helix.client import Client

client = Client(local=True, port=6969)

# Create sales records
sales = [
    {"amount": 150.0, "product": "Laptop"},
    {"amount": 75.0, "product": "Mouse"},
    {"amount": 300.0, "product": "Monitor"},
    {"amount": 50.0, "product": "Keyboard"},
    {"amount": 200.0, "product": "Headphones"},
    {"amount": 125.0, "product": "Webcam"},
]

for sale in sales:
    client.query("CreateSale", sale)

result = client.query("AnalyzeSalesStatistics", {})
print("Sales statistics:", result)

Example 2: Student grade analysis

Calculate grade statistics for students:

QUERY AnalyzeStudentGrades() =>
    students <- N::Student
        ::{
            name,
            grades,
            highest_grade: MAX(_::{grades}),
            lowest_grade: MIN(_::{grades}),
            average_grade: AVG(_::{grades}),
            total_assessments: COUNT(_::{grades})
        }
    RETURN students

QUERY CreateStudent(name: String, grades: [F64]) =>
    student <- AddN<Student>({ name: name, grades: grades })
    RETURN student

Here’s how to run the query using the SDKs or curl

from helix.client import Client

client = Client(local=True, port=6969)

# Create student records with their grades
students = [
    {"name": "Alice", "grades": [85.0, 92.0, 88.0, 95.0, 90.0]},
    {"name": "Bob", "grades": [78.0, 82.0, 75.0, 88.0, 80.0]},
    {"name": "Charlie", "grades": [95.0, 98.0, 92.0, 96.0, 94.0]},
    {"name": "Diana", "grades": [70.0, 75.0, 72.0, 78.0, 74.0]},
]

for student in students:
    client.query("CreateStudent", student)

result = client.query("AnalyzeStudentGrades", {})
print("Student grade analysis:", result)

Example 3: Sensor data monitoring

Monitor and summarize IoT sensor readings:

QUERY MonitorSensorReadings() =>
    sensors <- N::Sensor
        ::{
            sensor_id,
            readings,
            max_reading: MAX(_::{readings}),
            min_reading: MIN(_::{readings}),
            avg_reading: AVG(_::{readings}),
            total_readings: COUNT(_::{readings}),
            reading_sum: SUM(_::{readings})
        }
    RETURN sensors

QUERY CreateSensor(sensor_id: String, readings: [F64]) =>
    sensor <- AddN<Sensor>({ sensor_id: sensor_id, readings: readings })
    RETURN sensor

Here’s how to run the query using the SDKs or curl

from helix.client import Client

client = Client(local=True, port=6969)

# Create sensor records with readings
sensors = [
    {"sensor_id": "TEMP-001", "readings": [22.5, 23.1, 22.8, 23.5, 22.9]},
    {"sensor_id": "TEMP-002", "readings": [25.0, 25.5, 24.8, 26.0, 25.2]},
    {"sensor_id": "HUMID-001", "readings": [45.0, 47.0, 46.5, 48.0, 46.0]},
    {"sensor_id": "PRESS-001", "readings": [1013.0, 1012.5, 1014.0, 1013.5, 1013.0]},
]

for sensor in sensors:
    client.query("CreateSensor", sensor)

result = client.query("MonitorSensorReadings", {})
print("Sensor monitoring:", result)

Common Aggregate Patterns

Combining Aggregates

Aggregate functions are often used together for comprehensive analysis:

QUERY ComprehensiveStats() =>
    stats <- {
        count: COUNT(N::DataPoint::{value}),
        sum: SUM(N::DataPoint::{value}),
        avg: AVG(N::DataPoint::{value}),
        min: MIN(N::DataPoint::{value}),
        max: MAX(N::DataPoint::{value}),
        range: SUB(MAX(N::DataPoint::{value}), MIN(N::DataPoint::{value}))
    }
    RETURN stats

Filtered Aggregates

Combine aggregates with filtering for conditional statistics:

QUERY FilteredStats(threshold: F64) =>
    filtered <- N::DataPoint WHERE _::{value} > threshold
    stats <- {
        count: COUNT(filtered::{value}),
        average: AVG(filtered::{value}),
        maximum: MAX(filtered::{value})
    }
    RETURN stats

Nested Aggregates

Calculate aggregates on aggregate results:

QUERY NestedAggregates() =>
    groups <- N::Group
        ::{
            name,
            member_count: COUNT(_::{members}),
            avg_age: AVG(_::{members}::age),
            max_score: MAX(_::{members}::score)
        }
    overall_stats <- {
        total_groups: COUNT(groups),
        avg_group_size: AVG(groups::{member_count})
    }
    RETURN overall_stats

Aggregate functions are particularly useful for data analysis, reporting, and creating dashboards.

Empty Collections

When aggregate functions are applied to empty collections:

COUNT returns 0
SUM returns 0
AVG, MIN, MAX return null or error depending on implementation

Statistical Variance

To calculate variance and standard deviation, combine aggregate functions:

// Calculate variance
QUERY CalculateVariance() =>
    values <- N::Value::{amount}
    mean <- AVG(values)
    squared_diffs <- values
        ::{
            diff_squared: POW(SUB(_::{amount}, mean), 2.0)
        }
    variance <- AVG(squared_diffs::{diff_squared})
    std_dev <- SQRT(variance)
    RETURN { variance: variance, std_dev: std_dev }

Performance Considerations

Aggregate functions scan entire collections, so consider:

Adding appropriate indexes for frequently aggregated fields
Using filters before aggregation to reduce data volume
Caching aggregate results for frequently accessed statistics

Unary Math Functions

SQRT, ABS, LN, LOG10, EXP, CEIL, FLOOR, ROUND

Arithmetic Functions

ADD, SUB, MUL, DIV, POW, MOD

Conditionals

Filter data before aggregation

Math Overview

Overview of all math functions

HelixQL

​Aggregate Functions

​Available Functions

​MIN - Minimum Value

​MAX - Maximum Value

​SUM - Sum of Values

​AVG - Average Value

​COUNT - Count Elements

​Example 1: Statistical analysis of sales data

​Example 2: Student grade analysis

​Example 3: Sensor data monitoring

​Common Aggregate Patterns

​Combining Aggregates

​Filtered Aggregates

​Nested Aggregates

​Empty Collections

​Statistical Variance

​Performance Considerations

​Related Topics

Unary Math Functions

Arithmetic Functions

Conditionals

Math Overview

Aggregate Functions

Available Functions

MIN - Minimum Value

MAX - Maximum Value

SUM - Sum of Values

AVG - Average Value

COUNT - Count Elements

Example 1: Statistical analysis of sales data

Example 2: Student grade analysis

Example 3: Sensor data monitoring

Common Aggregate Patterns

Combining Aggregates

Filtered Aggregates

Nested Aggregates

Empty Collections

Statistical Variance

Performance Considerations

Related Topics