Skip to main content

Aggregate Functions

HelixQL provides a comprehensive set of aggregate functions for performing statistical operations and summarizing collections of values. These functions operate on arrays or collections to produce single summary values.

Available Functions

MIN - Minimum Value

MIN(collection)  // Returns the smallest value
Returns the minimum value from a collection of numbers.

MAX - Maximum Value

MAX(collection)  // Returns the largest value
Returns the maximum value from a collection of numbers.

SUM - Sum of Values

SUM(collection)  // Returns the total sum
Returns the sum of all values in a collection.

AVG - Average Value

AVG(collection)  // Returns the mean
Returns the arithmetic mean (average) of all values in a collection.

COUNT - Count Elements

COUNT(collection)  // Returns the number of elements
Returns the number of elements in a collection.
When using the SDKs or curling the endpoint, the query name must match what is defined in the queries.hx file exactly.

Example 1: Statistical analysis of sales data

Analyze sales performance using aggregate functions:
QUERY AnalyzeSalesStatistics() =>
    stats <- {
        total_sales: SUM(N::Sale::{amount}),
        average_sale: AVG(N::Sale::{amount}),
        max_sale: MAX(N::Sale::{amount}),
        min_sale: MIN(N::Sale::{amount}),
        sale_count: COUNT(N::Sale::{amount})
    }
    RETURN stats

QUERY CreateSale(amount: F64, product: String) =>
    sale <- AddN<Sale>({ amount: amount, product: product })
    RETURN sale
Here’s how to run the query using the SDKs or curl
from helix.client import Client

client = Client(local=True, port=6969)

# Create sales records
sales = [
    {"amount": 150.0, "product": "Laptop"},
    {"amount": 75.0, "product": "Mouse"},
    {"amount": 300.0, "product": "Monitor"},
    {"amount": 50.0, "product": "Keyboard"},
    {"amount": 200.0, "product": "Headphones"},
    {"amount": 125.0, "product": "Webcam"},
]

for sale in sales:
    client.query("CreateSale", sale)

result = client.query("AnalyzeSalesStatistics", {})
print("Sales statistics:", result)

Example 2: Student grade analysis

Calculate grade statistics for students:
QUERY AnalyzeStudentGrades() =>
    students <- N::Student
        ::{
            name,
            grades,
            highest_grade: MAX(_::{grades}),
            lowest_grade: MIN(_::{grades}),
            average_grade: AVG(_::{grades}),
            total_assessments: COUNT(_::{grades})
        }
    RETURN students

QUERY CreateStudent(name: String, grades: [F64]) =>
    student <- AddN<Student>({ name: name, grades: grades })
    RETURN student
Here’s how to run the query using the SDKs or curl
from helix.client import Client

client = Client(local=True, port=6969)

# Create student records with their grades
students = [
    {"name": "Alice", "grades": [85.0, 92.0, 88.0, 95.0, 90.0]},
    {"name": "Bob", "grades": [78.0, 82.0, 75.0, 88.0, 80.0]},
    {"name": "Charlie", "grades": [95.0, 98.0, 92.0, 96.0, 94.0]},
    {"name": "Diana", "grades": [70.0, 75.0, 72.0, 78.0, 74.0]},
]

for student in students:
    client.query("CreateStudent", student)

result = client.query("AnalyzeStudentGrades", {})
print("Student grade analysis:", result)

Example 3: Sensor data monitoring

Monitor and summarize IoT sensor readings:
QUERY MonitorSensorReadings() =>
    sensors <- N::Sensor
        ::{
            sensor_id,
            readings,
            max_reading: MAX(_::{readings}),
            min_reading: MIN(_::{readings}),
            avg_reading: AVG(_::{readings}),
            total_readings: COUNT(_::{readings}),
            reading_sum: SUM(_::{readings})
        }
    RETURN sensors

QUERY CreateSensor(sensor_id: String, readings: [F64]) =>
    sensor <- AddN<Sensor>({ sensor_id: sensor_id, readings: readings })
    RETURN sensor
Here’s how to run the query using the SDKs or curl
from helix.client import Client

client = Client(local=True, port=6969)

# Create sensor records with readings
sensors = [
    {"sensor_id": "TEMP-001", "readings": [22.5, 23.1, 22.8, 23.5, 22.9]},
    {"sensor_id": "TEMP-002", "readings": [25.0, 25.5, 24.8, 26.0, 25.2]},
    {"sensor_id": "HUMID-001", "readings": [45.0, 47.0, 46.5, 48.0, 46.0]},
    {"sensor_id": "PRESS-001", "readings": [1013.0, 1012.5, 1014.0, 1013.5, 1013.0]},
]

for sensor in sensors:
    client.query("CreateSensor", sensor)

result = client.query("MonitorSensorReadings", {})
print("Sensor monitoring:", result)

Common Aggregate Patterns

Combining Aggregates

Aggregate functions are often used together for comprehensive analysis:
QUERY ComprehensiveStats() =>
    stats <- {
        count: COUNT(N::DataPoint::{value}),
        sum: SUM(N::DataPoint::{value}),
        avg: AVG(N::DataPoint::{value}),
        min: MIN(N::DataPoint::{value}),
        max: MAX(N::DataPoint::{value}),
        range: SUB(MAX(N::DataPoint::{value}), MIN(N::DataPoint::{value}))
    }
    RETURN stats

Filtered Aggregates

Combine aggregates with filtering for conditional statistics:
QUERY FilteredStats(threshold: F64) =>
    filtered <- N::DataPoint WHERE _::{value} > threshold
    stats <- {
        count: COUNT(filtered::{value}),
        average: AVG(filtered::{value}),
        maximum: MAX(filtered::{value})
    }
    RETURN stats

Nested Aggregates

Calculate aggregates on aggregate results:
QUERY NestedAggregates() =>
    groups <- N::Group
        ::{
            name,
            member_count: COUNT(_::{members}),
            avg_age: AVG(_::{members}::age),
            max_score: MAX(_::{members}::score)
        }
    overall_stats <- {
        total_groups: COUNT(groups),
        avg_group_size: AVG(groups::{member_count})
    }
    RETURN overall_stats
Aggregate functions are particularly useful for data analysis, reporting, and creating dashboards.

Empty Collections

When aggregate functions are applied to empty collections:
  • COUNT returns 0
  • SUM returns 0
  • AVG, MIN, MAX return null or error depending on implementation

Statistical Variance

To calculate variance and standard deviation, combine aggregate functions:
// Calculate variance
QUERY CalculateVariance() =>
    values <- N::Value::{amount}
    mean <- AVG(values)
    squared_diffs <- values
        ::{
            diff_squared: POW(SUB(_::{amount}, mean), 2.0)
        }
    variance <- AVG(squared_diffs::{diff_squared})
    std_dev <- SQRT(variance)
    RETURN { variance: variance, std_dev: std_dev }

Performance Considerations

Aggregate functions scan entire collections, so consider:
  • Adding appropriate indexes for frequently aggregated fields
  • Using filters before aggregation to reduce data volume
  • Caching aggregate results for frequently accessed statistics

Unary Math Functions

SQRT, ABS, LN, LOG10, EXP, CEIL, FLOOR, ROUND

Arithmetic Functions

ADD, SUB, MUL, DIV, POW, MOD

Conditionals

Filter data before aggregation

Math Overview

Overview of all math functions