Skip to main content

Aggregate Functions

HelixQL provides a comprehensive set of aggregate functions for performing statistical operations and summarizing collections of values. These functions operate on arrays or collections to produce single summary values.

Available Functions

MIN - Minimum Value

MIN(collection)  // Returns the smallest value
Returns the minimum value from a collection of numbers.

MAX - Maximum Value

MAX(collection)  // Returns the largest value
Returns the maximum value from a collection of numbers.

SUM - Sum of Values

SUM(collection)  // Returns the total sum
Returns the sum of all values in a collection.

AVG - Average Value

AVG(collection)  // Returns the mean
Returns the arithmetic mean (average) of all values in a collection.

COUNT - Count Elements

COUNT(collection)  // Returns the number of elements
Returns the number of elements in a collection.
When using the SDKs or curling the endpoint, the query name must match what is defined in the queries.hx file exactly.

Example 1: Statistical analysis of sales data

Analyze sales performance using aggregate functions:
QUERY AnalyzeSalesStatistics() =>
    stats <- {
        total_sales: SUM(N::Sale::{amount}),
        average_sale: AVG(N::Sale::{amount}),
        max_sale: MAX(N::Sale::{amount}),
        min_sale: MIN(N::Sale::{amount}),
        sale_count: COUNT(N::Sale::{amount})
    }
    RETURN stats

QUERY CreateSale(amount: F64, product: String) =>
    sale <- AddN<Sale>({ amount: amount, product: product })
    RETURN sale
Here’s how to run the query using the SDKs or curl
from helix.client import Client

client = Client(local=True, port=6969)

# Create sales records
sales = [
    {"amount": 150.0, "product": "Laptop"},
    {"amount": 75.0, "product": "Mouse"},
    {"amount": 300.0, "product": "Monitor"},
    {"amount": 50.0, "product": "Keyboard"},
    {"amount": 200.0, "product": "Headphones"},
    {"amount": 125.0, "product": "Webcam"},
]

for sale in sales:
    client.query("CreateSale", sale)

result = client.query("AnalyzeSalesStatistics", {})
print("Sales statistics:", result)

Example 2: Student grade analysis

Calculate grade statistics for students:
QUERY AnalyzeStudentGrades() =>
    students <- N::Student
        ::{
            name,
            grades,
            highest_grade: MAX(_::{grades}),
            lowest_grade: MIN(_::{grades}),
            average_grade: AVG(_::{grades}),
            total_assessments: COUNT(_::{grades})
        }
    RETURN students

QUERY CreateStudent(name: String, grades: [F64]) =>
    student <- AddN<Student>({ name: name, grades: grades })
    RETURN student
Here’s how to run the query using the SDKs or curl
from helix.client import Client

client = Client(local=True, port=6969)

# Create student records with their grades
students = [
    {"name": "Alice", "grades": [85.0, 92.0, 88.0, 95.0, 90.0]},
    {"name": "Bob", "grades": [78.0, 82.0, 75.0, 88.0, 80.0]},
    {"name": "Charlie", "grades": [95.0, 98.0, 92.0, 96.0, 94.0]},
    {"name": "Diana", "grades": [70.0, 75.0, 72.0, 78.0, 74.0]},
]

for student in students:
    client.query("CreateStudent", student)

result = client.query("AnalyzeStudentGrades", {})
print("Student grade analysis:", result)

Example 3: Sensor data monitoring

Monitor and summarize IoT sensor readings:
QUERY MonitorSensorReadings() =>
    sensors <- N::Sensor
        ::{
            sensor_id,
            readings,
            max_reading: MAX(_::{readings}),
            min_reading: MIN(_::{readings}),
            avg_reading: AVG(_::{readings}),
            total_readings: COUNT(_::{readings}),
            reading_sum: SUM(_::{readings})
        }
    RETURN sensors

QUERY CreateSensor(sensor_id: String, readings: [F64]) =>
    sensor <- AddN<Sensor>({ sensor_id: sensor_id, readings: readings })
    RETURN sensor
Here’s how to run the query using the SDKs or curl
from helix.client import Client

client = Client(local=True, port=6969)

# Create sensor records with readings
sensors = [
    {"sensor_id": "TEMP-001", "readings": [22.5, 23.1, 22.8, 23.5, 22.9]},
    {"sensor_id": "TEMP-002", "readings": [25.0, 25.5, 24.8, 26.0, 25.2]},
    {"sensor_id": "HUMID-001", "readings": [45.0, 47.0, 46.5, 48.0, 46.0]},
    {"sensor_id": "PRESS-001", "readings": [1013.0, 1012.5, 1014.0, 1013.5, 1013.0]},
]

for sensor in sensors:
    client.query("CreateSensor", sensor)

result = client.query("MonitorSensorReadings", {})
print("Sensor monitoring:", result)

Common Aggregate Patterns

Combining Aggregates

Aggregate functions are often used together for comprehensive analysis:
QUERY ComprehensiveStats() =>
    stats <- {
        count: COUNT(N::DataPoint::{value}),
        sum: SUM(N::DataPoint::{value}),
        avg: AVG(N::DataPoint::{value}),
        min: MIN(N::DataPoint::{value}),
        max: MAX(N::DataPoint::{value}),
        range: SUB(MAX(N::DataPoint::{value}), MIN(N::DataPoint::{value}))
    }
    RETURN stats

Filtered Aggregates

Combine aggregates with filtering for conditional statistics:
QUERY FilteredStats(threshold: F64) =>
    filtered <- N::DataPoint WHERE _::{value} > threshold
    stats <- {
        count: COUNT(filtered::{value}),
        average: AVG(filtered::{value}),
        maximum: MAX(filtered::{value})
    }
    RETURN stats

Nested Aggregates

Calculate aggregates on aggregate results:
QUERY NestedAggregates() =>
    groups <- N::Group
        ::{
            name,
            member_count: COUNT(_::{members}),
            avg_age: AVG(_::{members}::age),
            max_score: MAX(_::{members}::score)
        }
    overall_stats <- {
        total_groups: COUNT(groups),
        avg_group_size: AVG(groups::{member_count})
    }
    RETURN overall_stats
Aggregate functions are particularly useful for data analysis, reporting, and creating dashboards.

Empty Collections

When aggregate functions are applied to empty collections:
  • COUNT returns 0
  • SUM returns 0
  • AVG, MIN, MAX return null or error depending on implementation

Statistical Variance

To calculate variance and standard deviation, combine aggregate functions:
// Calculate variance
QUERY CalculateVariance() =>
    values <- N::Value::{amount}
    mean <- AVG(values)
    squared_diffs <- values
        ::{
            diff_squared: POW(SUB(_::{amount}, mean), 2.0)
        }
    variance <- AVG(squared_diffs::{diff_squared})
    std_dev <- SQRT(variance)
    RETURN { variance: variance, std_dev: std_dev }

Performance Considerations

Aggregate functions scan entire collections, so consider:
  • Adding appropriate indexes for frequently aggregated fields
  • Using filters before aggregation to reduce data volume
  • Caching aggregate results for frequently accessed statistics