Skip to main content
When you scaffold a workspace with crunch-node init, you get a fully working Crunch Node powered by the crunch-node engine. This page walks through how the default implementation works.

Architecture

The Crunch Node runs as a set of independent Docker workers that communicate through a shared PostgreSQL database. This separation ensures that real-time prediction gathering, scoring, and reporting can scale independently.
Feed → Input → Prediction → Score → Snapshot → Checkpoint → On-chain
Diagram showing the worker pipeline: feed-data-worker, predict-worker, score-worker, and report-worker

Workers

WorkerPurpose
feed-data-workerIngests feed data from external sources (Pyth, Binance, etc.) via polling and backfill
predict-workerGets latest data, ticks all connected models, and collects predictions
score-workerResolves ground truth, scores predictions, writes snapshots, and rebuilds the leaderboard
report-workerFastAPI server exposing the leaderboard, predictions, feeds, snapshots, and checkpoints

Feed data worker

The feed data worker ingests market data from external sources and stores it in the feed_records table. It supports multiple data dimensions:
DimensionExampleEnv var
sourcepyth, binanceFEED_SOURCE
subjectBTC, ETHFEED_SUBJECTS
kindtick, candleFEED_KIND
granularity1s, 1mFEED_GRANULARITY
Configure your data source by setting these environment variables in node/.local.env.

Predict worker

The predict worker coordinates all participant models in real-time:
  1. Reads latest feed data from the database
  2. Ticks all connected models — sends market data via the Model Runner Client so models can update their internal state
  3. Calls predict() on all models for each configured scope (subject, horizon, step)
  4. Stores raw predictions in the predictions table for asynchronous scoring
The worker uses the DynamicSubclassModelConcurrentRunner to fan out requests to all models concurrently. Key configuration:
Env varDescriptionDefault
CRUNCH_IDCompetition identifierstarter-challenge
MODEL_BASE_CLASSNAMEBase class that models must implementtracker.TrackerBase
MODEL_RUNNER_NODE_HOSTModel Orchestrator hostmodel-orchestrator
MODEL_RUNNER_NODE_PORTModel Orchestrator port9091
MODEL_RUNNER_TIMEOUT_SECONDSMax time to wait for model responses60
For details on model connection and request handling, see the Model Runner documentation.

Score worker

The score worker transforms raw predictions into scores and leaderboard rankings. It runs independently from the predict worker, so CPU-intensive scoring never blocks real-time prediction collection. Pipeline:
  1. Resolve ground truth — fetch realized values from feed records
  2. Score predictions — evaluate each prediction against ground truth using the configured scoring function
  3. Aggregate snapshots — combine per-prediction scores into model-level metrics over time windows
  4. Rebuild leaderboard — rank all models based on overall performance
  5. Prune old data — remove expired predictions and snapshots

Scoring function

The scoring function is a Python callable that you define in your challenge package. The default scaffold provides a placeholder in challenge/starter_challenge/scoring.py:
def score_prediction(prediction, ground_truth):
    """Score a single prediction against ground truth."""
    return {"value": 0.0, "success": True, "failed_reason": None}
You configure the path to your scoring function via the SCORING_FUNCTION environment variable.

Multi-metric scoring

Every score cycle also computes portfolio-level metrics alongside the per-prediction scoring function. Active metrics are defined in your CrunchConfig:
contract = CrunchConfig(
    metrics=["ic", "ic_sharpe", "hit_rate", "max_drawdown", "model_correlation"],
)
Built-in metrics include Information Coefficient (IC), IC Sharpe, hit rate, max drawdown, Sortino ratio, turnover, and more. You can also register custom metrics.
Individual prediction scoring:For each submitted prediction, the score worker:
  1. Validates the prediction returned the expected output shape
  2. Retrieves the realized ground truth from feed records
  3. Evaluates the prediction using the configured scoring function
  4. Stores the result in the scores table
Model-level aggregation:After individual predictions are scored, the worker computes model-level performance:
  • Recent score — performance over the most recent time window (detects current form)
  • Steady score — medium-term performance (balances recency and stability)
  • Anchor scores — per-parameter breakdown (by subject, horizon, step combinations)
  • Overall score — weighted combination used for final ranking
Snapshots enable time-series reporting, allowing participants to track performance trends.

Report worker

The report worker provides the HTTP API for accessing competition data. Key endpoints:
EndpointDescription
GET /reports/leaderboardCurrent standings
GET /reports/models/globalModel performance over time (time-series)
GET /reports/models/paramsPer-scope performance breakdown
GET /reports/predictionsPrediction-level details for debugging
GET /reports/feedsActive feed subscriptions
GET /reports/snapshotsPer-model period summaries
GET /reports/checkpointsCheckpoint history and emission data

API security

Endpoints are protected by API key authentication when API_KEY is set in your environment. Public endpoints (leaderboard, schema, docs) are always accessible.
# In node/.local.env
API_KEY=my-strong-secret

Custom endpoints

Add endpoints by dropping Python files in node/api/:
# node/api/my_endpoints.py
from fastapi import APIRouter

router = APIRouter(prefix="/custom", tags=["custom"])

@router.get("/hello")
def hello():
    return {"message": "Hello from custom endpoint"}
Any .py file in api/ with a router attribute is auto-mounted at startup.

Configuration

All configuration is via environment variables in node/.local.env. The most important settings:
VariableDescriptionDefault
CRUNCH_IDCompetition identifierstarter-challenge
FEED_SOURCEData sourcepyth
FEED_SUBJECTSAssets to trackBTC
SCORING_FUNCTIONDotted path to scoring callable(engine default)
CHECKPOINT_INTERVAL_SECONDSSeconds between checkpoints604800
MODEL_BASE_CLASSNAMEParticipant model base classtracker.TrackerBase

CrunchConfig

All type shapes and competition behavior are defined in a CrunchConfig object. The engine auto-discovers your config from node/config/crunch_config.py:
from coordinator_node.crunch_config import CrunchConfig

contract = CrunchConfig(
    raw_input_type=RawInput,
    output_type=InferenceOutput,
    score_type=ScoreResult,
    metrics=["ic", "ic_sharpe", "hit_rate"],
)
See the crunch-node documentation for the full configuration reference.

Extension points

Customize competition behavior by setting callable paths in your environment:
Env varPurpose
SCORING_FUNCTIONScore a prediction against ground truth
INFERENCE_INPUT_BUILDERTransform raw feed data into model input
INFERENCE_OUTPUT_VALIDATORValidate model output shape and values
MODEL_SCORE_AGGREGATORAggregate per-model scores across predictions
LEADERBOARD_RANKERCustom leaderboard ranking strategy

Next: Challenge package example

Learn how to structure the participant-facing package that ML engineers use to join your competition.