Challenge package example

The challenge package is the public-facing repository that ML engineers (Crunchers) use to participate in your competition. When you scaffold a workspace with crunch-node init, the challenge/ directory contains a complete working example. Your challenge package should include:

Model interface — The base class that all participant models must implement
Scoring function — Lets participants evaluate their models locally
Quickstarter examples — Working models that participants can copy and adapt
Backtest harness — Lets participants run their models against historical data

The end goal is to publish this as a PyPI package that any Cruncher can pip install to get started.

Default scaffold

The challenge/ directory created by crunch-node init has this structure:

challenge/
├── starter_challenge/
│   ├── __init__.py
│   ├── tracker.py            # Model base class (TrackerBase)
│   ├── scoring.py            # Scoring function for local testing
│   ├── config.py             # Challenge configuration
│   ├── backtest.py           # Backtest harness
│   └── examples/
│       ├── mean_reversion_tracker.py
│       ├── trend_following_tracker.py
│       └── volatility_regime_tracker.py
├── tests/
│   ├── test_tracker.py
│   ├── test_scoring.py
│   └── test_examples.py
└── pyproject.toml

Model interface

The TrackerBase class in tracker.py defines the contract between your Crunch Node and every Cruncher submission:

class TrackerBase:
    """Base class for participant models.

    Subclass this and implement predict() to compete.
    tick() receives market data on every feed update —
    use it to maintain internal state (indicators, history, etc.).
    """

    def tick(self, data: dict[str, Any]) -> None:
        """Receive latest market data. Override to maintain state."""
        subject_key = data.get("symbol", "_default") if isinstance(data, dict) else "_default"
        self._latest_data_by_subject[subject_key] = data

    def predict(self, subject: str, resolve_horizon_seconds: int, step_seconds: int) -> dict[str, Any]:
        """Return a prediction for the given scope.

        Args:
            subject: Asset being predicted (e.g. "BTC").
            resolve_horizon_seconds: How far ahead ground truth is resolved.
            step_seconds: Time step between predictions.

        Returns:
            Dict matching InferenceOutput fields.
            Default expects {"value": float} — positive means bullish,
            negative means bearish.
        """
        raise NotImplementedError("Implement predict() in your model")

Key design decisions:

tick() receives market data on every feed update — models use it to maintain internal state (e.g., price history, indicators)
predict() returns a prediction for a specific scope — the return format must match your InferenceOutput type

The MODEL_BASE_CLASSNAME in your Crunch Node environment must match this class. For the default scaffold, it’s tracker.TrackerBase.

Scoring function

The scaffold includes a scoring function in scoring.py that participants can use for local testing:

def score_prediction(prediction, ground_truth):
    """Score a single prediction against ground truth.

    Return a dict matching your contract's ScoreResult shape.
    """
    return {"value": 0.0, "success": True, "failed_reason": None}

Replace this with your actual evaluation logic. This same function is used by the score worker in production (via the SCORING_FUNCTION environment variable).

Quickstarter examples

The examples/ directory contains working models that participants can copy and adapt. The default scaffold includes three strategies:

Example	Strategy
`mean_reversion_tracker.py`	Bets on price reverting to a moving average
`trend_following_tracker.py`	Follows momentum using EMA crossovers
`volatility_regime_tracker.py`	Adapts predictions based on volatility regime

These give participants a working starting point and demonstrate the patterns your interface expects.

Quickstarters are the single most important factor for participation rates. Make sure they work out of the box and are easy to modify.

Backtest harness

The challenge package includes a backtest harness so participants can evaluate their models against historical data before submitting:

from starter_challenge.backtest import BacktestRunner
from my_model import MyTracker

result = BacktestRunner(model=MyTracker()).run(
    start="2026-01-01", end="2026-02-01"
)

result.predictions_df   # DataFrame of all predictions
result.metrics           # Rolling windows + multi-metric evaluation
result.summary()         # Formatted output

The backtest harness:

Auto-fetches data from the Coordinator and caches locally on first run
Uses the same tick() → predict() loop as production
Applies the same scoring function and multi-metric evaluation as the live leaderboard

Publishing to PyPI

For participants to install your package, publish it to PyPI:

cd challenge
pip install build twine
python -m build
twine upload dist/*

Participants then install and use it:

pip install starter-challenge

from starter_challenge.tracker import TrackerBase

class MyModel(TrackerBase):
    def predict(self, subject, resolve_horizon_seconds, step_seconds):
        # Your prediction logic here
        return {"value": 0.5}

Replace starter-challenge with your competition’s package name in pyproject.toml before publishing.

Next: Define your own Crunch

Customize the prediction task, scoring function, and challenge package to build your own competition.

Overview

Core Concepts

Getting Started

CLI Reference

Appendix

Default scaffold

Model interface

Scoring function

Quickstarter examples

Backtest harness

Publishing to PyPI

Next: Define your own Crunch

​Default scaffold

​Model interface

​Scoring function

​Quickstarter examples

​Backtest harness

​Publishing to PyPI

Next: Define your own Crunch

Default scaffold

Model interface

Scoring function

Quickstarter examples

Backtest harness

Publishing to PyPI