Skip to main content
The Model Runner and Model Runner Client are the two halves of remote model execution. The Runner sits next to each model and serves it over gRPC. The Client sits in your Crunch Node and calls many models concurrently.

Model Runner

The Model Runner is a gRPC server that runs alongside a submitted model and makes it callable over the network. It handles three things:
  • Dynamic loading — discovers and instantiates the participant’s model class at startup
  • Remote execution — exposes model methods over gRPC so the Coordinator can call them
  • Access control — enforces the Secure Model Protocol to verify that only authorized Coordinators can reach the model

Dynamic subclass pattern

When the Coordinator initializes a model through the Client, it specifies a base class (e.g., starter_challenge.tracker.TrackerBase) that the model must implement. The Model Runner:
  1. Searches the submitted code for a class that inherits from this base class
  2. Instantiates it
  3. Exposes its methods over gRPC
The Runner handles all input/output serialization, so both sides exchange structured data reliably.

Health checks

The Runner exposes a health check service so the Coordinator can verify availability remotely. If a model becomes unresponsive, the Client detects it and disconnects.

Model Runner Client

The Model Runner Client is the Coordinator-side Python library. It connects to the Model Orchestrator, maintains a live list of available models, and fans out calls to all of them concurrently over gRPC. It is designed for reliability at scale — even when some models are slow, buggy, or offline.

Initialization

runner = DynamicSubclassModelConcurrentRunner(
    timeout=50,
    crunch_id="your-crunch-id",
    host="localhost",
    port=9091,
    base_classname="starter_challenge.tracker.TrackerBase",
    max_consecutive_failures=10,
    max_consecutive_timeouts=10,
)
ParameterDescription
timeoutMaximum seconds to wait for all models during a single call
crunch_idOn-chain identity of the Crunch
host / portLocation of the Model Orchestrator
base_classnameBase class that all participant models must implement
max_consecutive_failuresDisconnect a model after this many consecutive errors
max_consecutive_timeoutsDisconnect a model after this many consecutive timeouts
Then start the connection:
await runner.init()   # connect to the orchestrator and all models
await runner.sync()   # keep the model list updated in the background

Calling models

Every call fans out to all connected models concurrently. You specify the method name and arguments:
await runner.call(
    method="tick",
    arguments=[
        Argument(
            position=1,
            data=Variant(type=VariantType.JSON, value=encode_data(VariantType.JSON, prices)),
        )
    ],
)
To collect predictions:
results = await runner.call(
    method="predict",
    arguments=[
        Argument(position=1, data=Variant(type=VariantType.STRING, value=encode_data(VariantType.STRING, asset_code))),
        Argument(position=2, data=Variant(type=VariantType.INT, value=encode_data(VariantType.INT, horizon))),
        Argument(position=3, data=Variant(type=VariantType.INT, value=encode_data(VariantType.INT, step))),
    ],
)
Each result contains:
FieldDescription
model identifierWhich model produced this result
statusSUCCESS, FAILURE, or TIMEOUT
outputThe prediction payload
latencyTime spent predicting (microseconds)
model_nameUser-defined model name
cruncher_nameCruncher display name
cruncher_idUnique Cruncher identifier on-chain
deployment_idDeployed version identifier
Use deployment_id to detect when a Cruncher deploys a new version of their model. You can reset metrics or apply version-specific behavior based on this value.
You can also target a subset of models — for example, to allocate more inference budget to top performers.

Timeout and failure handling

All calls execute with the configured timeout. The Client applies safeguards so a single slow or broken model cannot block the entire system:
  • A call is marked TIMEOUT if the model takes too long to respond
  • A call is marked FAILURE if the model raises an exception or returns invalid data
The library tracks consecutive failures and consecutive timeouts per model. When either limit is reached, the model is automatically disconnected. This protects your pipeline from buggy models, very slow models, and models that don’t respect the interface.

Async architecture

The Client is fully asynchronous because it needs to:
  1. Maintain a persistent connection to the Orchestrator
  2. Call many models concurrently on every tick
It uses an event loop to keep the model list up to date, detect when models join or leave, and reconnect automatically when needed.
Avoid blocking the event loop with heavy computations in the predict worker. Delegate CPU-intensive work (scoring, aggregation) to the score worker instead.

Next: Model Orchestrator

How model containers are deployed, monitored, and kept reachable.