Model Runner
The Model Runner is a gRPC server that runs alongside an AI model submission and makes it callable over the network. It is designed to:- Dynamically load the model code
- Give access to specific Coordinators to call the model (via the Model Runner Client)
- Execute inference and return results remotely
How It Works
The Model Runner implements the Dynamic Subclass pattern:- When the Coordinator initializes a model through the Model Runner Client, it provides an interface that the model must implement.
- The Model Runner then searches inside the model code for a class implementing this interface and instantiates it.
- After that, the Coordinator can call the interface methods remotely.
- The Model Runner handles input/output serialization and deserialization, so both sides can exchange structured data reliably.
Model Runner Client Library
The Model Runner Client is the Coordinator-side Python library used to connect to the Model Orchestrator, keep the list of available models up to date, and call many models concurrently (fanout) over gRPC. It is designed to make remote inference reliable at scale, even when some models are slow, buggy, or offline. Initialization A typical setup looks like:timeout— maximum time (in seconds) to wait for all models during a callcrunch_id— on-chain identity of the Crunchhost/port— location of the Model Orchestrator (local or remote)base_classname— base class that all participant models must implement, provided by your PyPI package (see Public GitHub project)max_consecutive_failures— after this many failures, a model is disconnectedmax_consecutive_timeouts— after this many timeouts, a model is disconnected
Sending a Tick
methodis the name of the method defined in your base interface.argumentsis a list of typed arguments.- The client library handles encoding/decoding via gRPC.
Requesting Predictions
Example:- model identifier
- status (
SUCCESS,FAILURE,TIMEOUT) - output (prediction)
- latency (time spent predicting, in microseconds)
- runner metadata:
model_name— user-defined model namecruncher_name— Cruncher display namecruncher_id— unique Cruncher identifier on chaindeployment_id— deployed version identifier
Note deployment_id helps detect when a Cruncher deploys a new version of their model. You
can use it to reset metrics or apply version-specific behavior.
Timeouts and Failures
All calls are executed with a timeout. The client performs concurrent calls to all models and applies safeguards so a single slow or unresponsive model cannot block the system. A call is marked as:- TIMEOUT if the model takes too long to respond
- FAILURE if the model raises an exception or returns invalid data
- consecutive failures
- consecutive timeouts
- buggy models
- very slow models
- models that do not respect the interface
Async and the Event Loop
The Model Runner Client is asynchronous for two main reasons:- it keeps a persistent connection to the orchestrator
- it must call many models concurrently
- maintain a live model list
- detect when models join/leave
- reconnect automatically when needed
- don’t run heavy computations directly in the Predict worker
- delegate heavier work to the Score worker
- keep network calls and database writes efficient