The gRPC server that makes each model callable over the network, and the Python client library Coordinators use to call them.
The Model Runner and Model Runner Client are the two halves of remote model execution. The
Runner sits next to each model and serves it over gRPC. The Client sits in your
Crunch Node and calls many models concurrently.
When the Coordinator initializes a model through the Client, it specifies a base class (e.g.,
starter_challenge.tracker.TrackerBase) that the model must implement. The Model Runner:
Searches the submitted code for a class that inherits from this base class
Instantiates it
Exposes its methods over gRPC
The Runner handles all input/output serialization, so both sides exchange structured data reliably.
The Runner exposes a health check service so the Coordinator can verify availability remotely. If a
model becomes unresponsive, the Client detects it and disconnects.
The Model Runner Client is the Coordinator-side Python library. It connects to the
Model Orchestrator, maintains a live list of available models,
and fans out calls to all of them concurrently over gRPC.It is designed for reliability at scale — even when some models are slow, buggy, or offline.
Use deployment_id to detect when a Cruncher deploys a new version of their model. You can reset
metrics or apply version-specific behavior based on this value.
You can also target a subset of models — for example, to allocate more inference budget to top
performers.
All calls execute with the configured timeout. The Client applies safeguards so a single slow or
broken model cannot block the entire system:
A call is marked TIMEOUT if the model takes too long to respond
A call is marked FAILURE if the model raises an exception or returns invalid data
The library tracks consecutive failures and consecutive timeouts per model. When either limit is
reached, the model is automatically disconnected. This protects your pipeline from buggy models,
very slow models, and models that don’t respect the interface.
The Client is fully asynchronous because it needs to:
Maintain a persistent connection to the Orchestrator
Call many models concurrently on every tick
It uses an event loop to keep the model list up to date, detect when models join or leave, and
reconnect automatically when needed.
Avoid blocking the event loop with heavy computations in the predict worker. Delegate
CPU-intensive work (scoring, aggregation) to the score worker instead.