Skip to main content

Model Runner

The Model Runner is a gRPC server that runs alongside an AI model submission and makes it callable over the network. It is designed to:
  • Dynamically load the model code
  • Give access to specific Coordinators to call the model (via the Model Runner Client)
  • Execute inference and return results remotely

How It Works

The Model Runner implements the Dynamic Subclass pattern:
  • When the Coordinator initializes a model through the Model Runner Client, it provides an interface that the model must implement.
  • The Model Runner then searches inside the model code for a class implementing this interface and instantiates it.
  • After that, the Coordinator can call the interface methods remotely.
  • The Model Runner handles input/output serialization and deserialization, so both sides can exchange structured data reliably.
Access control & permissions The Model Runner is also responsible for enforcing the Crunch Protocol Secure Model Protocol checks, ensuring that only authorized callers can access a model and that identities are validated. Health checks Finally, the Model Runner exposes a health check service so the Coordinator can automatically verify the runner’s availability and health remotely (through the Model Runner Client).

Model Runner Client Library

The Model Runner Client is the Coordinator-side Python library used to connect to the Model Orchestrator, keep the list of available models up to date, and call many models concurrently (fanout) over gRPC. It is designed to make remote inference reliable at scale, even when some models are slow, buggy, or offline. Initialization A typical setup looks like:
runner = DynamicSubclassModelConcurrentRunner(
    timeout=50,
    crunch_id="your-crunch-id",
    host="localhost",
    port=9091,
    base_classname="condorgame.tracker.TrackerBase",
    max_consecutive_failures=10,
    max_consecutive_timeouts=10)
  • timeout — maximum time (in seconds) to wait for all models during a call
  • crunch_id — on-chain identity of the Crunch
  • host / port — location of the Model Orchestrator (local or remote)
  • base_classname — base class that all participant models must implement, provided by your PyPI package (see Public GitHub project)
  • max_consecutive_failures — after this many failures, a model is disconnected
  • max_consecutive_timeouts — after this many timeouts, a model is disconnected
Then, in your async service:
await runner.init()   # connect to orchestrator and to all models
await runner.sync()   # keep the model list updated in background
Once this is done, you are ready to call models.

Sending a Tick

await runner.call(
    method="infer",   # the method of the base model class
    arguments=[      # a list of typed arguments
        Argument(
            position=1,
            data=Variant(
                type=VariantType.JSON,
                value=encode_data(VariantType.JSON, prices),
            ),
        )
    ],
)
Notes:
  • method is the name of the method defined in your base interface.
  • arguments is a list of typed arguments.
  • The client library handles encoding/decoding via gRPC.
You can also target only a subset of models (for example, to allocate more work to top performers).

Requesting Predictions

Example:
results = await runner.call(
    method="infer",
    arguments=[
        Argument(position=1, data=Variant(type=VariantType.STRING, value=encode_data(VariantType.STRING, asset_code))),
        Argument(position=2, data=Variant(type=VariantType.INT, value=encode_data(VariantType.INT, horizon))),
        Argument(position=3, data=Variant(type=VariantType.INT, value=encode_data(VariantType.INT, step))),
    ],
)
Each result typically contains:
  • model identifier
  • status (SUCCESS, FAILURE, TIMEOUT)
  • output (prediction)
  • latency (time spent predicting, in microseconds)
  • runner metadata:
    • model_name — user-defined model name
    • cruncher_name — Cruncher display name
    • cruncher_id — unique Cruncher identifier on chain
    • deployment_id — deployed version identifier
Note deployment_id helps detect when a Cruncher deploys a new version of their model. You can use it to reset metrics or apply version-specific behavior.

Timeouts and Failures

All calls are executed with a timeout. The client performs concurrent calls to all models and applies safeguards so a single slow or unresponsive model cannot block the system. A call is marked as:
  • TIMEOUT if the model takes too long to respond
  • FAILURE if the model raises an exception or returns invalid data
The library tracks:
  • consecutive failures
  • consecutive timeouts
When a limit is reached, the model is stopped and disconnected. This protects your system from:
  • buggy models
  • very slow models
  • models that do not respect the interface

Async and the Event Loop

The Model Runner Client is asynchronous for two main reasons:
  1. it keeps a persistent connection to the orchestrator
  2. it must call many models concurrently
It uses an event loop to:
  • maintain a live model list
  • detect when models join/leave
  • reconnect automatically when needed
To keep your system healthy, avoid blocking the event loop:
  • don’t run heavy computations directly in the Predict worker
  • delegate heavier work to the Score worker
  • keep network calls and database writes efficient