Platform¶
Platform module for training and evaluation infrastructure.
- myriad.platform.train_and_evaluate(config, agent=None)[source]¶
Main entry point for a training run. Initializes everything and runs the outer training loop.
Output directory is automatically managed: - Under Hydra: uses current directory (Hydra-managed) - Otherwise: creates timestamped directory in outputs/
- Parameters:
config (Config) – Training configuration specifying environment, agent, and run parameters.
agent (Agent | None) – Optional pre-built Agent instance. If provided,
config.agentis used only for logging/metadata and the supplied agent runs instead.
- Returns:
agent_state: Trained agent (ready for inference)
training_metrics: Training history (loss, reward, etc.)
eval_metrics: Evaluation history (episode returns, lengths)
config: Configuration used (for reproducibility)
final_env_state: Final environment states (can be used to resume training)
- Return type:
TrainingResults containing
- myriad.platform.evaluate(config, agent_state=None, agent=None, return_episodes=False, save_episodes_to_disk_flag=None)[source]¶
Evaluation-only entry point (no training).
Useful for: - Non-learning controllers (random, bang-bang, PID) - Pre-trained models - Baseline comparisons - Benchmarking and validation
Output directory is automatically managed: - Under Hydra: uses current directory (Hydra-managed) - Otherwise: creates timestamped directory in outputs/
- Parameters:
config (EvalConfig) – EvalConfig specifying environment, agent, and evaluation parameters. Use config_to_eval_config() to convert a training Config if needed.
agent_state (AgentState | None) – Optional pre-initialized agent state. If None, agent will be initialized with random weights using config.run.seed.
agent (Agent | None) – Optional pre-built Agent instance. If provided,
config.agentis used only for logging/metadata and the supplied agent runs instead. Use this for agents whose constructor requires non-serializable arguments (e.g. a JAX array schedule formyriad.agents.classical.open_loop).return_episodes (bool) – If True, return full episode trajectories in EvaluationResults.episodes. This includes observations, actions, rewards, and dones for each step.
save_episodes_to_disk_flag (bool | None) – If True, save episodes to disk (respects config settings). If None, infers from config.run.eval_episode_save_frequency. Episodes can be saved to disk without keeping them in memory (return_episodes=False).
- Returns:
Summary statistics (mean_return, std_return, min, max)
Raw episode data (episode_returns, episode_lengths)
Optional trajectory data (if return_episodes=True)
Metadata (num_episodes, seed)
- Return type:
EvaluationResults containing
- class myriad.platform.TrainingResults(agent_state, training_metrics, eval_metrics, config, run_dir, final_env_state=None)[source]¶
Bases:
objectComplete results from a training run.
Returned by
train_and_evaluate()and contains everything needed to:Use the trained agent for inference
Analyze training progress
Reproduce the run
Resume training (optional)
- training_metrics: TrainingMetrics¶
Training metrics history (loss, reward, etc.).
- eval_metrics: EvaluationMetrics¶
Evaluation metrics history (episode returns, lengths).
- config: Config¶
Configuration used for this training run (for reproducibility).
- final_env_state: Any | None = None¶
Final state of training environments (can be used to resume training).
- summary()[source]¶
Get summary statistics for quick inspection.
- Returns:
final_eval_return_mean: Mean return from last evaluation checkpoint
final_eval_return_std: Std deviation from last evaluation checkpoint
training_steps_per_env: Environment steps per individual environment
training_global_steps: Total global environment steps across all envs
num_eval_checkpoints: Number of evaluations performed
- Return type:
Dictionary with key metrics
- save(directory, save_checkpoint=False)[source]¶
Save results and optionally agent checkpoint to directory.
Saves: - .hydra/config.yaml: Configuration used for the run - results.pkl: TrainingResults without agent_state - checkpoints/final.msgpack: Agent state (if save_checkpoint=True)
Note: The agent_state is excluded from results.pkl and saved separately using Flax msgpack serialization for reliability with JAX/Flax objects.
- Parameters:
- Raises:
RuntimeError – If agent checkpoint serialization fails
Example
>>> results = train_and_evaluate(config) >>> results.save(Path.cwd(), save_checkpoint=True)
- static load(directory)[source]¶
Load results from directory.
- Parameters:
- Returns:
Loaded TrainingResults object
- Return type:
Example
>>> results = TrainingResults.load("outputs/2026-02-12/14-30-52") >>> print(results.summary())
- save_agent(path)[source]¶
Save trained agent state to file using Flax msgpack serialization.
- Parameters:
path (str | Path) – Path to save the agent state (typically with .msgpack extension)
- Raises:
RuntimeError – If serialization fails
Example
>>> results = train_and_evaluate(config) >>> results.save_agent("trained_agent.msgpack")
- static load_agent(path)[source]¶
Load agent state from file.
- Parameters:
- Returns:
The loaded agent state (can be passed to evaluate())
- Raises:
FileNotFoundError – If file doesn’t exist
RuntimeError – If deserialization fails
- Return type:
Example
>>> agent_state = TrainingResults.load_agent("trained_agent.msgpack") >>> results = evaluate(config, agent_state=agent_state)
- __init__(agent_state, training_metrics, eval_metrics, config, run_dir, final_env_state=None)¶
- class myriad.platform.TrainingMetrics(global_steps, steps_per_env, loss=None, reward=None, agent_metrics=None)[source]¶
Bases:
objectTraining metrics collected at each logging checkpoint.
Metrics are captured at intervals defined by
eval_frequencyin the run config. Each list contains one entry per logging checkpoint.- agent_metrics: dict[str, list[float]] | None = None¶
Agent-specific metrics (e.g.,
q_value,td_errorfor DQN).
- __init__(global_steps, steps_per_env, loss=None, reward=None, agent_metrics=None)¶
- class myriad.platform.EvaluationMetrics(global_steps, steps_per_env, episode_returns, episode_lengths, mean_return, std_return, mean_length)[source]¶
Bases:
objectEvaluation metrics collected at each evaluation checkpoint.
Metrics are captured at intervals defined by
eval_frequencyin the run config. Each list contains one entry per evaluation checkpoint.- episode_returns: list[ndarray]¶
Raw episode returns from each evaluation. Each array contains returns from
eval_rolloutsepisodes.
- episode_lengths: list[ndarray]¶
Raw episode lengths from each evaluation. Each array contains lengths from
eval_rolloutsepisodes.
- __init__(global_steps, steps_per_env, episode_returns, episode_lengths, mean_return, std_return, mean_length)¶
- class myriad.platform.EvaluationResults(mean_return, std_return, min_return, max_return, mean_length, std_length, min_length, max_length, episode_returns, episode_lengths, num_episodes, seed, config, run_dir, episodes=None, agent_state=None)[source]¶
Bases:
objectResults from an evaluation-only run.
Returned by evaluate() and contains:
Summary statistics (mean, std, min, max)
Raw episode data (for custom analysis)
Optional trajectory data (if return_episodes=True)
Metadata (seed, num_episodes, config)
- __init__(mean_return, std_return, min_return, max_return, mean_length, std_length, min_length, max_length, episode_returns, episode_lengths, num_episodes, seed, config, run_dir, episodes=None, agent_state=None)¶
- config: EvalConfig¶
Evaluation configuration used (for reproducibility).
- episodes: dict[str, ndarray] | None = None¶
Full episode trajectories (if return_episodes=True). Contains: - observations: Shape
(num_episodes, max_steps, obs_dim)- actions: Shape(num_episodes, max_steps, ...)- rewards: Shape(num_episodes, max_steps)- dones: Shape(num_episodes, max_steps)
- save(directory, save_checkpoint=False)[source]¶
Save results and optionally agent checkpoint to directory.
Saves: - .hydra/config.yaml: Configuration used for the run (if config is present) - results.pkl: EvaluationResults without agent_state - checkpoints/final.msgpack: Agent state (if save_checkpoint=True and agent_state exists)
Note: The agent_state is excluded from results.pkl and saved separately using Flax msgpack serialization for reliability with JAX/Flax objects.
- Parameters:
- Raises:
RuntimeError – If agent checkpoint serialization fails
Example
>>> results = evaluate(config, agent_state=agent_state) >>> results.save(Path.cwd(), save_checkpoint=True)
- static load(directory)[source]¶
Load results from directory.
- Parameters:
- Returns:
Loaded EvaluationResults object
- Return type:
Example
>>> results = EvaluationResults.load("outputs/2026-02-12/14-30-52") >>> print(results.summary())
- summary()[source]¶
Get summary statistics for quick inspection.
- Returns:
mean_return: Mean episode return
std_return: Standard deviation of returns
min_return: Minimum return
max_return: Maximum return
mean_length: Mean episode length
num_episodes: Number of episodes evaluated
- Return type:
Dictionary with key metrics
- class myriad.platform.SessionLogger(wandb_run, run_dir, seed=0)[source]¶
Bases:
objectUnified logger for training and evaluation sessions.
Focuses on logging metrics and episodes during runs. Artifact persistence (saving results, checkpoints) is handled by the result objects themselves.
Handles three destinations automatically: 1. Memory - Captures metrics for return values 2. Disk - Saves episode trajectories 3. Remote - Logs to W&B (metrics + artifacts)
Example
>>> logger = SessionLogger.for_training(config) >>> logger.log_training_step(...) >>> logger.log_evaluation(..., save_episodes=True) >>> training_metrics, eval_metrics = logger.get_results() >>> logger.finalize()
- classmethod for_training(config, run_dir=None)[source]¶
Create a logger for training sessions.
- Parameters:
config (Config) – Training configuration
run_dir (Path | None) – Output directory for artifacts (default: current directory)
- Returns:
Configured SessionLogger instance
- Return type:
- classmethod for_evaluation(config, run_dir=None)[source]¶
Create a logger for evaluation-only sessions.
- Parameters:
config (EvalConfig) – Evaluation configuration
run_dir (Path | None) – Output directory for artifacts (default: current directory)
- Returns:
Configured SessionLogger instance
- Return type:
- log_training_step(global_step, steps_per_env, metrics_history, steps_this_chunk)[source]¶
Log training metrics.
Handles memory capture + W&B logging.
- log_evaluation(global_step, steps_per_env, eval_results, save_episodes=False, episode_save_count=None)[source]¶
Log evaluation results.
One call handles: - Captures metrics to memory - Saves episodes to disk (if save_episodes=True) - Logs metrics to W&B - Uploads episode artifacts to W&B
- Parameters:
global_step (int) – Global environment steps
steps_per_env (int) – Steps per individual environment
eval_results (dict[str, Any]) – Dictionary with ‘episode_return’, ‘episode_length’, ‘dones’, and optionally ‘episodes’ (trajectory data)
save_episodes (bool) – If True, save episodes to disk and log to W&B
episode_save_count (int | None) – Number of episodes to save (None = all available)
- Returns:
Path to saved episodes directory (if saved), else None
- Return type:
Path | None
- finalize(exit_code=0)[source]¶
Close the W&B run.
- Parameters:
exit_code (int) – 0 for clean/intentional exit (finished, killed by sweep agent, user-stopped), 1 for unexpected failure (OOM, crash).
- log_videos(episode_dir, render_frame_fn, global_step, fps=50, max_episodes=None, video_dir=None)[source]¶
Render saved episodes to videos and log to W&B.
- Parameters:
episode_dir (Path) – Path to directory containing .npz episode files
render_frame_fn (Callable[[ndarray], ndarray]) – Function that takes observation array and returns RGB frame
global_step (int) – Global environment steps (for W&B logging step)
fps (int) – Frames per second for rendered videos
max_episodes (int | None) – Maximum number of episodes to render (None = all)
video_dir (Path | None) – Optional output directory for videos (if None, creates temporary videos)
- myriad.platform.load_run(run_path)[source]¶
Load all artifacts from a run directory.
This is the main entry point for loading runs. It loads config, results, and metadata in one call. Agent checkpoints can be loaded on demand.
- Parameters:
- Returns:
RunArtifacts container with all run data
- Return type:
Example
>>> run = load_run("outputs/2026-02-12/14-30-52") >>> print(f"Final return: {run.results.summary()['mean_return']}") >>> agent = run.load_checkpoint() # Lazy load if needed
- myriad.platform.load_run_config(run_path)[source]¶
Load config from run directory.
Loads from .hydra/config.yaml and validates with Pydantic. Requires run_metadata.yaml to determine config type.
- Parameters:
- Returns:
Config or EvalConfig depending on run type
- Raises:
FileNotFoundError – If config.yaml or run_metadata.yaml not found
RuntimeError – If run_type field missing from metadata
- Return type:
Config | EvalConfig
Example
>>> config = load_run_config("outputs/2026-02-12/14-30-52") >>> print(config.run.seed)
- myriad.platform.load_run_results(run_path)[source]¶
Load results from run directory.
- Parameters:
- Returns:
TrainingResults or EvaluationResults
- Return type:
Example
>>> results = load_run_results("outputs/2026-02-12/14-30-52") >>> print(results.summary())
- myriad.platform.load_run_checkpoint(run_path, checkpoint='final')[source]¶
Load agent checkpoint from run directory.
- Parameters:
- Returns:
Agent state from checkpoint
- Raises:
FileNotFoundError – If checkpoint file not found
RuntimeError – If deserialization fails
- Return type:
Example
>>> agent_state = load_run_checkpoint("outputs/2026-02-12/14-30-52") >>> # Use with evaluate() >>> results = evaluate(config, agent_state=agent_state)
- myriad.platform.load_run_metadata(run_path)[source]¶
Load run metadata from run directory.
- Parameters:
- Returns:
Dictionary with metadata (run_type, timestamp, git_hash, versions)
- Raises:
FileNotFoundError – If metadata file not found
- Return type:
Example
>>> metadata = load_run_metadata("outputs/2026-02-12/14-30-52") >>> print(metadata["git_hash"])
- class myriad.platform.RunArtifacts(config, results, metadata, run_path)[source]¶
Bases:
Generic[ConfigT,ResultsT]Container for all artifacts from a run.
Provides a unified interface to access configs, results, metadata, and optionally load checkpoints.
- Type parameters:
ConfigT: Config or EvalConfig ResultsT: TrainingResults or EvaluationResults
- config: ConfigT¶
Configuration used for this run.
- results: ResultsT¶
Results from the run.
- __init__(config, results, metadata, run_path)¶
- load_checkpoint(checkpoint='final')[source]¶
Load agent checkpoint from disk.
Always loads fresh from disk (no caching).
- Parameters:
checkpoint (str) – Checkpoint name (default: “final”)
- Returns:
Agent state from checkpoint
- Raises:
FileNotFoundError – If checkpoint file not found
RuntimeError – If deserialization fails
- Return type:
- myriad.platform.fetch_sweep_runs(sweep_id, *, state=None)[source]¶
Fetch runs from a W&B sweep, optionally filtered by state.
- myriad.platform.fetch_top_k_runs(sweep_id, metric, top_k, *, maximize)[source]¶
Return the top-K finished runs from a W&B sweep, sorted by metric.
- Parameters:
- Returns:
List of
wandb.Runobjects, length ≤top_k.- Return type:
- myriad.platform.config_from_wandb_run(run)[source]¶
Reconstruct a Config from a W&B run object.
W&B stores the full
model_dump()nested dict inrun.config. Filters W&B-internal metadata and unwraps sweep param wrappers before passing toConfig.model_validate.- Parameters:
run (Any) – A
wandb.Runobject (from e.g.wandb.Api().run(...)).- Returns:
A validated
Configinstance.- Return type:
Config
- myriad.platform.runs_to_dataframe(runs, metrics=None)[source]¶
Convert a list of W&B runs to a Polars DataFrame.
Each row corresponds to one run. Config fields are flattened with dot-separated keys (e.g.
agent.lr). Summary metrics are included as-is.
- myriad.platform.save_agent_state(agent_state, path)[source]¶
Serialize and save agent state to file.
- Parameters:
- Raises:
RuntimeError – If serialization or file writing fails
- myriad.platform.load_agent_state(path)[source]¶
Load and deserialize agent state from file.
- Parameters:
- Returns:
Deserialized agent state
- Raises:
FileNotFoundError – If file doesn’t exist
RuntimeError – If deserialization fails
- Return type:
- myriad.platform.serialize_agent_state(agent_state)[source]¶
Serialize agent state to msgpack bytes.
- Parameters:
agent_state (Any) – Agent state to serialize (typically Flax TrainState or similar)
- Returns:
Serialized bytes
- Raises:
RuntimeError – If serialization fails
- Return type:
- myriad.platform.deserialize_agent_state(data)[source]¶
Deserialize agent state from msgpack bytes.
- Parameters:
data (bytes) – Msgpack-serialized bytes
- Returns:
Deserialized agent state
- Raises:
RuntimeError – If deserialization fails
- Return type:
Config builder utilities for programmatic use.
This module provides high-level functions to create training and evaluation configs without requiring detailed knowledge of Pydantic models.
- myriad.configs.builder.create_config(env, agent, num_envs=1, steps_per_env=1000, rollout_steps=None, eval_max_steps=None, eval_frequency=100, eval_rollouts=10, seed=42, wandb_enabled=False, **kwargs)[source]¶
Create a training config with sensible defaults.
This is the recommended way to create configs programmatically. It provides a simpler interface than constructing nested Pydantic models.
- Parameters:
env (str) – Environment name (e.g., “cartpole-control”, “ccas-ccar-control”)
agent (str) – Agent name (e.g., “dqn”, “pqn”, “random”)
num_envs (int) – Number of parallel environments to run
steps_per_env (int) – Number of steps to run per environment
rollout_steps (int | None) – Number of steps to collect per environment before updating (for on-policy agents only). If None, defaults to 2 for on-policy agents.
eval_max_steps (int | None) – Maximum steps per evaluation episode. If None, uses environment-specific default from registry or Config models.
eval_frequency (int) – Log and evaluate every N steps-per-env (0 to disable)
eval_rollouts (int) – Number of episodes to run during evaluation
seed (int) – Random seed for reproducibility
wandb_enabled (bool) – Enable Weights & Biases logging
**kwargs (Any) – Additional config overrides. Can specify nested parameters using dot notation (e.g.,
agent.learning_rate=1e-3) or pass dicts for nested configs (e.g.,wandb={"project": "my-project"}).
- Returns:
Fully configured Config object ready for
train_and_evaluate()- Return type:
Config
- myriad.configs.builder.create_eval_config(env, agent, eval_rollouts=10, eval_max_steps=None, seed=42, wandb_enabled=False, **kwargs)[source]¶
Create an evaluation-only config with sensible defaults.
Use this for evaluating non-learning controllers (random, PID, bang-bang) or pre-trained models without any training.
- Parameters:
env (str) – Environment name (e.g., “cartpole-control”)
agent (str) – Agent name (e.g., “random”, “dqn”)
eval_rollouts (int) – Number of episodes to evaluate
eval_max_steps (int | None) – Maximum steps per episode. If None, uses environment-specific default from registry or Config models.
seed (int) – Random seed for reproducibility
wandb_enabled (bool) – Enable Weights & Biases logging
**kwargs (Any) – Additional config overrides (same as create_config)
- Returns:
Fully configured EvalConfig object ready for
evaluate()- Return type:
EvalConfig