CartPole

Classic inverted pendulum balancing task. Keep the pole upright by applying left/right forces to the cart.

Physics

Pure dynamics for the cart-pole system, decoupled from task logic.

Pure stateless physics for the CartPole system.

This module contains the ground truth dynamics for the cart-pole system, completely decoupled from any task-specific logic (rewards, terminations, observations).

The physics can be reused by different tasks (control, system ID, etc.) and can be directly accessed by model-based methods like MPC planners or Neural ODEs.

class myriad.envs.classic.cartpole.physics.PhysicsState(x, x_dot, theta, theta_dot)[source]

Bases: NamedTuple

Pure physical state of the cart-pole system.

For CartPole, this is a fully observable system, so PhysicsState serves as both the internal state and the observation. This eliminates duplication and makes observability explicit.

Variables:
  • x (jax.Array) – Cart position (m)

  • x_dot (jax.Array) – Cart velocity (m/s)

  • theta (jax.Array) – Pole angle from vertical (rad, 0 = upright)

  • theta_dot (jax.Array) – Pole angular velocity (rad/s)

x: Array

Alias for field number 0

x_dot: Array

Alias for field number 1

theta: Array

Alias for field number 2

theta_dot: Array

Alias for field number 3

to_array()[source]

Convert to flat array for NN-based agents.

Returns:

Array of shape (4,) with [x, x_dot, theta, theta_dot]

Return type:

Array

classmethod from_array(arr)[source]

Create state from flat array.

Parameters:

arr (Array) – Array of shape (4,) with [x, x_dot, theta, theta_dot]

Returns:

PhysicsState instance

Return type:

PhysicsState

class myriad.envs.classic.cartpole.physics.PhysicsConfig(gravity=9.8, cart_mass=1.0, pole_mass=0.1, pole_length=0.5, force_magnitude=10.0, dt=0.02)[source]

Bases: object

Static physics constants for the cart-pole system.

These are compile-time constants passed as static_argnames to jit. Changing these values requires recompilation but enables better optimization.

gravity: float = 9.8
cart_mass: float = 1.0
pole_mass: float = 0.1
pole_length: float = 0.5
force_magnitude: float = 10.0
dt: float = 0.02
__init__(gravity=9.8, cart_mass=1.0, pole_mass=0.1, pole_length=0.5, force_magnitude=10.0, dt=0.02)
replace(**updates)

Returns a new object replacing the specified fields with new values.

class myriad.envs.classic.cartpole.physics.PhysicsParams[source]

Bases: object

Dynamic physics parameters for domain randomization.

Currently empty for CartPole, but maintained for protocol consistency and future domain randomization support (e.g., varying masses/lengths per episode).

__init__()
replace(**updates)

Returns a new object replacing the specified fields with new values.

myriad.envs.classic.cartpole.physics.step_physics(state, action, params, config)[source]

Pure physics step using Euler integration.

The cart-pole dynamics are based on the equations from Barto, Sutton, Anderson (1983).

Parameters:
  • state (PhysicsState) – Current physical state (x, x_dot, theta, theta_dot)

  • action (Array) – Discrete action {0, 1} representing force direction

  • params (PhysicsParams) – Dynamic parameters (unused, reserved for future randomization)

  • config (PhysicsConfig) – Static physics constants

Returns:

Next physical state after one dt timestep

Return type:

PhysicsState

myriad.envs.classic.cartpole.physics.create_physics_params(**kwargs)[source]

Factory function to create PhysicsParams.

Parameters:

**kwargs – Reserved for future domain randomization parameters

Returns:

PhysicsParams instance

Return type:

PhysicsParams

Control Task

Standard balancing task with termination on falling or leaving bounds.

Control task wrapper for CartPole.

Standard balancing task: Keep the pole upright for as long as possible. Reward: +1 per timestep the pole remains balanced.

class myriad.envs.classic.cartpole.tasks.control.ControlTaskState(physics, t)[source]

Bases: NamedTuple

State for the control task.

Variables:
physics: PhysicsState

Alias for field number 0

t: Array

Alias for field number 1

class myriad.envs.classic.cartpole.tasks.control.ControlTaskConfig(physics=<factory>, task=<factory>)[source]

Bases: object

Configuration for the CartPole control task.

Composed of physics config and task config for clean separation.

physics: PhysicsConfig
task: TaskConfig
property dt: float

Timestep duration in seconds.

property max_steps: int

Required by EnvironmentConfig protocol.

__init__(physics=<factory>, task=<factory>)
replace(**updates)

Returns a new object replacing the specified fields with new values.

class myriad.envs.classic.cartpole.tasks.control.ControlTaskParams(physics=<factory>)[source]

Bases: object

Parameters for the control task.

physics: PhysicsParams
__init__(physics=<factory>)
replace(**updates)

Returns a new object replacing the specified fields with new values.

myriad.envs.classic.cartpole.tasks.control.step(key, state, action, params, config)

Step the control task forward one timestep.

Parameters:
Returns:

Next observation (PhysicsState = fully observable) next_state: Next environment state reward: Reward (+1.0 per step) done: Termination flag (1.0 if done, 0.0 otherwise) info: Empty dict (no auxiliary information)

Return type:

obs_next

myriad.envs.classic.cartpole.tasks.control.reset(key, params, config)

Reset the control task to initial state.

Initializes the pole with small random perturbations around the upright position.

Parameters:
Returns:

Initial observation (PhysicsState with named fields) state: Initial task state

Return type:

obs

myriad.envs.classic.cartpole.tasks.control.get_obs(state, params, config)[source]

Extract observation from state.

For control task, observation is the physical state as a NamedTuple with named fields. Neural network agents can call .to_array() for flat array representation.

Parameters:
Returns:

PhysicsState with named fields (x, x_dot, theta, theta_dot)

Return type:

PhysicsState

myriad.envs.classic.cartpole.tasks.control.get_obs_shape(config)[source]

Get the shape of the observation space.

Parameters:

config (ControlTaskConfig) – Task configuration (unused)

Returns:

Observation shape tuple

Return type:

tuple[int, …]

myriad.envs.classic.cartpole.tasks.control.get_action_space(config)[source]

Get the discrete action space for the environment.

Parameters:

config (ControlTaskConfig) – Task configuration (unused)

Returns:

0 (push left) and 1 (push right)

Return type:

Discrete space with 2 actions

myriad.envs.classic.cartpole.tasks.control.make_env(config=None, params=None, **kwargs)[source]

Create a CartPole control task environment.

Parameters:
  • config (ControlTaskConfig | None) – Custom ControlTaskConfig. If None, uses defaults.

  • params (ControlTaskParams | None) – Custom ControlTaskParams. If None, creates from kwargs.

  • **kwargs – Keyword arguments for creating config/params if not provided.

Returns:

Environment instance for the control task

Return type:

Environment[ControlTaskState, ControlTaskConfig, ControlTaskParams, PhysicsState]

Shared utilities for CartPole task wrappers.

class myriad.envs.classic.cartpole.tasks.base.TaskConfig(max_steps=500, theta_threshold=0.2094395102393195, x_threshold=2.4)[source]

Bases: object

Base configuration shared by all CartPole tasks.

These define the task-specific termination conditions and episode limits.

max_steps: int = 500
theta_threshold: float = 0.2094395102393195
x_threshold: float = 2.4
__init__(max_steps=500, theta_threshold=0.2094395102393195, x_threshold=2.4)
replace(**updates)

Returns a new object replacing the specified fields with new values.

myriad.envs.classic.cartpole.tasks.base.check_termination(physics_state, t, task_config)[source]

Common termination check for CartPole tasks.

The episode terminates if: - Pole angle exceeds threshold (falls over) - Cart position exceeds threshold (goes off track) - Maximum timestep reached

Parameters:
  • physics_state (PhysicsState) – PhysicsState with x and theta fields

  • t (Array) – Current timestep counter

  • task_config (TaskConfig) – TaskConfig with thresholds and max_steps

Returns:

1.0 if terminated, 0.0 otherwise (as float for JAX compatibility)

Return type:

done

myriad.envs.classic.cartpole.tasks.base.get_cartpole_obs(physics_state)[source]

Extract standard CartPole observation from physics state.

For CartPole, the system is fully observable, so observation = state. This eliminates duplication and makes observability explicit.

Parameters:

physics_state (PhysicsState) – PhysicsState with x, x_dot, theta, theta_dot fields

Returns:

PhysicsState (observation = state for fully observable system)

Return type:

PhysicsState

myriad.envs.classic.cartpole.tasks.base.get_cartpole_obs_shape()[source]

Get the shape of the CartPole observation space.

Returns:

Observation shape tuple (4,) for [x, x_dot, theta, theta_dot]

Return type:

tuple[int, …]

myriad.envs.classic.cartpole.tasks.base.get_cartpole_action_space()[source]

Get the discrete action space for CartPole.

Returns:

0 (push left) and 1 (push right)

Return type:

Discrete space with 2 actions

myriad.envs.classic.cartpole.tasks.base.sample_initial_physics(key)[source]

Sample initial physics state with small random perturbations.

Initializes the pole with small random perturbations around the upright equilibrium position.

Parameters:

key (Array) – RNG key for random initialization

Returns:

PhysicsState with small perturbations in range [-0.05, 0.05] for all state variables