CartPole¶

Classic inverted pendulum balancing task. Keep the pole upright by applying left/right forces to the cart.

Physics¶

Pure dynamics for the cart-pole system, decoupled from task logic.

Pure stateless physics for the CartPole system.

This module contains the ground truth dynamics for the cart-pole system, completely decoupled from any task-specific logic (rewards, terminations, observations).

The physics can be reused by different tasks (control, system ID, etc.) and can be directly accessed by model-based methods like MPC planners or Neural ODEs.

class myriad.envs.classic.cartpole.physics.PhysicsState(x, x_dot, theta, theta_dot)[source]¶

Bases: NamedTuple

Pure physical state of the cart-pole system.

For CartPole, this is a fully observable system, so PhysicsState serves as both the internal state and the observation. This eliminates duplication and makes observability explicit.

Variables:

x (jax.jaxlib._jax.Array) – Cart position (m)
x_dot (jax.jaxlib._jax.Array) – Cart velocity (m/s)
theta (jax.jaxlib._jax.Array) – Pole angle from vertical (rad, 0 = upright)
theta_dot (jax.jaxlib._jax.Array) – Pole angular velocity (rad/s)

x: Array¶: Alias for field number 0

x_dot: Array¶: Alias for field number 1

theta: Array¶: Alias for field number 2

theta_dot: Array¶: Alias for field number 3

to_array()[source]¶

Convert to flat array for NN-based agents.

Returns:: Array of shape (4,) with [x, x_dot, theta, theta_dot]
Return type:: Array

classmethod from_array(arr)[source]¶

Create state from flat array.

Parameters:: arr (Array) – Array of shape (4,) with [x, x_dot, theta, theta_dot]
Returns:: PhysicsState instance
Return type:: PhysicsState

class myriad.envs.classic.cartpole.physics.PhysicsConfig(gravity=9.8, cart_mass=1.0, pole_mass=0.1, pole_length=0.5, force_magnitude=10.0, dt=0.02)[source]¶

Bases: object

Static physics constants for the cart-pole system.

These are compile-time constants passed as static_argnames to jit. Changing these values requires recompilation but enables better optimization.

gravity: float = 9.8¶

cart_mass: float = 1.0¶

pole_mass: float = 0.1¶

pole_length: float = 0.5¶

force_magnitude: float = 10.0¶

dt: float = 0.02¶

__init__(gravity=9.8, cart_mass=1.0, pole_mass=0.1, pole_length=0.5, force_magnitude=10.0, dt=0.02)¶

replace(**updates)¶: Returns a new object replacing the specified fields with new values.

class myriad.envs.classic.cartpole.physics.PhysicsParams[source]¶

Bases: object

Dynamic physics parameters for domain randomization.

Currently empty for CartPole, but maintained for protocol consistency and future domain randomization support (e.g., varying masses/lengths per episode).

__init__()¶

replace(**updates)¶: Returns a new object replacing the specified fields with new values.

myriad.envs.classic.cartpole.physics.step_physics(state, action, params, config)[source]¶

Pure physics step using Euler integration.

The cart-pole dynamics are based on the equations from Barto, Sutton, Anderson (1983).

Parameters:

state (PhysicsState) – Current physical state (x, x_dot, theta, theta_dot)
action (Array) – Discrete action {0, 1} representing force direction
params (PhysicsParams) – Dynamic parameters (unused, reserved for future randomization)
config (PhysicsConfig) – Static physics constants

Returns:

Next physical state after one dt timestep

Return type:

PhysicsState

myriad.envs.classic.cartpole.physics.create_physics_params(**kwargs)[source]¶

Factory function to create PhysicsParams.

Parameters:: **kwargs – Reserved for future domain randomization parameters
Returns:: PhysicsParams instance
Return type:: PhysicsParams

Control Task¶

Standard balancing task with termination on falling or leaving bounds.

Control task wrapper for CartPole.

Standard balancing task: Keep the pole upright for as long as possible. Reward: +1 per timestep the pole remains balanced.

class myriad.envs.classic.cartpole.tasks.control.ControlTaskState(physics, t)[source]¶

Bases: NamedTuple

State for the control task.

Variables:

physics (myriad.envs.classic.cartpole.physics.PhysicsState) – The underlying physics state (x, x_dot, theta, theta_dot)
t (jax.jaxlib._jax.Array) – Current timestep counter

physics: PhysicsState¶: Alias for field number 0

t: Array¶: Alias for field number 1

class myriad.envs.classic.cartpole.tasks.control.ControlTaskConfig(physics=<factory>, task=<factory>)[source]¶

Bases: object

Configuration for the CartPole control task.

Composed of physics config and task config for clean separation.

physics: PhysicsConfig¶

task: TaskConfig¶

property dt: float¶: Timestep duration in seconds.

property max_steps: int¶: Required by EnvironmentConfig protocol.

__init__(physics=<factory>, task=<factory>)¶

replace(**updates)¶: Returns a new object replacing the specified fields with new values.

class myriad.envs.classic.cartpole.tasks.control.ControlTaskParams(physics=<factory>)[source]¶

Bases: object

Parameters for the control task.

physics: PhysicsParams¶

__init__(physics=<factory>)¶

replace(**updates)¶: Returns a new object replacing the specified fields with new values.

myriad.envs.classic.cartpole.tasks.control.step(key, state, action, params, config)¶

Step the control task forward one timestep.

Parameters:

key (Array) – RNG key (unused for deterministic control task, but part of protocol)
state (ControlTaskState) – Current task state
action (Array) – Discrete action {0, 1}
params (ControlTaskParams) – Task parameters
config (ControlTaskConfig) – Task configuration (static)

Returns:

Next observation (PhysicsState = fully observable) next_state: Next environment state reward: Reward (+1.0 per step) done: Termination flag (1.0 if done, 0.0 otherwise) info: Empty dict (no auxiliary information)

Return type:

obs_next

myriad.envs.classic.cartpole.tasks.control.reset(key, params, config)¶

Reset the control task to initial state.

Initializes the pole with small random perturbations around the upright position.

Parameters:

key (Array) – RNG key for random initialization
params (ControlTaskParams) – Task parameters
config (ControlTaskConfig) – Task configuration (static)

Returns:

Initial observation (PhysicsState with named fields) state: Initial task state

Return type:

obs

myriad.envs.classic.cartpole.tasks.control.get_obs(state, params, config)[source]¶

Extract observation from state.

For control task, observation is the physical state as a NamedTuple with named fields. Neural network agents can call .to_array() for flat array representation.

Parameters:

state (ControlTaskState) – Current task state
params (ControlTaskParams) – Task parameters (unused)
config (ControlTaskConfig) – Task configuration (unused)

Returns:

PhysicsState with named fields (x, x_dot, theta, theta_dot)

Return type:

PhysicsState

myriad.envs.classic.cartpole.tasks.control.get_obs_shape(config)[source]¶

Get the shape of the observation space.

Parameters:: config (ControlTaskConfig) – Task configuration (unused)
Returns:: Observation shape tuple
Return type:: tuple[int, …]

myriad.envs.classic.cartpole.tasks.control.get_action_space(config)[source]¶

Get the discrete action space for the environment.

Parameters:: config (ControlTaskConfig) – Task configuration (unused)
Returns:: 0 (push left) and 1 (push right)
Return type:: Discrete space with 2 actions

myriad.envs.classic.cartpole.tasks.control.make_env(config=None, params=None, **kwargs)[source]¶

Create a CartPole control task environment.

Parameters:

config (ControlTaskConfig | None) – Custom ControlTaskConfig. If None, uses defaults.
params (ControlTaskParams | None) – Custom ControlTaskParams. If None, creates from kwargs.
**kwargs – Keyword arguments for creating config/params if not provided.

Returns:

Environment instance for the control task

Return type:

Environment[ControlTaskState, ControlTaskConfig, ControlTaskParams, PhysicsState]

Shared utilities for CartPole task wrappers.

class myriad.envs.classic.cartpole.tasks.base.TaskConfig(max_steps=500, theta_threshold=0.2094395102393195, x_threshold=2.4)[source]¶

Bases: object

Base configuration shared by all CartPole tasks.

These define the task-specific termination conditions and episode limits.

max_steps: int = 500¶

theta_threshold: float = 0.2094395102393195¶

x_threshold: float = 2.4¶

__init__(max_steps=500, theta_threshold=0.2094395102393195, x_threshold=2.4)¶

replace(**updates)¶: Returns a new object replacing the specified fields with new values.

myriad.envs.classic.cartpole.tasks.base.check_termination(physics_state, t, task_config)[source]¶

Common termination check for CartPole tasks.

The episode terminates if: - Pole angle exceeds threshold (falls over) - Cart position exceeds threshold (goes off track) - Maximum timestep reached

Parameters:

physics_state (PhysicsState) – PhysicsState with x and theta fields
t (Array) – Current timestep counter
task_config (TaskConfig) – TaskConfig with thresholds and max_steps

Returns:

1.0 if terminated, 0.0 otherwise (as float for JAX compatibility)

Return type:

done

myriad.envs.classic.cartpole.tasks.base.get_cartpole_obs(physics_state)[source]¶

Extract standard CartPole observation from physics state.

For CartPole, the system is fully observable, so observation = state. This eliminates duplication and makes observability explicit.

Parameters:: physics_state (PhysicsState) – PhysicsState with x, x_dot, theta, theta_dot fields
Returns:: PhysicsState (observation = state for fully observable system)
Return type:: PhysicsState

myriad.envs.classic.cartpole.tasks.base.get_cartpole_obs_shape()[source]¶

Get the shape of the CartPole observation space.

Returns:: Observation shape tuple (4,) for [x, x_dot, theta, theta_dot]
Return type:: tuple[int, …]

myriad.envs.classic.cartpole.tasks.base.get_cartpole_action_space()[source]¶

Get the discrete action space for CartPole.

Returns:: 0 (push left) and 1 (push right)
Return type:: Discrete space with 2 actions

myriad.envs.classic.cartpole.tasks.base.sample_initial_physics(key)[source]¶

Sample initial physics state with small random perturbations.

Initializes the pole with small random perturbations around the upright equilibrium position.

Parameters:: key (Array) – RNG key for random initialization
Returns:: PhysicsState with small perturbations in range [-0.05, 0.05] for all state variables