CartPole¶
Classic inverted pendulum balancing task. Keep the pole upright by applying left/right forces to the cart.
Physics¶
Pure dynamics for the cart-pole system, decoupled from task logic.
Pure stateless physics for the CartPole system.
This module contains the ground truth dynamics for the cart-pole system, completely decoupled from any task-specific logic (rewards, terminations, observations).
The physics can be reused by different tasks (control, system ID, etc.) and can be directly accessed by model-based methods like MPC planners or Neural ODEs.
- class myriad.envs.classic.cartpole.physics.PhysicsState(x, x_dot, theta, theta_dot)[source]¶
Bases:
NamedTuplePure physical state of the cart-pole system.
For CartPole, this is a fully observable system, so PhysicsState serves as both the internal state and the observation. This eliminates duplication and makes observability explicit.
- Variables:
x (jax.jaxlib._jax.Array) – Cart position (m)
x_dot (jax.jaxlib._jax.Array) – Cart velocity (m/s)
theta (jax.jaxlib._jax.Array) – Pole angle from vertical (rad, 0 = upright)
theta_dot (jax.jaxlib._jax.Array) – Pole angular velocity (rad/s)
- x: Array¶
Alias for field number 0
- x_dot: Array¶
Alias for field number 1
- theta: Array¶
Alias for field number 2
- theta_dot: Array¶
Alias for field number 3
- to_array()[source]¶
Convert to flat array for NN-based agents.
- Returns:
Array of shape (4,) with [x, x_dot, theta, theta_dot]
- Return type:
Array
- class myriad.envs.classic.cartpole.physics.PhysicsConfig(gravity=9.8, cart_mass=1.0, pole_mass=0.1, pole_length=0.5, force_magnitude=10.0, dt=0.02)[source]¶
Bases:
objectStatic physics constants for the cart-pole system.
These are compile-time constants passed as static_argnames to jit. Changing these values requires recompilation but enables better optimization.
- __init__(gravity=9.8, cart_mass=1.0, pole_mass=0.1, pole_length=0.5, force_magnitude=10.0, dt=0.02)¶
- replace(**updates)¶
Returns a new object replacing the specified fields with new values.
- class myriad.envs.classic.cartpole.physics.PhysicsParams[source]¶
Bases:
objectDynamic physics parameters for domain randomization.
Currently empty for CartPole, but maintained for protocol consistency and future domain randomization support (e.g., varying masses/lengths per episode).
- __init__()¶
- replace(**updates)¶
Returns a new object replacing the specified fields with new values.
- myriad.envs.classic.cartpole.physics.step_physics(state, action, params, config)[source]¶
Pure physics step using Euler integration.
The cart-pole dynamics are based on the equations from Barto, Sutton, Anderson (1983).
- Parameters:
state (PhysicsState) – Current physical state (x, x_dot, theta, theta_dot)
action (Array) – Discrete action {0, 1} representing force direction
params (PhysicsParams) – Dynamic parameters (unused, reserved for future randomization)
config (PhysicsConfig) – Static physics constants
- Returns:
Next physical state after one dt timestep
- Return type:
Control Task¶
Standard balancing task with termination on falling or leaving bounds.
Control task wrapper for CartPole.
Standard balancing task: Keep the pole upright for as long as possible. Reward: +1 per timestep the pole remains balanced.
- class myriad.envs.classic.cartpole.tasks.control.ControlTaskState(physics, t)[source]¶
Bases:
NamedTupleState for the control task.
- Variables:
physics (myriad.envs.classic.cartpole.physics.PhysicsState) – The underlying physics state (x, x_dot, theta, theta_dot)
t (jax.jaxlib._jax.Array) – Current timestep counter
- physics: PhysicsState¶
Alias for field number 0
- t: Array¶
Alias for field number 1
- class myriad.envs.classic.cartpole.tasks.control.ControlTaskConfig(physics=<factory>, task=<factory>)[source]¶
Bases:
objectConfiguration for the CartPole control task.
Composed of physics config and task config for clean separation.
- physics: PhysicsConfig¶
- task: TaskConfig¶
- __init__(physics=<factory>, task=<factory>)¶
- replace(**updates)¶
Returns a new object replacing the specified fields with new values.
- class myriad.envs.classic.cartpole.tasks.control.ControlTaskParams(physics=<factory>)[source]¶
Bases:
objectParameters for the control task.
- physics: PhysicsParams¶
- __init__(physics=<factory>)¶
- replace(**updates)¶
Returns a new object replacing the specified fields with new values.
- myriad.envs.classic.cartpole.tasks.control.step(key, state, action, params, config)¶
Step the control task forward one timestep.
- Parameters:
key (Array) – RNG key (unused for deterministic control task, but part of protocol)
state (ControlTaskState) – Current task state
action (Array) – Discrete action {0, 1}
params (ControlTaskParams) – Task parameters
config (ControlTaskConfig) – Task configuration (static)
- Returns:
Next observation (PhysicsState = fully observable) next_state: Next environment state reward: Reward (+1.0 per step) done: Termination flag (1.0 if done, 0.0 otherwise) info: Empty dict (no auxiliary information)
- Return type:
obs_next
- myriad.envs.classic.cartpole.tasks.control.reset(key, params, config)¶
Reset the control task to initial state.
Initializes the pole with small random perturbations around the upright position.
- Parameters:
key (Array) – RNG key for random initialization
params (ControlTaskParams) – Task parameters
config (ControlTaskConfig) – Task configuration (static)
- Returns:
Initial observation (PhysicsState with named fields) state: Initial task state
- Return type:
obs
- myriad.envs.classic.cartpole.tasks.control.get_obs(state, params, config)[source]¶
Extract observation from state.
For control task, observation is the physical state as a NamedTuple with named fields. Neural network agents can call .to_array() for flat array representation.
- Parameters:
state (ControlTaskState) – Current task state
params (ControlTaskParams) – Task parameters (unused)
config (ControlTaskConfig) – Task configuration (unused)
- Returns:
PhysicsState with named fields (x, x_dot, theta, theta_dot)
- Return type:
- myriad.envs.classic.cartpole.tasks.control.get_obs_shape(config)[source]¶
Get the shape of the observation space.
- Parameters:
config (ControlTaskConfig) – Task configuration (unused)
- Returns:
Observation shape tuple
- Return type:
- myriad.envs.classic.cartpole.tasks.control.get_action_space(config)[source]¶
Get the discrete action space for the environment.
- Parameters:
config (ControlTaskConfig) – Task configuration (unused)
- Returns:
0 (push left) and 1 (push right)
- Return type:
Discrete space with 2 actions
- myriad.envs.classic.cartpole.tasks.control.make_env(config=None, params=None, **kwargs)[source]¶
Create a CartPole control task environment.
- Parameters:
config (ControlTaskConfig | None) – Custom ControlTaskConfig. If None, uses defaults.
params (ControlTaskParams | None) – Custom ControlTaskParams. If None, creates from kwargs.
**kwargs – Keyword arguments for creating config/params if not provided.
- Returns:
Environment instance for the control task
- Return type:
Environment[ControlTaskState, ControlTaskConfig, ControlTaskParams, PhysicsState]
Shared utilities for CartPole task wrappers.
- class myriad.envs.classic.cartpole.tasks.base.TaskConfig(max_steps=500, theta_threshold=0.2094395102393195, x_threshold=2.4)[source]¶
Bases:
objectBase configuration shared by all CartPole tasks.
These define the task-specific termination conditions and episode limits.
- __init__(max_steps=500, theta_threshold=0.2094395102393195, x_threshold=2.4)¶
- replace(**updates)¶
Returns a new object replacing the specified fields with new values.
- myriad.envs.classic.cartpole.tasks.base.check_termination(physics_state, t, task_config)[source]¶
Common termination check for CartPole tasks.
The episode terminates if: - Pole angle exceeds threshold (falls over) - Cart position exceeds threshold (goes off track) - Maximum timestep reached
- Parameters:
physics_state (PhysicsState) – PhysicsState with x and theta fields
t (Array) – Current timestep counter
task_config (TaskConfig) – TaskConfig with thresholds and max_steps
- Returns:
1.0 if terminated, 0.0 otherwise (as float for JAX compatibility)
- Return type:
done
- myriad.envs.classic.cartpole.tasks.base.get_cartpole_obs(physics_state)[source]¶
Extract standard CartPole observation from physics state.
For CartPole, the system is fully observable, so observation = state. This eliminates duplication and makes observability explicit.
- Parameters:
physics_state (PhysicsState) – PhysicsState with x, x_dot, theta, theta_dot fields
- Returns:
PhysicsState (observation = state for fully observable system)
- Return type:
- myriad.envs.classic.cartpole.tasks.base.get_cartpole_obs_shape()[source]¶
Get the shape of the CartPole observation space.
- myriad.envs.classic.cartpole.tasks.base.get_cartpole_action_space()[source]¶
Get the discrete action space for CartPole.
- Returns:
0 (push left) and 1 (push right)
- Return type:
Discrete space with 2 actions
- myriad.envs.classic.cartpole.tasks.base.sample_initial_physics(key)[source]¶
Sample initial physics state with small random perturbations.
Initializes the pole with small random perturbations around the upright equilibrium position.
- Parameters:
key (Array) – RNG key for random initialization
- Returns:
PhysicsState with small perturbations in range [-0.05, 0.05] for all state variables