Isaac Lab: Understanding the “Observation” and “Step” concept

Category
Done

YT:

Hi, thank you for this video! I have a question- does the all observations tensor includes only the environments joint positions? Does it include more than that? And does the positions represent a step of simulation (or from start to reset to initial position)? I don't really understand the "step" concept.

Answer:

Observations Tensor & Step Process in Isaac Lab

Observations Tensor

  • Contents:
    • Joint positions and velocities
    • Root position and orientation
    • Linear and angular velocities
    • Distance to targets
    • Task-specific information
  • For example, the Leatherback - Community ProjectLeatherback - Community Project robot observations include
    • Distance error to target
    • Heading information (using sin/cos)
    • Robot velocity
    • Current throttle and steering states

How Steps Work in Isaac Lab

One Step Cycle includes

  1. Pre-Physics:
    • The environment processes actions received from your agent.
  2. Physics Simulation:
    • Isaac Sim performs one or more physics updates.
    • Decimation:
      • Controls how many physics steps occur for each RL step.
      • 💡

        An RL step is one cycle of decision making: the agent receives the current observations, selects and outputs an action, and after several physics simulation updates (as determined by decimation), it receives new observations and rewards to make the next decision.

      • Example: With a simulation timestep of 0.01s and a policy timestep of 0.1s, decimation is 10 (i.e., control actions are updated every 10 simulation steps).
  3. Post-Physics:
    • The environment:
      • Checks if episodes are done.
      • Calculates rewards.
      • Resets environments if needed.
      • Gathers the current observations (representing the state after the most recent physics step).

Timeline Example with the Leatherback Robot

  • t = 0ms:
    • Neural network outputs steering and throttle actions.
    • Physics Step 1: Actions applied; wheels start turning.
  • t = 16.7ms:
    • Physics Step 2: Same actions continue; robot moves.
  • t = 33.3ms:
    • Physics Step 3: Same actions continue; robot moves further.
  • t = 50.0ms:
    • Physics Step 4: Same actions continue; robot completes movement.
  • t = 66.7ms:
    • RL step completes; new observations are gathered and new actions are computed.
💡

The observations tensor always represents the current state after the most recent physics step, giving the agent all the necessary data to decide its next action.