Skip to content

feat(envs): add LIBERO-plus robustness benchmark#3313

Open
pkooij wants to merge 1 commit intofeat/benchmark-cifrom
feat/libero-plus-benchmark
Open

feat(envs): add LIBERO-plus robustness benchmark#3313
pkooij wants to merge 1 commit intofeat/benchmark-cifrom
feat/libero-plus-benchmark

Conversation

@pkooij
Copy link
Copy Markdown
Member

@pkooij pkooij commented Apr 8, 2026

Title

feat(envs): add LIBERO-plus robustness benchmark

Type / Scope

  • Type: Feature
  • Scope: src/lerobot/envs/, pyproject.toml, docker/, docs/

Summary / Motivation

LIBERO-plus is a robustness benchmark for VLA models that extends LIBERO with 7 perturbation dimensions (camera viewpoints, object layouts, robot initial states, language instructions, lighting, background textures, sensor noise), producing ~10 000 task variants across the standard LIBERO suites.

Because LIBERO-plus keeps the same Python gym interface as the original LIBERO, the integration is minimal: a one-line config subclass, an import fallback for the different package nesting, and a new pip extras group.

Related issues

  • Related: LIBERO integration (already in feat/async-vector-env)

What changed

  • src/lerobot/envs/libero.py — wraps the top-level LIBERO imports in try/except to handle the extra module nesting level that LIBERO-plus ships with
  • src/lerobot/envs/configs.py — adds LiberoPlusEnv config (@EnvConfig.register_subclass("libero_plus")), a thin subclass of LiberoEnv with task="libero_spatial" as default; fully inherits create_envs and get_env_processors
  • pyproject.toml — adds libero_plus optional dep group and includes it in all
  • docs/source/libero_plus.mdx — new benchmark doc: perturbation dimensions, task suites, install instructions, eval commands, camera name mapping, dataset reference
  • docs/source/_toctree.yml — registers new doc page
  • docker/Dockerfile.benchmark.libero_plus — isolated CI image (adds libexpat1 libfontconfig1-dev libmagickwand-dev system deps required by LIBERO-plus)
  • .github/workflows/benchmark_tests.yml — adds libero-plus-integration-test job (build image + 1-episode smoke eval on aws-g6-4xlarge-plus)

No breaking changes. env.type=libero continues to work unchanged.

Dataset note

pepijn223/libero_plus_lerobot is already LeRobot v3.0 format — no conversion needed.
Dataset card (README) is missing on the Hub and should be added in a follow-up.
Camera keys: observation.images.front / observation.images.wrist.

How was this tested (or how to run locally)

  • pre-commit run -a passes on all changed files
  • Registration verified locally via PYTHONPATH override:
    from lerobot.envs.configs import EnvConfig
    cfg = EnvConfig.get_choice_class('libero_plus')()
    # → type=libero_plus, task=libero_spatial
    
  • Full eval smoke-test (requires Linux + GPU + LIBERO-plus installed):
    lerobot-eval \
      --policy.path=pepijn223/smolvla_libero \
      --env.type=libero_plus \
      --env.task=libero_spatial \
      --eval.batch_size=1 --eval.n_episodes=1 \
      --eval.use_async_envs=false --policy.device=cuda \
      '--env.camera_name_mapping={"agentview_image": "camera1", "robot0_eye_in_hand_image": "camera2"}' \
      --policy.empty_cameras=1
    Runs automatically via the new CI job on aws-g6-4xlarge-plus.

Checklist (required before merge)

  • Linting/formatting run (pre-commit run -a)
  • All tests pass locally (pytest) — LIBERO-plus requires Linux, validated via CI
  • Documentation updated (docs/source/libero_plus.mdx)
  • CI is green

Reviewer notes

  • The import fallback in libero.py is the only change touching existing code paths. The try branch runs for hf-libero; the except branch for LIBERO-plus. Transparent to callers.
  • LiberoPlusEnv is intentionally a minimal subclass — no duplicated logic.
  • The Docker image uses uv sync --extra libero_plus --no-cache (no --locked) because the GitHub-sourced package is not in uv.lock. Pin a commit SHA in the dep once LIBERO-plus stabilizes.

@pkooij pkooij force-pushed the feat/async-vector-env branch 2 times, most recently from 35f18d4 to 566a77b Compare April 8, 2026 17:05
@pkooij pkooij changed the base branch from feat/async-vector-env to feat/benchmark-ci April 9, 2026 08:04
@pkooij pkooij force-pushed the feat/libero-plus-benchmark branch from 3f678a5 to cdd540d Compare April 9, 2026 08:07
- Add import fallback in libero.py for LIBERO-plus nested package structure
  (github.com/sylvestf/LIBERO-plus installs under a deeper module path than
  the original hf-libero wheel)
- Register LiberoPlusEnv config subclass (inherits LiberoEnv fully; only
  the env type name and default suite differ)
- Add libero_plus optional dep group in pyproject.toml pointing to the
  LIBERO-plus GitHub repo
- Add docs/source/libero_plus.mdx with install guide, task suite table,
  perturbation dimensions, eval commands, and dataset reference
- Add docker/Dockerfile.benchmark.libero_plus for isolated CI image
- Add libero-plus-integration-test CI job to benchmark_tests.yml

Dataset: pepijn223/libero_plus_lerobot is already v3.0 (no conversion needed).
Dataset card is missing and should be added separately on the Hub.

Eval smoke-test (requires Linux + GPU):
  lerobot-eval \
    --policy.path=pepijn223/smolvla_libero \
    --env.type=libero_plus \
    --env.task=libero_spatial \
    --eval.batch_size=1 --eval.n_episodes=1 \
    --eval.use_async_envs=false --policy.device=cuda \
    '--env.camera_name_mapping={"agentview_image":"camera1","robot0_eye_in_hand_image":"camera2"}' \
    --policy.empty_cameras=1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pkooij pkooij force-pushed the feat/libero-plus-benchmark branch from cdd540d to 39bb3a3 Compare April 9, 2026 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant