feat(envs): lazy env init + AsyncVectorEnv as default for n_envs > 1 by pkooij · Pull Request #3274 · huggingface/lerobot

pkooij · 2026-04-03T15:30:27Z

Summary

LiberoEnv and MetaWorldEnv eagerly allocated GPU EGL contexts in __init__, making AsyncVectorEnv unusable (child processes inherit stale GPU handles → EGL_BAD_CONTEXT). All environments were also created upfront, causing OOM on multi-suite evaluations.

This PR:

Defers GPU allocation to _ensure_env(), called on first reset()/step() inside worker subprocesses
Adds _LazyAsyncVectorEnv — only one task's workers are alive at a time, preventing OOM
Switches default to AsyncVectorEnv for parallel env stepping
Fixes task descriptions for VLM policies (env.call("task_description") instead of broken add_envs_task)
Auto-tunes batch_size based on available CPU cores (batch_size=0)

Benchmarks

All runs with pepijn223/smolvla_libero, single GPU.

`libero_spatial` (10 tasks, `batch_size=10`, `n_episodes=10` → 100 rollouts)

Branch	Wall time	GPU util	GPU mem
`refactor/benchmark-dispatch`	396s	0–8%	~2 GB
`feat/async-vector-env`	189s	0–99%	~10 GB

→ 2.1× speedup

Full LIBERO (4 suites, 40 tasks, `n_episodes=10` → 400 rollouts)

Branch	`batch_size`	Wall time	GPU mem
`refactor/benchmark-dispatch`	1 (10 OOMs)	1475s	~22 GB
`feat/async-vector-env`	10	996s	~10 GB

→ 1.5× faster, half the GPU memory

What changed

libero.py: lazy _ensure_env() + _LazyAsyncVectorEnv wrapper
metaworld.py: same lazy init pattern
configs.py: default use_async_envs=True, auto-downgrade to sync when n_envs=1
default.py: batch_size=0 (auto-tune), use_async_envs=True
lerobot_eval.py: env.call("task_description") fix, env.close() between tasks
utils.py: _get_sub_env_attr / _sub_env_has_attr for async-compatible attribute access

Tests

test_libero_lazy_init / test_metaworld_lazy_init
test_async_vector_env_libero / test_async_vector_env_metaworld
test_add_envs_task_async
test_single_env_uses_sync

src/lerobot/scripts/lerobot_eval.py

src/lerobot/envs/utils.py

docs/source/env_processor.mdx

src/lerobot/envs/libero.py

src/lerobot/scripts/lerobot_eval.py

s1lent4gnt

LGTM!

The base branch was changed.

…chmark docs Add a comprehensive guide for adding new benchmarks to LeRobot, and refactor the existing LIBERO and Meta-World docs to follow the new standardized template. Made-with: Cursor

…asses Replace hardcoded if/elif chains in factory.py with create_envs() and get_env_processors() methods on EnvConfig. New benchmarks now only need to register a config subclass — no factory.py edits required. Net -23 lines: factory.py shrinks from ~200 to ~70 lines of logic. Made-with: Cursor

Rewrite for simpler language, better structure, and easier navigation. Move quick-reference table to the top, fold eval explanation into architecture section, condense the doc template to a bulleted outline. Made-with: Cursor

- Thread camera_name_mapping from LiberoEnv config through to gym envs - Sync features_map with camera_name_mapping in LiberoEnv.__post_init__ - Fix render() to use first available camera instead of hardcoded "image" - Handle non-dict final_info in rollout by falling back to info["is_success"] - Add use_peft legacy field to SmolVLAConfig for checkpoint compat - Add defaults to GR00TN15Config init=False fields for transformers 5.3 Made-with: Cursor

Made-with: Cursor

- Revert GR00T N1.5 default_factory/default changes (transformers compat) - Revert SmolVLA use_peft legacy field - Apply ruff formatting fixes - camera_name_mapping stays entirely in env/eval layer (no policy changes) Made-with: Cursor

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co> Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

LiberoEnv and MetaworldEnv previously allocated GPU resources (EGL context, OpenGL framebuffer) in __init__, before AsyncVectorEnv's fork(). Worker processes inherited stale GPU handles, causing EGL_BAD_CONTEXT crashes on first render. Fix: defer OffScreenRenderEnv / MT1 construction to _ensure_env(), called on first reset() or step() inside the worker subprocess. Each worker creates its own clean context after fork(). Also fixes lerobot_eval.py:170 (add_envs_task TODO): replace with env.call("task") which works with both SyncVectorEnv and AsyncVectorEnv. AsyncVectorEnv is now the default for n_envs > 1; auto-downgraded to SyncVectorEnv when n_envs=1 (no benefit, less overhead). Expected speedup: ~15-20x for LIBERO Spatial with batch_size=50. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

eval_policy_all never closed environments after each task completed, causing AsyncVectorEnv worker processes to accumulate (N_tasks × n_envs). This led to OOM, BrokenPipeError and EOFError on multi-task benchmarks. Also fixes: - AsyncVectorEnv compat in envs/utils.py (use get_attr/call instead of .envs) - Tuple task handling in tokenizer_processor and lerobot_eval - _LazyAsyncVectorEnv for deferred worker spawning in LIBERO Made-with: Cursor

…ning env.call("task") returns the LIBERO task name with underscores (e.g. "pick_up_the_black_bowl_...") instead of the natural language description ("pick up the black bowl ..."). The VLM tokenizes these completely differently, causing 0.0 reward across all episodes. Made-with: Cursor

- Replace add_envs_task reference with env.call("task_description") - Update use_async_envs default to True - Add note about lazy GPU init for AsyncVectorEnv compatibility Made-with: Cursor

- batch_size=0 (default) auto-tunes based on CPU cores, capped by n_episodes and 64. Removes the need for users to guess the right value. The old batch_size > n_episodes error is replaced by silently clamping to n_episodes. - _LazyAsyncVectorEnv accepts pre-computed spaces so only one temp env is created per suite (not per task). For libero_spatial (10 tasks) this avoids 9 redundant LiberoEnv instantiations during env setup. Made-with: Cursor

__del__ is unreliable as a cleanup mechanism. close() is already called explicitly in the eval loop's finally block, so the finalizer is redundant. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ry overlap Previously, next task's AsyncVectorEnv workers were spawned while the current task was still running, causing both tasks' GPU contexts to coexist. Moving the prefetch start into the finally block (after env.close()) ensures workers for task N+1 only spin up once task N has released GPU memory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

_LazyAsyncVectorEnv lived in libero.py but metaworld had the same OOM problem: all tasks' AsyncVectorEnv workers were spawned eagerly, wasting GPU memory for tasks not yet running. Move the class to envs/utils.py so both environments share it, then apply the same is_async + lazy wrapping pattern in create_metaworld_envs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Benchmark CI workflow, Dockerfiles, benchmark docs, evaluation smoke-test doc, and dispatch tests belong in a separate PR. Scope this PR to the async env init changes only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…changes - Restore docs/source/adding_benchmarks.mdx (belongs in this PR) - Restore tests/envs/test_dispatch.py (belongs in this PR) - Revert docs/source/env_processor.mdx to main (out of scope for this PR) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…e PR) Step 7 (Dockerfile + benchmark_tests.yml CI job) and its table rows are out of scope for this PR. The CI infrastructure will be added on top in a follow-up PR. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replaced by env.call("task_description") in lerobot_eval.py. No callers remain in the codebase. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

HuggingFaceDocBuilderDev · 2026-04-08T17:06:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

src/lerobot/scripts/lerobot_eval.py

…r task description Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

s1lent4gnt

CHAD, LGTM!

…eadlock AsyncVectorEnv with default fork context leaks worker processes between test_policy parametrized cases; subsequent env creation deadlocks because new forked workers inherit stale pipe FDs from previous test's leaked workers. - configs.py: pass context="forkserver" to AsyncVectorEnv (matches _LazyAsyncVectorEnv) - test_policies.py: call close_envs(envs) at end of test_policy to clean up workers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

s1lent4gnt

LGTM!

Tests that call make_env(n_envs=2) without passing use_async_envs were getting AsyncVectorEnv, whose forked workers can't resolve gym namespaces registered at runtime. Default to False (sync) so existing tests pass. lerobot_eval.py explicitly passes cfg.eval.use_async_envs, so the CLI async behaviour (controlled by EvalConfig.use_async_envs) is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Resolves conflict in lerobot_eval.py by taking explicit (AttributeError, NotImplementedError) catches from main (#3274). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

This was referenced Apr 3, 2026

feat(eval): episode sharding, parallel launcher, and autotune #3275

Closed

feat(eval): thread-safe policy copies for max_parallel_tasks > 1 #3276

Closed

pkooij force-pushed the feat/async-vector-env branch 3 times, most recently from acd52a4 to 861a7c7 Compare April 7, 2026 18:11

s1lent4gnt reviewed Apr 8, 2026

View reviewed changes

src/lerobot/scripts/lerobot_eval.py Outdated Show resolved Hide resolved

s1lent4gnt reviewed Apr 8, 2026

View reviewed changes

src/lerobot/envs/utils.py Outdated Show resolved Hide resolved

s1lent4gnt reviewed Apr 8, 2026

View reviewed changes

docs/source/env_processor.mdx Show resolved Hide resolved

s1lent4gnt reviewed Apr 8, 2026

View reviewed changes

docs/source/env_processor.mdx Show resolved Hide resolved

s1lent4gnt reviewed Apr 8, 2026

View reviewed changes

src/lerobot/envs/libero.py Outdated Show resolved Hide resolved

s1lent4gnt reviewed Apr 8, 2026

View reviewed changes

src/lerobot/scripts/lerobot_eval.py Outdated Show resolved Hide resolved

pkooij mentioned this pull request Apr 8, 2026

feat(ci): benchmark smoke tests with isolated Docker images (LIBERO + MetaWorld) #3319

Open

7 tasks

s1lent4gnt previously approved these changes Apr 8, 2026

View reviewed changes

Base automatically changed from refactor/benchmark-dispatch to main April 8, 2026 15:49

pkooij and others added 15 commits April 8, 2026 18:28

docs(benchmarks): add benchmark integration guide and standardize ben…

69eec9c

…chmark docs Add a comprehensive guide for adding new benchmarks to LeRobot, and refactor the existing LIBERO and Meta-World docs to follow the new standardized template. Made-with: Cursor

fix link

75d5e5b

fix task count

7abe5f7

fix: use direct AutoresetMode import for gymnasium compat

d8e0eaa

Made-with: Cursor

fix: handle gymnasium < 1.0 without AutoresetMode

0ea6aac

Made-with: Cursor

Update docs/source/env_processor.mdx

8e07cab

Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co> Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>

docs: update adding_benchmarks for async env changes

8a778c0

- Replace add_envs_task reference with env.call("task_description") - Update use_async_envs default to True - Add note about lazy GPU init for AsyncVectorEnv compatibility Made-with: Cursor

pkooij and others added 7 commits April 8, 2026 18:29

refactor(envs): remove __del__ from _LazyAsyncVectorEnv

606ed97

__del__ is unreliable as a cleanup mechanism. close() is already called explicitly in the eval loop's finally block, so the finalizer is redundant. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor(envs): remove unused add_envs_task

566a77b

Replaced by env.call("task_description") in lerobot_eval.py. No callers remain in the codebase. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

pkooij force-pushed the feat/async-vector-env branch from 2433f10 to 35f18d4 Compare April 8, 2026 17:04

pkooij force-pushed the feat/async-vector-env branch from 35f18d4 to 566a77b Compare April 8, 2026 17:05

github-actions bot added the processor Issue related to processor label Apr 8, 2026

style: fix prettier formatting in env_processor.mdx

973bb7c

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

s1lent4gnt reviewed Apr 8, 2026

View reviewed changes

src/lerobot/scripts/lerobot_eval.py Outdated Show resolved Hide resolved

fix(eval): catch AttributeError and NotImplementedError explicitly fo…

c3fa286

…r task description Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

s1lent4gnt previously approved these changes Apr 8, 2026

View reviewed changes

pkooij dismissed s1lent4gnt’s stale review via 44534d5 April 8, 2026 19:03

s1lent4gnt previously approved these changes Apr 8, 2026

View reviewed changes

pkooij dismissed s1lent4gnt’s stale review via 7c9a676 April 8, 2026 19:52

s1lent4gnt approved these changes Apr 9, 2026

View reviewed changes

pkooij merged commit 919184d into main Apr 9, 2026
13 checks passed

pkooij deleted the feat/async-vector-env branch April 9, 2026 08:29

This was referenced Apr 9, 2026

fix(envs): handle async vector env compatibility for Libero #3321

Closed

fix(envs): add __getattr__ to _LazyAsyncVectorEnv for attribute proxying #3331

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(envs): lazy env init + AsyncVectorEnv as default for n_envs > 1#3274

feat(envs): lazy env init + AsyncVectorEnv as default for n_envs > 1#3274
pkooij merged 44 commits intomainfrom
feat/async-vector-env

pkooij commented Apr 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

s1lent4gnt left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 8, 2026

Uh oh!

Uh oh!

s1lent4gnt left a comment

Uh oh!

s1lent4gnt left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pkooij commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmarks

libero_spatial (10 tasks, batch_size=10, n_episodes=10 → 100 rollouts)

Full LIBERO (4 suites, 40 tasks, n_episodes=10 → 400 rollouts)

Related

What changed

Tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

s1lent4gnt left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 8, 2026

Uh oh!

Uh oh!

s1lent4gnt left a comment

Choose a reason for hiding this comment

Uh oh!

s1lent4gnt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pkooij commented Apr 3, 2026 •

edited

Loading

`libero_spatial` (10 tasks, `batch_size=10`, `n_episodes=10` → 100 rollouts)

Full LIBERO (4 suites, 40 tasks, `n_episodes=10` → 400 rollouts)