VLA Policy Rollouts
We showcase VLA inference for long-horizon bimanual furniture assembly in simulation.
Assemble the LACK side table (Speed 10×)
✅
Assemble the KALLAX shelf (Speed 10×)
✅
Assemble the IVAR chair (Speed 10×)
✅
We showcase real-world VLA policy inference.
Assemble the IVAR chair (Speed 20×)
✅
VLA Emergent Corrective Behaviors
We also observe emergent corrective behaviors. In several rollouts, the robot self-corrects when parts are initially misaligned. For example, when grasping the seat panel with insufficient contact, the robot reopens the gripper, adjusts its pose, and regrasp for a more stable hold. During the attachment of the left chair frame, the robot performs small corrective motions to align the parts before insertion.
Seat panel regrasp (Speed 1.5×)
⚠️
Left chair frame alignment (Speed 8×)
⚠️
BibTeX
@article{ma2026furniturevla,
title={FurnitureVLA: Learning Long-Horizon Bimanual Furniture Assembly with Vision-Language-Action Model},
author={Ma, Chenyang and Yang, Yue and Corcodel, Radu and Jain, Siddarth and Wu, Andrew and Hori, Chiori and Romeres, Diego},
journal={arXiv preprint arXiv:2607.01212},
year={2026}
}