Capability
Natural Language To Robotic Action Translation
14 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “natural-language-to-robotic-action-translation”
Google's vision-language-action model for robotics.
Unique: Represents robot actions as text tokens within a standard language model, enabling co-fine-tuning with internet-scale vision-language data while maintaining the same transformer architecture for both semantic understanding and action generation — avoiding separate policy networks or specialized control heads
vs others: Transfers web-scale language understanding to robotics more directly than prior work (RT-1) by unifying action representation with language tokens, enabling better generalization to novel objects and unseen command types through language semantics