What we plan to showcase:

  1. LLM based planning
    1. Input a very long command and generate an online plan.
  2. VLMaps
    1. Natural language navigation to multiple waypoints in the scene.
  3. 3D Manipulation
    1. Show that the robot can avoid obstacles when moving to a grasp pose.
  4. Online scene tracking
    1. Showcase that the robot has a memory of objects that it has seen in the world.

The following list summarizes the sequence of actions in FVD encore.

Hey, Alfred! Go to the sofa, then move between potted plant and the keyboard, then go to potted plant and open the drawer. Go to the table closest to sofa, then pick up teddy bear, go to the potted plant and place it in the drawer. Pick cup and then place in drawer. Finally, close the drawer.

Follow-up questions we ask Alfred once the tasks are complete:

Hey Alfred! Thanks for bringing me my objects. I seem to have lost my coffee mug (replace with unique object that’s not easily visible). Did you see it anywhere?

Rough flow: