Open main menu

Humanoid Robots Wiki β

Changes

Prismatic VLM REPL

954 bytes added, 00:51, 21 June 2024
no edit summary
The K-Scale OpenVLA adaptation by [[Userhttps:Paweł]//github.com/TRI-ML/prismatic-vlms Prismatic VLM] is the project upon which OpenVLA is based. The generate.py REPL script is available in the OpenVLA repo as well but is essentially using Prismatic models. is at Note that the Prismatic models generate natural language whereas OpenVLA models were trained to generate robot actions. (see this https://github.com/kscalelabsopenvla/openvla /issues/5).
== REPL Script Guide ==Of note, the K-Scale OpenVLA adaptation by [[User:Paweł]] is at https://github.com/kscalelabs/openvla
== Prismatic REPL Script Guide == Here are some suggestions to run the generate.py REPL Script from the repo (you can find this in the '''scripts''' folder) if you would like to get started with OpenVLA.
== Prerequisites ==
Make sure the images have an end effector in them.
 
[[File:Coke can2.png|400px|Can pickup task]]
== Starting REPL mode ==
Checkpoint Path: The model checkpoint is loaded from a specific path in the cache.
 
You should see this in your terminal:
 
[[File:Openvla1.png|800px|prismatic models]]
 
 
''After loading the model, the script enters a REPL mode, allowing the user to interact with the model. The REPL mode provides a default generation setup and waits for user inputs.''
Basically, the generate.py script runs a REPL that allows users to interactively test generating outputs from the Prismatic model prism-dinosiglip+7b. Upon running the script, users can enter commands in the REPL prompt:
''work in progresstype (i) to load a new local image by specifying its path, (p) to update the prompt template for generating outputs, (q) to quit the REPL,need or directly input a prompt to add screenshots generate a response based on the loaded image and next steps''the specified prompt. [[File:Prismatic chat1.png|800px|prismatic chat]]