467
edits
Changes
no edit summary
== Prismatic REPL Script Guide == Here are some suggestions to run the generate.py REPL Script from the repo (you can find this in the extern '''scripts''' folder) if you would like to get started with OpenVLA.
== Prerequisites ==
Make sure the images have an end effector in them.
[[File:Coke can2.png|400px|Can pickup task]]
== Starting REPL mode ==
Then, run generate.py. The script starts by initializing the generation playground with the Prismatic model prism-dinosiglip+7b.
The model prism-dinosiglip+7b is downloaded from the Hugging Face Hub.
The model configuration is found and then the model is loaded with the following components:
Vision Backbone: dinosiglip-vit-so-384px
Language Model (LLM) Backbone: llama2-7b-pure (this is also where the hf token comes into play)
Architecture Specifier: no-align+fused-gelu-mlp
Checkpoint Path: The model checkpoint is loaded from a specific path in the cache.
You should see this in your terminal:
[[File:Openvla1.png|800px|prismatic models]]
''After loading the model, the script enters a REPL mode, allowing the user to interact with the model. The REPL mode provides a default generation setup and waits for user inputs.''
Basically, the generate.py script runs a REPL that allows users to interactively test generating outputs from the Prismatic model prism-dinosiglip+7b. Upon running the script, users can enter commands in the REPL prompt: