Deployment and Principles of LLaVa

Posted Apr 18, 2025 Updated Apr 18, 2025

By Wei Xiong

1 min read

Reference

Principle and Deployment
A video concerning the principles of LLaVa

Good to know

Project structure and config.json settings

Download the llava framework, download the weight and vision encoder respectively

  
## Test command
python -m llava.serve.cli   --model-path liuhaotian/llava-v1.5-7b   --image-file "https://llava-vl.github.io/static/images/view.jpg"   --load-4bit               

# safe on < 8 GB VRA
# Using 4060(8G), so using --load-4bit to save VRAM is needed

Graphic Interface

  
## Controller
python -m llava.serve.controller --host 0.0.0.0 --port 10000

## Worker
python -m llava.serve.model_worker \
  --host 0.0.0.0 \
  --controller http://localhost:10000 \
  --port 40000 \
  --worker http://localhost:40000 \
  --model-path liuhaotian/llava-v1.5-7b \
  --load-4bit

## Gradio web server
python -m llava.serve.gradio_web_server   --controller http://localhost:10000   --model-list-mode reload

Outcome

When bug appears

Could not parse server response: SyntaxError: Unexpected token ‘I’, “Internal S”… is not valid JSON

Just UPDATE: pip install gradio -U

LLM, Multi-modal

This post is licensed under CC BY 4.0 by the author.