bokesyo commited on
Commit
587e267
1 Parent(s): 703e627

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -6
README.md CHANGED
@@ -37,7 +37,7 @@ The model only takes images as document-side inputs and produce vectors represen
37
  - x86 CPU with 32GB memory.
38
  - x86 CPU with 32GB memory + Nvidia GPU with 16GB memory.
39
 
40
- 1. Pip install all dependencies:
41
 
42
  ```
43
  Pillow==10.1.0
@@ -65,13 +65,22 @@ pip install huggingface-hub
65
  huggingface-cli download --resume-download RhapsodyAI/minicpm-visual-embedding-v0 --local-dir minicpm-visual-embedding-v0 --local-dir-use-symlinks False
66
  ```
67
 
68
- 3. To deploy a local demo, first check `pipeline_gradio.py`, change `model_path` to your local path and change `device` to your device (for users with Nvidia card, use `cuda`, for users with apple silicon, use `mps`, for users with only x86 cpu, please use `cpu`). then launch the demo:
 
 
69
 
70
  ```bash
71
  pip install gradio
72
- python pipeline_gradio.py
73
  ```
74
 
 
 
 
 
 
 
 
 
75
  # For research purpose
76
 
77
  To run the model for research purpose, please refer the following code:
@@ -116,11 +125,11 @@ print(scores)
116
 
117
  # Todos
118
 
119
- [x] Release huggingface space demo.
120
 
121
- [] Release the evaluation results.
122
 
123
- [] Release technical report.
124
 
125
  # Limitations
126
 
@@ -130,6 +139,8 @@ print(scores)
130
 
131
  - The inference speed is low, because vision encoder uses `timm`, which does not yet support `flash-attn`.
132
 
 
 
133
  # Citation
134
 
135
  If you find our work useful, please consider cite us:
 
37
  - x86 CPU with 32GB memory.
38
  - x86 CPU with 32GB memory + Nvidia GPU with 16GB memory.
39
 
40
+ 1. Pip install all dependencies (for all platforms):
41
 
42
  ```
43
  Pillow==10.1.0
 
65
  huggingface-cli download --resume-download RhapsodyAI/minicpm-visual-embedding-v0 --local-dir minicpm-visual-embedding-v0 --local-dir-use-symlinks False
66
  ```
67
 
68
+ 3. To deploy a local demo, first check `pipeline_gradio.py`, change `model_path` to your local path and change `device` to your device and launch demo:
69
+
70
+ Install `gradio` first.
71
 
72
  ```bash
73
  pip install gradio
 
74
  ```
75
 
76
+ Adapt the code in `pipeline_gradio.py` according to your device.
77
+
78
+ - For M1/M2/M3 users, please make sure `model = model.to(device='mps', dtype=torch.float16)` then run `PYTORCH_ENABLE_MPS_FALLBACK=1 python pipeline_gradio.py`.
79
+ - For x86 CPU users, please remove `model = model.to(device)` then run `python pipeline_gradio.py`.
80
+ - For x86 CPU + Nvidia GPU users, please make sure `model = model.to('cuda')` then run `python pipeline_gradio.py`.
81
+ - If you encountered an error, please open an issue [here](https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0/discussions), we will respond soon.
82
+
83
+
84
  # For research purpose
85
 
86
  To run the model for research purpose, please refer the following code:
 
125
 
126
  # Todos
127
 
128
+ - [x] Release huggingface space demo.
129
 
130
+ - [] Release the evaluation results.
131
 
132
+ - [] Release technical report.
133
 
134
  # Limitations
135
 
 
139
 
140
  - The inference speed is low, because vision encoder uses `timm`, which does not yet support `flash-attn`.
141
 
142
+ - The model performs not well on Chinese and other non-English information retrieval tasks.
143
+
144
  # Citation
145
 
146
  If you find our work useful, please consider cite us: