RhapsodyAI
/

minicpm-visual-embedding-v0

Feature Extraction

information retrieval

embedding model

visual information retrieval

Model card Files Files and versions Community

bokesyo commited on Jun 27

Commit

3081d81

•

1 Parent(s): 0297db8

Update README.md

Files changed (1) hide show

README.md +16 -5

README.md CHANGED Viewed

@@ -18,23 +18,34 @@ With MiniCPM-Visual-Embedding, it is possible to directly build knowledge base w
 # News
-- 2024-06-27: We released our first visual embedding model minicpm-visual-embedding-v0.1 on [huggingface](https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0.1).
 - 2024-05-08: We [committed](https://github.com/bokesyo/minicpm-visual-embedding) our training code (full-parameter tuning with GradCache and DeepSpeed, supports large batch size across multiple GPUs with zero-stage1) and eval code.
 # Get started
 First you are suggested to git clone this huggingface repo or download repo with `huggingface_cli`.
 ```bash
 git lfs install
-git clone https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0.1
 ```
 or
 ```bash
-huggingface-cli download RhapsodyAI/minicpm-visual-embedding-v0.1
 ```
 ```python
@@ -56,8 +67,8 @@ def last_token_pool(last_hidden_states: Tensor,
         return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]
-tokenizer = AutoTokenizer.from_pretrained('/local/path/to/minicpm-visual-embedding-v0.1')
-model = AutoModel.from_pretrained('/local/path/to/minicpm-visual-embedding-v0.1')
 image_1 = Image.open('/local/path/to/document1.png').convert('RGB')
 image_2 = Image.open('/local/path/to/document2.png').convert('RGB')

 # News
+- 2024-06-27: We released our first visual embedding model minicpm-visual-embedding-v0.1 on [huggingface](https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0).
 - 2024-05-08: We [committed](https://github.com/bokesyo/minicpm-visual-embedding) our training code (full-parameter tuning with GradCache and DeepSpeed, supports large batch size across multiple GPUs with zero-stage1) and eval code.
 # Get started
+Pip install all dependencies:
+```
+Pillow==10.1.0
+timm==0.9.10
+torch==2.1.2
+torchvision==0.16.2
+transformers==4.36.0
+sentencepiece==0.1.99
+```
 First you are suggested to git clone this huggingface repo or download repo with `huggingface_cli`.
 ```bash
 git lfs install
+git clone https://huggingface.co/RhapsodyAI/minicpm-visual-embedding-v0
 ```
 or
 ```bash
+huggingface-cli download RhapsodyAI/minicpm-visual-embedding-v0
 ```
 ```python
         return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]
+tokenizer = AutoTokenizer.from_pretrained('/local/path/to/minicpm-visual-embedding-v0', trust_remote_code=True)
+model = AutoModel.from_pretrained('/local/path/to/minicpm-visual-embedding-v0', trust_remote_code=True)
 image_1 = Image.open('/local/path/to/document1.png').convert('RGB')
 image_2 = Image.open('/local/path/to/document2.png').convert('RGB')