license: apache-2.0
language:
- ko
- en
pipeline_tag: visual-question-answering
tags:
- text2text-generation
base_model: google/deplot
ko-deplot
ko-deplot is a korean Visual-QA model based on the Google's Pix2Struct architecture. It was fine-tuned from Deplot, using korean chart image-text pairs.
ko-deplotμ Googleμ Pix2Struct ꡬ쑰λ₯Ό κΈ°λ°μΌλ‘ ν νκ΅μ΄ Visual-QA λͺ¨λΈμ λλ€. Deplot λͺ¨λΈμ νκ΅μ΄ μ°¨νΈ μ΄λ―Έμ§-ν μ€νΈ μ λ°μ΄ν°μ μ μ΄μ©νμ¬ νμΈνλνμμ΅λλ€.
- Developed by: NUUA
- Model type: Visual Question Answering
- License: apache-2.0
- Finetuned from model: google/deplot
Model Usage
You can run a prediction by querying an input image together with a question as follows:
μλμ μ½λλ₯Ό μ΄μ©νμ¬ λͺ¨λΈ μΆλ‘ μ ν μ μμ΅λλ€:
from transformers import Pix2StructProcessor, Pix2StructForConditionalGeneration
from PIL import Image
processor = Pix2StructProcessor.from_pretrained('nuua/ko-deplot')
model = Pix2StructForConditionalGeneration.from_pretrained('nuua/ko-deplot')
IMAGE_PATH = "LOCAL_PATH_TO_IMAGE"
image = Image.open(IMAGE_PATH)
inputs = processor(images=image, text="Generate underlying data table of the figure below:", return_tensors="pt")
predictions = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(predictions[0], skip_special_tokens=True))
Tokenizer Details
The model's tokenizer vocab was extended from 50,344 to 65,536 tokens using the following:
- Complete Korean Jamo
- Additional Korean Jamo
- Ko-Electra tokens
λͺ¨λΈμ tokenizer vocabμ 50344κ°μμ 65536κ°λ‘ μλλ₯Ό μ΄μ©νμ¬ νμ₯μν¨ ν νμ΅μ μ§ννμμ΅λλ€:
- μμ±ν νκΈ μλͺ¨
- μΆκ° μμ±ν νκΈ μλͺ¨
- Ko-Electra νκΈ ν ν°
Training Details
Training Data
Synthetic chart data from three libraries were used:
μΈ κ°μ λΌμ΄λΈλ¬λ¦¬μμ ν©μ± μ°¨νΈ λ°μ΄ν°λ₯Ό μμ±νμ¬ μ¬μ©νμμ΅λλ€:
Training Procedure
The model was first exposed to a short warmup stage, following its original paper. It was then trained using the chart data for 50,000 steps.
νμ΅μ μν΄ μ²μ 짧μ "warmup" λ¨κ³λ₯Ό κ±°μ³ νκΈμ νμ΅μν¨ ν 50,000 μ€ν λμ μ°¨νΈ λ°μ΄ν°λ₯Ό νμ΅μμΌ°μ΅λλ€.
Technical Specifications
Hardware
ko-deplot was trained by using A100 80G.
A100 80G GPUλ₯Ό μ΄μ©νμ¬ νμ΅νμμ΅λλ€.
Contact
Any questions and suggestions, please use the discussion tab. If you want to contact us directly, email [email protected].