Edit model card

Llama-3.2-1B-Instruct-korQuAD-v1

์ด ๋ชจ๋ธ์€ Llama-3.2-1B-Instruct๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ๊ตญ์–ด ์งˆ์˜์‘๋‹ต ํƒœ์Šคํฌ์— ๋Œ€ํ•ด ํŒŒ์ธํŠœ๋‹๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

๋ชจ๋ธ ์„ค๋ช…

  • ๊ธฐ๋ณธ ๋ชจ๋ธ: Llama-3.2-1B-Instruct
  • ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹: KorQuAD v1.0
  • ํ•™์Šต ๋ฐฉ๋ฒ•: LoRA (Low-Rank Adaptation)
  • ์ฃผ์š” ํƒœ์Šคํฌ: ํ•œ๊ตญ์–ด ์งˆ์˜์‘๋‹ต

๋ฒ„์ „ ํžˆ์Šคํ† ๋ฆฌ

v1.0.0(2024-10-02)

  • ์ดˆ๊ธฐ ๋ฒ„์ „ ์—…๋กœ๋“œ
  • KorQuAD v1.0 ๋ฐ์ดํ„ฐ์…‹ ํŒŒ์ธํŠœ๋‹

v1.1.0(2024-10-30)

  • ๋ชจ๋ธ ํ”„๋กฌํ”„ํŠธ ๋ฐ ํ•™์Šต ๋ฐฉ๋ฒ• ๊ฐœ์„ 
  • KorQuAD evaluate ์ฝ”๋“œ ์ ์šฉ

์„ฑ๋Šฅ

๋ชจ๋ธ Exact Match F1 Score
Llama-3.2-1B-Instruct-v1 18.86 37.2
Llama-3.2-1B-Instruct-v2 36.07 59.03
โ€ป https://korquad.github.io/category/1.0_KOR.html์˜ evaluation script ์‚ฌ์šฉ

์‚ฌ์šฉ ๋ฐฉ๋ฒ•

๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ชจ๋ธ์„ ๋กœ๋“œํ•˜๊ณ  ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

#๋ชจ๋ธ, ํ† ํฌ๋‚˜์ด์ € ๋กœ๋“œ
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = "NakJun/Llama-3.2-1B-Instruct-ko-QuAD"
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(model_path)

#์ž…๋ ฅ ํ˜•์‹ ์„ค์ •
prompt = f"""
### Question:
{question}
### Context:
{context}
### Answer:
"""

#ํ† ํฐํ™” ๋ฐ ์ถ”๋ก 
input_ids = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
output = model.generate(
input_ids,
max_new_tokens=100,
temperature=0.1,
repetition_penalty=1.3,
do_sample=True,
eos_token_id=tokenizer.eos_token_id
)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
answer = generated_text.split("Answer:")[-1].strip().split('\n')[0].strip()
print("์ƒ์„ฑ๋œ ๋‹ต๋ณ€:", answer)

ํ•™์Šต ์„ธ๋ถ€ ์ •๋ณด

  • ์—ํญ: 5
  • ๋ฐฐ์น˜ ํฌ๊ธฐ: 1
  • ํ•™์Šต๋ฅ : 2e-4
  • ์˜ตํ‹ฐ๋งˆ์ด์ €: AdamW (32-bit)
  • LoRA ์„ค์ •:
    • r: 16
    • lora_alpha: 16
    • ๋Œ€์ƒ ๋ชจ๋“ˆ: ["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "down_proj", "up_proj"]
    • lora_dropout: 0.01

์˜ˆ์‹œ ์งˆ๋ฌธ ๋ฐ ๋‹ต๋ณ€

[์˜ˆ์‹œ 1: ์ˆœ์ฒœํ–ฅ๋Œ€ํ•™๊ต]

Context:
์ˆœ์ฒœํ–ฅ๋Œ€ํ•™๊ต๋Š” ์ถฉ์ฒญ๋‚จ๋„ ์•„์‚ฐ์‹œ ์‹ ์ฐฝ๋ฉด ์ˆœ์ฒœํ–ฅ๋กœ์— ์œ„์น˜ํ•œ ์‚ฌ๋ฆฝ ์ข…ํ•ฉ๋Œ€ํ•™๊ต์ž…๋‹ˆ๋‹ค.
์ˆœ์ฒœํ–ฅ๋Œ€ํ•™๊ต์—๋Š” 1983๋…„ ๊ณต๊ณผ๋Œ€ํ•™์ด ์„ค๋ฆฝ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Question: ์ˆœ์ฒœํ–ฅ๋Œ€ํ•™๊ต์˜ ์œ„์น˜๋Š”?
Answer: ์ถฉ์ฒญ๋‚จ๋„ ์•„์‚ฐ์‹œ ์‹ ์ฐฝ๋ฉด ์ˆœ์ฒœํ–ฅ๋กœ

[์˜ˆ์‹œ 2: ์•„์ด๋ธŒ(IVE)]

Context:
์•„์ด๋ธŒ(IVE)๋Š” ๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์Šคํƒ€์‰ฝ ์—”ํ„ฐํ…Œ์ธ๋จผํŠธ ์†Œ์†์˜ 6์ธ์กฐ ๊ฑธ๊ทธ๋ฃน์œผ๋กœ, 2021๋…„ 12์›” 1์ผ์— ๋ฐ๋ท”ํ–ˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋ฃน ์ด๋ฆ„์ธ 'IVE'๋Š” "I HAVE"์—์„œ ์œ ๋ž˜ํ–ˆ์œผ๋ฉฐ, "๋‚ด๊ฐ€ ๊ฐ€์ง„ ๊ฒƒ์„ ๋‹น๋‹นํ•˜๊ฒŒ ๋ณด์—ฌ์ฃผ๊ฒ ๋‹ค"๋Š” ์˜๋ฏธ๋ฅผ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
๋ฐ๋ท”์™€ ๋™์‹œ์— ํฐ ์ธ๊ธฐ๋ฅผ ๋Œ๋ฉฐ ๋น ๋ฅด๊ฒŒ ์ฃผ๋ชฉ๋ฐ›๋Š” ๊ทธ๋ฃน ์ค‘ ํ•˜๋‚˜๋กœ ์ž๋ฆฌ ์žก์•˜์Šต๋‹ˆ๋‹ค.
๋ฉค๋ฒ„ ๊ตฌ์„ฑ:
์•ˆ์œ ์ง„ (๋ฆฌ๋”), ๊ฐ€์„, ๋ ˆ์ด, ์žฅ์›์˜, ๋ฆฌ์ฆˆ, ์ด์„œ
์ฃผ์š” ํ™œ๋™ ๋ฐ ํžˆํŠธ๊ณก:
ELEVEN (2021๋…„): ๋ฐ๋ท”๊ณก์œผ๋กœ, ์„ธ๋ จ๋œ ํผํฌ๋จผ์Šค์™€ ๋ฉœ๋กœ๋””๋กœ ๋งŽ์€ ์‚ฌ๋ž‘์„ ๋ฐ›์•˜์Šต๋‹ˆ๋‹ค.
LOVE DIVE (2022๋…„): ์ค‘๋…์„ฑ ์žˆ๋Š” ๋ฉœ๋กœ๋””์™€ ๋งคํ˜น์ ์ธ ์ฝ˜์…‰ํŠธ๋กœ ํฐ ์ธ๊ธฐ๋ฅผ ์–ป์œผ๋ฉฐ ์Œ์•…๋ฐฉ์†ก์—์„œ ๋‹ค์ˆ˜์˜ 1์œ„๋ฅผ ์ฐจ์ง€ํ–ˆ์Šต๋‹ˆ๋‹ค.
After LIKE (2022๋…„): 'LOVE DIVE'์— ์ด์–ด ํžˆํŠธ๋ฅผ ์นœ ๊ณก์œผ๋กœ, ์•„์ด๋ธŒ์˜ ๊ฐœ์„ฑ์„ ๋” ํ™•๊ณ ํžˆ ํ•˜๋Š” ๊ณก์ด์—ˆ์Šต๋‹ˆ๋‹ค.
์•„์ด๋ธŒ๋Š” ๋…ํŠนํ•œ ์ฝ˜์…‰ํŠธ์™€ ๋›ฐ์–ด๋‚œ ๋ฌด๋Œ€ ํผํฌ๋จผ์Šค๋กœ ๊ตญ๋‚ด์™ธ ํŒฌ๋“ค์—๊ฒŒ ์‚ฌ๋ž‘๋ฐ›๊ณ  ์žˆ์œผ๋ฉฐ, ๊ฐ ๋ฉค๋ฒ„๋“ค ์—ญ์‹œ ๊ฐœ๋ณ„์ ์ธ ๋งค๋ ฅ์„ ๋ฐœ์‚ฐํ•˜๋ฉฐ ํ™œ๋ฐœํžˆ ํ™œ๋™ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์žฅ์›์˜๊ณผ ์•ˆ์œ ์ง„์€ ๋ฐ๋ท” ์ „๋ถ€ํ„ฐ ์•„์ด์ฆˆ์› ํ™œ๋™์„ ํ†ตํ•ด ์ฃผ๋ชฉ๋ฐ›์•˜์œผ๋ฉฐ, ์ดํ›„ ์•„์ด๋ธŒ๋กœ์„œ๋„ ์„ฑ๊ณต์ ์ธ ํ™œ๋™์„ ์ด์–ด๊ฐ€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

Question1: ์•„์ด๋ธŒ์˜ ๋ฆฌ๋”๋Š” ๋ˆ„๊ตฌ์•ผ?
Answer1: ์•ˆ์œ ์ง„

Question2: ์•„์ด๋ธŒ ๋ฐ๋ท”๊ณก ์•Œ๋ ค์ค˜.
Answer2: ELEVEN

์—ฐ๋ฝ์ฒ˜

Downloads last month
21,948
Safetensors
Model size
1.24B params
Tensor type
F32
ยท
Inference API
Unable to determine this model's library. Check the docs .

Model tree for NakJun/Llama-3.2-1B-Instruct-korQuAD-v1

Finetuned
(90)
this model

Dataset used to train NakJun/Llama-3.2-1B-Instruct-korQuAD-v1