Edit model card

LLaMA 3.2 1B Instruct

LLaMA 3.2 3B Instruct is a multilingual instruction-tuned language model with 3.21 billion parameters. Designed for diverse multilingual dialogue and summarization tasks, it offers effective performance on a range of NLP benchmarks.

Model Information

  • Name: LLaMA 3.2 3B Instruct
  • Parameter Size: 3B (3.21B)
  • Model Family: LLaMA 3.2
  • Architecture: Auto-regressive Transformer with Grouped-Query Attention (GQA)
  • Purpose: Multilingual dialogue generation, text generation, and summarization.
  • Training Data: A mix of publicly available multilingual data, covering up to 9T tokens.
  • Supported Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Release Date: September 25, 2024
  • Context Length: 128k tokens
  • Knowledge Cutoff: December 2023

Quantized Model Files

  • Available Formats:
    • ggml-model-q8_0.gguf: 8-bit quantization for resource efficiency and good performance.
    • ggml-model-f16.gguf: Half-precision (16-bit) floating-point format for enhanced precision.
  • Quantization Library: llama.cpp
  • Use Cases: Multilingual dialogue, summarization, and text generation.

Core Library

LLaMA 3.2 1B Instruct can be deployed using llama.cpp or transformers, with a focus on streamlined integration into the Hugging Face ecosystem.

  • Primary Framework: llama.cpp
  • Alternate Frameworks:
    • transformers for Hugging Face model support.
    • vLLM for optimized inference and low-latency deployments.

Library and Model Links:

Safety and Responsible Use

LLaMA 3.2 3B has been designed with safety in mind but may produce biased, harmful, or unpredictable outputs, especially for less-covered languages or specific prompts.

  • Testing and Risk Assessment: Initial testing has primarily focused on English; coverage for other languages is ongoing.
  • Limitations: LLaMA 3.2 may not fully adhere to user instructions or safety guidelines, and may exhibit unexpected behaviors.
  • Responsible Use Guidelines: Refer to the Responsible Use Guide for more details.
Downloads last month
90
GGUF
Model size
3.21B params
Architecture
llama

8-bit

16-bit

Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for teleprint-me/llama-3.2-3b-instruct

Quantized
(135)
this model