LLaMA-3-8B-Instruct LoRA Finetuned Model for ABB
General Breakdown of the ABB-LLM Model
Motivation
As a Tool for Translation, Summarization, QA Tasks: The ABB-LLM model is designed to handle tasks that require the generation of new text, such as translation, summarization, and question-answering (QA).
As a Baseline for Classification, Named Entity Recognition (NER), and Other Tasks: For tasks that involve understanding and processing text, such as classification and NER, this model provides a solid baseline.
Long Term vision
Custom Model Training: When enough data is available, custom models should be trained for specific tasks. This approach is more efficient and yields better performance than using a general-purpose LLM (like this one).
Fine-Tuning Specialized Models: Models like BERT, RoBERTa, etc., should be fine-tuned for specific tasks like classification and NER, which will outperform small LLMs on these tasks.
What to Expect?
Limitations: Current 8B models are inadequate for QA tasks due to higher rates of hallucination and lower accuracy. Therefore, it is advised to use small models for summarization, translation, and classification tasks.
Context-Based Tasks: For tasks that rely on provided context (such as documents or search results), small models can be effective. These tasks include summarization, translation, classification, and NER.
Output Format: This model is trained to return JSON output, which is more structured and easier to work with compared to the verbose default output of the base 8B model.
Use Cases
The ABB-LLM model is suitable for various tasks where context or facts are provided as context. These include:
Summarization: Generate concise summaries of any text, such as agenda items or BPMN files.
Translation: Perform simple translations of text, including agenda items and BPMN files.
Classification: Classify text into predefined hierarchies, such as categorizing agenda items or BPMN files.
Named Entity Recognition (NER): Extract entities from text, useful for identifying key information in agenda items or BPMN files.
Keyword Extraction: Extract relevant keywords from text, aiding in the identification of important terms in agenda items or BPMN files.
Datasets:
The ABB-LLM model is trained on the svercoutere/llama3_abb_instruct_dataset, which uses the following format:
#### Context: {Dutch text documents, JSON objects, ...} #### {task to be performed with the context}
Examples of these tasks can be found within the dataset.
Model tree for svercoutere/llama-3-8b-instruct-abb-lora
Base model
unsloth/llama-3-8b-Instruct-bnb-4bit