--- license: apache-2.0 language: - ar pipeline_tag: text-generation tags: - 'arabic ' - text-generation --- # ArabianGPT Model Overview ## Introduction ArabianGPT-0.3B, developed under the ArabianLLM initiatives, is a specialized GPT-2 model optimized for Arabic language modeling. It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic. This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language. ## Key Features - **Architecture**: GPT-2 - **Model Size**: 345 million parameters - **Layers**: 24 - **Model Attention Layers (MAL)**: 16 - **Context Window Size**: 1024 tokens ## Training - **Dataset**: Scraped texts contains scientific articles, and general texts - **Data Size**: 23 GB - **Tokenizer**: Aranizer 64K - **Tokens**: Over 3.3 billion - **Hardware**: 4 NDIVIA A100 GPUs - **Training Duration**: 45 days - **Performance**: loss of 3.82 ## Role in ArabianLLM Initiatives ArabianGPT-0.3B is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects. ## Usage Suitable for Arabic text generation tasks. Example usage with Transformers Pipeline: ```python from transformers import pipeline pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512) text = '' pipe.predict(text) ``` ## Limitations and Ethical Considerations - The model may have context understanding or text generation limitations in certain scenarios. - Emphasis on ethical use to prevent misinformation or harmful content propagation. ## Acknowledgments Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab. ## Contact Information For inquiries: [riotu@psu.edu.sa](mailto:riotu@psu.edu.sa). ## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation
We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.3B, and users engage with and apply the model's outputs at their own risk.