Spaces:

tiiuae
/

README

Running

App Files Files Community

add FalconMamba

by Gkunsch - opened Aug 12

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+13

-0

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -14,11 +14,24 @@ pinned: false
 # News
 * 📸 **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
 * 🎉 **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
 * 💥 **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
 * ✨ **Falcon-[40B](https://huggingface.co/tiiuae/falcon-40b)/[7B](https://huggingface.co/tiiuae/falcon-7b) are now available under the Apache 2.0 license**, TII has [waived all royalties and commercial usage restrictions](https://www.tii.ae/news/uaes-falcon-40b-worlds-top-ranked-ai-model-technology-innovation-institute-now-royalty-free).
 # Falcon2 LLM
 Falcon2 LLM is TII's new flagship series of large language models, where we focused on building smaller models with enhanced performance to enable cheaper inference that can encourage the development of more downstream applications and improve the general usability of our models.

 # News
+* 🐍 **[FalconMamba-7B](https://huggingface.co/tiiuae/falcon-mamba-7b) is now available.** The first pure SSM model of the Falcon series released under the same permissive license. You can interact with it [here](https://huggingface.co/spaces/tiiuae/falcon-mamba-playground).
 * 📸 **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
 * 🎉 **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
 * 💥 **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
 * ✨ **Falcon-[40B](https://huggingface.co/tiiuae/falcon-40b)/[7B](https://huggingface.co/tiiuae/falcon-7b) are now available under the Apache 2.0 license**, TII has [waived all royalties and commercial usage restrictions](https://www.tii.ae/news/uaes-falcon-40b-worlds-top-ranked-ai-model-technology-innovation-institute-now-royalty-free).
+# Falcon Mamba
+We are excited to announce the release of our groundbreaking LLM model with a pure SSM architecture, setting a new benchmark by outperforming all previous SSM models and achieving performance on par with leading transformer-based models.
+| **Artefact**        | **Link**                                                         | **Type**                | **Details**                                                       |
+|---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
+| 🐍 **Falcon-Mamba-7B**       | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b)                  | *pretrained model*        | 7B parameters pure SSM trained on ~6,000 billion tokens.                   |
+| Falcon-Mamba-7B-Instruct  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct)         | *instruction/chat model*  | Falcon-Mamba-7B finetuned using only SFT.|
+| Falcon-Mamba-7B-4bit  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-4bit)         | *pretrained model*  | 4bit quantized version using GGUF|
+| Falcon-Mamba-7B-Instruct-4bit  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct-4bit)         | *instruction/chat model*  | 4bit quantized version using GGUF.|
 # Falcon2 LLM
 Falcon2 LLM is TII's new flagship series of large language models, where we focused on building smaller models with enhanced performance to enable cheaper inference that can encourage the development of more downstream applications and improve the general usability of our models.