Spaces:

tiiuae
/

README

Running

File size: 10,898 Bytes

7150597
 
8511e3e
7150597
 
 
 
 
 
c2f4806
 
d4d0c04
d52eaee
 
8511e3e
 
 
d4d0c04
d52eaee
 
f82eaf8
ea38138
d52eaee
c4c491d
39f86c7
 
 
d4d0c04
 
 
544aa61
 
39f86c7
 
d4d0c04
c4c491d
d4d0c04
59d5ecd
c4c491d
39f86c7
 
6c8c1ad
 
aeb8de6
7d25bbb
aeb8de6
4b1780c
aeb8de6
 
6c8c1ad
 
 
 
 
 
d52eaee
8511e3e
 
 
4b1780c
6c8c1ad
 
4b1780c
 
8511e3e
 
f82eaf8
 
 
8511e3e
 
 
 
 
200e10e
f82eaf8
 
8511e3e
f82eaf8
8511e3e
 
 
 
 
 
 
c2f4806
 
 
 
8511e3e

---
title: README
emoji: 🚀
colorFrom: red
colorTo: indigo
sdk: static
pinned: false
---

**Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**

* 🦅🐍 **The first SSLM model of the Falcon series has been released open-access, featuring [FalconMamba-7B](https://huggingface.co/collections/tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a).**
* 🦅🦅 **The second generation of Falcon models has been released open-access, featuring [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B) and [Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm).**
* 🔥 **[Falcon-180B](https://huggingface.co/tiiuae/falcon-180b) is now available in open-access! [Try it now in our chat demo!](https://huggingface.co/spaces/tiiuae/falcon-180b-demo)**

# News 

* 🐍 **[FalconMamba-7B](https://huggingface.co/tiiuae/falcon-mamba-7b) is now available.** The first pure SSM model of the Falcon series released under the same permissive license. You can interact with it [here](https://huggingface.co/spaces/tiiuae/falcon-mamba-playground), and check the **[FalconMamba Technical Report](https://arxiv.org/abs/2410.05355)** and **[FalconMamba blogpost](https://huggingface.co/blog/falconmamba)**.
* 📸 **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
* 🎉 **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
* 💥 **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). 
* ✨ **Falcon-[40B](https://huggingface.co/tiiuae/falcon-40b)/[7B](https://huggingface.co/tiiuae/falcon-7b) are now available under the Apache 2.0 license**, TII has [waived all royalties and commercial usage restrictions](https://www.tii.ae/news/uaes-falcon-40b-worlds-top-ranked-ai-model-technology-innovation-institute-now-royalty-free).

# FalconMamba LLM

We are excited to announce the release of our groundbreaking LLM model with a pure SSM architecture, setting a new benchmark by outperforming all previous SSM models and achieving performance on par with leading transformer-based models. 

Papers:
- [FalconMamba Technical Report, Zuo et al. 2024](https://arxiv.org/abs/2410.05355)

More details on the new models and their performance can also be found in our [FalconMamba blogpost](https://huggingface.co/blog/falconmamba).

| **Artefact**        | **Link**                                                         | **Type**                | **Details**                                                       |
|---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
| 🐍 **FalconMamba-7B**       | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b)                  | *pretrained model*        | 7B parameters pure SSM trained on ~5,800 billion tokens.                   |
| FalconMamba-7B-Instruct  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct)         | *instruction/chat model*  | Falcon-Mamba-7B finetuned using only SFT.|
| FalconMamba-7B-pre-decay  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-pre-decay)         | *pretrained model*  | Falcon-Mamba-7B pre-decay checkpoint.|
| FalconMamba-7B-4bit  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-4bit)         | *pretrained model*  | 4bit quantized version using GGUF.|
| FalconMamba-7B-Instruct-4bit  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct-4bit)         | *instruction/chat model*  | 4bit quantized version using GGUF.|


# Falcon2 LLM

Falcon2 LLM is TII's new flagship series of large language models, where we focused on building smaller models with enhanced performance to enable cheaper inference that can encourage the development of more downstream applications and improve the general usability of our models. 

Papers:
- [Falcon2-11B Technical Report, Malartic et al. 2024](https://www.arxiv.org/abs/2407.14885)

More details on the new models and their performance can also be found in our [Falcon2 blogpost](https://huggingface.co/blog/falcon2-11b).

See below for a detailed list of artefacts in the Falcon2 LLM family:

| **Artefact**        | **Link**                                                         | **Type**                | **Details**                                                       |
|---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
| 🦅🦅 **Falcon-11B**      | [Here](https://huggingface.co/tiiuae/falcon-11B)        | *pretrained model*        | 11B parameters trained on over 5000 billion tokens.    |
| 🦅📸 **Falcon-11B-vlm**      | [Here](https://huggingface.co/tiiuae/falcon-11B-vlm)        | *vision adapted model*        | Integrating the pretrained CLIP ViT-L/14 vision encoder with our Falcon2-11B chat-finetuned model, and trained with image-text data.    |

# Falcon LLM

Falcon LLM is TII's flagship series of large language models, built from scratch using a custom data pipeline and distributed training library [Almazrouei et al. 2023](https://arxiv.org/abs/2311.16867).

Papers:
- [RefinedWeb, Penedo et al. 2023](https://proceedings.neurips.cc/paper_files/paper/2023/hash/fa3ed726cc5073b9c31e3e49a807789c-Abstract-Datasets_and_Benchmarks.html)
- [The Falcon Series of Open Language Models, Almazrouei et al. 2023](https://arxiv.org/abs/2311.16867)

To promote collaborations and drive innovation, we have open-sourced a number of artefacts:
* The **Falcon-180B** pretrained and chat models, under the [Falcon-180B TII license](https://huggingface.co/spaces/tiiuae/falcon-180b-license/blob/main/LICENSE.txt). Falcon-180B is the largest and most powerful open-access model available. 
* The **Falcon-7/40B** pretrained and instruct models, under the  Apache 2.0 software license . Falcon-7B/40B models are state-of-the-art for their size, outperforming other open-source models on NLP benchmarks.
* The **RefinedWeb** dataset, a massive web dataset with stringent filtering and large-scale deduplication, enabling models trained on web data alone to match or outperform models trained on curated corpora. See 📓 [the paper](https://arxiv.org/abs/2306.01116) for more information. RefinedWeb is licensed under ODC-By 1.0.

See below for a detailed list of artefacts in the Falcon LLM family:

| **Artefact**        | **Link**                                                         | **Type**                | **Details**                                                       |
|---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
| 🥇 **Falcon-180B**      | [Here](https://huggingface.co/tiiuae/falcon-180b)        | *pretrained model*        | 180B parameters trained on 3,500 billion tokens.                    |
| Falcon-180B-Chat | [Here](https://huggingface.co/tiiuae/falcon-180b-chat)        | *chat model*  | Falcon-180B finetuned on a mixture of [Ultrachat](https://huggingface.co/datasets/stingning/ultrachat), [Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) and [Airoboros](https://huggingface.co/datasets/jondurbin/airoboros-2.1).                         |
| 🥈 **Falcon-40B**      | [Here](https://huggingface.co/tiiuae/falcon-40b)        | *pretrained model*        | 40B parameters trained on 1,000 billion tokens.                    |
| Falcon-40B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-40b-instruct)        | *instruction/chat model*  | Falcon-40B finetuned on the [Baize](https://github.com/project-baize/baize-chatbot) dataset.                         |
| 🥉 **Falcon-7B**       | [Here](https://huggingface.co/tiiuae/falcon-7b)                  | *pretrained model*        | 6.7B parameters trained on 1,500 billion tokens.                   |
| Falcon-7B-Instruct  | [Here](https://huggingface.co/tiiuae/falcon-7b-instruct)         | *instruction/chat model*  | Falcon-7B finetuned on the [Baize](https://github.com/project-baize/baize-chatbot), [GPT4All](https://github.com/nomic-ai/gpt4all), and [GPTeacher](https://github.com/teknium1/GPTeacher) datasets. |
| 📀 **RefinedWeb**      | [Here](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | *pretraining web dataset* | ~600 billion "high-quality" tokens.                                |
| Falcon-RW-1B        | [Here](https://huggingface.co/tiiuae/falcon-rw-1b)               | *pretrained model*        | 1.3B parameters trained on 350 billion tokens.                     |
| Falcon-RW-7B        | [Here](https://huggingface.co/tiiuae/falcon-rw-7b)               | *pretrained model*        | 7.5B parameters trained on 350 billion tokens.                     |

# About us

The [Technology Innovation Institute](https://www.tii.ae) (TII) is a leading global research center dedicated to pushing the frontiers of knowledge. Our teams of scientists, researchers and engineers work in an open, flexible and agile environment to deliver discovery science and transformative technologies. Our work means we will not only prepare for the future; we will create it. Working together, we are committed to inspiring innovation for a better tomorrow.

We are part of Abu Dhabi Government’s Advanced Technology Research Council, which oversees technology research in the emirate. As a disruptor in science, we are setting new standards and serve as a catalyst for change.

Faced with a future of limitless possibilities and supported by strategically funded investments, we are encouraging a culture of discovery. Our work reinforces Abu Dhabi and the UAE’s status as an R&D hub and a global leader in breakthrough technologies.