slippylolo commited on
Commit
8511e3e
1 Parent(s): c2f4806

Update with Falcon LLM announcement

Browse files
Files changed (1) hide show
  1. README.md +40 -2
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: README
3
- emoji: 🐠
4
  colorFrom: red
5
  colorTo: indigo
6
  sdk: static
@@ -9,8 +9,46 @@ pinned: false
9
 
10
  **Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**
11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  The [Technology Innovation Institute](https://www.tii.ae) (TII) is a leading global research center dedicated to pushing the frontiers of knowledge. Our teams of scientists, researchers and engineers work in an open, flexible and agile environment to deliver discovery science and transformative technologies. Our work means we will not only prepare for the future; we will create it. Working together, we are committed to inspiring innovation for a better tomorrow.
13
 
14
  We are part of Abu Dhabi Government’s Advanced Technology Research Council, which oversees technology research in the emirate. As a disruptor in science, we are setting new standards and serve as a catalyst for change.
15
 
16
- Faced with a future of limitless possibilities and supported by strategically funded investments, we are encouraging a culture of discovery. Our work reinforces Abu Dhabi and the UAE’s status as an R&D hub and a global leader in breakthrough technologies.
 
1
  ---
2
  title: README
3
+ emoji: 🚀
4
  colorFrom: red
5
  colorTo: indigo
6
  sdk: static
 
9
 
10
  **Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**
11
 
12
+
13
+ # News
14
+
15
+ * 💥 **TII has open-sourced Falcon LLM for research and commercial utilization!** Access the [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
16
+ * 🤗 TII is calling for proposals from the global research community and SME entrepreneurs to submit use cases for Falcon LLM, learn more about it on the [Falcon LLM website](https://falconllm.tii.ae).
17
+
18
+
19
+ # Falcon LLM
20
+
21
+ Falcon LLM is TII's flagship series of large language models, built from scratch using a custom data pipeline and distributed training library. Papers coming soon 😊.
22
+
23
+ To promote collaborations and drive innovation, we have open-sourced a number of artefacts:
24
+ * The **Falcon-7/40B** pretrained and instruct models, under the [TII Falcon LLM License](https://huggingface.co/tiiuae/falcon-7b/raw/main/LICENSE.txt). Falcon-7B/40B models are state-of-the-art for their size, outperforming most other models on NLP benchmarks.
25
+ * The **RefinedWeb** dataset, a massive web dataset with stringent filtering and large-scale deduplication, enabling models trained on web data alone to match or outperform models trained on curated corpora. RefinedWeb is licensed under Apache 2.0.
26
+
27
+ See below for a detailed list of artefacts in the Falcon LLM family:
28
+
29
+ | **Artefact** | **Link** | **Type** | **Details** |
30
+ |---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
31
+ | 🥇 **Falcon-40B** | [Here](https://huggingface.co/tiiuae/falcon-40b) | *pretrained model* | 40B parameters trained on 1,000 billion tokens. |
32
+ | Falcon-40B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-40b-instruct) | *instruction/chat model* | Falcon-40B finetuned on the [Baize](https://github.com/project-baize/baize-chatbot) dataset. |
33
+ | 🥈 **Falcon-7B** | [Here](https://huggingface.co/tiiuae/falcon-7b) | *pretrained model* | 6.7B parameters trained on 1,500 billion tokens. |
34
+ | Falcon-7B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-7b-instruct) | *instruction/chat model* | Falcon-7B finetuned on the [Baize](https://github.com/project-baize/baize-chatbot), [GPT4All](https://github.com/nomic-ai/gpt4all), and [GPTeacher](https://github.com/teknium1/GPTeacher) datasets. |
35
+ | 📀 **RefinedWeb** | [Here](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | *pretraining web dataset* | ~600 billion "high-quality" tokens. |
36
+ | Falcon-RW-1B | [Here](https://huggingface.co/tiiuae/falcon-rw-1b) | *pretrained model* | 1.3B parameters trained on 350 billion tokens. |
37
+ | Falcon-RW-7B | [Here](https://huggingface.co/tiiuae/falcon-rw-7b) | *pretrained model* | 7.5B parameters trained on 350 billion tokens. |
38
+
39
+
40
+ ## TII Falcon LLM License
41
+
42
+ We have made our models available under the [TII Falcon LLM License](https://huggingface.co/tiiuae/falcon-7b/raw/main/LICENSE.txt), a fork of Apache 2.0:
43
+ * You can freely use our models for research and/or personal purpose;
44
+ * You are allowed to share and build derivatives of these models, but you are required to give attribution and to share-alike with the same license;
45
+ * For commercial use, you are exempt from royalties payment if the attributable revenues are inferior to $1M/year, otherwise you should enter in a commercial agreement with TII.
46
+
47
+
48
+ # About us
49
+
50
  The [Technology Innovation Institute](https://www.tii.ae) (TII) is a leading global research center dedicated to pushing the frontiers of knowledge. Our teams of scientists, researchers and engineers work in an open, flexible and agile environment to deliver discovery science and transformative technologies. Our work means we will not only prepare for the future; we will create it. Working together, we are committed to inspiring innovation for a better tomorrow.
51
 
52
  We are part of Abu Dhabi Government’s Advanced Technology Research Council, which oversees technology research in the emirate. As a disruptor in science, we are setting new standards and serve as a catalyst for change.
53
 
54
+ Faced with a future of limitless possibilities and supported by strategically funded investments, we are encouraging a culture of discovery. Our work reinforces Abu Dhabi and the UAE’s status as an R&D hub and a global leader in breakthrough technologies.