internlm
/

internlm2-1_8b

Text Generation

Transformers

PyTorch

internlm2

custom_code

Model card Files Files and versions Community

x54-729 commited on Jan 31

Commit

ae3bb61

•

1 Parent(s): 8eec32b

update README

Browse files

Files changed (1) hide show

README.md +27 -31

README.md CHANGED Viewed

@@ -20,18 +20,17 @@ license: other
 [![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)
-[💻Github Repo](https://github.com/InternLM/InternLM) • [🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new)
 </div>
 ## Introduction
-The second generation of the InternLM model, InternLM2, includes models at two scales: 7B and 20B. For the convenience of users and researchers, we have open-sourced four versions of each scale of the model, which are:
-- internlm2-base: A high-quality and highly adaptable model base, serving as an excellent starting point for deep domain adaptation.
-- internlm2 (**recommended**): Built upon the internlm2-base, this version has further pretrained on domain-specific corpus. It shows outstanding performance in evaluations while maintaining robust general language abilities, making it our recommended choice for most applications.
-- internlm2-chat-sft: Based on the Base model, it undergoes supervised human alignment training.
-- internlm2-chat (**recommended**): Optimized for conversational interaction on top of the internlm2-chat-sft through RLHF, it excels in instruction adherence, empathetic chatting, and tool invocation.
 The base model of InternLM2 has the following technical features:
@@ -45,15 +44,15 @@ The base model of InternLM2 has the following technical features:
 We have evaluated InternLM2 on several important benchmarks using the open-source evaluation tool [OpenCompass](https://github.com/open-compass/opencompass). Some of the evaluation results are shown in the table below. You are welcome to visit the [OpenCompass Leaderboard](https://opencompass.org.cn/rank) for more evaluation results.
-| Dataset\Models | InternLM2-7B | InternLM2-Chat-7B | InternLM2-20B | InternLM2-Chat-20B | ChatGPT | GPT-4 |
-| --- | --- | --- | --- | --- | --- | --- |
-| MMLU | 65.8 | 63.7 | 67.7 | 66.5 | 69.1 | 83.0 |
-| AGIEval | 49.9 | 47.2 | 53.0 | 50.3 | 39.9 | 55.1 |
-| BBH | 65.0 | 61.2 | 72.1 | 68.3 | 70.1 | 86.7 |
-| GSM8K | 70.8 | 70.7 | 76.1 | 79.6 | 78.2 | 91.4 |
-| MATH | 20.2 | 23.0 | 25.5 | 31.9 | 28.0 | 45.8 |
-| HumanEval | 43.3 | 59.8 | 48.8 | 67.1 | 73.2 | 74.4 |
-| MBPP(Sanitized) | 51.8 | 51.4 | 63.0 | 65.8 | 78.9 | 79.0 |
 - The evaluation results were obtained from [OpenCompass](https://github.com/open-compass/opencompass) , and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/open-compass/opencompass).
@@ -92,34 +91,31 @@ print(output)
 The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[申请表（中文）](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact <[email protected]>.
 ## 简介
-第二代浦语模型， InternLM2 包含 7B 和 20B 两个量级的模型。为了方便用户使用和研究，每个量级的模型我们总共开源了四个版本的模型，他们分别是
-- internlm2-base: 高质量和具有很强可塑性的模型基座，是模型进行深度领域适配的高质量起点；
-- internlm2（**推荐**）: 在internlm2-base基础上，进一步在特定领域的语料上进行预训练，在评测中成绩优异，同时保持了很好的通用语言能力，是我们推荐的在大部分应用中考虑选用的优秀基座；
-- internlm2-chat-sft：在Base基础上，进行有监督的人类对齐训练；
-- internlm2-chat（**推荐**）：在internlm2-chat-sft基础上，经过RLHF，面向对话交互进行了优化，具有很好的指令遵循、共情聊天和调用工具等的能力。
 InternLM2 的基础模型具备以下的技术特点
 - 有效支持20万字超长上下文：模型在20万字长输入中几乎完美地实现长文“大海捞针”，而且在 LongBench 和 L-Eval 等长文任务中的表现也达到开源模型中的领先水平。
 - 综合性能全面提升：各能力维度相比上一代模型全面进步，在推理、数学、代码等方面的能力提升显著。
 ## InternLM2-1.8B
 ### 性能评测
 我们使用开源评测工具 [OpenCompass](https://github.com/internLM/OpenCompass/) 对 InternLM2 在几个重要的评测集进行了评测 ，部分评测结果如下表所示，欢迎访问[ OpenCompass 榜单 ](https://opencompass.org.cn/rank)获取更多的评测结果。
-| 评测集 | InternLM2-7B | InternLM2-Chat-7B | InternLM2-20B | InternLM2-Chat-20B | ChatGPT | GPT-4 |
-| --- | --- | --- | --- | --- | --- | --- |
-| MMLU | 65.8 | 63.7 | 67.7 | 66.5 | 69.1 | 83.0 |
-| AGIEval | 49.9 | 47.2 | 53.0 | 50.3 | 39.9 | 55.1 |
-| BBH | 65.0 | 61.2 | 72.1 | 68.3 | 70.1 | 86.7 |
-| GSM8K | 70.8 | 70.7 | 76.1 | 79.6 | 78.2 | 91.4 |
-| MATH | 20.2 | 23.0 | 25.5 | 31.9 | 28.0 | 45.8 |
-| HumanEval | 43.3 | 59.8 | 48.8 | 67.1 | 73.2 | 74.4 |
-| MBPP(Sanitized) | 51.8 | 51.4 | 63.0 | 65.8 | 78.9 | 79.0 |
 - 以上评测结果基于 [OpenCompass](https://github.com/open-compass/opencompass) 获得（部分数据标注`*`代表数据来自原始论文），具体测试细节可参见 [OpenCompass](https://github.com/open-compass/opencompass) 中提供的配置文件。
 - 评测数据会因 [OpenCompass](https://github.com/open-compass/opencompass) 的版本迭代而存在数值差异，请以 [OpenCompass](https://github.com/open-compass/opencompass) 最新版的评测结果为主。
@@ -149,4 +145,4 @@ print(output)
 ## 开源许可证
-本仓库的代码依照 Apache-2.0 协议开源。模型权重对学术研究完全开放，也可申请免费的商业使用授权（[申请表](https://wj.qq.com/s2/12725412/f7c1/)）。其他问题与合作请联系 <[email protected]>。

 [![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)
+[[表情]Github Repo](https://github.com/InternLM/InternLM) • [[表情]Reporting Issues](https://github.com/InternLM/InternLM/issues/new)
 </div>
 ## Introduction
+InternLM2-1.8B is the 1.8 billion parameter version of the second generation InternLM series. In order to facilitate user use and research, InternLM2-1.8B has two versions of open-source models. They are:
+- internlm2: Built upon the internlm2-base, this version has further pretrained on domain-specific corpus. It shows outstanding performance in evaluations while maintaining robust general language abilities, making it our recommended choice for most applications.
+- internlm2-chat: Optimized for conversational interaction on top of the internlm2-chat-sft through RLHF, it excels in instruction adherence, empathetic chatting, and tool invocation.
 The base model of InternLM2 has the following technical features:
 We have evaluated InternLM2 on several important benchmarks using the open-source evaluation tool [OpenCompass](https://github.com/open-compass/opencompass). Some of the evaluation results are shown in the table below. You are welcome to visit the [OpenCompass Leaderboard](https://opencompass.org.cn/rank) for more evaluation results.
+| Dataset\Models | InternLM2-1.8B | InternLM2-Chat-1.8B | InternLM2-7B | InternLM2-Chat-7B |
+| --- | --- | --- | --- | --- |
+| MMLU | 46.9 | 47.1 | 65.8 | 63.7 |
+| AGIEval | 33.4 | 38.8 | 49.9 | 47.2 |
+| BBH | 37.5 | 35.2 | 65.0 | 61.2 |
+| GSM8K | 31.2 | 39.7 | 70.8 | 70.7 |
+| MATH | 5.6 | 11.8 | 20.2 | 23.0 |
+| HumanEval | 25.0 | 32.9 | 43.3 | 59.8 |
+| MBPP(Sanitized) | 22.2 | 23.2 | 51.8 | 51.4 |
 - The evaluation results were obtained from [OpenCompass](https://github.com/open-compass/opencompass) , and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/open-compass/opencompass).
 The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[申请表（中文）](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact <[email protected]>.
 ## 简介
+书生·浦语-1.8B (InternLM2-1.8B) 是第二代浦语模型系列的18亿参数版本。为了方便用户使用和研究，书生·浦语-1.8B (InternLM2-1.8B) 共有两个版本的开源模型，他们分别是：
+- internlm2: 在internlm2-base基础上，进一步在特定领域的语料上进行预训练，在评测中成绩优异，同时保持了很好的通用语言能力，是我们推荐的在大部分应用中考虑选用的优秀基座；
+- internlm2-chat：在internlm2-chat-sft基础上，经过RLHF，面向对话交互进行了优化，具有很好的指令遵循、共情聊天和调用工具等的能力。
 InternLM2 的基础模型具备以下的技术特点
 - 有效支持20万字超长上下文：模型在20万字长输入中几乎完美地实现长文“大海捞针”，而且在 LongBench 和 L-Eval 等长文任务中的表现也达到开源模型中的领先水平。
 - 综合性能全面提升：各能力维度相比上一代模型全面进步，在推理、数学、代码等方面的能力提升显著。
 ## InternLM2-1.8B
 ### 性能评测
 我们使用开源评测工具 [OpenCompass](https://github.com/internLM/OpenCompass/) 对 InternLM2 在几个重要的评测集进行了评测 ，部分评测结果如下表所示，欢迎访问[ OpenCompass 榜单 ](https://opencompass.org.cn/rank)获取更多的评测结果。
+| 评测集 | InternLM2-1.8B | InternLM2-Chat-1.8B | InternLM2-7B | InternLM2-Chat-7B |
+| --- | --- | --- | --- | --- |
+| MMLU | 46.9 | 47.1 | 65.8 | 63.7 |
+| AGIEval | 33.4 | 38.8 | 49.9 | 47.2 |
+| BBH | 37.5 | 35.2 | 65.0 | 61.2 |
+| GSM8K | 31.2 | 39.7 | 70.8 | 70.7 |
+| MATH | 5.6 | 11.8 | 20.2 | 23.0 |
+| HumanEval | 25.0 | 32.9 | 43.3 | 59.8 |
+| MBPP(Sanitized) | 22.2 | 23.2 | 51.8 | 51.4 |
 - 以上评测结果基于 [OpenCompass](https://github.com/open-compass/opencompass) 获得（部分数据标注`*`代表数据来自原始论文），具体测试细节可参见 [OpenCompass](https://github.com/open-compass/opencompass) 中提供的配置文件。
 - 评测数据会因 [OpenCompass](https://github.com/open-compass/opencompass) 的版本迭代而存在数值差异，请以 [OpenCompass](https://github.com/open-compass/opencompass) 最新版的评测结果为主。
 ## 开源许可证
+本仓库的代码依照 Apache-2.0 协议开源。模型权重对学术研究完全开放，也可申请免费的商业使用授权（[申请表](https://wj.qq.com/s2/12725412/f7c1/)）。其他问题与合作请联系 <[email protected]>。