---
title: README
emoji: 👀
colorFrom: green
colorTo: yellow
sdk: static
pinned: false
---

[Multimodal Art Projection (M-A-P)](https://m-a-p.ai) is an open-source AI research community. 

The community members are working on research topics in a wide range of spectrum, including but not limited to pre-training paradigm of foundation models, large-scale data collection and processing, and the derived applciations on coding, reasoning and music creativity.

The community is open to researchers keen on any relevant topic. Welcome to join us! 
- [Discord Channel](https://discord.gg/qeRsjRdRHf)
- Our [Full Paper List](https://huggingface.co/collections/m-a-p/m-a-p-full-paper-list-65e070a694c7b01c5547fbff)
- mail: contact@m-a-p.ai

The development log of our Multimodal Art Projection (m-a-p) model family:
- 🔥08/05/2024: We release the fully transparent large language model [MAP-Neo](https://github.com/multimodal-art-projection/MAP-NEO), series models for scaling law exploraltion and post-training alignment, and along with the training corpus [Matrix](https://huggingface.co/datasets/m-a-p/Matrix).
- 🔥11/04/2024: MuPT [**paper**](https://arxiv.org/abs/2404.06393) and [**demo**](https://twitter.com/GeZhang86038849/status/1778089599035941279) are out. [**HF collection**](https://huggingface.co/collections/m-a-p/mupt-65a2cbcf895d1eca73b9f985).
- 🔥08/04/2024: Chinese Tiny LLM is out. [**HF collection**](https://huggingface.co/collections/m-a-p/chinese-tiny-llm-660d0133dff6856f94ce0fc6).
- 🔥28/02/2024: The release of [**ChatMusician**](https://huggingface.co/collections/m-a-p/chatmusician-65de07b3b87b189c2a588329)'s demo, code, model, data, and benchmark. 😆
- 🔥23/02/2024: The release of [**OpenCodeInterpreter**](https://huggingface.co/collections/m-a-p/opencodeinterpreter-65d312f6f88da990a64da456), beats GPT-4 code interpreter on HumanEval.
- 23/01/2024: we release [**CMMMU**](https://huggingface.co/datasets/m-a-p/CMMMU) for better Chinese LMMs' Evaluation.
- 13/01/2024: we release a series of **Music Pretrained Transformer (MuPT)** checkpoints, with [**size up to 1.3B and 8192 context length**](https://huggingface.co/m-a-p/MuPT_v0_8192_1.3B). Our models are **LLAMA2**-based, pre-trained on **world's largest 10B tokens symbolic music dataset** ([ABC notation format](https://en.wikipedia.org/wiki/ABC_notation)). We currently support Megatron-LM format and will release huggingface checkpoints soon.
- 02/06/2023: officially release the [MERT pre-print paper](https://arxiv.org/abs/2306.00107) and training [codes](https://github.com/yizhilll/MERT).
- 17/03/2023: we release two advanced music understanding models, [MERT-v1-95M](https://huggingface.co/m-a-p/MERT-v1-95M) and [MERT-v1-330M](https://huggingface.co/m-a-p/MERT-v1-330M) , trained with new paradigm and dataset. They outperform the previous models and can better generalize to more tasks.
- 14/03/2023: we retrained the MERT-v0 model with open-source-only music dataset [MERT-v0-public](https://huggingface.co/m-a-p/MERT-v0-public)
- 29/12/2022: a music understanding model [MERT-v0](https://huggingface.co/m-a-p/MERT-v0) trained with **MLM** paradigm, which performs better at downstream tasks.
- 29/10/2022: a pre-trained MIR model [music2vec](https://huggingface.co/m-a-p/music2vec-v1) trained with **BYOL** paradigm.