Shao's picture

10

Shao

Castielll

·

AI & ML interests

None yet

Organizations

None yet

Castielll's activity

upvoted a paper 2 months ago

Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming

Paper • 2408.16725 • Published Aug 29 • 52

upvoted 3 papers 3 months ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20 • 56

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9 • 46

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 105

upvoted 2 papers 4 months ago

PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems

Paper • 2406.12428 • Published Jun 18 • 1

Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23

upvoted 4 papers 7 months ago

Long-form music generation with latent diffusion

Paper • 2404.10301 • Published Apr 16 • 24

Audio Dialogues: Dialogues dataset for audio and music understanding

Paper • 2404.07616 • Published Apr 11 • 15

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Paper • 2403.05525 • Published Mar 8 • 39

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

Paper • 2404.05674 • Published Apr 8 • 13