Puffy Bird's picture

Puffy Bird

puffy310

·

AI & ML interests

None yet

Organizations

puffy310's activity

upvoted a paper about 2 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18 • 130

upvoted a paper 3 months ago

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

Paper • 2408.08189 • Published Aug 15 • 14

upvoted 9 papers 4 months ago

Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

Paper • 2407.01906 • Published Jul 2 • 34

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

TokenPacker: Efficient Visual Projector for Multimodal LLM

Paper • 2407.02392 • Published Jul 2 • 21

To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models

Paper • 2407.01920 • Published Jul 2 • 13

OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

Paper • 2407.02371 • Published Jul 2 • 49

Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning

Paper • 2407.00782 • Published Jun 30 • 23

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

Paper • 2407.01284 • Published Jul 1 • 75

LiteSearch: Efficacious Tree Search for LLM

Paper • 2407.00320 • Published Jun 29 • 37

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28 • 94

upvoted 9 papers 5 months ago

Adam-mini: Use Fewer Learning Rates To Gain More

Paper • 2406.16793 • Published Jun 24 • 67

Unlocking Continual Learning Abilities in Language Models

Paper • 2406.17245 • Published Jun 25 • 28

Video-Infinity: Distributed Long Video Generation

Paper • 2406.16260 • Published Jun 24 • 28

Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers

Paper • 2406.16747 • Published Jun 24 • 18

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

Paper • 2406.16714 • Published Jun 24 • 10

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Paper • 2406.15877 • Published Jun 22 • 45

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21 • 19

Scaling Laws for Linear Complexity Language Models

Paper • 2406.16690 • Published Jun 24 • 22

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17 • 57