DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 181
Learning Vision from Models Rivals Learning Vision from Data Paper • 2312.17742 • Published Dec 28, 2023 • 15
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation Paper • 2312.17276 • Published Dec 27, 2023 • 15
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache Paper • 2401.02669 • Published Jan 5 • 14
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 40
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 47
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution Paper • 2401.03065 • Published Jan 5 • 10
Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding Paper • 2401.04575 • Published Jan 9 • 14
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk Paper • 2401.05033 • Published Jan 10 • 15
DocGraphLM: Documental Graph Language Model for Information Extraction Paper • 2401.02823 • Published Jan 5 • 34
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text Paper • 2401.12070 • Published Jan 22 • 42
VideoPrism: A Foundational Visual Encoder for Video Understanding Paper • 2402.13217 • Published Feb 20 • 21
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs Paper • 2402.15491 • Published Feb 23 • 13
DeepSeek-VL: Towards Real-World Vision-Language Understanding Paper • 2403.05525 • Published Mar 8 • 39