KABI's picture

KABI

dongguanting

·

https://dongguanting.github.io/

AI & ML interests

Information Extration and Retrieval / Alignment for Large Language Models

Organizations

dongguanting's activity

upvoted a paper 7 days ago

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published 8 days ago • 57

upvoted 2 papers 13 days ago

CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Paper • 2410.23090 • Published 13 days ago • 52

CLEAR: Character Unlearning in Textual and Visual Modalities

Paper • 2410.18057 • Published 20 days ago • 198

upvoted a paper 28 days ago

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Paper • 2410.09732 • Published about 1 month ago • 54

upvoted 2 papers 29 days ago

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Paper • 2410.10139 • Published 30 days ago • 50

Toward General Instruction-Following Alignment for Retrieval-Augmented Generation

Paper • 2410.09584 • Published Oct 12 • 45

upvoted a paper about 1 month ago

MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making

Paper • 2409.16686 • Published Sep 25 • 8

upvoted 5 papers about 2 months ago

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21 • 27

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25 • 100

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published Sep 19 • 47

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines

Paper • 2409.12959 • Published Sep 19 • 36

LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published Sep 18 • 30

upvoted 7 papers 2 months ago

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published Sep 10 • 62

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery

Paper • 2409.05591 • Published Sep 9 • 28

Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4 • 72

Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Paper • 2409.06135 • Published Sep 10 • 14

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

Paper • 2409.05840 • Published Sep 9 • 45

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data

Paper • 2409.03810 • Published Sep 5 • 30

CogVLM2: Visual Language Models for Image and Video Understanding

Paper • 2408.16500 • Published Aug 29 • 56

upvoted a collection 2 months ago

Qwen2-VL

Vision-language model series based on Qwen2 • 15 items • Updated Sep 18 • 150