Guanzhou Ke's picture

32 2

Guanzhou Ke

guanzhouk

·

Guanzhou-Ke

AI & ML interests

Multi-modal learning

Organizations

None yet

guanzhouk's activity

upvoted a paper 14 days ago

GPT-4o System Card

Paper • 2410.21276 • Published 18 days ago • 77

upvoted a paper 18 days ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published 21 days ago • 88

upvoted a paper 20 days ago

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published 23 days ago • 56

upvoted a paper 26 days ago

HelpSteer2-Preference: Complementing Ratings with Preferences

Paper • 2410.01257 • Published Oct 2 • 19

upvoted a collection 26 days ago

Llama-3.1-Nemotron-70B

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 28 days ago • 132

upvoted a paper 27 days ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3 • 77

upvoted a paper about 1 month ago

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 90

upvoted 2 papers about 2 months ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6 • 59

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 40

upvoted a paper 2 months ago

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29 • 92

upvoted 10 papers 3 months ago

SWE-bench-java: A GitHub Issue Resolving Benchmark for Java

Paper • 2408.14354 • Published Aug 26 • 40

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27 • 36

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27 • 121

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 117

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Paper • 2407.15841 • Published Jul 22 • 39

Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study

Paper • 2406.07057 • Published Jun 11 • 15

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Paper • 2408.02900 • Published Aug 6 • 25

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22 • 62

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16 • 97

OpenResearcher: Unleashing AI for Accelerated Scientific Research

Paper • 2408.06941 • Published Aug 13 • 30