Agents - a kaizuberbuehler Collection

kaizuberbuehler 's Collections

Image Generation

Vision Language Models

Foundation Models

Synthetic Data and Self-Improvement

Agents

Video Generation

LM Prompt Engineering

LM Capabilities and Scaling

Music Generation

LM Architectures

Code Generation

Speech Synthesis

EXL2 Quantized Models

Agents

updated Oct 2

More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3 • 51
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Paper • 2402.07456 • Published Feb 12 • 41
Generative Agents: Interactive Simulacra of Human Behavior

Paper • 2304.03442 • Published Apr 7, 2023 • 11
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 8
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

Paper • 2312.13010 • Published Dec 20, 2023 • 4
GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 183
LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25 • 65
Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2 • 57
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

Paper • 2404.12753 • Published Apr 19 • 41
Scaling Instructable Agents Across Many Simulated Worlds

Paper • 2404.10179 • Published Mar 13 • 26
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 44
WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents

Paper • 2404.05902 • Published Apr 8 • 20
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8 • 80
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4 • 24
Voyager: An Open-Ended Embodied Agent with Large Language Models

Paper • 2305.16291 • Published May 25, 2023 • 9
LASER: LLM Agent with State-Space Exploration for Web Navigation

Paper • 2309.08172 • Published Sep 15, 2023 • 11
The Rise and Potential of Large Language Model Based Agents: A Survey

Paper • 2309.07864 • Published Sep 14, 2023 • 7
Reflexion: Language Agents with Verbal Reinforcement Learning

Paper • 2303.11366 • Published Mar 20, 2023 • 4
LEGENT: Open Platform for Embodied Agents

Paper • 2404.18243 • Published Apr 28 • 21
Diffusion for World Modeling: Visual Details Matter in Atari

Paper • 2405.12399 • Published May 20 • 27
OpenVLA: An Open-Source Vision-Language-Action Model

Paper • 2406.09246 • Published Jun 13 • 36
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

Paper • 2305.17390 • Published May 27, 2023 • 2
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains

Paper • 2407.18961 • Published Jul 18 • 38
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Paper • 2407.18901 • Published Jul 26 • 31
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31 • 3
OmniParser for Pure Vision Based GUI Agent

Paper • 2408.00203 • Published Aug 1 • 23
WebArena: A Realistic Web Environment for Building Autonomous Agents

Paper • 2307.13854 • Published Jul 25, 2023 • 23
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30 • 23
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

Paper • 2408.00764 • Published Aug 1 • 1
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

Paper • 2408.07060 • Published Aug 13 • 40
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12 • 115
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java

Paper • 2408.14354 • Published Aug 26 • 40
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments

Paper • 2405.07960 • Published May 13 • 1
On the limits of agency in agent-based models

Paper • 2409.10568 • Published Sep 14 • 12
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

Paper • 2409.07703 • Published Sep 12 • 66
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale

Paper • 2409.16299 • Published Sep 9 • 9