Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.05929

Fundational - Deep Learning

Just How Flexible are Neural Networks in Practice?

Paper • 2406.11463 • Published Jun 17 • 7
Not All Language Model Features Are Linear

Paper • 2405.14860 • Published May 23 • 39
KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 108
An Interactive Agent Foundation Model

Paper • 2402.05929 • Published Feb 8 • 27

More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3 • 51
An Interactive Agent Foundation Model

Paper • 2402.05929 • Published Feb 8 • 27
In-Context Principle Learning from Mistakes

Paper • 2402.05403 • Published Feb 8 • 14

An Interactive Agent Foundation Model

Paper • 2402.05929 • Published Feb 8 • 27

An Interactive Agent Foundation Model

Paper • 2402.05929 • Published Feb 8 • 27
From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models

Paper • 2401.02777 • Published Jan 5
AgentScope: A Flexible yet Robust Multi-Agent Platform

Paper • 2402.14034 • Published Feb 21 • 12
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error

Paper • 2403.04746 • Published Mar 7 • 22

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 143
ReFT: Reasoning with Reinforced Fine-Tuning

Paper • 2401.08967 • Published Jan 17 • 27
Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16 • 20
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10 • 64

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5 • 40
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 34
GPT-4 Technical Report

Paper • 2303.08774 • Published Mar 15, 2023 • 5
Gemini: A Family of Highly Capable Multimodal Models

Paper • 2312.11805 • Published Dec 19, 2023 • 45

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 183
ToolTalk: Evaluating Tool-Usage in a Conversational Setting

Paper • 2311.10775 • Published Nov 15, 2023 • 7
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems

Paper • 2311.11315 • Published Nov 19, 2023 • 6
An Embodied Generalist Agent in 3D World

Paper • 2311.12871 • Published Nov 18, 2023 • 8

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Paper • 2309.14509 • Published Sep 25, 2023 • 17
LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4 • 36
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11 • 42
Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16 • 20

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs