-
Just How Flexible are Neural Networks in Practice?
Paper • 2406.11463 • Published • 7 -
Not All Language Model Features Are Linear
Paper • 2405.14860 • Published • 39 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 108 -
An Interactive Agent Foundation Model
Paper • 2402.05929 • Published • 27
Collections
Discover the best community collections!
Collections including paper arxiv:2401.14953
-
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 2 -
Efficient Monotonic Multihead Attention
Paper • 2312.04515 • Published • 6 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 37 -
Exploring Format Consistency for Instruction Tuning
Paper • 2307.15504 • Published • 7
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 16 -
Transforming and Combining Rewards for Aligning Large Language Models
Paper • 2402.00742 • Published • 11 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 69 -
Specialized Language Models with Cheap Inference from Limited Domain Data
Paper • 2402.01093 • Published • 45
-
Learning Universal Predictors
Paper • 2401.14953 • Published • 18 -
Anything in Any Scene: Photorealistic Video Object Insertion
Paper • 2401.17509 • Published • 16 -
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Paper • 2402.00854 • Published • 19 -
StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis
Paper • 2401.17093 • Published • 18
-
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper • 2401.12954 • Published • 28 -
Learning Universal Predictors
Paper • 2401.14953 • Published • 18 -
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper • 2402.01622 • Published • 33 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 24
-
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 87 -
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Paper • 2310.05914 • Published • 14 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 56 -
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
Paper • 2401.03462 • Published • 26
-
Eureka: Human-Level Reward Design via Coding Large Language Models
Paper • 2310.12931 • Published • 26 -
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Paper • 2311.04901 • Published • 7 -
Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems
Paper • 2311.05884 • Published • 5 -
PolyMaX: General Dense Prediction with Mask Transformer
Paper • 2311.05770 • Published • 6