Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.01050

The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines

Paper • 2408.01050 • Published Aug 2 • 8
POA: Pre-training Once for Models of All Sizes

Paper • 2408.01031 • Published Aug 2 • 26
Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion

Paper • 2408.00458 • Published Aug 1 • 10
MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26 • 49

about 15 hours ago

The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines

Paper • 2408.01050 • Published Aug 2 • 8
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 33
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4 • 72
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Paper • 2409.04593 • Published Sep 6 • 22

Inference Optimization

The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines

Paper • 2408.01050 • Published Aug 2 • 8
Efficient Inference of Vision Instruction-Following Models with Elastic Cache

Paper • 2407.18121 • Published Jul 25 • 15
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19 • 44
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated

Paper • 2407.10969 • Published Jul 15 • 20

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24 • 26
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24 • 12
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 45
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21 • 28

A Survey on Data Selection for Language Models

Paper • 2402.16827 • Published Feb 26 • 4
Instruction Tuning with Human Curriculum

Paper • 2310.09518 • Published Oct 14, 2023 • 3
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs

Paper • 2312.05934 • Published Dec 10, 2023 • 1
Language Models as Agent Models

Paper • 2212.01681 • Published Dec 3, 2022

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs