Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.03099

Model Merging Papers

Collection of relevant papers about model merging

Qualitatively characterizing neural network optimization problems

Paper • 1412.6544 • Published Dec 19, 2014 • 4
Averaging Weights Leads to Wider Optima and Better Generalization

Paper • 1803.05407 • Published Mar 14, 2018 • 2
Merging Models with Fisher-Weighted Averaging

Paper • 2111.09832 • Published Nov 18, 2021 • 1
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 6

Papers about model merging

referenced in the mergekit repo: https://github.com/cg123/mergekit

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 6
Editing Models with Task Arithmetic

Paper • 2212.04089 • Published Dec 8, 2022 • 6
Resolving Interference When Merging Models

Paper • 2306.01708 • Published Jun 2, 2023 • 13
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2311.03099 • Published Nov 6, 2023 • 28

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2311.03099 • Published Nov 6, 2023 • 28

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2311.03099 • Published Nov 6, 2023 • 28

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2311.03099 • Published Nov 6, 2023 • 28

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2311.03099 • Published Nov 6, 2023 • 28

Cool Paper Names

Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model

Paper • 2310.17653 • Published Oct 26, 2023 • 2
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2311.03099 • Published Nov 6, 2023 • 28
"I Want It That Way": Enabling Interactive Decision Support Using Large Language Models and Constraint Programming

Paper • 2312.06908 • Published Dec 12, 2023 • 5
SaulLM-7B: A pioneering Large Language Model for Law

Paper • 2403.03883 • Published Mar 6 • 75

Alignment: FineTuning-Preference

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 28
Tailoring Self-Rationalizers with Multi-Reward Distillation

Paper • 2311.02805 • Published Nov 6, 2023 • 3
Ultra-Long Sequence Distributed Transformer

Paper • 2311.02382 • Published Nov 4, 2023 • 2
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Paper • 2309.11235 • Published Sep 20, 2023 • 16

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs