Towards the Law of Capacity Gap in Distilling Language Models Paper • 2311.07052 • Published Nov 13, 2023 • 1