Collections
Discover the best community collections!
Collections including paper arxiv:2205.05055
-
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
Paper • 2311.00871 • Published • 2 -
Can large language models explore in-context?
Paper • 2403.15371 • Published • 32 -
Data Distributional Properties Drive Emergent In-Context Learning in Transformers
Paper • 2205.05055 • Published • 2 -
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 35
-
Wide Residual Networks
Paper • 1605.07146 • Published • 2 -
Characterizing signal propagation to close the performance gap in unnormalized ResNets
Paper • 2101.08692 • Published • 2 -
Pareto-Optimal Quantized ResNet Is Mostly 4-bit
Paper • 2105.03536 • Published • 2 -
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Paper • 2106.01548 • Published • 2
-
Disentangling Writer and Character Styles for Handwriting Generation
Paper • 2303.14736 • Published • 2 -
A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions
Paper • 2211.02643 • Published • 2 -
A tailored Handwritten-Text-Recognition System for Medieval Latin
Paper • 2308.09368 • Published • 2 -
Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets
Paper • 2303.16256 • Published • 2
-
Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
Paper • 2110.07040 • Published • 2 -
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Paper • 1811.00056 • Published • 2 -
Vulnerability Analysis of Transformer-based Optical Character Recognition to Adversarial Attacks
Paper • 2311.17128 • Published • 2 -
Data Generation for Post-OCR correction of Cyrillic handwriting
Paper • 2311.15896 • Published • 3
-
Measuring the Effects of Data Parallelism on Neural Network Training
Paper • 1811.03600 • Published • 2 -
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Paper • 1804.04235 • Published • 2 -
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Paper • 1905.11946 • Published • 3 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 62