Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.14495

StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization

Paper • 2311.14495 • Published Nov 24, 2023 • 1
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17 • 58
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

Paper • 2401.13560 • Published Jan 24 • 1
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces

Paper • 2402.00789 • Published Feb 1 • 2

StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization

Paper • 2311.14495 • Published Nov 24, 2023 • 1
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17 • 58
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

Paper • 2401.13560 • Published Jan 24 • 1
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces

Paper • 2402.00789 • Published Feb 1 • 2

Trellis Networks for Sequence Modeling

Paper • 1810.06682 • Published Oct 15, 2018 • 1
Pruning Very Deep Neural Network Channels for Efficient Inference

Paper • 2211.08339 • Published Nov 14, 2022 • 1
LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

Paper • 2309.14157 • Published Sep 25, 2023 • 1
Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138

Trellis Networks for Sequence Modeling

Paper • 1810.06682 • Published Oct 15, 2018 • 1
ProSG: Using Prompt Synthetic Gradients to Alleviate Prompt Forgetting of RNN-like Language Models

Paper • 2311.01981 • Published Nov 3, 2023 • 1
Gated recurrent neural networks discover attention

Paper • 2309.01775 • Published Sep 4, 2023 • 7
Inverse Approximation Theory for Nonlinear Recurrent Neural Networks

Paper • 2305.19190 • Published May 30, 2023 • 1

Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering

Paper • 2204.04581 • Published Apr 10, 2022 • 1
Retrieval-Augmented Multimodal Language Modeling

Paper • 2211.12561 • Published Nov 22, 2022 • 1
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories

Paper • 2212.10511 • Published Dec 20, 2022 • 1
Memorizing Transformers

Paper • 2203.08913 • Published Mar 16, 2022 • 2

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs