-
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time
Paper • 2408.13233 • Published • 20 -
Heterogeneous Multi-task Learning with Expert Diversity
Paper • 2106.10595 • Published • 1 -
Residual Mixture of Experts
Paper • 2204.09636 • Published • 1 -
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Paper • 2307.05956 • Published • 1
Hazem Essam
hazemessam
AI & ML interests
Protein Language Modeling, Natural Language Processing, Generative Adverserial Networks.
Organizations
Collections
1
datasets
7
hazemessam/ddg_megadataset
Viewer
•
Updated
•
754k
•
28
hazemessam/ddg
Preview
•
Updated
•
101
hazemessam/abyssal_db
Preview
•
Updated
•
31
hazemessam/prostata
Viewer
•
Updated
•
10.5k
•
37
hazemessam/fireprot_db
Viewer
•
Updated
•
53.4k
•
38
hazemessam/uniprot_sprot
Viewer
•
Updated
•
569k
•
47
hazemessam/squad_v2
Viewer
•
Updated
•
2
•
59
•
1