Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
vilm
's Collections
Quyen
Smol Pretraining
VinaLLaMA
Vietcuna
Mixsmol
Smol Pretraining
updated
Feb 9
Curated & High quality Synthetic Textbook Datasets for Pretraining
Upvote
2
vilm/code-textbooks
Viewer
•
Updated
Jan 20
•
207k
•
35
•
2
vilm/MathPile-arXiv
Viewer
•
Updated
Jan 22
•
340k
•
34
•
2
vilm/MathPile-StackExchange
Viewer
•
Updated
Jan 22
•
264k
•
37
•
1
vilm/MathPile-ProofWiki
Viewer
•
Updated
Jan 22
•
23.6k
•
37
vilm/MathPile-Textbooks
Viewer
•
Updated
Jan 22
•
784
•
37
vilm/MathPile-Wikipedia
Viewer
•
Updated
Jan 22
•
20.9k
•
36
•
1
vilm/RedPajama-v2-small
Viewer
•
Updated
Jan 20
•
500k
•
75
•
1
vilm/RedPajama-v2-xsmall
Viewer
•
Updated
Jan 20
•
250k
•
49
•
1
vilm/the-stack-smol-xl-cleaned
Viewer
•
Updated
Jan 20
•
205k
•
44
•
1
vilm/refinedweb-1m-medium
Viewer
•
Updated
Jan 20
•
1M
•
50
•
2
Upvote
2
Share collection
View history
Collection guide
Browse collections