Josh Bauer
josh-sematic
AI & ML interests
None yet
Organizations
josh-sematic's activity
No "\n\n" in the dataset?!
1
#104 opened 2 months ago
by
ymh233
Deduped version of fineweb on HuggingFace yields "This dataset has 218 files that have been marked as unsafe."
1
#103 opened 3 months ago
by
egor-pakhomov
CC-MAIN-2024-10
#102 opened 3 months ago
by
josh-sematic
CC-MAIN-2023-40
#100 opened 3 months ago
by
josh-sematic
CC-MAIN-2023-50
#101 opened 3 months ago
by
josh-sematic
CC-MAIN-2023-23
#99 opened 3 months ago
by
josh-sematic
CC-MAIN-2021-10
#80 opened 3 months ago
by
josh-sematic
CC-MAIN-2020-45
#77 opened 3 months ago
by
josh-sematic
CC-MAIN-2020-40
#76 opened 3 months ago
by
josh-sematic
CC-MAIN-2019-35
#75 opened 3 months ago
by
josh-sematic
CC-MAIN-2020-34
#74 opened 3 months ago
by
josh-sematic
CC-MAIN-2019-30
#73 opened 3 months ago
by
josh-sematic
CC-MAIN-2020-29
#72 opened 3 months ago
by
josh-sematic
CC-MAIN-2023-14
#98 opened 3 months ago
by
josh-sematic
CC-MAIN-2019-26
#71 opened 3 months ago
by
josh-sematic
CC-MAIN-2020-24
#70 opened 3 months ago
by
josh-sematic
CC-MAIN-2019-22
#69 opened 3 months ago
by
josh-sematic
CC-MAIN-2020-16
#68 opened 3 months ago
by
josh-sematic
CC-MAIN-2019-18
#67 opened 3 months ago
by
josh-sematic