huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
[view on github]last commit: May 6, 2026
stars
3,066
7d
+6
30d
+54
90d
+171
## star history
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.