gssoc-2026
C++-accelerated data quality toolkit for Python: clean CSVs, profile messy datasets, validate schemas, and work smoothly with pandas.