Open Q&A dataset for the Swedish construction industry. 300 Q&As in v1.0, target 1000+. Multi-format (JSON, JSONL, Alpaca, ShareGPT, CSV). CC BY 4.0. Maintained by Zaragoza AB, Helsingborg.
Command-line tool to split documents into chunks and automatically generate question–answer datasets, designed for preparing data to fine-tune large language models (LLMs).