data-preparation

Tag

Cards List
#data-preparation

@tom_doerr: Generates LLM-ready datasets from raw data https://github.com/OpenDCAI/DataFlow…

X AI KOLs Timeline · 2d ago Cached

DataFlow is an open-source tool with visual, low-code pipelines to generate, clean, and prepare high-quality LLM training datasets from raw data. It includes a technical report on arXiv.

0 favorites 0 likes
#data-preparation

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Papers with Code Trending · 2025-12-18 Cached

DataFlow is an LLM-driven framework for automated data preparation and workflow engineering, featuring nearly 200 reusable operators and six domain-general pipelines that improve LLM performance across tasks like math, code, and Text-to-SQL.

0 favorites 0 likes
← Back to home

Submit Feedback