edgar

#edgar

@rohanpaul_ai: This was long needed for AI in finance. Making SEC filings readable for machines without flattening the accounting logi…

X AI KOLs Following ↗ · 6d ago Cached

Researchers from Stanford, UC, and Nanjing University release SEFD, a dataset of 152B tokens from SEC filings converted to layout-faithful MultiMarkdown, preserving table structure for LLM training with minimal overlap with Common Crawl.

0 favorites 0 likes

edgar

@rohanpaul_ai: This was long needed for AI in finance. Making SEC filings readable for machines without flattening the accounting logi…

Submit Feedback