@GitHub_Daily: 做量化研究的朋友,每天面对海量的金融研报和前沿论文,靠人工筛选有价值内容,无疑像大海捞针。 最近发现一个叫 QuantMind 的开源项目,专门做量化金融的智能知识提取与检索。 能自动抓取论文、新闻和博客等内容,把非结构化的文档转化为可查…
摘要
QuantMind 是一个开源的量化金融智能知识提取与检索框架,能够自动抓取论文、新闻等非结构化内容,构建可查询的结构化知识库,并支持自然语言检索。
查看缓存全文
缓存时间: 2026/06/12 12:58
做量化研究的朋友,每天面对海量的金融研报和前沿论文,靠人工筛选有价值内容,无疑像大海捞针。
最近发现一个叫 QuantMind 的开源项目,专门做量化金融的智能知识提取与检索。
能自动抓取论文、新闻和博客等内容,把非结构化的文档转化为可查询的结构化知识库。
结合针对金融领域微调的大模型,帮我们快速理解复杂内容,并自动构建语义知识图谱。
直接用自然语言提问,就能在极短时间内检索到需要的因子策略和市场洞察。
GitHub:http://github.com/LLMQuant/quant-mind…
提供一键运行脚本,支持单篇提取、批量并发运行,甚至能直接用自然语言下达处理指令。
如果平时需要处理大量金融研究报告,或者正在做量化策略研究,这个项目能帮到我们。
LLMQuant/quant-mind
Source: https://github.com/LLMQuant/quant-mind
Transform Financial Knowledge into Actionable Intelligence
Why QuantMind • Architecture • Quick Start • Usage • Roadmap • Vision • Contributing
QuantMind is an intelligent knowledge extraction and retrieval framework for quantitative finance. It transforms unstructured financial content—papers, news, blogs, reports—into a queryable knowledge base, enabling AI-powered research at scale.
📰 News
| 🗞️ News | 📝 Description |
|---|---|
| 🎉 Accepted at NeurIPS 2025 Workshop | Our paper Quant-Mind has been accepted to the NeurIPS 2025 GenAI in Finance Workshop !🚀 |
| 📢 First Release on GitHub | Quant-Mind is now live on GitHub — please check it out and join us! 🤗 |
🧐 Overview
QuantMind is a next-generation AI platform that ingests, processes, and structures every new piece of quantitative-finance research, including papers, news, blogs, and SEC filings into a semantic knowledge graph. Institutional investors, hedge funds, and research teams can now explore the frontier of factor strategies, risk models, and market insights in seconds, unlocking alpha that would otherwise remain buried.
✨ Why QuantMind?
The financial research landscape is overwhelming. Every day, hundreds of papers, articles, and reports are published.
🌐 The Opportunity
- Information Overload: 500 new research papers & reports published daily. Manual review takes weeks—costly, error-prone, and non-scalable
- Massive Market: Financial data & analytics market ≫ expected to grow to US$961.89 billion by 2032, with a compound annual growth rate of 13.5%. Tens of thousands of quant teams & asset managers hungry for speed
- High ROI: 1% improvement in research efficiency can translate to millions saved or earned in trading performance
💡 QuantMind solves this by
- 🔍 Extracting structured knowledge from any source (PDFs, web pages, APIs)
- 🧠 Understanding content with domain-specific LLMs fine-tuned for finance
- 💾 Storing information in a semantic knowledge graph
- 🚀 Retrieving insights through natural language queries
System Architecture

QuantMind is built on a decoupled, two-stage architecture. This design separates the concerns of data ingestion from intelligent retrieval, ensuring both robustness and flexibility.
Stage 1: Knowledge Extraction
This layer is responsible for collecting, parsing, and structuring raw information into standardized knowledge units.
Source APIs (arXiv, News, Blogs) → Intelligent Parser → Workflow/Agent → Structured Knowledge Base
- Source: Connects to various sources (academic APIs, news feeds, financial blogs, perplexity search source) to pull content
- Parser: Extracts text, tables, and figures from PDFs, HTML, and other formats
- Tagger: Automatically categorizes content into research areas and topics
- Workflow/Agent: Orchestrates the extraction pipeline with quality control and deduplication
Stage 2: Intelligent Retrieval
This layer transforms structured knowledge into actionable insights through various retrieval mechanisms.
Knowledge Base → Embeddings → Solution Scenarios (DeepResearch, RAG, Data MCP, ...)
-
Embedding Generation: Converts knowledge units into high-dimensional vectors for semantic search
-
Solution Scenarios: Multiple retrieval patterns including:
- DeepResearch: Complex multi-hop reasoning across documents
- RAG: Retrieval-augmented generation for Q&A
- Data MCP: Structured data access protocols
- Custom retrieval patterns based on use case
🚀 Quick Start
We use uv for fast and reliable Python package management.
Prerequisites:
- Python 3.8+
- Git
Installation:
-
Install uv (if not already installed):
# On macOS and Linux curl -LsSf https://astral.sh/uv/install.sh | sh # On Windows powershell -c "irm https://astral.sh/uv/install.ps1 | iex" # Or using pip pip install uv -
Clone the repository:
git clone https://github.com/LLMQuant/quant-mind.git cd quant-mind -
Create and activate virtual environment:
# Create a virtual environment uv venv # Activate it # On macOS/Linux: source .venv/bin/activate # On Windows: .venv\Scripts\activate -
Install dependencies:
uv pip install -e .
📚 Usage Examples
Run a single paper through paper_flow
import asyncio
from quantmind.configs import PaperFlowCfg
from quantmind.configs.paper import ArxivIdentifier
from quantmind.flows import paper_flow
async def main() -> None:
paper = await paper_flow(
ArxivIdentifier(id="2401.12345"),
cfg=PaperFlowCfg(model="gpt-4o-mini"),
)
print(paper.model_dump_json(indent=2))
asyncio.run(main())
Fan out a batch with batch_run
import asyncio
from quantmind.configs import PaperFlowCfg
from quantmind.configs.paper import ArxivIdentifier
from quantmind.flows import batch_run, paper_flow
async def main() -> None:
inputs = [ArxivIdentifier(id=aid) for aid in (
"2401.12345", "2401.12346", "2401.12347",
)]
result = await batch_run(
paper_flow,
inputs,
cfg=PaperFlowCfg(model="gpt-4o-mini"),
concurrency=3,
on_error="skip",
on_progress=lambda done, total: print(f"{done}/{total}"),
)
print(f"ok={result.success_count} failed={result.failure_count}")
asyncio.run(main())
Resolve free-form intent with magic
import asyncio
from quantmind.flows import paper_flow
from quantmind.magic import resolve_magic_input
async def main() -> None:
inp, cfg = await resolve_magic_input(
"Pull arXiv 2401.12345 about cross-sectional momentum; use gpt-4o-mini.",
target_flow=paper_flow,
)
paper = await paper_flow(inp, cfg=cfg)
print(paper.model_dump_json(indent=2))
asyncio.run(main())
Note: QuantMind is mid-migration to OpenAI Agents SDK (see #71). PR5 lands the apex layer (
flows/+magic.py); the remaining work is themind/memory + store layer scheduled for PR6 and PR7.
🗺️ Roadmap
-
Better
flowdesign for user-friendly usage - First production level example (Quant Paper Agent)
- Migrate Agent layer to OpenAI Agents SDK
-
Standardize knowledge format with
knowledge/(Pydantic-based) - Additional content sources (financial news, blogs, reports)
-
Cross-step working memory (
mind/memory) for batch document processing
The Vision: An Intelligent Research Framework
This section describes our long-term vision, not current capabilities. While QuantMind today provides a solid knowledge extraction framework, the features described below represent our aspirational goals for future development.
QuantMind is designed with a larger vision: to become a comprehensive intelligence layer for all financial knowledge. We’re building toward a system that understands the interconnections between academic research, market news, analyst reports, and social sentiment—creating a unified knowledge base that powers better financial decisions.
The foundation we’re building today—starting with papers—will expand to encompass the entire financial information ecosystem.
Future Conceptual Example (PR6 brings
FilesystemMemory):from quantmind.configs.paper import ArxivIdentifier from quantmind.flows import paper_flow from quantmind.knowledge import Paper from quantmind.mind.memory import FilesystemMemory # PR6 memory = FilesystemMemory("./mem/factor-research/") for arxiv_id in arxiv_ids: paper: Paper = await paper_flow(ArxivIdentifier(id=arxiv_id), memory=memory)
This future state represents our commitment to moving beyond simple data aggregation and toward genuine machine intelligence in the financial domain.
🤝 Contributing
We welcome contributions of all forms, from bug reports to feature development.
For Contributors: Please read CONTRIBUTING.md for essential development setup including pre-commit hooks, coding standards, and testing requirements.
Quick Start for Contributors:
-
Fork the repository
-
Setup development environment:
uv venv && source .venv/bin/activate uv pip install -e . ./scripts/pre-commit-setup.sh -
Create feature branch (
git checkout -b feat/my-feature) -
Follow conventional commits (
feat: add new feature) -
Submit PR with our template
Before Contributing:
- Open an issue to discuss significant changes
- Use our issue templates for bug reports and feature requests
- Ensure all pre-commit hooks pass before submitting PR
License
QuantMind is released under the MIT License—see LICENSE for details.
❤️ Acknowledgements
- arXiv for providing open access to a world of research.
- The open-source community for the tools and libraries that make this project possible.
相似文章
@WEB3_furture: 全球最贵的金融团队都在 GitHub 上开源了什么? 普通人怎么了解量化?直接上手是最快的 Jane Street、Goldman Sachs、J.P. Morgan 等顶级量化与高频交易机构,都放出了代表性的金融/工程工具,帮助普通量化…
该推文介绍了Jane Street、Goldman Sachs和J.P. Morgan等顶级量化机构开源的三个金融/工程工具:magic-trace(高精度进程追踪)、gs-quant(衍生品定价与风险管理Python包)和Perspective(实时数据可视化工具),帮助量化爱好者免费获得机构级技术能力。
@itsharmanjot: 一群AI研究人员刚刚开源了用于量化金融的Bloomberg Terminal。一个Bloomberg Terminal每年花费25,000美元…
QuantMind,一个开源框架,能够将金融研究论文、新闻和SEC文件整合到可搜索的知识图谱中,已在GitHub上发布,并被NeurIPS 2025的GenAI in Finance Workshop接收,提供了Bloomberg Terminal的免费替代品。
@GitHub_Daily: 刚接手一个新项目,面对几十万行代码,光是理清文件之间的调用关系和整体架构,就得花上好几天,效率很低。 于是找到 Understand Anything 这个开源项目,把整个代码库生成一张可交互的知识图谱,直观地看清每个模块之间的关系。 通…
Understand Anything 是一个开源项目,通过多智能体流水线自动分析代码库,生成可交互的知识图谱,帮助开发者快速理清代码结构和模块关系,支持与 Claude Code、Cursor 等主流 AI 编程工具集成。
@eastweb3eth: Github 美股量化合集----聪明人必用的工具 自从有了 Github,普通人也能玩量化了。但是别一上来就自己吭哧吭哧写回测引擎,真的,大部分人写的还不如 GitHub 上一个三年前的仓库抗造。 虽然仓库很多,我已经帮你筛选好了,这4…
推荐4个开源量化交易工具/框架(VeighNa、AI-Trader、StockSharp、QuantDinger),强调它们适合普通用户进行美股量化交易,帮助解放双手、将交易交给模型。
@grgerwcwetwet: 推荐一个开源项目 Horizon,一个专门盯海外科技圈的 AI 信息雷达。 它会自动聚合 Hacker News、Twitter、Reddit、GitHub 等平台内容,再用 AI 做筛选、去重和总结,把真正有价值的信息整理成日报。 比较…
推荐开源项目 Horizon,这是一个AI驱动的海外科技新闻雷达,自动聚合Hacker News、Twitter、Reddit、GitHub等平台内容,进行筛选、去重和总结,生成中英双语日报,并支持推送到飞书、邮箱、微信等渠道。