@GitHub_Daily: 做量化研究的朋友，每天面对海量的金融研报和前沿论文，靠人工筛选有价值内容，无疑像大海捞针。最近发现一个叫 QuantMind 的开源项目，专门做量化金融的智能知识提取与检索。能自动抓取论文、新闻和博客等内容，把非结构化的文档转化为可查…

X AI KOLs Timeline 2026/06/12 10:00 工具

quant-finance open-source knowledge-extraction retrieval natural-language-query semantic-knowledge-graph llm

摘要

QuantMind 是一个开源的量化金融智能知识提取与检索框架，能够自动抓取论文、新闻等非结构化内容，构建可查询的结构化知识库，并支持自然语言检索。

做量化研究的朋友，每天面对海量的金融研报和前沿论文，靠人工筛选有价值内容，无疑像大海捞针。最近发现一个叫 QuantMind 的开源项目，专门做量化金融的智能知识提取与检索。能自动抓取论文、新闻和博客等内容，把非结构化的文档转化为可查询的结构化知识库。结合针对金融领域微调的大模型，帮我们快速理解复杂内容，并自动构建语义知识图谱。直接用自然语言提问，就能在极短时间内检索到需要的因子策略和市场洞察。 GitHub：http://github.com/LLMQuant/quant-mind… 提供一键运行脚本，支持单篇提取、批量并发运行，甚至能直接用自然语言下达处理指令。如果平时需要处理大量金融研究报告，或者正在做量化策略研究，这个项目能帮到我们。

查看原文

查看缓存全文

缓存时间: 2026/06/12 12:58

做量化研究的朋友，每天面对海量的金融研报和前沿论文，靠人工筛选有价值内容，无疑像大海捞针。

最近发现一个叫 QuantMind 的开源项目，专门做量化金融的智能知识提取与检索。

能自动抓取论文、新闻和博客等内容，把非结构化的文档转化为可查询的结构化知识库。

结合针对金融领域微调的大模型，帮我们快速理解复杂内容，并自动构建语义知识图谱。

直接用自然语言提问，就能在极短时间内检索到需要的因子策略和市场洞察。

GitHub：http://github.com/LLMQuant/quant-mind…

提供一键运行脚本，支持单篇提取、批量并发运行，甚至能直接用自然语言下达处理指令。

如果平时需要处理大量金融研究报告，或者正在做量化策略研究，这个项目能帮到我们。

LLMQuant/quant-mind

Source: https://github.com/LLMQuant/quant-mind

Transform Financial Knowledge into Actionable Intelligence

Why QuantMind • Architecture • Quick Start • Usage • Roadmap • Vision • Contributing

QuantMind is an intelligent knowledge extraction and retrieval framework for quantitative finance. It transforms unstructured financial content—papers, news, blogs, reports—into a queryable knowledge base, enabling AI-powered research at scale.

📰 News

🗞️ News	📝 Description
🎉 Accepted at NeurIPS 2025 Workshop	Our paper Quant-Mind has been accepted to the NeurIPS 2025 GenAI in Finance Workshop !🚀
📢 First Release on GitHub	Quant-Mind is now live on GitHub — please check it out and join us! 🤗

🧐 Overview

QuantMind is a next-generation AI platform that ingests, processes, and structures every new piece of quantitative-finance research, including papers, news, blogs, and SEC filings into a semantic knowledge graph. Institutional investors, hedge funds, and research teams can now explore the frontier of factor strategies, risk models, and market insights in seconds, unlocking alpha that would otherwise remain buried.

✨ Why QuantMind?

The financial research landscape is overwhelming. Every day, hundreds of papers, articles, and reports are published.

🌐 The Opportunity

Information Overload: 500 new research papers & reports published daily. Manual review takes weeks—costly, error-prone, and non-scalable
Massive Market: Financial data & analytics market ≫ expected to grow to US$961.89 billion by 2032, with a compound annual growth rate of 13.5%. Tens of thousands of quant teams & asset managers hungry for speed
High ROI: 1% improvement in research efficiency can translate to millions saved or earned in trading performance

💡 QuantMind solves this by

🔍 Extracting structured knowledge from any source (PDFs, web pages, APIs)
🧠 Understanding content with domain-specific LLMs fine-tuned for finance
💾 Storing information in a semantic knowledge graph
🚀 Retrieving insights through natural language queries

System Architecture

quantmind-outline

QuantMind is built on a decoupled, two-stage architecture. This design separates the concerns of data ingestion from intelligent retrieval, ensuring both robustness and flexibility.

Stage 1: Knowledge Extraction

This layer is responsible for collecting, parsing, and structuring raw information into standardized knowledge units.

Source APIs (arXiv, News, Blogs) → Intelligent Parser → Workflow/Agent → Structured Knowledge Base

Source: Connects to various sources (academic APIs, news feeds, financial blogs, perplexity search source) to pull content
Parser: Extracts text, tables, and figures from PDFs, HTML, and other formats
Tagger: Automatically categorizes content into research areas and topics
Workflow/Agent: Orchestrates the extraction pipeline with quality control and deduplication

Stage 2: Intelligent Retrieval

This layer transforms structured knowledge into actionable insights through various retrieval mechanisms.

Knowledge Base → Embeddings → Solution Scenarios (DeepResearch, RAG, Data MCP, ...)

Embedding Generation: Converts knowledge units into high-dimensional vectors for semantic search
Solution Scenarios: Multiple retrieval patterns including:
- DeepResearch: Complex multi-hop reasoning across documents
- RAG: Retrieval-augmented generation for Q&A
- Data MCP: Structured data access protocols
- Custom retrieval patterns based on use case

🚀 Quick Start

We use uv for fast and reliable Python package management.

Prerequisites:

Python 3.8+
Git

Installation:

Install uv (if not already installed):

# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

# Or using pip
pip install uv

Clone the repository:

git clone https://github.com/LLMQuant/quant-mind.git
cd quant-mind

Create and activate virtual environment:

# Create a virtual environment
uv venv

# Activate it
# On macOS/Linux:
source .venv/bin/activate

# On Windows:
.venv\Scripts\activate

Install dependencies:
```
uv pip install -e .
```

📚 Usage Examples

Run a single paper through `paper_flow`

import asyncio

from quantmind.configs import PaperFlowCfg
from quantmind.configs.paper import ArxivIdentifier
from quantmind.flows import paper_flow


async def main() -> None:
    paper = await paper_flow(
        ArxivIdentifier(id="2401.12345"),
        cfg=PaperFlowCfg(model="gpt-4o-mini"),
    )
    print(paper.model_dump_json(indent=2))


asyncio.run(main())

Fan out a batch with `batch_run`

import asyncio

from quantmind.configs import PaperFlowCfg
from quantmind.configs.paper import ArxivIdentifier
from quantmind.flows import batch_run, paper_flow


async def main() -> None:
    inputs = [ArxivIdentifier(id=aid) for aid in (
        "2401.12345", "2401.12346", "2401.12347",
    )]
    result = await batch_run(
        paper_flow,
        inputs,
        cfg=PaperFlowCfg(model="gpt-4o-mini"),
        concurrency=3,
        on_error="skip",
        on_progress=lambda done, total: print(f"{done}/{total}"),
    )
    print(f"ok={result.success_count} failed={result.failure_count}")


asyncio.run(main())

Resolve free-form intent with `magic`

import asyncio

from quantmind.flows import paper_flow
from quantmind.magic import resolve_magic_input


async def main() -> None:
    inp, cfg = await resolve_magic_input(
        "Pull arXiv 2401.12345 about cross-sectional momentum; use gpt-4o-mini.",
        target_flow=paper_flow,
    )
    paper = await paper_flow(inp, cfg=cfg)
    print(paper.model_dump_json(indent=2))


asyncio.run(main())

Note: QuantMind is mid-migration to OpenAI Agents SDK (see #71). PR5 lands the apex layer (flows/ + magic.py); the remaining work is the mind/ memory + store layer scheduled for PR6 and PR7.

🗺️ Roadmap

Better flow design for user-friendly usage
First production level example (Quant Paper Agent)
Migrate Agent layer to OpenAI Agents SDK
Standardize knowledge format with knowledge/ (Pydantic-based)
Additional content sources (financial news, blogs, reports)
Cross-step working memory (mind/memory) for batch document processing

The Vision: An Intelligent Research Framework

This section describes our long-term vision, not current capabilities. While QuantMind today provides a solid knowledge extraction framework, the features described below represent our aspirational goals for future development.

QuantMind is designed with a larger vision: to become a comprehensive intelligence layer for all financial knowledge. We’re building toward a system that understands the interconnections between academic research, market news, analyst reports, and social sentiment—creating a unified knowledge base that powers better financial decisions.

The foundation we’re building today—starting with papers—will expand to encompass the entire financial information ecosystem.

Future Conceptual Example (PR6 brings FilesystemMemory):

from quantmind.configs.paper import ArxivIdentifier
from quantmind.flows import paper_flow
from quantmind.knowledge import Paper
from quantmind.mind.memory import FilesystemMemory  # PR6

memory = FilesystemMemory("./mem/factor-research/")
for arxiv_id in arxiv_ids:
    paper: Paper = await paper_flow(ArxivIdentifier(id=arxiv_id), memory=memory)

This future state represents our commitment to moving beyond simple data aggregation and toward genuine machine intelligence in the financial domain.

🤝 Contributing

We welcome contributions of all forms, from bug reports to feature development.

For Contributors: Please read CONTRIBUTING.md for essential development setup including pre-commit hooks, coding standards, and testing requirements.

Quick Start for Contributors:

Fork the repository

Setup development environment:

uv venv && source .venv/bin/activate
uv pip install -e .
./scripts/pre-commit-setup.sh

Create feature branch (git checkout -b feat/my-feature)
Follow conventional commits (feat: add new feature)
Submit PR with our template

Before Contributing:

Open an issue to discuss significant changes
Use our issue templates for bug reports and feature requests
Ensure all pre-commit hooks pass before submitting PR

License

QuantMind is released under the MIT License—see LICENSE for details.

❤️ Acknowledgements

arXiv for providing open access to a world of research.
The open-source community for the tools and libraries that make this project possible.

相似文章

@WEB3_furture: 全球最贵的金融团队都在 GitHub 上开源了什么？普通人怎么了解量化？直接上手是最快的 Jane Street、Goldman Sachs、J.P. Morgan 等顶级量化与高频交易机构，都放出了代表性的金融/工程工具，帮助普通量化…

X AI KOLs Timeline

该推文介绍了Jane Street、Goldman Sachs和J.P. Morgan等顶级量化机构开源的三个金融/工程工具：magic-trace（高精度进程追踪）、gs-quant（衍生品定价与风险管理Python包）和Perspective（实时数据可视化工具），帮助量化爱好者免费获得机构级技术能力。

@itsharmanjot: 一群AI研究人员刚刚开源了用于量化金融的Bloomberg Terminal。一个Bloomberg Terminal每年花费25,000美元…

X AI KOLs Timeline

QuantMind，一个开源框架，能够将金融研究论文、新闻和SEC文件整合到可搜索的知识图谱中，已在GitHub上发布，并被NeurIPS 2025的GenAI in Finance Workshop接收，提供了Bloomberg Terminal的免费替代品。

@waveking1314: 有人把量化基金常用的工具，全部整理进了一个免费的GitHub仓库。定价引擎、回测框架、订单簿、实时行情、风险模型，几乎一套配齐。里面的项目多到有点离谱：期权定价库用来计算期权和衍生品价值，覆盖多种定价模型与风险指标。完整回测框架…

X AI KOLs Timeline

A user curated a free GitHub repository aggregating numerous open-source quantitative finance tools, including pricing engines, backtesting frameworks, order book simulators, and risk models, making institutional-grade research tools accessible to individuals at minimal cost.

@Jolyne_AI: GitHub 上有个开源小工具：daily-arXiv-ai-enhanced，帮你把“追论文”这件事变成每天自动完成的日常。它会每日抓取 arXiv 最新论文，并用 DeepSeek 等大模型生成中文摘要，让你用更少时间，快速跟上 A…

X AI KOLs Timeline

daily-arXiv-ai-enhanced 是一个开源工具，通过GitHub Actions每日自动抓取arXiv最新论文，并使用DeepSeek等大模型生成中文摘要，帮助快速跟进AI研究进展。

@GitHub_Daily: 平时收藏了一堆文章和论文，全堆在笔记软件里吃灰，从来没整理过。 second-brain 的思路是让 AI 来当图书管理员，我们只管把素材扔进 raw 文件夹。 AI 读完会自动把内容写成一篇篇结构化的 Wiki，页面之间带双链，索引也替…

X AI KOLs Timeline

Second Brain 是一个基于LLM的个人知识库工具，自动将原始素材整理成结构化的Wiki，支持Obsidian浏览和Agent集成。

LLMQuant/quant-mind

📰 News

🧐 Overview

✨ Why QuantMind?

🌐 The Opportunity

💡 QuantMind solves this by

System Architecture

Stage 1: Knowledge Extraction

Stage 2: Intelligent Retrieval

🚀 Quick Start

📚 Usage Examples

Run a single paper through paper_flow

Fan out a batch with batch_run

Resolve free-form intent with magic

🗺️ Roadmap

The Vision: An Intelligent Research Framework

🤝 Contributing

License

❤️ Acknowledgements

相似文章

@WEB3_furture: 全球最贵的金融团队都在 GitHub 上开源了什么？ 普通人怎么了解量化？直接上手是最快的 Jane Street、Goldman Sachs、J.P. Morgan 等顶级量化与高频交易机构，都放出了代表性的金融/工程工具，帮助普通量化…

@itsharmanjot: 一群AI研究人员刚刚开源了用于量化金融的Bloomberg Terminal。一个Bloomberg Terminal每年花费25,000美元…

@Jolyne_AI: GitHub 上有个开源小工具：daily-arXiv-ai-enhanced，帮你把“追论文”这件事变成每天自动完成的日常。 它会每日抓取 arXiv 最新论文，并用 DeepSeek 等大模型生成中文摘要，让你用更少时间，快速跟上 A…

提交意见反馈

Run a single paper through `paper_flow`

Fan out a batch with `batch_run`

Resolve free-form intent with `magic`

@WEB3_furture: 全球最贵的金融团队都在 GitHub 上开源了什么？普通人怎么了解量化？直接上手是最快的 Jane Street、Goldman Sachs、J.P. Morgan 等顶级量化与高频交易机构，都放出了代表性的金融/工程工具，帮助普通量化…

@Jolyne_AI: GitHub 上有个开源小工具：daily-arXiv-ai-enhanced，帮你把“追论文”这件事变成每天自动完成的日常。它会每日抓取 arXiv 最新论文，并用 DeepSeek 等大模型生成中文摘要，让你用更少时间，快速跟上 A…