@QingQ77: 一个本地优先的学术论文管理桌面应用,支持 arXiV 等来源的论文发现、管理和可视化。 https://github.com/linxiv-dev/linXiv… 一个面向研究人员的本地论文管理工具,数据全存在本地,不上传任何内容到外部服…
摘要
一个本地优先的学术论文管理桌面应用 linXiv,支持 arXiV 等来源的论文发现、管理和可视化,集成 SQLite 数据库、AI 标注、Obsidian 笔记和论文网络图。
查看缓存全文
缓存时间: 2026/06/23 10:04
一个本地优先的学术论文管理桌面应用,支持 arXiV 等来源的论文发现、管理和可视化。
https://github.com/linxiv-dev/linXiv…
一个面向研究人员的本地论文管理工具,数据全存在本地,不上传任何内容到外部服务。它把 SQLite 数据库、Google Gemini 的 AI 标注、Obsidian 笔记集成和论文网络图可视化整合到一个 Tauri 桌面应用里(React + TypeScript 前端,Python 后端)。你可以直接上传 PDF、建项目、写笔记、打标签,也能通过 arXiv、DOI、OpenAlex 或 CrossRef 搜论文。
linxiv-dev/linXiv
Source: https://github.com/linxiv-dev/linXiv
linXiv
A local-first desktop application for discovering, managing, and visualizing academic papers from arXiv and other sources. Combines a local SQLite database, optional AI-powered tagging, Obsidian vault integration, and an interactive network graph (Cytoscape rendering with a D3 force simulation), wrapped in a Tauri desktop shell (React + TypeScript frontend, Python backend).
Upload your PDFs, create projects, manage notes, tags, and more to organize your files — all locally, without sending your data anywhere. This project aims to be a one-stop-shop for researchers who want to manage their literature, with the near-term goal of extending to research groups who seek to share knowledge without going to the web.
Development status: The database schema and paper identifier format are actively changing.
source_idvalues are being migrated to a namespaced format (arxiv:2204.12985,doi:10.48550/…,openalex:W3123456789,local:{hash}). Until that work lands, pre-v0.1.2 (current version) existingpapers.dbfiles will not be compatible with new builds — deletepapers.dband let it rebuild on first run. No stable release has been cut yet.
Table of Contents
Features
- Paper search — Search arXiv by keyword, fetch by ID, or look up by DOI; results saved to a local SQLite DB with version tracking
- Interactive graph — Force-directed network of papers and authors (D3 force simulation, Cytoscape rendering); real-time force controls (center, repel, link distance, link strength)
- Projects — Organise papers into projects; add notes per paper scoped to a project; composable SQL query builder (
Q) for filtering - TeX rendering — MathJax renders LaTeX math in titles and abstracts inside the search UI
- AI tools — Google Gemini structured output for tag generation, paper summarization, and semantic similarity
- Obsidian integration — Auto-generate markdown notes with YAML frontmatter for your vault
- PDF & TeX downloads — Batch download PDFs and TeX source tarballs
Project Structure
linXiv/
├── AI_tools.py # Gemini: tag(), summarize(), find_related(); PaperContent input type
├── linxiv_cli.py # CLI entry point (linxiv command via pyproject.toml)
├── linxiv_mcp.py # MCP server for Claude integration
├── config.py # App-wide configuration constants
├── user_settings.py # User-editable settings (API keys, paths)
├── pyproject.toml # Package metadata + CLI/MCP entry points
├── assets/
│ ├── app_icon.png # Application icon
│ └── wide_logo.png # Wide logo (README header)
├── api/
│ ├── __main__.py # Entry point: python -m api
│ ├── app.py # FastAPI routes (REST API incl. /api/graph)
│ ├── graph_payload.py # Graph JSON (tags + projects) for /api/graph
│ └── run_api.py # uvicorn launcher helper
├── sources/
│ ├── base.py # PaperSource protocol + PaperMetadata model
│ ├── arxiv_source.py # ArxivSource: search and fetch from arXiv API
│ ├── crossref_source.py # CrossRefSource: fetch by DOI, search by title
│ ├── openalex_source.py # OpenAlexSource: lookup via OpenAlex
│ ├── doi_resolve.py # DOI resolution (arXiv, Semantic Scholar, CrossRef fallback)
│ ├── fetch_paper_metadata.py# High-level fetch/search helpers + Obsidian note generation
│ ├── pdf_metadata.py # PDF metadata extraction and resolution pipeline
│ └── arxiv_downloads.py # PDF and TeX source download helpers
├── service/
│ ├── paper.py # Paper service: get, get_all, get_many, upsert, graph data
│ ├── author.py # Author service: get, upsert, link/unlink to papers
│ ├── tag.py # Tag service: get, upsert, paper/project tag management
│ ├── note.py # Note service: get, upsert, count by paper/project
│ ├── project.py # Project service: get, upsert, filter, status management
│ ├── export_import.py # Export/import projects as .lxproj archives
│ ├── vault.py # On-disk LaTeX vault for the embedded editor
│ ├── editor_project.py # Note-link layer for the embedded editor
│ ├── files.py # File utilities for paper sources
│ └── models/ # Typed return types (PaperDetails, ProjectDetails, etc.)
├── storage/
│ ├── db.py # SQLite DB: versioned paper storage, graph data queries
│ ├── authors.py # Author CRUD and paper linkage
│ ├── tags.py # Tag CRUD
│ ├── projects.py # Projects: Project data model + CRUD (Status/Q imported)
│ ├── notes.py # Notes: per-paper annotations scoped to projects
│ ├── paths.py # Filesystem paths (project root, DB, PDFs)
│ ├── config/
│ │ ├── core.py # Schema application: apply_sql_schema, init_db
│ │ ├── queries.py # Typed query helpers + composable Q predicate builder
│ │ └── sql/ # SQL table, view, and index definitions
│ └── migrations/ # One-off schema migration scripts
├── formats/
│ ├── bibtex.py # BibTeX import/export
│ ├── csv_fmt.py # CSV import/export
│ ├── json_fmt.py # JSON import/export
│ ├── markdown.py # Markdown / Obsidian import/export
│ ├── table_format.md # YAML frontmatter template for Obsidian notes
│ └── arxiv_paper.md # Plain-text paper card template
├── public/
│ └── graph/ # Graph viewer (graph.html/js/css), loaded in an iframe
├── src/ # React + TypeScript frontend (Vite)
├── src-tauri/ # Tauri shell (Rust) + bundled sidecar binaries
├── tests/ # pytest suite (API, CLI, DB, sources, DOI, notes, projects)
├── docs/ # Development notes and technical debt log
└── pdfs/ # Downloaded PDFs (gitignored)
Setup
Prerequisites
- Python 3.10+
- Node.js 18+ (for frontend / Tauri dev)
- Rust toolchain (for Tauri)
- uv (recommended Python package manager)
Install dependencies
uv sync # Python dependencies (backend + dev)
npm install # Node dependencies (frontend)
Add
--extra mcpif you need the MCP server:uv pip install -e ".[mcp]"
Environment variables
Create a .env file in the project root:
GENAI_API_KEY_TAG_GEN=your_google_gemini_api_key
Run
HTTP API (JSON backend)
uv run python -m api # http://127.0.0.1:8000 — see /docs for OpenAPI
CLI
Install once (editable install via uv):
uv pip install -e .
Then run from anywhere:
linxiv --version
# Search papers (arxiv, openalex, or crossref)
linxiv search "attention is all you need" --max 5
linxiv search "diffusion models" --source openalex --max 10
linxiv search "lattice QCD" --source crossref --max 3
# Fetch and save a paper by ID
linxiv fetch 2204.12985
linxiv fetch W3123456789 --source openalex
# List papers in the database
linxiv list --limit 20 --offset 0 --category cs.LG
# Paper management
linxiv paper get 2204.12985
linxiv paper versions 2204.12985
linxiv paper delete 2204.12985
# Tag management
linxiv tag add 2204.12985 transformers attention deep-learning
linxiv tag remove 2204.12985 attention
linxiv tag list 2204.12985
linxiv tag list-all
linxiv tag create my-tag
linxiv tag delete 42
# Project management
linxiv project list
linxiv project list --status active # active | archived | deleted
linxiv project get 1
linxiv project create "Diffusion Models" --description "Score-based generative models"
linxiv project update 1 --name "Diffusion Models v2" --description "Updated"
linxiv project add-paper 1 2006.11239
linxiv project remove-paper 1 2006.11239
linxiv project delete 1
# Note management
linxiv note create 2204.12985 "Key insight: scaled dot-product attention" --title "Reading notes"
linxiv note create 2204.12985 "Follow-up question" --project-id 1
linxiv note get 7
linxiv note list --paper-id 2204.12985
linxiv note list --project-id 1
linxiv note delete 7
# PDF management
linxiv pdf path 2204.12985
linxiv pdf path 2204.12985 --version 2
linxiv pdf download 2204.12985 https://arxiv.org/pdf/2204.12985
linxiv pdf storage
All commands output JSON (or a formatted markdown card for fetch). Pass --help to any subcommand for full options.
MCP server (Claude integration)
Install with the mcp extra:
uv pip install -e ".[mcp]"
Register with Claude Code:
claude mcp add linxiv -- linxiv-mcp
Or add manually to .claude/settings.json:
{
"mcpServers": {
"linxiv": {
"command": "linxiv-mcp"
}
}
}
Without an editable install, fall back to
uv run:{ "command": "uv", "args": ["run", "linxiv_mcp.py"], "cwd": "/absolute/path/to/linxiv" }
Once registered, Claude can call the linXiv tools directly — for example search_papers, fetch_paper, and list_papers. Full tool documentation will be added soon.
Building the Tauri App
The Tauri desktop app wraps the React/Vite frontend and bundles the Python backend as sidecar binaries compiled with PyInstaller.
Tauri prerequisites
- Node.js 18+
- Rust toolchain (stable)
- uv
- The Tauri build pulls the
tauri-plugin-texbraincrate as a git dependency (github.com/linxiv-dev/tex-brain-linxiv-plugin, pinned inCargo.lock) — no extra checkout needed - System Tauri dependencies — follow the Tauri v2 prerequisites guide for your OS (WebKit2GTK on Linux, Xcode Command Line Tools on macOS, Microsoft C++ Build Tools on Windows)
Development
Start the Python API and the Tauri dev window in separate terminals:
# terminal 1 — Python backend
uv run python -m api # http://127.0.0.1:8000
# terminal 2 — Tauri dev window (also starts Vite, hot-reloads on frontend changes)
npm run tauri dev
The Python API sidecar is not bundled in dev mode — the app talks to the locally running API on port 8000.
Production build
The Python entry points (API, CLI, MCP server) are compiled to self-contained binaries with PyInstaller and staged into src-tauri/binaries/ before Tauri bundles the app.
1. Build and stage the Python sidecars:
npm run build:sidecar
This runs PyInstaller on linxiv-api.spec, linxiv-cli.spec, and linxiv-mcp.spec, then copies the outputs to src-tauri/binaries/ with the correct Tauri target-triple suffix.
2. Build the Tauri app:
npm run tauri build
Or run both steps at once:
npm run build:all
The final installer/bundle is written to src-tauri/target/release/bundle/.
Installing the CLI
After installing the desktop app, open Settings and click Install CLI to symlink the bundled linxiv binary to ~/.local/bin/linxiv (Linux/macOS) or add a shim to your PATH (Windows).
Usage
Projects
from storage import Project, filter_projects, Q, Status, get_paper
# Create and save a project
p = Project(name="Diffusion Models", color=0x5b8dee, project_tags=["generative"])
p.save()
# Add papers — add_papers takes integer SOURCE_FKs (papers must already be in the DB)
p.add_papers([get_paper(sid)["source_fk"] for sid in ("2006.11239", "2010.02502", "2112.10752")])
# Query with composable predicates
active = filter_projects(Q("status = ?", Status.ACTIVE))
not_deleted = filter_projects(~Q("status = ?", Status.DELETED))
blue_diffusion = filter_projects(
Q("status = ?", Status.ACTIVE)
& Q("color = ?", 0x5b8dee)
& Q("name LIKE ?", "%diffusion%")
)
Notes
from storage import Note, get_notes, count_paper_notes, ensure_notes_db, get_paper
ensure_notes_db()
# Notes attach to a paper by its integer SOURCE_FK
sfk = get_paper("2006.11239")["source_fk"]
# Add a project-scoped note on a paper
note = Note(source_fk=sfk, project_id=p.id, title="Key insight", content="...")
note.save()
# Retrieve
project_notes = get_notes(sfk, project_id=p.id)
count = count_paper_notes(sfk, project_id=p.id)
Search and save papers
from sources import search_papers
from storage import init_db, save_papers
init_db()
papers = search_papers("lattice QCD", max_results=25) # returns arxiv.Result objects
save_papers(papers) # persist them to the DB
Add by DOI
from sources import resolve_doi
result = resolve_doi("10.48550/arXiv.1706.03762")
AI tools
from AI_tools import tag, summarize, find_related, PaperContent
content = PaperContent(abstract=paper.summary)
tags = tag(content) # ["#quantum_computing", ...]
tags = tag(content, file_path="tags.md") # also appends to file
s = summarize(content)
print(s.tldr)
print(s.key_contributions)
# Semantic edges for the graph
from storage import list_papers
candidates = [(r["paper_id"], r["summary"]) for r in list_papers()]
related_ids = find_related(content, candidates)
Download PDFs
from sources.arxiv_downloads import download_pdf, download_pdf_batch, download_source_batch
download_pdf(paper, dirpath="pdfs/")
download_pdf_batch(papers, dirpath="pdfs/")
download_source_batch(papers, dirpath="source/")
Database queries
from storage import get_paper, get_all_versions, list_papers, get_graph_data
get_paper("2204.12985") # latest version
get_paper("2204.12985", version=2)
get_all_versions("2204.12985") # all stored versions
nodes, edges = get_graph_data() # for the graph viewer
Graph Visualization
Papers (circles, in your theme’s accent color — blue by default) and authors (gold diamonds) form a force-directed network. Edges connect each paper to its authors. The control panel has four real-time sliders:
| Slider | Effect |
|---|---|
| Center force | Pulls/pushes nodes toward the center |
| Repel force | Controls node-to-node repulsion |
| Link distance | Target edge length |
| Link strength | Stiffness of paper–author edges |
Notes
papers.db,pdfs/,source/, and vault contents are gitignored.- MathJax, D3, and the Inter UI font are all bundled locally — no external CDN calls, so the interface works fully offline.
PaperContentacceptsabstract,full_text(TeX source), orpdf(bytes) — Gemini will use the richest available source.
Acknowledgements
linXiv owes a debt to Qiqqa, the open-source research management tool originally created by Jimme Jardine.
PDF text and metadata extraction uses pypdf, a pure-Python PDF library maintained by the py-pdf organization. pypdf is licensed under the BSD 3-Clause License.
相似文章
@fhwofjow51260: 研究生科研工具与网站,个人推荐版 ,建议收藏。 从找文献、读论文、翻译 PDF,到 LaTeX、画图、文献管理、模拟审稿,基本都覆盖了 1、OpenAlex https://openalex.org 免费开源的全球学术数据库,索引论文、作…
该帖推荐了22个研究生常用的科研工具与网站,涵盖文献检索、论文阅读、翻译、LaTeX写作、绘图、文献管理、AI辅助审稿等环节,并给出分类建议。
@wsl8297: 如果你手里有一堆 PDF、文档、项目资料要喂给 AI,Synthadoc 这个方向很值得看。 GitHub:https://github.com/axoviq-ai/synthadoc… 它把原始资料在摄入时就编译成结构化 wiki,自动…
Synthadoc 是一个开源工具,可将 PDF、文档等项目资料编译为结构化的本地 Markdown wiki,自动建立交叉引用并检测矛盾,适合个人或小团队进行离线知识管理。
@IndieDevHailey: 科研党福音!这个开源神器,让你从文献海里杀出重围,一键搞定学术全流程。 还在为文献调研慢、写作卡壳、引用不严谨、审稿被怼头疼?推荐这个开源仓库:academic-research-skills 它不是AI代写工具,而是靠谱的人机协作框架—…
推荐开源仓库 academic-research-skills,提供一套人机协作的学术研究全流程工具,包括深度文献调研、论文写作、同行评审模拟和引用审计,支持AI辅助但保持用户主导,适合硕博生和研究者。
@coldsake_: https://x.com/coldsake_/status/2067374833692815638
本文介绍如何使用Obsidian结合Codex/CC等AI工具构建学术文献管理系统,实现文献自动分类、查重、生成wiki页面和学术工具箱,并分享阅读文献和提升学术能力的方法。
@grgerwcwetwet: 推荐一个开源项目:qiaomu-anything-to-notebooklm。 有人用 Claude 做了个硬核工具,我看完只想说:知识管理党真的该收藏。 你随便丢进去内容——微信公众号、YouTube、播客、PDF、Word、Excel…
推荐一个开源项目 qiaomu-anything-to-notebooklm,基于 Claude 实现多源内容(微信公众号、YouTube、PDF 等)自动整理并生成播客、PPT、思维导图等,全程自然语言操作。