@QingQ77: 一个本地优先的学术论文管理桌面应用,支持 arXiV 等来源的论文发现、管理和可视化。 https://github.com/linxiv-dev/linXiv… 一个面向研究人员的本地论文管理工具,数据全存在本地,不上传任何内容到外部服…

X AI KOLs Timeline 工具

摘要

一个本地优先的学术论文管理桌面应用 linXiv,支持 arXiV 等来源的论文发现、管理和可视化,集成 SQLite 数据库、AI 标注、Obsidian 笔记和论文网络图。

一个本地优先的学术论文管理桌面应用,支持 arXiV 等来源的论文发现、管理和可视化。 https://github.com/linxiv-dev/linXiv… 一个面向研究人员的本地论文管理工具,数据全存在本地,不上传任何内容到外部服务。它把 SQLite 数据库、Google Gemini 的 AI 标注、Obsidian 笔记集成和论文网络图可视化整合到一个 Tauri 桌面应用里(React + TypeScript 前端,Python 后端)。你可以直接上传 PDF、建项目、写笔记、打标签,也能通过 arXiv、DOI、OpenAlex 或 CrossRef 搜论文。
查看原文
查看缓存全文

缓存时间: 2026/06/23 10:04

一个本地优先的学术论文管理桌面应用,支持 arXiV 等来源的论文发现、管理和可视化。

https://github.com/linxiv-dev/linXiv…

一个面向研究人员的本地论文管理工具,数据全存在本地,不上传任何内容到外部服务。它把 SQLite 数据库、Google Gemini 的 AI 标注、Obsidian 笔记集成和论文网络图可视化整合到一个 Tauri 桌面应用里(React + TypeScript 前端,Python 后端)。你可以直接上传 PDF、建项目、写笔记、打标签,也能通过 arXiv、DOI、OpenAlex 或 CrossRef 搜论文。


linxiv-dev/linXiv

Source: https://github.com/linxiv-dev/linXiv

linXiv

linXiv logo

A local-first desktop application for discovering, managing, and visualizing academic papers from arXiv and other sources. Combines a local SQLite database, optional AI-powered tagging, Obsidian vault integration, and an interactive network graph (Cytoscape rendering with a D3 force simulation), wrapped in a Tauri desktop shell (React + TypeScript frontend, Python backend).

Upload your PDFs, create projects, manage notes, tags, and more to organize your files — all locally, without sending your data anywhere. This project aims to be a one-stop-shop for researchers who want to manage their literature, with the near-term goal of extending to research groups who seek to share knowledge without going to the web.

Development status: The database schema and paper identifier format are actively changing. source_id values are being migrated to a namespaced format (arxiv:2204.12985, doi:10.48550/…, openalex:W3123456789, local:{hash}). Until that work lands, pre-v0.1.2 (current version) existing papers.db files will not be compatible with new builds — delete papers.db and let it rebuild on first run. No stable release has been cut yet.

Table of Contents

Features

  • Paper search — Search arXiv by keyword, fetch by ID, or look up by DOI; results saved to a local SQLite DB with version tracking
  • Interactive graph — Force-directed network of papers and authors (D3 force simulation, Cytoscape rendering); real-time force controls (center, repel, link distance, link strength)
  • Projects — Organise papers into projects; add notes per paper scoped to a project; composable SQL query builder (Q) for filtering
  • TeX rendering — MathJax renders LaTeX math in titles and abstracts inside the search UI
  • AI tools — Google Gemini structured output for tag generation, paper summarization, and semantic similarity
  • Obsidian integration — Auto-generate markdown notes with YAML frontmatter for your vault
  • PDF & TeX downloads — Batch download PDFs and TeX source tarballs

Project Structure

linXiv/
├── AI_tools.py                # Gemini: tag(), summarize(), find_related(); PaperContent input type
├── linxiv_cli.py              # CLI entry point (linxiv command via pyproject.toml)
├── linxiv_mcp.py              # MCP server for Claude integration
├── config.py                  # App-wide configuration constants
├── user_settings.py           # User-editable settings (API keys, paths)
├── pyproject.toml             # Package metadata + CLI/MCP entry points
├── assets/
│   ├── app_icon.png           # Application icon
│   └── wide_logo.png          # Wide logo (README header)
├── api/
│   ├── __main__.py            # Entry point: python -m api
│   ├── app.py                 # FastAPI routes (REST API incl. /api/graph)
│   ├── graph_payload.py       # Graph JSON (tags + projects) for /api/graph
│   └── run_api.py             # uvicorn launcher helper
├── sources/
│   ├── base.py                # PaperSource protocol + PaperMetadata model
│   ├── arxiv_source.py        # ArxivSource: search and fetch from arXiv API
│   ├── crossref_source.py     # CrossRefSource: fetch by DOI, search by title
│   ├── openalex_source.py     # OpenAlexSource: lookup via OpenAlex
│   ├── doi_resolve.py         # DOI resolution (arXiv, Semantic Scholar, CrossRef fallback)
│   ├── fetch_paper_metadata.py# High-level fetch/search helpers + Obsidian note generation
│   ├── pdf_metadata.py        # PDF metadata extraction and resolution pipeline
│   └── arxiv_downloads.py     # PDF and TeX source download helpers
├── service/
│   ├── paper.py               # Paper service: get, get_all, get_many, upsert, graph data
│   ├── author.py              # Author service: get, upsert, link/unlink to papers
│   ├── tag.py                 # Tag service: get, upsert, paper/project tag management
│   ├── note.py                # Note service: get, upsert, count by paper/project
│   ├── project.py             # Project service: get, upsert, filter, status management
│   ├── export_import.py       # Export/import projects as .lxproj archives
│   ├── vault.py               # On-disk LaTeX vault for the embedded editor
│   ├── editor_project.py      # Note-link layer for the embedded editor
│   ├── files.py               # File utilities for paper sources
│   └── models/                # Typed return types (PaperDetails, ProjectDetails, etc.)
├── storage/
│   ├── db.py                  # SQLite DB: versioned paper storage, graph data queries
│   ├── authors.py             # Author CRUD and paper linkage
│   ├── tags.py                # Tag CRUD
│   ├── projects.py            # Projects: Project data model + CRUD (Status/Q imported)
│   ├── notes.py               # Notes: per-paper annotations scoped to projects
│   ├── paths.py               # Filesystem paths (project root, DB, PDFs)
│   ├── config/
│   │   ├── core.py            # Schema application: apply_sql_schema, init_db
│   │   ├── queries.py         # Typed query helpers + composable Q predicate builder
│   │   └── sql/               # SQL table, view, and index definitions
│   └── migrations/            # One-off schema migration scripts
├── formats/
│   ├── bibtex.py              # BibTeX import/export
│   ├── csv_fmt.py             # CSV import/export
│   ├── json_fmt.py            # JSON import/export
│   ├── markdown.py            # Markdown / Obsidian import/export
│   ├── table_format.md        # YAML frontmatter template for Obsidian notes
│   └── arxiv_paper.md         # Plain-text paper card template
├── public/
│   └── graph/                 # Graph viewer (graph.html/js/css), loaded in an iframe
├── src/                       # React + TypeScript frontend (Vite)
├── src-tauri/                 # Tauri shell (Rust) + bundled sidecar binaries
├── tests/                     # pytest suite (API, CLI, DB, sources, DOI, notes, projects)
├── docs/                      # Development notes and technical debt log
└── pdfs/                      # Downloaded PDFs (gitignored)

Setup

Prerequisites

  • Python 3.10+
  • Node.js 18+ (for frontend / Tauri dev)
  • Rust toolchain (for Tauri)
  • uv (recommended Python package manager)

Install dependencies

uv sync          # Python dependencies (backend + dev)
npm install      # Node dependencies (frontend)

Add --extra mcp if you need the MCP server: uv pip install -e ".[mcp]"

Environment variables

Create a .env file in the project root:

GENAI_API_KEY_TAG_GEN=your_google_gemini_api_key

Run

HTTP API (JSON backend)

uv run python -m api   # http://127.0.0.1:8000 — see /docs for OpenAPI

CLI

Install once (editable install via uv):

uv pip install -e .

Then run from anywhere:

linxiv --version

# Search papers (arxiv, openalex, or crossref)
linxiv search "attention is all you need" --max 5
linxiv search "diffusion models" --source openalex --max 10
linxiv search "lattice QCD" --source crossref --max 3

# Fetch and save a paper by ID
linxiv fetch 2204.12985
linxiv fetch W3123456789 --source openalex

# List papers in the database
linxiv list --limit 20 --offset 0 --category cs.LG

# Paper management
linxiv paper get 2204.12985
linxiv paper versions 2204.12985
linxiv paper delete 2204.12985

# Tag management
linxiv tag add 2204.12985 transformers attention deep-learning
linxiv tag remove 2204.12985 attention
linxiv tag list 2204.12985
linxiv tag list-all
linxiv tag create my-tag
linxiv tag delete 42

# Project management
linxiv project list
linxiv project list --status active      # active | archived | deleted
linxiv project get 1
linxiv project create "Diffusion Models" --description "Score-based generative models"
linxiv project update 1 --name "Diffusion Models v2" --description "Updated"
linxiv project add-paper 1 2006.11239
linxiv project remove-paper 1 2006.11239
linxiv project delete 1

# Note management
linxiv note create 2204.12985 "Key insight: scaled dot-product attention" --title "Reading notes"
linxiv note create 2204.12985 "Follow-up question" --project-id 1
linxiv note get 7
linxiv note list --paper-id 2204.12985
linxiv note list --project-id 1
linxiv note delete 7

# PDF management
linxiv pdf path 2204.12985
linxiv pdf path 2204.12985 --version 2
linxiv pdf download 2204.12985 https://arxiv.org/pdf/2204.12985
linxiv pdf storage

All commands output JSON (or a formatted markdown card for fetch). Pass --help to any subcommand for full options.

MCP server (Claude integration)

Install with the mcp extra:

uv pip install -e ".[mcp]"

Register with Claude Code:

claude mcp add linxiv -- linxiv-mcp

Or add manually to .claude/settings.json:

{
  "mcpServers": {
    "linxiv": {
      "command": "linxiv-mcp"
    }
  }
}

Without an editable install, fall back to uv run:

{ "command": "uv", "args": ["run", "linxiv_mcp.py"], "cwd": "/absolute/path/to/linxiv" }

Once registered, Claude can call the linXiv tools directly — for example search_papers, fetch_paper, and list_papers. Full tool documentation will be added soon.

Building the Tauri App

The Tauri desktop app wraps the React/Vite frontend and bundles the Python backend as sidecar binaries compiled with PyInstaller.

Tauri prerequisites

  • Node.js 18+
  • Rust toolchain (stable)
  • uv
  • The Tauri build pulls the tauri-plugin-texbrain crate as a git dependency (github.com/linxiv-dev/tex-brain-linxiv-plugin, pinned in Cargo.lock) — no extra checkout needed
  • System Tauri dependencies — follow the Tauri v2 prerequisites guide for your OS (WebKit2GTK on Linux, Xcode Command Line Tools on macOS, Microsoft C++ Build Tools on Windows)

Development

Start the Python API and the Tauri dev window in separate terminals:

# terminal 1 — Python backend
uv run python -m api   # http://127.0.0.1:8000

# terminal 2 — Tauri dev window (also starts Vite, hot-reloads on frontend changes)
npm run tauri dev

The Python API sidecar is not bundled in dev mode — the app talks to the locally running API on port 8000.

Production build

The Python entry points (API, CLI, MCP server) are compiled to self-contained binaries with PyInstaller and staged into src-tauri/binaries/ before Tauri bundles the app.

1. Build and stage the Python sidecars:

npm run build:sidecar

This runs PyInstaller on linxiv-api.spec, linxiv-cli.spec, and linxiv-mcp.spec, then copies the outputs to src-tauri/binaries/ with the correct Tauri target-triple suffix.

2. Build the Tauri app:

npm run tauri build

Or run both steps at once:

npm run build:all

The final installer/bundle is written to src-tauri/target/release/bundle/.

Installing the CLI

After installing the desktop app, open Settings and click Install CLI to symlink the bundled linxiv binary to ~/.local/bin/linxiv (Linux/macOS) or add a shim to your PATH (Windows).

Usage

Projects

from storage import Project, filter_projects, Q, Status, get_paper

# Create and save a project
p = Project(name="Diffusion Models", color=0x5b8dee, project_tags=["generative"])
p.save()

# Add papers — add_papers takes integer SOURCE_FKs (papers must already be in the DB)
p.add_papers([get_paper(sid)["source_fk"] for sid in ("2006.11239", "2010.02502", "2112.10752")])

# Query with composable predicates
active = filter_projects(Q("status = ?", Status.ACTIVE))
not_deleted = filter_projects(~Q("status = ?", Status.DELETED))
blue_diffusion = filter_projects(
    Q("status = ?", Status.ACTIVE)
    & Q("color = ?", 0x5b8dee)
    & Q("name LIKE ?", "%diffusion%")
)

Notes

from storage import Note, get_notes, count_paper_notes, ensure_notes_db, get_paper

ensure_notes_db()

# Notes attach to a paper by its integer SOURCE_FK
sfk = get_paper("2006.11239")["source_fk"]

# Add a project-scoped note on a paper
note = Note(source_fk=sfk, project_id=p.id, title="Key insight", content="...")
note.save()

# Retrieve
project_notes = get_notes(sfk, project_id=p.id)
count = count_paper_notes(sfk, project_id=p.id)

Search and save papers

from sources import search_papers
from storage import init_db, save_papers

init_db()
papers = search_papers("lattice QCD", max_results=25)  # returns arxiv.Result objects
save_papers(papers)                                    # persist them to the DB

Add by DOI

from sources import resolve_doi

result = resolve_doi("10.48550/arXiv.1706.03762")

AI tools

from AI_tools import tag, summarize, find_related, PaperContent

content = PaperContent(abstract=paper.summary)

tags = tag(content)                        # ["#quantum_computing", ...]
tags = tag(content, file_path="tags.md")   # also appends to file

s = summarize(content)
print(s.tldr)
print(s.key_contributions)

# Semantic edges for the graph
from storage import list_papers
candidates = [(r["paper_id"], r["summary"]) for r in list_papers()]
related_ids = find_related(content, candidates)

Download PDFs

from sources.arxiv_downloads import download_pdf, download_pdf_batch, download_source_batch

download_pdf(paper, dirpath="pdfs/")
download_pdf_batch(papers, dirpath="pdfs/")
download_source_batch(papers, dirpath="source/")

Database queries

from storage import get_paper, get_all_versions, list_papers, get_graph_data

get_paper("2204.12985")           # latest version
get_paper("2204.12985", version=2)
get_all_versions("2204.12985")    # all stored versions
nodes, edges = get_graph_data()   # for the graph viewer

Graph Visualization

Papers (circles, in your theme’s accent color — blue by default) and authors (gold diamonds) form a force-directed network. Edges connect each paper to its authors. The control panel has four real-time sliders:

SliderEffect
Center forcePulls/pushes nodes toward the center
Repel forceControls node-to-node repulsion
Link distanceTarget edge length
Link strengthStiffness of paper–author edges

Notes

  • papers.db, pdfs/, source/, and vault contents are gitignored.
  • MathJax, D3, and the Inter UI font are all bundled locally — no external CDN calls, so the interface works fully offline.
  • PaperContent accepts abstract, full_text (TeX source), or pdf (bytes) — Gemini will use the richest available source.

Acknowledgements

linXiv owes a debt to Qiqqa, the open-source research management tool originally created by Jimme Jardine.

PDF text and metadata extraction uses pypdf, a pure-Python PDF library maintained by the py-pdf organization. pypdf is licensed under the BSD 3-Clause License.

相似文章

@fhwofjow51260: 研究生科研工具与网站,个人推荐版 ,建议收藏。 从找文献、读论文、翻译 PDF,到 LaTeX、画图、文献管理、模拟审稿,基本都覆盖了 1、OpenAlex https://openalex.org 免费开源的全球学术数据库,索引论文、作…

X AI KOLs Timeline

该帖推荐了22个研究生常用的科研工具与网站,涵盖文献检索、论文阅读、翻译、LaTeX写作、绘图、文献管理、AI辅助审稿等环节,并给出分类建议。

@IndieDevHailey: 科研党福音!这个开源神器,让你从文献海里杀出重围,一键搞定学术全流程。 还在为文献调研慢、写作卡壳、引用不严谨、审稿被怼头疼?推荐这个开源仓库:academic-research-skills 它不是AI代写工具,而是靠谱的人机协作框架—…

X AI KOLs Timeline

推荐开源仓库 academic-research-skills,提供一套人机协作的学术研究全流程工具,包括深度文献调研、论文写作、同行评审模拟和引用审计,支持AI辅助但保持用户主导,适合硕博生和研究者。