I built a semantic arXiv search engine with AI-generated TL;DRs, claim classification, and paper comparison
Summary
A semantic search engine for arXiv papers featuring AI-generated TL;DRs, claim classification, paper comparison, and more. Built with Next.js, Cloudflare, and open-source models.
View Cached Full Text
Cached at: 06/08/26, 03:21 PM
Teycir/ArxivExplorer
Source: https://github.com/Teycir/ArxivExplorer
Support Development
If this project helps your work, support ongoing maintenance and new features.
ETH Donation Wallet
0x11282eE5726B3370c8B480e321b3B2aA13686582
Scan the QR code or copy the wallet address above.
Fast semantic arXiv paper search with AI-powered summaries — no login required.
“Research papers, decoded..”
Video Demo
Screenshots
Landing Page
Advanced Search Filters
Search papers similar to abstracts
Claim Assessment
Author Pages
Paper Comparison
Explore & Discover
Features
Core Search & Discovery
- Hybrid Search — Combines FTS5 keyword search and Vectorize semantic search for accurate results
- Advanced Filtering — Filter by author (substring match), citation count, category, and date range
- Smart Caching — KV-based caching with 2h TTL for search results, 24h for embeddings
- Related Papers — Pre-computed top-8 semantically similar papers via Vectorize
- Topic Collections — Curated topics with category mappings (stored in
topicstable) - Author Pages — Author statistics, timeline visualization, and all papers
- Full-Text Search — SQLite FTS5 virtual table with automatic triggers
AI-Powered Features
- Pre-Generated Summaries — TL;DR, key contributions, methods, limitations, beginner/technical explanations
- Entity Extraction — Keywords, entities (models/datasets/benchmarks), paper type classification
- Claim Classification — AI-powered support/contradiction analysis for scientific claims
- Smart Abstracts — Enhanced paper metadata with prerequisites and follow-up questions
Paper Management
- Bookmarks — Client-side collections with 90-day TTL (100 bookmark soft cap)
- Export Options — JSON and BibTeX export for collections
- Paper Comparison — Side-by-side comparison view (up to 6 papers)
- Revision History — Track paper updates and version differences
- Share & Copy — Quick copy for arXiv ID and BibTeX entries
Enrichment & Metadata
- Citation Tracking — Semantic Scholar integration with citation count + influential citations
- Citation Snapshots — Historical citation data stored in
citation_snapshotstable - CrossRef Integration — Journal metadata, publisher, license, funders
- OpenAlex Data — Concepts, affiliations, institutional data (ROR IDs)
- Papers With Code — Code repositories, benchmarks, SOTA rankings (schema ready)
User Engagement
- Achievements System — Gamified badges stored client-side with activity tracking
- Recent Searches — Search history with suggestions
- Personalized Feed — Recommendations based on bookmark history
- RSS Feed —
/rss.xmlwith 20 recent papers (1h cache)
Developer Tools
- CLI Interface —
arxiv-clifor AI assistants (search, trending, topics, authors) - Admin API — Vectorize bulk operations, maintenance endpoints, enrichment triggers
SEO & Discoverability
- Dynamic Meta Tags — Open Graph and Twitter Card tags on all paper pages
- Sitemap.xml — Auto-generated sitemap with all papers, topics, and authors
- Robots.txt — Search engine crawler configuration
- Structured Data — JSON-LD schema markup for papers and authors
- SSR Content — Server-side rendered pages with full content for crawlers
- Canonical URLs — Proper canonical tags to prevent duplicate content
- AI Agent Discovery —
/ai.txtand/llms.txtroutes for LLM tool integration
Performance
- Edge Caching — Cloudflare KV with intelligent TTL strategies
- ISR Rendering — Next.js ISR with 10-minute revalidation
- Zero Login — Instant access to all features
- Global CDN — Cloudflare Workers edge deployment
Security
- Rate Limiting — Per-IP token bucket on all public endpoints (60-100 req/min) with lockout
- SQL Injection Protection — 100% parameterized queries via D1
.prepare().bind() - Input Sanitization — Strict validation on all user inputs (control chars, length limits, allowlists)
- Timing-Safe Auth — Admin endpoints use
crypto.timingSafeEqual(no timing oracles) - Strict CORS — Explicit origin only (wildcard rejected at startup)
- AI Quota Protection — Hard character limits + rate limiting on
/api/classify-claim - Error Sanitization — Generic 500 messages (internal details logged server-side only)
See SECURITY.md for full details.
Architecture
Built on Cloudflare’s edge platform for global performance:
- Frontend: Next.js deployed as a Cloudflare Worker (via OpenNext +
main+assetsmode) - API: Cloudflare Workers
- Database: Cloudflare D1 (SQLite)
- Vector Search: Cloudflare Vectorize
- Cache: Cloudflare KV
- AI: Workers AI (Llama 3.1 + BGE embeddings) for live inference; local Ollama for bulk ingestion
Deployment note: The frontend is deployed as a Worker (not Cloudflare Pages) to avoid the per-request nonce injection that Pages unconditionally adds to
script-src, which breaks the app’s CSP.
System Design
Browser → Next.js Worker → API Worker → KV Cache → D1 Database
↓
Vectorize
↑
Ingest Worker (Cron)
↑
Workers AI / local Ollama
Data Pipeline
Papers flow through a multi-stage pipeline:
1. Fetch Stage
Ingest worker polls the arXiv API on cron schedule (0 * * * * hourly) and writes new papers to D1 with summary_ready = 0.
2. Summarize Stage
Either the ingest worker (Workers AI, rate-limited) or the local bulk script (Ollama, unlimited) generates:
- Structured summaries (tldr, contributions, methods, limitations, explanations)
- Paper embeddings for semantic search
- Sets
summary_ready = 1when complete
3. Enrichment Stage (optional)
- Citations: Semantic Scholar API updates citation counts via cron
- CrossRef: DOI-based metadata enrichment (daily cron
30 2 * * *) - OpenAlex: Concepts, affiliations, open access metadata
- Papers With Code: Code repositories, benchmarks, SOTA rankings
4. Related Papers
Pre-computes top-8 semantically similar papers using Vectorize and stores in related_papers table.
Cron Schedule
The ingest worker runs on a single cron trigger:
* * * * *— Every minute (processes 1 paper per run with 1 retry on failure; citation updates via Semantic Scholar run in the same cron)
CrossRef enrichment is triggered via the admin endpoint (POST /admin/crossref-batch) rather than a separate cron.
Bulk Local Processing
When remote Workers AI hits rate limits, use the local Ollama pipeline to catch up:
# Process all pending/failed papers from remote D1 using local Ollama
ADMIN_SECRET=<secret> npx tsx scripts/process-pending-local.ts
# Push a fully-processed local DB up to remote D1 + Vectorize
ADMIN_SECRET=<secret> npx tsx scripts/push-local-to-remote.ts
# Bulk ingest (fetch + summarize + embed in one pass)
npx tsx scripts/bulk-ingest.ts --days 7 --categories cs.LG,cs.CL
Both scripts use the D1 REST API directly (no wrangler subprocess per paper), which is ~100× faster than the naive approach and avoids shell-escaping issues with special characters in paper text.
Ollama models used locally:
| Role | Model |
|---|---|
| Summarisation | gemma4:e4b (8 B, Q4_K_M) |
| Embeddings | nomic-embed-text (137 M, F16) |
Quick Start
Prerequisites
- Node.js 18+
- Cloudflare account (free tier works)
- Wrangler CLI:
npm install -g wrangler
Installation
git clone https://github.com/yourusername/arxiv-explorer.git
cd arxiv-explorer
npm install
wrangler login
# Create infrastructure
wrangler d1 create arxiv-explorer
wrangler kv:namespace create CACHE
wrangler vectorize create arxiv-papers --dimensions=768 --metric=cosine
# Update wrangler config files with your IDs
# Edit: wrangler.api.toml, wrangler.ingest.toml, wrangler.jsonc
# Apply database schema (canonical version)
wrangler d1 execute arxiv-explorer --remote --file=migrations/schema.sql
# Copy and fill env files
cp .env.local.example .env.local
cp scripts/config.local.example.ts scripts/config.local.ts
# Edit scripts/config.local.ts with your Cloudflare credentials
Development
npm run dev # Next.js dev server
wrangler dev --config wrangler.api.toml # API worker
wrangler dev --config wrangler.ingest.toml # Ingest worker
Visit http://localhost:3000
Deployment
# Full deployment (Next.js + API worker)
./deploy.sh
# Or individually:
npm run deploy # Next.js frontend (Worker mode via OpenNext)
npm run deploy:api # API worker
npm run deploy:ingest # Ingest worker
# Note: deploy.sh does NOT deploy ingest worker
# Deploy ingest worker manually when needed
Project Structure
├── app/ # Next.js 16 app directory
│ ├── page.tsx # Home page
│ ├── search/ # Search results
│ ├── paper/[id]/ # Paper detail pages
│ ├── topic/[slug]/ # Topic pages
│ ├── author/[name]/ # Author pages
│ ├── compare/ # Paper comparison
│ ├── diff/[id]/ # Paper revision history
│ ├── bookmarks/ # Bookmark management
│ ├── explore/ # Explore page
│ ├── achievements/ # Achievement tracking
│ ├── claim/ # Claim classification
│ ├── faq/ # FAQ page
│ ├── how-to-use/ # User guide
│ ├── rss.xml/ # RSS feed route
│ │ └── route.ts
│ ├── ai.txt/ # LLM discovery route
│ │ └── route.ts
│ ├── llms.txt/ # LLM discovery route
│ │ └── route.ts
│ └── components/ # React components
│ ├── SummarySection.tsx
│ ├── PaperCard.tsx
│ ├── SearchFilters.tsx
│ ├── BookmarkButton.tsx
│ ├── CollectionManager.tsx
│ ├── SearchBoxHome.tsx
│ ├── Navbar.tsx
│ ├── Footer.tsx
│ └── ... (40+ components)
├── src/
│ ├── api-worker/ # Cloudflare Workers API
│ │ ├── index.ts # Router
│ │ └── routes/
│ │ ├── search.ts # Hybrid search (FTS5 + semantic)
│ │ ├── paper.ts # Paper details
│ │ ├── related.ts # Related papers
│ │ ├── trending.ts # Trending papers
│ │ ├── topic.ts # Topic endpoints
│ │ ├── topics.ts # List topics
│ │ ├── author.ts # Author endpoints
│ │ ├── authors.ts # List authors
│ │ ├── claim.ts # Claim classification
│ │ ├── admin.ts # Admin endpoints (Vectorize, maintenance)
│ │ ├── stats.ts # Database statistics
│ │ └── sitemap.ts # Sitemap generation
│ ├── ingest-worker/ # Background processing (cron)
│ │ ├── index.ts # Cron entrypoint
│ │ ├── pipeline.ts # Main ingestion pipeline
│ │ ├── fetch-arxiv.ts # arXiv API fetcher
│ │ ├── generate-summary.ts
│ │ ├── generate-embedding.ts
│ │ ├── generate-entities.ts
│ │ ├── update-citations.ts # Semantic Scholar sync
│ │ ├── fetch-crossref.ts # CrossRef enrichment
│ │ ├── fetch-openalex.ts # OpenAlex enrichment
│ │ ├── fetch-pwc.ts # Papers With Code enrichment
│ │ ├── compute-related.ts # Related papers computation
│ │ └── tfidf.ts # TF-IDF utilities
│ └── shared/ # Shared types & utils
│ ├── types.ts # TypeScript interfaces
│ ├── db.ts # Database helpers
│ └── utils.ts # Utilities
├── scripts/
│ ├── push-local-to-remote.ts # Sync local → remote D1 + Vectorize
│ ├── retry-failed-local.ts # Reprocess pending papers via Ollama
│ ├── bulk-ingest.ts # Full bulk ingest pipeline
│ ├── sync-remote-to-local.ts # Sync remote → local
│ ├── backfill-*.ts # Various backfill scripts
│ ├── upload-embeddings.ts # Standalone Vectorize uploader
│ ├── test-*.sh # Test scripts
│ ├── config.local.example.ts # Local config template
│ └── ... (25+ utility scripts)
├── migrations/
│ ├── schema.sql # Canonical D1 schema (single source of truth)
│ ├── 0001_schema.sql # Initial migration (legacy)
│ └── 000*.sql # Other migrations
├── helper/ # API client helpers
├── lib/ # Frontend libraries
├── wrangler.api.toml # API worker config
├── wrangler.ingest.toml # Ingest worker config
├── wrangler.jsonc # Next.js worker config (frontend)
├── next.config.ts # Next.js configuration
├── open-next.config.ts # OpenNext Cloudflare adapter config
└── deploy.sh # Deployment script
API Reference
GET /api/search?q=attention+mechanisms # Hybrid FTS5 + semantic search
GET /api/search?q=...&author=Hinton # Filter by author (substring match)
GET /api/search?q=...&minCitations=10 # Filter by minimum citations
GET /api/search?q=...&category=cs.LG # Filter by arXiv category
GET /api/search?q=...&date=week # Filter by date (day/week/month)
GET /api/search?q=...&author=X&minCitations=Y&... # Combine multiple filters
GET /api/paper/:id # Paper detail + summary
GET /api/paper/:id/related # Semantically similar papers
GET /api/trending # Trending papers (KV cached)
GET /api/topic/:slug # Topic paper collection
GET /api/topics # List all topics
GET /api/author/:name # Author papers and statistics
GET /api/authors # List authors
GET /api/stats # Database statistics
GET /api/sitemap # Sitemap for SEO
GET /rss.xml # RSS feed (20 recent papers, 1h cache)
GET /compare?ids=id1,id2,id3 # Compare up to 6 papers side-by-side
POST /api/classify-claim # AI-powered claim classification
# Admin endpoints (x-admin-secret required)
POST /admin/vectorize/upsert # Bulk embed upsert
POST /admin/retry-failed # Reset summary_ready=2 → 0
POST /admin/backfill-related # Backfill related papers
POST /admin/crossref-batch # CrossRef batch enrichment
POST /admin/related/clear # Clear related papers
POST /admin/related/bulk-insert # Bulk insert related papers
POST /admin/kv/delete # Delete KV cache entries
GET /admin/papers/all # Export all papers
Configuration
Environment Variables
# .env.local (Next.js frontend)
NEXT_PUBLIC_API_BASE=https://arxiv-api.yourdomain.workers.dev
API_BASE=https://arxiv-api.yourdomain.workers.dev
// scripts/config.local.ts (for local scripts)
export const CF_TOKEN = 'your-cloudflare-api-token';
export const CF_ACCOUNT_ID = 'your-account-id';
export const CF_D1_ID = 'your-d1-database-id';
Ingestion Settings (wrangler.ingest.toml)
[vars]
ARXIV_FETCH_CATEGORIES = "cs.AI,cs.LG" # Default fetch categories (add more as needed)
ARXIV_FETCH_LIMIT_PER_CATEGORY = "0" # Papers per category per cron (0 = process pending only)
INGEST_MAX_CONCURRENT = "1" # Concurrent AI processing
ARXIV_RATE_LIMIT_DELAY_MS = "3000" # Delay between arXiv requests
SUMMARY_MODEL = "@cf/meta/llama-3.1-8b-instruct" # Workers AI summary model
EMBEDDING_MODEL = "@cf/baai/bge-base-en-v1.5" # Workers AI embedding model
INGEST_PHASE = "hourly" # Phase label (informational only)
POLITE_EMAIL = "[email protected]" # Contact email for arXiv API
# Optional Ollama (local AI)
# OLLAMA_BASE = "https://your-tunnel.trycloudflare.com"
# OLLAMA_SUMMARY_MODEL = "gemma4:e4b"
# OLLAMA_EMBEDDING_MODEL = "nomic-embed-text"
Minutely cron schedule:
- Processes exactly 1 pending paper per run (summary_ready = 0 or failed within 7 days)
- Retries once on failure (2 total attempts)
- Daily quota: 113 papers/day max (5,000 neurons, 50% of daily budget reserved for tooltips)
- Quota tracking via KV with automatic reset at 00:00 UTC
Admin Secret
Required for Vectorize upserts, maintenance endpoints, and enrichment endpoints:
# Set for API worker
wrangler secret put ADMIN_SECRET --config wrangler.api.toml
# Use in local scripts
ADMIN_SECRET=your-secret npx tsx scripts/push-local-to-remote.ts
Database Schema
papers
- arXiv metadata (id, title, authors, abstract, categories, dates, URLs)
authors_normalized— lowercased for fast prefix searchcitation_count— from Semantic Scholar (updated hourly via cron)citations_updated_at— last citation sync timestampsummary_ready:0= pending ·1= done ·2= failed- Additional fields:
comment,journal_ref,doi,primary_category
summaries
tldr— one-sentence resultkey_contributions— JSON arraymethods— JSON arraylimitations— JSON arraybeginner_explain— plain-language paragraphtechnical_summary— researcher-level paragraphmodel_version— which model generated it
Supporting tables
paper_categories— normalized category rows (indexed for topic queries)papers_fts— FTS5 virtual table with insert/update/delete triggersembeddings_meta— tracks embedding generation per paperrelated_papers— pre-computed top-8 semantic neighborstopics— curated topic collections with category mappingscitation_snapshots— historical citation data for velocity trackingentity_definitions— terminology definitions for entities
Canonical schema file
The single source of truth is migrations/schema.sql. Additional columns added via incremental migrations (e.g. 0012_summaries_extended.sql adds problem_statement to summaries) must be applied on top with wrangler d1 execute.
Rebuild from scratch
# Apply canonical schema (wipes and recreates all tables)
wrangler d1 execute arxiv-explorer --remote --file=migrations/schema.sql
# Push local data (papers, summaries, categories, FTS, embeddings)
ADMIN_SECRET=<secret> npx tsx scripts/push-local-to-remote.ts
Performance
- Search: <240 ms average (KV cache hit) · <400 ms (D1 fallback)
- Paper detail: <190 ms average (KV cache hit) · <500 ms (D1 fallback)
- Cache hit rate: ~85% (188ms average cache hit time)
- Throughput: 50 req/s under mixed load
- Edge deployment: Global CDN via Cloudflare Workers
- Stress tested: 100 concurrent requests, 0% error rate
Key Features
Citation Tracking
- Source: Semantic Scholar API integration
- Updates: Automatic cron job (part of ingest worker)
- Storage:
citation_countandcitations_updated_atfields in papers table - History: Citation snapshots stored in
citation_snapshotstable - Rate Limiting: Respects Semantic Scholar rate limits
Paper Collections
- Location:
/bookmarkspage - Storage: Client-side localStorage
- Features:
- Create named collections
- Assign bookmarks to collections
- Export as JSON or BibTeX
- Export all bookmarks or by collection
- Capacity: 100 bookmarks (soft cap), 90-day TTL
Advanced Search Filters
- Author Filter:
?author=Hinton— substring match across all authors - Citation Filter:
?minCitations=10— minimum citation threshold - Category Filter:
?category=cs.LG— arXiv category code (cs.LG, cs.CL, cs.CV, etc.) - Date Filter:
?date=week— time window (day/week/month) - Combined Filters: All filters work together and with hybrid search
- Caching: Separate KV cache keys per filter combination (2h TTL)
- Example:
/api/search?q=transformer&author=Vaswani&minCitations=100&category=cs.LG&date=month
RSS Feed
- Endpoint:
/rss.xml - Content: 20 most recent papers with AI-generated summaries
- Format: RSS 2.0 with full TL;DR, key contributions, and methods
- Cache: 1-hour TTL via Cloudflare KV
- Use Case: Subscribe in your RSS reader to stay updated on new papers
- Example:
https://arxiv-explorer.yourdomain.com/rss.xml
Paper Comparison
- Route:
/compare?ids=id1,id2,id3 - Capacity: Up to 6 papers side-by-side
- Sections: TL;DR, Key Contributions, Methods, Limitations, Technical Summary
- Layout: Responsive grid adapts to paper count
- Example:
/compare?ids=2605.30353,2302.13971,2303.08774
Testing
Integration Tests
cd scripts
./test-integration.sh # Core functionality tests
./test-new-features.sh # New features tests
./test-full.sh # Comprehensive test suite
Stress Testing
cd scripts
./test-stress.sh # Production load testing
API Deep Testing
cd scripts
./test-api-deep.sh # Deep API endpoint testing
CLI Tool for AI Assistants
A command-line interface designed for AI assistants (Claude Code, ChatGPT, etc.) to programmatically search and explore papers.
Installation
# Quick install
./install-cli.sh
# Manual
cd cli
npm run build
npm link
Usage
# Search papers
arxiv-cli search "transformer attention" 5
# Get paper details with AI summary
arxiv-cli paper 2605.30353
# Show trending papers
arxiv-cli trending 10
# Browse topics
arxiv-cli topics
arxiv-cli topic large-language-models 20
# Author papers
arxiv-cli author "Yann LeCun" 10
Output Format
Clean, structured text optimized for AI parsing:
ID: 2605.30353
Title: Physics Is All You Need...
Authors: John Doe, Jane Smith...
Published: 2026-06-03
Categories: cs.LG, cs.AI
TL;DR: This paper introduces...
URL: https://arxiv.org/abs/2605.30353
See cli/README.md for complete documentation.
Troubleshooting
Check paper counts
npx wrangler d1 execute arxiv-explorer --remote --config wrangler.api.toml \
--command="SELECT summary_ready, COUNT(*) as cnt FROM papers GROUP BY summary_ready"
Retry pending/failed papers locally
# Retry up to 50 papers
ADMIN_SECRET=<secret> LIMIT=50 npx tsx scripts/process-pending-local.ts
# Process with higher concurrency (careful with GPU memory)
ADMIN_SECRET=<secret> LIMIT=100 CONCURRENCY=2 npx tsx scripts/process-pending-local.ts
Push local DB to remote
ADMIN_SECRET=<secret> npx tsx scripts/push-local-to-remote.ts
Watch live logs
wrangler tail arxiv-api --format=pretty # API worker
wrangler tail arxiv-ingest --format=pretty # Ingest worker
Sync remote DB to local
npx tsx scripts/pull-remote-to-local.ts
Reset database
./scripts/reset-and-ingest.sh
Design Notes
Why Worker instead of Pages
Deploying the Next.js frontend as a Cloudflare Worker (via OpenNext main + assets) rather than Cloudflare Pages avoids the per-request nonce that Pages unconditionally injects into script-src. That injection happens at the CDN layer before the response reaches the browser, so no amount of middleware or _headers file can override it. The Worker deployment has no such injection and serves the app’s own CSP intact.
The deployment uses:
@opennextjs/cloudflareadapter- OpenNext build:
npx opennextjs-cloudflare build - Output:
.open-next/worker.js+.open-next/assets/ - Wrangler config:
wrangler.jsoncwithmainandassetsbindings
Search Algorithm
- Normalise query
- Check KV cache (2 h TTL)
- Parallel:
- D1 FTS5 keyword search (title boosted 10:1:5)
- Vectorize semantic search (query embedding cached 24 h)
- Merge (25 % keyword · 75 % semantic), deduplicate
- Return top 10, write to KV
Caching Strategy
- Lazy KV writes: paper detail written to KV on first access, not at ingestion
- Query embedding cache: popular search vectors cached 24 h in KV
- Trending KV cache: 60-minute TTL, auto-invalidated on new papers
AI Processing
- Single consolidated prompt per paper → structured JSON output
- Workers AI uses
@cf/meta/llama-3.1-8b-instructfor summaries,@cf/baai/bge-base-en-v1.5for embeddings - Local Ollama fallback:
gemma4:e4b(summaries) +nomic-embed-text(embeddings) - Failed papers marked
summary_ready = 2and retried on next run
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
License
This project is licensed under the Business Source License 1.1 (BSL 1.1).
- ✅ Free for personal, academic, and non-commercial use
- ❌ Commercial use requires a separate license
- 📅 Converts to MIT License on 2029-06-01
See LICENSE.md for full terms, or contact the author for commercial licensing.
Acknowledgments
- arXiv for open access to research papers
- Cloudflare for the edge platform
- Next.js / OpenNext for the framework + Worker adapter
- Ollama for local model inference
- SeekYou for BackgroundBeams, DecryptedText, and AnimatedTagline components
🌐 Related Projects
Explore more privacy-first and security tools:
Privacy & Encryption
- Timeseal - Time-locked encryption vault with Dead Man’s Switch. AES-256 split-key crypto, ephemeral seals.
- Sanctum - Zero-trust encrypted vault with cryptographic plausible deniability. XChaCha20-Poly1305, Argon2id.
- GhostChat - True P2P encrypted chat via WebRTC. No servers, no storage, self-destructing messages.
- xmrproof - Monero payment verification, 100% client-side.
- GhostReceipt - Anonymous receipt generation with zero-knowledge proofs.
Security Tools
- BurpAPISecuritySuite - Burp Suite extension for API security testing. 15 attack types, 108+ payloads, BOLA/IDOR detection.
- Mcpwn - Automated security scanner for Model Context Protocol servers. Detects RCE, path traversal, prompt injection.
- DiffCatcher - Git repo discovery, diff capture, code element extraction.
- HoneypotScan - Honeypot detection service for security research.
- CheckAPI - LLM API key validator for multiple providers. Privacy-first, client-side validation.
- SeekYou - Host intelligence aggregator — unified OSINT across 15 sources for IPs, domains, and ASNs.
MCP Security Servers
- burp-mcp-server - MCP server for Burp Suite Professional. Vulnerability scanning via AI assistants.
- nuclei-mcp - MCP server for Nuclei. Multi-target scanning, severity filtering.
- nmap-mcp - MCP server for Nmap. Stealth recon, vuln/NSE scanning.
- frida-mcp - MCP server for Frida. Dynamic instrumentation, SSL pinning bypass.
💼 Services Offered
- 🔒 Privacy-First Development - P2P applications, encrypted communication, zero-knowledge systems
- 🚀 Web Application Development - Full-stack development with Next.js, React, TypeScript
- 🔧 Edge Computing Solutions - Cloudflare Workers, Pages, D1, KV, Durable Objects
- 🛡️ Security Tool Development - Burp extensions, penetration testing tools, automation frameworks
- 🤖 AI Integration - LLM-powered applications, intelligent automation, custom AI solutions
- 🔍 OSINT & Threat Intelligence - Custom reconnaissance tools, threat feed aggregation, IOC correlation
Get in Touch: teycirbensoltane.tn | Available for freelance projects and consulting
Built with 💚 by Teycir Ben Soltane
Similar Articles
Interactive Semantic Flow Analysis of arXiv AI Papers from the Last 6 Months
TraceScope provides an interactive web-based tool for exploring semantic flows of recent AI papers from arXiv, with an open-source library available on GitHub.
I Built Paper Deck: A Better Way to Discover AI/ML Papers [P]
An open-source tool called Paper Deck that aggregates AI/ML papers from arXiv and Hugging Face, allowing reading, starring, and cross-device progress tracking.
Obsidian whitepaper archive w search & browsable concepts & connections
A personal collection of 1,400 hand-curated Arxiv whitepapers on reasoning, RL, alignment, and more, now online with semantic search and browsable conceptual connections, built using Obsidian and Claude.
AiraXiv: An AI-Driven Open-Access Platform for Human and AI Scientists
This paper introduces AiraXiv, an AI-driven open-access platform designed for both human and AI scientists, featuring interactive UI and MCP-based interactions to support continuous, feedback-driven paper iteration and scalable research infrastructure.
@VincentLogic: Drowning in new Arxiv papers every day? Head spinning. Just discovered a treasure trove of a website that aggregates the latest AI papers and model benchmarks. Clean interface, just check Trending or filter by week/month. Best part: each paper directly links to the benchmarks and models it uses.
Recommend a free website sophon.at/papers that aggregates the latest AI papers and model benchmarks. Clean interface, supports Trending or weekly/monthly filtering. Each paper directly links to its benchmarks and models.