@NFTCPS: 我草,开盒工具来啦! 输个用户名,840多个平台帮你一次扒干净,这玩意儿叫ALIENS EYE。 它没那么蠢,不是靠看HTTP状态码瞎猜,而是用训练好的ML模型加25个特征一起判断,结果分三档:Found、Maybe、Not Found,…

X AI KOLs Timeline 工具

摘要

ALIENS EYE 是一个AI驱动的开源用户名扫描器,利用机器学习模型和25个特征对840多个平台进行检测,支持代理、Tor和多种导出格式。

我草,开盒工具来啦! 输个用户名,840多个平台帮你一次扒干净,这玩意儿叫ALIENS EYE。 它没那么蠢,不是靠看HTTP状态码瞎猜,而是用训练好的ML模型加25个特征一起判断,结果分三档:Found、Maybe、Not Found,还带置信度。 几个点说下: 几秒钟异步扫完840+平台 能走Tor和代理,藏自己 结果能导出JSON、CSV、HTML、Markdown 一句话,查人挺好使,别拿去干坏事。 https://github.com/arxhr007/Aliens_eye…
查看原文
查看缓存全文

缓存时间: 2026/06/28 08:03

我草,开盒工具来啦!

输个用户名,840多个平台帮你一次扒干净,这玩意儿叫ALIENS EYE。

它没那么蠢,不是靠看HTTP状态码瞎猜,而是用训练好的ML模型加25个特征一起判断,结果分三档:Found、Maybe、Not Found,还带置信度。

几个点说下: 几秒钟异步扫完840+平台 能走Tor和代理,藏自己 结果能导出JSON、CSV、HTML、Markdown

一句话,查人挺好使,别拿去干坏事。

https://github.com/arxhr007/Aliens_eye…


arxhr007/Aliens_eye

Source: https://github.com/arxhr007/Aliens_eye

ALIENS EYE

Aliens Eye Logo

AI-OSINT Username Scanner

Advanced AI-Powered Social Media Username Finder

Scan 840+ platforms with ML-blended detection

PyPI CI Python Stars License

Highlights

  • 840+ platforms scanned asynchronously in seconds
  • ML + heuristic detection — a trained model blended with 25 structural signals (HTTP status, DOM shape, keywords, fingerprints) instead of naive status-code checks
  • Modern terminal UI — live progress, sorted result tables, summary panels (powered by rich)
  • Proxy & Tor support--proxy socks5://... or just --tor
  • Site filtering--site github,reddit, --exclude-site, --no-nsfw
  • Self-checkaliens_eye selfcheck validates detection accuracy against accounts known to exist
  • Retrainable — collect your own labeled dataset and retrain the model with aliens_eye train
  • Reports in JSON, CSV, HTML, and Markdown
  • Playwright fallback for JavaScript-heavy pages (optional extra)

Install

pip install aliens-eye

Optional extras:

pip install "aliens-eye[browser]"   # Playwright fallback for hard pages
python -m playwright install chromium

pip install "aliens-eye[train]"     # scikit-learn, for retraining the ML model

Or with Docker:

docker build -t aliens-eye .
docker run --rm -it aliens-eye username

From source:

git clone https://github.com/arxhr007/Aliens_eye.git
cd Aliens_eye
pip install -e .

Usage

# Interactive prompts
aliens_eye

# Single username
aliens_eye username

# Multiple usernames
aliens_eye username1 username2

# Advanced scan level (prefix/suffix variations)
aliens_eye username -l advanced

# Only scan specific sites
aliens_eye username --site github,reddit,gitlab

# Skip NSFW sites
aliens_eye username --no-nsfw

# Route through Tor (needs a local Tor daemon)
aliens_eye username --tor

# Any HTTP or SOCKS proxy
aliens_eye username --proxy socks5://127.0.0.1:1080

# Export everything
aliens_eye username --format all --output results

# Heuristics only, no ML
aliens_eye username --no-ml

# Non-interactive preset: quick / full / aggressive
aliens_eye username --profile quick

# Plain output for scripts and CI (no colors/progress)
aliens_eye username --plain

# View results from a previous scan
aliens_eye -r results/username_advanced_20260611_120000.json

# Validate detection accuracy against known accounts
aliens_eye selfcheck
Aliens Eye Logo

How detection works

Every response is converted into a 25-dimensional feature vector: HTTP status buckets, username placement (path/title/meta), error and profile keywords, DOM structure (images, forms, profile/error CSS classes), response timing, redirect counts, and per-site fingerprint matches learned from previous scans.

Two judges then vote:

  1. Heuristic engine — weighted scoring over the features
  2. ML model — logistic regression trained on labeled scans of real (and deliberately fake) accounts, shipped with the package and running in pure Python (no sklearn needed at runtime)

The blended probability maps to Found / Maybe / Not Found with a confidence percentage. If a model file is missing or invalid, the scanner silently falls back to heuristics.

Retraining the model

pip install "aliens-eye[train]"

# 1. Scan ground-truth accounts + random non-existent usernames to build a dataset
aliens_eye train collect --out dataset.csv --negatives 4

# 2. Fit and export the model
aliens_eye train fit --data dataset.csv --out model.json

# 3. Use it
aliens_eye username --model model.json

Configuration

Aliens Eye merges a JSON config file with CLI flags (CLI wins). Search order without --config: ./config.json, then the platform config dir (e.g. ~/.config/aliens_eye/config.json on Linux, %LOCALAPPDATA%\aliens_eye on Windows).

{
  "concurrent": 50,
  "timeout": 10.0,
  "retries": 2,
  "rate_limit_delay": 0.2,
  "output_dir": "results",
  "output_formats": ["json", "csv", "html", "md"],
  "use_playwright": false,
  "proxy": null,
  "use_ml": true,
  "exclude_nsfw": false,
  "level": "basic"
}

Outputs

Results are saved with timestamped filenames:

  • username_level_YYYYMMDD_HHMMSS.json — full detail including per-site feature analysis
  • .csv — flat rows for spreadsheets
  • .html — styled standalone report
  • .md — Markdown summary of Found/Maybe hits

Architecture

The package lives under src/aliens_eye/: core/ (scanner, detector, analyzer, http, exporter, fingerprints), ml/ (inference, training, dataset collection), utils/ (rich console layer), and data/ (sites.json, trained model, ground-truth sets). For internals and flowcharts, see WORKING.md.

Contributing

Issues and PRs welcome — adding sites to src/aliens_eye/data/sites.json, expanding the ground-truth set in selfcheck.json, or improving the model all directly improve detection. Run pytest and ruff check src tests before submitting.

Disclaimer

This tool is for educational purposes and legitimate OSINT research only. You are responsible for complying with laws and site terms of service.

相似文章

@XAMTO_AI: 我擦,刚翻到一个扒站神器,叫 Web-Check,还完全白嫖! 随便甩个网址进去,对面网站直接被扒到只剩底裤,离谱程度堪比开盒现场: DNS 记录全给你抖搂出来 服务器架构看得透透的 用啥框架、CMS、插件,一个都别想藏 开放端口、历史快…

X AI KOLs Timeline

Web-Check 是一款免费开源的网站信息探测工具,输入网址即可获取DNS记录、服务器架构、框架、CMS、开放端口、子域名等详细信息,适合开发者和安全研究人员。

@NFTCPS: X推特上那些搬运博主的内容源终于知道从哪来的! 就这个工具MediaCrawler,一个工具通吃小红书、抖音、快手、B站、微博、贴吧、知乎,公开的内容、评论、点赞、转发都能扒下来。 最骚的是它不用搞JS逆向那套,靠浏览器登录态直接拿签名,…

X AI KOLs Timeline

MediaCrawler是一个多平台自媒体数据采集工具,支持小红书、抖音、快手、B站、微博、贴吧、知乎的公开内容抓取,利用浏览器登录态绕过JS逆向,降低技术门槛。