How to build scalable web apps with OpenAI's Privacy Filter
Summary
This article demonstrates how to build scalable web applications for PII detection using OpenAI's Privacy Filter model and Gradio Server, showcasing three specific app examples like document exploration and image anonymization.
View Cached Full Text
Cached at: 05/08/26, 09:01 AM
How to build scalable web apps with OpenAI’s Privacy Filter
Source: https://huggingface.co/blog/openai-privacy-filter-web-apps Back to Articles
- The model
- 1. Document Privacy Explorer
- 2. Image Anonymizer
- 3. SmartRedact Paste
- What
gradio\.Serverprovides - Try them
- Recommended reading
OpenAI released Privacy Filter on the Hub this week: an open-source personally-identifiable information (PII) detector that labels text across eight categories in a single forward pass over a 128k context.Model card. We spent a few hours building with it and landed on three apps that each reveals a different slice of what it can do.
- Document Privacy Explorer: drop in a PDF or DOCX, read the document back with every PII span highlighted in place.
- Image Anonymizer: upload an image, get it back with redacted black bars over names, emails, and account numbers. The image is also editable on a canvas so you can make your own annotations before downloading.
- SmartRedact Paste: paste sensitive text, share a public URL that serves the redacted version, keep a private reveal link for yourself.
All three are built ongradio.Server, which lets you pair custom HTML/JS frontends with Gradio’s queueing, ZeroGPU allocation, andgradio\_clientSDK. In all these apps,**gradio\.Server**plays the same backend role, and that consistency is exactly what makes it really powerful.
https://huggingface.co/blog/openai-privacy-filter-web-apps#the-modelThe model
Privacy Filter is a 1.5B-parameter model with 50M active parameters, permissively licensed under Apache 2.0. PII categories areprivate\_person,private\_address,private\_email,private\_phone,private\_url,private\_date,account\_number,secret. Context is 128,000 tokens. Achieves state-of-the-art performance on thePII-Masking-300k benchmark. Full numbers and methodology are in theofficial release blog.
https://huggingface.co/blog/openai-privacy-filter-web-apps#1-document-privacy-explorer1. Document Privacy Explorer
Try it atysharma/OPF-Document-PII-Explorer.
**User problem.**You want to read a PII-heavy document (a contract, a resume, an exported chat log) with every detected span highlighted by category, a filter in the sidebar, and a summary dashboard up top. The reading experience should feel like a normal document, not a form.
**What Privacy Filter does here.**The whole file goes through in a single 128k-context forward pass, so there’s no chunking, no stitching, and span offsets line up directly with the rendered text. BIOES decoding keeps span boundaries clean through long ambiguous runs.
**Whatgr\.Serverdoes here.**You could wire this up in Blocks withgr\.HighlightedTextand a sidebar, and it would work. The reading experience we wanted (serif body, category filters that toggle CSS classes client-side instead of re-running the model, a summary dashboard that doesn’t force a page re-render) was easier to hand-author than to compose.gr\.Serverlets us serve the reader view as a single HTML file and expose the model behind one queued endpoint:
import gradio as gr
from fastapi.responses import HTMLResponse
from gradio.data_classes import FileData
server = gr.Server()
@server.get("/", response_class=HTMLResponse)
async def homepage():
return FRONTEND_HTML # reader view; see app.py
@server.api(name="analyze_document")
def analyze_document(file: FileData) -> dict:
text = extract_text(file["path"]) # PyMuPDF / python-docx
source_text, spans = run_privacy_filter(text) # single 128k pass
return {
"text": source_text,
"spans": spans, # [{start, end, label}, ...]
"stats": compute_stats(source_text, spans),
}
Note the decorator:@server\.api\(name="analyze\_document"\), not a plain@server\.post. That’s the piece that plugs the handler into Gradio’s queue, so concurrent uploads are serialized,@spaces\.GPUcomposes correctly on ZeroGPU, and the same endpoint is reachable from both the browser andgradio\_clientwith no duplicated code. The browser calls it with the Gradio JS client:
<script type="module">
import { Client, handle_file } from "https://cdn.jsdelivr.net/npm/@gradio/client/dist/index.min.js";
const client = await Client.connect(window.location.origin);
async function uploadFile(file) {
const result = await client.predict("/analyze_document", { file: handle_file(file) });
renderResults(result.data[0]); // { text, spans, stats }
}
</script>
https://huggingface.co/blog/openai-privacy-filter-web-apps#2-image-anonymizer2. Image Anonymizer
Try it atysharma/OPF-Image-Anonymizer.
**User problem.**You want to share an image or any screenshot (a Slack thread, a receipt, a Stripe dashboard) with black bars over the PII. You want to toggle bars on and off, drag them to reposition, or draw one by hand for anything the model missed, then export the result.
**What Privacy Filter does here.**Tesseract runs OCR and returns per-word bounding boxes. The backend reconstructs the full text with a char-offset to box map, then runs Privacy Filter once over the whole text. Detected character spans are looked up against the word map and joined into pixel rectangles per line.
Whatgr\.Serverdoes here.gr\.ImageEditorsupports layered annotation and is a reasonable starting point for image redaction. The workflow we wanted (per-bar category metadata, toggle all bars in a category at once, client-side PNG export at natural resolution with no server round-trip) was cleaner to build on a custom<canvas\>frontend.gr\.Serverhands back pixel rectangles from one queued endpoint and lets the canvas own everything else:
@server.api(name="anonymize_screenshot")
def anonymize_screenshot(image: FileData) -> dict:
img = Image.open(image["path"]).convert("RGB")
full_text, char_to_box = ocr_image(img) # per-word boxes + char map
spans = run_privacy_filter(full_text)
boxes = spans_to_pixel_boxes(spans, char_to_box)
return {
"image_data_url": pil_to_base64(img),
"width": img.width,
"height": img.height,
"boxes": boxes, # [{x, y, w, h, label, text}, ...]
}
The frontend invokes it withclient\.predict\("/anonymize\_screenshot", \{ image: handle\_file\(file\) \}\), the same pattern as above. Toggles, drags, new-bar drawing, and PNG export all happen in the browser; edits never round-trip to the server.
https://huggingface.co/blog/openai-privacy-filter-web-apps#3-smartredact-paste3. SmartRedact Paste
Try it atysharma/OPF-SmartRedact-Paste.
**User problem.**You want a pastebin that redacts before sharing. You paste a log line, an email, a support ticket. You get two URLs back. The public one serves the redacted version with<PRIVATE\_PERSON\>,<PRIVATE\_EMAIL\>,<ACCOUNT\_NUMBER\>placeholders, following the redaction convention from theofficial blog examples. The private one is gated by a token you keep and shows the original with spans highlighted.
**What Privacy Filter does here.**Swap each detected span with a<CATEGORY\>placeholder on the stored paste. That’s the entire redaction step. Multilingual text (Spanish, French, Chinese, Hindi, and others in the model-card examples) routes through the same call with no change.
**Whatgr\.Serverdoes here.**This app needs two distinct GET routes for the same paste ID, one public and one token-gated, and the URL shape matters because the reveal URL is the thing you keep.gr\.Serverworks here because it’s a FastAPI app underneath — which is also why@server\.apiand plain@server\.getcan sit side by side in the same process. Note: this can also be built withgr\.Blocks\(\)bymounting custom routes with FastAPI:
# Model call → queued endpoint. Hit from the browser via
# client.predict("/create_paste", { text, ttl }).
@server.api(name="create_paste")
def create_paste(text: str, ttl: str = "never") -> dict:
source_text, spans = run_privacy_filter(text)
redacted = redact(source_text, spans) # <CATEGORY> placeholders
pid, reveal_token = secrets.token_urlsafe(6), secrets.token_urlsafe(22)
PASTES[pid] = Paste(pid, reveal_token, source_text, redacted, spans,
expires_at=_ttl(ttl)) # see app.py
return {
"view_path": f"/view/{pid}",
"reveal_path": f"/view/{pid}?token={reveal_token}",
}
# View page → plain FastAPI GET. No model, no queue needed, and we
# actually want the bespoke URL shape `/view/{pid}?token=...` that a
# queued endpoint couldn't give us.
@server.get("/view/{pid}", response_class=HTMLResponse)
async def view_paste(pid: str, token: str | None = None):
p = _store_get(pid) # see app.py for store
if p is None:
return HTMLResponse(_not_found(), status_code=404)
revealed = bool(token) and secrets.compare_digest(token, p.reveal_token)
return HTMLResponse(_render_view(p, revealed))
A daemon thread evicts expired pastes every 30 seconds. The whole service, including storage, is about 200 lines of application code because everything lives in one process.
https://huggingface.co/blog/openai-privacy-filter-web-apps#what-gradioserver-providesWhatgradio\.Serverprovides
The split across all three apps is the same — anything that touches the model goes through@server\.api, everything else stays on plain FastAPI routes:
AppQueued compute (@server\.api)Plain FastAPI routesDocument Privacy Exploreranalyze\_document— extract, detect, statsGET /serves the custom reader viewImage Anonymizeranonymize\_screenshot— OCR, detect, spans → pixel boxesGET /+GET /examples/\*serve the canvas UI and preloaded examplesSmartRedact Pastecreate\_paste— detect, redact, mint IDsGET /compose page,GET /view/\{pid\}?token=\.\.\.public + token-gated views,GET /api/paste/\{pid\}JSON lookup
@server\.apigives you Gradio’s queue (serialized requests, correct@spaces\.GPUcomposition on ZeroGPU, progress events) and it’s what the browser hits through@gradio/client. The same endpoint is also whatgradio\_clientusers hit from Python — one function, two SDKs, no duplicated code. Plain@server\.get/@server\.postare reserved for the static surfaces: HTML pages, file lookups, cheap dict reads. That’s the rule of thumb from thegradio.Server intro post, and it’s what makes these three apps feel consistent even though their UIs are very different.
https://huggingface.co/blog/openai-privacy-filter-web-apps#try-themTry them
Drop in a resume, a screenshot of a Slack thread, a log line with a token in it. The fun part is seeing what Privacy Filter catches (and occasionally misses) on text you actually care about.
https://huggingface.co/blog/openai-privacy-filter-web-apps#recommended-readingRecommended reading
- OpenAI’s release post:Introducing OpenAI Privacy Filter
- Model card:openai/privacy-filter on Hugging Face
- Redaction examples and taxonomy on Model card
Similar Articles
Introducing OpenAI Privacy Filter
OpenAI releases Privacy Filter, an open-weight model designed to detect and redact personally identifiable information (PII) in text with high efficiency and context awareness.
openai/privacy-filter
OpenAI releases Privacy Filter, a 1.5B parameter bidirectional token classification model for PII detection and masking, featuring an Apache 2.0 license and long-context support for high-throughput data sanitization.
OpenAI Privacy Filter Model
OpenAI quietly released an Apache-2.0-licensed privacy-filter model on Hugging Face with open weights, aiming to help users run local privacy-preserving filters while retaining big-lab quality.
@iotcoi: OpenAI trained the perfect LLM to hide data from OpenAI openai/privacy-filter Apache 2.0, 1B params MoE, runs local My …
OpenAI released a 1B-parameter Apache-2.0 MoE model that strips sensitive data before it reaches any LLM, enabling fully local, leak-proof workflows.
How ChatGPT learns about the world while protecting privacy
OpenAI explains how ChatGPT learns from public data and user interactions while protecting privacy through filtering and user controls.