Tag
A Python script that can automatically parse PDF book content, extract key knowledge points, and generate Markdown-format summaries, aiming to improve reading and knowledge organization efficiency.
Recall is an open-source tool that provides fully-local, zero-cost project memory for Claude Code by automatically capturing session history and summarizing it into a compact context file, all without sending data to any external API.
This paper proposes Detect–Remask–Repair, a diffusion-based framework for localized faithfulness repair in summarization when contexts evolve, and introduces the StreamSum benchmark for evaluating such settings. Experiments show it offers controllable trade-offs between faithfulness, speed, and content preservation.
This paper evaluates context engineering configurations for LLM agents in enterprise tool-use workflows, showing that summarization with selective pruning achieves 91.6% accuracy while reducing token usage by over 60% compared to full-context baselines.
A comparison of LLM summarization performance shows Qwen 3 leads the 30B parameter range, followed by Gemma 4, while newer Qwen models may be optimized for agentic tasks.
ColibotAI is an on-device AI tool that translates, summarizes, and explains any text without needing internet connection.
The author describes inverting the textbook agent memory design from retrieval-on-demand to injection-first to avoid latency and confident empty-context errors, detailing the architecture and a dangerous self-poisoning failure mode with write-back.
The paper introduces NRLB, a multi-agent framework for plain language summarization that simulates diverse reader groups (elementary school, non-native, attention deficits) to improve readability while maintaining factual accuracy, validated across multiple datasets and human evaluations.
A practitioner shares real-world failure modes of context window management strategies (summarization, RAG, truncation) in AI agents running continuously for 6+ hours, noting that each method degrades decision quality in ways that only become apparent at extended runtime.
This paper introduces a method to improve factual consistency in text summarization by aggregating scores from multiple weak metrics via preference learning, achieving consistent factuality gains across various language models.
MemFail is a diagnostic benchmark that isolates failure modes of LLM memory systems by formalizing summarization, storage, and retrieval operations, and evaluating them with adversarially designed datasets.
This paper investigates the risk of sensitive information inference from exported LLM representations in clinical summarization, showing that reducing leakage from one vector artifact does not guarantee privacy in others. It introduces SurfaceLoRA, a fine-tuning method that reduces race recovery from targeted vectors while preserving utility.
Introduces parallel context compaction for long-horizon LLM agents, enabling fine-grained control over summary volume and reducing end-to-end latency compared to sequential synchronous compaction across multiple backbone models.
A Meta paper shows that coding agents improve significantly when they reuse short summaries of past attempts instead of raw logs, achieving strong gains on SWE-Bench and Terminal-Bench with Claude 4.5 Opus.
Meetily is a privacy-first, open-source AI meeting assistant that captures, transcribes, and summarizes meetings entirely locally on the user's infrastructure.
This paper proposes an evidence-based model to automatically generate query keywords from query-free summarization datasets, enabling the creation of query-focused summarization datasets. Experimental results show that summaries generated using evidence-based queries achieve competitive ROUGE scores compared to original queries.
SCURank introduces Summary Content Units to rank candidate summaries, enabling small models distilled from multiple LLMs to outperform traditional metrics and single-LLM distillates.
A novice asks for recommendations on small language models and prompting strategies to build an employee note summarization engine under 2000 tokens, after experiencing hallucinations with Qwen2.5-7B-Instruct.
This paper introduces the one-sided conversation problem (1SC), addressing how to reconstruct missing dialogue and generate summaries when only one speaker's turns are available in real-world settings like telemedicine and call centers. The authors evaluate prompting and finetuned models on multiple datasets, finding that access to future context and utterance length information improves reconstruction, while high-quality summaries can be generated without full dialogue reconstruction.
AICW Summarize Widget lets website visitors summarize page content using their preferred AI tool. It is a simple embeddable widget aimed at improving content accessibility.