qa

#qa

@petergyang: I am interviewing @NousResearch Hermes co-founder @karan4d tomorrow, what topics would you like to see us cover? I alre…

X AI KOLs Following ↗ · yesterday Cached

Peter Yang is interviewing Karan4D, co-founder of NousResearch's Hermes, and asks for topic suggestions from the community.

0 favorites 0 likes

#qa

@browser_use: Use the QA skill in Browser Use v4. Your agent builds the app. Give Browser Use the URL and let it: > Test the flow and…

X AI KOLs Following ↗ · 2026-06-26 Cached

Browser Use v4 introduces a QA skill that allows your agent to test flows, catch bugs, and evaluate UI by clicking around as a user, closing the feedback loop for developers.

0 favorites 0 likes

#qa

@mxtaverse: People are losing jobs left and right. Esp front end, designers and QA. Scary situation.

X AI KOLs Following ↗ · 2026-06-23

A tweet observes that people in front-end, design, and QA roles are losing jobs, reflecting the current scary situation in the tech industry.

0 favorites 0 likes

#qa

QApilot's CoWork

Product Hunt ↗ · 2026-06-17

QApilot's CoWork claims to triple mobile automation efficiency without expanding the QA team.

0 favorites 0 likes

#qa

My voice-agent test now includes the 600-second cliff

Reddit r/AI_Agents ↗ · 2026-06-11

The author describes a voice agent call cut off at 600 seconds without warning, and proposes a testing approach to handle max duration gracefully, including pre-cutoff warnings and state preservation.

0 favorites 0 likes

#qa

Your AI Agent is one bad prompt away from ruining your brand (And why traditional QA is useless)

Reddit r/AI_Agents ↗ · 2026-06-11

The article argues that traditional chatbot QA is broken because it only tests happy paths, and proposes using an AI-powered user simulator that attacks the bot with diverse personas and edge cases to find vulnerabilities before deployment.

0 favorites 0 likes

#qa

@neural_avb: Locally generating GRPO-like rollouts with my SLM, and using this tiny RM as the rubric. Next I'll be RL training on fr…

X AI KOLs Timeline ↗ · 2026-06-11 Cached

Neural_avb releases a lightweight Answer-eq Reward Model for RL training on QA tasks, claiming 80% agreement with external judge LM and faster than F1/ROUGE/BertScore.

0 favorites 0 likes

#qa

@antirez: Took the good work of the communtiy of DwarfStar and consolidating the Strix Halo support. It looks very good. More QA …

X AI KOLs Following ↗ · 2026-06-07 Cached

Antirez is consolidating community contributions from DwarfStar to improve Strix Halo support, with final QA and merge expected soon.

0 favorites 0 likes

#qa

A new era for software testing

Hacker News Top ↗ · 2026-06-07 Cached

The article discusses using LLMs as automated QA engineers to perform manual testing tasks, such as integration and regression testing, potentially raising software quality bar.

0 favorites 0 likes

#qa

Answer Presence Drives RAG Rewriting Gains

Hugging Face Daily Papers ↗ · 2026-06-04 Cached

The paper investigates whether the performance gains from rewriting retrieved passages in RAG QA pipelines are causally driven by the presence of the gold answer string in the rewritten context, using controlled intervention audits across multiple models and datasets.

0 favorites 0 likes

#qa

@RayFernando1337: The bugs that cause churn almost never show up in a diff, and you only really catch them when you stop reviewing code a…

X AI KOLs Timeline ↗ · 2026-06-02 Cached

A developer shares a workflow using Cursor's Opus 4.8 Max Thinking model with subagent harness, and introduces a GitHub repository with installable skill files for AI coding agents, including a 'running-bug-review-board' skill that performs live QA testing.

0 favorites 0 likes

#qa

@justsisyphus: imagine your codex does QA by himself using computer use without manually telling them every fucking time yes that is w…

X AI KOLs Timeline ↗ · 2026-05-31 Cached

LazyCodex is a tool that automates QA using AI computer use, allowing developers to set up automated testing without manual intervention.

0 favorites 0 likes

#qa

@yihui_indie: I've been away from the workplace for too long. I'm now very curious about QA work in big companies—is it still the same workflow as before? That is, after finding a bug, you file a ticket to the developers. Because I've realized that when I submit a bug to the devs now, the submitted bug itself is a prompt for AI. I think…

X AI KOLs Following ↗ · 2026-05-30 Cached

After leaving the workplace, the author is curious whether the workflow of QA in big companies remains the same—submitting a ticket after finding a bug—and believes that submitting a bug can itself be seen as a prompt for AI, so it might be better to directly let AI modify the code.

0 favorites 0 likes

#qa

@ndrewpignanelli: Activegraph's website, newsletter, and marketing are all run on Cofounder!

X AI KOLs Timeline ↗ · 2026-05-26 Cached

ActiveGraph introduces a deterministic non-generative approach for evidence compilation before semantic memory, achieving 85.6% QA accuracy and 86.2% turn answer-in-context on LongMemEval-S.

0 favorites 0 likes

#qa

@yoheinakajima: ran my first benchmark this weekend (longmemeval) mostly to test activegraph, learned a lot! - this is a stepping stone…

X AI KOLs Timeline ↗ · 2026-05-26 Cached

Yohei Nakajima ran the LongMemEval benchmark on ActiveGraph, achieving 85.6% QA accuracy and 86.2% turn answer-in-context, demonstrating the effectiveness of event-based agent systems for long-term memory.

0 favorites 0 likes

#qa

Claim-Selective Certification for High-Risk Medical Retrieval-Augmented Generation

arXiv cs.CL ↗ · 2026-05-22 Cached

This paper proposes claim-selective certification for high-risk medical retrieval-augmented generation (RAG), decomposing responses into verifiable claims and scoring them against evidence to produce actions (full, partial, conflict, abstain) using an intent-aware selector, achieving low unsupported-claim risk and high action accuracy.

0 favorites 0 likes

#qa

@RayFernando1337: You can teach Composer 2.5 to be a really good QA engineer for your team with this prompt: "go ahead and make a QA sect…

X AI KOLs Following ↗ · 2026-05-20 Cached

A tweet shares a prompt that configures Composer 2.5 to act as a QA engineer, creating test documentation and bug reports for development phases.

0 favorites 0 likes

#qa

@aigclink: An open-source end-to-end video translation + video Q&A Skill: violin. The highlight is not just literal translation, but the idea of content re-creation. It integrates ASR, LLM translation, and TTS into a seamless pipeline video Skill. The three modules are automatically chained: input a video and get a dubbed translated video. Translation style is adjustable, for example...

X AI KOLs Timeline ↗ · 2026-05-15

Violin is an open-source end-to-end video translation and video Q&A tool, integrating ASR, LLM translation, and TTS. It supports style adjustment and content re-creation, and can answer questions about video content.

0 favorites 0 likes

qa

Submit Feedback