quality-assurance

Tag

Cards List
#quality-assurance

Using AI to write better code more slowly

Lobsters Hottest · 2026-05-25 Cached

Nolan Lawson argues that AI coding assistants can be used to write high-quality code slowly by employing multiple models for thorough code review and bug detection, improving codebase health rather than maximizing output speed.

0 favorites 0 likes
#quality-assurance

How are teams handling prompt QA at scale?

Reddit r/AI_Agents · 2026-05-20

A practitioner at a company handling ~40k conversations/month describes the bottleneck of manual prompt QA and asks how teams are using automated systems to detect regressions and user frustration in production.

0 favorites 0 likes
#quality-assurance

Drizz

Product Hunt · 2026-05-14

Drizz is a mobile testing tool that autonomously writes, runs, and fixes tests.

0 favorites 0 likes
#quality-assurance

GPT-5.5 was used to flag fatal errors in FrontierMath problems

Reddit r/singularity · 2026-05-12

GPT-5.5 was used by Epoch to identify fatal errors in approximately one-third of the FrontierMath benchmark problems, demonstrating the model's capability to sanity-check evaluation standards.

0 favorites 0 likes
#quality-assurance

@kettanaito: More and more people are asking me about testing resources so let's put everything I've written in one post. Bookmark, …

X AI KOLs Following · 2026-05-10 Cached

The author consolidates a series of articles on software testing fundamentals, covering topics such as the purpose of testing, assertions, code coverage, and handling flaky tests.

0 favorites 0 likes
#quality-assurance

Fabraix

Product Hunt · 2026-05-07

Fabraix is a tool that helps developers identify gaps in their AI agents before users encounter them.

0 favorites 0 likes
#quality-assurance

Kimi vendor verifier – verify accuracy of inference providers

Hacker News Top · 2026-04-20 Cached

Moonshot AI has open-sourced the Kimi Vendor Verifier (KVV), a tool designed to help users verify the accuracy and correctness of inference provider implementations for open-source models like Kimi K2. It uses six critical benchmarks to detect infrastructure-level issues such as KV cache bugs, quantization degradation, and parameter misuse.

0 favorites 0 likes
← Back to home

Submit Feedback