Tag
Introducing the /goal feature of Claude Code, paired with the prompt "/goal all tests pass and lint is clean" to automate testing and linting, equivalent to a highly efficient QA engineer.
Blog post offering guidelines for writing Elixir tests, including using @subject module attributes, describe blocks, and avoiding module mocking in favor of dependency injection.
The article discusses the problem of flaky tests in software development and proposes a simple mechanical habit: when using a merge queue, continue to run the full test suite on main and maintain a visible list of recent main failures to help identify and eradicate flaky tests.
A developer argues that end-to-end testing is more important than unit tests in the age of AI coding agents, after an agent-coded refactor passed all unit tests but broke a critical feature that was only caught by manual end-to-end checks.
A tutorial covering various strategies for mocking network calls in Rust, using the example of a Kubernetes controller that emits events, with emphasis on not degrading production code for testability.
The post argues that the primary value of AI in programming is not just writing code faster, but enabling sustainable high-level verification and testing that was previously too costly in terms of human effort.
The author describes implementing golden/visual regression testing for the Waterfall-CAD Haskell library using SVG output and the tasty-golden library.
A shared resource linking to an interview preparation playbook focused on RAG evaluation and testing for LLMs.
Announcement of a course on React end-to-end testing with Playwright, covering browser automation, user journeys, auth, search, test setup, and debugging tips.
Fabraix is a tool that helps developers identify gaps in their AI agents before users encounter them.
llm-echo 0.5a0 release adds a -o thinking 1 option to help test against LLM 0.32a0 and higher. This plugin provides a fake echo model for automated testing.
Opslane is a new tool that lets developers test Claude-generated code changes in a real browser by reading specs, building acceptance criteria, and running tests against a local dev server.
LangChain published a post explaining how they automated testing of their documentation to ensure technical accuracy for both humans and AI coding agents.
The article discusses the growing importance of reliability, security, and user protections as AI models become more capable and personalized.
Tophat by Shopify is a tool that lets you test mobile CI builds on any device without building locally.
Mitchell Hashimoto introduces Tripwire, a Zig library that injects failures to test error recovery paths, with zero runtime cost when disabled.
AutoDev is an AI-driven software development framework that automates complex engineering tasks, such as code and test generation, within a secure Docker environment. It achieves high performance on the HumanEval dataset by enabling autonomous planning and execution of intricate software engineering tasks.
Insomnia is an open-source, cross-platform API client for GraphQL, REST, WebSockets, SSE, gRPC and other HTTP protocols, offering local, Git, and cloud sync storage options.
Puppeteer is a JavaScript library that provides a high-level API to control Chrome/Chromium over the DevTools Protocol, commonly used for browser automation, web scraping, and testing.
Cypress is a fast, easy, and reliable end-to-end testing framework for anything that runs in a browser, available as an open-source project on GitHub.