measurement

Tag

Cards List
#measurement

PReMISE: Policy Rubrics as Measurement Specifications for LLM Judges

arXiv cs.AI · 3d ago Cached

Introduces PReMISE, a framework for discovering and auditing policy-level rubrics for LLM judges along four axes: structural adequacy, reliability, preference fit, and adversarial robustness.

0 favorites 0 likes
#measurement

AI evaluation may bias perceptions: The importance of context in interpreting academic writing

arXiv cs.CL · 2026-05-27 Cached

This paper examines how estimates of AI use in scientific writing can be biased when evaluation methods ignore contextual differences across countries and fields, and proposes context-aware benchmarks for more accurate measurement.

0 favorites 0 likes
#measurement

Our voice agent's p99 was 280ms. Competitor's was 450ms. Users said ours felt slower. We measured why.

Reddit r/AI_Agents · 2026-05-26

A voice agent team found that despite lower end-to-end latency (280ms vs competitor's 450ms), users perceived it as slower due to poor barge-in interrupt rate (380ms vs 60ms). They identified three fixes—memory pinning, VAD threshold tuning, and smaller TTS chunks—that improved barge-in rate from 41% to 89% at 100ms, making users feel it's faster.

0 favorites 0 likes
#measurement

Screen Ruler

Product Hunt · 2026-05-23

Screen Ruler is a tool that provides on-screen measurements for designers and developers.

0 favorites 0 likes
#measurement

AI proficiency is becoming a hiring requirement but we still have no real way to measure it

Reddit r/ArtificialInteligence · 2026-05-22

The author explores the difficulty of accurately measuring AI proficiency in hiring, arguing that current certifications and tests focus on memorization rather than practical reasoning and evaluation.

0 favorites 0 likes
#measurement

All the Fancy Measuring Devices Used in Science Rely on Two Stone-Age Techniques

Wired · 2026-05-22 Cached

The article argues that despite modern scientific instruments, all measurements ultimately derive from two ancient techniques: comparison and counting, illustrated through examples like rulers and sundials.

0 favorites 0 likes
#measurement

Points are a weird and inconsistent unit of measure

Lobsters Hottest · 2026-05-13 Cached

A technical deep dive into the historical inconsistency of the typographic point unit, explaining why TeX (72.27 pt/inch) and Inkscape (72 pt/inch) use different definitions, rooted in 19th-century standardization and Donald Knuth's pragmatic adjustment.

0 favorites 0 likes
← Back to home

Submit Feedback