here it is: Benchmark-Yourself app - compete against open source LLMs and get your score - 5 benchmarks available - Add your results to your CV or linkedIn (if you dare)... or just paste them below for community shaming.

Reddit r/LocalLLaMA Tools

Summary

A web app that allows users to benchmark their own performance against open source LLMs on five benchmarks, with the option to add results to a CV or LinkedIn.

[https://benchmark-yourself.streamlit.app/](https://benchmark-yourself.streamlit.app/) BBQ is 🔥 * Rule 4: Limit Self-Promotion - this is not self promotion * The 1/10th rule is a good guideline: self-promotion should not be more than 10% of your content. - my content is high quality and diversified * Affiliation must be disclosed: No engagement farming, No “I found this..”, etc. - I am not affiliated with streamline or oMLX or anything.
Original Article

Similar Articles

The Metacognitive Monitoring Battery: A Cross-Domain Benchmark for LLM Self-Monitoring

arXiv cs.CL

A new cross-domain benchmark (Metacognitive Monitoring Battery) with 524 items evaluates LLM self-monitoring capabilities across six cognitive domains using human psychometric methodology. Applied to 20 frontier LLMs, it reveals three distinct metacognitive profiles and shows that accuracy rank and metacognitive sensitivity rank are largely inverted.