I'm an AI engineer with 8 years in AI infrastructure, driven by past hallucination pitfalls to rigorously track every benchmark release.