Hermes Labs maps the structured failures hiding in production LLM systems. 1,500+ controlled evaluations. Enterprise behavioral audits. Open-source safety tools. Built on philosophy of language, not just metrics.
Models agree with users instead of correcting them. Confidence masks compliance. Your AI says "yes" when the answer is "no."
Asymmetric evidential standards. Positive claims get amplified, null results get buried. Your AI has a confirmation bias it can't see.
Meaning shifts across conversation turns without flagging uncertainty. The model silently reinterprets your intent mid-session.
Edge cases in instruction parsing where the model acts with full confidence on an ambiguous interpretation, never signaling doubt.
Most AI safety work starts from machine learning. Hermes Labs starts from the centuries-old study of how meaning breaks down.
Wittgenstein showed that language games have rules that aren't written anywhere. Gadamer proved that interpretation is never neutral. Heidegger revealed that understanding requires context that can't be formalized.
These aren't abstract ideas. They're exactly the failure modes that cause production LLMs to misattribute sources, fabricate citations, and act with full confidence on ambiguous instructions.
"I started Hermes Labs because I kept watching engineering teams deploy language models without understanding how they fail. The traditions that spent centuries studying how meaning breaks down have a lot to say about why LLMs behave the way they do."
Rolando Bosch, FounderHermes Labs maps the gap between what AI systems say and what they know. Independent research, open-source tools, enterprise audits.