Why Language Models Hallucinate

Language models make things up because they are rewarded for guessing answers instead of admitting they don't know.

HallucinationEvaluationTrustworthiness

Key Takeaways

Language models sometimes make up information that sounds correct but is actually wrong. Think of it like a student guessing on a hard test instead of saying 'I don't know.' This happens because the models are trained to get the best score possible, and guessing often leads to a better score than admitting uncertainty. The creators of these models are encouraged to change the way they are tested. Instead of punishing them for not knowing an answer, they should be rewarded for being honest and trustworthy.