LLMs Need to Go Beyond Computational Confidence Metrics to Establish Trust
Published in AAAI Symposium on AI Trustworthiness and Risk Assessment for Challenged Contexts (ATRACC-25), 2025
While Large Language Models (LLMs) have demonstrated impressive capabilities, their widespread deployment is hindered by the lack of trustworthiness of their responses. Although existing trust scores and confidence metrics attempt to quantify the uncertainty, ensure safety, and reliability of LLM responses, they address only a single dimension of trust and fail to ensure trust holistically, in a user-centric manner. This lack of metric reliability and LLM trustworthiness poses significant risks in critical human-AI interaction applications. We posit that current confidence metrics and trust scores are insufficient to accurately measure trustworthiness and to ultimately inform how to establish calibrated user trust in these systems. We further argue that we need to move beyond computational assessments to enhance the measurement of trustworthiness of generative AI systems. We outline frameworks and approaches that can be incorporated into holistic trustworthy AI assessment and development in future research.
