Metrics for Holistic Evaluation of LLM Reasoning about Action, Change, and Planning
Published in NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling, 2025
New informative Metrics for Evaluation of LLM Responses in Planning and Reasoning tasks.
