AI Law - International Review of Artificial Intelligence LawCC BY-NC-SA Commercial Licence ISSN 3035-5451
G. Giappichelli Editore

02/09/2025 - EU study highlights flaws in current AI benchmarking methods (Europe)

argument: Notizie/News - Digital Governance

Source: Euractiv

A recent EU Commission Joint Research Centre paper warns that current AI benchmarking tools overpromise and are easily manipulated, often measuring irrelevant aspects of AI models. The study stresses the need for regulators to carefully evaluate these tools, ensuring benchmarks reflect real-world capabilities, are transparent, well-documented, and culturally inclusive. This is critical as the EU’s AI Act relies on benchmarks to classify AI risks. The US has launched its own AI evaluation tools to maintain leadership. Experts suggest the EU should require third-party evaluators and fund the AI evaluation ecosystem. The Commission has invested €9 million to support AI model evaluations, aiming to establish trustworthy standards and possibly create a global ‘Brussels effect’ in AI governance.