EU study highlights flaws in current AI benchmarking methods (Europe)

home > Notizie/News > Digital Governance > EU study highlights flaws in ..

02/09/2025 - EU study highlights flaws in current AI benchmarking methods (Europe)

argument: Notizie/News - Digital Governance

A recent EU Commission Joint Research Centre paper warns that current AI benchmarking tools overpromise and are easily manipulated, often measuring irrelevant aspects of AI models. The study stresses the need for regulators to carefully evaluate these tools, ensuring benchmarks reflect real-world capabilities, are transparent, well-documented, and culturally inclusive. This is critical as the EU’s AI Act relies on benchmarks to classify AI risks. The US has launched its own AI evaluation tools to maintain leadership. Experts suggest the EU should require third-party evaluators and fund the AI evaluation ecosystem. The Commission has invested €9 million to support AI model evaluations, aiming to establish trustworthy standards and possibly create a global ‘Brussels effect’ in AI governance.

AI Law - International Review of Artificial Intelligence Law ISSN 3035-5451

02/09/2025 - EU study highlights flaws in current AI benchmarking methods (Europe)

Subscribe to the newsletter