Accountability is the ability to identify who made the decision, for what reasons, how the decision has been put into practice, and to whom the decision-takers are responsible. As part of an accountability mechanism to support trustworthy AI, providers of a high-risk AI system should be able to demonstrate artefacts like technical documentation (EU AI Act Article 11) and event logs (Article 12). The amount of data to be created for these accountability artefacts and the variety of AI components from different vendors to be assemble for a single AI service bring challenges for system developers and auditors to ask a question like “Which models, in the past two weeks, perform less accurately than what they claimed they can in the technical document?” if each component comes with their own way to describe claimed/recorded accuracy. We present STAV, a system trustworthiness and accountability vocabulary that standardised common machine learning evaluation metrics, model card items, technical documentation items, and event log items. STAV names and namespaces can be called directly from a Python library and integrated with MLOps framework like MLflow to track machine learning metrics and parameters during system development, register a model in a model registry, and record performance during operation. Documentation of AI components from different vendors can be queried the same way using this common metadata scheme. STAV vocabulary will be available at https://w3id.org/stav/ and the Python library at https://pypi.org/project/stav/.

Summit 2023
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.