Machine learning (ML) models in production fail when their broader systems – from data pipelines to deployment environments – deviate from training assumptions, not merely due to statistical anomalies in input data. Despite extensive work on data drift, data validation, and out-of-distribution detection, ML monitoring research remains largely model-centric while neglecting contextual information: auxiliary signals about the system around the model (external factors, data pipelines, downstream applications). Incorporating this context turns statistical anomalies into actionable alerts and structured root-cause analysis. Drawing on a systematic review of 94 primary studies, we identify three dimensions of contextual information for ML monitoring: the system element concerned (natural environment or technical infrastructure); the aspect of that element (runtime states, structural relationships, prescriptive properties); and the representation used (formal constructs or informal formats). This forms the Contextual System-Aspect-Representation (C-SAR) framework, a descriptive model synthesizing our findings. We identify 20 recurring triplets across these dimensions and map them to the monitoring activities they support. This study provides a holistic perspective on ML monitoring: from interpreting “tea leaves” (i.e., isolated data and performance statistics) to constructing and managing “system maps” (i.e., end-to-end views that connect data, models, and operating context).

Leest, J., Raibulet, C., Lago, P., Gerostathopoulos, I. (2025). From Tea Leaves to System Maps: A Survey and Framework on Context-aware Machine Learning Monitoring. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1-30 [10.1109/TSE.2025.3602520].

From Tea Leaves to System Maps: A Survey and Framework on Context-aware Machine Learning Monitoring

Raibulet, C.
Co-primo
;
2025

Abstract

Machine learning (ML) models in production fail when their broader systems – from data pipelines to deployment environments – deviate from training assumptions, not merely due to statistical anomalies in input data. Despite extensive work on data drift, data validation, and out-of-distribution detection, ML monitoring research remains largely model-centric while neglecting contextual information: auxiliary signals about the system around the model (external factors, data pipelines, downstream applications). Incorporating this context turns statistical anomalies into actionable alerts and structured root-cause analysis. Drawing on a systematic review of 94 primary studies, we identify three dimensions of contextual information for ML monitoring: the system element concerned (natural environment or technical infrastructure); the aspect of that element (runtime states, structural relationships, prescriptive properties); and the representation used (formal constructs or informal formats). This forms the Contextual System-Aspect-Representation (C-SAR) framework, a descriptive model synthesizing our findings. We identify 20 recurring triplets across these dimensions and map them to the monitoring activities they support. This study provides a holistic perspective on ML monitoring: from interpreting “tea leaves” (i.e., isolated data and performance statistics) to constructing and managing “system maps” (i.e., end-to-end views that connect data, models, and operating context).
Articolo in rivista - Review Essay
artificial intelligence; machine learning; monitoring; software engineering;
English
26-ago-2025
2025
1
30
open
Leest, J., Raibulet, C., Lago, P., Gerostathopoulos, I. (2025). From Tea Leaves to System Maps: A Survey and Framework on Context-aware Machine Learning Monitoring. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1-30 [10.1109/TSE.2025.3602520].
File in questo prodotto:
File Dimensione Formato  
Leest-2025-IEEE Trans Softw Engineer-AAM.pdf

accesso aperto

Descrizione: ieee - Post-Publication Policies - https://conferences.ieeeauthorcenter.ieee.org/author-ethics/guidelines-and-policies/post-publication-policies/
Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print)
Licenza: Licenza open access specifica dell’editore
Dimensione 982.85 kB
Formato Adobe PDF
982.85 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/568963
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact