Objective: Persona validation is a challenging task, often relying on costly external validation methods. The aim of this study was the development of a novel method for Personas validation based on data already available during their creation. Methods: A novel approach based on self-supervised machine learning (SSML) was proposed. A training-test split was performed (80 % - 20 %), with the training set used for Personas development. The obtained labels were used as input for a 5-fold cross-validation grid search, resulting in 5 optimal different models. The “weak” ground truth for the test set was determined using the trained clustering model, and was compared with the prediction obtained by the majority voting of the optimal models. Performance evaluation was conducted by means of weighted accuracy, precision, recall and F1 score. Results: The proposed method was evaluated on two very different healthcare datasets composed by questionnaires. The former was presented 1070 subjects, resulting in three unbalanced Personas (P0 n = 100; P1 n = 292; P2 n = 464). The latter included 176 subjects with three slightly unbalanced Personas. (P0 n = 58; P1 n = 32; P2 n = 50). The SSML approach resulted capable of correctly differentiating the clusters with high values of weighted accuracy (88.27 % and 94.12 %), precision (87.11 % and 92.83 %), recall (86.92 % and 91.67 %), and F1 score (86.92 % and 91.76 %). Conclusions: The proposed method showed high capabilities in generalization beyond the training data, validating the Personas’ capability of stratifying the characteristics of target populations. Additionally, this method significantly reduced the costs to validate Personas when compared to other methods in current literature.

Tauro, E., Gorini, A., Bilo, G., Caiani, E. (2025). A novel data-driven approach for Personas validation in healthcare using self-supervised machine learning. JOURNAL OF BIOMEDICAL INFORMATICS, 165(May 2025) [10.1016/j.jbi.2025.104815].

A novel data-driven approach for Personas validation in healthcare using self-supervised machine learning

Bilo G.
Penultimo
;
2025

Abstract

Objective: Persona validation is a challenging task, often relying on costly external validation methods. The aim of this study was the development of a novel method for Personas validation based on data already available during their creation. Methods: A novel approach based on self-supervised machine learning (SSML) was proposed. A training-test split was performed (80 % - 20 %), with the training set used for Personas development. The obtained labels were used as input for a 5-fold cross-validation grid search, resulting in 5 optimal different models. The “weak” ground truth for the test set was determined using the trained clustering model, and was compared with the prediction obtained by the majority voting of the optimal models. Performance evaluation was conducted by means of weighted accuracy, precision, recall and F1 score. Results: The proposed method was evaluated on two very different healthcare datasets composed by questionnaires. The former was presented 1070 subjects, resulting in three unbalanced Personas (P0 n = 100; P1 n = 292; P2 n = 464). The latter included 176 subjects with three slightly unbalanced Personas. (P0 n = 58; P1 n = 32; P2 n = 50). The SSML approach resulted capable of correctly differentiating the clusters with high values of weighted accuracy (88.27 % and 94.12 %), precision (87.11 % and 92.83 %), recall (86.92 % and 91.67 %), and F1 score (86.92 % and 91.76 %). Conclusions: The proposed method showed high capabilities in generalization beyond the training data, validating the Personas’ capability of stratifying the characteristics of target populations. Additionally, this method significantly reduced the costs to validate Personas when compared to other methods in current literature.
Articolo in rivista - Articolo scientifico
Clustering; Personalized care; Personas; Self-supervised machine learning; Validation;
English
14-mar-2025
2025
165
May 2025
104815
open
Tauro, E., Gorini, A., Bilo, G., Caiani, E. (2025). A novel data-driven approach for Personas validation in healthcare using self-supervised machine learning. JOURNAL OF BIOMEDICAL INFORMATICS, 165(May 2025) [10.1016/j.jbi.2025.104815].
File in questo prodotto:
File Dimensione Formato  
Tauro-2025-Journal of Biomedical Informatics-VoR.pdf

accesso aperto

Descrizione: This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 1.44 MB
Formato Adobe PDF
1.44 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/551641
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact