Since 1998, AlmaLaurea—a consortium of 80 Italian universities and a member of the Italian National Statistical System—has conducted an annual census on graduates’ employment status. The survey provides estimates of descriptive indicators at both the population level and for specific subpopulations (domains) of interest, such as degree programmes. Some domains have very few observations due to a small population size and non-response. In this paper, we address this estimation problem within a Small Area Estimation framework. Specifically, we propose using generalized linear mixed models that incorporate two variables as proxies for graduates’ response propensity, making the assumption of non-informative non-response more plausible. Degree programme estimates of employment rates are derived as (semi-parametric) empirical best predictions using a finite mixture of logistic regression models, with their mean squared error estimated via a second-order, bias-corrected, analytical estimator. Sensitivity analysis is conducted to assess the explanatory power of variables modelling response propensity and to evaluate potential correlations between area-specific random effects and observed heterogeneity.
Ranalli, G., Pennoni, F., Bartolucci, F., Mira, A. (2025). When non-response makes estimates from a census a small area estimation problem: the case of the survey on graduates’ employment status in Italy. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 19(2), 515-543 [10.1007/s11634-025-00630-z].
When non-response makes estimates from a census a small area estimation problem: the case of the survey on graduates’ employment status in Italy
Pennoni, F;
2025
Abstract
Since 1998, AlmaLaurea—a consortium of 80 Italian universities and a member of the Italian National Statistical System—has conducted an annual census on graduates’ employment status. The survey provides estimates of descriptive indicators at both the population level and for specific subpopulations (domains) of interest, such as degree programmes. Some domains have very few observations due to a small population size and non-response. In this paper, we address this estimation problem within a Small Area Estimation framework. Specifically, we propose using generalized linear mixed models that incorporate two variables as proxies for graduates’ response propensity, making the assumption of non-informative non-response more plausible. Degree programme estimates of employment rates are derived as (semi-parametric) empirical best predictions using a finite mixture of logistic regression models, with their mean squared error estimated via a second-order, bias-corrected, analytical estimator. Sensitivity analysis is conducted to assess the explanatory power of variables modelling response propensity and to evaluate potential correlations between area-specific random effects and observed heterogeneity.| File | Dimensione | Formato | |
|---|---|---|---|
|
Ranalli et all-2025-Adv Data Anal Classif-VoR.pdf
accesso aperto
Descrizione: This article is licensed under a Creative Commons Attribution 4.0 International License To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
861.66 kB
Formato
Adobe PDF
|
861.66 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


