Reliable data on the Italian Third Sector remain limited despite its socioeconomic importance. This paper presents a methodology for assembling a comprehensive dataset on the sector by systematically exploiting administrative data from the Single National Register of the Third Sector (RUNTS). The proposed pipeline automates the acquisition and cleaning of data on more than 134,000 organisations through a tailored web-scraping procedure. To address common shortcomings of administrative data, such as incompleteness and self-reporting bias, a novel web-augmentation strategy is implemented, combining information extracted from organisational websites with a Generative AI-based classification system. This approach allows the identification of beneficiaries, territorial scope, and economic activities, and enables comparison between web-derived and self-declared information for the latter case. Using data for the second quarter of 2025, the paper provides an up-to-date picture of the Italian Third Sector and offers a new informational base for socioeconomic research and co-programming of local policies.

Bottai, C., Trentini, F. (2025). Mapping the Italian Third Sector: Insights from Web-augmented Administrative Data [Altro] [10.2139/ssrn.5744522].

Mapping the Italian Third Sector: Insights from Web-augmented Administrative Data

Bottai, Carlo
;
Trentini, Francesco
2025

Abstract

Reliable data on the Italian Third Sector remain limited despite its socioeconomic importance. This paper presents a methodology for assembling a comprehensive dataset on the sector by systematically exploiting administrative data from the Single National Register of the Third Sector (RUNTS). The proposed pipeline automates the acquisition and cleaning of data on more than 134,000 organisations through a tailored web-scraping procedure. To address common shortcomings of administrative data, such as incompleteness and self-reporting bias, a novel web-augmentation strategy is implemented, combining information extracted from organisational websites with a Generative AI-based classification system. This approach allows the identification of beneficiaries, territorial scope, and economic activities, and enables comparison between web-derived and self-declared information for the latter case. Using data for the second quarter of 2025, the paper provides an up-to-date picture of the Italian Third Sector and offers a new informational base for socioeconomic research and co-programming of local policies.
Altro
Preprint
Social Economy, Third Sector, Administrative Data, Web-Augmentation, Generative AI, ICNPO
English
2025
https://ssrn.com/abstract=5744522
Bottai, C., Trentini, F. (2025). Mapping the Italian Third Sector: Insights from Web-augmented Administrative Data [Altro] [10.2139/ssrn.5744522].
open
File in questo prodotto:
File Dimensione Formato  
Bottai-2025-SSRN-preprint.pdf

accesso aperto

Tipologia di allegato: Submitted Version (Pre-print)
Licenza: Altro
Dimensione 3.3 MB
Formato Adobe PDF
3.3 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/577147
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact