Recently, Cenzato et al. proposed a new text index, called the suffixient array, which is a subset of the suffix array and supports locating a single pattern occurrence or finding its maximal exact matches (MEMs), assuming random access to the input text T[1..n] is available. They show that, given the suffix array, the longest common prefix array, and the Burrows-Wheeler transform (BWT) of the reverse of T[1..n] over an alphabet {1,…,σ}, a suffixient array can be constructed in linear time. However, their construction algorithms require multiple scans of these arrays. When restricted to a single pass over the arrays, they present an alternative construction algorithm running in O(n + r log σ) time, where r is the number of runs in the BWT of the reversed text. In this paper, we present a new one-pass algorithm that constructs a suffixient array in linear time under the standard RAM model.
Bonizzoni, P., Gao, Y., Riccardi, B. (2026). Constructing Suffixient Arrays Revisited. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026) (pp.1-18) [10.4230/lipics.cpm.2026.30].
Constructing Suffixient Arrays Revisited
Bonizzoni, P;Gao, Y
;Riccardi, B
2026
Abstract
Recently, Cenzato et al. proposed a new text index, called the suffixient array, which is a subset of the suffix array and supports locating a single pattern occurrence or finding its maximal exact matches (MEMs), assuming random access to the input text T[1..n] is available. They show that, given the suffix array, the longest common prefix array, and the Burrows-Wheeler transform (BWT) of the reverse of T[1..n] over an alphabet {1,…,σ}, a suffixient array can be constructed in linear time. However, their construction algorithms require multiple scans of these arrays. When restricted to a single pass over the arrays, they present an alternative construction algorithm running in O(n + r log σ) time, where r is the number of runs in the BWT of the reversed text. In this paper, we present a new one-pass algorithm that constructs a suffixient array in linear time under the standard RAM model.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


