1

Resource-Limited Automated Ki67 Index Estimation in Breast Cancer

Compressibility Measures for Two-Dimensional Data

In this paper we extend to two-dimensional data two recently introduced one-dimensional compressibility measures: the $\gamma$ measure defined in terms of the smallest string attractor, and the $\delta$ measure defined in terms of the number of …

Engineering a Textbook Approach to Index Massive String Dictionaries

We study the problem of engineering space-time efficient indexes that support membership and lexicographic (rank) queries on very large static dictionaries of strings. Our solution is based on a very simple approach that consists of decoupling string …

Learned monotone minimal perfect hashing

A Monotone Minimal Perfect Hash Function (MMPHF) constructed on a set $S$ of keys is a function that maps each key in $S$ to its rank. On keys not in $S$, the function returns an arbitrary value. Applications range from databases, search engines, …

Computing matching statistics on Wheeler DFAs

Matching statistics were introduced to solve the approximate string matching problem, which is a recurrent subroutine in bioinformatics applications. In 2010, Ohlebusch et al. [SPIRE 2010] proposed a time and space efficient algorithm for computing …

A Critical Analysis of Classifier Selection in Learned Bloom Filters: the Essentials

Compressed string dictionaries via data-aware subtrie compaction

String dictionaries are a core component of a plethora of applications, so it is not surprising that they have been widely and deeply investigated in the literature since the introduction of tries in the '60s. We introduce a new approach to trie …

Learned Sorted Table Search and Static Indexes in Small Model Space

Machine Learning Techniques, properly combined with Data Structures, have resulted in Learned Static Indexes, innovative and powerful tools that speed-up Binary Search, with the use of additional space with respect to the table being searched into. …

On the Suitability of Neural Networks as Building Blocks for the Design of Efficient Learned Indexes

With the aim of obtaining time/space improvements in classic Data Structures, an emerging trend is to combine Machine Learning techniques with the ones proper of Data Structures. This new area goes under the name of Learned Data Structures. The …

On the Choice of General Purpose Classifiers in Learned Bloom Filters: An Initial Analysis Within Basic Filters

Bloom Filters are a fundamental and pervasive data structure. Within the growing area of Learned Data Structures, several Learned versions of Bloom Filters have been considered, yielding advantages over classic Filters. Each of them uses a …