On Computing the Shattering Coefficient to Analyze the Bias of Supervised Learning Algorithms with Applications to Deep Learning
Rodrigo Mello
Date: 15:15 – 16:00, Tuesday, 17.03.2020
Location: Minnaert – 2.02
Title: On Computing the Shattering Coefficient to Analyze the Bias of Supervised Learning Algorithms with Applications to Deep Learning
Abstract: The Statistical Learning Theory (SLT) provides the foundation to ensure that a supervised algorithm generalizes the mapping f : X → Y given f is selected from its search space bias F. SLT depends on the Shattering coefficient function N (F, n) to upper bound the empirical risk minimization principle, from which one can estimate the necessary training sample size to ensure the probabilistic learning convergence and, most importantly, the characterization of the capacity of F, including its under and overfitting abilities while addressing specific target problems. However, the precise computation of the Shattering coefficient is still an open problem, which we address in this paper by firstly estimating the maximal number of hyperplanes required to shatter any data sample based on the recent contributions by Har-Peled and Jones, and then we estimate the Shattering coefficient for both binary and multi-class problems. As main contributions, (i) our approach allows to study the complexity of the search space bias F, (ii) estimate training sample sizes, and (iii) parametrize the number of hyperplanes necessary to address some supervised task, what is specially appealing to deep neural networks, as evidenced by our experiments.