Learning with Feed-forward Neural Networks: Three Schemes to Deal with the Bias/Variance Trade-off
Author | : |
Publisher | : |
Total Pages | : |
Release | : 2004 |
ISBN-10 | : OCLC:650535981 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Learning with Feed-forward Neural Networks: Three Schemes to Deal with the Bias/Variance Trade-off written by and published by . This book was released on 2004 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: In terms of the Bias/Variance decomposition, very flexible (i.e., complex) Supervised Machine Learning systems may lead to unbiased estimators but with high variance. A rigid model, in contrast, may lead to small variance but high bias. There is a trade-off between the bias and variance contributions to the error, where the optimal performance is achieved. In this work we present three schemes related to the control of the Bias/Variance decomposition for Feed-forward Neural Networks (FNNs) with the (sometimes modified) quadratic loss function: 1. An algorithm for sequential approximation with FNNs, named Sequential Approximation with Optimal Coefficients and Interacting Frequencies (SAOCIF). Most of the sequential approximations proposed in the literature select the new frequencies (the non-linear weights) guided by the approximation of the residue of the partial approximation. We propose a sequential algorithm where the new frequency is selected taking into account its interactions with the previously selected ones. The interactions are discovered by means of their optimal coefficients (the linear weights). A number of heuristics can be used to select the new frequencies. The aim is that the same level of approximation may be achieved with less hidden units than if we only try to match the residue as best as possible. In terms of the Bias/Variance decomposition, it will be possible to obtain simpler models with the same bias. The idea behind SAOCIF can be extended to approximation in Hilbert spaces, maintaining orthogonal-like properties. In this case, the importance of the interacting frequencies lies in the expectation of increasing the rate of approximation. Experimental results show that the idea of interacting frequencies allows to construct better approximations than matching the residue. 2. A study and comparison of different criteria to perform Feature Selection (FS) with Multi-Layer Perceptrons (MLPs) and the Sequential Backward Selection (SBS) procedure w.