
Crossvalidation
This text is a survey on crossvalidation. We define all classical cross...
read it

On a NadarayaWatson Estimator with Two Bandwidths
In a regression model, we write the NadarayaWatson estimator of the reg...
read it

Nonparametric estimation of the first order Sobol indices with bootstrap bandwidth
Suppose that Y = m(X_1, ..., X_p), where (X_1, ..., X_p) are inputs, Y i...
read it

An Efficient Approach for Removing Lookahead Bias in the Least Square Monte Carlo Algorithm: LeaveOneOut
The least square Monte Carlo (LSM) algorithm proposed by Longstaff and S...
read it

RLeave: an in silico crossvalidation protocol for transcript differential expression analysis
Background and Objective: The massive parallel sequencing technology fac...
read it

Crossvalidation failure: small sample sizes lead to large error bars
Predictive models ground many stateoftheart developments in statistic...
read it

Testing CrossValidation Variants in Ranking Environments
This research investigates how to determine whether two rankings can com...
read it
Bagging crossvalidated bandwidth selection in nonparametric regression estimation with applications to largesized samples
Crossvalidation is a wellknown and widely used bandwidth selection method in nonparametric regression estimation. However, this technique has two remarkable drawbacks: (i) the large variability of the selected bandwidths, and (ii) the inability to provide results in a reasonable time for very large sample sizes. To overcome these problems, bagging crossvalidation bandwidths are analyzed in this paper. This approach consists in computing the crossvalidation bandwidths for a finite number of subsamples and then rescaling the averaged smoothing parameters to the original sample size. Under a randomdesign regression model, asymptotic expressions up to a secondorder for the bias and variance of the leaveoneout crossvalidation bandwidth for the Nadaraya–Watson estimator are obtained. Subsequently, the asymptotic bias and variance and the limit distribution are derived for the bagged crossvalidation selector. Suitable choices of the number of subsamples and the subsample size lead to an n^1/2 rate for the convergence in distribution of the bagging crossvalidation selector, outperforming the rate n^3/10 of leaveoneout crossvalidation. Several simulations and an illustration on a real dataset related to the COVID19 pandemic show the behavior of our proposal and its better performance, in terms of statistical efficiency and computing time, when compared to leaveoneout crossvalidation.
READ FULL TEXT
Comments
There are no comments yet.