Package 'enpls'

Title:	Ensemble Partial Least Squares Regression
Description:	An algorithmic framework for measuring feature importance, outlier detection, model applicability domain evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions.
Authors:	Nan Xiao [aut, cre] , Dong-Sheng Cao [aut], Miao-Zhu Li [aut], Qing-Song Xu [aut]
Maintainer:	Nan Xiao <[email protected]>
License:	GPL-3 \| file LICENSE
Version:	6.1
Built:	2025-03-15 02:41:38 UTC
Source:	https://github.com/nanxstats/enpls

Help Index

Methylalkanes Retention Index Dataset
Cross Validation for Ensemble Partial Least Squares Regression
Cross Validation for Ensemble Sparse Partial Least Squares Regression
Ensemble Partial Least Squares for Model Applicability Domain Evaluation
Ensemble Partial Least Squares Regression
Ensemble Partial Least Squares for Measuring Feature Importance
Mean Absolute Error (MAE)
Ensemble Partial Least Squares for Outlier Detection
Root Mean Squared Error (RMSE)
Root Mean Squared Logarithmic Error (RMSLE)
Ensemble Sparse Partial Least Squares for Model Applicability Domain Evaluation
Ensemble Sparse Partial Least Squares Regression
Ensemble Sparse Partial Least Squares for Measuring Feature Importance
Ensemble Sparse Partial Least Squares for Outlier Detection
logD7.4 Data for 1,000 Compounds
Plot cv.enpls object
Plot cv.enspls object
Plot enpls.ad object
Plot enpls.fs object
Plot enpls.od object
Plot enspls.ad object
Plot enspls.fs object
Plot enspls.od object
Make Predictions from a Fitted Ensemble Partial Least Squares Model
Make Predictions from a Fitted Sparse Ensemble Partial Least Squares Model
Print cv.enpls Object
Print cv.enspls Object
Print enpls.ad Object
Print Fitted Ensemble Partial Least Squares Object
Print enpls.fs Object
Print enpls.od Object
Print enspls.ad Object
Print Fitted Ensemble Sparse Partial Least Squares Object
Print enspls.fs Object
Print enspls.od Object

Methylalkanes Retention Index Dataset

Description

Methylalkanes retention index dataset from Liang et, al.

Usage

data("alkanes")
data("alkanes")

Format

A list with 2 components:

x - data frame with 207 rows (samples) and 21 columns (predictors)
y - numeric vector of length 207 (response)

Details

This dataset contains 207 methylalkanes' chromatographic retention index (y) which have been modeled by 21 molecular descriptors (x).

Molecular descriptor types:

Chi path, cluster and path/cluster indices
Kappa shape indices
E-state indices
Molecular electricity distance vector index

References

Yi-Zeng Liang, Da-Lin Yuan, Qing-Song Xu, and Olav Martin Kvalheim. "Modeling based on subspace orthogonal projections for QSAR and QSPR research." Journal of Chemometrics 22, no. 1 (2008): 23–35.

Examples

data("alkanes")
str(alkanes)
data("alkanes")
str(alkanes)

Cross Validation for Ensemble Partial Least Squares Regression

Description

K-fold cross validation for ensemble partial least squares regression.

Usage

cv.enpls(x, y, nfolds = 5L, verbose = TRUE, ...)
cv.enpls(x, y, nfolds = 5L, verbose = TRUE, ...)

Arguments

`x`	Predictor matrix.
`y`	Response vector.
`nfolds`	Number of cross-validation folds, default is `5`. Note that this is the CV folds for the ensemble PLS model, not the individual PLS models. To control the CV folds for single PLS models, please use the argument `cvfolds`.
`verbose`	Shall we print out the progress of cross-validation?
`...`	Arguments to be passed to `enpls.fit`.

Value

A list containing:

ypred - a matrix containing two columns: real y and predicted y
residual - cross validation result (y.pred - y.real)
RMSE - RMSE
MAE - MAE
Rsquare - Rsquare

Note

To maximize the probablity that each observation can be selected in the test set (thus the prediction uncertainty can be measured), please try setting a large reptimes.

`x`	Predictor matrix of the training set.
`y`	Response vector of the training set.
`xtest`	List, with the i-th component being the i-th test set's predictor matrix (see example code below).
`ytest`	List, with the i-th component being the i-th test set's response vector (see example code below).
`maxcomp`	Maximum number of components included within each model. If not specified, will use the maximum number possible (considering cross-validation and special cases where n is smaller than p).
`cvfolds`	Number of cross-validation folds used in each model for automatic parameter selection, default is `5`.
`space`	Space in which to apply the resampling method. Can be the sample space (`"sample"`) or the variable space (`"variable"`).
`method`	Resampling method. `"mc"` (Monte-Carlo resampling) or `"boot"` (bootstrapping). Default is `"mc"`.
`reptimes`	Number of models to build with Monte-Carlo resampling or bootstrapping.
`ratio`	Sampling ratio used when `method = "mc"`.
`parallel`	Integer. Number of CPU cores to use. Default is `1` (not parallelized).

`x`	An object of class `cv.enpls`.
`xlim`	x Vector of length 2 - x axis limits of the plot.
`ylim`	y Vector of length 2 - y axis limits of the plot.
`alpha`	An alpha transparency value for points, a real number in (0, 1].
`main`	Plot title, not used currently.
`...`	Additional graphical parameters, not used currently.

`x`	An object of class `cv.enspls`.
`xlim`	x Vector of length 2 - x axis limits of the plot.
`ylim`	y Vector of length 2 - y axis limits of the plot.
`alpha`	An alpha transparency value for points, a real number in (0, 1].
`main`	Plot title, not used currently.
`...`	Additional graphical parameters, not used currently.

`x`	An object of class `enpls.ad`.
`type`	Plot type. Can be `"static"` or `"interactive"`.
`main`	Plot title, not used currently.
`...`	Additional graphical parameters, not used currently.

`x`	An object of class `enpls.fs`.
`nvar`	Number of top variables to show. Ignored if `sort = FALSE`.
`type`	Plot type. `"dotplot"` or `"boxplot"`.
`limits`	Vector of length 2. Set boxplot limits (in quantile) to remove the extreme outlier coefficients.
`main`	Plot title, not used currently.
`...`	Additional graphical parameters, not used currently.

`x`	An object of class `enpls.od`.
`criterion`	Criterion of being classified as an outlier, can be `"quantile"` or `"sd"`.
`prob`	Quantile probability as the cut-off value.
`sdtimes`	Times of standard deviation as the cut-off value.
`alpha`	An alpha transparency value for points, a real number in (0, 1].
`main`	Plot title.
`...`	Additional graphical parameters for `plot`.

`x`	An object of class `enspls.ad`.
`type`	Plot type. Can be `"static"` or `"interactive"`.
`main`	Plot title.
`...`	Additional graphical parameters for `plot`.

`x`	An object of class `enspls.fs`.
`nvar`	Number of top variables to show. Ignored if `sort = FALSE`.
`type`	Plot type, can be `"dotplot"` or `"boxplot"`.
`limits`	Vector of length 2. Set boxplot limits (in quantile) to remove the extreme outlier coefficients.
`main`	Plot title, not used currently.
`...`	Additional graphical parameters, not used currently.

`x`	An object of class `enspls.od`.
`criterion`	Criterion of being classified as an outlier, can be `"quantile"` or `"sd"`.
`prob`	Quantile probability as the cut-off value.
`sdtimes`	Times of standard deviation as the cut-off value.
`alpha`	An alpha transparency value for points, a real number in (0, 1].
`main`	Plot title.
`...`	Additional graphical parameters for `plot`.

`object`	An object of class `enpls.fit`.
`newx`	New data to predict with.
`method`	Use `"mean"` or `"median"` to create the final prediction.
`...`	Additional parameters for `predict`.

`yreal`	true response vector
`ypred`	predicted response vector

`yreal`	true response vector
`ypred`	predicted response vector

`yreal`	true response vector
`ypred`	predicted response vector

`object`	An object of class `enspls.fit`.
`newx`	New data to predict with.
`method`	Use `"mean"` or `"median"` to create the final prediction.
`...`	Additional parameters for `predict`.

`x`	An object of class `cv.enpls`.
`...`	Additional parameters for `print`.

`x`	An object of class `enpls.fs`.
`sort`	Should the variables be sorted in decreasing order of importance?
`nvar`	Number of top variables to show. Ignored if `sort = FALSE`.
`...`	Additional parameters for `print`.

Package 'enpls'

Help Index

Methylalkanes Retention Index Dataset

Description

Usage

Format

Details

References

Examples

Cross Validation for Ensemble Partial Least Squares Regression

Description

Usage

Arguments

Value

Note

Author(s)

See Also

Examples

Cross Validation for Ensemble Sparse Partial Least Squares Regression

Description

Usage

Arguments

Value

Note

Author(s)

See Also

Examples

Ensemble Partial Least Squares for Model Applicability Domain Evaluation

Description

Usage

Arguments

Value

Note

Author(s)

Examples

Ensemble Partial Least Squares Regression

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Ensemble Partial Least Squares for Measuring Feature Importance

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Mean Absolute Error (MAE)

Description

Usage

Arguments

Value

Author(s)

Ensemble Partial Least Squares for Outlier Detection

Description

Usage

Arguments

Value

Note

Author(s)

See Also

Examples

Root Mean Squared Error (RMSE)

Description

Usage

Arguments

Value

Author(s)

Root Mean Squared Logarithmic Error (RMSLE)

Description

Usage

Arguments

Value

Author(s)

Ensemble Sparse Partial Least Squares for Model Applicability Domain Evaluation

Description