News

New paper!

Friday, June 24, 2022

Machine learning-based regression for cross validation / prediction

Random Forest Regression sklearn.ensemble.RandomForestRegressor

- Pros: Powerful, especially for a large data. Non-parametric. Good for non-linearity as well.
- Cons: It is known that Random Forest Regression cannot extrapolate. That is, the prediction will be always within the range used in the training data (unlike linear regression). see here [1] for a good detailed description about this. Support Vector Machine regression, linear regression, and deep learning are suggested alternatives that can do extrapolation.

GAM / SVR comparison 

According to this link [2], GAM (generalised additive model) and SVR (support vector regression) predictions work equally well.


SVR sklearn.svm.SVR

Hyperparameter tuning of SVR with GridSearchCV
Good parameter range to check seems C: between 1 and 100, epsilon: between 1e-3 and 10, and kernel. Described here [3]


References:

[1] Derrick Mwiti. Random Forest Regression: When Does It Fail and Why? https://neptune.ai/blog/random-forest-regression-when-does-it-fail-and-why

[2] . Non-Linear Regression with R. https://minimatech.org/non-linear-regression-with-r/

[3] https://stackoverflow.com/questions/69669827/tuning-of-hyperparameters-of-svr