Construction of Machine Learning Models for Biomass of Chinese Fir Plantations Based on Boruta Feature Selection and Optuna Hyperparameter Optimization
-
-
Abstract
Using Cunninghamia lanceolata plantation data from permanent sample plots of the 8th National Forest Inventory (NFI-8) in Guangdong Province as the research subject, the Boruta algorithm was employed to screen stand factors and climate factors. Hyperparameter optimization was performed using Optuna and Random Search (for comparison) to construct Random Forest (RF), Support Vector Regression (SVR), and Artificial Neural Network (ANN) models. Model performance was evaluated using R2, RMSE, and MAE. When models were built using only the Boruta-selected stand factors N, D, H, P, and Age, the R2 for all models exceeded 0.76. A total of 11 climate factors were retained, among which 8 were temperature-related and 3 were precipitation-related. Adding all climate factors to the selected stand factors further improved model performance, with the RF model showing the most significant improvement. Models built using the selected stand factors and selected climate factors outperformed those using the selected stand factors and all climate factors, with the ANN model demonstrating the most notable performance gain. When models constructed with the selected stand and climate factors underwent hyperparameter optimization, Optuna consistently yielded better results than Random Search. After Boruta selection and Optuna optimization, the ANN model achieved the greatest relative improvement post-tuning, while the RF model exhibited the best predictive performance (R2 = 0.9271). The overall model performance ranking from highest to lowest was RF, ANN, SVR.
-
-