Models of forest growth and yield provide important information on stand and tree developments and the interactions of these developments with silvicultural treatments. These models have been developed based on assumptions such as independence of observations, uncorrelated error terms, and error terms with constant variance; if these factors are absent, there may be problems with multicollinearity, autocorrelation, or heteroscedasticity, respectively. These problems, which have several adverse effects on parameter estimates, are statistical phenomena and must be avoided. In recent years, the artificial neural network (ANN) model, thanks to its superior features such as the ability to make successful predictions and the absence of the requirement for statistical assumptions, has been commonly used in forestry modeling. However, while goodnessoffit measures were taken into consideration in the assessment of ANN models, the control of the biological characteristics of model predictions was ignored. In this study, variabledensity yield models were developed using nonlinear regression and ANN techniques. These modeling techniques were compared based on some goodnessoffit measures and the principles of forest yield. The results showed that ANN models were more successful in meeting expected biological patterns than regression models.
Regression models have been widely used to maximize the explained variation in forest measurements, such as individual tree heights, individual tree diameter increments, stand basal areas, stand volumes, and site indexes (
Artificial neural network (ANN) models that are independent of statistical assumptions have been widely used in forestry modeling. In earlier studies, ANN models were commonly employed to classify forest characteristics (
Most researchers have paid attention to goodnessoffit statistics in evaluating the performance of ANN models (
The objectives of this study were to (i) develop variabledensity yield models using a modified Gompertz model and an ANN model and (ii) compare their performances based on goodnessoffit statistics and the principles of forest yield.
In this study, natural monospecific Crimean pine stands (
Forest yield models should ideally be developed using data from permanent sample plots. However, this study used data from temporary sample plots because there were no permanent sample plots of natural monospecific Crimean pine stands grown in the study area. The sample plots inventoried were selected to be representative of the range of site indexes, stand densities, and stand ages. A total of 180 sample plots were randomly chosen; in each, a circular sampling scheme was adopted because the stands, which have different stand densities in same site index and have different ages in same site index and stand density, should be adequately sampled. If the number of samples is insufficient in terms of site index, stand density, and age, a variablestand density model may not meet the principles of forest yield. The area of the sample plots was 400, 600, or 800 m^{2} according to the crown closure of the plots. This study followed the forest management guidelines of Turkey in determining the area of the sample plots (
The selection of a suitable yield function is essential for modeling forest development. In forestry modeling, sigmoid functions such as the LundqvistKorf, Richards, and Hossfeld IV/McDillAmateis functions have been widely used (
where
In general, the variabledensity yield models include
It is crucial to construct an ANN model that is consistent with the principles of forest yield. If ANN’s parameters are not properly managed, the produced ANN model will lead, in general, to unreasonable results (
An ANN model includes three fully connected layers: the input layer, hidden layer, and output layer. Each input with an appropriate weight is taken into the input layer without any data processing. The weighted inputs with added bias values are passed to the hidden layer through an activation function. Then, the resulting data are propagated forward to the next layer. This process is continued until the final output has been produced.
The various types of activation functions are selected according to the type of problem. The sigmoid (
The inputs can be normalized to reduce the network complexity and improve the robustness of the network against outliers (
The crossvalidation methods such as leaveoneout and
Overfitting is a serious problem for ANN models. If an ANN model having good fit in the training dataset fails in the testing dataset, it means that this model suffers extensively from the overfitting problem. In general, this problem is inevitable for ANN models having a small number of observations. On the other hand, if the number of parameters in an ANN model is higher than the number of observations, the ANN model does not adequately fit the training dataset (
Another way to avoid overfitting is to select a suitable regularization training function, such as the LevenbergMarquardt, Bayesian, or variable learning rate gradient descent function. In this study, the Bayesian function was chosen because it can capture nonlinear patterns in the dataset and produce an
As with the regression approach, the initial objective in an ANN model is to minimize errors (eqn. 7). This study aimed to minimize the mean of squared errors (also called the objective function). Different performance functions can be used as the objective function, such as the mean absolute error. In this study, the overall prediction performance of the ANN models was measured by the mean squared error, which is an indicator of the variation in the errors (
where
In the Bayesian regularization technique, an additional term is added to the objective function (
where
The regularization parameters
where
Choosing the best number of hidden neurons, learning rate, and momentum factor is important for enhancing the learning capacity of ANN models. When many hidden neurons are used, ANN models tend to memorize the problem but do not generalize the relationships between inputs and output. Conversely, if only a few hidden neurons are used, ANN models do not reflect the patterns of the dataset (
In order to achieve the optimum number of hidden neurons, learning rate, and momentum value, different combinations of these factors were studied. In this trialanderror process, the number of hidden neurons varied from 2 to 10 and the learning rate and momentum factor varied from 0.1 to 0.9.
In this study, a total of 180 data items for each stand variable were used to evaluate the performance of the ANN models. To promote the performance of the ANN models and avoid overfitting, the fivefold crossvalidation was employed using the entire dataset. The ANN models were conceived using the MATLAB^{®} software (
The statistical evaluation of the regression and ANN models was done using the adjusted coefficient of determination (R_{adj}^{2},
where
The goodnessoffit statistics, estimated coefficients, and
The goodnessoffit statistics and network parameters of the best ANN models for each stand variable are shown in
The number of parameters in the ANN models (namely, the coefficients for the regression models) was higher than for the nonlinear regression models. However, both of them provided a similar statistical performance for
According to
In most studies using ANN models, the findings have been discussed only with the consideration of the goodnessoffit statistics (
The Gompertz function was modified to develop the dynamic variabledensity yield models in this study. The variables
ANN models were more dynamic than nonlinear regression models in describing patterns of
Developing a dynamic nonlinear regression model can be difficult due to the difficulty in determining the starting values of the parameters (
The accuracy of ANN models is highly dependent on the selection of network parameters (
This study demonstrated that simple ANN models with one hidden layer and only a few hidden neurons (n<10) were adequate for accurately predicting the
Many researchers paid attention to the goodnessoffit statistics in evaluating ANN models and comparing ANN models with other modeling approaches. However, the biological rationales are often overlooked. This study showed that two different modeling approaches with similar error statistics may provide different curves depending on the
The authors thank the General Directorate of Forestry for the legal permission of field studies, as well as Dr. Sinan Bulut, forest engineer Ömer Bulut and Kutay Kaya for their help in collecting data.
This study was a part of a PhD thesis supported financially by the Scientific Research Project Unit of Çankiri Karatekin University, Turkey (grant: OF061218D05) and the Turkish Scientific and Technical Research Institute (TUBITAK, grant: TOVAG119O061).
FB completed the statistical analyses and wrote the manuscript draft. IE and AG contributed to the idea, reviewed and edited the manuscript. All authors have read and approved the final manuscript.
The locations of study area and sample plots.
A simple representation of ANN model with one hidden layer with two hidden neurons (or nodes). This network includes a total of nine parameters (six weight and three error parameters).
Curves of nonlinear regression models for (a)
Curves of ANN models for (a)
Scatterplots of predicted
Scatterplots of predicted
Descriptive statistics of Crimean pine stands studied in this study. (Dq): quadratic mean diameter; (Hq): quadratic mean height; (A): age; (G): basal area; (N): stem number; (V): volume derived from the double entry tree volume equation V = 0.000202 · d^{1.602553 }h^{0.969154} (
Variables  n  Min  Max  Mean  STD 

Dq (cm)  180  2.00  56.30  27.14  10.89 
Hq (m)  180  2.00  31.30  15.25  6.31 
A (year)  180  13.50  161.67  74.11  32.87 
G (m^{2 }ha^{1})  180  2.00  70.00  31.11  14.46 
N (number ha^{1})  180  137.50  1550.00  630.02  327.59 
V (m^{3 }ha^{1})  180  7.00  921.67  323.79  216.15 
SI (m)  180  10.00  34.00  21.16  6.68 
SD  180  1.25  16.92  6.24  2.70 
Coefficient estimates (with
Model  Coefficient  Estimate  SE  pvalue 

RMSE  


b_{1}  12.542  0.252  49.691  <0.0001  0.96  3.24 
a_{1}  1.926  0.18  10.726  <0.0001  
a_{2}  0.527  0.04  13.257  <0.0001  
a_{3}  0.00004  0  4.0689  <0.0001  
a_{4}  0.0005  0  9.7649  <0.0001  
b_{3}  21.846  2.471  8.8406  <0.0001  

b_{1}  19.732  0.808  24.412  <0.0001  0.8  108.15 
a_{1}  2.214  0.382  5.79  <0.0001  
a_{2}  0.129  0.054  2.4  <0.0001  
a_{3}  0.001  0.001  1.832  <0.0001  
a_{4}  0.005  0.001  3.916  <0.0001  
b_{3}  54.144  16.15  3.352  <0.0001  

b_{1}  9.565  0.427  22.388  <0.0001  0.85  144.18 
a_{1}  1.111  0.325  3.415  <0.0001  
a_{2}  0.662  0.165  4.004  <0.0001  
a_{3}  0.111  0.026  4.266  <0.0001  
a_{4}  0.34  0.058  5.857  <0.0001  
b_{3}  67.998  6.493  10.472  <0.0001 
The number of parameters, learning rate, momentum, and goodnessoffit statistics of ANNbased variabledensity yield models developed for predicting basal area (
Parameter 




Number of inputs  5  5  5 
Number of network parameters  57  36  50 
Number of hidden neurons  8  5  7 
Learning rate  0.1  0.1  0.1 
Momentum value  0.5  0.5  0.5 

0.97  0.97  0.86 
RMSE  2.77  35.79  143.87 
Appendix 1  Syntax to facilitate the determination of initial values of a nonlinear function and the statistical significance of estimated parameters, based on the GenSA package in R.