Estimation of canopy attributes of wild cacao trees using digital cover photography and machine learning algorithms
iForest - Biogeosciences and Forestry, Volume 14, Issue 6, Pages 517-521 (2021)
doi: https://doi.org/10.3832/ifor3936-014
Published: Nov 17, 2021 - Copyright © 2021 SISEF
Short Communications
Abstract
Surveying canopy attributes while conducting fieldwork in the rain forest is time-consuming. Low-cost imagery such as digital cover photography is a potential source of information to speed up the process of vegetation assessments and reduce costs during expeditions. This study presents an image-based non-destructive method to estimate canopy attributes of wild cacao trees in two regions of the rain forest in Colombia, using digital cover photography and machine learning algorithms. Upward-looking photography at the base of each cacao tree and machine learning algorithms were used to estimate gap fraction (GF), foliage cover (FC), crown cover (CC), crown porosity (CP), clumping index (Ω), and leaf area index (LAI) of the canopy cover. Here we used the cacao wild trees found on forestry plots as a case study to test the application of low-cost imagery on the extraction and analysis of canopy attributes. Canopy attributes were successfully extracted from the canopy cover imagery and provided 92% of classification accuracy for the structural attributes of the canopy. Canopy cover attributes allowed us to differentiate between canopy structures of the Amazon and Pacific rainforests sites suggesting that wild cacao trees are associated with different vegetation types. We also compare classification results for the computer extraction of canopy attributes with a digital canopy cover benchmark. We conclude that our approach was effective to quickly survey canopy features of vegetation associated with and of crop wild relatives of cacao. This study allows highly reproducible estimates of canopy attributes using cover photography and state-of-the-art machine learning algorithms such as deep learning Convolutional Neural Networks.
Keywords
Canopy Attributes, Cover Photography, Colombia, Machine Learning, Deep Learning
Introduction
Colombia is considered as one of the main centers of diversity for crop wild relatives of cacao ([14]). The genus Theobroma and Herrania, as well as wild species of Theobroma cacao L., are the main taxonomic entities of cacao ([11]). They grow in remote areas of rainforests where much of its diversity is present, but accessing those regions is challenging. Studying crop wild relatives is a priority for the conservation of genetic resources ([20]). Unfortunately, the available information about these crop wild relatives of high agricultural, economic, and cultural importance is limited. Accurate estimate of forest canopy structure is central for a wide range of ecological studies and applications. Because of the difficulty of direct measurements, indirect methods have been widely used. Canopy photographic methods are among the most widely used on account of their simple, fast, and cost-effective procedures.
In the past, tree crown attributes have been estimated using vertical digital photography ([3], [22]). Digital cover photography (DCP) is a high resolution, restricted-view angle method, that provides mainly vertical sampling of the canopy ([25], [8], [9], [1], [10]) and is an emerging method to estimate canopy attributes ([6], [10]). Accurate estimates of canopy attributes using DCP overcomes the difficulties of hemispherical photography, which are sensitive to image processing, which are tedious and time-consuming ([7], [1]). Cover photographs also provide higher resolution than hemispherical photographs. In terms of image processing, machine learning algorithms have been used to estimate forest canopy imputation using Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) data ([24]), as well as other remote sensing images ([21]). For instance, canopy hemispherical photography (CHP) segmentation and gap fraction (GF) calculation were performed using deep learning neural networks ([19]). Deep learning regression has also been used to make hemispherical photography independent of sunlight illumination conditions ([12]). Here, we use upward-looking DCP (as in [7]) rather than downward-looking photographs because of load constraints during fieldwork in the expeditions as well as constraints in human resources.
This work addresses the issue of how to compute canopy cover properties using DCP and machine learning algorithms. Classification accuracy is estimated using cross-validation and comparison to a digital cover photography benchmark ([15]).
Materials and methods
Study area
The study of crop wild relatives (CWR) of cacao trees was performed in the rainforest of three Colombian departments: Caquetá, Putumayo, and Chocó (Fig. S1 on Supplementary material) between 2018 and 2019. The first cacao-BIO expedition traveled to the Caguán and upper Caquetá rivers in Caquetá and Putumayo departments, where five parcels surrounding a wild Theobroma tree were examined across different landscapes (flooded, firm ground, and riverbanks). Later, the second expedition took place in La Victoria municipality part of the Canton de San Pablo (Chocó department) where three parcels around a wild cacao tree in the rain forest were examined. The expedition collected a total of eight upward-looking DCPs, one for each parcel.
Estimation of canopy properties
A key point to estimate canopy attributes is to separate canopy large gaps from normal size gaps ([25], [6], [7], [1]). One of the latest tools provides a free Python library called “Canopy Cover” (CaCo - ⇒ https://github.com/alivernini/caco) that performs a segmentation of the canopy and its gaps, as well as a segmentation of large gaps and normal size gaps using statistical methods, from DCP ([1] - see Fig. 1 for an illustration of the canopy, small and large gaps). CaCo does not currently provide all the canopy attributes computed here, since it centers more on providing statistics of the gaps found. In this work, six machine learning algorithms are used to classify upward-looking DCPs into the sky (gaps), leaves, and trunks: K-Nearest Neighbors (KNN - [2]), Support Vector Machines (SVM - [4]), Random Forests (RF - [16]), Extreme Gradient Boost (XGBoost - [5]), Multilayer Perceptron (MLP - [23]) and deep learning Convolutional Neural Networks (CNN - [18]). The algorithm in CaCo that statistically separates large gaps from normal size canopy gaps was used here, based on the sky class, detected using supervised machine learning algorithms. The Python code used here is available in GitHub (⇒ https://github.com/julioduarte2020/CanopyCover), where a modified version of CaCo is included that estimates all canopy attributes. The innovation of this work is the use of DCP and supervised machine learning algorithms to classify the images and then estimate the canopy attributes.
Fig. 1 - Canopy large and small gaps (black: canopy; white: small gaps; gray: large gaps) in the rain forest in Colombia.
Samples of the sky, trunk, and leaves were selected on each DCP image using the free software MultiSpec (⇒ https://engineering.purdue.edu/~biehl/MultiSpec/) to form the training data. Five-fold cross-validation was used to estimate the performance of each classifier, i.e., the samples selected are randomly split into training samples (80%) and testing samples (20%) five times, covering the training data. The performance of each classifier was measured in terms of classification accuracy, sensitivity, and specificity. KNN, SVM, and RF classifiers were implemented in python using the Sklearn library. XGBoost was implemented in python using the XGBoost library. MLP and CNN classifiers were implemented in python using the Keras library with Tensorflow under the hood. Encouraging results were obtained by setting 5 neighbors and leaf size of 100 for the KNN classifier. The linear SVM classifier was used with default settings. As for RF classifiers, best results were obtained using 100 estimators and default settings. Regarding XGBoost, encouraging results were obtained using 100 estimators and trees as the booster. For the MLP classifier, the best results were obtained using two dense layers of size three with batch normalization ([17]) and Relu activation ([13]). Also, encouraging results were obtained for the CNN classifier using a sliding window of size 9 pixels around each pixel, a first convolutional layer with a kernel of size 3 and 20 filters, batch normalization and dropout layer ([26]) of 0.2; a second convolutional layer with a kernel of size 5 and 40 filters, batch normalization, dropout of 0.2 and max-pooling of size 2×2. After the two previous convolutional layers, a flatten layer is added, followed by two MLP layers of size half of the input of the previous layer, batch normalization, dropout of 0.2, and Relu activation.
With each DCP image classified into trunk, leaves, and sky, the following canopy attributes can be estimated ([1] - eqn. 1 to eqn. 5):
where FC is the foliage coverage, CC is the crown cover, CP is the crown porosity, Ω is the clumping index, LAI is the leaf area index; gT is the total number of pixels of gaps (sky); L is the number of pixels of all leaves; pC is the number of pixels in the image minus the number of pixels of the trunk, i.e., the number of pixels of the canopy; gT/pC= GF is the total gap fraction; gL is the number of pixels of large gaps, estimated as those gaps which size is larger than one standard deviation above the mean of all gaps ([1]); and k is the coefficient of extinction, which is assumed to be 0.5 as in Alivernini et al. ([1]).
Benchmark
Besides cross-validation classification accuracy and accuracy with respect to the training data, we also tested the best two classifiers: CNN and RF as well as CaCo using a digital canopy cover benchmark ([15]) that consists of 315 DCP images distributed on seven test sites (45 images on each test site), taken in a hemispherical way (zenith angles between 2.5° and 72.5° at intervals of 5°), and using a Terrestrial Laser Scanning (TLS) to estimate the total gap fraction (GF) from 3D point data cloud available from the TLS. From these 315 images, we select 63 DCP images that are upward looking (zenith angles between 2.5° and 12.5°) with their respective GF and effective leaf area index (LAIe) measures. The LAIe can be computed as ([10] - eqn. 6):
The LAIe was computed as ln(GF)/k using the GF for the benchmark and k=0.5 so that the LAIe of the benchmark corresponds to the same equations used here. The LAIe for the DCP images was computed as LAI · Ω, so that there is correspondence with the eqn. 1 to eqn. 5 used here. Twenty-one images of the benchmark were chosen to select training samples for the trunk, leaves, and sky, based on the availability of those classes on each image.
Results
Performance of canopy classification algorithms
Fig. 2 shows the performance of each classifier, where CNN and RF have the best performance in terms of classification accuracy, sensitivity, and specificity.
Canopy attributes
Fig. 3 shows the classification accuracy for the three best classifiers CNN, RF, and XGBoost, on each test site. As can be seen from these results, CNN seems to perform best for test sites 1, 2, 4, 5, and 6 and worst for site 8. RF and XGBoost have similar performance across all sites, being better than CNN on sites 3, 7, and 8. In general, CNN performs well on all sites except on site 8, where it falls behind RF and XGBoost by 13% in accuracy.
The only difference we found between the image on site 8 and the other site images is that the image on-site 8 is very sunny compared to the images on the other sites, so it is probably due to this factor that CNN does not perform well on this image.
Fig. S2-S6 (Supplementary material) show the estimated canopy attributes using CNN, RF, XGBoost, and CaCo on each test site. The first three sites correspond to Chocó and the last five sites correspond to Caquetá and Putumayo. The different classifiers showed that, in general, the Choco canopies are thicker and denser than the Caquetá and Putumayo sites.
Fig. 4 shows in the x-axis the TLS total gap fraction (GF) of the benchmark versus the estimated GF in the y-axis using (a) CNN, (b) RF, and (c) CaCo. From this figure, RF obtains the best R2 statistic, followed by CaCo, and CNN; while CaCo obtains the best slope. Fig. 5 shows in the x-axis the LAIe estimated using the benchmark GF versus the estimated LAIe from the images in the y-axis using (a) CNN, (b) RF and (c) CaCo. From this figure, RF obtains the best R2 statistic, followed by CaCo and CNN; while CNN obtains the best slope followed by RF and CaCo.
Discussion
Estimated canopy cover attributes using CaCo tend to vary less from one site to the next, while they tend to vary more using CNN, RF and XGBoost classifiers, indicating a greater sensitivity to the varying conditions on each test site (Fig. S2-S6 in Supplementary material). The estimation of canopy cover attributes, as given in eqn. 1 to eqn. 5, depends on a good classification of the sky, tree trunks, and leaves, as provided by the best three classifiers, i.e., CNN, RF, and XGBoost. In contrast, CaCo only considers two classes: sky and canopy; however, CaCo results are not that far away from the CNNs, RFs, and XGBoost results, showing that even though Caco was made for sky-canopy classification, it provides also good results. As a matter of fact, in some cases, CaCo results are closer to CNN than RF or XGBoost, also indicating a high sensitivity of the estimated canopy attributes to classification accuracy. RF and XGBoost have similar classification performance (Fig. 3) and similar estimated canopy attributes. The canopy cover benchmark estimate of the GF uses a similar sky-canopy segmentation algorithm ([10]) to CaCo ([1]), despite that CNN, and RF also obtained good prediction accuracies of GF and LAIe overall.
The proposed technique can be used in agroforestry systems to estimate the canopy attributes using upward or downward DCP images, which would allow determining if for instance cacao trees are raised in a well-shaded farm, or if the programming of cultural practices such as pruning the canopy of trees is required. Canopy cover should be higher the warmer and drier the climate is, and there is a non-linear relationship between shade and yield (⇒ https://climatesmartcocoa.guide/entry-points/shading-and-agroforestry/). The percentage of shade can be easily computed from the classification images obtained using the proposed supervised classification method.
Conclusions
A method to estimate canopy cover attributes from upward-looking DCP and machine learning algorithms have been proposed here. Given that canopy cover attributes are very sensitive to classification accuracy, it is of utmost importance to obtain good classification accuracy of the sky, tree trunks, and leaves. Deep learning convolutional neural networks provided, in general, the best classification results, compared to other well-known classification methods. Given that we compare CNN, RF and CaCo against a known benchmark and the results are satisfactory, there is confidence that the estimated canopy attributes using DCP images and machine learning algorithms are close to reality.
Acknowledgments
This work is part of the “Expedicion Colombia CacaoBIO” project carried out between AGROSAVIA and the Andes University under the special cooperation agreement of the Colombian government no. FP44842-142-2018. The Administrative Department of Science, Innovation of Colombia is acknowledged for financing the project. We thank the professional Angela Sanchez Galán (University of The Andes) for her support in the field activities.
We thank Dr. Francesco Chianucci (CREA-FL, Arezzo, Italy) for providing us with the digital canopy cover benchmark as well as the GF and LAIe measurements for those images, and for his feedback on our research through peer-review.
References
CrossRef | Gscholar
CrossRef | Gscholar
Gscholar
Authors’ Info
Authors’ Affiliation
Corporación Colombiana de Investigación Agropecuaria - AGROSAVIA, Centro de Investigación Tibaitatá, Km 14 vía Mosquera, Bogotá, Cundinamarca (Colombia)
Corporación Colombiana de Investigación Agropecuaria - AGROSAVIA, Sede Central, Km 14 vía Mosquera, Bogotá (Colombia)
Corporación Colombiana de Investigación Agropecuaria - AGROSAVIA, Centro de Investigación Nataima, Km 9 vía Espinal-Chicoral, Tolima, Sede Florencia, Caquetá (Colombia)
Corporación Colombiana de Investigación Agropecuaria - AGROSAVIA, Centro de Investigación La Libertad, Km 14 vía Villavicencio, Puerto López, Meta (Colombia)
Corresponding author
Paper Info
Citation
Duarte-Carvajalino JM, Paramo-Alvarez M, Ramos-Calderón PF, González-Orozco CE (2021). Estimation of canopy attributes of wild cacao trees using digital cover photography and machine learning algorithms. iForest 14: 517-521. - doi: 10.3832/ifor3936-014
Academic Editor
Nicola Puletti
Paper history
Received: Jul 23, 2021
Accepted: Sep 08, 2021
First online: Nov 17, 2021
Publication Date: Dec 31, 2021
Publication Time: 2.33 months
Copyright Information
© SISEF - The Italian Society of Silviculture and Forest Ecology 2021
Open Access
This article is distributed under the terms of the Creative Commons Attribution-Non Commercial 4.0 International (https://creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Web Metrics
Breakdown by View Type
Article Usage
Total Article Views: 26362
(from publication date up to now)
Breakdown by View Type
HTML Page Views: 23694
Abstract Page Views: 1143
PDF Downloads: 1232
Citation/Reference Downloads: 2
XML Downloads: 291
Web Metrics
Days since publication: 1100
Overall contacts: 26362
Avg. contacts per week: 167.76
Article Citations
Article citations are based on data periodically collected from the Clarivate Web of Science web site
(last update: Feb 2023)
Total number of cites (since 2021): 1
Average cites per year: 0.33
Publication Metrics
by Dimensions ©
Articles citing this article
List of the papers citing this article based on CrossRef Cited-by.
Related Contents
iForest Similar Articles
Research Articles
The estimation of canopy attributes from digital cover photography by two different image analysis methods
vol. 7, pp. 255-259 (online: 26 March 2014)
Research Articles
Above ground biomass estimation from UAV high resolution RGB images and LiDAR data in a pine forest in Southern Italy
vol. 15, pp. 451-457 (online: 03 November 2022)
Review Papers
Remote sensing-supported vegetation parameters for regional climate models: a brief review
vol. 3, pp. 98-101 (online: 15 July 2010)
Research Articles
Estimation of above-ground biomass using machine learning approaches with InSAR and LiDAR data in tropical peat swamp forest of Brunei Darussalam
vol. 17, pp. 172-179 (online: 17 June 2024)
Review Papers
Accuracy of determining specific parameters of the urban forest using remote sensing
vol. 12, pp. 498-510 (online: 02 December 2019)
Review Papers
Remote sensing of selective logging in tropical forests: current state and future directions
vol. 13, pp. 286-300 (online: 10 July 2020)
Research Articles
A geographically weighted deep neural network model for research on the spatial distribution of the down dead wood volume in Liangshui National Nature Reserve (China)
vol. 14, pp. 353-361 (online: 27 July 2021)
Technical Reports
Detecting tree water deficit by very low altitude remote sensing
vol. 10, pp. 215-219 (online: 11 February 2017)
Research Articles
Afforestation monitoring through automatic analysis of 36-years Landsat Best Available Composites
vol. 15, pp. 220-228 (online: 12 July 2022)
Research Articles
Forest fire occurrence modeling in Southwest Turkey using MaxEnt machine learning technique
vol. 17, pp. 10-18 (online: 02 February 2024)
iForest Database Search
Search By Author
Search By Keyword
Google Scholar Search
Citing Articles
Search By Author
Search By Keywords
PubMed Search
Search By Author
Search By Keyword