[PDF] AN OPEN-SOURCE CANOPY CLASSIFICATION SYSTEM USING




Loading...







[PDF] Forest height monitoring and aboveground biomass variability in

4 nov 2016 · RSS Remote Sensing Solutions GmbH 2016 Forest height monitoring and aboveground biomass variability in Indonesia's tropical forests

[PDF] AN OPEN-SOURCE CANOPY CLASSIFICATION SYSTEM USING

Studying deforestation has been an important topic in forestry research Especially, canopy classification using remotely sensed

[PDF] Canopy spectral invariants for remote sensing and model applications

Canopy spectral invariants for remote sensing and model applications Dong Huang sition of the solutions of some basic radiative transfer sub-

[PDF] Monitoring of Forests through Remote Sensing

1 oct 2020 · Sample PhenoCam images of the canopy in a US tower site in 2006 on B: an Remote sensing can be part of the solution, as it can monitor

[PDF] AN OPEN-SOURCE CANOPY CLASSIFICATION SYSTEM USING 41763_3isprs_archives_XLVI_4_W2_2021_175_2021.pdf

AN OPEN-SOURCECANOPY CLASSIFICATION SYSTEMUSING

MACHINE-LEARNINGTECHNIQ UESWITHINAPYTHON FRAMEWORK

Owen Smith[1], HuidaeCho [2]

 Institute forEn vironmentalandSpatialAnalysis, Univ ersityof NorthGeor gia,Oakwood,GA30566,USA [1] ocsmit7654@ung.edu,[2] huidae.cho@ung.edu KEY WORDS:CanopyClassification, RemoteSensing Analysis,Machine Learning,Open Source,Python Module

ABSTRACT:

Studying deforestationhas beenan importanttopic inforestry research.Especially ,canop yclassification usingremotely sensed

data playsan essentialrole inmonitoring treecanop yon alar gescale. Asremotesensingtechnologies advance, thequality and

resolution ofsatellite imageryha ve significantlyimproved.Oftentimes,le veraging high-resolutionimagerysuchastheNational

Agriculture ImageryProgram (NAIP) imageryrequiresproprietary software.Howe ver ,the lackofinsightinto theinnerworkings

of suchsoftw areandtheinability ofmodifying itscode leadman yresearchers tow ardsopen-source solutions.In thisresearch,

we introduceCanoClass, anopen-source cross-platform canopy classificationsystemwrittenin Python.CanoClassutilizesthe

Random ForestandExtra Trees algorithmspro videdbyscikit-learnto classifycanop yusingremotesensing imagery. Basedon our

benchmark tests,this new canopyclassificationsystem was283% to464 %f asterthancommercialFeature Analyst,butit produced

comparable resultswith asimilarity of87.56 %to 87.62%.

1. INTRODUCTION

Forestedareas playan integral rolein themaintenanceofboth local andglobal environments. Theyarethe bulkofEarth" scar - bon sequestrationfor mitigating anthropogenicprocesses(Bala et al.,2007; Platz,2015; Reedand Kaye,2020; Shenet al.,

2020), providenaturalerosion andrunof fcontrol forflooding

events,whichha ve beengrowinginfrequency becauseof cli- mate change(Benit oetal.,2003; Sriwongsitanon andT aesom- bat, 2011),and canof ferrespite forurbanheatislands (Wong and Yu,2005;Rani etal., 2018;Bosch etal., 2020).The effec- tivecreationof canopy datais ofutmostimportanceto analyze the aforementionedprocesses inaddition toforest patternssuch as disturbance,mortality ,andthesocietal andeconomic effects forests canpro vide(Senfetal., 2018;Senf andSeidl, 2020). Deforestation monitoringis anessenti alpart ofmaintainingany environmentas theloss offorested landsleads toincreased car- bon dioxidebeing placedinto theatmosphere whilesimultane- ously eliminatingcarbon storage(Bala etal., 2007).At smaller scales, deforestationleads toincreased runoff ratesand subse- quently increasederosion, especiallyin areaswhere noplant reclamation isinitiated (Benitoet al.,2003). Asimpro vements are madein thefields ofgeospatial scienceand remotesensing, an increasingemphasis isput onaccurate forestcanop ydetec- tion, amongother ecologicalf actors,for thepurposeofmon- itoring andpredicting canopy change(Franklin,2001).Ho w- ever,monitoringforestecosystems accuratelyto mitigate these effectson alar gescale canbeatime-consuming anddif ficult process tocomplete (Basuet al.,2015) andthere canbe many inhibiting factorssuchas accessto highresolution imagerydata and accessto software capableofprocessingthe amountof data required. Subsequently,with theincreaseofhigh spatialreso- lution dataa vailablebothpubliclyandcommercially ,a need arises forimplementations capableof reproducibleand efficient classification schemesdesigned specificallyfor treecanop yde- tection. In thisresearch, wede veloped theCanoClassPythonmod- Corresponding authorule forcanop yclassificationbuilt ontop ofscikit-learn(Pe- dregosaet al.,2011) andthe GeospatialData AbstractionLi- brary (GDAL)(GDAL/OGR Contributors,2020).Dallaqua et al. (2018)ha veusedscikit-learnfordeforestation monitoringin which acommittee systemw asde veloped.CanoClassis built around theclassification ofthe NationalAgricultural Imagery Program (NAIP)imageryand iscapable ofclassifying both1- m andthe submeterresolution imagerymade av ailablein the

2019 iterationsof theN AIPprogram leadingtoincreasedaccu-

racies (USDA,2020).Section 2introduces ourPython module and elaborateson how itworks.Sections 3and 4,respectively , present dataand discussre sultsfor acasestudy. Finally, we conclude ourresearch inSection 5.

2. METHODS

2.1 Challengesin CanopyClassification

In thisstudy ,weusedremote sensingdata toclassify canopy . However,astheresolutionof remotesensing dataincreases and the extentofthe studyarea grows, classifyinggeospatial dataas canopyv ersusnon-canopybecomes computationallychalleng- ing. Challengesf acedinclassifyingcanop yinclude themis- classification ofw aterascanopy ,noisy orclutteredoutputsin higher resolutiondata sets,and linearartif actcreation along data setedges. Furthermore,as datasets grow ,so doesthe computational timerequired toprocess them.F ore xample,as we willdiscuss later, weused1-mN AIPimagery forour case study.Ho wever,sinceasingle1-mNAIP rastertile contains approximately 50million individual cells,processingmultiple NAIPtiles fora large studyarea canleadtoconsiderable com- putational timesif thew orkflow isnotoptimized.Toaddress these issues,we employed machine-learningtechniqueswhere we firstcompute ve getationindicesusingmulti-bandremote sensing data,and applyclassification andpost-processing algo-

rithms designedspecifically forcanop yclassification problems.The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVI-4/W2-2021

FOSS4G 2021 - Academic Track, 27 September-2 October 2021, Buenos Aires, ArgentinaThis contribution has been peer-reviewed.

https://doi.org/10.5194/isprs-archives-XLVI-4-W2-2021-175-2021 | © Author(s) 2021. CC BY 4.0 License.

175

2.2 VegetationIndices

Vegetationindicesare extensi vely usedwhenoneattemptsto separate vegetationfromothertypes ofland cov er. Thesein- dices typicallyuse thenear -infrared(NIR) bandintheir equa- tions asthe wa velengthof0.75μmto0.8μm inthis bandis absorbed byphoto-synthetically activ evegetationandreflected by bodiesof water andimpervioussurfaces (Tuck er,1979). To account foratmospheric effects, weusedtheAtmospherically Resistant VegetationIndex(AR VI)(KaufmanandT anre,1992).

The ARVIiswritten as

ARVI=NIR2Red+BlueNIR+ 2Red+Blue(1)

where ituses theblue bandin conjunctionwith thered andNIR bands toprovide correctionforatmosphericef fects.The ARVI is fourtimes lesssensiti ve onaveragetoatmospheric effects than theND VI,themostwidely usedv egetation inde x(Kauf- man andT anre,1992).

2.3 RemoteSensing Data

Weused four-band (red,green,blue,and NIR)N AIPimagery for developingandtestingour canopy classificationframe work. Previousstudiesthatutilizedopen-sourceclassificationsystems havenotbeen ableto achiev eaccurac yat alevelthatN AIP imagery canpro vide,havingbeen limitedtousingonly pub- lic accessimagery suchas Landsat8 (Roy etal., 2014)(30-m resolution) orModerate ResolutionImaging Spectroradiome- ter (MODIS)(N ASA,2020)(250-mto 1000-mresolution) (Simi´c deT orres,2016;Dallaquaet al.,2018). WhileN AIP imagery ison a3-year cycle andcannot matchthetemporal frequencyin whichsatellite imageryis taken, itis taken during seasons inwhich agricultureis growing inthe UnitedStatesen- suring similarcharacteristics betweendata sets(USD A,2020). Furthermore, cloudmasking willnot beneeded forprocessing as NAIPimagery"s qualitycontrolremov esan yimage thathas more than10 %cloudcov erper quarterquad(QQ),rendering theneedforacloudmasknegligible(USDA,2020).Thelackof cloud coverinNAIP imagerywill removeissues thatpre vious studies hadwith utilizingv egetation indicesbecausethepres- ence ofclouds would rendertheindex increasinglyunreliable themorecloudcov erthescenecontained.Itisimportanttonote that, whileN AIPwasused todevelop andtest thecanop yclas- sification frameworkandmostbatch processingfunctions are developedtouse NAIP imagery, theindividualclassification and trainingmodules arecapable ofw orkingwit han yremotely sensed vegetationindicesdev elopedfrom othersatellitessuch as Landsat8 andSentinel-2 (Druschet al.,2012).

2.4 ClassificationAlgorithms

Weconsidered two classificationmethodsincludingthe Ran- dom ForestandExtra Trees classifiers.Both arecapableofuti- lizing multi-coreprocessingfor increasedcomputational speed, putting themat anadv antageo verotheralgorithms.

2.4.1 RandomF orestAlgorithmThe RandomF orest(RF)

algorithm isa combinedmulti-tree predictorb uiltupon boot- strap aggregatingwhereeachdecisionnode issplit usinga ran- dom selectionof featuresand themost popularclass issubse- quently chosenbased ona vote afterthe specifiednumberof trees aregenerated (Breiman,2001). Incases ofland-co ver

classification, theRF algorithmis foundto beas effecti ve, ifnot moreef fective,asotherpopularsimilarensemble algo-rithms suchas boostingand bagging(Breiman, 2001;Gisla- son etal., 2006).In addition,the RFalgorithm hasbeen foundto havelightercomputationalload thanthe popularAdaBoost algorithm (Freundand Schapire,1996). Thelighter computa-tional loadof theRF classifieris thanksto therandom selectionof variablestosplit decisiontrees. Itminimizes thecorrelation between treesand utilizesbootstrapping, whichmeans thata portion asopp osedtotheentiredata setis usedfor eachtree. However,theRFalgorithmcan usea considerableamount ofmemorybecauseamatrixofthenumberofsamplesbythenum-ber oftrees isstored inmemory (Gislasonet al.,2006). With memory usagein mind,the RFclassifier isstill anideal algo-

rithm foruse withlar gedata setssinceitdoes noto verfit as the algorithmfollo wsthelaw oflar genumbers(Etemadi,1981) and isconsiderably lesssensiti ve tonoisethanotherboosting or baggingalgorithms (Breiman,2001). RFalgorithms hav e been usedwithsuccess whenclassifying ve getationand land cover,and,inthe caseof canopy classification,are oftenfound to outperformother algorithms(Coulston etal., 2012).

2.4.2 ExtraT reesAlgorithmThe ExtraT rees(ET)algo-

rithm issimilar tothe RFalgorithm inthat itis amulti-tree predictor builtusingan ensembleof decisiontrees andthe most popular classis chosenbased onan aggreg ationof trees.Ho w- ever,incontrastto RF, theET algorithmsplits thenodesofthe tree completelyat randomwhereas theRF algorithmcuts the node atthe locallyoptimal combinationof featuresand split (Breiman, 2001;Geurts etal., 2006).Additionally ,the ETclas- sifier usesthe entiretyof thesample andnot justthe bootstrap to growtrees,which meansthat eachtree isindependent orun- correlated tothe last(Geurts etal., 2006).The ETalgorithm has higherbias andlo werv ariancethanthestandard RFalgo- rithm becauseof theincreased randomnessof thesplit nodes (Geurts etal .,2006).Thesedif ferenceslead tothe ETalgo- rithm"sbiggest strength,which isits computationalef ficiency that canbe attributed toitsincreasedrandomness andsimplistic approach tonode splittingwhen comparedto theRF algorithm. On averagewhenempiricallycompared tothe RFalgorithm, the ETalgorithm iscomputationally aboutthree timesf aster than anRF algorithmapplied tothe samedata set(Geurts et al., 2006).The computationalef ficienc yinadditiontoitsuse- fulness whenclassifying high-dimensionalobjects suchas im- agerymakestheETalgorithmanidealalgorithmforanefficient classification system(Xu etal., 2010;La wsonet al.,2017).

2.5 CanoClassPython Module

CanoClass wasdev elopedtobridgethegaps betweenremotely sensed imagery,classification,and post-processingrequired for largedata sets throughthedev elopmentof multi-phaseprocess- ing modules.As aPython module,CanoClass isseparated into twosections beingN AIPclassification andtheclassificationof all otherrem otesensingproductsthat offer 4-bandimagery . The separationof NAIP fromotherremotese nsingimagery is due tothe differences inprocessingcreatedfor NAIP imagery, in particularthe batchprocessing createdfor NAIP imageryto allowfor itsscalable application.Batch processesfor imagery such asLandsat 8and MODISare notcreated chieflybecause of thedif ferenceinscalebetween NAIP andother imagery. The extentof asingle Landsat8 imageis 185km by180 kmwhile that ofa singleN AIPimage isapproximately7km by7 km, making thescalability ofLandsat 8less importantthan other remote imageryas it alreadyencompassessuch alar geareain comparison. Theprocessing ofboth NAIP andother imageryis

shownin Figure1 anddescribed indetail below .The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVI-4/W2-2021

FOSS4G 2021 - Academic Track, 27 September-2 October 2021, Buenos Aires, ArgentinaThis contribution has been peer-reviewed.

https://doi.org/10.5194/isprs-archives-XLVI-4-W2-2021-175-2021 | © Author(s) 2021. CC BY 4.0 License.

176
Ζ&ODVVL4FDWLRQ5HSURMHFWDQG &OLSWR44ȴ Ζ Ζ Ζ 0RVDLF&OLSWR52
Politique de confidentialité -Privacy policy