[PDF] A 3D CNN-LSTM-Based Image-to-Image Foreground Segmentation



Previous PDF Next PDF







Web Page Segmentation and Pagination for Enhancing Readability

Lemari et al investigated the effects of the text visual structure on text comprehension in segmented presentation [12] They found if readers are not provided with any information about the text visual structure (pagination) or if they are provided with unusable information, they heavily rely



Systems Design & Programming Paging and Segmentation CMPE 310

Systems Design & Programming Paging and Segmentation CMPE 310 Privilege Levels CPL is defined by the descriptors, so access to them must be restricted Privileged Instructions: Q Those that affect the segmentation and protection mechanisms (CPL=0 only) For example, LGDT, LTR, HLT Q Those that alter the Interrupt flag (CPL



IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B

Content is final as presented, with the exception of pagination BALLA-ARABÉ et al : FAST AND ROBUST LSM FOR IMAGE SEGMENTATION 3 In this case, the external force can be included as follows:



Gestion de la mémoire - Cours systemes dexploitation

I La segmentation La pagination : La mémoire virtuelle étudiée jusqu’ici est à une dimension, les adresses virtuelles sont comprises entre 0 et une adresse maximale Chaque segment est une suite d’adressescontinus de 0 à une adresse maximale autorisée Les segments ont des tailles différentes qui varient en cours



A 3D CNN-LSTM-Based Image-to-Image Foreground Segmentation

This article has been accepted for inclusion in a future issue of this journal Content is final as presented, with the exception of pagination AKILAN et al : 3D CNN-LSTM-BASED IMAGE-TO-IMAGE FOREGROUND SEGMENTATION 3 Fig 3 CNN feature flows: (a) ResNet flow, and (b) the residual feature mapping of our 3D CNN-LSTM FG segmenter



RoadNet-RT: High Throughput CNN Architecture and SoC Design

This article has been accepted for inclusion in a future issue of this journal Content is final as presented, with the exception of pagination BAI et al : RoadNeT-RT: HIGH THROUGHPUT CNN ARCHITECTURE AND SoC DESIGN FOR REAL-TIME ROAD SEGMENTATION 3 Fig 2 The mainstream structures for real-time semantic segmentation



ChipNet: Real-Time LiDAR Processing for Drivable Region

region segmentation However, due to the diversity in road scene, it is difficult to design a feature descriptor that handles all visual cases and light conditions In addition, Shen et al proposed a series of algorithms to cluster super-pixels that could improve vision based semantic segmentation [28], [29]



Transformation-Consistent Self-Ensembling Model for

You et al [13] combined radial projection and self-training learning to improve the segmentation of retinal vessel from fundus image Portela et al [14] presented a clustering-based Gaussian mixture model to automatically segment brain MR images Later on, Gu et al [16] constructed forest oriented superpixels for vessel segmentation For



IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL 49, NO 11

This article has been accepted for inclusion in a future issue of this journal Content is final as presented, with the exception of pagination SONG et al : AN 8–16 Gb/s, 0 65–1 05 pJ/b, VOLTAGE-MODE TRANSMITTER WITH ANALOG IMPEDANCE MODULATION EQUALIZATION 3 Fig 4 2-tap FIR equalization in low-swing voltage-mode drivers



QCM - Système dexploitation

ressources matérielles et logicielles de l'ordinateur B) créé une "machine virtuelle" qui est plus facile à programmer que la machine réelle et qui est la même quelque soit la machine réelle C) gère la segmentation et la pagination de la M P D) les 3 dernières réponses E) aucune des 4 dernières réponses 2

[PDF] pagination systeme d'exploitation

[PDF] telecharger un livre de grammaire pdf

[PDF] larousse conjugaison pdf

[PDF] telecharger larousse difficultés grammaticales pdf

[PDF] larousse grammaire francais

[PDF] larousse orthographe pdf

[PDF] larousse livre de bord orthographe pdf

[PDF] telecharger livre larousse grammaire gratuit pdf

[PDF] introduction grammaire generative

[PDF] chomsky théorie

[PDF] chomsky linguistique pdf

[PDF] aspects de la théorie syntaxique pdf

[PDF] grammaire grecque ancien

[PDF] sommaire paginé rapport de stage

[PDF] grammaire grec moderne pdf

This article has been accepted for inclusion in a future issue of this j ournal. Content is final as presented, with the exception of pagination. IEEETRA NSACTIONSONINTELLIGENTTRA NSPO RTATIONSYSTEMS1

A 3DCNN-LSTM-Based Image-to-Image

ForegroundSegmentation

ThangarajahAkila n,StudentMember ,IEEE, QingmingJonathanW u,SeniorMember ,IEEE, AminSafaei ,SeniorMember ,IEEE, JieHuo,StudentMember ,IEEE, andYimin Yang ,Member,I EEE Abstract—Thevide o-basedseparationoffor eground(FG)and background(BG)hasbe enwi delyst udiedduetoitsvital role in manyapplica tions,includingintelligentt ransporta tionand videosurv eillance.Mostoftheexistingalg orithms areba sed ontr aditionalcomputervi siontechniquestha tperformpixel - levelpro cessingassumingthatFGan dBGpossessd istinct visualch aracteristics.Recently,state-of-the- art solutionsexploit deeplea rningmodelstar getedorig inallyforimageclassic ation. Majordrawback sofsuchastrategy ar ethe lack ingdelineati on of FGre gionsduetomissi ngtemporal info rmationasth ey segmenttheFGbase dona si ngleframeobje ctdete ct ionstrateg y. Tograp plewitht hisissue,we excogitatea3D convolu tional neuralnetw ork(3DCNN)withlo ngshort-term memory (LSTM) pipelinesthatharness seminal ideas,viz.,full yconvolutio nal networking,3Dtra nspose convolution,andres idualfeatureows. Thence,anF G-BGs egmenteri simplementedi nanen coder- decoderfash ionandtrained onrepr esentativeFG-BGs egment s. Themodel devises astrategycalle dd oubleencod ingandslo w decoding,whichfus esthelea rnedspatio -tempo ralcueswith appropriatefeaturemaps bothinthedown-sam plingandup- samplingpathsf orachie vingwellgen eralizedFGobjectrepre- sentation.Finally,fr omtheSigm oidcondence mapg eneratedby the3 DCNN-LSTMmo del,th eFGiside ntied automa ticallyby usingNobuyu kiOtsu"smethodanda nempirical globalthr eshold . The analysisofexpe rime ntalresultsviastandardquantitativ e metricson16 benc hmar kdatasetsincludingbothindoorand outdoorscenesvalid atesth attheproposed3D CNN-LSTM achievescompetiti veperformanceintermso fgureofmerit evaluatedagainstprior andstate-of-the-art method s.Besides, a failureanal ysisisconductedon20v ideo sequencesfromthe

DAVIS2016dat aset .

IndexT erms—Deeplearning ,foreground-bac kground segmentation,intelligentsys tems,LSTM,spatiote mporalcues.

I. INTRODUCTION

I N THEfieldo fI ntelligent Transportation Systems(ITS), includingautonomousdriv ing,driverassistance,S imultane- ous LocalizationandMapping(SLA M),pedest riana ndvehicle Manuscriptr eceivedJanuary26,2018;revis edJuly 20,2018, October9, 2018andDecem ber6 ,2018; acceptedFebruary16,2019. Thiswo rkwassupported inpartby theCanadaRes earchChairProgram and in partbythe NS ERC Discover yGrant.TheAssociateEditorforthispaper wasC. Guo.(Correspondingauthor:Qingming JonathanWu. ) T.A kilan,Q.J. Wu,a ndJ.H uoarew iththeDepar tmentof ElectricalandCom puterE ngineering,Univers ityofWi ndsor,Windsor, ON N9B3 P4,Canada( e-mail:(thangara@uwinds or.ca;jwu@uwindsor.ca; huo11@uwindsor.ca). A. Safaeiiswith To rontoM icroElectronicsInc., Missi ssauga,ON L5T

2H7,Canada (e -mail:safaei.a.s @ieee.org).

Y.Y angiswiththe Com puterScience Depar tment,Lake headUniver sity, ThunderB ay,ONP7B 5E1,Canada(e -mail:yyang48@lak eheadu.ca). Digital ObjectIdentifier10. 1109/TI TS.2019.2900426 Fig.1.T raf ficflowidentification:(a )Multi-channel[R, G,B]spatio- temporal input, (b)3DCNN-L ST M,(c)FGscoremap, and(d)i dentifiedtraf ficflow (white -F Ganddark -BG). detection,the FG-BGse gmentationus ingvisualcueshas beenan inte gralsubsystem. Hence,thevideo-basedintelligent systemshave becomeubiquitousduet oamyria dofeasily accessiblelow -pricedcameramodules.S uchapplicationsf ace a crucialchalle ngeofprocessingmass iv evolumeofdat afrom multiplefeeds atthesam etime.I tisalsoreq uire dfor them to tacklewithv ar yingenvironmental factors,likeillumination changes,dynami cbackgrounds, andsoforth [1],[2].Thes e demands perplexthereal -time operationofthesy stems.Inthe analysisof traffic flowor humanactivity ,theperformance of anITS subs tantiallydependsontherobustnesso fF G-B G segmentation. Besidesbein gacore un itofvideoan alyticintelligent framework,theF G-BGs egmentationi salsoaninherentpart of variousmachine-/c omputer-visionproblems,for instance, attention-awarevideoanalysi s[1],[3],videosali ency-based objectseg mentationandretriev al[4]- [6],imagequalityassess- ment [7],vi sualtracking[8], andhuman-robotormachine interaction[9].Theprimaryobj ecti veof FG-BGse gmen- tationi stop laceatigh tmask ,w herethe appearanceof an object,ave hicl eorhumanismonitored. Suchmas k isv eryinformati vethanusingboundingboxas itall ows a closelocalization oftheFG objects. Anexampl eo f

FGdet ectionandBGsuppres sioni ssho wninFig.1,

wherethe FG-BG separationise mployedtoidentifyt he trafficflo wona busy highway .Itcanbeachieved by employingseveralal gorithmscategorized intofivegroups: i) Sample-based[10]- [15],ii).Probabilistic-based[16] -[21 ], iii). Subspace-based[22]-[24],iv). Codebook-based[25]-[27], and v).Ne uralnetwork(NN )-based[28]-[33].

The sample-basedalgorithmscreatea generalized

BGmodel bas edonthee videnc e collectedinlocal-le vel, global-level,orahybrid-level oft he twofromthepastseto f Nframes,i.e., for eachpixel/s uper- pixellocationorregion

1524-9050 ©2019 I

EEE. Personaluse isperm itted, butrepublication/redist ributionrequiresIEE Eperm ission. Seehttp://w ww.ieee.org/publications_standards/publications/rights/index.htmlformoreinformation. This article has been accepted for inclusion in a future issue of this j ournal. Content is final as presented, with the exception of pagination.

2IEEETRA NSACTIONSONINTELLIGENTTRA NSPO RTATIONSYSTEMS

Fig. 2.An ov erviewoftheproposed3D CNN-LSTM image-to-image network:3DCon v- 3DConvolutionw /tdownsampling, 3DConvT -3D transpose

conv(up-s ampling),BN-Batchnormalization, E(·)- binarycross-entrop yerror. therear eNsamplessto red.Ifthereare knumber of samplesinth eBG thatha ve adistanc esmallerthan aset thresholdtot heincoming pixel/super- pixelortheregionin thecurrent frame,theni ti scla ssifi edas BG,otherwiseFG. Thep robabilisticmodelswor kontheprincip leofstochastic process,li keGaussianMi xtureModels(GMM)[17],[34] and ConditionalRandomFiel d(CRF )-basedalgorithms[35]. The subspace-basedapproachesperformat ransformation of datato as ubspace, suchasEigenspace orPrincipal

ComponentAnalysis (PCA)-bas edsubspace.Then,they

form aB Gmodelu singt hesubspaceandes timatet heFG. The Codebookgenerate sadictionary th atconsistsofcol or, intensity,tempo ralfeatures,orsimilarr epresentations .

Sameproperti esofanewp ixelare compared with the

dictionaryvaluestod etermineitsstatus .T heNN-based modelsformul atetheFG -BGse gmentationas astructured input-outputmatchi ngproblem.Suchmodelsha vegain ed theirreputati onafterseriesof breakthroughperformances in theIm ageNet-Large-ScaleVisualRecognitionCh allenge (ILSVRC)sincet heyearof2012.TheN N-based techni ques havebeene xploi tedforvisualsemanti cs /labeling[36], [37], medicalimage partitioni ng[38],[39],an drecentlyforvideo

FG-BGse gmentation[1]aswell.Themainch allengesi n

CNN-basedFGdetecti onand BGsuppressionisthat dealing with time-dependentmotionandthedith eringeffectat borderingpi xelsoftheFG object s.Weaddress thesei ssues, bye xcogitatinga3DEnDecCNN that utilizesas trategy calleddoublee ncodingwi thmicro-autoencoders andslow decodingus ingresidualconnect ionslike inResNet[40] for lostfeaturereco veryand 3DConvandLS TMuni tsto handle localto globallong-s hortterms patio-temporalmot iono f theF Gobjects. Tofacilitatethe trainin gpro cess,wetake advantageofi ntra-domaint ransferlearning.

Insu mmary,thispaperf ocusesonimp roving aVanilla

image-to-imageConv-LS TMmodelforenhancedFGobject localization.Toth isend, theke ycontri bution softhisp aper area sfollo ws: i.It in troducesanovel techni quenameddouble-encoding usingautoencoder -likemicromodulesandslow -decoding usingfeaturep assing residualconnections.Here, aninput feature atas tage duringdown-sa mplingp rocessis encoded twicebeforeit reaches completelyto the

nextl evelofdimens ionreducedfeat uremap.While,theup-sa mplingprocessdecodesthefeat uremapswit htwos etsofr esidualfeature flowsfromd own-sa mpling

stagesfore ve rynewspatial dimens ionofthefeature space. ii.Thet ime-dependent videocuesare handledby3D con- volutionstocapture the shorttemp oralm otionswhilethe long-shortterm temporalmoti onsarecaptu redbyLST M modulesi nthe down-sampl ingandup-samplingstages, respectively. iii. Itpr ovidesempiricalma nifesttoshowthee ffectivenessof the proposedmodelcompared to aVanilla Conv -LSTM network. iv.Itcarri esout test in ginanexhaustivemanneron vari- ous videodat asetsfromthebenchmark databasecalled change detection2014(CDnet )[20] andfailureanalysis on DAVIS-2016dynamiccameravi deose quences[41]. The restoft hispaperis organizedasfollo ws:Section II reviewsrelatedliteratu re.S ectionIIIelaboratesthea rchi- tectural information.Sectio n IVdes cribestheexperimental set-up,analyses theperformance,and highlights someke y characteristicsofthecomparede xist ingm ethods .Finally, SectionVIconcl udesthepaper withfutu redirecti ons. II. R

EVIEW:CNNFORSEGMENTATION

Deep CNNshav eshown state-of-the-artperformance

in objectsegmen tation/detection/localizationovertraditional methods,li keGMM[5],[17],Graph-cut ,Nonparametri cmod- els[15], Vi sualbackgroundextract or(ViB e)[11],andPixel- BasedAdaptiv eSegmenter(PBAS )[42].Here,theFCN [36] isa pi oneerofCNNa rchit ecturethatreint erpretsthestandard visualclass ificationnetworkaslayersoffull 2Dconvolutional computationswit houtflattenedful lyconnectedlayers.This model introducesfeature-le velaugmentationsthrough skip connections thatcombine deep, coarse,semantic detailand shallow,fine,appearancecuesfrom chos enmid-layers .I nc on- trast,ourmodelp erforms3 Dc onvol utions withLSTMmod- ules anddoes the coarse-lev elfeaturefusioninast ructured manner,as sh owninFig.2 .Thei ntroductionof micro autoen- coder blocksarebase dont hephilosophy ofincreas ingt he networkdept hinstead ofwideningforabette rfeat uregeneral- ization.Hence, theresi dualfeatureflo wsnegatet hevanish ing gradientofdeep netw orksby carryingimporta nti nformation from earliertolat erlayers .Althoughs uchshortcutss eem,like This article has been accepted for inclusion in a future issue of this j ournal. Content is final as presented, with the exception of pagination. AKILANet al.: 3DCNN- LSTM-BASEDIMAGE-TO-IMAGEFORE GROUNDSEGMENTATION3 Fig. 3.CNN feature flows :(a)ResNet flow,and(b)t heresidual feature mappingofour 3DCNN-L ST MFG segmenter. an additiont othecon ventionalCNN connections,italle viates training andr educesthenumbero fp arameters[ 40]. An illustrationforthe ResNet connectionis depicted inF ig.3(a),where Xisan input feature,H(X)is ad esired transformation,andF(X)isa resi dualmapping.In[40], thefeature fusion operationH(X)=F(X)+Xis per- formed bya short cutconnectionand element-wiseaddit ion. Contrastingly,ourmodelstacksthe features depth-w iseas

H(X)=F(X)X, likein Fig.3 (b),wher edenotes

coarse-levelfeatureconcatenation.This fa vorsto hav eless numberoffilters inconv layersre sultin glesscomp utation.

Ronnebergeret al.[39] restructurestheFC Nas EnDec

CNN,referred U-net forbio medical cellsegmentation.In that, the activationmapsafter eachconvolution(con v)inthe encoding stageareconcatenated withthespatially matching activationmaps inthe decodingstage. Itallowsthe netw orkto exploittheori ginal contextualinformati ontosupplementthe featuresaft erupsamplin ginthehigherlayers. Inotherw ords, it isa rem edyforthel ostspa tialresol utionduetop ooling operationsorcons ecuti veconvolutionalkernel striding.The proposed3D CN N-LSTMmodelabstractedby Fi g.2hasthe followingvariati onsfromtheU-net: i.The max-pool ingoperationsachieveinv ariantfeaturesbut has atoll onobject localizationaccurac y[43]. To cir- cumventt his,weperformsubs ampling proces sthrough

3D stridedconv(k ernelsizeof 3andstri deof2).

quotesdbs_dbs12.pdfusesText_18