Triply Supervised Decoder Networks for Joint Detection and PDF

CONFIDENTIAL

Vision25 Volunteer Application form; Dec 2018 Page 1 of 5. CONFIDENTIAL. Volunteer Application Form. Thank you for your interest in volunteering for

The 2030 Agenda and the Sustainable Development Goals: An

Sustainable Development” 2018 [online] https://unstats.un.org/sdgs/indicators/Global%20. Indicator%20Framework%20after%20refinement_Eng.pdf.

Global Terrorism Index 2019

For more information visit www.economicsandpeace.org which recorded 3217 fewer deaths from terrorism in 2018

Global Peace Index

For more information visit www.economicsandpeace.org. Please cite this report as: over the period from 32 riots and protests in 2011 to 292 in 2018.

The ILOs Strategic Plan for 2022-25

10 nov. 2020 All Governing Body documents are available at www.ilo.org/gb. Governing Body. 340th Session Geneva

Triply Supervised Decoder Networks for Joint Detection and

25 sept. 2018 arXiv:1809.09299v1 [cs.CV] 25 Sep 2018 ... vision. Instead of using in-network feature maps of differ- ent resolutions for multi-scale ...

2022 Global Peace Index

11 juin 2022 For more information visit www.economicsandpeace.org. Please cite this report as: ... 25 most peaceful countries improved by 5.1 per cent.

Recognition in Terra Incognita

25 juil. 2018 tally vision algorithms do not generalize well across datasets [13 ... CV] 25 Jul 2018 ... app (http://merlin.allaboutbirds.org/download/).

Seeing Small Faces from Robust Anchors Perspective

25 févr. 2018 CV] 25 Feb 2018 ... [25] were able to show that both DPM models and rigid ... ference on computer vision and pattern recognition pages.

World Vision

to learn why good food is so important especially for growing children. worldvision.org/nutrition. Page 5. 5. What's the challenge

Triply Supervised Decoder Networks for Joint Detection and Segmentation

Jiale Cao

1, Yanwei Pang1, Xuelong Li2

1School of Electrical and Information Engineering, Tianjin University

2Xi"an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences

connor@tju.edu.cn, pyw@tju.edu.cn, xuelongli@opt.ac.cn

Abstract

Joint object detection and semantic segmentation can be applied to many fields, such as self-driving cars and un- manned surface vessels. An initial and important progress towards this goal has been achieved by simply sharing the deep convolutional features for the two tasks. How- ever, this simple scheme is unable to make full use of the fact that detection and segmentation are mutually benefi- cial. To overcome this drawback, we propose a frame- work called TripleNet where triple supervisions including detection-oriented supervision, class-aware segmentation supervision, and class-agnostic segmentation supervision are imposed on each layer of the decoder network. Class- agnostic segmentation supervision provides an objectness prior knowledge for both semantic segmentation and ob- ject detection. Besides the three types of supervisions, two light-weight modules (i.e., inner-connected module and at- tention skip-layer fusion) are also incorporated into each layer of the decoder. In the proposed framework, detection and segmentation can sufficiently boost each other. More- over, class-agnostic and class-aware segmentation on each decoder layer are not performed at the test stage. There- fore, no extra computational costs are introduced at the test stage. Experimental results on the VOC2007 and VOC2012 datasets demonstrate that the proposed TripleNet is able to improve both the detection and segmentation accuracies without adding extra computational costs.

1. Introduction

Object detection and semantic segmentation are two fun- damental and important tasks in the field of computer vi- sion. In recent few years, object detection [36, 29, 26] and semantic segmentation [30, 5, 1] with deep convolutional networks [20, 39, 14, 17] have achieved great progress, re- spectively. Most state-of-the-art methods only focus on one single task, which does not join object detection and se- mantic segmentation together. However, joint object detec- tion and semantic segmentation is very necessary and im-det seg det det seg seg det seg det seg detdetdetdet segsegsegseg D S D S DDDDD S DDDDD SSSSS DDDDD S SS S S S (a) naive joint network(b) refined joint network(c) blitznet (d) deeply joint pyramid (e) deeply refined joint pyramid fffff D S D DDDD S S S S S res2 res3 res4 res5 res6 res7 fffff D S DDDDD SSSSS res2 res3 res4 res5 res6 res7 r S S a S a S a S a S a S a r r r r r r f 1 x 1 conv upsample concat 3 x 3 conv 1 x 1 conv sum f

Skip-layer fusion

refined module concatenation 3 x 3 conv 3 x 3 conv 3 x 3 conv concat 3 x 3 conv logits cls reg 1 x 1 conv 1 x 1 conv segmentation detection fffff det seg res2 res3 res4 res5 res6 res7 1 x 1 conv upsample concat 3 x 3 conv 1 x 1 conv sum f skip-layer fusion detdetdetdetdet segsegsegsegseg res1 conv1

Input Image

Detection

Segmentation

The decoder

feature map

The encoder

feature map (a) PairNet(b) skip-layer fusion DDDDD S DDDDD SSSSS DDDDD S S S S (a) naive joint network(b) refined joint network(c) Blitznet (d) PairNet (e) TripleNet D S D S fffff seg res2 res3 res4 res5 res6 res7 r seg a r r r r r r f attentionskip-layer fusion inner-connected module concatenation 3 x 3 conv 3 x 3 conv 3 x 3 conv concat 3 x 3 conv logits cls reg 1 x 1 conv 1 x 1 conv segmentation detection res1 conv1 seg detdetdetdetdet seg seg a seg seg a seg seg a seg seg a seg seg a

Input Image

det

Detection

Segmentation

(a) TripleNet (b) inner-connected module

The decoder

feature map fffff seg res2 res3 res4 res5 res6 res7 r r r r r r r f

Skip-layer fusion

refined module concatenation 3 x 3 conv 3 x 3 conv 3 x 3 conv concat 3 x 3 conv logits cls reg 1 x 1 conv 1 x 1 conv segmentation detection res1 conv1 seg detdetdetdetdet segsegsegseg

Input Image

det

Detection

Segmentation

(a) Deeply refined pyramid network(b) The refined moduel

The decoder

feature map fffff seg res2 res3 res4 res5 res6 res7 r seg a r r r r r r f

Skip-layer fusion

refined module concatenation 3 x 3 conv 3 x 3 conv 3 x 3 conv concat 3 x 3 conv logits cls reg 1 x 1 conv 1 x 1 conv segmentation detection res1 conv1 seg detdetdetdetdet seg a seg seg a seg seg a seg seg a seg seg a

Input Image

det

Detection

Segmentation

(a) Deeply refined pyramid network(b) The refined moduel

The decoder

feature map SS S a S a S a S a S a 1 x 1 conv upsample 1 x 1 conv 3 x 3 conv 1 x 1 conv sum

The decoder

feature map

The encoder

feature map (c) attention skip-layer fusion 3 x 3 conv SE concatFigure 1. Some architectures of joint detection and segmenta- tion. (a) The last layer of the encoder is used for detection and segmentation[2]. (b) The branch for detection is refined by the branch for segmentation [31, 47]. (c) Each layer of the decoder de- tects objects of different scales, and the fused layer is for segmen- tation [7]. (d) The proposed PairNet. Each layer of the decoder is simultaneously for detection and segmentation. (e) The proposed TripleNet, which has three types of supervisions and some light- weight modules. portant in many applications, such as self-driving cars and unmanned surface vessels. In fact, object detection and semantic segmentation are highly related. On the one hand, semantic segmentation usually used as a multi-task supervision can help improve object detection [31, 24]. On the other hand, object de- tection can be used as a prior knowledge to help improve performance of semantic segmentation [14, 34]. Due to application requirements and task relevance, joint object detection and semantic segmentation has gradually attracted the attention of researchers. Fig. 1 summarizes three typical methods of joint object detection and seman-quotesdbs_dbs35.pdfusesText_40

[PDF] Le guide du développement durable en entreprise

[PDF] Accounting Advisory Services Des solutions opérationnelles et efficaces, adaptées à vos problématiques

[PDF] Présentation Comité des usagers du CSSS de Laval

[PDF] Modalités de candidatures pour les missions d appui en Côte d Ivoire des partenaires universitaires français

[PDF] Commentaire. Décision n 2012-261 QPC du 22 juin 2012. M. Thierry B. (Consentement au mariage et opposition à mariage)

[PDF] MINISTÈRE DE L EMPLOI, DE LA COHÉSION SOCIALE ET DU LOGEMENT CONVENTIONS COLLECTIVES. Convention collective IDCC : 2622. CRÉDIT MARITIME MUTUEL

[PDF] Identification du poste

[PDF] Le monde produit en deux jours plus de données qu il n en a produit entre le début de l humanité et 2003

[PDF] Général. Identification. Etude transversale répétée sur la prévalence de la carie dentaire des enfants de 6 et 12 ans.

[PDF] DIAGNOSTIC DES PERFORMANCES ENERGETIQUES DES BATIMENTS DE LA COMMUNAUTE URBAINE DE STRASBOURG

[PDF] Mission Locale de l Agglomération Rouennaise

[PDF] Del Consulting. Stratégie, Marketing, Développement LA PLATEFORME DE MARQUE. Un outil au service de la croissance

[PDF] Sommaire. Créez votre compte entreprise et gagnez du temps! Qui connaît bien protège bien

[PDF] Conditions générales de partenariat

[PDF] Chiffre du mois. La population des ingénieurs diplômés : chiffres-clés. N 59 Octobre 2015. Introduction

[PDF] Triply Supervised Decoder Networks for Joint Detection and

Jiale Cao

1, Yanwei Pang1, Xuelong Li2

1School of Electrical and Information Engineering, Tianjin University

2Xi"an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences

Abstract

1. Introduction

Skip-layer fusion

Skip-layer fusion

Input Image

Detection

Segmentation

The decoder

The encoder

Input Image

Detection

Segmentation

The decoder

Skip-layer fusion

Input Image

Detection

Segmentation

The decoder

Skip-layer fusion

Input Image

Detection

Segmentation

The decoder

The decoder

The encoder