Study on Fashion Image Retrieval Methods for Efficient Fashion
Despite recent advances FIR still has limitations for application to real-world visual searches. The main reason for this is not only the trade-off between the
Memory-Augmented Attribute Manipulation Networks for Interactive
Fashion search with attribute manipulation. The user provides a clothing image with additional description about wanted attributes yet not included in the
Which Is Plagiarism: Fashion Image Retrieval Based on Regional
With the proposed regional attention we compare images region by region and find it is better than the typical attention for the plagiarized clothes retrieval
Generative Attribute Manipulation Scheme for Flexible Fashion Search
Fashion Search; Generative Adversarial Networks; Attribute Ma- prototype image generation and metric learning for fashion search.
Studio2Shop: from studio photo shoots to fashion articles
Shaw. 18 1439 AH new means of making fashion searchable and helping shoppers find the articles ... shop where the query picture is a photo shoot image of.
A new clothing image retrieval algorithm based on sketch
tive sketch-based clothing image retrieve system to search clothing image by using mobile visual sensors. Although sketching in mobile sensors is an expres-.
Learning Attribute Representations With Localization for Flexible
credible fashion search platform should be able to (1) find images that share the same attributes as the query image. (2) allow users to manipulate certain
Where to Buy It: Matching Street Clothing Photos in Online Shops
Given a real-world photo of a clothing item e.g. taken on the street
Personal Clothing Retrieval on Photo Collections by Color and
the query image. However visual search of dressed clothing in photo collections is a challenging task
Cross-Domain Image Retrieval With a Dual Attribute-Aware Ranking
user photo depicting a clothing image our goal is to re- image search [36] aims at identifying a product
scenarios.Inaddition, wehav ealsoobtained correspondingfine-grainedclothingattrib utes(e.g.,clothingcolor ,collar
pattern,sleev eshape,sleevelength,etc.)fromthe available onlineproductdescription, withoutsignificantannotation cost.Asdata pre-processing,inorder toremov etheimpact ofclutteredbackgrounds, whichpredominantlye xistforthe offlineimages,weemploy anenhancedR-CNN detectorto localizetheclothing areainthe image,withsome refine- mentsparticularlymade fortheclothing detectionproblem.Foraddressingtheproblemof cross-domainretriev al,
weproposea novel DualAttribute-a wareRankingNetwork (DARN)forretrieval featurelearning.D ARNconsistsof twosub-networkswithsimilar structure.Eachofthetwo domainimagesare fedintoeach ofthetw osub-networks. Thisspecificdesign aimstodiminish thediscrepancy ofon- lineandof flineimages.Thetwo sub-networksaredesignedtobe drivenbyse-
manticattribute learning,sowecallthemattrib ute-aware networks.Theintuitionisto createapo werfulsemantic representationofclothing ineachdomain, bylev eraging thevast amountsofdataannotatedwithfine-grained cloth- ingattributes. Tree-structurelayersareembeddedinto each sub-networkforthecomprehensiv eintegration ofattributes andtheirfull relations.Specifically, thelow-le vellayers of thesub-network aresharedforlearningthelo w-level rep- resentation.Then,a setoffully connectedlayersin atree- structureareused toconstructthe high-level component, witheachbranch modellingoneattrib ute.Basedonthe learnedsemanticfeatures fromeach
rankobjectiv etofurtherenhancetheretrievalfeature rep- resentation.Specifically, thetripletrankinglossisused toconstrainthe featuresimilarityof triplets,i.e.,thefea- turedistancebetween theonline-offline imagepairmust be smallerthanthat ofoffline imageandan yotherdissimilar onlineimages.Generally,theretrieval featuresfromD ARNhavesev-
eraladvantages comparedwiththedeepfeaturesof other works[19,8].(1)By usingthedual-structure network,our
modelcanhandle thecross-domainproblem moreappro- priately.(2)Ineachsub-netw ork,thescenario-specific se- manticrepresentationof clothingiselaborately capturedby ticrepresentation,the visualsimilarityconstraint enables moreeffecti vefeaturelearningfortheretrievalproblem.Insummary, themaincontributionsofourpaper are:
1.We collectauniquedatasetcomposedof cross-
scenarioimagepairs withfine-grainedattrib utes.The numberofonline imagesisabout 450,000,withad- ditional90,000of flinecounterpartscollected. Each imagehasabout 5-9semanticattrib utecategories, withmorethan ahundredpossible attributev alues.Thisonline-offline imagepairdatasetprovidesa train-ing/testingplatformfor manyreal-w orldapplicationsrelatedtoclothing analytics.We areplanningto re-leasethefull datasettothe communityforresearch purposesonly.
2.We proposetheDualAttribute-Aw areRankingNet-
workwhichsimultaneouslyintegrates theattributes andvisualsimilarity constraintintothe retrieval fea- turelearning.W edesigntree-structure layerstocom- fullrelations,which providesa newinsight onmulti- labellearning.W ealsointroduce thetripletlossfunc- tionwhichperfectly fitsintothe deepnetwork training.3.We conductextensivee xperimentsproving theeffec-
tivenessandrobustnessoftheframe workand eachone ofitscomponents fortheclothing retrieval problem.Thetop-20retrie valaccurac yisdoubledwhenusing
theproposedD ARNotherthan usingpre-trainedCNN featureonly(0.570 vs.0.268).The proposedmethod isgeneraland couldbeapplied toothercross-domain imageretriev alproblems.2.RelatedW ork
FashionDatasets.Recently, severaldatasetscontain-
ingawide varietyof clothingimagescaptured fromfashion websiteshav ebeencarefullyannotatedwithattributelabels45,9,32,18].Thesedatasets areprimarilydesigned for
trainingande valuationof clothingparsingandattributees- timationalgorithms.In contrast,ourdata iscomprisedof a largesetofclothingimage pairsdepictinguser photosand correspondinggarmentsfrom onlineshopping,in addition tofine-grainedattrib utes.Notably, thisreal-worlddatais essentialtobridge thegapbetween thetwo domains.VisualAnalysisofClothing.Many methodshavebeen
recentlyproposedfor automatedanalysisof clothingim- ages,spanninga widerangeof applicationdomains.In particular,clothingrecognitionhasbeen usedforconte xt- aidedpeopleidentification [13],fashion stylerecognition
21],occupationrecognition [39],andsocial tribepredic-
tion[26].Clothingparsing methods,whichproduce se-
manticlabelsfor eachpixel intheinput image,hav ere- ceivedsignificantattentioninthepastfe wyears[45,9].In
thesurveillance domain,matchingclothingimagesacross camerasisa fundamentaltaskfor thewell-known person re-identificationproblem[28,37].
Recently,thereisagro winginterestinmethodsforcloth- ingretriev al[20,33,31,44]andoutfit recommendation
18].Mostof thosemethodsdo notmodelthe discrepancy
betweentheuser photosandonline clothingimages.An ex- ceptionisthe workof Liuetal [31],whichfollo wsav ery
differentmethodologythanoursbased onpart-basedalign- mentandfeatures derived fromsparsereconstruction, and doesnote xploittherichness ofourdataobtainedbymining imagesfromcustomer reviews. 1063VisualAttributes.Researchon attribute-basedvi-
sualrepresentationsha verecei vedrenewedattentionby thecomputervision communityinthe pastfew years27,11,34,43].Attributes areusuallyreferredassemantic
propertiesofobjects orscenesthat aresharedacross cat- egories.Amongotherapplications,attrib uteshav ebeen usedforzero-shotlearning[27],imageranking andretrieval
38,22,17],fine-grainedcate gorization[3],sceneunder -
standing[35],andsentence generationfromimages [25].
Relatedtoour applicationdomain,K ovashka etal[
22]developedasystemcalled"WhittleSearch",whichis ableto answerqueriessuch as"Show meshoeimages likethese, butsportier".Theyused theconceptof relativeattributes proposedbyP arikhandGrauman [
34]forrele vancefeed-
back.Attributes forclothinghavebeen exploredin several recentpapers[4,5,2].They allowuserstosearchvisual
contentbasedon fine-graineddescriptions,such asa"blue stripedpolo-styleshirt".Attribute-basedrepresentationshave alsoshown com-
pellingresultsfor matchingimagesof peopleacrossdo- mains[37,29].Thew orkbyDonahue andGrauman[7]
demonstratesthatricher supervisionconv eyingannotator rationalesbasedon visualattributes, canbeconsidered as aformof privileged information[42].Alongthis direction,
inourw ork,wesho wthatcross-domainimageretriev alcan benefitfromfeature learningthatsimultaneously optimizes alossfunction thattakes intoaccountvisual similarityand attributeclassification.DeepLearning .Deepcon volutionalneural networks
haveachieveddramaticaccuracy improvementsinmanyar- easofcomputer vision[23,14,40].Thew orkofZhang et
al[46]combinedposelet classifiers[2]withcon volutional
netstoachie vecompelling resultsinhumanattributepre- diction.Sunet al[40]discov eredthatattributescanbe
implicitlyencodedin high-level featuresofnetw orksfor identitydiscrimination.In ourwork, weinsteade xplicitly useattribute predictionasaregularizerin deepnetworks for cross-domainimageretrie val.Existingapproachesfor imageretriev albasedon deep
learninghav eoutperformedpreviousmethodsbasedon otherimagerepresentations [1].Howe ver,theyarenotde-
signedtohandle theproblemof cross-domainimagere- trieval.Severaldomainadaptationmethodsbasedon deep learninghav ebeenrecentlyproposed[16,6].Relatedto
ourwork, Chenetal[5]usesa double-pathnetwork with
alignmentcostlayers forattribute prediction.Incontrast, ourwork addressestheproblemofcross-domainretrie val featurelearning,proposing anov elnetwork architecture thatlearnsef fective featuresformeasuringvisualsimilar- ityacrossdomains. Wenote thatotherdomain adaptation methods[24,15]coulde venbe appliedontopofourlearned
featurestofurther refineretriev alresults.AttributecategoriesExamples(totalnumber)
ClothesButtonDoubleBreasted,Pullo ver, ...(12)
ClothesCategory T-shirt,Skirt,LeatherCoat... (20)
ClothesColorBlack,White,Red, Blue...(56)
ClothesLengthRegular,Long,Short...(6)
ClothesPattern Pure,Stripe,Lattice, Dot...(27)
ClothesShapeSlim,Straight,Cloak, Loose...(10)
CollarShapeRound,Lapel,V -Neck...(25)
SleeveLengthLong,Three-quarter, Sleeveless...(7)
SleeveShapePuff,Raglan,Petal,Pile... (16)
Table1.Clothingattribute categoriesand examplev alues.The numberinbrack etsisthe totalnumberofvaluesfor eachcategory . Figure2.Some examplesof online-offlineimage pairs,containing imagesofdif ferenthumanpose, illumination,andvaryingback- ground.Particularly ,theofflineimagescontainmanyselfies with highocclusion.3.DataCollection
Wehavecollected about453,983onlineupper-clothing
imagesinhigh-resolution (about800×500onav erage) fromsev eralonline-shoppingwebsites.Generally,eachim- agecontainsa singlefrontal-view person.Fromthe sur- roundingtext ofimages,semanticattributes( e.g.,cloth- ingcolor, collarshape,sleeveshape, clothingstyle)are keycorrespondstoan attributecate gory(e.g.,color),and thevalueistheattrib utelabel( e.g.,red,black, white,etc.). Then,wemanually prunedthenoisy labels,merged similar labelsbasedon humanperception,and removed thosewith asmallnumber ofsamples.After that,9cate goriesofcloth- ingattributes areextractedandthetotal numberofattrib ute valuesis179.Asan example,there are56v aluesforthe colorattribute. Thespecifiedattrib utecategories andexampleattribute valuesarepresentedinT able1.Thislar ge-scaledatasetan-
notatedwithfine-grained clothingattributes isusedto learn apowerful semanticrepresentationofclothing,aswe will describeinthe nextsection. Recallthatthe goalofour retrieval problemisto findthe onlineshoppingimages thatcorrespondto agiv enquery photointhe "street"domainuploaded bytheuser .To ana- lyzethediscrepanc ybetweenthe imagesintheshopping scenario(onlineimages) andstreetscenario (offlineim- 10640 0.5 1 1.5 2 2.5 3
3.5x 10
4base shirtcotton clothescotton coatdenim jacketdown jacketformal skirtfur clotheshoodiesknitwearlace shirtleather clothesprinted skirtshirtshort dresssmall suitsweaterT-shirtvestwind coatwoolen coat
# of Images # of Online Images # of Offline Images Figure3.The distributionof online-offlineimage pairs. ages),wecollect alarge setofof flineimageswith theiron- linecounterparts.The key insighttocollect thisdatasetis thatthereare manycustomer review websiteswhereusers postphotosof theclothingthe yhav epurchased.As the linktothe correspondingclothingimages fromtheshop- pingstoreis available, itispossible tocollectalargesetof online-offlineimagepairs. Weinitiallycrawled381,975 online-offlineimage pairs ofdifferent categoriesfromthecustomerre viewpages. Then,aftera datacurationprocess, wheresev eralannota- torshelpedremo vingunsuitableimages, thedatawasre- ducedto91,390 imagepairs.F oreachof thesepairs,fine- grainedclothingattrib utesweree xtractedfromtheonline imagedescriptions.Some examplesof croppedonline- offlineimagepairsarepresented inFigure2.Ascan be
seen,eachpair ofimagesdepict thesameclothing, butin differentscenarios,exhibitingv ariationsinpose, lighting, andbackgroundclutter .Thedistrib utionofthecollected online-offlineimagesisillustratedin Figure3.Generally,
thenumberof imagesofdif ferentcategories inbothsce- nariosarealmost inthesame orderofmagnitude, whichis helpfulfortraining theretriev almodel.Insummary, ourdatasetissuitabletothe clothingre-
trievalproblemforseveralreasons. First,thelar geamount ofimagesenables effectiv etrainingof retrievalmodels,es- peciallydeepneural networkmodels. Second,theinforma- tionaboutfine-grained clothingattributes allowslearning ofsemanticrepresentations ofclothing.Last butnot least, theonline-offline imagespairsbridgethegapbetween the shoppingscenarioand thestreetscenario, providingrich in- formationforreal-w orldapplications.4.Technical Approach
Theuniquedataset introducedinthe previoussection
servesasthefuelto powerup ourattribute-dri venfeature learningapproachfor cross-domainretriev al.Next wede- scribethemain componentsofour proposedapproach,and howtheyareassembled tocreateareal-worldcross-domain clothingretriev alsystem.4.1.DualAttrib ute-awareRanking Network Inthissection, theDualAttrib ute-aware RankingNet- work(DARN)isintroduced forretrievalfeaturelearning.Comparedtoe xistingdeepfeatures [
19,8],DARN simulta-
neouslyintegrates semanticattributeswithvisualsimilarity constraintsintothe featurelearningstage, whileatthe same timemodelingthe discrepancybetween domains.NetworkStructure.Thestructure ofDARN isillus-
tratedinFigure4.Tw osub-networkswithsimilarNetwork-
in-Network(NIN)models[30]areconstructed asitsfoun-
dation.Duringtraining, theimagesfrom theonlineshop- pingdomainare fedintoone sub-network,and theimages fromthestreet domainarefed intotheother .Eachsub- networkaimstorepresentthe domain-specificinformation andgeneratehigh level comparablefeaturesas output.The NINmodelin eachsub-network consistsoffi vestack ed convolutionallayersfollowedbyMLPConv layersasde- finedin[30],andtw ofullyconnected layers(FC1,FC2).
Toincreasetherepresentationcapability oftheintermedi- atelayer, thefourthlayer,namedCon v4,isfollo wedbytw oMLPConvlayers.
Ontopof eachsub-network, weaddtree-structured
fully-connectedlayersto encodeinformationabout seman- ticattributes. Giventhesemanticfeatures learnedbythe twosub-networks,wefurther imposeatriplet-basedrank- inglossfunction, whichseparatesthe dissimilarimages withafix edmargin undertheframeworkof learningto rank.Thedetails ofsemanticinformation embeddingand therankingloss areintroducedne xt.SemanticInformation Embedding.Inthe clothingdo-
main,attributes oftenrefertothespecificdescription ofcer- tainparts( e.g.,collarshape, sleeve length)orclothing (e.g., clothescolor, clothesstyle).Complementarytothevisual appearance,thisinformation canbeused toforma powerful semanticrepresentationfor theclothingretrie valproblem. structurelayersto comprehensively capturetheinformation ofattributes andtheirfullrelations.Specifically,wetransmittheFC2 responseofeach sub-
fully-connectednetwork tomodeleachattributeseparately . Inthistree-structured network,the visualfeaturesfrom the low-levellayersaresharedamongattributes;whilethe se- manticfeaturesfrom thehigh-lev ellayersare learnedsep- arately.Theneuronnumberin theoutput-layerof each branchequalsto thenumberof correspondingattribute val- ues(seeT able1).Sinceeach attributehas asinglev alue,
thecross-entropy lossisusedineachbranch. Notethatthe valuesofsomeattributes maybemissing forsomeclothing images.Inthis case,thegradients fromthecorresponding branchesaresimply settozero. 10653 227
227
Conv1:
7×7×3×96,
S=255 96Conv2:
5×5×96×256,
S=214 256Conv3:
3×3×256×512,
S=11451214384
Conv4:
3×3×512×1024,
S=19696
2562565125121024 5123845125125127
Conv5:
3×3×384×512,
S=240964096
3×3 max
pooling3×3 max pooling3×3 max pooling2×2 max
pooling5×5 max
pooling 3 227227
Conv1:
7×7×3×96,
S=255 96Conv2:
5×5×96×256,
S=214 256Conv3:
3×3×256×512,
S=11451214384
Conv4:
3×3×512×1024,
S=19696
2562565125121024 5123845125125127
Conv5:
3×3×384×512,
S=240964096
quotesdbs_dbs9.pdfusesText_15[PDF] clothing search terms
[PDF] clothing search uk
[PDF] clothing with r
[PDF] cloud compiler c++
[PDF] cloud compiler ide
[PDF] cloud compiler ni
[PDF] cloud compiler project
[PDF] cloud security (cisco)
[PDF] cloverleaf project list
[PDF] club world cup result
[PDF] club world cup trophy ceremony
[PDF] cluster analysis book pdf
[PDF] cluster analysis everitt pdf
[PDF] cluster analysis example business