The vanishing - Writers and directors of European film and TV fiction
fiction released between 2015 and 2020. Sources of the data: IMDB public files for theatrical films writers and TV fiction directors and writers.
The MediaEval 2015 Affective Impact of Movies Task
Sep 15 2015 Induced affect detection: the emotional impact of a video or movie can be a strong indicator for search or recommendation;. • Violence detection ...
Aligning Books and Movies: Towards Story-Like Visual Explanations
To align movies and books we exploit a neural sentence movies and books such as book retrieval (finding the book ... arXiv preprint arXiv 2015.
SARAJEVO FILM FESTIVAL 2015
Aug 22 2015 departments of the University 'Hasan Prishtina' will attend the Sarajevo Film festival 2015: The filmmakers: Blerta ZEQIRI.
NII-UIT at MediaEval 2015 Affective Impact of Movies Task
Sep 15 2015 NII-UIT at MediaEval 2015. Affective Impact of Movies Task. Vu Lam. University of Science
2015 Feature Film Study
4. In 2015 three of the 19 films produced in California were animated movies. Of the 16 live-action movies filmed primarily in the state
Inequality in 800 Popular Films: Examining Portrayals of Gender
LGBT. Only 32 speaking or named characters were lesbian gay
Banff Mountain Film Festival World Tour Films 2015/16
Winner of the British. Mountaineering Council's 2014 Women in Adventure Film Competition. Builder (special edit). (2015 Canada
PARENTS RATINGS ADVISORY STUDY - 2015
Parents feel that the rating system advises best on the amount of violence content versus other content types. •. Parents indicate that movies containing
DISCUSSION DRAFT
Jun 26 2015 CAYMAN ISLANDS. Supplement No.1 published with Gazette No. 16 dated 3rd August
WatchingMoviesandReading Books
YukunZhu
?,1RyanKiros*,1RichardZemel1RuslanSalakhutdinov 1RaquelUrtasun
1AntonioTorralba 2SanjaFidler1
1UniversityofToronto2MassachusettsInstituteof Technology
Abstract
Booksare arichsourceof bothfine-grained information, howac haracter, anobjectorascenelookslike,aswellas high-levelsemantics,whatsomeoneis thi nking,feeling and howthesestates evolvethr oughastory .Thispaperaimsto alignbooksto theirmovie releasesin orderto providerich descriptiveexplanations forvisualcontentthatgosemanti- callyfarbe yondthecaptions availableincurrentdatasets.Toalignmoviesand bookswee xploitaneuralsentence
embeddingthatis trainedin anunsupervisedway froma largecorpusofbooks,aswellas avideo-text neuralem- beddingforcomputing similaritiesbetweenmo vieclipsand sentencesinthe book.We propose acontext-awar eCNN tocombineinformation frommultiple sources.W edemon- strategoodquantitativeperformancefor movie/bookalign- mentandshow several qualitativeexamples thatshowcase thediversity oftasksourmodelcanbe usedfor.1.Introduction
Atrulyintelligent machineneedsto notonlyparse the
surrounding3Den vironment,but alsounderstandwhypeo- pletake certainactions,whatthey willdonext,whatthe y couldpossiblybe thinking,ande ventry toempathizewith them.Inthis quest,languagewill playacrucial rolein groundingvisualinformation tohigh-lev elsemanticcon- cepts.Onlya feww ordsina sentencemayconvey really richsemanticinformation. Languagealsorepresents anatu- ralmeansof interactionbetweena naive userandour vision algorithms,whichis particularlyimportantfor applications suchassocial roboticsorassisti vedri ving.Combiningimagesor videoswithlanguage hasgotten
significantattentionin thepastyear ,partlydue tothecre- ationofCoCo [20],Microsoft"s large-scalecaptionedim-
agedataset.The fieldhastackled adiv ersesetof taskssuch ascaptioning[15,13,40,39,24],alignment[ 13,17,38],
Q&A[22,21],visualmodel learningfromte xtualdescrip-
tions[9,29],andsemantic visualsearchwith naturalmulti-
sentencequeries[ 19]. ?Denotesequalcontrib ution Figure1:Shotfromthe movieGoneGirl,alongwith thesubtitle, alignedwiththe book.We reasonaboutthe visualanddialog (text) alignmentbetweenthe movieand abook. Booksprovide uswithverydescriptiv etext thatconv eys bothfine-grainedvisual details(how thingslooklik e)as wellashigh-le velsemantics (whatpeoplethink,feel,and howtheirstatesev olvethrough astory).This sourceof knowledge,however ,doesnotcomewithassociatedvisual informationthatw ouldenableus togrounditwithnatural language.Groundingdescr iptionsinbooks tovisionwould allowustogette xtualexplanationsorstoriesabout thevi- sualworld ratherthanshortcaptionsavailableincurrent datasets.Itcould alsoprovide uswitha verylar geamount ofdata(with tensofthousands booksav ailableonline).Inthispaper ,wee xploitthefactthatman ybooksha ve
beenturnedinto movies.Books andtheirmo viereleases havealotofcommonknowledge aswellas theyare com- plementaryinman yways. Forinstance,booksprovide de- taileddescriptionsabout theintentionsand mentalstatesof thecharacters,while moviesare betteratcapturing visual aspectsofthe settings.Thefirstchallenge weneedto address,andthe focusof
thispaper, istoalignbookswiththeir moviereleases inor- dertoobtain richdescriptionsfor thevisualcontent. We aimtoalign thetwo sourceswithtw otypesof information: visual,wherethe goalisto linkamo vieshotto abookpara- graph,anddialog,wherewe wantto findcorrespondences betweensentencesin themovie" ssubtitleand sentencesin thebook(Fig.1).We introduceanovelsentence similarity
measurebasedon aneuralsentence embeddingtrainedon millionsofsentences fromalar gecorpusof books.Onthe visualside,we extendthe neuralimage-sentenceembed- dingstothe videodomainand trainthemodel onDVS de- scriptionsofmo vieclips.Our approachcombinesdifferent similaritymeasuresand takesinto accountcontextual infor- 1 19mationcontainedin thenearbyshots andbooksentences. Ourfinalalignment modelisformulated asanener gymin-imizationproblemthat encouragesthealignment tofollow asimilartimeline. Toe valuatethe book-moviealignmentmodelwecollected adatasetwith 11movie/book pairsan-notatedwith2,070 shot-to-sentencecorrespondences.W edemonstrategoodquantitati veperformance andshowsev-eralqualitativ eexamplesthatshowcasethediv ersityoftasksourmodel canbeused for.All ourdataand codeareavailable:
http://www.cs.utoronto.ca/˜mbweb/.Thealignmentmodel enablesmultipleapplications.
Imagineanapp whichallows theuserto browsethe book
asthescenes unrollinthe movie:perhaps itsendingor act- ingareambiguous, andonew ouldlike toquerythe book foranswers.V ice-versa,while readingthebookonemight wanttoswitchfromte xttovideo, particularlyforthe juicy scenes.We alsoshowotherapplicationsof learningfrom moviesandbookssuchas bookretriev al(findingthe book thatgoeswith amovie andfindingother similarbooks),and captioningCoCoimages withstory-like descriptions.2.RelatedW ork
Mosteffort inthedomainofvisionand languagehas
beendev otedtotheproblemofimagecaptioning.Older themintote xtualdescriptions[7,18].Recently, several
approachesbasedon RNNsemerged, generatingcaptions viaalearned jointimage-text embedding[15,13,40,24].
Theseapproachesha veals obeenextendedtogeneratede- scriptionsofshort videoclips[39].In[ 27],theauthors go
beyonddescribingwhatishappeningin animageand pro- videexplanations aboutwhysomethingishappening. Re- latedtoours isalsow orkonimage retrieval [11]whichaims
tofindan imagethatbest depictsacomple xdescription.Fortext-to-imagealignment,[
17,8]findcorrespon-
dencesbetweennouns andpronounsin acaptionand visual objectsusingse veralvisual andtextualpotentials.Linet al.[19]doesso forvideos.In [23],theauthors aligncook-
ingvideoswith therecipes.Bojano wskietal. [2]local-
izeactionsfrom anorderedlist oflabelsin videoclips. In[13,34],theauthors useRNNembeddings tofindthe
correspondences.[41]combinesneural embeddingswith
softattentionin ordertoalign thewords toimagere gions.Earlywork onmovie-to-textalignmentinclude dynamic
timewarping foraligningmoviestoscripts withthehelp ofsubtitles[6,5].Sankaretal.[31]furtherde velopeda
systemwhichidentified setsofvisual andaudiofeatures to Suchalignmenthas beenexploited toprovide weaklabels forpersonnaming tasks[6,33,28].
Closesttoour workis [
38],whichaligns plotsynopsesto
shotsinthe TVseriesfor story-basedcontent retrieval. Thisworkadoptsasimilarityfunction betweensentencesin plotsynopsesandshots basedonperson identitiesandk eywords insubtitles.Our workdif ferswiththeirs inseveralimpor- tantaspects.First, wetacklea morechallengingproblem ofmovie/bookalignment.Unlikeplot synopsis,which closelyfollowthestorylineofmo vies,booksare moreverbose andmightvary inthestorylinefromtheirmo vierelease.Fur -thermore,weuse learnedneuralembeddings tocomputethe similaritiesratherthan hand-designedsimilarityfunctions.
Paralleltoourwork, [
37]aimsto alignscenesin movies
tochaptersin thebook.Ho wever ,theirapproach operates onav erycoarsele vel(chapters),whileoursdoes soonthe sentence/paragraphlev el.Theirdatasetthusevaluateson90scene-chaptercorrespondences, whileourdataset draws
1,800shot-to-paragraphalignments. Furthermore,theap-
proachesareinherently different.[37]matchesthe pres-
enceofcharacters inascene tothosein achapter, aswell asuseshand-crafted similaritymeasuresbetween sentences inthesubtitles anddialogsin thebooks,similarly to[ 38].Rohrbachetal.[
30]recentlyreleased theMovie De-
scriptiondatasetwhich containsclipsfrom movies,each time-stampedwitha sentencefromD VS(Descriptiv eVideo Service).Thedataset containsclipsfromovera100movies, andprovides agreatresourceforthecapti oningtechniques. Oureffort hereistoalignmovies withbooksin ordertoob- tainlonger, richerandmorehigh-level videodescriptions. Westartbydescribingour newdataset, andthene xplain ourproposedapproach.3.TheMo vieBookandBookCor pusDatasets
Wecollectedtwolar gedatasets,one formovie/book
alignmentandone withalar genumberof books.TheMovieBook Dataset.Sincenoprior workor dataex-
istonthe problemofmo vie/bookalignment,we collecteda newdatasetwith11moviesandcorrespondingbooks.F or eachmovie wealsohavesubtitles, whichweparse intoa setoftime-stamped sentences.Notethat nospeaker infor- mationispro videdinthe subtitles.Weparseeachbook into sentencesandparagraphs.Ourannotatorshad themovie andabook openedside
byside.The ywereask edtoiteratebetweenbrowsing the bookandw atchingafe wshots/scenesofthemovie, and tryingtofind correspondencesbetweenthem. Inparticular, theymarkedthee xacttime(inseconds)ofcorrespondence inthemo vieandthe matchinglinenumberinthe bookfile, indicatingthebe ginningofthe matchedsentence.Onthe videoside,we assumethatthe matchspansacross ashot(a videounitwith smoothcameramotion). Ifthematch was longerinduration, theannotatoralso indicatedtheending time.Similarlyfor thebook,if moresentencesmatched, theannotatorindicated fromwhichto whichlinea match occurred.Eachali gnment wastaggedasavisual,dialogue, oranaudiomatch .Notethat even fordialogs, themovie andbookv ersionsaresemantically similarbutnotexactly 20BOOKMOVIEANNOTATION
Title#sent.#words #unique
wordsavg.#words persent.max#w ords persent.#para- graphs#shots#sent.in subtitles#dialog align.#visual align. NoCountryfor OldMen8,05069,8241,70410683,1891,34888922347 HarryPotterand theSorcerersStone 6,45878,5962,363152272,9252,6471,22716473 TheGreenMile 9,467133,2413,043171192,7602,3501,846208102 OneFlew OvertheCuckooNest 7,103112,9782,949191922,2361,6711,5536425Table1:Statisticsforour MovieBookDatasetwithground-truthfor alignmentbetweenbooks andtheirmo viereleases.
#ofbooks #ofsentences #ofw ords#ofunique wordsmean#of wordsper sentencemedian#of wordsper sentence11,03874,004,228984,846,3571,316,4201311
Table2:Summarystatisticsof ourBookCorpusdataset.We usethiscorpustotrainthe sentenceembeddingmodel. thesame.Thus decidingonwhat definesamatch ornotis alsosomewhat subjectiveandmayslightly varyacrossour annotators.Altogether, theannotatorsspent90hourslabel- Table1presentsourdataset, whileFig.6showsafew
ground-truthalignments.Thenumberofsentences perbook varyfrom638to15,498, even thoughthemo viesaresimilar induration.This indicatesahuge diversity indescriptiv e- nessacrossliterature, andpresentsa challengeformatch- ing.Sentencesalso varyin length,withthose inBrokebackMountainbeingtwice aslongas thoseinThe Road.The
longestsentencein AmericanPsychohas 422words and spansov erapageinthebook.Aligningmovies withbooksischallengingev enforhu-
mans,mostlydue tothescale ofthedata. Eachmovie ison average2hlongandhas1,800shots, whileabook hason average7,750sentences.Booksalsohav edifferent styles ofwriting,formatting, language,maycontain slang(go- ingvsgoin",ore venwasvs"us),etc.T able1showsthat
findingvisualmatches wasparticularly challenging.This isbecausedescriptions inbookscan beeitherv eryshort andhiddenwithin longerparagraphsor even withinalonger sentence,orv eryverbose -inwhichcasethey getobscured withthesurrounding text- andarehard tospot.Ofcourse, howclosethemovie followsthe bookisalso uptothedi- rector,whichcanbeseen throughthenumber ofalignments thatourannotators foundacrossdif ferentmovie/books.BookCorpus.Inorderto trainoursentence similarity
modelwecollected acorpusof 11,038booksfrom theweb. Thesearefree bookswrittenby yetunpublishedauthors.Weonlyincludedbooksthat hadmorethan 20Kwords
inorderto filteroutperhaps noisiershorterstories. The datasethasbooks in16dif ferentgenres,e.g., Romance Table2highlightsthesummary statisticsofour corpus.
4.AligningBooks andMovies
Ourapproachaims toaligna moviewith abookby ex-
ploitingvisualinformation aswellas dialogs.We takeshots asvideounits andsentencesfrom subtitlestorepresent di-alogs.Ourgoal istomatch thesetothe sentencesinthe book.We proposeseveralmeasuresto computesimilari-tiesbetweenpairs ofsentencesas wellasshots andsen-tences.We useournoveldeep neuralembeddingtrained onourlar gecorpusof bookstopredictsimilaritiesbetweensentences.Note thatanextendedversion ofthesentence embeddingisdescribed indetailin [
16]showing howto
dealwithmillion-w ordvocab ularies,anddemonstratingits performanceona largev arietyofNLP benchmarks.For comparingshotswith sentenceswee xtendtheneural em- beddingofimages andtext [15]tooperate inthevideo do-
main.We nextdevelopa novelcontextualalignment model thatcombinesinformation fromvarious similaritymeasures andalar gertime-scalein ordertomakebetterlocal align- mentpredictions.Finally ,wepropose asimplepairwiseConditionalRandomField (CRF)thatsmooths thealign-
mentsbyencouraging themtofollo walinear timeline,both inthevideo andbookdomain. totext embedding.Wenextpropose ourcontextual model thatcombinessimilarities anddiscussCRF inmoredetail.4.1.SkipThoughtV ectors
Inorderto scorethesimilarity betweentwo sentences, weexploit ourarchitectureforlearningunsupervised rep- resentationsofte xt[16].Themodel islooselyinspired by
theskip-gram[25]architecturefor learningrepresentations
ofwords. Inthewordskip-grammodel, aword wiischo- senandmust predictitssurrounding context(e.g. wi+1and w i-1foraconte xtwindow ofsize1).Ourmodelw orksin asimilarw aybut atthesentencelevel. Thatis,givena sen- tencetuple( si-1,si,si+1)ourmodel firstencodesthe sen- tencesiintoafix edvector ,thenconditionedonthisv ector triestoreconstruct thesentencessi-1andsi+1,assho wn inFig.2.Themoti vationfor thisarchitectureisinspired
bythedistrib utionalhypothes is:sentencesthathave similar surroundingcontext arelikelytobeboth semanticallyand syntacticallysimilar. Thus,twosentencesthatha vesimilar 21Figure2:Sentenceneuralembedding [16].Giv enatuple(si-1,si,si+1)ofcontiguous sentences,wheresiisthei-thsentenceof abook,
thesentencesiisencodedand triestoreconstruct theprevious sentencesi-1andnext sentencesi+1.Unattachedarro wsareconnected to
theencoderoutput. Colorsindicatewhich componentsshareparameters. ?eos?istheend ofsentencetok en. hestartedthe car,left theparkinglot andmerged ontothehighw ayafe wmilesdo wntheroad . hedrov edownthestreetoffintothe distance.heshutthe doorandw atchedthetaxi drive off. shewatched thelightsflickerthroughthe treesasthe mendrov etowardtheroad . amessyb usinesstobe sure,butnecessaryto achieve afineand nobleend.themostef fective waytoendthebattle.theysawtheironly goalassurvivaland logicallyplanneda strategyto achieve it.
therewould befarfewercasualties andfar lessdestruction.Table3:Qualitativeresultsfromthesentenceskip-grammodel. Foreach querysentenceon theleft,we retrieve the4nearest neighbor
sentences(byinner product)chosenfrom booksthemodel hasnotseen before.Moreresults insupplementary. syntaxandsemantics arelikely tobeencoded toasimilar vector.Oncethemodelistrained,we canmapan ysentence throughtheencoder toobtainv ectorrepresentations,then scoretheirsimilarity throughaninner product.Thelearningsignal ofthemodel dependsonha vingcon-
tiguoustext, wheresentencesfollowoneanother inse- quence.Anatural corpusfortraining ourmodelis thus alarge collectionofbooks.Given thesizeand diversity ofgenres,our BookCorpusallows ustolearn verygeneral representationsofte xt.For instance,Table3illustratesthe
nearestneighboursof querysentences,tak enfromheld out booksthatthe modelwas nottrainedon. Thesequalitativ e resultsdemonstratethat ourintuitionis correct,withresult- ingnearestneighbors correspondslargely tosyntactically andsemanticallysimilar sentences.Notethat thesentence embeddingisgeneral andcanbe appliedtoother domains notconsideredin thispaper, whichise xploredin[ 16].Toconstructanencoder, weusea recurrentneuralnet-
work,inspiredbythesuccess ofencoder-decoder models forneuralmachine translation[12,3,1,35].Tw okinds
ofactiv ationfunctionshaverecentlygainedtract ion:long short-termmemory(LSTM) [10]andthe gatedrecurrent
unit(GRU) [4].Bothtypes ofactiv ationsuccessfullysolv e
thevanishing gradientproblem,throughtheuseof gates tocontrolthe flowof information.TheLSTM unitexplic- ityemploys acellthatactsasa carouselwithan identity weight.Theflo wofinformation throughacelliscontrolled byinput,output andforget gateswhich controlwhatgoes intoacell, whatleav esacell andwhetherto resetthecon- tentsofthe cell.TheGR Udoesnot useacell butemplo ys twogates:anupdate andaresetgate.In aGRU, thehidden stateisa linearcombinationof theprevious hiddenstateand theproposedhidden state,wherethe combinationweights arecontrolledby theupdateg ate.GRUs have beenshown toperformjust aswellas LSTMonse veralsequence pre- dictiontasks[4]whilebeing simpler.Thus, weuseGR Uas
theactiv ationfunctionforourencoderanddecoderRNNs. Supposeweare given asentencetuple (si-1,si,si+1),andletwtidenotethet-thword forsiandletxtibeits wordembedding.Webreak themodeldescription intothree parts:theencoder ,decoderand objectivefunction. Encoder.Letw1i,...,wNidenotewords insentencesiwithNthenumberof wordsin thesentence.The encoderpro-
ducesahidden statehtiateachtime stepwhichforms the representationofthe sequencew1i,...,wti.Thus,the hid- denstatehNiistherepresentation ofthewhole sentence. TheGRU producesthenexthiddenstate asalinear combi- nationofthe previoushidden stateandthe proposedstate update(wedrop subscripti): h t=(1 -zt)?ht-1+zt?¯ht(1) where ¯htistheproposed stateupdateat timet,ztistheup- dategate and(?)denotesa component-wiseproduct.The updategate takesvaluesbetweenzero andone.Intheex- tremecases,if theupdateg ateisthe vectorof ones,the previoushiddenstateiscompletely forgottenand ht=¯ht. Alternatively,iftheupdategateisthezero vector, thanthe hiddenstatefrom theprevious timestepis simplycopied over,thatisht=ht-1.Theupdate gateis computedas z t=σ(Wzxt+Uzht-1)(2) whereWzandUzaretheupdate gateparameters. The proposedstateupdate isgiv enby ht=tanh(Wxt+U(rt?ht-1))(3) wherertisthereset gate,which iscomputedasquotesdbs_dbs46.pdfusesText_46[PDF] 2015 nancy meyers movie
[PDF] 2015 nc 700x review
[PDF] 2015 nc d400 instructions
[PDF] 2015 nc drivers license
[PDF] 2015 nc plumbing code book
[PDF] 2015 nc-3 form for north carolina
[PDF] 2015 news article on invokana lawsuits
[PDF] 2015 news bloopers youtube
[PDF] 2015 news headlines
[PDF] 2015 nice actress photo nepali
[PDF] 2015 nice bronchiolitis guideline
[PDF] 2015 nice list certificate printable
[PDF] 2015 nice new fashion hand bands
[PDF] 2015 nice to know you incubus