[PDF] Computing Word-Pair Antonymy - Association for Computational





Loading...








[PDF] Machine Learning of Antonyms in English and Arabic Corpora

for accepting nothing less than excellence from me Amirah Al- harbi, thank you 7 1 Arabic pairs in Sketch Engine showing antonym and synonym relations

[PDF] Antonymy - LU Research Portal

Form–meaning pairings are antonyms when they are used as binary opposites Configurationally clever–accepting, daring–sick) at the other end of the scale

[PDF] • Antonyms 3 - EnglishForEveryoneorg

This is synonymous with astute, not the opposite of it 10) C The word sullen means sulky and gloomy After receiving sad news, someone might appear sullen

[PDF] Computing Word-Pair Antonymy - Association for Computational

consider antonyms have been created for certain lan- However, accepting this leads to two interesting and 2 1 Why are some pairs better antonyms?

PDF document for free
  1. PDF document for free
[PDF] Computing Word-Pair Antonymy - Association for Computational 1400_1D08_1103.pdf

Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 982-991,Honolulu, October 2008.c?2008 Association for Computational LinguisticsComputingWord-PairAntonymy

SaifMohammad†BonnieDorr

?GraemeHirstf † ?LaboratoryforComputationalLinguisticsandInformationProcessing † ?InstituteforAdvancedComputerStudiesand ?ComputerScience † ?UniversityofMarylandand ?HumanLanguageTechnologyCenterofExcellence ?saif,bonnie?@umiacs.umd.edu f

DepartmentofComputerScience

UniversityofToronto

gh@cs.toronto.edu

Abstract

Knowingthedegreeofantonymybetween

wordshaswidespreadapplicationsinnatural languageprocessing.Manually-createdlexi- conshavelimitedcoverageanddonotinclude mostsemanticallycontrastingwordpairs.We presentanewautomaticandempiricalmea- sureofantonymythatcombinescorpusstatis- ticswiththestructureofapublishedthe- saurus.Theapproachisevaluatedonasetof closest-oppositequestions,obtainingapreci- sionofover80%.Alongtheway,wediscuss whathumansconsiderantonymousandhow antonymymanifestsitselfinutterances.

1Introduction

Nativespeakersofalanguageintuitivelyrecog-

nizedifferentdegreesofantonymy-whethertwo wordsarestronglyantonymous(hot-cold,good- bad,friend-enemy),justsemanticallycontrasting (enemy-fan,cold-lukewarm,ascend-slip)ornot antonymousatall(penguin-clown,cold-chilly, boat-rudder).Overtheyears,manydefinitionsof antonymyhavebeenproposedbylinguists(Cruse,

1986;LehrerandLehrer,1982),cognitivescien-

tists(Kagan,1984),psycholinguists(Deese,1965), andlexicographers(Egan,1984),whichdifferfrom eachotherinsmallandlargerespects.Inits strictestsense,antonymyappliestogradableadjec- tives,suchashot-coldandtall-short,wherethe twowordsrepresentthetwoendsofasemantic dimension.Inabroadersense,itincludesother adjectives,nouns,andverbsaswell(life-death, ascend-descend,shout-whisper).Initsbroadestsense,itappliestoanytwowordsthatrepresent contrastingmeanings.Wewillusethetermde- greeofantonymytoencompassthecompletese- manticrange-acombinedmeasureofthecontrast inmeaningconveyedbytwowordsandthetendency ofnativespeakerstocallthemopposites.Thehigher thedegreeofantonymybetweenatargetwordpair, thegreaterthesemanticcontrastbetweenthemand thegreatertheirtendencytobeconsideredantonym pairsbynativespeakers.

Automaticallydeterminingthedegreeof

antonymybetweenwordshasmanyusesinclud- ingdetectingandgeneratingparaphrases(The dementorscaughtSiriusBlack/Blackcouldnot escapethedementors)anddetectingcontradictions (Marneffeetal.,2008;Voorhees,2008)(Kyotohas apredominantlywetclimate/Itismostlydryin

Kyoto).Ofcourse,such"contradictions"maybe

aresultofdifferingsentiment,newinformation, non-coreferentmentions,orgenuinelycontradictory statements.Antonymsoftenindicatethediscourse relationofcontrast(MarcuandEchihabi,2002).

Theyarealsousefulfordetectinghumor(Mihalcea

andStrapparava,2005),assatireandjokestend tohavecontradictionsandoxymorons.Lastly,it isusefultoknowwhichwordsaresemantically contrastingtoatargetword,evenifsimplytofilter themout.Forexample,intheautomaticcreation ofathesaurusitisnecessarytodistinguishnear- synonymsfromwordpairsthataresemantically contrasting.Measuresofdistributionalsimilarity failtodoso.Detectingantonymouswordsisnot sufficienttosolvemostoftheseproblems,butit remainsacrucial,andlargelyunsolved,component.982

Lexiconsofpairsofwordsthatnativespeakers

considerantonymshavebeencreatedforcertainlan- guages,buttheircoveragehasbeenlimited.Further, aseachtermofanantonymouspaircanhavemany semanticallycloseterms,thecontrastingwordpairs faroutnumberthosethatarecommonlyconsidered antonympairs,andtheyremainunrecorded.Even thoughanumberofcomputationalapproacheshave beenproposedforsemanticcloseness,andsomefor hypernymy-hyponymy(Hearst,1992),measuresof antonymyhavebeenlesssuccessful.Tosomeex- tent,thisisbecauseantonymyisnotaswellunder- stoodasotherclassicallexical-semanticrelations.

Wefirstverybrieflysummarizeinsightsandin-

tuitionsaboutthisphenomenon,asproposedbylin- guistsandlexicographers(Section2).Wediscuss relatedwork(Section3).Wedescribetheresources weuse(Section4)andpresentexperimentsthatex- aminethemanifestationofantonymyintext(Sec- tions5and6).Wethenproposeanewempirical approachtodeterminethedegreeofantonymybe- tweentwowords(Section7).Wecompiledadataset of950closest-oppositequestions,whichweusedfor evaluation(Section8).Weconcludewithadiscus- sionofthemeritsandlimitationsofthisapproach andoutlinefuturework.

2Theparadoxesofantonymy

Antonymy,likesynonymyandhyponymy,isa

lexical-semanticrelationthat,strictlyspeaking,ap- pliestotwolexicalunits-combinationsofsurface formandwordsense.(Thatsaid,forsimplicityand whereappropriatewewillusetheterm"antonymous words"asaproxyfor"antonymouslexicalunits".)

However,acceptingthisleadstotwointerestingand

seeminglyparadoxicalquestions(describedbelow inthetwosubsections).

2.1Whyaresomepairsbetterantonyms?

Nativespeakersofalanguageconsidercertaincon-

trastingwordpairstobeantonymous(forexample, large-small),andcertainotherseeminglyequivalent wordpairsaslessso(forexample,large-little).A numberofreasonshavebeensuggested:(1)Cruse (1986)observesthatifthemeaningofthetarget wordsiscompletelydefinedbyonesemanticdimen- sionandthewordsrepresentthetwoendsofthisse-manticdimension,thentheytendtobeconsidered antonyms.Wewillrefertothissemanticdimension asthedimensionofopposition.(2)Ifontheother hand,asLehrerandLehrer(1982)pointout,thereis moretothemeaningoftheantonymouswordsthan thedimensionofopposition-forexample,morese- manticdimensionsoraddedconnotations-thenthe twowordsarenotsostronglyantonymous.Most peopledonotthinkofchubbyasadirectantonym ofthinbecauseithastheadditionalconnotationof beingcuteandinformal.(3)Cruse(1986)alsopos- tulatesthatwordpairsarenotconsideredstrictly antonymousifitisdifficulttoidentifythedimension ofopposition(forexample,city-farm).(4)Charles andMiller(1989)claimthattwocontrastingwords areidentifiedasantonymsiftheyoccurtogetherin asentencemoreoftenthanchance.However,Mur- phyandAndrew(1993)claimthatthegreater-than- chanceco-occurrenceofantonymsinsentencesis becausetogethertheyconveycontrastwell,which isrhetoricallyuseful,andnotreallythereasonwhy theyareconsideredantonymsinthefirstplace.

2.2Aresemanticclosenessandantonymy

opposites?

Twowords(moreprecisely,twolexicalunits)are

consideredtobecloseinmeaningifthereisa lexical-semanticrelationbetweenthem.Lexical- semanticrelationsareoftwokinds:classical andnon-classical.Examplesofclassicalrela- tionsincludesynonymy,hyponymy,troponymy,and meronymy.Non-classicalrelations,aspointedout byMorrisandHirst(2004),aremuchmorecom- monandincludeconceptspertainingtoanothercon- cept(kind,chivalrous,formalpertainingtogentle- manly),andcommonlyco-occurringwords(forex- ample,problem-solutionpairssuchashomeless, shelter).Semanticdistance(orcloseness)inthis broadsenseisknownassemanticrelatedness.Two wordsareconsideredtobesemanticallysimilarif theyareassociatedviathesynonymy,hyponymy- hypernymy,orthetroponymyrelation.Soterms thataresemanticallysimilar(plane-glider,doctor- surgeon)arealsosemanticallyrelated,buttermsthat aresemanticallyrelatedmaynotalwaysbesemanti- callysimilar(plane-sky,surgeon-scalpel).

Antonymyisuniqueamongtheserelationsbe-

causeitsimultaneouslyconveysbothasenseof983 closenessandofdistance(Cruse,1986).Antony- mousconceptsaresemanticallyrelatedbutnotse- manticallysimilar.

3Relatedwork

CharlesandMiller(1989)proposedthatantonyms

occurtogetherinasentencemoreoftenthanchance.

Thisisknownastheco-occurrencehypothesis.

Theyalsoshowedthatthiswasempiricallytruefor

fouradjectiveantonympairs.JustesonandKatz (1991)demonstratedtheco-occurrencehypothesis for35prototypicalantonympairs(fromanoriginal setof39antonympairscompiledbyDeese(1965)) andalsoforanadditional22frequentantonympairs.

Allofthesepairswereadjectives.Fellbaum(1995)

conductedsimilarexperimentson47noun,verb,ad- jective,andadverbpairs(noun-noun,noun-verb, noun-adjective,verb-adverbandsoon)pertaining to18concepts(forexample,lose(v)-gain(n)and loss(n)-gain(n),wherelose(v)andloss(n)pertainto theconceptof"failingtohave/maintain").How- ever,non-antonymoussemanticallyrelatedwords suchashypernyms,holonyms,meronyms,andnear- synonymsalsotendtooccurtogethermoreoften thanchance.Thus,separatingantonymsfromthem hasproventobedifficult.

Linetal.(2003)usedpatternssuchas"fromX

toY"and"eitherXorY"toseparateantonymword pairsfromdistributionallysimilarpairs.Theyeval- uatedtheirmethodon80pairsofantonymsand80 pairsofsynonymstakenfromtheWebster'sColle- giateThesaurus(Kay,1988).Inthispaper,wepro- poseamethodtodeterminethedegreeofantonymy betweenanywordpairandnotjustthosethatare distributionallysimilar.Turney(2008)proposeda uniformmethodtosolvewordanalogyproblems thatrequireidentifyingsynonyms,antonyms,hyper- nyms,andotherlexical-semanticrelationsbetween wordpairs.However,theTurneymethodissuper- visedwhereasthemethodproposedinthispaperis completelyunsupervised.

Harabagiuetal.(2006)detectedantonyms

forthepurposeofidentifyingcontradictions byusingWordNetchains-synsetsconnectedby thehypernymy-hyponymylinksandexactlyone antonymylink.Lucertoetal.(2002)proposedde- tectingantonympairsusingthenumberofwordsbetweentwowordsintextandalsocuewordssuch asbut,from,andand.Unfortunately,theyevalu- atedtheirmethodononly18wordpairs.Neitherof thesemethodsdeterminesthedegreeofantonymy betweenwordsandtheyhavenotbeenshownto havesubstantialcoverage.Schwabetal.(2002)cre- ate"antonymousvector"foratargetword.The closerthisvectoristothecontextvectorsofthe othertargetword,themoreantonymousthetwotar- getwordsare.However,theantonymousvectorsare manuallycreated.Further,theapproachisnoteval- uatedbeyondahandfulofwordpairs.

Workinsentimentdetectionandopinionmining

aimsatdeterminingthepolarityofwords.Forex- ample,Pang,LeeandVaithyanathan(2002)detect thatadjectivessuchasdazzling,brilliant,andgrip- pingcasttheirqualifyingnounspositivelywhereas adjectivessuchasbad,cliched,andboringportray thenounnegatively.Manyofthesegradableadjec- tiveshaveantonyms.buttheseapproachesdonot attempttodeterminepairsofpositiveandnegative polaritywordsthatareantonyms.

4Resources

4.1Publishedthesauri

Publishedthesauri,suchastheRoget'sandMac-

quarie,dividethevocabularyintoaboutathousand categories.Wordswithinacategorytendtobenear- synonymousorsemanticallysimilar.Onemayalso findantonymousandsemanticallyrelatedwordsin thesamecategory,butthisisrare.Theintuition isthatwordswithinacategoryrepresentacoarse concept.Wordswithmorethanonemeaningmay befoundinmorethanonecategory;theserepre- sentitscoarsesenses.Withinacategory,thewords aregroupedintoparagraphs.Wordsinthesame paragraphtendtobecloserinmeaningthanthosein differentparagraphs.Wewilltakeadvantageofthe structureofthethesaurusinourapproach.

4.2WordNet

Unlikethetraditionalapproachtoantonymy,Word-

Netencodesantonymyasalexicalrelationship-a

relationbetweentwowords(notconcepts)(Grosset al.,1989).Eventhoughasynset(aWordNetcon- cept)mayberepresentedbymorethanoneword, individualwordsacrosssynsetsaremarkedas(di-984 rect)antonyms.Grossetal.arguethatotherwords inthesynsetsform"indirectantonyms".

Evenafterincludingtheindirectantonyms,Word-

Net'scoverageislimited.AsMarcuandEchi-

habi(2002)pointout,WordNetdoesnoten- codeantonymyacrosspart-of-speech(forexam- ple,legally-embargo).Further,thenoun-noun, verb-verb,andadjective-adjectiveantonympairsof

WordNetlargelyignorenear-oppositesasrevealed

byourexperiments(Section8below).Also,Word-

Net(oranyothermanually-createdrepositoryof

antonymsforthatmatter)doesnotencodethede- greeofantonymybetweenwords.Nevertheless,we investigatetheusefulnessofWordNetasasourceof seedantonympairsforourapproach.

4.3Co-occurrencestatistics

Thedistributionalhypothesisofclosenessstates

thatwordsthatoccurinsimilarcontextstendto besemanticallyclose(Firth,1957).Distributional measuresofdistance,suchasthoseproposedbyLin (1998),quantifyhowsimilarthetwosetsofcontexts ofatargetwordpairare.Equation1isamodified formofLin'smeasurethatignoressyntacticdepen- denciesandhenceitestimatessemanticrelatedness ratherthansemanticsimilarity: Lin ?w1 ?w2 ??? åw ?T?w1 ???T?w2 ? ?I?w1 ?w ???I?w2 ?w ??? åw ???T?w1 ?I?w1 ?w????åw ????T?w2 ?I?w2 ?w???(1)

Herew1andw2arethetargetwords;I

?x ?y ?isthe pointwisemutualinformationbetweenxandy;and T ?x ?isthesetofallwordsythathavepositivepoint- wisemutualinformationwiththewordx(I ?x ?y ??? 0).

MohammadandHirst(2006)showedthat

thesedistributionalword-distancemeasuresper- formpoorlywhencomparedwithWordNet-based concept-distancemeasures.Theyarguedthatthis isbecausetheword-distancemeasuresclumpto- getherthecontextsofthedifferentsensesofthetar- getwords.Theyproposedawaytoobtaindistri- butionaldistancebetweenwordsenses,usingany ofthedistributionalmeasuressuchascosineorthat proposedbyLin,andshowedthatthisapproachper- formedmarkedlybetterthanthetraditionalword- distanceapproach.Theyusedthesauruscategoriesasverycoarsewordsenses.Equation2showshow

Lin'sformulaisusedtodeterminedistributionaldis-

tancebetweentwothesauruscategoriesc1andc2: Lin ?c1 ?c2 ??? åw ?T?c1 ???T?c2 ? ?I?c1 ?w ???I?c2 ?w ??? åw ???T?c1 ?I?c1 ?w????åw ????T?c2 ?I?c2 ?w???(2) HereT ?c ?isthesetofallwordswthathaveposi- tivepointwisemutualinformationwiththethesaurus categoryc(I ?c ?w ???0).Weadoptthismethod foruseinourapproachtodetermineword-pair antonymy.

5Theco-occurrencehypothesisof

antonyms

Asafirststeptowardsformulatingourapproach,

weinvestigatedtheco-occurrencehypothesisona significantlylargersetofantonympairsthanthose studiedbefore.Werandomlyselectedathousand antonympairs(nouns,verbs,andadjectives)from

WordNetandcountedthenumberoftimes(1)they

occurredindividuallyand(2)theyco-occurredinthe samesentencewithinawindowoffivewords,inthe

BritishNationalCorpus(BNC)(Burnard,2000).We

thencalculatedthemutualinformationforeachof thesewordpairsandaveragedit.Werandomlygen- eratedanothersetofathousandwordpairs,without regardtowhethertheywereantonymousornot,and useditasacontrolset.Theaveragemutualinfor- mationbetweenthewordsintheantonymsetwas

0.94withastandarddeviationof2.27.Theaverage

mutualinformationbetweenthewordsinthecon- trolsetwas0.01withastandarddeviationof0.37.

Thusantonymouswordpairsoccurtogethermuch

moreoftenthanchanceirrespectiveoftheirintended senses(p ?0?01).Ofcourse,anumberofnon- antonymouswordsalsotendtoco-occurmoreof- tenthanchance-commonlyknownascollocations.

Thus,strongco-occurrenceisnotasufficientcondi-

tionfordetectingantonyms,buttheseresultsshow thatitcanbeausefulcue.

6Thesubstitutionalanddistributional

hypothesesofantonyms

CharlesandMiller(1989)alsoproposedthatin

mostcontexts,antonymsmaybeinterchanged.The985 meaningoftheutterancewillbeinverted,ofcourse, butthesentencewillremaingrammaticalandlin- guisticallyplausible.Thiscametobeknownasthe substitutabilityhypothesis.However,theirexper- imentsdidnotsupportthisclaim.Theyfoundthat givenasentencewiththetargetadjectiveremoved, mostpeopledidnotconfoundthemissingwordwith itsantonym.JustesonandKatz(1991)latershowed thatinsentencesthatcontainbothmembersofan antonymousadjectivepair,thetargetadjectivesdo indeedoccurinsimilarsyntacticstructuresatthe phrasallevel.Fromthis(andtosomeextentfromthe co-occurrencehypothesis),wecanderivethedistri- butionalhypothesisofantonyms:antonymsoccur insimilarcontextsmoreoftenthannon-antonymous words.

Weusedthesamesetofonethousandantonym

pairsandonethousandcontrolpairsasinthepre- viousexperimenttogatherempiricalproofofthe distributionalhypothesis.Foreachwordpairfrom theantonymset,wecalculatedthedistributionaldis- tancebetweeneachoftheirsensesusingMoham- madandHirst's(2006)methodofconceptdistance alongwiththemodifiedformofLin's(1998)dis- tributionalmeasure(equation2).Thedistancebe- tweentheclosestsensesofthewordpairswasav- eragedforallthousandantonyms.Theprocesswas thenrepeatedforthecontrolset.

Thecontrolsethadanaveragesemanticclose-

nessof0.23withastandarddeviationof0.11on ascalefrom0(unrelated)to1(identical).Onthe otherhand,antonymouswordpairshadanaverage semanticclosenessof0.30withastandarddevia- tionof0.23.1Thisdemonstratesthatrelativetoother wordpairs,antonymouswordstendtooccurinsimi- larcontexts(p ?0?01).However,near-synonymous andsimilarwordpairsalsooccurinsimilarcontexts. (thedistributionalhypothesisofcloseness).Thus, justliketheco-occurrencehypothesis,occurrence insimilarcontextsisnotsufficient,butratheryet anotherusefulcuetowardsdetectingantonyms.

1Itshouldbenotedthatabsolutevaluesintherangebetween

0and1aremeaninglessbythemselves.However,ifasetof

wordpairsisshowntoconsistentlyhavehighervaluesthanan- otherset,thenwecanconcludethatthemembersoftheformer settendtobesemanticallycloserthanthoseofthelatter.7Ourapproach

Wenowpresentanempiricalapproachtodetermine

thedegreeofantonymybetweenwords.Inorder tomaximizeapplicabilityandusefulnessinnatural languageapplications,wemodelthebroadsenseof antonymy.Givenatargetwordpair,theapproach determineswhethertheyareantonymousornot,and iftheyareantonymouswhethertheyhaveahigh, medium,orlowdegreeofantonymy.Morepre- cisely,theapproachpresentsawaytodetermine whetheronewordpairismoreantonymousthanan- other.

Theapproachreliesonthestructureofthepub-

lishedthesaurusaswellastheco-occurrenceand distributionalhypotheses.Asmentionedearlier,a thesaurusorganizeswordsinsetsrepresentingcon- ceptsorcategories.Wefirstdeterminepairsofthe- sauruscategoriesthatarecontrastinginmeaning (Section7.1).Wethenusetheco-occurrenceand distributionalhypothesestodeterminethedegreeof antonymy(Section7.2).

7.1Detectingcontrastingcategories

Weproposetwowaysofdetectingthesauruscate-

gorypairsthatrepresentcontrastingconcepts(we willcallthesepairscontrastingcategories):(1)us- ingaseedsetofantonymsand(2)usingasimple heuristicthatexploitshowthesauruscategoriesare ordered.

7.1.1Seedsets

Affix-generatedseedsetAntonympairssuchas

hot-coldanddark-lightoccurfrequentlyintext, butintermsoftype-pairstheyareoutnumbered bythosecreatedusingaffixes,suchasun-(clear- unclear)anddis-(honest-dishonest).Further,this phenomenonisobservedinmostlanguages(Lyons,

1977).

Table1listssixteenmorphologicalrulesthattend

togenerateantonymsinEnglish.Theseruleswere appliedtoeachofthewordsintheMacquarieThe- saurusandiftheresultingtermwasalsoavalid wordinthethesaurus,thentheword-pairwasadded totheaffix-generatedseedset.Thesesixteenrules generated2,734wordpairs.Ofcourse,notallof themareantonymous,forexamplesect-insectand coy-decoy.However,thesearerelativelyfewin986 w1w2examplepairw1w2examplepairw1w2examplepair XabXnormal-abnormalXmisXfortune-misfortuneimXexXimplicit-explicit XantiXclockwise-anticlockwiseXnonXaligned-nonalignedinXexXintrovert-extrovert XdisXinterest-disinterestXunXbiased-unbiasedupXdownXuphill-downhill XimXpossible-impossiblelXillXlegal-illegaloverXunderXoverdone-underdone XinXconsistent-inconsistentrXirXregular-irregularXlessXfulharmless-harmful

XmalXadroit-maladroit

Table1:Sixteenaffixrulestogenerateantonympairs.Here`X'standsforanysequenceofletterscommontoboth wordsw1andw2. numberandwerefoundtohaveonlyasmallimpact ontheresults.

WordNetseedsetWecompiledalistof20,611

semanticallycontrastingwordpairsfromWordNet.

IftwowordsfromtwosynsetsinWordNetarecon-

nectedbyanantonymylink,theneverypossible wordpairacrossthetwosynsetswasconsideredto besemanticallycontrasting.Alargenumberofthem includemultiwordexpressions.Foronly10,807of the20,611pairswerebothwordsfoundintheMac- quarieThesaurus-thevocabularyusedforourex- periments.WewillrefertothemastheWordNet seedset.

Then,giventhesetwoseedsets,ifanywordin

thesauruscategoryC1isantonymoustoanyword incategoryC2asperaseedantonympair,thenthe twocategoriesaremarkedascontrasting.Itshould benoted,however,thattheseedantonympairmay beantonymousonlyincertainsenses.Forexample, considertheantonympairwork-play.Here,playis antonymoustoworkonlyinitsACTIVITYFORFUN senseandnotitsDRAMAsense.Insuchcases,we employthedistributionalhypothesisofcloseness: twowordsareantonymoustoeachotherinthose senseswhichareclosestinmeaningtoeachother.

SincethethesauruscategorypertainingtoWORKis

relativelycloserinmeaningtotheACTIVITYFOR

FUNsensethantheDRAMAsense,thosetwocat-

egorieswillbeconsideredcontrastingandnotthe categoriespertainingtoWORKandDRAMA.

IfnowordinC1isantonymoustoanywordinC2,

thenthecategoriesareconsiderednotcontrasting.

Astheseedsets,bothautomaticallygeneratedand

manuallycreated,arerelativelylargeincomparison tothetotalnumberofcategoriesintheMacquarie

Thesaurus(812),thissimpleapproachhasreason-

ablecoverageandaccuracy.7.1.2Orderofthesauruscategories

Mostpublishedthesauriareorderedsuchthat

contrastingcategoriestendtobeadjacent.Thisis notahard-and-fastrule,andoftenacategorymaybe contrastinginmeaningtoseveralothercategories.

Further,oftenadjacentcategoriesarenotsemanti-

callycontrasting.However,sincethiswasaneasy- enoughheuristictoimplement,weinvestigatedthe usefulnessofconsideringadjacentcategoriesascon- trasting.Wewillrefertothisastheadjacency heuristic.

7.2Determiningthedegreeofantonymy

Onceweknowwhichcategorypairsarecontrast-

ing(usingthemethodsfromtheprevioussubsec- tion),wedeterminethedegreeofantonymybe- tweenthetwocategories(Section7.2.1).Theaim istoassigncontrastingcategorypairsanon-zero valuesignifyingthedegreeofcontrast.Inturn,we willusethatinformationtodeterminethedegreeof antonymybetweenanywordpairwhosemembers belongtotwocontrastingcategories(Sections7.2.2 and7.2.3).

7.2.1Categorylevel

Usingthedistributionalhypothesisofantonyms,

weclaimthatthedegreeofantonymybetweentwo contrastingconcepts(thesauruscategories)isdi- rectlyproportionaltothedistributionalclosenessof thetwoconcepts.Inotherwords,themorethewords representingtwocontrastingconceptsoccurinsim- ilarcontexts,themorethetwoconceptsareconsid- eredtobeantonymous.

AgainweusedMohammadandHirst's(2006)

methodalongwithLin's(1998)distributionalmea- suretodeterminethedistributionalclosenessof twothesaurusconcepts.Co-occurrencestatisticsre- quiredfortheapproachwerecomputedfromthe987

BNC.Wordsthatoccurredwithinawindowof5

wordswereconsideredtoco-occur.

7.2.2Lexicalunitlevel

Recallthatstrictlyspeaking,antonymy(likeother

lexical-semanticrelations)appliestolexicalunits(a combinationofsurfaceformandwordsense).If twowordsareusedinsensespertainingtocontrast- ingcategories(asperthemethodsdescribedinSec- tion7.1),thenwewillconsiderthemtobeantony- mous(degreeofantonymyisgreaterthanzero).

Iftwowordsareusedinsensespertainingtonon-

contrastingsenses,thenwewillconsiderthemtobe notantonymous(degreeofantonymyisequalto0).

Ifthetargetwordsbelongtothesamethesaurus

paragraphsasanyoftheseedantonymslinkingthe twocontrastingcategories,thenthewordsarecon- sideredtohaveahighdegreeofantonymy.Thisis becausewordsthatoccurinthesamethesauruspara- graphtendtobesemanticallyverycloseinmean- ing.Relyingontheco-occurrencehypothesis,we claimthatforwordpairslistedincontrastingcate- gories,thegreatertheirtendencytoco-occurintext, thehighertheirdegreeofantonymy.Weusemutual informationtocapturethetendencyofword-word co-occurrence.

Ifthetargetwordsdonotbothbelongtothesame

paragraphsasaseedantonympair,butoccurincon- trastingcategories,thenthetargetwordsareconsid- eredtohavealowormediumdegreeofantonymy (lessantonymousthanthewordpairsdiscussed above).Suchwordpairsthathaveahighertendency toco-occurareconsideredtohaveamediumdegree ofantonymy,whereasthosethathavealowerten- dencytoco-occurareconsideredtohavealowde- greeofantonymy.

Co-occurrencestatisticsforthispurposewerecol-

lectedfromtheGooglen-gramcorpus(Brantsand

Franz,2006).2Wordsthatoccurredwithinawindow

of5wordswereconsideredtobeco-occurring.

7.2.3Wordlevel

Eventhoughantonymyappliestopairsofword

andsensecombinations,mostavailabletextsarenot

2WeusedtheGooglen-gramcorpusiscreatedfromatext

collectionofover1trillionwords.Weintendtousethesame corpus(andnottheBNC)todeterminesemanticdistanceas well,inthenearfuture.sense-annotated.Ifantonymousoccurrencesareto beexploitedforanyofthepurposeslistedinthebe- ginningofthispaper,thenthetextmustbesense disambiguated.However,wordsensedisambigua- tionisahardproblem.Yet,andtosomeextentbe- causeunsupervisedwordsensedisambiguationsys- temsperformpoorly,muchcanbegainedbyusing simpleheuristics.Forexample,ithasbeenshown thatcohesivetexttendstohavewordsthatareclose inmeaningratherthanunrelatedwords.This,along withthedistributionalhypothesisofantonyms,and thefindingsbyJustesonandKatz(1991)(antony- mousconceptstendtooccurmoreoftenthanchance inthesamesentence),suggeststhatifwefindaword pairinasentencesuchthattwoofitssensesare stronglycontrasting(asperthealgorithmdescribed inSection7.2.2),thenitisprobablethatthetwo wordsareusedinthosecontrastingsenses.

8Evaluation

8.1Taskanddata

Inordertobestevaluateacomputationalmeasure

ofantonymy,weneedataskthatnotonlyrequires knowingwhethertwowordsareantonymousbut alsowhetheronewordpairismoreantonymousthan anotherpair.Therefore,weevaluatedoursystemon asetofclosest-oppositequestions.Eachquestion hasonetargetwordandfivealternatives.Theobjec- tiveistoidentifythatalternativewhichistheclosest oppositeofthetarget.Forexample,consider: adulterate:a.renounceb.forbid c.purifyd.criticizee.correct

Herethetargetwordisadulterate.Oneoftheal-

ternativesprovidediscorrect,whichasaverbhasa meaningthatcontrastswiththatofadulterate;how- ever,purifyhasagreaterdegreeofantonymywith adulteratethancorrectdoesandmustbechosen inorderfortheinstancetobemarkedascorrectly answered.Thisevaluationissimilartohowoth- ershaveevaluatedsemanticdistancealgorithmson

TOEFLsynonymquestions(Turney,2001),except

thatinthosecasesthesystemhadtochoosetheal- ternativewhichisclosestinmeaningtothetarget.

WelookedontheWorldWideWebforlargesets

ofclosestantonymquestions.Wefoundtwoinde- pendentsetsofquestionsdesignedtopreparestu-988 developmentdatatestdata

PRFPRF

a.randombaseline0.200.200.200.200.200.20 b.affix-generatedseedsonly0.720.530.610.710.510.60 c.WordNetseedsonly0.790.520.630.750.500.60 d.bothseedsets0.770.650.700.730.600.65 e.adjacencyheuristiconly0.810.430.560.830.460.59 f.affixseedset+heuristic0.750.600.670.760.610.68 g.bothseedsets+heuristic0.760.660.700.760.640.70 Table2:Resultsobtainedonclosest-oppositequestions. dentsfortheGraduateRecordExamination.3The firstsetconsistsof162questions.Weusedthisset todevelopourapproachandwillrefertoitasthede- velopmentset.Eventhoughthealgorithmdoesnot haveanytunedparametersperse,thedevelopment sethelpeddeterminewhichcuesofantonymywere usefulandwhichwerenot.Thesecondsethas1208 closest-oppositequestions.Wediscardedquestions thathadamultiwordtargetoralternative.Afterre- movingduplicateswewereleftwith950questions, whichweusedastheunseentestset.

Interestingly,thedatacontainsmanyinstances

thathavethesametargetwordusedindifferent senses.Forexample: (1)obdurate:a.meagerb.unsusceptible c.rightd.tendere.intelligent (2)obdurate:a.yieldingb.motivated c.moribundd.azuree.hard (3)obdurate:a.transitoryb.commensurate c.complaisantd.similare.uncommunicative

In(1),obdurateisusedintheHARDENEDINFEEL-

INGSsenseandtheclosestoppositeistender.In(2),

itisusedintheRESISTANTTOPERSUASIONsense andtheclosestoppositeisyielding.In(3),itisused inthePERSISTENTsenseandtheclosestoppositeis transitory.

Thedatasetsalsocontainquestionsinwhichone

ormoreofthealternativesisanear-synonymofthe targetword.Forexample: astute:a.shrewdb.foolish c.callowd.winninge.debating

Observethatshrewdisanear-synonymofastute.

Theclosest-oppositeofastuteisfoolish.Aman-

ualcheckofarandomlyselectedsetof100test-set questionsrevealedthat,onoverage,oneinfourhad

3Bothdatasetsareapparentlyinthepublicdomainandwill

bemadeavailableonrequest.anear-synonymasoneofthealternative.

8.2Experiments

WeusedthealgorithmproposedinSection7toauto-

maticallysolvetheclosest-oppositequestions.Since individualwordsmayhavemorethanonemean- ing,wereliedonthehypothesisthattheintended senseofthealternativesarethosewhicharemost antonymoustooneofthesensesofthetargetword. (ThisfollowsfromthediscussionearlierinSection

7.2.3.)Soforeachofthealternativesweusedthe

targetwordascontext(butnottheotheralterna- tives).Wethinkthatusingalargercontexttode- termineantonymywillbeespeciallyusefulwhen thetargetwordsarefoundinsentencesandnatural text-somethingweintendtoexploreinthefuture.

Table2presentsresultsobtainedonthedevelop-

mentandtestdatausingdifferentcombinationsof theseedsetsandtheadjacencyheuristic.Ifthesys- temdidnotfindanyevidenceofantonymybetween thetargetandanyofitsalternatives,thenitrefrained fromattemptingthatquestion.Wethereforereport precision(numberofquestionsansweredcorrectly/ numberofquestionsattempted),recall(numberof questionsansweredcorrectly/totalnumberofques- tions),andF-scorevalues(2 ?P?R ???P?R?).

Observethatallresultsarewellabovetheran-

dombaselineof0.20(obtainedwhenasystemran- domlyguessesoneofthefivealternativestobethe answer).Also,usingonlythesmallsetofsixteen affixrules,thesystemperformsalmostaswellas whenituses10,807WordNetantonympairs.Using boththeaffix-generatedandtheWordNetseedsets, thesystemobtainsmarkedlyimprovedprecisionand coverage.Usingonlytheadjacencyheuristicgave bestprecisionvalues(upwardsof0.8)withsubstan-989 tialcoverage(attemptingclosetohalfthequestions).

However,bestoverallperformancewasobtainedus-

ingbothseedsetsandtheadjacencyheuristic(F- scoreof0.7).

8.3Discussion

Theseresultsshowthat,tosomedegree,theauto-

maticapproachdoesindeedmimichumanintuitions ofantonymy.Intasksthatrequirehigherprecision, usingonlytheadjacencyheuristicisbest,whereas intasksthatrequirebothprecisionandcoverage,the seedsetsmaybeincluded.Evenwhenbothseedsets wereincluded,onlyfourinstancesinthedevelop- mentsetandtwentyinthetestsethadtarget-answer pairsthatmatchedaseedantonympair.Forallre- maininginstances,theapproachhadtogeneralizeto determinetheclosestopposite.Thisalsoshowsthat eventheseeminglylargenumberofdirectandin- directantonymsfromWordNet(morethan10,000) arebythemselvesinsufficient.

Thecomparableperformanceobtainedusingthe

affixrulesalonesuggeststhateveninlanguages withoutawordnet,substantialaccuraciesmaybe achieved.Ofcourse,improvedresultswhenusing

WordNetantonymsaswellsuggeststhattheinfor-

mationtheyprovideiscomplementary.

Erroranalysisrevealedthatattimesthesystem

failedtoidentifythatacategorypertainingtothe targetwordcontrastedwithacategorypertaining totheanswer.Additionalmethodstoidentifyseed antonympairswillhelpinsuchcases.Certainother errorsoccurredbecauseoneormorealternatives otherthantheofficialanswerwerealsoantonymous tothetarget.Forexample,thesystemchoseaccept astheoppositeofchasteninsteadofreward.

9Conclusion

Wehaveproposedanempiricalapproachto

antonymythatcombinescorpusco-occurrence statisticswiththestructureofapublishedthesaurus.

Themethodcandeterminethedegreeofantonymy

orcontrastbetweenanytwothesauruscategories (setsofwordsrepresentingacoarseconcept)and betweenanytwowordpairs.Weevaluatedtheap- proachonalargesetofclosest-oppositequestions whereinthesystemnotonlyidentifiedwhethertwo wordsareantonymousbutalsodistinguishedbe-tweenpairsofantonymouswordsofdifferentde- grees.ItachievedanF-scoreof0.7inthistaskwhere therandombaselinewasonly0.2.Whenaimingfor highprecisionitscoresover0.8,butthereissome dropinthenumberofquestionsattempted.Inthe processofdevelopingthisapproachwevalidatedthe co-occurrencehypothesisproposedbyCharlesand

Miller(1989)onalargesetof1000noun,verb,and

adjectivepairs.Wealsogaveempiricalproofthat antonympairstendtobeusedinsimilarcontexts- thedistributionalhypothesisforantonyms.

Ourfuturegoalsincludeportingthisapproach

toacross-lingualframeworkinordertodetermine antonymyinaresource-poorlanguagebycombin- ingitstextwithathesaurusfromaresource-rich language.Wewilluseantonympairstoidentify contrastrelationsbetweensentencestointurnim- proveautomaticsummarization.Wealsointendto usetheapproachproposedhereintaskswherekey- wordmatchingisespeciallyproblematic,forexam- ple,separatingparaphrasesfromcontradictions.

Acknowledgments

WethankSmarandaMuresan,SiddharthPatward-

han,membersoftheCLIPlabattheUniversityof

Maryland,CollegePark,andtheanonymousreview-

ersfortheirvaluablefeedback.Thisworkwassup- ported,inpart,bytheNationalScienceFoundation underGrantNo.IIS-0705832,inpart,bytheHuman

LanguageTechnologyCenterofExcellence,andin

part,bytheNaturalSciencesandEngineeringRe- searchCouncilofCanada.Anyopinions,findings, andconclusionsorrecommendationsexpressedin thismaterialarethoseoftheauthorsanddonotnec- essarilyreflecttheviewsofthesponsor.

References

ThorstenBrantsandAlexFranz.2006.Web1t5-gram

version1.LinguisticDataConsortium.

LouBurnard.2000.ReferenceGuidefortheBritish

NationalCorpus(WorldEdition).OxfordUniversity

ComputingServices.

WalterG.CharlesandGeorgeA.Miller.1989.Con-

textsofantonymousadjectives.AppliedPsychology,

10:357±375.

DavidA.Cruse.1986.Lexicalsemantics.Cambridge

UniversityPress.990

JamesDeese.1965.Thestructureofassociationsinlan-

guageandthought.TheJohnsHopkinsPress.

RoseF.Egan.1984.SurveyofthehistoryofEnglish

synonymy.Webster'sNewDictionaryofSynonyms, pages5a±25a.

ChristianeFellbaum.1995.Co-occurrenceand

antonymy.InternationalJournalofLexicography,

8:281±303.

JohnR.Firth.1957.Asynopsisoflinguistictheory

1930±55.InStudiesinLinguisticAnalysis,pages1±

32,Oxford:ThePhilologicalSociety.(Reprintedin

F.R.Palmer(ed.),SelectedPapersofJ.R.Firth1952-

1959,Longman).

DerekGross,UteFischer,andGeorgeA.Miller.1989.

Antonymyandtherepresentationofadjectivalmean-

ings.MemoryandLanguage,28(1):92±106.

SandaM.Harabagiu,AndrewHickl,andFinleyLaca-

tusu.2006.Lacatusu:Negation,contrastandcontra- dictionintextprocessing.InProceedingsofthe23rd

NationalConferenceonArtificialIntelligence(AAAI-

06),Boston,MA.

MartiHearst.1992.Automaticacquisitionofhy-

ponymsfromlargetextcorpora.InProceedingsof theFourteenthInternationalConferenceonComputa- tionalLinguistics,pages539±546,Nantes,France.

JohnS.JustesonandSlavaM.Katz.1991.Co-

occurrencesofantonymousadjectivesandtheircon- texts.ComputationalLinguistics,17:1±19.

JeromeKagan.1984.TheNatureoftheChild.Basic

Books.

MaireWeirKay,editor.1988.Webster'sCollegiateThe-

saurus.Merrian-Webster.

AdrienneLehrerandK.Lehrer.1982.Antonymy.Lin-

guisticsandPhilosophy,5:483±501.

DekangLin,ShaojunZhao,LijuanQin,andMingZhou.

2003.Identifyingsynonymsamongdistributionally

similarwords.InProceedingsofthe18thInter- nationalJointConferenceonArtificialIntelligence (IJCAI-03),pages1492±1493,Acapulco,Mexico.

DekangLin.1998.Automaticretreivalandcluster-

ingofsimilarwords.InProceedingsofthe17thIn- ternationalConferenceonComputationalLinguistics (COLING-98),pages768±773,Montreal,Canada. CupertinoLucerto,DavidPinto,andH´ectorJimi´enez-

Salazar.2002.Anautomaticmethodtoidentify

antonymy.InWorkshoponLexicalResourcesandthe

WebforWordSenseDisambiguation,pages105±111,

Puebla,Mexico.

JohnLyons.1977.Semantics,volume1.Cambridge

UniversityPress.

DanielMarcuandAbdesammadEchihabi.2002.An

unsupervisedapproachtorecognizingdiscourserela- tions.InProceedingsofthe40thAnnualMeetingoftheAssociationforComputationalLinguistics(ACL-

02),Philadelphia,PA.

Marie-CatherinedeMarneffe,AnnaRafferty,and

ChristopherD.Manning.2008.Findingcontradic-

tionsintext.InProceedingsofthe46thAnnualMeet- ingoftheAssociationforComputationalLinguistics (ACL-08),Columbus,OH.

RadaMihalceaandCarloStrapparava.2005.Making

computerslaugh:Investigationsinautomatichumor recognition.InProceedingsoftheConferenceonHu- manLanguageTechnologyandEmpiricalMethodsin

NaturalLanguageProcessing,pages531±538,Van-

couver,Canada.

SaifMohammadandGraemeHirst.2006.Distributional

measuresofconcept-distance:Atask-orientedevalu- ation.InProceedingsoftheConferenceonEmpiri- calMethodsinNaturalLanguageProcessing,Sydney,

Australia.

JaneMorrisandGraemeHirst.2004.Non-classical

lexicalsemanticrelations.InProceedingsofthe

WorkshoponComputationalLexicalSemantics,HLT,

Boston,MA.

GregoryL.MurphyandJaneM.Andrew.1993.The

conceptualbasisofantonymyandsynonymyinadjec- tives.JournalofMemoryandLanguage,32(3):1±19.

BoPang,LillianLee,andShivakumarVaithyanathan.

2002.Thumbsup?:sentimentclassificationusingma-

chinelearningtechniques.InProceedingsoftheCon- ferenceonEmpiricalMethodsinNaturalLanguage

Processing,pages79±86,Philadelphia,PA.

DidierSchwab,MathieuLafourcade,andViolaine

Prince.2002.Antonymyandconceptualvectors.In

Proceedingsofthe19thInternationalConferenceon

ComputationalLinguistics(COLING-02),pages904±

910.

PeterTurney.2001.Miningthewebforsynonyms:

PMI-IRversusLSAonTOEFL.InProceedingsofthe

TwelfthEuropeanConferenceonMachineLearning,

pages491±502,Freiburg,Germany.

PeterTurney.2008.Auniformapproachtoanalogies,

synonyms,antonyms,andassociations.InProceed- ingsofthe22ndInternationalConferenceonCom- putationalLinguistics(COLING-08),pages905±912,

Manchester,UK.

EllenMVoorhees.2008.Contradictionsandjus-

tifications:Extensionstothetextualentailmenttask.

InProceedingsofthe46thAnnualMeetingofthe

AssociationforComputationalLinguistics(ACL-08),

Columbus,OH.991


Antonyms Documents PDF, PPT , Doc

[PDF] aboard antonyms

  1. Arts Humanities

  2. Writing

  3. Antonyms

[PDF] aboard synonyms words

[PDF] above antonyms and synonyms

[PDF] above antonyms in english

[PDF] accepting antonyms

[PDF] accepting ka antonyms

[PDF] across antonyms and synonyms

[PDF] across antonyms word

[PDF] after considering synonyms

[PDF] aftercare antonyms

Politique de confidentialité -Privacy policy