[PDF] The SENSEVAL–3 Multilingual English–Hindi Lexical Sample Task





Loading...








[PDF] Meaning in Hindi - Department of Public Enterprises

Word in English Meaning in Hindi Globally Competitive ??????? ??? ?? Usages in Hindi assets by way of the above options, land




[PDF] Have You Noticed Meaning In Hindi - The Optical Co

hindi meaning you have noticed in hindi head, falls with is left in truth to children: on jews as hamartomas or hindi meaning If noticed above

[PDF] ??? ????????? ???????? - ??????? ?????

32 above board (adv) - ????? ??? dealing above board in all Meaning Usages in English Usages in Hindi employee will be fixed in

[PDF] S NO Words in English Meaning in Hindi Usages in English Usages

Meaning in Hindi Usages in English Usages in Hindi 1 Instrument Of Accession ??????? ???? The Instrument of accession of

[PDF] ENGLISH WORD-MAKING - CORE

and ma in Hindi — and his interest is stirred (Hoey, p, 1012) But what even vague meanings do fer and mit share in the examples above?




[PDF] With reference to the subject mentioned above, I am pleased - IFAC

Comments: Scope of social benefits may vary and as such, it may not be possible to put a holistic definition Though both ITC and GFSM 2001 capture social 

[PDF] class_9_Englishpdf - Chhattisgarh

1 Name of the tree in Hindi, local language and English above Consult the dictionary to find more meanings of each of these words a still

[PDF] notice_jht_29062020pdf - Staff Selection Commission

29 jui 2020 · Candidates may also note that in respect of the above, versa or two years? experience of translation work from Hindi to English and vice 

[PDF] With reference to the subject cited above, I would like to state that no

You are, therefore, requested once again to persue the matter at the Govt level immediately The Principals of defaulting colleges who have not submitted 




[PDF] सरल प्रशासनिक शब्दावली - राजभाषा

करें। 31 above average औसत सेअधिक This year, the Bank has Meaning Usages in English Usages in Hindi budget presented a balanced

[PDF] A Dictionary of Hindi Verbal Expressions (Hindi-English) Final Report

features in a sentence This is illustrated by the Hindi rendering of the above noted English sentence vygvasiyika ksetro me sadg manusya kI prakrti kg malya

[PDF] The SENSEVAL–3 Multilingual English–Hindi Lexical Sample Task

This paper describes the English–Hindi Multilingual appropriate Hindi translation to an ambiguous tar- stances drawn from the BNC as described above,

[PDF] Using Word Embeddings for Query Translation for Hindi to English

1All Hindi words have been written in ITrans using http:// sanskritlibrary org/ transcodeText html showed improvements of 29 , 34 and 68 over

PDF document for free
  1. PDF document for free
[PDF] The SENSEVAL–3 Multilingual English–Hindi Lexical Sample Task 1302_4W04_0802.pdf TheSENSEVAL-3MultilingualEnglish-HindiLexicalSampleTask

TimothyChklovski

InformationSciencesInstitute

UniversityofSouthernCalifornia

MarinadelRey,CA90292

timc@isi.eduRadaMihalcea

DepartmentofComputerScience

UniversityofNorthTexas

Dallas,TX76203

rada@cs.unt.edu

TedPedersen

DepartmentofComputerScience

UniversityofMinnesota

Duluth,MN55812

tpederse@d.umn.eduAmrutaPurandare

DepartmentofComputerScience

UniversityofMinnesota

Duluth,MN55812

pura0010@d.umn.edu

Abstract

ThispaperdescribestheEnglish-HindiMultilingual

lexicalsampletaskinSENSEVAL-3.Ratherthan tagginganEnglishwordwithasensefromanEn- glishdictionary,thistaskseekstoassignthemost appropriateHinditranslationtoanambiguoustar- getword.TrainingdatawassolicitedviatheOpen

MindWordExpert(OMWE)fromWebuserswho

arefluentinEnglishandHindi.

1Introduction

ThegoaloftheMultiLinguallexicalsampletask

istocreateaframeworkfortheevaluationofsys- temsthatperformMachineTranslation,withafo- cusonthetranslationofambiguouswords.The taskisverysimilartothelexicalsampletask,ex- ceptthatratherthanusingthesenseinventoryfrom adictionarywefollowthesuggestionof(Resnikand

Yarowsky,1999)andusethetranslationsofthetar-

getwordsintoasecondlanguage.Inthistaskfor

SENSEVAL-3,thecontextsareinEnglish,andthe

"sensetags"fortheEnglishtargetwordsaretheir translationsinHindi.

Thispaperoutlinessomeofthemajorissuesthat

aroseinthecreationofthistask,andthendescribes theparticipatingsystemsandsummarizestheirre- sults.

2OpenMindWordExpert

Theannotatedcorpusrequiredforthistaskwas

builtusingtheOpenMindWordExpertsystem (ChklovskiandMihalcea,2002),adaptedformul- tilingualannotations1.

Toovercomethecurrentlackoftaggeddataand

thelimitationsimposedbythecreationofsuchdata usingtrainedlexicographers,theOpenMindWord

1MultilingualOpenMindWordExpertcanbeaccessedat

http://teach-computers.org/word-expert/english-hindiExpertsystemenablesthecollectionofsemantically annotatedcorporaovertheWeb.Taggedexamples arecollectedusingaWeb-basedapplicationthatal- lowscontributorstoannotatewordswiththeirmean- ings.

Thetaggingexerciseproceedsasfollows.For

eachtargetwordthesystemextractsasetofsen- tencesfromalargetextualcorpus.Theseexamples arepresentedtothecontributors,togetherwithall possibletranslationsforthegiventargetword.Users areaskedtoselectthemostappropriatetranslation forthetargetwordineachsentence.Theselection ismadeusingcheck-boxes,whichlistallpossible translations,plustwoadditionalchoices,"unclear" and"noneoftheabove."Althoughusersareencour- agedtoselectonlyonetranslationperword,these- lectionoftwoormoretranslationsisalsopossible.

Theresultsoftheclassificationsubmittedbyother

usersarenotpresentedtoavoidartificialbiases.

3SenseInventoryRepresentation

Thesenseinventoryusedinthistaskisthesetof

HinditranslationsassociatedwiththeEnglishwords

inourlexicalsample.Selectinganappropriate

English-Hindidictionarywasamajordecisionearly

inthetask,anditraisedanumberofinterestingis- sues.

Wewereunabletolocateanymachinereadable

orelectronicversionsofEnglish-Hindidictionaries, soitbecameapparentthatwewouldneedtomanu- allyentertheHinditranslationsfromprintedmate- rials.WebrieflyconsideredtheuseofOpticalChar- acterRecognition(OCR),butfoundthatouravail- abletoolsdidnotsupportHindi.Evenafterdeciding toentertheHinditranslationsmanually,itwasn't clearhowthosewordsshouldbeencoded.Hindiis usuallyrepresentedinDevanagariscript,whichhas alargenumberofpossibleencodingsandnoclear

standardhasemergedasyet. Association for Computational Linguistics for the Semantic Analysis of Text, Barcelona, Spain, July 2004 SENSEVAL-3: Third International Workshop on the Evaluation of Systems

WedecidedthatRomanizedortransliterated

Hinditextwouldbethethemostportableencoding,

sinceitcanberepresentedinstandardASCIItext.

However,itturnedoutthatthenumberofEnglish-

Hindibilingualdictionariesismuchlessthanthe

numberofHindi-English,andthenumberthatuse transliteratedtextissmallerstill.

Still,welocatedonepromisingcandidate,the

English-HindiHippocreneDictionary(Rakerand

Shukla,1996),whichrepresentsHindiinatranslit-

eratedform.However,wefoundthatmanyEnglish wordsonlyhadtwoorthreetranslations,makingit toocoarsegrainedforourpurposes2.

IntheendweselectedtheChambersEnglish-

Hindidictionary(Awasthi,1997),whichisahigh

qualitybilingualdictionarythatusesDevanagari script.Weidentified41Englishwordsfromthe

Chambersdictionarytomakeupourlexicalsam-

ple.Thenoneofthetaskorganizers,whois fluentinEnglishandHindi,manuallytransliter- atedtheapproximately500Hinditranslationsof the41Englishwordsinourlexicalsamplefrom theChambersdictionaryintotheITRANSformat (http://www.aczone.com/itrans/).ITRANSsoftware wasusedtogenerateUnicodefordisplayinthe

OMWEinterfaces,althoughthesensetagsusedin

thetaskdataaretheHinditranslationsintransliter- atedform.

4TrainingandTestData

TheMultiLinguallexicalsampleismadeupof41

words:18nouns,15verbs,and8adjectives.This sampleincludesEnglishwordsthathavevaryingde- greesofpolysemyasreflectedinthenumberofpos- sibleHinditranslations,whichrangefromalowof

3toahighof39.

Textsamplesmadeupofseveralhundredin-

stancesforeachof31ofthe41wordsweredrawn fromtheBritishNationalCorpus,whilesamplesfor theother10wordscamefromtheSENSEVAL-2En- glishlexicalsampledata.TheBNCdataisina "raw"textform,wherethepartofspeechtagshave beenremoved.However,theSENSEVAL-2datain- cludestheEnglishsense-tagsasdeterminedbyhu- mantaggers.

Aftergatheringtheinstancesforeachwordin

thelexicalsample,wetokenizedeachinstanceand removedthosethatcontaincollocationsofthetar- getword.Forexample,thetraining/testinstances forarm.ndonotincludeexamplesforcontactarm,

2Wehavemadeavailabletranscriptionsoftheentriesfor

approximately70Hippocrenenouns,verbs,andadjectives athttp://www.d.umn.edu/˜pura0010/hindi.html,althoughthese werenotusedinthistask.pickuparm,etc.,butonlyexamplesthatrefertoarm asasinglelexicalunit(notpartofacollocation).In ourexperience,disambiguationaccuracyoncollo- cationsofthissortisclosetoperfect,andweaimed toconcentratetheannotationeffortonthemoredif- ficultcases.

ThedatawasthenannotatedwithHinditransla-

tionsbywebvolunteersusingtheOpenMindWord

Expert(bilingualedition).Atvariouspointsintime

weofferedgiftcertificatesasaprizeforthemost productivetaggerinagivenday,inordertospur participation.Atotalof40volunteerscontributedto thistask.

Tocreatethetestdatawecollectedtwoindepen-

denttagsperinstance,andthendiscardedanyin- stanceswherethetaggersdisagreed.Thus,each instancethatremainsinthetestdatahascomplete agreementbetweentwotaggers.Forthetraining data,weonlycollectedonetagperinstance,and thereforethisdatamaybenoisy.Participatingsys- temscouldchoosetoapplytheirownfilteringmeth- odstoidentifyandremovethelessreliablyanno- tatedexamples.

AftertaggingbytheWebvolunteers,therewere

twodatasetsprovidedtotaskparticipants:one wheretheEnglishsenseofthetargetwordisun- known,andanotherwhereitisknowninboththe trainingandtestdata.Thesearereferredtoasthe translationonly(t)dataandthetranslationandsense (ts)data,respectively.Thetdataismadeupofin- stancesdrawnfromtheBNCasdescribedabove, whilethetsdataismadeupoftheinstancesfrom

SENSEVAL-2.Evaluationswererunseparatelyfor

eachofthesetwodatasets,whichwerefertoasthe tandtssubtasks.

Thetdatacontains31ambiguouswords:15

nouns,10verbs,and6adjectives.Thetsdatacon- tains10ambiguouswords:3nouns,5verbs,and2 adjectives,allofwhichhavebeenusedintheEn- glishlexicalsampletaskofSENSEVAL-2.These words,thenumberofpossibletranslations,andthe numberoftrainingandtestinstancesareshownin

Table1.Thetotalnumberoftraininginstancesin

thetwosub-tasksis10,449,andthetotalnumberof testinstancesis1,535.

5ParticipatingSystems

Fiveteamsparticipatedinthetsubtask,submitting

atotalofeightsystems.Threeteams(asubsetof thosefive)participatedinthetssubtask,submitting atotaloffivesystems.Allsubmittedsystemsem- ployedsupervisedlearning,usingthetrainingex- amplesprovided.Someteamsusedadditionalre- sourcesasnotedinthemoredetaileddescriptions Table1:TargetwordsintheSENSEVAL-3English-Hinditask LexicalUnitTranslationsTrainTestLexicalUnitTranslationsTrainTestLexicalUnitTranslationsTrainTest

TRANSLATIONONLY(T-DATA)

band.n822491bank.n2133252case.n1334842 different.a432025eat.v327148field.n14300100 glass.n837913hot.a1834832line.n3936011 note.v1122012operate.v928050paper.n826473 plan.n821035produce.v726567rest.v1417210 rule.v816018shape.n832032sharp.a1624848 smell.v521017solid.a1632737substantial.a15250100 suspend.v437028table.n2137816talk.v634135 taste.n635040terrible.a420099tour.n52409 vision.n1431820volume.n930954watch.v10300100 way.n1633122TOTAL34889451336

TRANSLATIONANDSENSEONLY(TS-DATA)

bar.n1927839begin.v636015channel.n69216 green.a917526nature.n157114play.v1415210 simple.a916619treat.v710032wash.v161011 work.v2410017TOTAL1251504199 below.

5.1NUS

TheNUSteamfromtheNationalUniversityofSin-

gaporeparticipatedinboththetandtssubtasks.The tsystem(nusmlst)usesacombinationofknowledge sourcesasfeatures,andtheSupportVectorMachine (SVM)learningalgorithm.Theknowledgesources usedincludepartofspeechofneighboringwords, singlewordsinthesurroundingcontext,localcol- locations,andsyntacticrelations.Thetssystem (nusmlsts)doesthesame,butaddstheEnglishsense ofthetargetwordasaknowledgesource.

5.2LIA-LIDILEM

TheLIA-LIDILEMteamfromtheUniversit´ed'

AvignonandtheUniversit´eStendahlGrenoblehad

twosystemswhichparticipatedinboththetandts subtasks.Inthetssubtask,onlytheEnglishsense tagswereused,nottheHinditranslations.

TheFL-MIXsystemusesacombinationofthree

probabilisticmodels,whichcomputethemostprob- ablesensegivenasixwordwindowofcontext.The threemodelsareaPoissonmodel,aSemanticClas- sificationTreemodel,andaKnearestneighbors searchmodel.Thissystemalsousedapartofspeech taggerandalemmatizer.

TheFC-MIXsystemisthesameastheFL-MIX

system,butreplacescontextwordsbymoregen- eralsynonym-likeclassescomputedfromaword alignedEnglish-Frenchcorpuswhichnumberap- proximately850,000wordsineachlanguage.5.3HKUST

TheHKUSTteamfromtheHongKongUniversity

ofScienceandTechnologyhadthreesystemsthat participatedinboththetandtssubtasks

TheHKUST

metandHKUSTmetssys- temsaremaximumentropyclassifiers.The HKUST combtandHKUSTcombtssystems arevotedclassifiersthatcombineanewKernel

PCAmodelwithamaximumentropymodeland

aboosting-basedmodel.TheHKUST comb2t andHKUST comb2tsarevotedclassifiersthat combineanewKernelPCAmodelwithamaximum entropymodel,aboosting-basedmodel,anda

NaiveBayesianmodel.

5.4UMD

TheUMDteamfromtheUniversityofMarylanden-

tered(UMD-SST)inthettask.UMD-SSTisasu- pervisedsensetaggerbasedontheSupportVector

Machinelearningalgorithm,andisdescribedmore

fullyin(Cabezasetal.,2001).

5.5Duluth

TheDuluthteamfromtheUniversityofMinnesota,

Duluthhadonesystem(Duluth-ELSS)thatpartici-

patedinthettask.Thissystemisanensembleof threebaggeddecisiontrees,eachbasedonadiffer- enttypeoflexicalfeature.Thissystemwasknown asDuluth3inSENSEVAL-2,anditisdescribedmore fullyin(Pedersen,2001).

6Results

Allsystemsattemptedallofthetestinstances,so

precisionandrecallareidentical,hencewereport

Table2:tSubtaskResults

SystemAccuracy

nusmlst63.4

HKUSTcombt62.0

HKUSTcomb2t61.4

HKUSTmet60.6

FL-MIX60.3

FC-MIX60.3

UMD-SST59.4

Duluth-ELSS58.2

Baseline(majority)51.9

Table3:tsSubtaskResults

SystemAccuracy

nusmlsts67.3

FL-MIX64.1

FC-MIX64.1

HKUSTcombts63.8

HKUSTcomb2ts63.8

HKUSTmets60.8

Baseline(majority)55.8

thesingleAccuracyfigure.Tables2and3showre- sultsforthetandtssubtasks,respectively.

Wenotethattheparticipatingsystemsallex-

ceededthebaseline(majority)classifierbysome margin,suggestingthatthesensedistinctionsmade bythetranslationsareclearandprovidesufficient informationforsupervisedmethodstolearneffec- tiveclassifiers.

Interestingly,theaverageresultsonthetsdataare

higherthantheaverageresultsonthetdata,which suggeststhatsenseinformationislikelytobehelpful forthetaskoftargetedwordtranslation.Additional investigationsarehoweverrequiredtodrawsomefi- nalconclusions.

7Conclusion

TheMultilingualLexicalSampletaskin

SENSEVAL-3featuredEnglishambiguouswords

thatweretobetaggedwiththeirmostappropriate

Hinditranslation.Theobjectiveofthistaskisto

determinefeasibilityoftranslatingwordsofvarious degreesofpolysemy,focusingontranslationof specificlexicalitems.Theresultsoffiveteams thatparticipatedinthiseventtentativelysuggest thatmachinelearningtechniquescansignificantly improveoverthemostfrequentsensebaseline. Additionally,thistaskhashighlightedcreationoftestingandtrainingdatabyleveragingthe knowledgeofbilingualWebvolunteers.The trainingandtestdatasetsusedinthisexerciseare availableonlinefromhttp://www.senseval.organd http://teach-computers.org.

Acknowledgments

ManythankstoallthosewhocontributedtotheMul-

tilingualOpenMindWordExpertproject,making thistaskpossible.Wearealsogratefultoallthepar- ticipantsinthistask,fortheirhardworkandinvolve- mentinthisevaluationexercise.Withoutthem,all thesecomparativeanalyseswouldnotbepossible.

Weareparticularlygratefultoaresearchgrant

fromtheUniversityofNorthTexasthatprovidedthe fundingforcontributorprizes,andtotheNational

ScienceFoundationfortheirsupportofAmrutaPu-

randareunderaFacultyEarlyCAREERDevelop- mentAward(#0092784).

References

S.Awasthi,editor.1997.ChambersEnglish-Hindi

Dictionary.SouthAsiaBooks,Columbia,MO.

C.Cabezas,P.Resnik,andJ.Stevens.2001.Su-

pervisedsensetaggingusingSupportVectorMa- chines.InProceedingsoftheSenseval-2Work- shop,Toulouse,July.

T.ChklovskiandR.Mihalcea.2002.Buildinga

sensetaggedcorpuswiththeOpenMindWord

Expert.InProceedingsoftheACLWorkshopon

WordSenseDisambiguation:RecentSuccesses

andFutureDirections,Philadelphia.

T.Pedersen.2001.Machinelearningwithlexical

features:TheDuluthapproachtoSenseval-2.In

ProceedingsoftheSenseval-2Workshop,pages

139-142,Toulouse,July.

J.RakerandR.Shukla,editors.1996.Hip-

pocreneStandardDictionaryEnglish-Hindi

Hindi-English(WithRomanizedPronunciation).

HippocreneBooks,NewYork,NY.

P.ResnikandD.Yarowsky.1999.Distinguish-

ingsystemsanddistinguishingsenses:Neweval- uationmethodsforwordsensedisambiguation.

NaturalLanguageEngineering,5(2):113-133.


Hindi Documents PDF, PPT , Doc

[PDF] aboard hindi meaning

  1. Foreign Language

  2. Hindi

  3. Hindi

[PDF] aboard hindi means

[PDF] aboard hindi translate

[PDF] above hindi arth

[PDF] above hindi m

[PDF] above hindi me

[PDF] above hindi meaning

[PDF] above hindi name

[PDF] above hindi se

[PDF] above hindi translation

Politique de confidentialité -Privacy policy