[PDF] Emoji-Powered Representation Learning for Cross-Lingual

To tackle this problem, cross- lingual sentiment classification approaches aim to transfer knowledge learned from one language that has abundant labeled

As emojis are becoming more popular to use in text-based commu- nication this thesis investigates the feasibility of an emoji training heuristic for multi-class

[PDF] Classifying the Informative Behaviour of Emoji in Microblogs

The best classification model achieved an F-score of 0 7 In this paper we shortly present the corpus, and we describe the classification experiments, explain the

[PDF] Towards an Emotion-Based Analysis of Emojis - Association for

(2009) proposed a form of distant supervision by using emoticons as noisy labels for Twitter sentiment classification Davidov et al (2010) adopted a fairly similar

[PDF] Emoji-Powered Representation Learning for Cross-Lingual - IJCAI

To tackle this problem, cross- lingual sentiment classification approaches aim to transfer knowledge learned from one language that has abundant labeled

[PDF] Using Neural Networks to Predict Emoji Usage from Twitter Data

We frame this investigation as a text classification problem, mapping input text to their most likely accompanying emoji, utilizing 1 5 million scraped Twitter tweets

[PDF] emoji classification python

[PDF] emoji coke can

[PDF] emoji corpus

[PDF] emoji faces meaning

[PDF] emoji in javascript

[PDF] emoji meanings apple

[PDF] emoji meanings of the symbols pdf

[PDF] emoji meanings pdf download

[PDF] emoji prediction github

[PDF] emoji prediction machine learning

[PDF] emoji prediction project

[PDF] emoji rankings

[PDF] emoji sentiment analysis

[PDF] emoji sentiment lexicon

[PDF] emoji values

Emoji-PoweredRepresentationLear ningforCr oss-LingualSentiment

Classification (ExtendedAbstract)

Zhenpeng Chen

1,Sheng Shen2,Ziniu Hu3,Xuan Lu4,Qiaozhu Mei4andXuanzhe Liu1

1KeyLabof High-ConfidenceSoftw areT echnology, MoE(PekingUniv ersity),Beijing,China

2UniversityofCalifornia, Berkele y, USA

3UniversityofCalifornia, LosAngeles, USA

4UniversityofMichig an,Ann Arbor,USA

czp@pku.edu.cn, sheng.s@berkeley.edu,bull@cs.ucla.edu,f luxuan, qmei g @umich.edu, xzl@pku.edu.cn

Abstract

Sentiment classificationtypically relieson alar ge amount oflabeled data.In practice,the av ail- ability oflabels ishighly imbalancedamong dif- ferent languages.T otacklethisproblem, cross- lingual sentimentcl assificationapproachesaimto transfer knowledgelearnedfrom onelanguage that has abundantlabelede xamples(i.e., thesource language, usuallyEnglish) toanother language with fewerlabels(i.e., thetar getlanguage). The source andthe target languagesareusually bridged through off-the-shelfmachinetranslation tools.

Through sucha channel,cross-language sentiment

patterns canbe successfullylearned fromEnglish and transferredinto thetar getlanguages. Thisap- proach, however,oftenfailstocapture sentiment knowledgespecific tothe target language.In this paper,we emplo yemojis,whicharewidelya vail- able inman ylanguages,as anewchannel tolearn both thecross-language andthe language-specific sentiment patterns.W eproposeano vel representa- tion learningmethod thatuses emojiprediction as an instrumentto learnrespecti ve sentiment-aware representations foreach language.The learnedrep- resentations arethen integrated tofacilitatecross- lingual sentimentclassification.

1 Introduction

Sentiment analysishas beena criticalcomponent inman yap- plications suchas recommendersystems [Sunet al., 2019], personalized contentdeli very [Harakawaet al., 2018] , and online advertising

Qiuet al., 2010]. However,existingwork

on sentimentanalysis mainlydeals withEnglish texts [De- riuet al., 2017]. Althoughsome efforts havealso beenmade This paperis anabridged version ofa paperwiththesame title, which wontheBest FullP aperA ward attheWWW2019 conference [Chenet al., 2019].This work wassupportedby theKey-Area Re- search andDe velopmentProgramofGuangdongPro vinceunder the grantNo.2020B010164002 andtheBeijingOutstandingYoungSci- entist Programunder thegrant NO.BJJWZYJH01201910001004. Corresponding author:Xuanzhe Liu.with otherlanguages, sentimentanalysis fornon-English lan- guages isf arbehind.Thiscreates aconsiderable inequality in thequality ofthe aforementionedW ebservices receiv edby non-English users,especially consideringthat 74.8%of Inter- netusersarenon-Englishspeakers

1.Thecauseofthisinequal-

ity isquite simple:ef fectiv esentimentanalysistoolsareoften builtupon supervisedlearning techniques,and therear eway morelabeled examples inEnglishthanin otherlangua ges A straightforwardsolutionis totransfer thekno wledge learned froma label-richlanguage (i.e.,the sourcelanguage, usually English)to anotherlanguage thathas fewer labels (i.e., thetar getlanguage),anapproach known ascross- lingual sentimentclassification [Chenet al., 2017]. Inprac- tice, itsbiggest challengeis how tofill thelinguisticgap be- tween Englishand thetar getlanguage. Mostrecentstud- ies havebeenusingof f-the-shelfmachine translationtools to generatepseudo parallelcorpora andthen learnbilingual representations forthe downstream sentimentclassification task [Xiao andGuo, 2013;Zhou et al., 2016]. Morespecif- ically,man yofthesemethods enforcethe alignedbilingual textsto sharea unifiedembedding space,and sentimentanal- ysis ofthe target languageisconductedin thatspace. Although thisapproach lookssensible, theperformance of thesemachine translation-based methodsoftenfalls short. Indeed, amajor obstacleof cross-lingualsentiment analysis is theso-called languagediscrepancy problem[Chenet al., 2017
], whichmachine translationdoes nottackle well.More specifically,sentiment expressions oftendiffera lotacross languages. Machinetranslationis ableto retainthe general expressionsof sentimentsthat areshared acrosslanguages (e.g., " angry " or"

ñ#&K" forne gativesentiment),but

it usuallyloses ore ven altersthesentimentsinlanguage- specific expressions [Mohammadet al., 2016]. Asan exam- ple, inJapanese, thecommon expression "

AE.H+1

" indicatesa neg ativesentiment,describingtheexcessive usage orw asteofaresource. Howe ver ,its translationinEn- glish, " use itlik ehotwater ,"loses thene gati vesentiment. The reasonbehind thispitf allis easytoexplain: ma- chine translationtools areusually trainedon parallelcor - pora thatare built inthefirstplace tocapture patternsshared across languagesinstead ofpatterns specificto individual lan-1

https://www.internetworldstats.com/stats7.htm.Proceedingsofthe Tw enty-N inthInternationalJointConferenceon ArtificialIntelligence(IJCAI-20)

SisterC onferencesBestPapers Track

4701
guages. Inother words, theproblemisdue tothe failure to retain language-specificsentiment knowledge whenunilater- ally pursuinggeneralization acrosslanguages. Ane wbridge needs tobe built beyondmachinetranslation, whichnotonly transfers "generalsentiment knowedge" fromthesourcelan- guage butalsocaptures "priv atesentiment knowledge"ofthe targetlanguage. That bridgecanbe built withemojis . In thispaper ,wetacklethe problemof cross-lingualsenti- ment analysisby employing emojisas aninstrument. Emojis are consideredan emerging ubiquitouslanguageusedw orld- wide [Chenet al., 2018]; inour approachthe yserv ebothasa proxy ofsentiment labelsand asa bridgebetween languages. Their functionalityof expressing emotionsmotivates usto employemojis ascomplementary labels forsentiments, while their ubiquitymak esitfeasibleto learnemoj i-sentimentrep- resentations foralmost ev eryactivelanguage.Coupled with machine translation,the cross-languagepatterns ofemoji us- age cancomplement thepseudo parallelcorpora andnarro w the languageg ap,andthela nguage -specificpatterns ofemoji usage helpaddress thelanguage discrepancy problem.

Wepropose ELSA,a nov elframe workofEmoji-

poweredr epresentationlearningforcross-L ingual Sentiment

Analysis

[Chenet al., 2019]. ThroughELSA, language- specific representationsare firstderi ved basedonmodel- ing howemojisare usedalongside words ineach language. These per-languagerepresentationsare theninte gratedand refined topredict therich sentimentlabels inthe source language, throughthe helpof machinetranslation. Differ - ent fromthe mandatorilyaligned bilingualrepresentations in existingstudies,the jointrepresentation learnedthrough ELSA catchesnot onlythe generalsentiment patternsacross languages, butalsothe language-specificpatterns. Inthis way,thene wrepresentation andthedownstream tasksare no longer dominatedby thesource language.

Wee valuatetheperformanceofELSA

2on abenchmark

Amazon reviewdataset,whichco vers ninetasks combined from threetar getlanguages(i.e.,Japanese, French,and Ger- man) andthree domains(i.e., book,D VD,and music).Re- sults indicatethat ELSAoutperforms existing approacheson all ofthese tasksin termsof classificationaccurac y.

2 TheELSA Appr oach

timent classificationaimst ouse thelabeleddatain asource language (i.e.,English) tolearn amodel thatcan classifythe sentiment oftest datain atar getlanguage. Inour setting, besides labeledEnglish documents( L

S), wealso hav elarge-

scale unlabeleddata inEnglish ( U

S) andin thetar getlan-

guage ( U

T). Furthermore,there exist unlabeleddatacontain-

ing emojis,both inEnglish ( E

S) andin thetar getlanguage

E T). Inpractice, theseunlabeled, emoji-richdata canbe easily obtainedfrom onlinesocial mediasuch asT witter. Our task isto build amodelthatcan classifythe sentiment po- larity ofdocument inthe target languagesolely basedonthe labeled datain thesource language(i.e., LS) andthe different kinds ofunlabeled data(i.e., US,UT,ESandET). Finally,2 The benchmarkdatasets, scripts,and pre-trainedmodels are

availableathttps://github .com/sInceraSs/ELSA.we usea held-outset oflabeled documentsin thetar getlan- guage (L

T), whichcan besmall, toe valuate themodel.

The workflowofELSAis illustratedin Figure1(a), with the followingsteps.In step 1andstep 2, web uildsentence representation modelsfor boththe sourceand thetar getlan- guages. Instep 3, wetranslate eachlabeled Englishdocu- ment intothe target language,sentencebysentence, through

GoogleT ranslate

. Boththe Englishsentences andtheir trans- lations arefed intothe representationmodels learnedin steps

1 and2 toobtain their per-language representations(

step 4 andstep 5). Thenin step 6andstep 7we aggregatethese sentence representationsback toform two compactrepresen- tations foreach trainingdocument, onein Englishand the other inthe target language.Instep 8, weuse thetw orep- resentations asfeatures topredict thereal sentimentlabel of each documentand obtainthe finalsentiment classifier. In the testphase, fora new documentin thetargetlanguage, we translate itinto Englishand thenfollo wthe previous stepsto obtain itsrepresentation ( step 9 ), basedon whichwe predict the sentimentlabel usingthe classifier( step 10

2.1 RepresentationLearning

Representations ofdocuments needto belearned beforewe train thesentiment classifier. Sinceemojisarewidely used to expresssentimentsacross languages,we learnsentiment- awarerepresentationsof documentsusi ngemoji prediction as aninstrument. Specifically, inadistantlysupervised way , we useemojis assurrog atesentiment labelsandlearnsen- tence embeddingsby predictingwhich emojisare usedin a sentence. Thisrepresentation learningprocess isconducted separately inthe sourceand thetar getlanguages tocapture language-specific sentimente xpressions.Thearchitectureof the representationlearning modelis illustratedin Figure1(b).

WordEmbedding Layer. The wordembeddingsare pre-

trained withthe skip-gramalgorithm basedon eitherUSor U T, whichencode ev erywordintoacontinuous vectorspace. Bi-DirectionalLSTM Layer. LSTM isparticularly suit- step (e.g.,w ordtoken),LSTM combinesthecurrentinput and knowledgefrom thepre vioussteps toupdatethestates ofthe hidden layer.Letus denoteeach sentencein ESorETas (x, e), wherex= [d1;d2;:::;dL]as asequence ofwordvectors representing theplain text (byremovingemojis) andeas one emoji containedin thete xt.At eachstept, wecan extract the latent vectorfromLSTM. Inorder to capturethe information from theconte xtbothprecedingand following aw ord,we use thebi-directional LSTM.W econcatenate thelatentvec- tors fromboth directionsto constructa bi-directionalencoded vectorhifor everysingleword vector di, whichis: h i= [!h i; h i]: Attention Layer.The attentionlayer takes theoutputsof both theembedding layerand thetw oLSTM layersas input, through askip-connection, whichenables unimpededinfor - mation flowinthe wholetraining process.The i-th wordof the inputsentence canbe representedas ui: u

i= [di;hi1;hi2];Proceedingsofthe Tw enty-N inthInternationalJointConferenceon ArtificialIntelligence(IJCAI-20)

SisterC onferencesBestPapers Track

4702

RepresentationLearning:SourceLanguageLabeledEnglishDocsSupervisedLearningMachineTranslateClassificationModelNewDocumentinTargetLanguageSentimentPolaritySentenceRepresentationsAggregateDocumentRepresentationSentenceRepresentationsAggregate21109865437Translate,RepresentationPredictTranslatedDocsDocumentRepresentationRepresentationLearning:TargetLanguage(a) The workflow of ELSA.(b) Network architecture

for representation learning through emoji prediction.

Figure 1: The ELSA approach.

wheredi,hi1, andhi2denote the encoded vectors of words extracted in the word embedding layer and the first and sec- ond bi-directional LSTMs, respectively. Since not all words contribute equally to predicting emojis or expressing senti- ments, we employ the attention mechanism [Bahdanauet al., 2014
]to determine the importance of every single word. The attention score of thei-th word is calculated by a i=exp(Waui)P L j=1exp(Wauj); whereWais the weight matrix used by the attention layer. Then each sentence can be represented as the weighted sum of all words in it, using the attention scores as weights. We denote the sentence representation asv. Softmax Layer.The sentence representation is then trans- ferred into the softmax layer, which returns a probability vec- torY. Each element of this vector indicates the probability that this sentence contains a specific emoji. Finally, we learn the model parameters by minimizing the cross entropy be- tween the output probability vectors and the one-hot vectors of the emoji contained in each sentence. After learning the parameters, we can extract the output of the attention layer to represent each input sentence. Through this emoji-prediction process, words with distinctive sentiments can be identified, and the plain text surrounding the same emojis will be repre- sented similarly. Given the fact that the sentiment labels are limited, once the emoji-powered sentence representations are trained, they are locked in the downstream sentiment predic- tion task to avoid over-fitting.

2.2 Training the Sentiment Classifier

Based on the pre-trained, per-language sentence representa- tions, we then learn document representations and conduct cross-lingual sentiment classification. First, for each English documentDs2LS, we use the pre-trained English represen- tation model to embed every single sentence in it. Second, we aggregate these sentence representations to derive a compact document representation. Because different parts of a docu-quotesdbs_dbs17.pdfusesText_23

[PDF] [PDF] Emoji-Powered Representation Learning for Cross-Lingual - IJCAI

[PDF] Multi-class Sentiment Classification on Twitter using an Emoji - DiVA

[PDF] Classifying the Informative Behaviour of Emoji in Microblogs

[PDF] Towards an Emotion-Based Analysis of Emojis - Association for

[PDF] Emoji-Powered Representation Learning for Cross-Lingual - IJCAI

[PDF] Using Neural Networks to Predict Emoji Usage from Twitter Data

Classification (ExtendedAbstract)

Zhenpeng Chen

1,Sheng Shen2,Ziniu Hu3,Xuan Lu4,Qiaozhu Mei4andXuanzhe Liu1

1KeyLabof High-ConfidenceSoftw areT echnology, MoE(PekingUniv ersity),Beijing,China

2UniversityofCalifornia, Berkele y, USA

3UniversityofCalifornia, LosAngeles, USA

4UniversityofMichig an,Ann Arbor,USA

Abstract

Through sucha channel,cross-language sentiment

1 Introduction

Qiuet al., 2010]. However,existingwork

1.Thecauseofthisinequal-

ñ#&K" forne gativesentiment),but

AE.H+1

SisterC onferencesBestPapers Track

Wepropose ELSA,a nov elframe workofEmoji-

Analysis

Wee valuatetheperformanceofELSA

2on abenchmark

2 TheELSA Appr oach

S), wealso hav elarge-

S) andin thetar getlan-

T). Furthermore,there exist unlabeleddatacontain-

S) andin thetar getlanguage

T), whichcan besmall, toe valuate themodel.

GoogleT ranslate

1 and2 toobtain their per-language representations(

2.1 RepresentationLearning

WordEmbedding Layer. The wordembeddingsare pre-

SisterC onferencesBestPapers Track

Figure 1: The ELSA approach.

2.2 Training the Sentiment Classifier