[PDF] Quantitative Characterization of Code Switching Patterns in





Previous PDF Next PDF



Dialogs for Everyday Use

intonation on a name used in direct address is unusual in Ameri can English and tends to sound brusque and impolite. go to the movies. Notice the rising ...



Spoken English in Dialogues: 833 common English sentences used

conversation and say good-bye. GOING TO PUBLIC PLACES 1. How to go to the cinema 2. How to join a fitness club 3. How to queue 4. How to exchange money 5 



Everyday Conversations: Learning American English

The movie starts at 8:00. ALICE: See you then. Bye! Dialogue 1-6: A Telephone Call.



Hollywood Movie Dialogue and the Real Realism of John

but they pervade most Hollywood cinema. 1. Dialogue in American movies either advances the plot or supplies pertinent background information. Any number of 



Hollywood Identity Bias Dataset: A Context Oriented Bias Analysis of

of social bias identification in movie dialogues. We introduce a new dataset as Hollywood Identity Bias. Dataset (HIBD) consisting of 35 movie scripts anno-.



STAAR® - English II RELEASED

Sep 17 2019 45 Read this quotation from paragraph 4. This is why we love the movies: it's like going on a roller coaster for the brain. Why does the author ...



Conversational implicatures in English dialogue: Annotated dataset

Nov 24 2019 A conversation in the animation movie 'Anastasia' goes like this



Using Movies in EFL Classrooms: A Study Conducted at the English

Feb 22 2016 integration of English movies in their classes to develop their language skills. ... As they watched the movies with plenty of conversations ...



NoEl: An Annotated Corpus for Noun Ellipsis in English

May 16 2020 We use the same convention to present examples of ellipsis from the movie dialogues throughout this paper. Ellipses occur in the environment of ...



Runze Xu Runze Xu English as an International Language Program

(2011) which examines lexical bundles in film dialogues alone. No studies have yet looked at lexical bundles in movie scripts in general. To these ends



Everyday Conversations: Learning American English

level students of English as a Foreign Language (EFL) or English as a Dialogue 1-2: Informal Greetings and Farewells ... Dialogue 3-2: At the Movies.



20 Simple Dialogues

with English Grammar Dialogues Use the dialogues to practice the simple tenses and speaking English. ... Well what's your favorite movie?



Killing the Writer: Movie Dialogue Conventions and John Cassavetes

Dialogue in Hollywood movies abides by conventions that do not pertain to reg- ular conversation. I want to look briefly at four prominent conventions that will 



NoEl: An Annotated Corpus for Noun Ellipsis in English

16 ????. 2020 ?. We use the same convention to present examples of ellipsis from the movie dialogues throughout this paper. Ellipses occur in the environment of ...



Conversational implicatures in English dialogue: Annotated dataset

24 ????. 2019 ?. A conversation in the animation movie 'Anastasia' goes like this. (1). ANYA: Is this where I get travelling papers? CLERK: It would be if we ...



Hollywood Movie Dialogue and the Real Realism of John

but they pervade most Hollywood cinema. 1. Dialogue in American movies either advances the plot or supplies pertinent background information. Any number of 



Understanding Hollywood through Dialogues

We plan to use these deep learning architectures on our domain specific dataset to classify movie dialogues focusing on gender classification. 3 Datasets. We 



Web-based Dialogue and Translation Games for Spoken Language

Let's watch a movie tomorrow evening. Figure 4: Example English dialogue in the Hobbies and Sched- ules domain. rections midstream. It will also alert them to 



Quantitative Characterization of Code Switching Patterns in

Multi-Party Conversations: A Case Study on Hindi Movie Scripts. Adithya Pratapa reports use of English for professional purposes.



THE GODFATHER Screenplay by Mario Puzo Francis Ford Coppola

went to the movies with him; she stayed out late. following dialogue). Don...Don Corleone. ... from now this – Hollywood bigshot's gonna.

S Bandyopadhyay, D S Sharma and R Sangal. Proc. of the 14th Intl. Conference on Natural Language Processing, pages 75-84,Kolkata, India. December 2017.c

2016 NLP Association of India (NLPAI)Quantitative Characterization of Code Switching Patterns in Complex

Multi-Party Conversations: A Case Study on Hindi Movie Scripts

Adithya Pratapa

Microsoft Research, India

adithyapratapa@gmail.comMonojit Choudhury

Microsoft Research, India

monojitc@microsoft.com

Abstract

In this paper, we present a framework

for quantitative characterization of code- switching patterns in multi-party conver- sations, which allows us to compare and contrast the socio-cultural and functional aspects of code-switching within a set of cultural contexts. Our method applies some of the proposed metrics for quan- tificationofcode-switching(

Gambackand

Das 2016

Guzman et al.

2017
) at the level of entire conversations, dyads and participants. We apply this technique to analyze the conversations from 18 recent

Hindi movies. In the process, we are able

to tease apart the use of code-switching as a device for establishing identity, socio- cultural contexts of the characters and the events in a movie.

1 Introduction

Code-switching(henceforth CS) orcode-mixing

refers to the juxtaposition of linguistic units from more than one language in a single conversation, or in a single utterance. Linguists have exten- sively studied the structural (i.e., the grammatical constraints on CS) and functional (i.e., the moti- vation and intention behind CS) aspects of CS in various mediums, contexts, languages and geogra- phies (

Myers-Scotton

2005
Auer 1995
2013
However, most of these studies are limited to qual- itative analysis of small datasets, which makes it hard to make statistically valid quantitative claims over the nature and distribution of CS.

Recently, due to the availability of large code-

switched datasets, gathered mostly from social media, there has been some quantitative stud- ies on socio-linguistic and functional aspects of CS (

Rudra et al.

2016

Rijhw aniet al.

2017
Guz- man et al.,2017 ). Nevertheless, there are no large- sations, primarily because currently the only avail- able large-scale datasets come from social media.

These are either micro-blogs without any conver-

sational context or data from Facebook or What- sApp with very short conversations. On the other hand, functions of CS are most relevant and dis- cernible in relatively long multi-party conversa- tions embedded in a social context. For instance, it is well documented ( Auer 2013
) that CS is mo- tivated by complex social functions, such as iden- tity, social power and style accommodation, which are difficult to elicit and establish from short social media texts.

In this work, we propose a set of techniques for

analyzing CS styles and functions in conversations grounded over social networks. Our approach de- theCode-mixing Index(CMI) (Gamback and Das, 2016
) and corpus level metrics proposed in ( Guz- man et al. 2017
), applied to conversations at the level of dyads, participants, conversation scenes and the entire social network of the participants.

We apply this new approach to analyze scripts of

18 recent Hindi movies with various degrees and

styles of Hindi-English CS. Through this analysis technique, we are able to bring out the social func- tions of CS at different levels.

The primary contributions of this work are: (a)

development of a set of quantitative conversation analysis techniques for CS; (b) some visualiza- tion techniques for CS patterns in conversations that can help linguists and social scientists to get a holistic view of the switching styles in interac- tions; (c) analysis of CS patterns in recent Hindi movies that adds to the existing rich literature of similar but small scale qualitative studies of CS in

Indian cinema.

Rest of this paper is organized as follows: Sec75

2 describes related work on functions of CS with

particular emphasis on CS in Indian cinema. Sec

3 introduces our analysis technique, which is later

applied and illustrated in the context of movie scripts in Sec 5 and 6. Sec 4 introduces the movie dataset, preprocessing of the scripts and word- level language labeling of the dialogues. Sec 7 concludes the paper by summarizing the contribu- tions and discussing potential future work.

2 Related Work

In this section, we will start with a brief review of the linguistics literature on functional and socio- linguistic aspects of CS, followed by a discussion on recent computational models. In order to put the case-study on Hindi movies in perspective, we will also review relevant literature on CS in Indian cinema.

2.1 Functions of Code-Switching

Code-switching is a common phenomenon in all

multilingual communities, though usually it is un- predictable whether in a given context a speaker will code-switch or not ( Auer 1995
). Neverthe- less, linguists have observed that there are pre- ferred languages for communicating certain kinds offunctions. Forinstance, certainspeechactivities might be exclusively or more commonly related to a certain language choice (e.g. Fishman ( 1971
reports use of English for professional purposes and Spanish for informal chat for English-Spanish bilinguals from Puerto Rico). Language switch- ing is also used as a signaling device that serves specific communicative functions (

Barredo

1997

Sanchez

1983

Nishimura

1995

Maschl er

1991
1994
) such as: (a) reported speech (b) narrative to evaluative switch (c) reiterations or empha- sis (d) topic shift (e) puns and language play (f) topic/comment structuring etc. Attempts of pre- dicting the preferred language, or even exhaus- tivelylistingsuchfunctions, havefailed. However, linguists agree that language alteration in multilin- gual communities is not a random process.

Code-switching is also strongly linked to social

identity and the principle of linguistic style ac- commodation (

Melhim and Rahman

1991
Auer 2013
). For instance, two Hindi-English bilingual speakers could code-switch just to establish a con- nection or in-group identity because CS is the norm for a large section of urban Indians, and En- glish is attached to aspirational values by a largesection of the Indian society (see Sec.2.3for de- tailed discussion on this).

2.2 Computational and Quantitative Studies

Over the last decade, research in computational

processing of code-switching has gained signifi- cant interest (

Solorio and Liu

2008
2010
Vyas etal. 2014

Pengetal.

2014

Sharmaetal.

2016
In particular, word-level language identification, which is the first step towards processing of CS text, has received a lot of attention (see

Rijhw ani

et al. 2017
) for a review). In this work, we use the word-level language labeler by

Gella et al.

2013
for labeling the Hindi movie dialogues.

Nevertheless, to the best of our knowledge,

there has been very little work on automatic iden- tification of functional aspects of CS or any large- scale data-driven study of its socio-linguistic as- pects. Of the few studies that exist, most no- table are the ones by Rudra et al. ( 2016
) on lan- guage preference by Hindi-English bilinguals on

Twitter and Rijhwani et al. (

2017
) on extent and patterns of CS across European languages from

24 cities. Rudra et al. (

2016
) analyzed 430K unique tweets for opinion and sentiment, and con- cluded that Hindi-English bilinguals prefer to ex- press negative opinions in Hindi; they further re- port that a large fraction of the CS tweets exhib- ited the narrative-evaluative function. Rijhwani et al. ( 2017
) examined more than 50M tweets from across the world the study shows that the percent- age of CS tweets varies from 1 to 11% across the cities, and more CS is observed in the cities where English is not the primary language of com- munication. They also show that English-Spanish

CS patterns in a predominantly Spanish speaking

region (e.g., Barcelona) are different from those whereEnglishistheprimarylanguage(e.g., Hous- ton).

In an excellent survey on computational socio-

linguistics,

Nguyen et al.

2016
) report a few other studies on socio-linguistic aspects of multilingual communities.

2.3 Code-switching in Indian Cinema

Hindi-English CS, commonly calledHinglish, is

extremely widespread in India. There is histor- ical attestation, as well as recent studies on the growing use of Hinglish in general conversation, and in entertainment and media (see

P arshadet al.

2016
) and references therein). Several recent studies (

Bali et al.

2014

Barman et al.

2014
;76

Sequiera et al.,2015 ) also provide evidence of

Hinglish and other instances of CS on online so-

cial media, such as Twitter and Facebook.

Hindi movies provide a rich data source for

studying CS in the Indian context. Accord- ing to theConversational Analysisapproach to CS ( Auer 2013
W ei 2002
), in any given context a particular language is preferred orunmarked.

Therefore, "speakers, and in turn script writers,

choose marked or unmarked codes on the ba- sis of which one will bring them the best out- comes" ( Vaish 2011
). Myers-Scotton ( 2005
) sug- gested that thematrixor unmarked code for Hindi movies is Hindi. Therefore, any switch to English has some communicative purpose. L

¨osch( 2007)

uses this idea to analyze the dialogues of the movie Monsoon Wedding (2001) and concludesquotesdbs_dbs1.pdfusesText_1
[PDF] english phonetic alphabet pdf

[PDF] english phonetics course pdf

[PDF] english phonetics dictionary

[PDF] english phonetics exercises pdf

[PDF] english phonetics lessons

[PDF] english plus 2 student book answer key

[PDF] english plus 2 workbook pdf

[PDF] english plus workbook 1 answer key

[PDF] english proficiency test pdf

[PDF] english pronunciation book

[PDF] english short stories for beginners

[PDF] english speaking countries activities

[PDF] english story books for learning english pdf

[PDF] english story for intermediate level pdf

[PDF] english tenses summary