La question de corpus : cours et exemple Barème : Sur 4 points pour
Pour cela deux méthodes: soit on met un texte en particulier à l'honneur parmi les textes du corpus en se prononçant rapidement sur son originalité
AN INTRODUCTION TO CORPUS LINGUISTICS
Using Corpora in the Language Learning Classroom: Corpus Linguistics for Teachers. Gena R. Bennett pora there is a “method” to employ. The Corpus ...
Corpus sampling
This difficulty may account partly for the reaction against corpus-based linguistics during the Chomsky-dominated decades of the 1960s and 1970s. Intro-.
DE LA PRESENTATION DU CORPUS
Quelle(s) méthode(s) pour appréhender un corpus en bac ? 1. Découvrir le corpus. 2. Caractériser les documents. 3. Hiérarchiser les documents.
An automated method to build a corpus of rhetorically-classified
26 thg 6 2014 Abstract. The rhetorical classification of sentences in biomedical texts is an important task in the recognition of the components of.
Rédiger lintroduction de la question sur le corpus : un exemple
Rédiger l'introduction de la question sur le corpus : un exemple. Sujet proposé : quel regard ces textes portent-ils sur les femmes du peuple ?
Statistics in Corpus Linguistics
Statistical techniques intro- Corpus linguistics is a scientific method of language analysis. ... Statistics and scientific method: an introduction.
La synthèse de documents
la conclusion). On peut commencer par une phrase affirmative puis continuer par une phrase interro-négative. Exemple : Certes
Méthodologie « Analyse méthodique dun corpus dœuvres et
Méthodologie. « Analyse méthodique d'un corpus d'œuvres et réflexion sur certains aspects de la création artistique ». PREMIÈRE QUESTION DE L'ÉPREUVE
Towards a Methodology for a Corpus-Based Approach to
Évaluation : paramètres méthodes
[PDF] La question de corpus : cours et exemple Barème : Sur 4 points pour
Introduction: présentation synthétique du corpus proposé en ajoutant quelques infos (on ne se contente pas de reformuler ou paraphraser le paratexte) sur
[PDF] Fiche méthode : la question sur le corpus - Créer son blog
Méthode 1) Faire une introduction : • « Ce corpus est constitué de X textes » • Puis citez chaque texte avec son titre son auteur sa date de parution
[PDF] Méthode de la question sur corpus - Lettrines
Une introduction (un paragraphe) présente le corpus Elle indique le nom des auteurs et des œuvres ainsi que le genre et l'époque auxquels ils appartiennent
[PDF] Méthode de la question de corpus - Zone littéraire
- Introduction : Elle doit être rapide Vous présentez les documents du corpus selon le classement que vous avez trouvé (points communs et différences) et non
[PDF] LA QUESTION SUR CORPUS A LEPREUVE ECRITE- METHODE
? Pour faire l'introduction : 1) Rappelez les titres des œuvres dont les textes du corpus sont extraits et le nom de leurs auteurs (ces indications figurent
Méthode de la question de corpus - Maxicours
Une fois ce travail au brouillon terminé il faut passer à la rédaction L'introduction et la conclusion doivent être courtes (contrairement aux travaux d'
[PDF] La méthodologie de la question sur corpus
Ce corpus est en relation avec un ou plusieurs objets d'étude du programme Le candidat doit traiter une ou deux questions le conduisant à confronter les textes
[PDF] Méthodologie 1) Létude dun corpus de textes - cloudfrontnet
Procéder à une 1ère approche globale du corpus : à quels objet(s) L'introduction amène le sujet expose la problématique en reprenant la citation ou
[PDF] Les corpus numériques pour laide à lécriture académique - HAL
réflexions méthodes Corpus écrits universitaires et vocabulaire de spécialité Introduction de la phraséologie 4 Conclusion C Cavalla - ACFAS 2019
[PDF] Introduction 1 Présentation du corpus - Université Côte dAzur
11 juil 2012 · Dans un pre- mier temps il s'agira de rendre compte des différents travaux en francophonie Il se- ra question par exemple d'étudier les
Comment faire l'introduction d'un corpus ?
Pour l'introduction, il suffit de présenter l'objet d'étude (le théâtre, l'argumentation) et le thème. Puis il faut reformuler la question dans une tournure indirecte. Par exemple, il conviendra de se demander quels sont les registres utilisés par les auteurs dans ces différents textes argumentatifs.Comment bien Ecrire un corpus ?
Lisez tous les documents et les paratextes pour trouver des points communs. Reformulez l'idée principale de chaque texte. Définissez le thème général du corpus. Confrontez les documents : chercher comment ces idées se nuancent, se complètent ou au contraire se contredisent.Qu'est-ce qu'un corpus exemple ?
Un corpus est un ensemble de documents, artistiques ou non (textes, images, vidéos, etc. ), regroupés dans une optique précise. On peut utiliser des corpus dans plusieurs domaines : études littéraires, linguistiques, scientifiques, philosophie, etc.- Si vous définissez votre corpus autour d'un thème ou d'une notion, la meilleure méthode consiste à définir une série de mots-clés et de synonymes pertinents par lesquels vous interrogerez les répertoires, catalogues et bases de données de livres anciens.
Statistics in Corpus Linguistics
Do you use language corpora in your research or study, butnd that you struggle with statistics? This practical introduction will equip you to understand the key principles of statistical thinking and apply these concepts to your own research, without the need for prior statistical knowledge. The book gives step-by-step guidance through the process of statistical analysis and provides multiple examples of how statistical techniques can be used to analyse and visualize linguistic data. It also includes a useful selection of discussion questions and exercises which you can use to check your understanding. The book comes with a companion website, which provides additional materials (including answers to exercises, datasets, advanced materials, teaching slides etc.) and Lancaster Stats Tools online (http://corpora.lancs.ac.uk/stats), a free click-and-analyse statistical tool for easy calculation of the statistical measures discussed in the book. vaclav brezinais a senior lecturer at the Department of Linguistics and English Language, Lancaster University. He specializes in corpus linguistics, statistics and applied linguistics, and has designed a number of different tools for corpus analysis.Statistics in Corpus
Linguistics
A Practical Guide
VACLAV BREZINA
Lancaster University
University Printing House, Cambridge CB2 8BS, United Kingdom One Liberty Plaza, 20th Floor, New York, NY 10006, USA477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314-321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre,
New Delhi-110025, India
79 Anson Road, #06
-04/06, Singapore 079906 Cambridge University Press is part of the University of Cambridge.It furthers the University
"s mission by disseminating knowledge in the pursuit of education, learning, and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781107125704DOI: 10.1017/9781316410899
© Vaclav Brezina 2018
This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press.First published 2018
Printed in the United Kingdom by TJ International Ltd. Padstow Cornwall A catalogue record for this publication is available from the British Library. Library of Congress Cataloging-in-Publication DataNames: Brezina, Vaclav, 1979-author.
Title: Statistics in corpus linguistics : a practical guide / Vaclav Brezina, Lancaster University. Description: Cambridge ; New York : Cambridge University Press, 2018. |Includes bibliographical references and index.
Identi
ers: LCCN 2018007010 | ISBN 9781107125704 (alk. paper) Subjects: LCSH: Corpora (Linguistics) | Linguistics-Statistical methods. Classication: LCC P128.C68 B76 2018 | DDC 410.1/88-dc23 LC record available at https://lccn.loc.gov/2018007010ISBN 978-1-107-12570-4 Hardback
ISBN 978-1-107-56524-1 Paperback
Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.List of Figures pagex
List of Tablesxiv
About This Bookxvii
Acknowledgementsxix
11.1 What Is This Chapter About? 1
1.2 What Is Statistics? Science, Corpus Linguistics and Statistics 1
1.3 Basic Statistical Terminology 5
1.4 Building of Corpora and Research Design 15
1.5 Exploring Data and Data Visualization 22
1.6 Application and Further Examples: Do Fiction Writers Use
More Adjectives than Academics? 30
1.7 Exercises 32
Things to Remember 36
Advanced Reading 36
382.1 What Is This Chapter About? 38
2.2 Tokens, Types, Lemmas and Lexemes 38
2.3 Words in a Frequency List 42
2.4 The Whelk Problem: Dispersion 46
2.5 Which Words Are Important? Average Reduced Frequency 53
2.6 Lexical Diversity: Type/Token Ratio (TTR), STTR and MATTR 57
2.7 Application and Further Examples: Do the British Talk about
Weather All the Time? 59
2.8 Exercises 62
Things to Remember 64
Advanced Reading 65
Reliability of Manual Coding66
3.1 What Is This Chapter About? 66
3.2 Collocations and Association Measures 66
3.3 Collocation Graphs and Networks: Exploring Cross-associations 75
3.4 Keywords and Lockwords 79
3.5 Inter-rater Agreement Measures 87
3.6 Application and Further Examples: What Do Readers of British
Newspapers Think about Immigration? 92
3.7 Exercises 96
Things to Remember 100
Advanced Reading 101
1024.1 What Is This Chapter About? 102
4.2 Analysing a Lexico-grammatical Feature 103
4.3 Cross-tabulation, Percentages and Chi-squared Test 108
4.4 Logistic Regression 117
4.5 Application:ThatorWhich? 130
4.6 Exercises 134
Things to Remember 137
Advanced Reading 138
1395.1 What Is This Chapter About? 139
5.2 Relationships between Variables: Correlations 139
5.3 Classification: Hierarchical Agglomerative Cluster Analysis 151
5.4 Multidimensional Analysis (MD) 160
5.5 Application: Registers in New Zealand English 170
5.6 Exercises 177
Things to Remember 181
Advanced Reading 182
Variation183
6.1 What Is This Chapter About? 183
6.2 Individual Style and Social Variation: Where Does a
Sociolinguistic Variable Start? 183
6.3 Group Comparison: T-Test, ANOVA, Mann-WhitneyUTest,
Kruskal-Wallis Test 186
6.4 Individual Style: Correspondence Analysis 199
6.5 Linguistic Context: Mixed-Effects Models 207
6.6 Application: Who Is This Person from the White House? 211
6.7 Exercises 215
Things to Remember 217
Advanced Reading 218
2197.1 What Is This Chapter About? 219
7.2 Time as a Variable: Measuring and Visualizing Time 219
viiicontents7.3 Finding and Interpreting Differences: Percentage Change and
the Bootstrap Test 2297.4 Grouping Time Periods: Neighbouring Cluster Analysis 235
7.5 Modelling Changes in Discourse: Peaks and Troughs and UFA 241
7.6 Application: Colours in the Seventeenth Century 247
7.7 Exercises 251
Things to Remember 255
Advanced Reading 256
Thinking, Meta-analysis and Effect Sizes257
8.1 What Is This Chapter About? 257
8.2 Ten Principles of Statistical Thinking 257
8.3 Meta-analysis: Statistical Synthesis of Research Results 267
8.4 Effect Sizes: A Guide for Meaningful Use 275
8.5 Exercises 280
Things to Remember 282
Advanced Reading 282
Final Remarks283
References285
Index294
Figures
1.1 The relationship between the relative frequency of
adjectives and verbspage41.2 Process of statistical analysis6
1.3 Example of a dataset7
1.4 The distribution of therst-person pronoun in theTrinity
Lancaster Corpus9
1.5 Standard normal distribution9
1.6 Dispersion of adjective frequencies in 11 corpusles11
1.7 Condence intervals: two situations14
1.8 Research designs in corpus linguistics21
1.9 Bar chart: variablexin three corpora24
1.10 Boxplot: variablexin three corpora24
1.11 Error bars: variablexin three corpora25
1.12 Histogram: the de
nite article in BE06261.13 Histogram: the f-word in BNC6426
1.14 Scatterplot:theandIin BNC6427
1.15 Scatterplot:the,Iandyouin BNC6428
1.16 Top ten places connected withgoing"ortravelling"in
the BNC281.17 Other types of visualizations29
1.18 The use of adjectives byction and academic writers:
boxplot311.19 The use of adjectives byction and academic writers: error bars 32
1.20 Great Britain: main island33
2.1 Distribution of word frequencies in the BNC45
2.2 Example corpus: calculation ofSD49
2.3 Distribution of wordsw
1 andw 2 553.1 Frequency and exclusivity scale74
3.2 Collocation graph:love"in BE06 (10a-log Dice (7), L3-R3,
C5-NC5)76
3.3 Collocation networks: concept demonstration77
3.4 Third-order collocates of time in LOB (3a-MI(5), R5-L5,
C4-NC4; nolter applied)78
3.5 Collocation network ofuniversity"based on BE06 (3b-MI(3),
L5-R5, C8-NC8)79
x3.6 Collocation networks aroundimmigrants"in theGuardian
(3a-MI(6), R5-L5, C10-NC10; nolter applied)943.7 Collocation networks aroundimmigrants"in theDaily Mail
(3a-MI(6), R5-L5, C20-NC20; nolter applied)943.8 Selected collocation networks97
4.1 The de
nite and indenite articles in BNC subcorpora 1044.2Thevsa(n)dataset: linguistic feature design (an excerpt) 105
4.3 A mosaic plot: article type by contextual determination 109
4.4 Logistic regression: a basic schema119
4.5 Article use in English: a dataset (an excerpt)122
4.6 A sentence from this book corrected forgrammar"130
4.7 Visualization of the relationship betweenwhichandthatand
a separator1324.8Must,have toandneed toin British English (BE06) 135
5.1 Nouns and adjectives in BE06140
5.2 Verbs and adjectives in BE06140
5.3 Pronouns and coordinators in BE06141
5.4 Correlation:ve data points143
5.5 Correlation: covariance143
5.6 Statistically signicant (p <0.05) Pearson"s correlations
in relation to the number of observations1455.7 Multi-panel scatterplot: nouns, adjectives, verbs, pronouns
and coordinators1495.8 Correlation matrix: nouns, adjectives, verbs, pronouns and
coordinators1505.9 Colour terms in the BNC152
5.10 Creating clusters: Steps 1-4155
5.11 Creating clusters:nal result156
5.12 Colour terms: a tree plot (dendrogram)-z-score
2 normalized,Euclidean distance, SLINK method156
5.13 Tree plot: SLINK method157
5.14 Tree plot: CLINK method157
5.15 Tree plot: average linkage method158
5.16 Tree plot: Ward"s method159
5.17 A dataset for multidimensional analysis (a small extract) 164
5.18 Data reduction: ten variables into two factors165
5.19 Promax factor rotation166
5.20 Factor extraction: scree plot167
5.21 Mean scores of registers placed on Dimension 1: Involved vs
Informational169
5.22 Correlation matrix: 44 variables173
5.23 Correlation between mean word length and contractions:
register clusters174List of Figuresxi
5.24 Cluster plot: registers in New Zealand English175
5.25 Dimension 1: New Zealand English-full MD analysis 177
5.26 Dimension 2: New Zealand English-full MD analysis 177
5.27 Relationship between mean word length (number of characters)
and mean sentence length (number of words) in BNC 1785.28 Relationship between the use of the past and the present tense
in BE061785.29 Relationship between the use of adjectives and colour terms in
BE06179
5.30 Relationship between text length (tokens) and type-token ratio
(TTR) in BNC1795.31 Dimension 3181
5.32 Dimension 4181
6.1 Distribution of personal pronouns in BNC64 female speakers 188
6.2 ANOVA calculation: between-group variance (top), within-
group variance (bottom)1936.3 Dataset from BNC64-relative frequencies and ranks: use
of personal pronouns1956.4 Distribution ofain'tin BNC64 speakers: social-class effect 198
6.5Ain'tin BNC64: 95% CI198
6.6 A correspondence plot: word classes in the speech of individual
speakers2016.7 Speaker (row) pro
les: Euclidean distance2046.8 Speaker (row) pro
les: chi-squared distance2056.9 Sociolinguistic dataset: internal and external factors
(an excerpt)2086.10 Mixed-effects models: output209
6.11 Correspondence analysis: use of word classes by White
House press secretaries214
6.12 Correspondence analysis: use of epistemic markers in BNC64 216
7.1 Modals in the Brown family corpora220
7.2 Modals in the Brown family corpora: an alternative interpretation 223
7.3 Google n-gram viewer:man"andwoman"224
7.4 Modals in the Brown family corpora: original (top) and
rescaled (bottom)2257.5 Modals in British English: (a) boxplots; (b) 95%
CI error bars227
7.6 Candlestick plot: the development of individual modals
1931-2006228
7.7 Bootstrapping: demonstration of the concept231
7.8 Example of a dataset for the bootstrap test:itsin EEBO 233
7.9 Data points over time: an invented example235
xiilist of Þgures7.10 Two clustering principles: (a) hierarchical agglomerative
clustering; (b) variability-based neighbour clustering 2377.11 Dendrograms: (a) hierarchical agglomerative clustering;
(b) variability-based neighbour clustering2387.12 Dendrogram: use of the possessive pronounitsin the
seventeenth century2397.13 Scree plot: use of the possessive pronounitsin the
seventeenth century2407.14 Resulting peaks and troughs graphs: settings as indicated 244
7.15 Results of UFA forwar1940-2009 (3a-MI(3), L5-R5,
C10relative-NC10relative; AC1)246
7.16 Frequency of colour terms in the seventeenth century 248
7.17 Candlestick plot: colours in the seventeenth century249
7.18 Results of UFA forred1600-99 (3a-MI(3), L5-R5,
C10relative-NC10relative; AC1)250
7.19 VNC:redin the seventeenth century251
7.20 Number of tweets related to an episode of the UKX-Factor
(16/11/2014, 7-11pm)2527.21 Development of frequencies ofhandsome,prettyand
beautifulfollowed by a male (M) or female (F) person in the seventeenth century2527.22 Development of frequencies of the possessive pronounits
in the seventeenth century2537.23 Four frequency change scenarios254
7.24Handsomein the seventeenth century254
7.25Prettyin the seventeenth century255
8.1 Overview of genres in BE06 (Baker 2009)260
8.2 Past tense in different written genres of BE06260
8.3 Past tense (a) and present tense (b)
in different written genres of BE06: boxplot rendition 2658.4 Finding the Globe268
8.5 Forest plot: meta-analysis of four studies274
8.6 Comparison of two subcorpora278
8.7 Forest plot: example 1281
8.8 Forest plot: example 2281
List of Figuresxiii
1.1 The effect sizerand its standard interpretationpage14
1.2 Brown family sampling frame 16
1.3 Frequencies of selected words and expressions in three English
corpora 191.4 Different levels of analysis in corpus linguistics 20
1.5 Subcorpora in mini-research 30
2.1 Type, lemma and lexeme: advantages and disadvantages 41
2.2 Top ten words in the BNC 42
2.3 Example corpus: one million tokens 47
2.4 Calculation of DP with the example corpus 53
2.5 BE06 60
2.6 Weather-related lemmas in BE06 61
2.7 Ranks of weather-related lemmas in BE06 62
2.8 BNC: distribution of four selected words 64
3.1 Observed frequencies 70
3.2 Expected frequencies: random occurrence baseline 71
3.3 Association measures: overview 72
3.4 Ranking of collocates of'new'in BE06 (L3-R3) 73
3.5 Collocation parameters notation (CPN) 75
3.6 AmE06: American English keywords 80
3.7 Decisions about keywords: BASIC options 81
3.8 Comparison of selected lexical items in BE06 and AmE06 83
quotesdbs_dbs16.pdfusesText_22[PDF] expertise medicale suite accident travail
[PDF] rapport expertise medicale assurance
[PDF] rapport d'expertise judiciaire batiment
[PDF] proces verbal police municipale
[PDF] proces verbal de contravention police municipale
[PDF] rapport d'infraction gendarmerie
[PDF] exemple de procès verbal de police
[PDF] proces verbal blanc
[PDF] beauchamp et childress les principes de l'éthique biomédicale
[PDF] non malfaisance infirmier
[PDF] 4 ans ne veut pas grandir
[PDF] principes éthiques soins infirmiers
[PDF] dessin d observation d une feuille
[PDF] mon fils ne veut pas travailler au collège