Uppsala Papers in Africa Studies 2 Editor: Sten Hagberg
quête de terrain qualitative. Sten Hagberg professeur en anthropologie cultu- relle de l'université d'Uppsala
Uppsala Papers in Africa Studies 2 Editor: Sten Hagberg
quête de terrain qualitative. Sten Hagberg professeur en anthropologie cultu- relle de l'université d'Uppsala
Uppsala Papers in Africa Studies 4 Editor: Sten Hagberg
Sten Hagberg Ludovic O. Kibora et Gabriella Körling. Uppsala 2019 sur le terrain et les points de vue qui en découlent : émique se réfère aux notions
Complete French All-in-One - Annie Heminway.pdf
Zoé is reading a poetry book in her fauteuil. armchair. -ain le bain bath swim le pain bread le grain grain le terrain ground
Écologie spatiale du micronecton: distribution diversité et
campagnes de marquage des oiseaux marins (pétrels et puffins) et du barcoding et encadrés des journées d'information auprès du grand public ont été ...
Drawing Habits in Nineteenth-Century France Shana Cooperstein
Chapter 1: Drawing at The French Academy and its Roulleau so to him I owe un grand merci. ... terrain on which drawing reforms acquired ground.8.
RÉPUBLIQUE CENTRAFRICAINE : ANATOMIE DUN ÉTAT
13-Dec-2007 C. LE GRAND BOND EN ARRIÈRE DE L'ÉCONOMIE CENTRAFRICAINE . ... campagne pour sa réélection – qui s'assimile à un plébiscite.
Bootstrap Methods for Multi-Task Dependency Parsing in Low
supervisées à partir de grands corpus annotés. tion aux campagnes d'évaluation proposées dans le cadre de la conférence CoNLL.
ASSOCIATION pour la CONNAISSANCE des TRAVAUX PUBLICS
Participe à la séance : Ludovic Bidois et Noël Richet (Délégué général ASCO-TP) Etude de la faisabilité du projet Grand Paris.
Untitled
le plus grand nombre à traduire les peurs ou les aspirations en une image Au fondamental
Low-resource Conditions
byKyungTae Lim
Abstract
Dependency parsing is an essential component of several NLP applications owing its ability to capture complex relational information in a sentence. Due to the wider availability of dependency treebanks, most dependency parsing systems are built us- ing supervised learning techniques. These systems require a significant amount of annotated data and are thus targeted toward specific languages for which this type of data are available. Unfortunately, producing sufficient annotated data for low-re- source languages is time- and resource-consuming. To address the aforementioned issue, the present study investigates three bootstrapping methods, namely, (1) multi- lingual transfer learning, (2) deep contextualized embedding, and (3) co-training. Mul- tilingual transfer learning is a typical supervised learning approach that can transfer dependency knowledge using multilingual training data based on multilingual lexical representations. Deep contextualized embedding maximizes the use of lexical features during supervised learning based on enhanced sub-word representations and language model (LM). Lastly, co-training is a semi-supervised learning method that leverages parsing accuracies using unlabeled data. Our approaches have the advantage of re- quiring only a small bilingual dictionary or easily obtainable unlabeled resources (e.g., Wikipedia) to improve parsing accuracy in low-resource conditions. We evaluated our parser on 57 official CoNLL shared task languages as well as on Komi, which is a lan- guage we developed as a training and evaluation corpora for low-resource scenarios. The evaluation results demonstrated outstanding performances of our approaches in both low- and high-resource dependency parsing in the 2017 and 2018 CoNLL shared tasks. A survey of both model transfer learning and semi-supervised methods for low-resource dependency parsing was conducted, where the effect of each method under different conditions was extensively investigated. IMéthodes dĜ Ĝ
parKyungTae Lim
Résumé
Note : Le résumé étendu en français se trouve en annexe, à la section ( B.1 L"analyse en dépendances est une composante essentielle de nombreuses applica- tions de TAL (Traitement Automatique des Langues), dans la mesure où il s"agit de fournir une analyse des relations entre les principaux éléments de la phrase. La plu- part des systèmes d"analyse en dépendances sont issus de techniques d"apprentissage supervisées, à partir de grands corpus annotés. Ce type d"analyse est dès lors lim- ité à quelques langues seulement, qui disposent des ressources adéquates. Pour les langues peu dotées, la production de données annotées est une tâche impossible le plus souvent, faute de moyens et d"annotateurs disponibles. Afin de résoudre ce problème, la thèse examine trois méthodes dĜ Ĝ IIAcknowledgments
I am extremely happy and fortunate to have met all the members of Lattice. Words can"t express my deep appreciation to my supervisor, Thierry. He is not only an adviser for me but more of a life mentor. Thierry(patiently)guided and helped me continuously to grow all through out my PhD. Back in the first year of my Phd, I only pursued to focus on improving implementation skills by participating in the CoNLL shared task. Thierry guided me in the right path along with a conducive environment. And through his help and non-stop effort, I had reached my dream on the shared task. I have a good memory of the ACL 2017 conference; I was in Vancouver to present our shared task results there and met many brilliant researchers who participated in the same shared task. They are professionals not only in the technical aspect but also incredibly passionate to share their ideas. It motivated me to be one of them and I also wanted to share my ideas with them by publishing conference papers. I knew it is not easy to publish an article in a good conference, but it was much harder than I expected. Sometimes, I got frustrated and too emotional. Whenever I get overly emotional, Thierry encouraged me and helped me find out the reason why I started all of these and that thought kept me motivated. Thanks to him. He was kind and understanding every time I"m not quite myself and when I needed a person to lean on. I also want to thank my doctoral committee members, Benjamin and Remi. Their thoughtful comments and wisdom led me to getting accepted in a CICLing paper. I also would like to give thanks to Jamie and Jay-Yoon in CMU, who gave me most of the ideas for the Co-training work for AAAI conference paper. Thanks, Niko and Alex, they always make me crazy in parsing low-resource languages. I"m grateful for many LATTICE lab members: Loïc, Pablo, Martine, Sophie, Clé- ment, Frédérique, Fabien, and others. Whenever I"m in trouble, they have always supported me. Most of you know, it is tough to make a living in Paris as an inter- national student aged over 30. Back in 2017, when I first came to Paris, many lab members helped me to find accommodation, helped me to study French, and even III tried to search a French class for my wife. During my time as a student, I was fortunate to have many friends and professors to support me. I would like to give special thanks to the Paris NLP study group: Djame, Benoît, Éric, Benjamin, Pedro, Gael, Clementine, and others. I have learned not only NLP theories but also a way of thinking from a linguistic point of view from them, and the fun memories such as our regular beer time will be cherished forever and ever. Finally, I wouldn"t be where I am today without the support of my family. I have always kept in mind a lot of sacrifices and commitment from my parents and my wife. I want to thank everyone and say that I love you with all my heart. My journey with my wife in Paris will always be an unforgettable memory until the end of my life. IVContents
1 Introduction
11.1 Research Questions
31.2 Contributions
61.3 Thesis Structure
71.4 Publications Related to the Thesis
92 Background
112.1 Syntactic Representation
112.2 Dependency Parsing
172.2.1 Transition-based Parsing
192.2.2 Graph-based Parsing
232.2.3 Neural Network based Parsers
242.2.4 A Typical Neural Dependency Parser: the BIST-Parser
292.2.5 Evaluation Metrics
342.3 Transfer Learning for Dependency Parsing
352.4 Semi-Supervised Learning for Dependency Parsing
393 A Baseline Monolingual Parser, Derived from The BIST Parser
413.1 A Baseline Parser Derived from the BIST Parser
433.2 Experiments during the CoNLL 2017 Shared Task
473.2.1 The CoNLL 2017 Shared Task
493.2.2 Experimental Setup
503.2.3 Results
51V
3.3 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
4 A Multilingual Parser based on Transfer Learning
564.1 Our Approach
594.2 A Multilingual Dependency Parsing Model
614.2.1 Cross-Lingual Word Representations
624.2.2 Cross-Lingual Dependency Parsing Model
644.3 Experiments on Komi and Sami
674.3.1 Experiment Setup
674.3.2 Results
684.4 Experiments on The CoNLL 2017 data
704.4.1 Experiment Setup
704.4.2 Results
734.5 Summary
755 A Deep Contextualized Tagger and Parser
765.1 Multi-Attentive Character-Level Representations
795.2 Deep Contextualized Representation (ELMo)
865.3 Deep Contextualized Tagger
885.3.1 Two Taggers from Character Models
895.3.2 Joint POS Tagger
905.3.3 Experiments and Results
915.4 A Deep Contextualized Multi-task Parser
975.4.1 Multi-Task Learning for Tagging and Parsing
1005.4.2 Experiments on The CoNLL 2018 Shared Task.
1035.4.3 Results and Analysis
1065.5 Summary
1156 A Co-Training Parser on Meta Structure
1166.1 Parsing on Meta Structure
1196.1.1 ThebaselineModel
121VI
6.1.2 Supervised Learning onMetaStructure (meta-base). . . .123
6.2 Parsing on Co-Training
1246.2.1Co-meta
1256.2.2 Joint Semi-Supervised Learning
1266.3 Experiments
1276.3.1 Data Sets
1276.3.2 Evaluation Metrics
1276.3.3 Experimental Setup
1286.4 Results and Analysis
1296.4.1 Results in Low-Resource Settings
1326.4.2 Results in High-Resource Settings
1376.5 Summary
1397 Multilingual Co-Training
1417.1 Integration of Co-Training and Multilingual Transfer Learning
1427.2 Experiments
1437.2.1 Preparation of Language Resources
1437.2.2 Experiments strategies
1447.3 Results
1447.4 Summary
1468 Conclusion
1478.1 Summary of the Thesis
1478.2 Discussion over the Research Questions of the Thesis
1488.3 Perspectives
153A Universal Dependency
155A.1 The CoNLL-U Format
155A.2 Tagsets
157B Résumé en français de la thèse
160B.1 Introduction
160VII B.2 État de l"art. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .166 B.3 Mise au point d"un modèle lexical multilingue 169
B.3.1 Préparation de ressources linguistiques
170B.3.2 Projection de plongements de mots pour obtenir une ressource multilingue 170
B.3.3 Corpus annotés au format Universal Dependencies 172
B.4 Modèle d"analyse en dépendancescrosslingue 172
B.4.1 Architecture du système d"analyse
173B.4.2 Modèle d"analyse
174B.5 Expériences
175B.6 Résultats et analyse
178B.7 Conclusion
182References
188VIII
List of Figures
2-1 Syntactic representation of the sentence ''The big dog chased the cat"".
On the left a constituent analysis, on the right the dependency analysis. 122-2 An example of English Universal Dependency corpus
172-3 Representation of the structure of the sentence ''I prefer the morning
flight through Denver" using a dependency representation. The goal of a parser is to produce this kind of representation for unseen sen- tences, i.e., find relations among words and represent these relations with directed labeled arcs. We call this a typed dependency structure because the labels are drawn from a fixed inventory of grammatical relations. (taken from Stanford Lecture:https://web.stanford.15.pdf)
182-4 Basic transition-based parser. (taken from Stanford Lecture:https:
@M\{}jurafsky/slp3/15.pdf) 192-5 An example of a dependency tree and the transitions-based parsing
process (taken from (Zhang et al., 2019)) 212-6 An example of a graph-based dependency parsing (taken from (Yu,
2018))
252-7 An example of binary feature representations (fromhttps://blog.
262-8 An example of the continuous representations (same source as for the
previous figure). 26IX
2-9 An example of the skip-gram model. Here, it predicts the center (focus)
word ''learning" based on the context words (same source as for the previous figure). 272-10 Illustration of the neural model scheme of the graph-based parser when
calculating the score of a given parse tree (this figure and caption are taken from the original paper (Kiperwasser and Goldberg, 2016a)). The parse tree is depicted below the sentence. Each dependency arc in the sentence is scored using an MLP that is fed by the BiLSTM encoding of the words at the arc"s end points (the colors of the arcs correspond to colors of the MLP inputs above), and the individual arc scores are summed to produce the final score. All the MLPs share the same parameters. The figure depicts a single-layer BiLSTM, while in practice they use two layers. When parsing a sentence, they compute scores for all possiblen2arcs, and find the best scoring tree using a dynamic-programming algorithm. 312-11 Illustration of multilingual transfer learning in NLP (the figure is based
from (Jamshidi et al., 2017)) 372-12 ""How the transfer learning transfers knowledge in parsing?"". A parser
learns the shared parameters (Wd) based on supervised-learning. Since the learning is a data-driven task with inputs, source language can affect to tune the parameter (Wd) for the target language (the figure is taken from (Yu et al., 2018). 38X
3-1 Overall system structure for training language models.(1) Embed-
ding Layer: vectorized features that are feeding into Bidirectional LSTM.(2) Bidirectional-LSTM: train representation of each to- ken as vector values based on bidirectional LSTM neural network.(3) Multi-Layer Perceptron: build candidate of parse trees based on trained(changed) features by bidirectional LSTM layer, and then cal- culate probabilistic scores for each of candidates. Finally, if it has multiple roots, revise it or select the best parse tree. 444-1 An example of the cross-lingual representation learning method be-
tween English (Source Language) and French (Target Language) 634-2 An example of our cross-lingual dependency parsing for Russian (Source
Language) and Komi (Target Language)
665-1 An example of the word-based character model with a single attention
representation (Dozat et al., 2017b) 835-2 An example of the word-based character model with three attention
representations. 845-3 (A) Structure of the tagger proposed by Dozat et al. (2017b) using a
word-based character model and (B) structure of the tagger proposed by Bohnet et al. (2018a) using a sentence-based character model with meta-LSTM. 855-4 Overall structure of our contextualized tagger with three different clas-
sifiers. 905-5 An example of the procedure to generate a weighted POS embedding.
915-6 Overall structure of our multi-task dependency parser.
1016-1 An example of word similarity captured by different Views (from CS224N
Stanford Lecture:http://web.stanford.edu/class/cs224n/) 121XI
6-2 Overall structure of our baseline model. This system generates word-
and character-level representation vectors, and concatenates them as a unified word embedding for every token in a sentence. To trans- form this embedding into a context-sensitive one, the system encodes it based on the individual BiLSTM for each tagger and parser. 1226-3 Overall structure of ourCo-metamodel. The system consists of three
different pairs of taggers and parsers that are trained using limited context information. Based on the input representation of the word, character, and meta, each model draws a differently shaped parse tree. Finally, our co-training module induces models to learn from each other using each model"s predicted result. 1246-4 An example of the label selection method forensembleandvoting.
1336-5 Evaluation results for Chinese (zh_gsd) based on different sizes of
the unlabeled set and proposed models. We applyensemble-based Co-metawith the fixed size of 50 training sentences while varying the unlabeled set size. 1346-6 Evaluation results for Chinese (zh_gsd) based on the different sizes
of the train set and proposed models. We applyensemblebased Co-metawith the fixed size of 12k unlabeled sentences while varying training set size. 1367-1 The overall structure of ourCo-metaMmodel. This system gener-
ates word- and character-level representation vectors and concatenates them into a unified word embedding for every token in a sentence. The word-level representation can be a multilingual embedding as proposed in Section 4.2. Thus, this system can train a dependency model, using both labeled and unlabeled resources from several languages. 142A-1 An example of tokenization of Universal Dependency 155
A-2 An example of syntactic annotation of Universal Dependency 156
XII B-1 Architecture du réseau de neurones. . . . . . . . . . . . . . . . . . .174 XIII
List of Tables
3.1 Official results with rank. (number): number of corpora
503.2 Official results with monolingual models (1).
523.3 Official results with monolingual models (2).
533.4 Relative contribution of the different representation methods on the
English development set (English_EWT).
543.5 Contribution of the multi-source trainable methods on the English de-
velopment set (English_EWT). 544.1 Dictionary sizes and size of bilingual word embeddings generated by
each dictionary. 644.2 Labeled attachment scores (LAS) and unlabeled attachment scores
(UAS) for Northern Sami (sme) 684.3 The highest results of this experiment (FinnishSami model) compared
with top 3 results for Sami from the CoNLL 2017 Shared Task. 694.4 Labeled attachment scores (LAS) and unlabeled attachment score (UAS)
for Komi (kpv). We doesn"t conduct training for ''kpv + eng + rus" language combination because of unrealistic training scenario (It takes more than 40GB memory for training) 69XIV
4.5 Languages trained by a multilingual model.Embedding model:ap-
plied languages that were used for making multilingual word embed- dings.Bilingual Dic:resources to generate bilingual dictionaries Training corpora:Training corpora that were used.7 languages: English, Italian, French, Spanish, Portuguese, German, Swedish.(num- ber):the number of multiplication to expand the total amount of corpus. 724.6 Official experiment results with rank. (number): number of corpora
744.7 Official experiment results processed by multilingual models.
745.1 Hyperparameter Details
quotesdbs_dbs47.pdfusesText_47[PDF] lufthansa core values
[PDF] lufthansa vision mission statement
[PDF] Lugar y formas de poder
[PDF] lugares donde se ejerce el poder
[PDF] Lugares y formas de poder - Oral d'Espagnol
[PDF] Lugares y formas de poder - Oral de Bac d'Espagnol
[PDF] lugares y formas de poder definicion
[PDF] lugares y formas de poder definition de la notion
[PDF] lugares y formas de poder definition simple
[PDF] lugares y formas de poder introducción
[PDF] lugares y formas de poder la inefable elvira
[PDF] lugares y formas de poder problématique
[PDF] lugares y formas de poder temas
[PDF] lugares y formas de poder traduction