[PDF] [PDF] Unitex User Manual - Unitex/GramLab

1 5 3 How to compile Unitex C++ programs on a Macintosh The Kleene star, represented by the character *, allows you to recognize zero, one or several



Previous PDF Next PDF





[PDF] Mac character codes - Geoff-Hartcom

Mac character codes Accents and diacriticals (S = shift, O = Option, SO = both) Accent itself Accented letters: press the indicated shortcut, then the indicated 



Text symbol tables

Some of the symbols on this page and in the rest of this appendix require the latexsym and amssymb device under iCloud along with the iPhone, the Mac desktops, and the Mac notebooks G Grдtzer, Practical \star (⋆ math op ), 170



[PDF] MacBook Air Users Guide (Manual)

40 Problems That Prevent You from Using Your MacBook Air 44 Using Apple Hardware Test As an ENERGY STAR® partner, Apple has determined that standard The symbol above means that according to local laws and regulations your 



[PDF] Unitex User Manual - Unitex/GramLab

1 5 3 How to compile Unitex C++ programs on a Macintosh The Kleene star, represented by the character *, allows you to recognize zero, one or several



[PDF] The Comprehensive LaTeX Symbol List - CTAN

25 jui 2020 · This document lists 14599 symbols and the corresponding LATEX commands that Table 379: bbding Stars, Flowers, and Similar Shapes



[PDF] Text - Apple Developer

Setting and Getting an Edit Record's Text and Character Attribute Information Inside Macintosh: Text describes how you can use that support to put superior represented by a crescent, the Hebrew keyboard by a Star of David, and common



[PDF] Proposal for addition of half stars Introduction Examples of - Unicode

4 août 2016 · Ratings often use half stars, but unfortunately Unicode does not The ½ symbol is sometimes used in place of the half star in contexts where a 



[PDF] Miscellaneous Symbols and Pictographs - The Unicode Standard

and the Unicode Character Database, which are available online See http://www unicode org/ucd/ Moon, sun, and star symbols Use of the moon symbols is 



[PDF] MacBook 13-inch User Guide (Manual)

Mac is a service mark of Apple Computer, Inc ENERGY STAR® is a U S registered trademark Intel and “Dolby,” “Pro Logic,” and the double-D symbol are



[PDF] Dragon Professional Individual for Mac User Guide - Nuance

Upgrading from Dragon Dictate 4 or Dragon for Mac 5: FAQ 41 All Upgrades 41 Moon and star icons to the left ○ To dictate punctuation or symbols, say the name of the punctuation or symbol at the appropriate places in your dictation

[PDF] starmore bed assembly instructions

[PDF] start nsclient++

[PDF] stat radar

[PDF] stata generate variable multiple conditions

[PDF] state of climate change 2019

[PDF] state primary nomination paper

[PDF] state representative district map

[PDF] state teaching certificate

[PDF] state the characteristics of oral language

[PDF] states that recognize federal tax treaties

[PDF] static method in java

[PDF] static utility methods in java

[PDF] station france bleu lorraine nancy

[PDF] station radio france bleu paris

[PDF] stationnement gratuit lille

UNITEX3.0

USERMANUALUniversité Paris-Est Marne-la-Vallée http://www-igm.univ-mlv.fr/~unitex unitex@univ-mlv.fr

Sébastien Paumier

English translation of version 1.2 by the local grammar group at the (Wolfgang Flury, Franz Guenthner, Friederike Malchok, Clemens Marschner, Sebastian

Nagel, Johannes Stiehler)

http://www.cis.uni-muenchen.de/ 2

Contents

Introduction11

What"s new from version 2.0 ?

12

Content

13

Unitex contributors

14

If you use Unitex in research projects...

15

1 Installation of Unitex

17

1.1 Licenses

17

1.2 Java runtime environment

17

1.3 Installation on Windows

18

1.4 Installation on Linux

18

1.5 Installation on MacOS X

19

1.5.1 Using the Apple Java 1.6 runtime

19

1.5.2 SoyLatte

20

1.5.3 How to compile Unitex C++ programs on a Macintosh

24

1.5.4 How to makes all files visible on Mac OS

26

1.6 First use

27

1.7 Adding new languages

27

1.8 Uninstalling Unitex

28

1.9 Unitex for developpers

28

2 Loading a text

31

2.1 Selecting a language

31

2.2 Text formats

31

2.3 Editing text files

34

2.4 Opening a text

34

2.5 Preprocessing a text

35

2.5.1 Normalization of separators

37

2.5.2 Splitting into sentences

37

2.5.3 Normalization of non-ambiguous forms

39

2.5.4 Splitting a text into tokens

40

2.5.5 Applying dictionaries

42

2.5.6 Analysis of compound words in Dutch, German, Norwegian and Rus-

sian 45
3

4CONTENTS

2.6 Opening a tagged text

45

3 Dictionaries

47

3.1 The DELA dictionaries

47

3.1.1 The DELAF format

47

3.1.2 The DELAS Format

50

3.1.3 Dictionary Contents

51

3.2 Looking up a word in a dictionary

53

3.3 Checking dictionary format

54

3.4 Sorting

55

3.5 Automatic inflection

57

3.5.1 Inflection of simple words

57

3.5.2 Inflection of compound words

61

3.5.3 Inflection of semitic languages

62

3.6 Compression

62

3.7 Applying dictionaries

64

3.7.1 Priorities

64

3.7.2 Application rules for dictionaries

65

3.7.3 Dictionary graphs

65

3.7.4 Morphological dictionary graphs

67

3.8 Bibliography

69

4 Searching with regular expressions

71

4.1 Definition

71

4.2 Tokens

71

4.3 Lexical masks

72

4.3.1 Special symbols

72

4.3.2 References to information in the dictionaries

73

4.3.3 Grammatical and semantic constraints

73

4.3.4 Inflectional constraints

74

4.3.5 Negation of a lexical mask

74

4.4 Concatenation

76

4.5 Union

77

4.6 Kleene star

77

4.7 Morphological filters

77

4.8 Search

79

4.8.1 Search configuration

79

4.8.2 Presentation of the results

80

4.8.3 Statistics

85

5 Local grammars

89

5.1 The local grammar formalism

89

5.1.1 Algebraic grammars

89

5.1.2 Extended algebraic grammars

90

5.2 Editing graphs

90

CONTENTS5

5.2.1 Creating a graph

90

5.2.2 Sub-Graphs

95

5.2.3 Manipulating boxes

98

5.2.4 Transducers

99

5.2.5 Using Variables

100

5.2.6 Copying lists

102

5.2.7 Special Symbols

103

5.2.8 Toolbar Commands

104

5.3 Display options

106

5.3.1 Sorting the lines of a box

106

5.3.2 Zoom

106

5.3.3 Antialiasing

107

5.3.4 Box alignment

108

5.3.5 Display options, fonts and colors

109

5.4 Exporting graphs

111

5.4.1 Inserting a graph into a document

111

5.4.2 Printing a Graph

112

6 Advanced use of graphs

113

6.1 Types of graphs

113

6.1.1 Inflection transducers

113

6.1.2 Preprocessing graphs

114

6.1.3 Graphs for normalizing the text automaton

115

6.1.4 Syntactic graphs

116

6.1.5 ELAG grammars

116

6.1.6 Parameterized graphs

117

6.2 Compilation of a grammar

117

6.2.1 Compilation of a graph

117

6.2.2 Approximation with a finite state transducer

117

6.2.3 Constraints on grammars

118

6.2.4 Error detection

122

6.3 Contexts

122

6.3.1 Right contexts

122

6.3.2 Left contexts

125

6.4 The morphological mode

129

6.4.1 Why ?

129

6.4.2 The rules

129

6.4.3 Morphological dictionaries

130

6.4.4 Dictionary entry variables

131

6.5 Exploring grammar paths

132

6.6 Graph collections

134

6.7 Rules for applying transducers

135

6.7.1 Insertion to the left of the matched pattern

135

6.7.2 Application while advancing through the text

136

6.7.3 Priority of the leftmost match

136

6CONTENTS

6.7.4 Priority of the longest match

137

6.7.5 Transducer outputs with variables

137

6.8 Output variables

141

6.9 Operations on variables

142

6.9.1 Testing variables

142

6.9.2 Comparing variables

143

6.10 Applying graphs to texts

143

6.10.1 Configuration of the search

143

6.10.2 Advanced search options

145

6.10.3 Concordance

148

6.10.4 Modification of the text

149

6.10.5 Extracting occurrences

150

6.10.6 Comparing concordances

150

6.10.7 Debug mode

151

7 Text automaton

153

7.1 Displaying text automaton

153

7.2 Construction

155

7.2.1 Construction rules for text automata

155

7.2.2 Normalization of ambiguous forms

156

7.2.3 Normalization of clitical pronouns in Portuguese

157

7.2.4 Keeping the best paths

159

7.3 Resolving Lexical Ambiguities with ELAG

163

7.3.1 Grammars For Resolving Ambiguities

163

7.3.2 Compiling ELAG Grammars

164

7.3.3 Resolving Ambiguities

166
quotesdbs_dbs17.pdfusesText_23