[PDF] Formal Languages and Automata Theory PDF

In this chapter we introduce formal language theory the computational theories of languages and grammars The models are actually inspired by

[PDF] Formal Languages and Automata Theory

5 nov 2010 · It is observed that a formal learning of a language has the following three steps 1 Learning its alphabet - the symbols that are used in the

[PDF] Chapter 2 Formal Languages - IFI UZH

ing the basic concepts and formalism of formal language theory using analogies with natural language and common knowledge of grammar Section 2 3 follows

[PDF] Introduction to the Theory of Formal Languages

Formal Language Theory Wiebke Petersen A formal language L is a set of words over an alphabet ? i e L ? ?? Examples:

[PDF] An Introduction to Formal Languages and Automata - Spartans Fall-14

his book is designed for an introductory course on formal languages automata transducers play no significant role in formal language theory

[PDF] Automata Theory and Formal Languages - CORE

Preface 7 Chapter 1 Formal Grammars and Languages 9 1 1 Free Monoids 9 1 2 Formal Grammars 10 1 3 The Chomsky Hierarchy

[PDF] FORMAL LANGUAGES AND AUTOMATA - Gopalan Colleges

formal languages? And their relative expressive power? (Language Hierarchy) Automata theory also studies if there exist any effective algorithm or not

[PDF] Chapter 1 Basics of Formal Language Theory - UPenn CIS

CIS511 Introduction to the Theory of Computation Formal Languages and Automata Models of Computation Jean Gallier May 27 2010

[PDF] Theory of Automata Formal Languages and Computation

This book deals with a fascinating and important subject which has the fundamentals of computer hardware software and some of their applications

[PDF] Formal Grammars and Languages - UCR CS

Formal language theory as a discipline is generally regarded as growing from the work of linguist Noam Chomsky in the 1950s when he attempted to give a

Chapter 6 Formal Language Theory - itscaltechedu

In this chapter we introduce formal language theory the computational theories of languages and grammars The models are actually inspired by formal logic enriched with insights from the theory of computation We begin with the de?nition of a language and then proceed to a rough characterization of the basic Chomsky hierarchy

Formal Languages and Automata Theory

D. Goswami and K. V. Krishna

November 5, 2010

1 Mathematical Preliminaries 3

2 Formal Languages 4

2.1 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Finite Representation . . . . . . . . . . . . . . . . . . . . . . . 13

2.4.1 Regular Expressions . . . . . . . . . . . . . . . . . . . 13

3 Grammars 18

3.1 Context-Free Grammars . . . . . . . . . . . . . . . . . . . . . 19

3.2 Derivation Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2.1 Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 Regular Grammars . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 Digraph Representation . . . . . . . . . . . . . . . . . . . . . 36

4 Finite Automata 38

4.1 Deterministic Finite Automata . . . . . . . . . . . . . . . . . 39

4.2 Nondeterministic Finite Automata . . . . . . . . . . . . . . . 49

4.3 Equivalence of NFA and DFA . . . . . . . . . . . . . . . . . . 54

4.3.1 Heuristics to Convert NFA to DFA . . . . . . . . . . . 58

4.4 Minimization of DFA . . . . . . . . . . . . . . . . . . . . . . . 61

4.4.1 Myhill-Nerode Theorem . . . . . . . . . . . . . . . . . 61

4.4.2 Algorithmic Procedure for Minimization . . . . . . . . 65

4.5 Regular Languages . . . . . . . . . . . . . . . . . . . . . . . . 72

4.5.1 Equivalence of Finite Automata and Regular Languages 72

4.5.2 Equivalence of Finite Automata and Regular Grammars 84

4.6 Variants of Finite Automata . . . . . . . . . . . . . . . . . . . 89

4.6.1 Two-way Finite Automaton . . . . . . . . . . . . . . . 89

4.6.2 Mealy Machines . . . . . . . . . . . . . . . . . . . . . . 91

5 Properties of Regular Languages 94

5.1 Closure Properties . . . . . . . . . . . . . . . . . . . . . . . . 94

5.1.1 Set Theoretic Properties . . . . . . . . . . . . . . . . . 94

5.1.2 Other Properties . . . . . . . . . . . . . . . . . . . . . 97

5.2 Pumping Lemma . . . . . . . . . . . . . . . . . . . . . . . . . 104

Chapter 1 Mathematical Preliminaries

Chapter 2 Formal Languages

A language can be seen as a system suitable for expression of certain ideas, facts and concepts. For formalizing the notion of a language one must cover all the varieties of languages such as natural (human) languages and program- ming languages. Let us look at some common features across the languages. One may broadly see that a language is a collection of sentences; a sentence is a sequence of words; and a word is a combination of syllables. If one con- siders a language that has a script, then it can be observed that a word is a sequence of symbols of its underlying alphabet. It is observed that a formal learning of a language has the following three steps. 1. Learning its alphabet - the symbols that are used in the language. 2. Its words - as various sequences of symbols of its alphabet. 3. Formation of sentences - sequence of various words that follow certain rules of the language. In this learning, step 3 is the most di±cult part. Let us postpone to discuss construction of sentences and concentrate on steps 1 and 2. For the time being instead of completely ignoring about sentences one may look at the common features of a word and a sentence to agree upon both are just se- quence of some symbols of the underlying alphabet. For example, the English sentence "The English articles - a, an and the - are categorized into two types: indefinite and definite." may be treated as a sequence of symbols from the Roman alphabet along with enough punctuation marks such as comma, full-stop, colon and further one more special symbol, namelyblank-spacewhich is used to separate two words. Thus, abstractly, a sentence or a word may be interchangeably used 4 for a sequence of symbols from an alphabet. With this discussion we start with the basic de¯nitions of alphabets and strings and then we introduce the notion of language formally. Further, in this chapter, we introduce some of the operations on languages and discuss algebraic properties of languages with respect to those operations. We end the chapter with an introduction to ¯nite representation of languages via regular expressions.

2.1 Strings

We formally de¯ne analphabetas a non-empty ¯nite set. We normally use the symbolsa;b;c;:::with or without subscripts or 0;1;2;:::, etc. for the elements of an alphabet. Astringover an alphabet § is a ¯nite sequence of symbols of §. Although one writes a sequence as (a1;a2;:::;an), in the present context, we prefer to write it asa1a2¢¢¢an, i.e. by juxtaposing the symbols in that order. Thus, a string is also known as awordor asentence. Normally, we use lower case letters towards the end of English alphabet, namelyz;y;x;w;etc., to denote strings.

Example 2.1.1.

Let § =fa;bgbe an alphabet; thenaa;ab;bba;baaba;::: are some examples of strings over §. Since the empty sequence is a ¯nite sequence, it is also a string. Which is ( ) in earlier notation; but with the notation adapted for the present context we require a special symbol. We use", to denote the empty string. The set of all strings over an alphabet § is denoted by § if § =f0;1g, then

Although the set §

in¯nite for any alphabet §. In order to understand some such fundamen- tal facts we introduce some string operations, which in turn are useful to manipulate and generate strings. One of the most fundamental operations used for string manipulation is concatenation. Letx=a1a2¢¢¢anandy=b1b2¢¢¢bmbe two strings. The concatenationof the pairx,ydenoted byxyis the string a

1a2¢¢¢anb1b2¢¢¢bm:

5 Clearly, the binary operation concatenation on § x(yz) = (xy)z: Thus,x(yz) may simply be written asxyz. Also, since"is the empty string, it satis¯es the property "x=x"=x The operation concatenation is not commutative on §

For a stringxand an integern¸0, we write

x n+1=xnxwith the base conditionx0=": That is,xnis obtained by concatenatingncopies ofx. Also, whenevern= 0, the stringx1¢¢¢xnrepresents the empty string". Letxbe a string over an alphabet §. Fora2§, the number of occur- rences ofainxshall be denoted byjxja. Thelengthof a stringxdenoted byjxjis de¯ned as jxj=X a2§jxja: Essentially, the length of a string is obtained by counting the number of symbols in the string. For example,jaabj= 3,jaj= 1. Note thatj"j= 0. If we denoteAnto be the set of all strings of lengthnover §, then one can easily ascertain that n¸0A n: We say thatxis asubstringofyifxoccurs iny, that isy=uxvfor some stringsuandv. The substringxis said to be apre¯xofyifu=".

Similarly,xis asu±xofyifv=".

Generalizing the notation used for number of occurrences of symbolain a stringx, we adopt the notationjyjxas the number of occurrences of a string xiny.

2.2 Languages

We have got acquainted with the formal notion of strings that are basic elements of a language. In order to de¯ne the notion of a language in a broad spectrum, it is felt that it can be any collection of strings over an alphabet. 6

Example 2.2.1.

1. The emptyset?is a language over any alphabet. Similarly,f"gis also a language over any alphabet. 2. The set of all strings overf0;1gthat start with 0. 3. The set of all strings overfa;b;cghavingacas a substring.

Remark2.2.2.

Note that?6=f"g, because the language?does not contain any string butf"gcontains a string, namely". Also it is evident thatj?j= 0; whereas,jf"gj= 1. Since languages are sets, we can apply various well known set operations such as union, intersection, complement, di®erence on languages. The notion of concatenation of strings can be extended to languages as follows.

The concatenation of a pair of languagesL1,L2is

1L2=fxyjx2L1^y2L2g:

Example 2.2.3.

IfL1=f0;1;01gandL2=f1;00g, then

1L2=f01;11;011;000;100;0100g.

ForL1=fb;ba;babgandL2=f";b;bb;abbg, we have

Remark2.2.4.

1. Since concatenation of strings is associative, so is the concatenation of languages. That is, for all languagesL1;L2andL3, (L1L2)L3=L1(L2L3):

Hence, (L1L2)L3may simply be written asL1L2L3.

2. The number of strings inL1L2is always less than or equal to the product of individual numbers, i.e. jL1L2j · jL1jjL2j: 3. L

1µL1L2if and only if"2L2.

Proof.

The \if part" is straightforward; for instance, if"2L2, then for anyx2L1, we havex=x"2L1L2. On the other hand, suppose " =2L2. Now, note that a stringx2L1of shortest length inL1cannot be inL1L2. This is because, ifx=yzfor somey2L1and a nonempty stringz2L2, thenjyjSimilarly,"2L1if and only ifL2µL1L2. We writeLnto denote the language which is obtained by concatenating ncopies ofL. More formally, L

0=f"gand

L n=Ln¡1L, forn¸1. In the context of formal languages, another important operation is Kleene as L n¸0L n:

Example 2.2.5.

Kleene star of the languagef01gis

f";01;0101;010101;:::g=f(01)njn¸0g. 2. Since an arbitrary string inLnis of the formx1x2¢¢¢xn, forxi2Land L n¸0L n, one can easily observe that L

Remark2.2.6.

Note that, the Kleene star of the languageL=f0;1gover the alphabet § =f0;1gis L =f"g [ f0;1g [ f00;01;10;11g [ ¢¢¢ =f";0;1;00;01;10;11;¢¢¢g = the set of all strings over §:

Thus, the earlier introduced notation §

Kleene star by considering § as a language over §. 8 Thepositive closureof a languageLis denoted byL+is de¯ned as L n¸1L n: We often can easily describe various formal languages in English by stat- ing the property that is to be satis¯ed by the strings in the respective lan- guages. It is not only for elegant representation but also to understand the properties of languages better, describing the languages in set builder form is desired. Consider the set of all strings overf0;1gthat start with 0. Note that can be represented by

Examples

1. The set of all strings overfa;b;cgthat haveacas substring can be written as

This can also be written as

stating that the set of all strings overfa;b;cgin which the number of occurrences of substringacis at least 1. 2. The set of all strings over some alphabet § with even number ofa0s is

Equivalently,

3. The set of all strings over some alphabet § with equal number ofa0s andb0s can be written as 4. The set of all palindromes over an alphabet § can be written as wherexRis the string obtained by reversingx. 9 5. The set of all strings over some alphabet § that have anain the 5th position from the right can be written as 6. The set of all strings over some alphabet § with no consecutivea0s can be written as 7. The set of all strings overfa;bgin which every occurrence ofbis not before an occurrence ofacan be written as fambnjm;n¸0g: Note that, this is the set of all strings overfa;bgwhich do not contain baas a substring.

2.3 Properties

The usual set theoretic properties with respect to union, intersection, comple- ment, di®erence, etc. hold even in the context of languages. Now we observe certain properties of languages with respect to the newly introduced oper- ations concatenation, Kleene closure, and positive closure. In what follows, L;L

1;L2,L3andL4are languages.

P1 Recall that concatenation of languages is associative. P2 Since concatenation of strings is not commutative, we haveL1L26=L2L1, in general. P3

Lf"g=f"gL=L.

L?=?L=?.

Proof.

Letx2L?; thenx=x1x2for somex12Landx22?. But?

being emptyset cannot hold any element. Hence there cannot be any elementx2L?so thatL?=?. Similarly,?L=?as well. P5

Distributive Properties:

1. (L1[L2)L3=L1L3[L2L3. 10

Proof.

Supposex2(L1[L2)L3

=)x=x1x2;for somex12L1[L2);and somex22L3 =)x=x1x2;for somex12L1orx12L2;andx22L3 =)x=x1x2;for somex12L1andx22L3; orx12L2andx22L3 =)x2L1L3orx2L2L3 =)x2L1L3[L2L3:

Conversely, supposex2L1L3[L2L3=)x2L1L3orx2L2L3.

Without loos of generality, assumex62L1L3. Thenx2L2L3. =)x=x3x4;for somex32L2andx42L3 =)x=x3x4;for somex32L1[L2;and somex42L3 =)x2(L1[L2)L3:

Hence, (L1[L2)L3=L1L3[L2L3.

2. L

1(L2[L3) =L1L2[L1L3.

Proof.

Similar to the above.

From these properties it is clear that the concatenation is distributive over ¯nite unions. Moreover, we can observe that concatenation is also distributive over countably in¯nite unions. That is, L i¸1L i! i¸1LL iand i¸1L i! L=[ i¸1L iL P6

IfL1µL2andL3µL4, thenL1L3µL2L4.

P7 P8 P9 P10 L 11

Proof.

L +. On the other hand,x2L+impliesx=x1¢¢¢xmwithm¸1 andxi2Lfor alli. Now writex0=x1¢¢¢xm¡1so thatx=x0xm. P11 P12 L P13

Proof.

y v i2L2. Note thatviui+12L2L1, for alliwith 1·i·n¡1. Hence, x=yz= (y1¢¢¢yn)z= (u1v1¢¢¢unvn)z=u1(v1u2¢¢¢vn¡1unvnz)2 L P14

Proof.

L Thus, L 12

2.4 Finite Representation

Pro¯ciency in a language does not expect one to know all the sentences of the language; rather with some limited information one should be able to come up with all possible sentences of the language. Even in case of programming languages, a compiler validates a program - a sentence in the programming language - with a ¯nite set of instructions incorporated in it. Thus, we are interested in a ¯nite representation of a language - that is, by giving a ¯nite amount of information, all the strings of a language shall be enumerated/validated. Now, we look at the languages for which ¯nite representation is possible. Given an alphabet §, to start with, the languages with single stringfxg and?can have ¯nite representation, sayxand?, respectively. In this way, ¯nite languages can also be given a ¯nite representation; say, by enumerating all the strings. Thus, giving ¯nite representation for in¯nite languages is a nontrivial interesting problem. In this context, the operations on languages may be helpful. For example, the in¯nite languagef";ab;abab;ababab;:::gcan be con- Kleene star operation we can have ¯nite representation for some in¯nite lan- guages. While operations are under consideration, to give ¯nite representation for languages one may ¯rst look at the indivisible languages, namely?;f"g;and fag, for alla2§, as basis elements. over the basis elements. For example, ifx=abathen choosefagandfbg; and concatenatefagfbgfagto getfabag. Any ¯nite language over §, say fx1;:::;xngcan be obtained by considering the unionfx1g [ ¢¢¢ [ fxng. In this section, we look at the aspects of considering operations over basis elements to represent a language. This is one aspect of representing a language. There are many other aspects to give ¯nite representations; some such aspects will be considered in the later chapters.

2.4.1 Regular Expressions

We now consider the class of languages obtained by applying union, con- catenation, and Kleene star for ¯nitely many times on the basis elements. These languages are known as regular languages and the corresponding ¯nite representations are known as regular expressions.

De¯nition 2.4.1(Regular Expression).

We de¯ne aregular expressionover

an alphabet § recursively as follows. 13 1. ?;", anda, for eacha2§, are regular expressions representing the languages?;f"g, andfag, respectively. 2. Ifrandsare regular expressions representing the languagesRandS, respectively, then so are (a) (r+s) representing the languageR[S, (b) (rs) representing the languageRS, and (c) In a regular expression we keep a minimum number of parenthesis which are required to avoid ambiguity in the expression. For example, we may simply writer+stin case of (r+(st)). Similarly,r+s+tfor ((r+s)+t).

De¯nition 2.4.2.

Ifris a regular expression, then the language represented byris denoted byL(r). Further, a languageLis said to beregularif there is a regular expressionrsuch thatL=L(r).

Remark2.4.3.

1. A regular language over an alphabet § is the one that can be obtained from the emptyset,f"g, andfag, fora2§, by ¯nitely many applica- tions of union, concatenation and Kleene star. 2. The smallest class of languages over an alphabet § which contains ?;f"g, andfagand is closed with respect to union, concatenation, and Kleene star is the class of all regular languages over §.

Example 2.4.4.

As we observed earlier that the languages?;f"g,fag, and all ¯nite sets are regular.

Example 2.4.5.

fanjn¸0gis regular as it can be represented by the

Example 2.4.6.

Example 2.4.7.

The set of all strings overfa;bgwhich containabas a substring is regular. For instance, the set can be written as 14

Example 2.4.8.

The languageLoverf0;1gthat contains 01 or 10 as sub- string is regular. L=fxj01 is a substring ofxg [ fxj10 is a substring ofxg

Since §

point, one can easily notice that (0 + 1) is a regular expression representingL.

Example 2.4.9.

The set of all strings overfa;bgwhich do not containab as a substring. By analyzing the language one can observe that precisely the language is as follows.quotesdbs_dbs14.pdfusesText_20

[PDF] formal languages and automata theory book

[PDF] formal report example for students pdf

[PDF] formal report writing example for students

[PDF] formalin fixation time calculator

[PDF] formalin solution

[PDF] format for project writing pdf

[PDF] format line numbers in word 2016

[PDF] formation developpeur web a distance

[PDF] formatting document in ms word in hindi

[PDF] formatting techniques in tableau

[PDF] forme algébrique d'un nombre complexe exercice

[PDF] forme bilinéaire et quadratique exercices corrigés

[PDF] forme bilinéaire symétrique définie positive

[PDF] forme canonique second degré exercice corrigé

[PDF] forme indéterminée 0/0

[PDF] [PDF] Formal Languages and Automata Theory

Formal Languages and Automata Theory

D. Goswami and K. V. Krishna

November 5, 2010

Contents

1 Mathematical Preliminaries 3

2 Formal Languages 4

2.1 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Finite Representation . . . . . . . . . . . . . . . . . . . . . . . 13

2.4.1 Regular Expressions . . . . . . . . . . . . . . . . . . . 13

3 Grammars 18

3.1 Context-Free Grammars . . . . . . . . . . . . . . . . . . . . . 19

3.2 Derivation Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2.1 Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 Regular Grammars . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 Digraph Representation . . . . . . . . . . . . . . . . . . . . . 36

4 Finite Automata 38

4.1 Deterministic Finite Automata . . . . . . . . . . . . . . . . . 39

4.2 Nondeterministic Finite Automata . . . . . . . . . . . . . . . 49

4.3 Equivalence of NFA and DFA . . . . . . . . . . . . . . . . . . 54

4.3.1 Heuristics to Convert NFA to DFA . . . . . . . . . . . 58

4.4 Minimization of DFA . . . . . . . . . . . . . . . . . . . . . . . 61

4.4.1 Myhill-Nerode Theorem . . . . . . . . . . . . . . . . . 61

4.4.2 Algorithmic Procedure for Minimization . . . . . . . . 65

4.5 Regular Languages . . . . . . . . . . . . . . . . . . . . . . . . 72

4.5.1 Equivalence of Finite Automata and Regular Languages 72

4.5.2 Equivalence of Finite Automata and Regular Grammars 84

4.6 Variants of Finite Automata . . . . . . . . . . . . . . . . . . . 89

4.6.1 Two-way Finite Automaton . . . . . . . . . . . . . . . 89

4.6.2 Mealy Machines . . . . . . . . . . . . . . . . . . . . . . 91

5 Properties of Regular Languages 94

5.1 Closure Properties . . . . . . . . . . . . . . . . . . . . . . . . 94

5.1.1 Set Theoretic Properties . . . . . . . . . . . . . . . . . 94

5.1.2 Other Properties . . . . . . . . . . . . . . . . . . . . . 97

5.2 Pumping Lemma . . . . . . . . . . . . . . . . . . . . . . . . . 104

Chapter 1

Mathematical Preliminaries

Chapter 2

Formal Languages

2.1 Strings

Example 2.1.1.

Although the set §

1a2¢¢¢anb1b2¢¢¢bm:

For a stringxand an integern¸0, we write

Similarly,xis asu±xofyifv=".

2.2 Languages

Example 2.2.1.

Remark2.2.2.

The concatenation of a pair of languagesL1,L2is

1L2=fxyjx2L1^y2L2g:

Example 2.2.3.

IfL1=f0;1;01gandL2=f1;00g, then

1L2=f01;11;011;000;100;0100g.

ForL1=fb;ba;babgandL2=f";b;bb;abbg, we have

Remark2.2.4.

Hence, (L1L2)L3may simply be written asL1L2L3.

1µL1L2if and only if"2L2.

Proof.

0=f"gand

Example 2.2.5.

Kleene star of the languagef01gis

Remark2.2.6.

Thus, the earlier introduced notation §

Examples

This can also be written as

Equivalently,

2.3 Properties

1;L2,L3andL4are languages.

Lf"g=f"gL=L.

L?=?L=?.

Proof.

Letx2L?; thenx=x1x2for somex12Landx22?. But?

Distributive Properties:

Proof.

Supposex2(L1[L2)L3

Conversely, supposex2L1L3[L2L3=)x2L1L3orx2L2L3.

Hence, (L1[L2)L3=L1L3[L2L3.

1(L2[L3) =L1L2[L1L3.