[PDF] LEX & YACC Tutorial



Previous PDF Next PDF







LEX & YACC Tutorial

Yacc (Yet Another Compiler-Compiler) generates parser based on an analytic grammar 3 Flex is Free scanner alternative to Lex Bison is Free parser generator program written for the GNU project alternative to Yacc



Lex and Yacc: A Brisk Tutorial - University of Arizona

Using Yacc Suppose the grammar spec is in a file foo y Then: – Thecommand‘yacc foo y’yieldsa filey tab ccon-taining the parser constructed by yacc – Thecommand‘yacc -d foo y’constructsafile y tab h that can be #include’d into the scanner generated by lex – Thecommand‘yacc -v foo y’additionallyconstructs



Yacc: Yet Another Compiler-Compiler

Yacc is written in a portable dialect of C1 and the actions, and output subroutine, are in C as well Moreover, many of the syntactic conventions of Yacc follow C The heart of the input specification is a collection of grammar rules Each rule describes an allow-able structure and gives it a name For example, one grammar rule might be



Introduction to YACC

32 Introduction to YACC Fall 2012 Use token to define your terminals, yacc –d will create y tab h and define the token for you (as #define) Along with the token, you can have exactly one piece of information passed onto the stack That piece of information can change depending upon the token (or rule matched) Use



Introduction History - Western Michigan University

– YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar and to produce the source code of the syntactic analyzer of a language produced by this grammar History • Yacc original written by Stephen C Johnson, 1975 •Variants: – lex, yacc (AT&T) – bison: a yacc replacement (GNU)



lex & yacc, 2nd Edition

Chapter 3, Using Yacc, gives a full example using lex and yacc to develop a fully functional desktop calculator Chapter 4, A Menu Generation Language, demonstrates how to use lex and yacc to develop a menu generator Chapter 5, Parsing SQL, develops a parser for the full SQL relational data base language



ANTLR, Yacc, and Bison

Yacc Specification • A yacc specification consists of three parts: yacc declarations, and C declarations within { } translation rules user-defined auxiliary procedures • The translation rules are productions with actions: production 1 { semantic action 1} production 2 { semantic action 2} production n { semantic action n}



“Calculator” example

A yacc "state" is a set of "dotted rules" – a grammar rules with a "dot” somewhere in the right hand side (In some yacc printouts, "_" is the dot ) Intuitively, "A → α_β" in a state means this rule, up to and including α is consistent with input seen so far; next terminal in the input might derive from the left end of β

[PDF] allocution bienvenue association

[PDF] fin de la démocratie athénienne

[PDF] l'apogée d'athènes

[PDF] fondation d'athènes

[PDF] apogée d'athènes date

[PDF] auteurs francophones connus

[PDF] liste des auteurs africains et leurs oeuvres pdf

[PDF] auteurs francophones contemporains

[PDF] littérature francophone est elle une littérature française

[PDF] auteurs francophones africains

[PDF] littérature francophone définition

[PDF] auteurs francophones les plus lus

[PDF] le jeu des acteurs au théâtre dissertation

[PDF] l avant scène théâtre

[PDF] actes sud theatre

LEX & YACC Tutorial

1February 28, 2008

Tom St. John

Outline

Overview of Lex and Yacc

?Structure of Lex Specification ?Structure of Yacc Specification 2 ?Some Hints for Lab1

Overview

Lex (A LEXical Analyzer Generator) generates lexical analyzers (scanners or Lexers) Yacc (Yet Another Compiler-Compiler) generates parser based on an analytic grammar 3 generates parser based on an analytic grammar

Flex is

Free scanner

alternative to Lex

Bison is

Free parser

generator program written for the GNU project alternative to Yacc

Scanner, Parser, Lex and Yacc

Scanner

Parser

symbol table token

Source

Program

4 lex.yy.c name.tab.c name.tab.h

Lex spec (.l)

Yacc spec (name.y)

Lex/ flex

Yacc/ bison

C Compiler

Skeleton of a Lex Specification (.l file)

x.l < C global variables, prototypes, comments > lex.yy.c is generated after running > lex x.lThis part will be embedded into lex.yy.c 5 [DEFINITION SECTION]%%[RULES SECTION]%%C auxiliary subroutines embedded into lex.yy.cDefine how to scan and what action to take for each token Any user code.

Lex Specification: Definition Section%{%{You should include this!Yacc will generate this file automatically.

6 #include "zcalc.tab.h" #include "zcalc.h" #include %}#include "zcalc.tab.h"#include "zcalc.h"#include %}

User-defined header file

Lex Specification: Rules Section

pattern { corresponding actions } pattern { corresponding actions } pattern { corresponding actions } pattern { corresponding actions }?

Format

RegularExpression

C Expression

7 [1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER; }[1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER;

Example

Unsigned integer will be

accepted as a tokenExpression

You need to define these two in .y file

Two Notes on Using Lex1. Lex matches token with

longest match

Input: abc

Rule: [a-z]+ Token: abc(not "a" or "ab")

82. Lex uses the

first applicable rule

Input: post

Rule1:

"post" {printf ("Hello,"); }

Rule2:

[a-zA-Z]+ {printf ("World!"); } It will print Hello, (not "World!")

Skeleton of a Yacc Specification (.y file)

x.y < C global variables, prototypes, comments > %}x.tab.c is generated after running> yacc x.y 9 [DEFINITION SECTION]%%[PRODUCTION RULES SECTION]%%C auxiliary subroutines Declaration of tokensrecognized in Parser (Lexer).How to understand the input, and what actions to take for each "sentence". [1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER; }[1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER;

Yacc Specification: Definition Section (1)

#include "zcalc.h"#include %{#include "zcalc.h"#include zcalc.l 10 #include int flag = 0; %}%union { int dval; ... %token NUMBER#include int flag = 0; %}%union { int dval; ... %token NUMBERzcalc.y

Yacc Specification: Definition Section (2)

%left '-' '+'

%left '*' '/' '%'%left '-' '+'%left '*' '/' '%'Define operator's precedence and associativity- We can solve problem in slide 13

11 %type expression statement statement_list %type logical_expr%type expression statement statement_list%type logical_expr

Define nonterminal's name

- With this name, you will define rules in rule section

Yacc Specification: Production Rule Section (1)

Formatnontermname : symbol1 symbol2 ... { corresponding actions } | symbol3 symbol4 ... { corresponding actions } nontermname : symbol1 symbol2 ... { corresponding actions } | symbol3 symbol4 ... { corresponding actions } or 12 nontermname2 : ... nontermname2 : ...

Regular expressionC expression

Yacc Specification: Production Rule Section (2)

Examplestatement : expression { printf (" = %g\n", $1); }expression : expression '+' expression { $$ = $1 + $3; }

| expression '-' expression { $$ = $1 - $3; } | expression '*' expression { $$ = $1 * $3; } | NUMBER statement : expression { printf (" = %g\n", $1); } expression : expression '+' expression { $$ = $1 + $3; } | expression '-' expression { $$ = $1 - $3; } | expression '*' expression { $$ = $1 * $3; } | NUMBER 13 | NUMBER| NUMBER

What will happen if we have input "2+3*4"?$$: final value by performing non-terminal's action, Only for writing, not reading

$n: value of the n th concatenated element

Avoiding Ambiguous Expression

That's the reason why we need to define

operator's precedence in definition section

Hints for Lab1Exercise 2?

Q: How to recognize "prefix", "postfix" and "infix" in Lexer?

A: Step1: Add these rules to your .l file:%%

"prefix" { return PREFIX;} "prefix" { return PREFIX;}

Should be put in the rule section

14Step2: declare PREFIX, POSTFIX and INFIX as "token" in

your .y file "prefix" { return PREFIX;} "postfix" { return POSTFIX; } "infix" { return INFIX;} %%"prefix" { return PREFIX;} "postfix" { return POSTFIX; } "infix" { return INFIX;}

Should be put in the rule section

Case-sensitive

Exercise 2?

Q: How to combine three modes together?

A: You may have following grammar in your yacc fileint flag = 0; // Default setting%%..int flag = 0; // Default setting%%..Hints for Lab1

15 ..statement: PREFIX { flag = 0; } |

INFIX { flag = 1; } |

POSTFIX { flag = 2; } |

expression expression: expr_pre | expr_in | expr_post; expr_pre: '+' expr_pre expr_pre { if(flag == 0) $$ = $2 + $3; } expr_in: expr_in '+' expr_in { if(flag == 1) $$ = $1 + $3; } ... ..statement: PREFIX { flag = 0; } |

INFIX { flag = 1; } |

POSTFIX { flag = 2; } |

expression expression: expr_pre | expr_in | expr_post; expr_pre: '+' expr_pre expr_pre { if(flag == 0) $$ = $2 + $3; } expr_in: expr_in '+' expr_in { if(flag == 1) $$ = $1 + $3; }

Exercise 3?

Q: What action do we use to define the octal and

hexadecimal token? A: You can simply use 'strtol' functions for this.Hints for Lab1 16

long strtol(const char *nptr, char **endptr, int base); long strtol(const char *nptr, char **endptr, int base);

Exercise 4-5Q: How to build up and print AST1. Define the struct for AST and linked list structure having

AST nodes.

Instead of using struct,

typedef struct EXP{ struct EXP* exp1;typedef struct EXP{ struct EXP* exp1;Hints for Lab1

172. In yacc file, your statement and expressions should be 'ast' type (no longer dval type).

Instead of using struct,

if you use union here,

It's easier to handle the terminal

nodes (name and numbers) typedef struct EXP{ struct EXP* exp1; struct EXP* exp2; struct OP operator; } AST; typedef struct EXP{ struct EXP* exp1; struct EXP* exp2; struct OP operator; } AST; Exercise 4-53. Functions for making expression. It can be different functions by the type of the node (kinds of expression, number, name and so on). You can make functions like,Hints for Lab1

18The action field for each production in your yacc

file can call any function you have declared . Just as a sentence is recursively parsed, your AST is recursively built-up and traversed.

makeExpression(struct EXP* exp1, struct EXP* exp2, struct OP operator)makeExpression(struct EXP* exp1, struct EXP* exp2, struct OP operator)

A case study - The Calculator

#include "zcalc.tab.h" #include "y.tab.h" { yylval.dval = atof(yytext); return NUMBER; } #include "zcalc.h" %union { double dval; struct symtab *symp; } %token NAME %token NUMBER %left '+' '-' %type expression %%statement_list : statement ' \n' | statement_list statement ' \n' zcalc.l zcalc.y

Yacc -d zcalc.y

19 [ \t] ; [a-zA-Z][a-zA-Z0-(]* { struct symtab *sp = symlook(yytext); yylval.symp = sp; return NAME; statement_list : statement ' \n' | statement_list statement ' \n' statement : NAME '=' expression {$1->value = $3;} | expression { printf (" = %g\n", $1); } expression : expression '+' expression { $$ = $1 + $3; } | expression '-' expression { $$ = $1 - $3; } | NUMBER { $$ = $1; } | NAME { $$ = $1->value; } struct symtab * symlook( char *s ) { /* this function looks up the symbol table and check whether the symbol s is already there. If not, add s into symbol table. */ int main() { yyparse(); return 0;

References

Lex and Yacc Pagehttp://dinosaur.compilertools.net 20quotesdbs_dbs9.pdfusesText_15