LEX & YACC Tutorial
Yacc (Yet Another Compiler-Compiler) generates parser based on an analytic grammar 3 Flex is Free scanner alternative to Lex Bison is Free parser generator program written for the GNU project alternative to Yacc
Lex and Yacc: A Brisk Tutorial - University of Arizona
Using Yacc Suppose the grammar spec is in a file foo y Then: – Thecommand‘yacc foo y’yieldsa filey tab ccon-taining the parser constructed by yacc – Thecommand‘yacc -d foo y’constructsafile y tab h that can be #include’d into the scanner generated by lex – Thecommand‘yacc -v foo y’additionallyconstructs
Yacc: Yet Another Compiler-Compiler
Yacc is written in a portable dialect of C1 and the actions, and output subroutine, are in C as well Moreover, many of the syntactic conventions of Yacc follow C The heart of the input specification is a collection of grammar rules Each rule describes an allow-able structure and gives it a name For example, one grammar rule might be
Introduction to YACC
32 Introduction to YACC Fall 2012 Use token to define your terminals, yacc –d will create y tab h and define the token for you (as #define) Along with the token, you can have exactly one piece of information passed onto the stack That piece of information can change depending upon the token (or rule matched) Use
Introduction History - Western Michigan University
– YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar and to produce the source code of the syntactic analyzer of a language produced by this grammar History • Yacc original written by Stephen C Johnson, 1975 •Variants: – lex, yacc (AT&T) – bison: a yacc replacement (GNU)
lex & yacc, 2nd Edition
Chapter 3, Using Yacc, gives a full example using lex and yacc to develop a fully functional desktop calculator Chapter 4, A Menu Generation Language, demonstrates how to use lex and yacc to develop a menu generator Chapter 5, Parsing SQL, develops a parser for the full SQL relational data base language
ANTLR, Yacc, and Bison
Yacc Specification • A yacc specification consists of three parts: yacc declarations, and C declarations within { } translation rules user-defined auxiliary procedures • The translation rules are productions with actions: production 1 { semantic action 1} production 2 { semantic action 2} production n { semantic action n}
“Calculator” example
A yacc "state" is a set of "dotted rules" – a grammar rules with a "dot” somewhere in the right hand side (In some yacc printouts, "_" is the dot ) Intuitively, "A → α_β" in a state means this rule, up to and including α is consistent with input seen so far; next terminal in the input might derive from the left end of β
[PDF] fin de la démocratie athénienne
[PDF] l'apogée d'athènes
[PDF] fondation d'athènes
[PDF] apogée d'athènes date
[PDF] auteurs francophones connus
[PDF] liste des auteurs africains et leurs oeuvres pdf
[PDF] auteurs francophones contemporains
[PDF] littérature francophone est elle une littérature française
[PDF] auteurs francophones africains
[PDF] littérature francophone définition
[PDF] auteurs francophones les plus lus
[PDF] le jeu des acteurs au théâtre dissertation
[PDF] l avant scène théâtre
[PDF] actes sud theatre
LEX & YACC Tutorial
1February 28, 2008
Tom St. John
Outline
Overview of Lex and Yacc
?Structure of Lex Specification ?Structure of Yacc Specification 2 ?Some Hints for Lab1Overview
Lex (A LEXical Analyzer Generator) generates lexical analyzers (scanners or Lexers) Yacc (Yet Another Compiler-Compiler) generates parser based on an analytic grammar 3 generates parser based on an analytic grammarFlex is
Free scanner
alternative to LexBison is
Free parser
generator program written for the GNU project alternative to YaccScanner, Parser, Lex and Yacc
Scanner
Parser
symbol table tokenSource
Program
4 lex.yy.c name.tab.c name.tab.hLex spec (.l)
Yacc spec (name.y)
Lex/ flex
Yacc/ bisonC Compiler
Skeleton of a Lex Specification (.l file)
x.l < C global variables, prototypes, comments > lex.yy.c is generated after running > lex x.lThis part will be embedded into lex.yy.c 5 [DEFINITION SECTION]%%[RULES SECTION]%%C auxiliary subroutines embedded into lex.yy.cDefine how to scan and what action to take for each token Any user code.Lex Specification: Definition Section%{%{You should include this!Yacc will generate this file automatically.
6 #include "zcalc.tab.h" #include "zcalc.h" #includeUser-defined header file
Lex Specification: Rules Section
pattern { corresponding actions } pattern { corresponding actions } pattern { corresponding actions } pattern { corresponding actions }?Format
RegularExpression
C Expression
7 [1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER; }[1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER;Example
Unsigned integer will be
accepted as a tokenExpressionYou need to define these two in .y file
Two Notes on Using Lex1. Lex matches token with
longest matchInput: abc
Rule: [a-z]+ Token: abc(not "a" or "ab")82. Lex uses the
first applicable ruleInput: post
Rule1:
"post" {printf ("Hello,"); }Rule2:
[a-zA-Z]+ {printf ("World!"); } It will print Hello, (not "World!")Skeleton of a Yacc Specification (.y file)
x.y < C global variables, prototypes, comments > %}x.tab.c is generated after running> yacc x.y 9 [DEFINITION SECTION]%%[PRODUCTION RULES SECTION]%%C auxiliary subroutines Declaration of tokensrecognized in Parser (Lexer).How to understand the input, and what actions to take for each "sentence". [1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER; }[1-9][0-9]* { yylval.dval = atoi (yytext); return NUMBER;Yacc Specification: Definition Section (1)
#include "zcalc.h"#includeYacc Specification: Definition Section (2)
%left '-' '+'%left '*' '/' '%'%left '-' '+'%left '*' '/' '%'Define operator's precedence and associativity- We can solve problem in slide 13
11 %typeDefine nonterminal's name
- With this name, you will define rules in rule sectionYacc Specification: Production Rule Section (1)
Formatnontermname : symbol1 symbol2 ... { corresponding actions } | symbol3 symbol4 ... { corresponding actions } nontermname : symbol1 symbol2 ... { corresponding actions } | symbol3 symbol4 ... { corresponding actions } or 12 nontermname2 : ... nontermname2 : ...Regular expressionC expression
Yacc Specification: Production Rule Section (2)
Examplestatement : expression { printf (" = %g\n", $1); }expression : expression '+' expression { $$ = $1 + $3; }
| expression '-' expression { $$ = $1 - $3; } | expression '*' expression { $$ = $1 * $3; } | NUMBER statement : expression { printf (" = %g\n", $1); } expression : expression '+' expression { $$ = $1 + $3; } | expression '-' expression { $$ = $1 - $3; } | expression '*' expression { $$ = $1 * $3; } | NUMBER 13 | NUMBER| NUMBERWhat will happen if we have input "2+3*4"?$$: final value by performing non-terminal's action, Only for writing, not reading
$n: value of the n th concatenated elementAvoiding Ambiguous Expression
That's the reason why we need to define
operator's precedence in definition sectionHints for Lab1Exercise 2?
Q: How to recognize "prefix", "postfix" and "infix" in Lexer?A: Step1: Add these rules to your .l file:%%
"prefix" { return PREFIX;} "prefix" { return PREFIX;}Should be put in the rule section
14Step2: declare PREFIX, POSTFIX and INFIX as "token" in
your .y file "prefix" { return PREFIX;} "postfix" { return POSTFIX; } "infix" { return INFIX;} %%"prefix" { return PREFIX;} "postfix" { return POSTFIX; } "infix" { return INFIX;}Should be put in the rule section
Case-sensitiveExercise 2?
Q: How to combine three modes together?
A: You may have following grammar in your yacc fileint flag = 0; // Default setting%%..int flag = 0; // Default setting%%..Hints for Lab1
15 ..statement: PREFIX { flag = 0; } |INFIX { flag = 1; } |
POSTFIX { flag = 2; } |
expression expression: expr_pre | expr_in | expr_post; expr_pre: '+' expr_pre expr_pre { if(flag == 0) $$ = $2 + $3; } expr_in: expr_in '+' expr_in { if(flag == 1) $$ = $1 + $3; } ... ..statement: PREFIX { flag = 0; } |INFIX { flag = 1; } |
POSTFIX { flag = 2; } |
expression expression: expr_pre | expr_in | expr_post; expr_pre: '+' expr_pre expr_pre { if(flag == 0) $$ = $2 + $3; } expr_in: expr_in '+' expr_in { if(flag == 1) $$ = $1 + $3; }Exercise 3?
Q: What action do we use to define the octal and
hexadecimal token? A: You can simply use 'strtol' functions for this.Hints for Lab1 16long strtol(const char *nptr, char **endptr, int base); long strtol(const char *nptr, char **endptr, int base);
Exercise 4-5Q: How to build up and print AST1. Define the struct for AST and linked list structure having
AST nodes.
Instead of using struct,
typedef struct EXP{ struct EXP* exp1;typedef struct EXP{ struct EXP* exp1;Hints for Lab1172. In yacc file, your statement and expressions should be 'ast' type (no longer dval type).
Instead of using struct,
if you use union here,It's easier to handle the terminal
nodes (name and numbers) typedef struct EXP{ struct EXP* exp1; struct EXP* exp2; struct OP operator; } AST; typedef struct EXP{ struct EXP* exp1; struct EXP* exp2; struct OP operator; } AST; Exercise 4-53. Functions for making expression. It can be different functions by the type of the node (kinds of expression, number, name and so on). You can make functions like,Hints for Lab118The action field for each production in your yacc
file can call any function you have declared . Just as a sentence is recursively parsed, your AST is recursively built-up and traversed.makeExpression(struct EXP* exp1, struct EXP* exp2, struct OP operator)makeExpression(struct EXP* exp1, struct EXP* exp2, struct OP operator)