[PDF] COMPILER CONSTRUCTION for compiler construction and sound





Previous PDF Next PDF



Compiler Construction Tools

Compiler. Construction Tools. Page 2. • LEX (Lexical Analyzer Generator ). • YACC (Yet Another Compiler Compiler )(Parse. Generator). Page 3. LEX. Page 4. • LEX 



COMPILER DESIGN LECTURE NOTES Bachelor of Technology

Compiler Construction Tools Parser generators



Untitled

1.2.9 Compiler-Construction Tools. 1.3 The Evolution of Programming Languages. The Move to Higher-level Languages. Impacts on Compilers. Exercises for Section 



COMPILER CONSTRUCTION

compiler construction. Much of Chapters 7 and 8 is therefore devoted to parser gen manual methods even for those situations where tool use is recommended.



Compiler Construction using Flex and Bison

The parser is implemented as a push-down automata. 1. Page 8. Yacc and Bison are tools for generating parsers: programs which recognize.



Eli: a complete flexible compiler construction system

Many tools produce code whose performance is far below that of a hand-coded solution to the same problem. The Eli system was created to overcome these barriers.



Compiler Construction - Lecture 4 - Context-Free Grammars

The table is usually generated by automated software tools called parser generators. – Handwritten parsers are hand-coded using the grammar as a guide for 



Introduction to Compiler Construction

Be able to build a compiler for a (simplified). (programming) language. • Know how to use compiler construction tools such as generators of scanners and 



Compiler Construction Tools

– Regular expressions. – Lex. – Yacc. – Code generator generators. Page 3. 3. • Various software tools exist which aid in the construction of compilers. • 



Configuration Control in Compiler Construction

28 Sept 1987 ... compiler construction tools managed by Odins . Odin accepts a request for a certain product carries out the steps necessary to obtain that.



Compiler Construction Tools

Construction Tools. Page 2. • LEX (Lexical Analyzer Generator ). • YACC (Yet Another Compiler Compiler )(Parse. Generator) 



COMPILER DESIGN LECTURE NOTES Bachelor of Technology

Compiler Construction Tools Parser generators



COMPILER CONSTRUCTION

for compiler construction and sound engineering principles for selecting Programming languages are tools used to construct formal descriptions of finite ...



Compiler Construction

Introduction;. 2. Lexical and Syntax analysis;. 3. Parser generation tools specifically cup and jlex;. 4. Simple type-analysis and checking;.



Compiler Construction using Flex and Bison

The parser is implemented as a push-down automata. 1. Page 8. Yacc and Bison are tools for generating parsers: programs which recognize.



Eli: a complete flexible compiler construction system

compiler when referring to language processors. one tool for input to another and it ... with compiler construction tools is.



A tool box for compiler construction

Abstract. This paper presents a set of tools supporting the construction of nearly every compiler phase. Design goals of this tool box have been practical 



Untitled

1.2.9 Compiler-Construction Tools. 1.3 The Evolution of Programming Languages. The Move to Higher-level Languages. Impacts on Compilers.



Compiler Construction - Lecture :

compiler = example of a complex software architecture gaining experience with tool support. Compiler Construction Translated to DVI PDF



COMPILER DESIGN SYLLABUS

COURSE OVERVIEW: This course deals with the basic techniques of Compiler Construction and tools that can used to perform Syntax-directed translation of a 

COMPILERCONSTRUCTIONWilliamM.WaiteDepartmentofElectrical EngineeringUniversityof ColoradoBoulder, Colorado 80309USA

email:William.Waite@colorado.eduGerhardGo osInstitut ProgrammstrukturenundDatenorganisationFakultat f ur InformatikUniversitat KarlsruheD-76128 KarlsruheGermanyemail:ggoos@ipd.info.uni-karlsruhe.deCompiler Construction, a mo dern text written bytwo leaders in the in theeld, demonstrates how a compiler is built.Describing the necessary to olsand how to createand use them, the authors comp ose the task into mo d-ules, placing equal emphasis on the action and data asp ects of compilation.Attribute grammars are used extensively to provide a uniform treatmentofsemanticanalysis, comp etentco degenerationand assembly.The authorsalso explain howintermediate representations can b e chosen automaticallyonthe basis ofattribute dep endence.Thus, all asp ectsof thesub ject arepresentedintermsofauniformmo delsub jecttoautomation.Thisap-proach imparts a vivid understanding of the compilation and the decisionsthat must b e made when designing a compiler.From the back page of the printed b o ok.Allknownerrors fromtherstandsecondprinting(1994and1995) haveb eenxed.Whileeveryprecaution has b een taken in preparation of this b o ok, the authors assume no resp onsibility for errorsor omissions, or damages resulting from the use of the information contained here.c

1984{1994 by Springer-Verlag, Berlin, New York Inc. ISBN 0-387-90821-8 and ISBN 3-540-90821c

1995 by William M. Waite and Gerhard Go os.All rights reserved.No part of this b o ok may b e translated, repro duced, archived or sold in any formwithout written p ermission from one of the authors.ThecontentofCompilerConstructionismadeavailableviatheWebbyp ermissionoftheauthorsas a service to the community and only for educational purp oses.The b o ok may b e accessed freelyvia Web browsers.The URL isftp://i44ftp.info.uni-karlsruhe.de/pub/papers/ggoos/Compi-lerConstruction.ps.gz.Karlsruhe, 22nd February 1996

To all who know more than one language

PrefaceCompilers and op erating systemsconstitute thebasic interfacesbetweenaprogrammer andthemachineforwhichheisdevelopingsoftware.Inthisbookweareconcernedwiththeconstruction oftheformer.Our intentis toprovide the readerwith arm theoreticalbasisforcompilerconstructionandsoundengineeringprinciplesforselectingalternatemetho ds,implementing them,andintegratingthemintoareliable, economicallyviable pro duct.Theemphasis is up on a clean decomp osition employing mo dules that can b e re-used for many com-pilers, separation of concerns to facilitate team programming, and

exibility to accommo datehardwareandsystemconstraints.Areadershouldbeabletounderstandthequestionshemust ask when designing a compiler for language X on machine Y, what tradeos are p ossible,and what p erformance might b e obtained.He should not feel that any part of the design restson whim; each decision must b e based up on sp ecic, identiable characteristics of the sourceand target languages or up on design goals of the compiler.Thevastma jorityofcomputerprofessionalswillneverwriteacompiler.Nevertheless,study of compiler technology provides imp ortant b enets for almost everyone in the eld.Itfo cusesattentiononthebasicrelationshipsbetweenlanguagesandmachines.Un-derstanding of these relationships eases the inevitable transitions to new hardware andprogramming languages and improvesa p erson's abilitytomakeappropriate tradeosin design and implementation.It illustrates application of software engineering techniques to the solution of a signicantproblem.The problem is understandable to most users of computers, and involves b othcombinatorial and data pro cessing asp ects.Many of the techniques used to construct a compiler are useful in a wide variety of appli-cations involving symb olic data.In particular, every man-machine interface constitutesa form of programming language and the handling of input involves these techniques.Web elievethatsoftwareto olswillbeusedincreasinglytosupp ortmanyasp ectsofcompilerconstruction.MuchofChapters7and8isthereforedevotedtoparsergen-eratorsandanalyzersforattributegrammars.Thedetailsofthisdiscussionareonlyinteresting to those who must construct such to ols; the general outlines must b e knownto all who use them.We also realize that construction of compilers by hand will remainanimp ortantalternative,andthuswehavepresentedmanualmetho dsevenforthosesituations where to ol use is recommended.Virtually every problem in compiler construction has a vast numb er of p ossible solutions.Wehaverestrictedourdiscussion tothemetho dsthataremostusefulto day,andmakenoattempttogiveacomprehensivesurvey.Thus,forexample,wetreatonlytheLLandLRparsing techniques and provide references to the literature for other approaches.Because wedo not constantly remind the reader that alternative solutions are available, wemay sometimesapp ear overly dogmatic although that is not our intent.i

iiPreface

Chapters 5 and 8, and App endix B, state most theoretical results without pro of.Althoughthismakesthebookunsuitable forthosewhoseprimary interestisthetheoryunderlying acompiler,wefeltthatemphasisonpro ofswouldbemisplaced.Manyexcellenttheoreticaltexts already exist; our concern is reduction to practice.Acompilerdesigniscarriedoutinthecontextofaparticularlanguage/machinepair.Although the principles of compiler construction are largely indep endent of this context, thedetaileddesigndecisionsarenot.Inordertomaintainaconsistentcontextforourma jorexamples, we therefore need to cho ose a particular source language and target machine.Thesource language that we shall use is dened in App endix A. Wechose not to use an existinglanguageforseveralreasons,themostimp ortantb eingthatanewlanguageenabledustocontrolcomplexity:Featuresillustratingsignicantquestionsincompilerdesigncouldbeincludedwhileavoidingfeaturesthatledtoburdensomebutobviousdetail.Italsoallowsus to illustrate how a compiler writer derives information ab out a language, and provides anexample of an informal but relatively precise language denition.WechosethemachinelanguageoftheIBM370anditsimitatorsasourtarget.Thisarchitectureiswidelyused,andinmanyresp ectsitisadicultonetodealwith.Theproblems are representativeof many computers, the imp ortant exceptions b eing those (suchastheIntel8086)withoutasetofgeneralregisters.Aswediscussco degenerationandassemblystrategiesweshallp ointoutsimplicationsformoreuniformarchitectureslikethose of the DEC PDP11 and Motorola 68000.Weassumethatthereaderhasaminimumofoneyearofexp eriencewithablo ck-structuredlanguage,andsomefamiliaritywithcomputerorganization.Chapters5and8usenotationfromlogicandsettheory,butthematerialitselfisstraightforward.Severalimp ortant algorithms are based up on results from graph theory summarized in App endix B.Thisbookisbasedup onmanycompilerpro jectsandup onthelecturesgivenbytheauthorsattheUniversitatKarlsruheandtheUniversityofColorado.Forself-study,werecommendthatareaderwithverylittlebackgroundb eginwithSection1.1,Chapters2and 3,Section 12.1and App endix A.His ob jectiveshould b e tothoroughly understand therelationships b etween typical programming languages and typical machines, relationships thatdene the task of the compiler.It is useful to examine the machine co de pro duced by existingcompilers while studying this material.The remainder of Chapter 1 and all of Chapter 4 givean overview of the organization of a compiler and the prop erties of its ma jor data structures,whileChapter14showshowthreepro ductioncompilershaveb eenstructured.Fromthismaterialthereadershouldgainanappreciationforhowthevarioussubtasksrelatetooneanother, and the imp ortantcharacteristics of the interfaces b etween them.Chapters 5, 6 and 7 deal with the task of determining the structure of the source program.Thisisp erhapstheb est-understo o dofallcompilertasks,andtheoneforwhichthemosttheoreticalbackground is available.The theory is summarized in Chapter 5,and applied inChapters6and7.Readerswhoarenottheoreticallyinclined,andwhoarenotconcernedwithconstructingparsergenerators,shouldskimChapter5.Theirob jectivesshouldbetounderstandthenotationfordescribinggrammars,tobeabletodealwithniteautomata,and to understand the concept of using a stack to resolve parenthesis nesting.These readersshould then concentrate on Chapter 6, Section 7.1 and the recursive descent parse algorithmof Section 7.2.2.The relationship b etween Chapter 8 and Chapter 9 is similar to that b etweenChapter 5andChapter7,butthetheoryislessextensiveandlessformal.ThistheoryalsounderliespartsofChapters10and11.Wesuggestthatthereaderwhoisactuallyengagedincom-piler constructiondevotemoreeorttoChapters8-11thantoChapters5-7.Thereasonisthatparser generatorscan b e obtained \otheshelf " and used toconstructthe lexical andsyntacticanalysismo dulesquicklyandreliably.Acompilerdesignermusttypicallydevote

Prefaceiii

mostofhis eorttosp ecifying andimplementing theremainder ofthecompiler,andhencefamiliarity with Chapters 8-11 will have a greater eect on his pro ductivity.The lecturerin aone-semester,three-hour coursethatincludes exercises is comp elled torestrict himself to the fundamental concepts.Details of programming languages (Chapter 2),machines(Chapter3)andformallanguagesandautomatatheory(Chapter5)canonlybecoveredinacursoryfashionormustbeassumedasbackground.Thesp ecictechniquesfor parser development and attribute grammar analysis, as well as the whole ofChapter 13,mustbereservedforaseparatecourse.Itseemsb esttopresenttheoreticalconceptsfromChapter 5 in close conjunction with the sp ecic metho ds of Chapters 6 and 7, rather than asa single topic.Atypical outline is:1.The Nature of the Problem4 hours1.1.Overview of compilation (Chapter 1)1.2.Languages and machines (Chapters 2 and 3)2.Compiler Data Structures (Chapter 4)4 hours3.Structural Analysis10 hours3.1.Formal Systems (Chapter 5)3.2.Lexical analysis (Chapter 6)3.3.Parsing (Chapter 7)Review and Examination2 hours4.Consistency Checking10 hours4.1.Attribute grammars (Chapter 8)4.2.Semantic analysis (Chapter 9)5.Co de Generation (Chapter 10)8 hours6.Assembly (Chapter 11)2 hours7.Error Recovery (Chapter 12)3 hoursReview2 hoursThestudentsdonotwriteacompilerduringthiscourse.Forseveralyearsithasb eenrun concurrently with apracticum in which thestudentsimplement theessentialparts ofaLAX compiler.They are given the entire compiler, with stubs replacing the parts they are towrite.In contrast to pro ject courses in which the students must write a complete compiler, thisapproach has the advantage that they need not b e concerned with unimp ortant organizationaltasks.Sinceonlythecentralproblems needbesolved,onecandealwith complexlanguageprop erties.Atthe sametime, students areforced toread the environment programs and toadheretointerfacesp ecications.Finally,ifastudentcannotsolveaparticularproblemitdo es not cause his entire pro ject to fail since he can take the solution given by the instructorand pro ceed.AcknowledgementsThis b o ok is theresult of manyyearsofcollab oration.The necessaryresearchpro jectsandtravelweregenerously supp orted byour resp ective universities, theDeutscheForschungsge-meinschaft and the National Science Foundation.Itisimp ossibletolistallofthecolleaguesandstudentswhohavein

uencedourwork.Wewould, however, like to sp ecially thank four of our do ctoral students, Lynn Carter, BruceHaddon, Uwe Kastens and Johannes Rohrich, for b oth their technical contributions and theirwillingness to read the innumerable manuscripts generated during the b o ok's gestation.MaeJeanRuehlman and Gabriele Sahr also haveour gratitude forlearning morethan theyeverwantedtoknowab outcomputersandwordpro cessingastheypro ducedandeditedthosemanuscripts.

ivPreface

ContentsPrefaceiContentsv1Intro ductionandOverview11.1Translation and Interpretation..........................11.2The Tasks of a Compiler..............................31.3Data Managementin a Compiler.........................51.4Compiler Structure.................................61.5Notes and References ................................92Prop ertiesofProgrammingLanguages112.1Overview......................................112.1.1Syntax, Semantics and Pragmatics....................112.1.2Syntactic Prop erties............................122.1.3Semantic Prop erties............................142.2Data Ob jects and Op erations...........................152.2.1Elementary Typ es.............................162.2.2Comp osite Typ es..............................182.2.3Strings ....................................192.2.4Pointers...................................192.2.5Typ e Equivalence ..............................202.3Expressions.....................................212.4Control Structures.................................232.5Program Environments and Abstract Machine States..............252.5.1Constants, Variables and Assignment..................252.5.2The Environment..............................262.5.3Binding...................................312.6Notes and References ................................343Prop ertiesofRealandAbstractMachines373.1Basic Characteristics................................373.1.1Storage Classes...............................383.1.2Access Paths................................403.1.3Op erations.................................423.2Representation of Language Elements......................433.2.1Elementary Ob jects............................433.2.2Comp osite Ob jects.............................453.2.3Expressions.................................483.2.4Control Structures.............................51v

viContents

3.3Storage Management................................553.3.1Static Storage Management........................553.3.2Dynamic Storage Management Using a Stack..............563.3.3Dynamic Storage Management Using a Heap..............603.4Mapping Sp ecications ...............................633.5Notes and References ................................644AbstractProgramRepresentation694.1Intermediate Languages..............................694.1.1Token Sequence ...............................694.1.2Structure Tree...............................704.1.3Computation Graph............................734.1.4Target Tree.................................744.2Global Tables....................................774.2.1Symbol Table................................774.2.2ConstantTable...............................794.2.3Denition Table..............................804.3Notes and References ................................815ElementsofFormalSystems835.1DescriptiveTo ols..................................835.1.1Strings and Rewriting Systems......................835.1.2Grammars ..................................845.1.3Derivations and Parse Trees........................865.1.4Extended Backus-Naur Form.......................895.2Regular Grammars and Finite Automata....................915.2.1Finite Automata..............................915.2.2State Diagrams and Regular Expressions................935.3Context-FreeGrammars and Pushdown Automata...............965.3.1Pushdown Automata............................975.3.2Top-Down Analysis and LL(k) Grammars................985.3.3Bottom-Up Analysis and LR(k) Grammars...............1035.4Notes and References ................................1086LexicalAnalysis1116.1Mo dules and Interfaces ...............................1116.1.1Decomp osition of the Grammar......................1116.1.2Lexical Analyzer Interface.........................1126.2Construction....................................1136.2.1Extraction and Representation......................1136.2.2State Minimization.............................1156.2.3Programming the Lexical Analyzer....................1166.3Notes and References ................................1197Parsing1237.1Design........................................1237.1.1The Parser Interface............................1237.1.2Selection of the Parsing Algorithm....................1257.1.3Parser Construction............................1267.2LL(1) Parsers....................................127

Contentsvii

7.2.1Strong LL(k) Grammars..........................1277.2.2The Parse Algorithm ............................1297.2.3Computation of FIRST and FOLLOW Sets...............1347.3LR Parsers.....................................1367.3.1The Parse Algorithm ............................1377.3.2SLR(1) and LALR(1) Grammars.....................1387.3.3Shift-Reduce Transitions ..........................1437.3.4Chain Pro duction Elimination......................1447.3.5Implementation ...............................1467.4Notes and References ................................1488AttributeGrammars1538.1Basic Concepts of Attribute Grammars.....................1538.2Traversal Strategies .................................1588.2.1Partitioned Attribute Grammars.....................1588.2.2Derived Traversals.............................1608.2.3Pre-Sp ecied Traversals..........................1688.3Implementation Considerations..........................1728.3.1Algorithm Co ding.............................1728.3.2Attribute Storage ..............................1778.4Notes and References ................................1799SemanticAnalysis1839.1Description of Language Prop erties via Attribute Grammars ..........1839.1.1Scop e and Name Analysis.........................1849.1.2Typ es....................................1899.1.3Declarations .................................1929.1.4Expressions and Statements........................1969.2Implementation of Semantic Analysis .......................2019.3Notes and References ................................20610 Co deGeneration21110.1Memory Mapping ..................................21210.2Target Attribution.................................21410.2.1Register Allo cation.............................21410.2.2Targeting..................................21810.2.3Use of Algebraic Identities .........................21910.3Co de Selection...................................22610.3.1Machine Simulation............................22610.3.2Co de Transformation ............................23010.4Notes and References ................................23211 Assembly23511.1Internal Address Resolution............................23511.1.1Lab el Value Determination........................23611.1.2Span-Dep endent Instructions.......................23611.1.3Sp ecial Problems..............................24011.2External Address Resolution...........................24011.2.1Cross-Referencing.............................24111.2.2Library Search...............................243

viiiContents

11.3Instruction Enco ding................................24311.3.1Target Co de................................24411.3.2The Enco ding Pro cess...........................24411.4Notes and References ................................24812 ErrorHandling25312.1General Principles.................................25412.1.1Errors, Symptoms, Anomalies and Limitations.............25412.1.2Resp onses..................................25512.1.3Communication with the User......................25612.2Compiler Error Recovery.............................25812.2.1Semantic Errors..............................25812.2.2Syntactic Errors..............................26012.2.3Lexical Errors ................................26512.3Run-Time Errors..................................26612.3.1Static Error Lo cation...........................26612.3.2Establishing the Dynamic Environment.................26812.3.3Debugging Aids ...............................26912.4Notes and References ................................26913 Optimization27313.1The Computation Graph ..............................27313.2Lo cal Optimization.................................27713.2.1Value Numb ering..............................27813.2.2Co ding...................................28113.2.3Peephole Optimization ...........................28213.2.4Lo cal Register Allo cation.........................28513.3Global Optimization................................28613.3.1Global Data Flow Analysis........................28613.3.2Co de Motion................................28813.3.3Strength Reduction............................29013.3.4Global Register Allo cation .........................29413.4Ecacy and Cost..................................29514 Implementation29914.1Implementation Decisions.............................29914.1.1Criteria...................................29914.1.2Pass Structure...............................30014.1.3Table Representation ............................30214.2Case Studies....................................30314.2.1GIER ALGOL...............................30314.2.2ZurichPascal................................30614.2.3IBM FORTRAN H.............................31114.3Notes and References ................................316ATheSampleProgrammingLanguageLAX319A.1Basic Symb ols ....................................319A.2Program Structure.................................320A.2.1Programs..................................320A.2.2Visibility Rules...............................321

Contentsix

A.2.3Blo cks....................................321A.2.4Statement Lists ...............................322A.2.5Iterations..................................322A.2.6Lab els and Jumps.............................322A.3Declarations .....................................323A.3.1Values, Typ es and Ob jects........................323A.3.2Variable Declarations...........................324A.3.3Identity Declarations............................324A.3.4Pro cedure Declarations..........................324A.3.5Typ e Declarations.............................324A.4Expressions.....................................324A.4.1Evaluation of Expressions.........................325A.4.2Co ercions..................................325A.4.3Op erations.................................326A.4.4Names....................................327A.4.5Pro cedure Calls ...............................327A.4.6Clauses...................................328BUsefulAlgorithmsForDirectedGraphs329B.1Terminology .....................................329B.2Directed Graphs as Data Structures.......................332B.3Partitioning Algorithms..............................336B.3.1Strongly Connected Comp onents.....................337B.3.2Renement.................................338B.3.3Coloring...................................340B.4Notes and References ................................343References347

xContents

Chapter1Intro ductionandOverviewThe termcompilationdenotes the conversion of an algorithm expressed in a human-orientedsource languageto an equivalent algorithm expressed in a hardware-orientedtarget language.Weshallbeconcernedwiththeengineeringofcompilers{theirorganization,algorithms,data structures and user interfaces.1.1TranslationandInterpretationProgramming languages are to ols used to construct formal descriptions of nite computations

(algorithms).Eachcomputationconsistsofop erationsthattransformagiveninitialstateintosomenalstate.Aprogramminglanguageprovidesessentiallythreecomp onentsfordescribing such computations:Data typ es, ob jects and values with op erations dened up on them.Rules xing the chronological relationships among sp ecied op erations.Rules xing the (static) structure of a program.These comp onents together constitute thelevel of abstractionon whichwe can formulatealgorithms in the language.We shall discuss abstractions for programming languages in detailin Chapter 2.

The collection of ob jects existing at a given p oint in time during the computation consti-tutes the state,s, of the computation at that time.The set,S, of all states that could o ccurduring computations expressed in the language is called thestate spaceof the language.Themeaningof an algorithm is the (partially-dened) functionf:S!Sby which it transformsinitial states to nal states.Figure1.1illustratestheconceptofastate.Figure1.1aisafragmentofaprogramwritteninPascal.Sincethisfragmentdo esnotdeclaretheidentiersiandj,weaddthefactthatb oth areintegervariables.The valuesofiandjb eforethegivenfragmentb eginsto executeconstitute the initial state;their values after execution ceasesconstitute the nalstate.Figure1.1billustrates thestatetransformations carriedoutbythefragment,startingfrom a particular initial state.

Letfb e the function dened by the state transformation of some particular algorithmA.IfwearetopreservethemeaningofAwhencompiling ittoanewlanguagethenthestatetransformation functionf

0of the translated algorithmA

0must, in some sense, `agree' withf.Since the state space,S

0, of the target language may dier from that of the source language,wemustrstdecideup onafunction,M,tomapeachstatesi

2StoasubsetM(s)ofS

0.Thefunctionf

0thenpreservesthemeaningoffiff

0(M(s))isasubsetofM(f(s))forallallowable initial statess2S.1

2Intro duction and Overview

whilei6=jdoifi>jtheni:=ijelsej:=ji;a) An algorithmInitial:i=36j=24i=12j=24Final:i=12j=12b) A particular sequence of statesFigure 1.1:Algorithms and StatesFor example, consider the language of a simple computer with a single accumulator and twodata lo cations called I and J resp ectively (Exercise 1.3).Supp ose thatMmaps a particularstate of the algorithm given in Figure 1.1a to a set of machine states in which I contains thevalueofthevariablei,Jcontainsthevalueofthevariablej,andtheaccumulatorcontainsany arbitrary value.Figure 1.2a shows a translation of Figure 1.1a for this machine; a partialstate sequence is given in Figure 1.2b.LOOPLOADISUBJJZEROEXITJNEGSUBISTOREIJUMPLOOPSUBILOADJSUBISTOREJJUMPLOOPEXITa) An algorithmInitial:I=36J=24ACC=?I=36J=24ACC=36I=36J=24ACC=12...Final:I=12J=12ACC=0b) A sequence of states corresp onding to Figure 1.1bFigure 1.2:ATranslation of Figure 1.1In determining the statesequence of Figure 1.1b,we used only the concepts of Pascal assp eciedbythelanguagedenition.Foreveryprogramminglanguage,PL,wecandeneanabstractmachine:Theop erations,datastructuresandcontrolstructuresofPLb ecomethememoryelementsandinstructionsofthemachine.A`Pascalmachine'isthereforeanimaginary computer with Pascal op erations as its machine instructions and the data ob jectsp ossible in Pascal as its memory elements.Execution of an algorithm written inPLon sucha machine is calledinterpretation;the abstract machine is aninterpreter.

1.2 The Tasks of a Compiler3

Apureinterpreter analyzes the characterform of each source language instruction everytimethatinstruction isexecuted.Ifthegiveninstruction isonly tobeexecutedonce,pureinterpretationistheleastexp ensivemetho dofall.Henceitisoftenusedforjobcontrollanguagesandthe`immediatecommands'ofinteractivelanguages.Wheninstructionsaretobeexecutedrep eatedly,ab etterapproachistoanalyzethecharacterformofthesourceprogram only once, replacing it with a sequence of symb ols more amenable to interpretation.This analysis is simply a translation of the source language into some target language, whichis then interpreted.Thetranslationfromthesourcelanguagetothetargetlanguagecantakeplaceaseachinstructionoftheprogramisexecutedforthersttime(interpretationwithsubstitution).Thus only that part of the program actually executedwill b e translated;during testing thismay b e only a fraction of the entire program.Also, the character form of the source programcan often b e stored more compactly than the equivalent target program.The disadvantage ofinterpretationwith substitution is thatb oth thecompiler andinterpreter mustbeavailableduring execution.In practice, however, a system of this kind should not b e signicantly largerthan a pure interpreter for the same language.Examplesmaybefoundofvirtually alllevelsofinterpretation.Atoneextremearethesystems in which the compiler merely converts constants to internal form, xes the meaningof identiers and p erhaps transforms inx notation to p ostx (APL and SNOBOL4 are com-monly implemented this way);attheotherarethesystemsin which thehardware,assistedbyasmall run-timesystem,formstheinterpreter(FORTRANandPascalimplementationsusually follow this strategy).1.2TheTasksofaCompilerA compilation is usually implemented as a sequence of transformations (SL; L1

);(L1 ;L2 );::: ;(Lk i

iscalledanintermediatelanguage.Intermediatelanguagesareconceptualto olsusedindecomp osingthetaskofcompilingfromthesourcelanguagetothetargetlanguage.Thedesignofaparticularcompilerdetermineswhich(ifany)intermediatelanguageprogramsactually app ear as concrete text or data structures during compilation.Any compilation can b e broken down into two ma jor tasks:Analysis:Discover the structure and primitives of the source program, determining itsmeaning.Synthesis:Create a target program equivalent to the source program.Thisbreakdownisusefulb ecauseitseparatesourconcernsab outthesourceandtargetlanguages.The analysis concerns itself solely with the prop erties of the source language.It convertstheprogramtextsubmitted bytheprogrammerintoanabstractrepresentationemb o dyingtheessentialprop erties ofthealgorithm.This abstractrepresentationmayb e implementedin manyways, but it is usually conceptualized as a tree.The structure of the tree representsthecontrolanddata

owasp ectsoftheprogram,andadditionalinformationisattachedtotheno destodescrib eotherasp ectsvitaltothecompilation.InChapter2wereviewthegeneralcharacteristicsofsourcelanguages,p ointingouttheprop ertiesrelevantforthecompiler writer.Figure1.3illustrates thegeneralidea with anabstraction ofthealgorithmof Figure 1.1a.Figure1.3adescrib esthecontrolanddata

owofthealgorithmbymeansofthe`k thdescendantof 'relation.Forexample,tocarryoutthealgorithmdescrib edbyasubtree

4Intro duction and Overview

-asgn exp name idnnamename idn idn idnnameexp -if asgn name idnidnnameexp name idnwhile exp name idnname name idnidn a) Control and data owNo de

Additional Information

idnidentier corresp onding declaration nametyp e of the variable exptyp e of the expression value b) Additional information ab out the source programNo de

Additional Information

namecorresp onding data lo cation ifaddress of co de to carry out theelsepart whileaddress of the expression evaluation co de

c) Additional information ab out the target programFigure 1.3:An Abstract Program Fragmentro oted in awhileno de werst evaluatethe expression describ ed by the subtree thatis therstdescendantofthewhileno de.Ifthisexpressionyieldstr uethenwecarryoutthealgorithm describ ed by the subtree that is the second descendant.Similarly,toevaluate theexpressiondescrib edbyanexpressionsubtree,weevaluatetherstandthirddescendantsand then apply the op erator describ ed by the second descendant to the results.The algorithm of Figure 1.1a is not completely characterized by Figure 1.3a.Informationmust b e added (Figure 1.3b) to complete the description.Note that some of this information(theactualidentier foreachidn)istakendirectly formthesourcetext.Theremainderisobtained by pro cessing the tree.For example, the typ e of the expression value dep ends up onthe op erator and the typ es of the op erands.Synthesis pro ceeds from the abstraction develop ed during analysis.It augmentsthe treeby attachingadditional information (Figure 1.3c)that re

ectsthe source-to-targetmappingdiscussedintheprevioussection.Forexample,theaccessfunctionforthevariableiinFigure 1.1a would b ecome the address of data lo cation I according to the mappingMassumedbyFigure1.2.Similarly, theaddressoftheelsepartoftheconditional wasrepresentedbythelab elSUBI.Chapter3discussesthegeneralcharacteristicsofmachines,highlightingprop erties that are imp ortant in the development of source-to-targetmappings.Formal denitions of the source language and the source-to-target mapping determine thestructure of the tree and the computation of the additional information.The compiler simplyimplements the indicated transformations, and hence the abstraction illustrated in Figure 1.3formsthebasisfortheentirecompilerdesign.InChapter4wediscussthisabstractionindetail, considering p ossible intermediate languages and the auxiliary data structures used intransforming b etween them.Analysisisthemoreformalizedofthetwoma jorcompilertasks.Itisgenerallybroken

1.3 Data Managementin a Compiler5

downintotwoparts,thestructuralanalysistodeterminethestaticstructureofthesourceprogram, and thesemantic analysisto x the additional information and check its consistency.Chapter 5 summarizes some results from the theory of formal languages and shows how theyare used in the structural analysis of a program.Two subtasks of the structural analysis areidentiedonthebasisoftheparticularformalismsemployed:Lexicalanalysis(Chapter6)deals with the basic symb ols of the source program, and is describ ed in terms of nite-stateautomata;syntacticanalysis,orparsing,(Chapter7)dealswiththestaticstructureoftheprogram, and is describ ed in terms of pushdown automata.Chapter 8 extends the theoreticaltreatment of Chapter 5 to cover the additional information attached to the comp onents of thestructure,andChapter9appliestheresultingformalism(attributegrammars)tosemanticanalysis.Thereislittleinthewayofformalmo delsfortheentiresynthesispro cess,althoughal-gorithmsforvarioussubtasksareknown.Weviewsynthesisasconsistingoftwodistinctsubtasks,codegenerationandassembly.Co degeneration(Chapter10)transformstheab-stract source program app earing at the analysis/synthesis interface into an equivalent targetmachineprogram.Thistransformationiscarriedoutintwosteps:Firstwemapthealgo-rithm from source concepts to target concepts, and then we select a sp ecic sequence of targetmachine instructions to implement that algorithm.Assembly(Chapter11)resolvesalltargetaddressingandconvertsthetargetmachineinstructionsintoanappropriateoutputformat.Weshouldstressthatbyusingtheterm`assembly' we do not imply that the co de generator will pro duce symb olic assembly co de forinput to the assembly task.Instead, it delivers an internal representation of target instructionsinwhichmostaddressesremainunresolved.Thisrepresentationissimilar tothatresultingfromanalysis ofsymb olic instructions during therstpassofanormal symb olic assembler.The output of the assembly task should b e in the format accepted by the standard link editoror loader on the target machine.Errorsmayapp earatanytimeduringthecompilationpro cess.Inordertodetectasmanyerrorsasp ossibleinasinglerun,repairsmustbemadesuchthattheprogramisconsistent, even though it may not re

ect the programmer's intent.Violations of the rules ofthe source language should b e detected and rep orted during analysis.If the source algorithmusesconceptsofthesourcelanguageforwhichnotargetequivalenthasb eendenedinaparticular implementation,orif thetargetalgorithm exceedslimitations ofasp ecic targetlanguageinterpreter(e.g.requiresmorememorythanasp eciccomputerprovides),thisshould b e rep orted during synthesis.Finally, errors must b e rep orted if any storage limits ofthe compiler itself are violated.Inadditiontotheactualerrorhand ling,itisusefulforthecompilertoprovideextrainformation for run-time error detection and debugging.This task is closely related to errorhandling, and b oth are discussed in Chapter 12.Anumberofstrategiesmaybefollowedinanattempttoimprovethetargetprogramrelativetosomesp eciedmeasureofcost.(Co desizeandexecutionsp eedaretypicalcostmeasures.)These strategies mayinvolve deep er analysis of the source program, more complexmapping functions,andtransformations ofthetargetprogram.Weshall treatthersttwoinourdiscussionsofanalysisandco degenerationresp ectively;thethirdisthesub jectofChapter 13.1.3DataManagementinaCompilerAs with other large programs, data management and access account for many of the problemstobesolvedbythedesignofacompiler.Inordertocontrolcomplexity,weseparatethe

6Intro duction and Overview

functionalasp ectsofadataob jectfromtheimplementationasp ectsbyregardingitasaninstanceofanabstractdatatype.(Anabstractdatatyp eisdenedbyasetofcreation,assignmentandaccessop eratorsandtheirinteraction;nomentionismadeoftheconcreteimplementationtechnique.)Thisenablesustoconcentrateup ontherelationshipsbetweentasksanddataob jectswithoutb ecomingenmeshedindetailsofresourceallo cationthatre

ectthemachineup on which thecompiler is running (thecompilerhost)ratherthantheproblem of compilation.Aparticular implementationischosenforadataob jectonthebasisoftherelationshipbetweenitspatternofusageandtheresourcesprovidedbythecompilerhost.Mostofthebasic issues involved b ecome apparentifwe distinguish three classes of data:Lo cal data of compiler tasksProgram text in various intermediate representationsTables containing information that represents context-dep endence in the program textStorageforlo caldatacanbeallo catedstaticallyormanagedviathenormalstackingmechanismsofablo ck-structuredlanguage.Suchstrategiesarenotusefulfortheprogramtext,however,orforthetablescontainingcontextualinformation.Becauseofmemorylim-itations,wecanoftenholdonlyasmallsegmentoftheprogramtextindirectly-accessiblestorage.Thisconstrainsustopro cesstheprogramsequentially,andpreventsusfromrep-resentingitdirectlyasalinkeddatastructure.Instead,alinearnotationthatrepresentsasp ecic traversal of the data structure (e.g. prex or p ostx) is often employed.Informationto b e used b eyond the immediate vicinity of the place where it was obtained is stored in ta-bles.Conceptually, this information is a comp onent of the program text; in practice it ofteno ccupies dierent data structures b ecause it has dierent access patterns.For example, tablesmust often b e accessed randomly.In some cases it is necessary to search them, a pro cess thatmay require a considerable fraction of the total compilation time.For this reason we do notusually consider the p ossibility of spilling tables to a le.Thesizeoftheprogramtextandthatofmosttablesgrowslinearlywiththelengthofthe original source program.Some data structures (e.g.the parse stack)only grow with thecomplexityofthesourceprogram.(Complexityisgenerallyrelatedtonestingofconstructssuch as pro cedures and lo ops.Thus long, straight-line programs are not particularly complex.)Sp ecicationofb oundsonthesizeofanyofthesedatastructuresleadsautomaticallytorestrictions on theclassoftranslatable programs.These restrictions maynotbeonerous toahuman programmer but may seriously limit programs generated by pre-pro cessors.1.4CompilerStructureAdecomp ositionofanyproblemidenties b othtasksanddatastructures.Forexample,inSection1.2wediscussedtheanalysisandsynthesistasks.Wementionedthattheanalyzerconvertedthesourceprogramintoanabstractrepresentationandthatthesynthesizerob-tainedinformationfromthisabstractrepresentationtoguideitsconstructionofthetargetalgorithm.Thus we are led to recognize a ma jor data ob ject, whichwe call thestructuretreein addition to the analysis and synthesis tasks.We dene onemodulefor each task and each data structure identied during the decom-p osition.A mo dule is sp ecied byaninterface that denes the ob jects and actions it makesavailable,andtheglobaldataandop erationsituses.Itisimplemented(ingeneral)byacollectionofpro ceduresaccessingacommondatastructurethatemb o diesthestateofthemo dule.Mo dulesfallintoasp ectrumwithsinglepro ceduresatoneendandsimpledataob jects at the other.Four p oints on this sp ectrum are imp ortant for our purp oses:

1.4 Compiler Structure7

Pro cedure:Anabstractionofasingle"memoryless"action(i.e.anactionwithnointernalstate).Itmaybeinvokedwithparameters,anditseectdep endsonlyup ontheparametervalues.(Example{Apro ceduretocalculatethesquarero otofarealvalue.)Package:An abstraction of a collection of actions related by a commoninternal state.Thedeclarationofapackageisalsoitsinstantiation,andhenceonlyoneinstanceisp ossible.(Example { The analysis or structure tree mo dule of a compiler.)Abstract data typ e:An abstraction of a data ob ject on whicha numb er of actions canbep erformed.Declarationisseparatefrominstantiation,andhencemanyinstancesmayexist.(Example{Astackabstractionprovidingtheop erationspush,pop,top,etc.)Variable:Anabstractionofadataob jectonwhichexactlytwoop erations,fetchandstore,canbep erformed.(Example{Anintegervariableinmostprogramminglan-guages.)Abstractdatatyp escanbeimplemented viapackages:Thepackagedenes adatatyp eto represent the desired ob ject, and pro cedures for all op erations on the ob ject.Ob jects arethen instantiated separately.When an op eration is invoked, the particular ob ject to whichitshould b e applied is passed as a parameter to the op eration pro cedure.TheoverallcompilerstructurethatweshalluseinthisbookisoutlinedinFigures1.4through 1.8.Eachofthesegures describ es asingle step in the decomp osition.The centralblo ckoftheguresp eciestheproblemb eingdecomp osedatthisstep.Totheleftarethedatastructures from which information is obtained,and totheright arethosetowhichinformation is delivered.Below is the decomp osition of the problem, with b oxes representingsubtasks.Datastructuresusedforcommunicationamongthesesubtasksarelistedattheb ottomofthe gure.Eachboxand eachentryin anyofthethreedatalists corresp onds toamo duleofthecompiler.Itisimp ortanttonotethatFigures1.4through1.8re

ectonlythe overall structure of the compiler; they are not owcharts and they do not sp ecify mo duleinterfaces.INPUTOUTPUTSource textTarget Co deError Rep orts

Compilation

AnalysisSynthesis

LOCALStructure TreeFigure 1.4:Decomp osition of the CompilerOur decomp osition is based up on our understanding of the compilation problem and our

p erceptionoftheb esttechniquescurrentlyavailableforitssolution.Thechoiceofpreciseb oundaries is driven by control and data

ow considerations, primarily minimization of owat interfaces.Sp ecic criteria that in

uenced our decisions will b e discussed throughout thetext.Thedecomp ositionisvirtuallyindep endentoftheunderlyingimplementation,andofthesp ecic characteristicsofthesourcelanguageand targetmachine.Clearly thesefactors

8Intro duction and Overview

INPUTOUTPUTSource textTarget Co deError Rep orts

Analysis

Structural AnalysisSemantic Analysis

Figure 1.5:Decomp osition of the Analysis Taskin

uence the complexity of the mo dules that wehave identied, in some cases reducing themto trivial stubs, but the overall structure remains unchanged.INPUTOUTPUTSource textError Rep ortsConnection Sequence

Structural Analysis

Lexical AnalysisParsing

LOCALToken SequenceFigure 1.6:Decomp osition of the Structural Analysis TaskIndep endence of the mo dules from the concreteimplementation is obtained by assumingthateachmo duleisimplementedonitsownabstractmachine,whichprovidesthepreciseop erationsneededbythemo dule.Thelo caldatastructuresofFigures1.4-1.8arethuscomp onents of the abstract machine on which the given subproblem is solved.INPUTOUTPUTStructure TreeError Rep ortsTarget Co de

Synthesis

Co de GenerationAssembly

LOCALTarget TreeFigure 1.7:Decomp osition of the Synthesis TaskOnecanseethedegreeoffreedomremainingintheimplementationbynotingthatourdiagramsneverprescrib ethetimesequenceofthesubproblemsolutions.Thus,forexam-ple,analysisandsynthesismightrunsequentially.Inthiscasethestructuretreemustbecompletely built as a linked data structure during analysis, written to a le if necessary, andthenpro cessedduringsynthesis.Analysisandsynthesismight,however,runconcurrently

1.5 Notes and References9

and interact as coroutines:As so on as the analyzer has extracted an element of the structuretree,thesynthesizerisactivatedtopro cessthiselementfurther.Inthiscasethestructuretreewill neverbebuilt asaconcreteob ject,butissimply anabstractdatastructure;onlythe element b eing pro cessed exists in concrete form.INPUTOUTPUTStructure TreeError Rep ortsTarget Tree

Co de Generation

Targe MappingCo de Selection

LOCALComputation GraphFigure 1.8:Decomp osition of the Co de Generation TaskInparticular,ourdecomp ositionhasnothingtodowiththep ossibledivision ofacom-pilerintopasses.(Weconsiderapasstobeasingle,sequentialscanoftheentiretextineitherdirection.Apasseithertransformstheprogramfromoneinternalrepresentationtoanotherorp erforms sp ecied changeswhile holding therepresentationconstant.)Thepassstructure commonly arises from storage constraints in main memory and from input/outputconsiderations, rather than from anylogical necessitytodivide the compiler intoseveral se-quentialsteps.Onemo duleisoftensplitacrossseveralpasses,and/ortasksb elongingtoseveral mo dules are carried out in the same pass.Possible criteria will b e illustrated by con-crete examples in Chapter 14.Proven programming metho dologies indicate that it is b est toregard pass structure as an implementation question.This p ermits development of programfamilieswiththesamemo dulardecomp ositionbutdierentpassorganization.Theab oveconsideration of coroutines and other implementation mo dels illustrates such a family.1.5NotesandReferencesCompilerconstructionisoneoftheareasofcomputersciencethatearlyworkerstriedtoconsider systematically.Knuth[1962] rep orts some of those eorts.Imp ortant sources fromthe rst half of the 60's are an issue of theCommunicationsof the ACM1961 the rep ort of aconference sp onsored by the International Computing Centre [ICC, 1962] and the collection ofpap ers edited byRosen[1967].Finally,Annual Review in Automatic Programmingcontainsa large numb er of fundamental pap ers in compiler construction.The idea of an algorithmic conversion of expressions to a machine-oriented form originatedin the work ofRutishauser[1952].Although most of our current metho ds b ear only a dis-tant resemblance to those of the 50's and early 60's, wehave inherited a view of the descriptionofprogramming languagesthatprovides thefoundation ofcompiler construction to day:In-termediate languages were rst prop osed as interfaces in the compilation pro cess by a SHAREcommitteeMocketal.[1958];theextensivetheoryofformallanguages,rstdevelop edbythe linguist Noam Chomsky 1956, was employed in the denition of ALGOL 60 1963; the useof pushdown automata as mo dels for syntax analysis app ears in the work ofSamelsonandBauer[1960].The b o ok byRandellandRussell[1964] remains a useful guide for a quick implemen-tation of ALGOL 60 that do es not dep end up on extensive to ols.Grauet al. [1967] describ e

10Intro duction and Overview

anALGOL60implementationin anextendedversionofALGOL60.Theb o oksbyGries[1971],AhoandUllman[1972, 1977] andBauerandEickel[1976] represent the state ofthe art in the mid 1970's.Recognitionthatparsingcanbeundersto o dviamo delsfromthetheoryofformallan-guagesledtoaplethoraofworkinthisareaandprovidedthestrongestmotivationforthefurther developmentof thattheory.Fromtime totime the impression arises thatparsing isthe only relevant comp onent of compiler construction.Parsing unquestionably represents oneof the most imp ortant control mechanisms of a compiler.However, while just under one thirdofthepap erscollectedinPollack's1972bibliography aredevotedtoparsing,therewasnotonereferencetotheequallyimp ortanttopicofco degeneration.Measurements[Horningetal.,1972]haveshown thatparsing represents approximately9%ofacompiler's co deand11%ofthetotalcompilationtime.Ontheotherhand,co degenerationandoptimizationaccountfor50-70%ofthecompiler.Certainly thisdiscrepancy isdue,inpart,tothegreatadvances made in the theory of parsing; the value of this work should not b e underestimated.Wemuststress,however,thatamorebalancedviewp ointisnecessaryifprogressistobemaintained.Mo dulardecomp ositionParnas[1972,1976]isadesigntechniqueinwhichintermedi-atestagesarerepresentedbysp ecications oftheexternalb ehavior(interfaces)ofprogrammo dules.The technique of data-driven decomp osition was discussed byLiskovandZilles[1974]and a summary of program mo dule characteristics was given byGoosandKastens[1978].Thislatterpap ershowshowthevariouskindsofprogrammo dulesareconstructedin several programming languages.Our diagrams depicting single decomp ositions are lo oselybased up on some ideas ofConstantineet al. [1974].Exercises

1.1Consider the Pascal algorithm of Figure 1.1a.(a)What are the elementary ob jects and op erations?(b)What are the rules for chronological relations?(c)What comp osition rules are used to construct the static program?1.2Determine the state transformation function,f, for the algorithm of Figure 1.1a.Whatinitial statesguaranteetermination?Howdoyou characterizethe corresp onding nalstates?1.3Consider a simple computer with an accumulator and two data lo cations.The instruc-tion set is:LOADd:Copy the contents of data lo cation d to the accumulator.STOREd:Copy the contents of the accumulator to data lo cation d.SUBd:Subtract the contents of data lo cation d from the accumulator, leavingthe result in the accumulator.(Ignore any p ossibilityof over

ow.)JUMPi:Execute instruction i next.JZEROi:Execute instruction i next if the accumulator contents are zero.JNEGi:Executeinstructioninextiftheaccumulatorcontentsarelessthanzero.(a)What are the elementary ob jects?(b)What are the elementary actions?(c)What comp osition rules are used?(d)Complete the state sequence of Figure 1.2b.

Chapter2Prop ertiesofProgrammingLanguagesProgramminglanguagesareoftendescrib edbystatingthemeaningoftheconstructs(ex-pressions,statements,clauses,etc.)interpretively.Thisdescriptionimplicitlydenesaninterpreter for an abstract machine whose machine language is the programming language.Theoutputoftheanalysistaskisarepresentationoftheprogramtobecompiledintermsoftheop erationsanddatastructuresofthisabstractmachine.Bymeansofco degeneration and the run-time system, these elements are mo deled by op eration sequences anddata structures of the computer and its basic software (op erating system, etc.)Inthischapterweexploretheprop ertiesofprogramminglanguagesthatdeterminetheconstructionandp ossibleformsoftheasso ciatedabstractmachines,anddemonstratethecorresp ondence b etween the elements of the programming language and the abstract machine.On the basis of this discussion, we select the features of our example source language, LAX.A complete denition of LAX is given in App endix A.2.1OverviewThebasisofeverylanguageimplementation isalanguagedenition.(SeetheBibliographyfor a list of the language denitions that we shall refer to in this b o ok.)Users of the languageread the denition as a user manual:What is the practical meaning of the primitive elements?How can they b e meaningfully used?How can they b e combined in a meaningful way?Thecompilerwriter,ontheotherhand,isinterestedinthequestionofwhichconstructionsarepermitted.Evenifhecannotatthemomentseeanyuseful applicationofaconstruct,oriftheconstructleadstoseriousimplementationdiculties,hemustimplementitexactlyassp ecied by the language denition.Descriptions such as programming textb o oks, which areoriented towards the meaningful applications of the language elements, do not clearly denethe b oundaries b etween what is p ermitted and what is prohibited.Thus it is dicult to makeuse of such descriptions as bases for the construction of a compiler.(Programming textb o oksare also informal, and often cover only a part of the language.)2.1.1Syntax,SemanticsandPragmaticsThe syntax of a language determines whichcharacter strings constitute well-formed programsinthelanguageandwhichdonot.Thesemanticsofalanguagedescrib ethemeaningofaprogram in terms of the basic concepts of the language.Pragmatics relate the basic concepts11

12Prop erties of Programming Languages

of the language to concepts outside the language (to concepts of mathematics or to the ob jectsand op erations of a computer, for example).Semantics include prop erties that can b e deduced without executing the program as wellasthoseonlyrecognizableduringexecution.FollowingGriffiths[1973],wedenotetheseprop ertiesstaticanddynamicsemantics resp ectively.The assignment of a particular prop ertyto one or the other of these classes is partially a design decision by the compiler writer.Forexample, some implementations of ALGOL 60 assign the distinction b etween integer and realtothedynamicsemantics,althoughthisdistinction cannormallybemadeatcompiletimeand thus could b elong to the static semantics.Pragmaticconsiderationsapp earinlanguagedenitionsasunelab oratedstatementsofexistence,asreferencestootherareasofknowledge,asapp ealstointuition,orasexplicitstatements.Examples are the statements `[Bo olean] values are the truth values denoted by theidentiers true and false' (Pascal Rep ort, Section 6.1.2), `their results are obtained in the senseof numerical analysis' (ALGOL 68 Revised Rep ort, Section 2.1.3.1.e) or `decimal numb ers havetheir conventionalmeaning'(ALGOL60Rep ort,Section2.5.3).Mostpragmaticprop ertiesare hinted at through a suggestivechoice of words that are not further explained.Statementsthatcertainconstructsonlyhaveadenedmeaningundersp eciedconditionsalsob elongtothepragmaticsofalanguage.Insuchcasesthecompiler writerisusually freetoxthemeaning of the construct under other conditions.The richer the pragmatics of a language, themorelatitude acompiler writer has forecientimplementation and theheavier theburdenon the user to write his program to give the same answers regardless of the implementation.We shall set the following goals for our analysis of a language denition:Stipulation of the syntactic rules sp ecifying construction of programs.Stipulation of the static semantic rules.These, in conjunction with the syntactic rules,determine the form into which the analysis p ortion of the compiler transforms the sourceprogram.Stipulation ofthedynamicsemanticrulesanddierentiationfrompragmatics.Thesedetermine the ob jects and op erations of the language-oriented abstract machine, whichcan b e used to describ e the interface b etween the analysis and synthesis p ortions of thecompiler:The analyzer translates the source program into an abstract targetprogramthat could run on the abstract machine.Stipulation of the mapping of the ob jects and op erations of the abstract machine ontothe ob jects and op erations of the hardware and op erating system, taking the pragmaticmeanings of theseprimitives into account.This mapping will b e carried out partly bythe co de generatorand partly by the run-time system;its sp ecication is the basis forthe decisions regarding the partitioning of tasks b etween these two phases.2.1.2SyntacticProp ertiesThesyntacticrulesofalanguageb elongtodistinctlevelsaccordingtotheirmeaning.Thelowestlevelcontainsthe`sp ellingrules'forbasicsymb ols,whichdescrib etheconstructionofkeywords,identiersandsp ecialsymb ols.Theserulesdetermine,forexample,whetherkeywords have the form of identiers (beg in) or are written with sp ecial delimiters ('BEGIN',.BEGIN),whetherlowercaselettersarep ermittedinadditiontoupp ercase,andwhichsp ellings (<=,.LE.,'NOT' 'GREATER')arep ermitted for symb ols suchas thatcannotberepro duced on all I/Odevices.A commonprop ertyoftheserules is thattheydo notaectthe meaning of the program b eing represented.(In this b o ok wehave distinguished keywordsbyusingb oldfacetyp e.Thisconventionisusedonlytoenhancereadability,anddo esnotimply anything ab out the actual representation of keywords in program text.)

2.1 Overview13

The second level consists of the rules governing representation and interpretation of con-stants,forexamplerulesab outthesp ecicationofexp onentsin

oatingp ointnumb ersorthe allowed forms of integers (decimal, hexadecimal, etc.)These rules aect the meanings ofprograms insofar as they sp ecify the p ossibilities for direct representation of constantvalues.Thetreatmentofb othofthesesyntacticclassesisthetaskoflexicalanalysis,discussedinChapter 6.Thethirdlevelofsyntacticrulesistermedtheconcretesyntax.Concretesyntaxrulesdescrib e the comp osition of language contructs such as expressions and statements from basicsymb ols.Figure2.1ashowstheparsetree(agraphicalrepresentationoftheapplicationofconcrete syntax rules) of the Pascal statement`ifaorbandcthen:::else:::'.Becausethegoalofthecompiler'sanalysis taskistodeterminethemeaningofthesourceprogram,semantically irrelevant complications suchasop eratorprecedenceand certainkeywordscanbesuppressed.Thelanguageconstructsaredescrib edbyanabstractsyntaxthatsp eciesthecomp ositionalstructureofaprogramwhileleavingop ensomeasp ectsofitsconcreterepresentationasastringofbasicsymb ols.Application oftheabstractsyntaxrules canbeillustrated byastructuretree(Figure 2.1b).

statement if expression simple expressionthen term factor variable identifier aterm factor and variable identifier bfactor variable identifier celse or a) Parse tree (application of concrete syntax rules) term factor variable identifier a statement or expression term factor and variable identifier bfactor variable identifier c... ... b) Structure tree (application of abstract syntax rules)Figure 2.1:Concrete and Abstract Syntax

14Prop erties of Programming Languages

2.1.3SemanticProp ertiesMostcurrentprogramminglanguagessp ecifyalgorithmsoperational ly,incontrastto`veryhigh level' languages that allow the user to formally describ e a problem and leave the imple-mentation to the compiler.Essential semantic elements of op erational languages are -Data ob jects and structures up on which op erations take placeOp erations and construction rules for expressions and other op erative statementsConstructs providing

ow of control, the dynamic comp osition of program fragmentsDataob jectsapp earasexplicit constants,asvaluesofvariables andasresults ofop era-tions.Atanyp ointintheexecutionofaprogramthetotalityofvariablevaluesrepresentsthestateoftheabstractmachine.Thisstateconstitutestheenvironmentforexecutionoffurther op erations.Included in the set of op erations are the accessfunctions such as indexing of an arrayorselection of a eld of a record, and op erations such as the addition or comparison of twovalues.Theseop erationsdonotalterthestateoftheabstractmachine.Assignmentisanexampleofanop erationwith asideeectthataltersthecontentsofavariable,acomp onentofthestateof the abstract machine.Most programming languages contain a large numb er of suchstate-changingop erations, all of whichmayb e regarded as assignment combined with otherop erations.Usuallytheseop erationsareformulatedasstatementswithoutresults.MostCOBOL `verbs' designate such statements.Finally, op erations include blo ckentry and exit,pro cedure call and return, and creation of variables.These op erations, whichwe asso ciate withcontrol of the state, change the state by creating and deleting ob jects (variables, parameters,etc.)and altering the allowable access functions.Flowofcontrolincludesconditionalexpressionsorstatements,caseselection,iteration,jumps and so forth.These elements app ear in various forms in most programming languages,andfrequentlytakeintoaccountsomesp ecialimplementationp ossibilityorpractice.Forexample, the conditional statementiftruth_valuethens1

elses2 ;and the case selectioncasetruth_valueoftrue:s1 ;false:s2

end;haveidenticaleectsinPascal.Asweshallseelater,however,thetwoconstructswouldprobably b e implemented dierently.In considering semantic prop erties, it is imp ortant for the compiler writer to systematicallycollectthecountlessdetailssuchasprop ertiesofdataob jects,op erationsandsideeects,p ossibilities foriteration,andsoforth,intosomeschema.Theclarityandadequacyofthisschema determines the quality of the compiler b ecause the compiler structure is derived fromit.Asho ddyschemamakeswell-nighimp ossibleaconvincingargumentthatthecompilertranslates the source language fully and completely.Formany languages,including ALGOL 60,ALGOL 68,Pascal and Ada,good schemataarecomparativelyeasytoobtainb ecausethelanguagedenitionsaresuitablystructured.Other language denitions take the form of a collection of language element descriptions withmany exception rules; a systematic treatment of such languages is often imp ossible.

2.2 Data Ob jects and Op erations15

2.2DataOb jectsandOp erationsThe most imp ortantcharacteristics of a programming language are the available data ob jectsandtheop erationsthatmaybeexecutedup onthem.Theterm`ob ject'meansaconcreteinstance of an abstract value.Many such instances of the same value may exist at the sametime.Thesetofvaluesp ossibleinalanguage,suchasnumb ers,characterstrings,recordsand so forth, is usually innite although a given program naturally uses only a nite numberof them.Ob jectsandvaluesmaybeclassiedaccordingtomanycriteria.Forexample,theirinternal (tothe computer) or external representation, the algorithm used toaccessthem, orthe accessrights might b e used.Each such classication leads to an attribute of the ob ject.The most imp ortant classication is a partition of the set of values according to the applicableop erations; the corresp onding attribute is called thetypeormodeof the value.Examples arethe numeric typ esintegerandreal, to which the basic arithmetic op erations may b e applied.(The sp ecial role of zero in division is not covered by this classication.)Aroughsub divisionofob jecttyp escanbemadeonthebasisofthep ossibleaccessfunctions.If an ob ject can b e accessed only in its entiretywesay that its typ e iselementary.If, however,the ob jectconsists of acollection of distinct comp onents,whichmayb e alteredindividually, then wesay that its typ e iscomposite.Thus if a programming language were toexplain

oatingp ointop erationsintermsofup datingop erationsonfractionandexp onentindividually, oating p ointvalues would b e comp osite.This is not usually done; the oatingp ointop erationscanonlyyield complete

oatingnumb ers,andhencerealisanelementarytyp e.Every op eration interprets its op erands in a sp ecied manner.The assignmentofatyp e toavalue xes this interpretation and admits only those op erations for which this interpretationis meaningful.As usual with such attributes, there are many p ossible choices for thebindingtime{ the p oint at which a particular attribute is ascrib ed to a particular ob ject:If the typ eis rst xed up on execution of an op eration, and if practically any op eration can b e appliedtoanyob ject(solongasitslengthisappropriate),thenwetermthelanguagetypelessortype-free;otherwise it is called atypedlanguage.If the typ e of an ob ject can b e determinedexplicitly from the program text, we sp eak ofmanifesttyp e; the typ e islatentif it cannot b edetermined until the program is executed.(A language whose typ es are manifest throughoutissometimescalledastrongly-typedlanguage,whileonewhosetyp esarelatentiscalledweakly-typed.)Ob jectswithlatenttyp esmustbeprovidedwithanexplicittyp eindicationduring execution.Mostassembly languages are examples of typ eless languages.In contrast,ALGOL60,FORTRANandCOBOLarelanguageswithmanifesttyp es:Allvariablesaredeclared (either explicitly or implicitly) to havevalues of a certain typ e, and there are dierentforms of denotation for constantsof dierenttyp es.SNOBOL4 has neither declarations norimpliedtyp esp ecicationsforitsvariables;onthecontrary,thetyp emaychangeduringexecution.Thus SNOBOL4 has latenttyp es.The union mo des in ALGOL 68 and the variantrecords of Pascal and Ada takeanintermediate p osition.Avariable of such a `discriminatedunion' has a latenttyp e, but the p ossible value typ es may only b e drawn from an explicitly-stated set.In a typ eless language, the internal representation (`co ding') of an ob ject is the concern ofthe programmer; the implementor of a typ ed language can x the co ding b ecause he is fullyaware of all desired interpretations.Erroneous co ding by the programmer is thus imp ossible.Further,inconsistentcreationoruseofadataob jectcanbedetectedautomaticallyandhencetheclassofautomatically-detectederrorsisbroadened.Withmanifesttyp essucherrors app ear during compilation, with latenttyp es they are rst detected during execution.Moreover, in a language with latenttyp es the erroneous creation of an ob ject is only detected

16Prop erties of Programming Languages

up onsubsequentuseandthenecessarydynamictyp echeckingincreasesthecomputationtime.2.2.1ElementaryTyp esOurpurp oseinthissectionandthenextistogiveanoverviewofthetyp esusuallyfoundin programming languagesand explore their `normal' prop erties.The readershould noteinparticular how these prop erties may b e deduced from the language denition.The elementary typ es can b e partitioned according to the (theoretical) size of their valuesets.Atyp e is calledniteif only a xed number of values of this typ e exist; otherwise thetyp e is (p otentially)innite.Finitetyp escanbedenedbyenumerationofallofthevaluesofthetyp e.Examplesarethetyp eBooleanwhosevalue setisftrue,falsegand thetyp echaracter,with theentiresetofcharactersp ermittedbyanimplementationasitsvalueset.Almostallop erationsandprop erties ofatyp ewithnvaluescanbedened givinga1-1corresp ondencewiththenatural numb ers 0;::: ;n1 and then dening op erations using these ordinal numb ers.Thisp ossibility do es not imply that such a mapping is actually sp ecied in every language; on thecontrary,nite typ esareintro duced primarily torepresentvaluesetsforwhichanumericalinterpretationismeaningless.Forexample,therevisedALGOL68rep ortdenesnocorre-sp ondence b etween truth values and the integers values, but leaves its precise sp ecication tothquotesdbs_dbs14.pdfusesText_20

[PDF] compiler definition

[PDF] compiler design

[PDF] compiler design ppt

[PDF] compiler error

[PDF] compiler pdf

[PDF] complementary slackness condition lagrangian

[PDF] complete avec etre ou avoir

[PDF] complete avec etre ou avoir au futur

[PDF] complete avec etre ou avoir au futur brainly

[PDF] complete avec etre ou avoir au present

[PDF] complete avec le verbe etre ou avoir

[PDF] complete business plan for bakery ppt

[PDF] complete english grammar book free

[PDF] complete english grammar book free download

[PDF] complete english grammar books free download pdf in hindi