We will start from assembly language but use high-level C language to help understand it Compiler often directly generates machine code The assembly language
Assembly language is converted into executable machine code by a utility program referred to as an assembler like NASM, MASM etc Audience
4 9 1 The UCR Standard Library for 80x86 Assembly Language Programmers 169 The Art of Assembly Language Page xi 8 22 5 GetArray ASM
http://www avr-asm-tutorial net Why learning Assembler? Assembler or other languages, that is the question Why should I learn another language,
Figure 1 6: driver c code advantage of this They use C's I/O functions (printf, etc ) The following shows a simple assembly program first asm
Within “C” you can insert assembly language using “asm(“ ASM is generally used to speed up C code – Typically a function is written in assembler, and
L07: Assembly Programming I Translation 4 What makes programs run fast(er)? Hardware User program in C Assembler C compiler Code Time Compile Time
Write assembly language functions that are C-callable • Use assembly language statements within a C function (by using the asm() in-line assembly construct)
Assembler language (Computer program language) 2 Microprocessors—Programming I Title II Series QA76 73 A87D36 2004 005 13?6—dc22 2004049182
![[PDF] Assembly Programming I - Washington [PDF] Assembly Programming I - Washington](https://pdfprof.com/EN_PDFV2/Docs/PDF_3/20388_3CSE351_L07_asm_I_17sp_ink.pdf.jpg)
20388_3CSE351_L07_asm_I_17sp_ink.pdf
CSE351, Spring 2017
L07: Assembly Programming I
AssemblyProgrammingICSE351Spring2017AssemblyProgrammingICSE351Spring2017
Instructor:
RuthAnderson
TeachingAssistants:
DylanJohnson
KevinBi
Linxing PrestonJiang
CodyOhlsen
Yufang Sun
JoshuaCurtis
CSE351, Spring 2017
L07: Assembly Programming I
Administrivia
Lab1dueFriday(4/14)
Prelimsubmission(3+ofbits.c)dueonTONIGHT(4/10).
Turninwhateveryouhaveatthattime(dropboxclosesat
11:59pm),nolates.Worthasmallpart(nomorethan10%)
oftotalpointsforlab1.
Homework2duenextWednesday(4/19)
2
CSE351, Spring 2017
L07: Assembly Programming I
Roadmap
3 car *c = malloc(sizeof(car)); c->miles = 100; c->gals = 17; floatmpg = get_mpg(c); free(c);
Car c = new Car();
c.setMiles(100); c.setGals(17); float mpg = c.getMPG(); get_mpg: pushq%rbp movq%rsp, %rbp ... popq%rbp ret
Java:C:
Assembly
language:
Machine
code:
0111010000011000
100011010000010000000010
1000100111000010
110000011111101000011111
Computer
system:OS:
Memory &data
Integers&floats
x86assembly
Procedures&stacks
Executables
Arrays&structs
Memory&caches
Processes
Virtualmemory
Memoryallocation
Javavs.C
CSE351, Spring 2017
L07: Assembly Programming I
Translation
4
Whatmakesprogramsrunfast(er)?
Hardware
User program inC
Assembler
C compiler
CodeTime CompileTime RunTime
.exefile.cfile
CSE351, Spring 2017
L07: Assembly Programming I
CLanguage
HWInterfaceAffectsPerformance
5 x86Ͳ64
IntelPentium 4
IntelCore2
Intel Corei7
AMDOpteron
AMDAthlon
GCC ARMv8 (AArch64/A64)
ARMCortexͲA53
AppleA7
Clang Your program
Program
BProgram
A
CompilerSourcecodeArchitecture
Differentapplications
oralgorithmsPerformoptimizations, generateinstructionsDifferent implementations
Hardware
Instructionset
CSE351, Spring 2017
L07: Assembly Programming I
InstructionSetArchitectures
TheISAdefines:
Thesystem'sstate(e.g.registers,memory,program
counter)
TheinstructionstheCPUcanexecute
Theeffectthateachoftheseinstructionswillhaveonthe
systemstate 6 CPU
Memory
PC
Registers
CSE351, Spring 2017
L07: Assembly Programming I
InstructionSetPhilosophies
ComplexInstructionSetComputing(CISC):Addmore
andmoreelaborateandspecializedinstructionsas needed
Lotsoftoolsforprogrammerstouse,buthardwaremustbe
abletohandleallinstructions x86Ͳ64isCISC,butonlyasmallsubsetofinstructions encounteredwithLinuxprograms
ReducedInstructionSetComputing(RISC):Keep
instructionsetsmallandregular
Easiertobuildfasthardware
Letsoftwaredothecomplicatedoperationsbycomposing
simplerones 7
CSE351, Spring 2017
L07: Assembly Programming I
GeneralISADesignDecisions
Instructions
Whatinstructionsareavailable?Whatdotheydo?
Howaretheyencoded?
Registers
Howmanyregistersarethere?
Howwidearethey?
Memory
Howdoyouspecifyamemorylocation?
8
CSE351, Spring 2017
L07: Assembly Programming I
GeneralISADesignDecisions
Instructions
Whatinstructionsareavailable?Whatdotheydo?
Howaretheyencoded?Instructionsaredata!
Registers
Howmanyregistersarethere?
Howwidearethey?Sizeofaword
Memory
Howdoyouspecifyamemorylocation?Differentwaysto
buildupanaddress 9
CSE351, Spring 2017
L07: Assembly Programming I
MainstreamISAs
Macbooks &PCs
(Corei3,i5,i7,M) x86Ͳ64InstructionSet
SmartphoneͲlikedevices
(iPhone,iPad,RaspberryPi)
ARMInstructionSet
Digitalhome&networking
equipment (BluͲray,PlayStation2)
MIPSInstructionSet
10
CSE351, Spring 2017
L07: Assembly Programming I
Definitions
Architecture(ISA):Thepartsofaprocessordesign
thatoneneedstounderstandtowriteassemblycode "Whatisdirectlyvisibletosoftware"
Microarchitecture:Implementationofthe
architecture
CSE/EE469,470
Arethefollowingpartofthearchitecture?
Numberofregisters?
HowaboutCPUfrequency?
Cachesize?Memorysize?
11
CSE351, Spring 2017
L07: Assembly Programming I
Definitions
Architecture(ISA):Thepartsofaprocessordesign
thatoneneedstounderstandtowriteassemblycode "Whatisdirectlyvisibletosoftware"
Microarchitecture:Implementationofthe
architecture
CSE/EE469,470
Arethefollowingpartofthearchitecture?
Numberofregisters?Yes
HowaboutCPUfrequency?No
Cachesize?Memorysize?No
12
CSE351, Spring 2017
L07: Assembly Programming I
CPU
AssemblyProgrammer'sView
ProgrammerͲvisiblestate
PC:theProgramCounter(%ripinx86Ͳ64)
•
Addressofnextinstruction
Namedregisters
•
Togetherin"registerfile"
•
Heavilyusedprogramdata
Conditioncodes
•
Storestatusinformationaboutmostrecent
arithmeticoperation •
Usedforconditionalbranching
13 PC
Registers
Memory
•Code •Data •Stack
Addresses
Data
Instructions
Condition
Codes
Memory
ByteͲaddressablearray
Codeanduserdata
IncludestheStack(for
supportingprocedures)
CSE351, Spring 2017
L07: Assembly Programming I
x86Ͳ64Assembly"DataTypes"
Integraldataof1,2,4,or8bytes
Datavalues
Addresses(untyped pointers)
Floatingpointdataof4,8,10or2x8or4x4or8x2
Differentregistersforthose(e.g.%xmm1,%ymm2)
Comefromextensionstox86(SSE,AVX,...)
Noaggregatetypessuchasarraysorstructures
Justcontiguouslyallocatedbytesinmemory
Twocommonsyntaxes
"AT&T":usedbyourcourse,slides,textbook,gnutools,... "Intel":usedbyInteldocumentation,Inteltools,...
Mustknowwhichyou'rereading
14
Notcovered
InCSE351
CSE351, Spring 2017
L07: Assembly Programming I
WhatisaRegister?
AlocationintheCPUthatstoresasmallamountof
data,whichcanbeaccessedveryquickly(onceevery clockcycle)
Registershavenames,notaddresses
Inassembly,theystartwith%(e.g.%rsi)
Registersareattheheartofassemblyprogramming
Theyareapreciouscommodityinallarchitectures,but
especiallyx86 15
CSE351, Spring 2017
L07: Assembly Programming I
x86Ͳ64IntegerRegisters-64bitswide CanreferencelowͲorder4bytes(alsolowͲorder2&1bytes) 16 %r8d %r8 %r9d %r9 %r10d %r10 %r11d %r11 %r12d %r12 %r13d %r13 %r14d %r14 %r15d %r15 %rsp %esp%eax %rax %ebx %rbx %ecx %rcx %edx %rdx %esi %rsi %edi %rdi %ebp %rbp
CSE351, Spring 2017
L07: Assembly Programming I
SomeHistory:IA32Registers-32bitswide
17 %esi %si %edi %di %esp %sp %ebp %bp %eax %ax %ah %al %ecx %cx %ch %cl %edx %dx %dh %dl %ebx %bx %bh %bl
16Ͳbitvirtualregisters
(backwardscompatibility) generalpurpose accumulate counter data base source index destination index stack pointer base pointer
NameOrigin
(mostlyobsolete)
CSE351, Spring 2017
L07: Assembly Programming I
Memoryvs. Registers
Addressesvs.Names
0x7FFFD024C3DC %rdi
Bigvs.Small
~8GiB (16x8B)=128B
Slowvs.Fast
~50Ͳ100ns subͲnanosecondtimescale
Dynamicvs.Static
Can"grow"asneeded fixednumberinhardware
whileprogramruns 18
CSE351, Spring 2017
L07: Assembly Programming I
ThreeBasicKindsofInstructions
1)Transferdatabetweenmemoryandregister
Loaddatafrommemoryintoregister
• %reg=Mem[address]
Storeregisterdataintomemory
•
Mem[address]=%reg
2)Performarithmeticoperationonregisterormemory
data c = a + b; z = x << y; i = h & g;
3)Controlflow:whatinstructiontoexecutenext
Unconditionaljumpsto/fromprocedures
Conditionalbranches
19
Remember:Memory
isindexedjustlikean arrayofbytes!
CSE351, Spring 2017
L07: Assembly Programming I
Operandtypes
Immediate:Constantintegerdata
Examples:$0x400,$-533
LikeCliteral,butprefixedwith'$'
Encodedwith1,2,4,or8bytes
dependingontheinstruction
Register:1of16integerregisters
Examples:%rax,%r13
But%rspreservedforspecialuse
Othershavespecialusesforparticular
instructions
Memory:Consecutivebytesofmemory
atacomputedaddress
Simplestexample:(%rax)
Variousother"addressmodes"
20 %rax%rcx%rdx%rbx%rsi%rdi%rsp%rbp%rN
CSE351, Spring 2017
L07: Assembly Programming I
MovingData
Generalform:mov_ source, destination
Missingletter(_)specifiessizeofoperands
NotethatduetobackwardsͲcompatiblesupportfor8086 programs(16Ͳbitmachines!),"word"means16bits=2bytes inx86instructionnames
Lotsoftheseintypicalcode
movb src, dst
Move1Ͳbyte"byte"
movw src, dst
Move2Ͳbyte"word"
21
movl src, dst
Move4Ͳbyte"longword"
movq src, dst
Move8Ͳbyte"quadword"
CSE351, Spring 2017
L07: Assembly Programming I
movqOperandCombinations
Source Dest Src,Dest CAnalog
movq Imm Reg movq $0x4, %rax Mem movq $-147, (%rax)
RegReg
movq %rax, %rdx Mem movq %rax, (%rdx)
Mem Reg
movq (%rax), %rdx 22
CannotdomemoryͲmemorytransferwithasingle
instruction
Howwouldyoudoit?
var_a = 0x4; *p_a = -147; var_d = var_a; *p_d = var_a; var_d = *p_a;
CSE351, Spring 2017
L07: Assembly Programming I
Question
WhichofthefollowingstatementsisTRUE?
A.Forfloat f,(f+2 == f+1+1)always
returnsTRUE
B.Thewidthofa"word"ispartofasystem's
architecture(asopposedtomicroarchitecture)
C.Havingmoreregistersincreasestheperformance
ofthehardware,butdecreasestheperformance ofthesoftware
D.MemtoMem(src todst)is
theonlydisallowed operandcombinationinx86Ͳ64 23
CSE351, Spring 2017
L07: Assembly Programming I
Summary
Convertingbetweenintegralandfloatingpointdata
typesdoeschangethebits
FloatingpointroundingisaHUGEissue!
•
Limitedmantissabitscauseinaccuraterepresentations
• FloatingpointarithmeticisNOTassociativeordistributive x86Ͳ64isacomplexinstructionsetcomputing(CISC) architecture
RegistersarenamedlocationsintheCPUforholding
andmanipulatingdata x86Ͳ64uses1664Ͳbitwideregisters
Assemblyoperandsincludeimmediates,registers,
anddataatspecifiedmemorylocations 24