DATA STRUCTURE - RECURSION BASICS Some computer programming languages allows a module or function to call itself This technique is known as recursion
We will take only a superficial look at recursion in this course since it provides a very neat way to represent certain useful numerical functions and data
What is the data structures used to perform recursion? Ans: Stack Because of its LIFO (Last In First Out) property it remembers its 'caller' so knows whom to
performed, using as few resources, both execution time and memory space, Some of the more commonly used data structures include lists, arrays, stacks,
What is the data structures used to perform recursion? Stack Because of its LIFO (Last In First Out) property it remembers its 'caller' so knows whom to return
memory needed to store data and the kind of data that will be stored in that memory location What is the data structures used to perform recursion?
Recursionisacentralconceptincomputationinwhichthesolutionofaproblemdependsonthesolutionofsmallercopies
ofthesameproblem.Recursionisaconceptuallydifferentapproachtothinkingaboutnumericalalgorithms. Itstandsin
contrasttoiterationwhichisthemorefamiliarwayofthinkinaboutalgorithms(atleasttomostpeoplewithbackgrounds
inmathematics,physicsorengineering).Allrecursiveprocedureshaveanalternativeformultionasiterativeprocedures.
Wewilltakeonlyasuperficiallookatrecursioninthiscoursesinceitprovidesaveryneatwaytorepresentcertainuseful
numericalfunctionsanddatastructures.ComputerSciencetextsgointomuchgreaterdepth. Mostmodernprogramminglanguagessupportrecursionbyallowingafunctiontocallitself.Arecursivefunctionusuallyconsistsofoneormorerecursivecases(valuesofthe inputsforwhichthefunctioncallsitself)and oneormore
so-calledbasecases(valuesofthe inputsforwhichthefunctionreturnsaresultstriviallywithoutrecurring).Animportant
considerationwhendesigningrecursivefunctionsistomakesurethatabasecaseisalwaysreachedforallowedvaluesoftheinputdata.Failuretodothiscaneasilyresultinnon-terminatinginfiniterecursionswhichcanberatherdifficultto
debug.Considerthefollowingclassicalexamplewhichimplementsthefactorialfunctionasarecursivefunction: intf(intn) { if (n == 0) // Base case {return1; } else {returnn*f(n-1);//Recursivecase } } Ifyou haveneverencounteredtheconceptofarecursivefunctionbefore,ittakessomethoughttoseethatf(n)=n!. Onewaytounderstandwhatishappeningistowriteout thesetoffunctionexecutionswhichoccurwhenthefunctioniscalled.Consider, forexamplewhenwecalculatef(4).Usingbracketstokeeptrackofthelevelsoftherecursionwehave
f(4) = 4 *f(3) //Input3callsrecursivecase = 4 *(3*(f(2))//Input2callsrecursivecase = 4 *(3*(2*f(1))//Input1callsrecursivecase = 4 *(3*(2*(1*f(0))))//Input0callsbase casereturning1 = 4 *(3*(2*(1*1))) //Multiplications nowgo frominside the out = 4 *(3*(2*1)) = 4 *(3*2) = 4 *6 = 24Wehavealreadyseensomelesstrivialexamplesofrecursivefunctionsinthe"divide-and-conquer"algorithmsintro-
ducedinSecs.1.2and1.3.Recursionisthenaturalframeworkforthinkingaboutdivide-and-conqueralgorithms. IfF(n)
representsthenumberofFLOPsrequiredtosolve aproblemofsizen,wehaveseenthatfordivide-and-conqueralgo-rithms,F(n)satisfiesarecurrencerelation.Intheliteratureontheanalysisofthecomputationalcomplexityofalgorithms,
theso-called"MasterTheorem"providesasymptoticboundsonthecomplexityofrecursivelydefineddivide-and-conquer
algorithms.Herewejustprovidethestatementforthepurposesofapplication. 12apointerpointsto.Theoperationofobtainingthevalueofavariablereferencedbyapointerisknownasdereferencing
thepointer.Mostoftheapplicationsofpointersexploitthefactthatitisoftenmuchcheaperintimeandspacetocopyand
dereferencepointersthanitistocopyandaccessthedatatowhichthepointerspoint.InC/C++thesyntaxfordefininganddereferencingapointerinvolvesthesymbol*.Thesyntaxforobtainingtheaddress
ofavariabletostoreinapointerinvolvesthesymbol&.Hereisanexample: inta = 6; //Defineanintaand set itsvalueto be 6 int *ptr=NULL;//Defineapointerptr and set itsvalueto be NULL ptr=&a;//Set ptrtopointto a cout << *ptr<Anyattempttousesuchuninitializedpointerscancauseunexpectedbehavior,eitherbecausetheinitialvalueisnota
validaddress,orbecauseusingitmaydamageotherpartsoftheprogram.Theresultisoftenasegmentationfaultorotherstorageviolation.Evenworse,insystemswithexplicitmemoryallocation(inCthisisimplementedusingthemalloc()
functionanditsvariants),itispossibletocreatea"dangling"pointerbyaccidentallydeallocatingthememoryregionitpoints
into.Thistypeofpointerisdangerousandsubtlebecauseadeallocatedmemoryregionmaycontainthesamedataasitdidbeforeitwasdeallocatedbutmaybethenreallocatedandoverwritten.Thiscanleadtothemosteviltypesof"random"
bugsinwhichaprogramsometimesrunscorrectly(ifthememorypointedtobythedanglingpointerhappensnottobeoverwrittenbeforeitisused)butsometimesrunsincorrectlyorcrasheswithasegmentationfault(ifthememorypointerto
bythedanglingpointerisreallocatedforsomeotherpurposebeforeitisused). Letusnowlookathowpointerscanbeusedtodouseful things.so-calledsinglylinkedlist),eachnodeconsistsofsomedataandapointertothenextnodeinthelist.Usuallythelastnode
inthelisthasaNULLpointerindicatingtheendofthelist:Insomerespectsalinkedlistsislike aconventionallineararrayinthesensethat isstoresasequenceofdataobjects.
Ithassomeadvantagesoveraconventionallineararrayhowever: •Linkedlistsaredynamicdatastructures.Thememoryisallocatedastheitemsareaddedtothelistsoitisnot necessarytoknowthelengthofthelistinadvance.•Elementscanbeefficientlyinsertedorremovedfromthelistwithoutreallocationorreorganizationoftherestofthe
elements.Thisisbecausethelistelementsdonotneedtobestored contiguouslyinmemory. •Theyformthebuildingblocksofmorecomplicateddatastructuresliketreesandnetworkswhicharenoteasily implementedaslineararrays.•Thebiggestdisadvantageis thatlinkedlistsonlyprovidesequentialaccesstolistelements.Togettonodeirequires
traversingthelistfromthebeginning.Thismakesdirectrandomaccesstolistelementsveryinefficient. •Becauseeachnodestoresapointerinadditiontoanitemofdata,alinkedlisttakesupmorememorythanalinear arraytostoreanequivalentamountofdata. Sinceeachnodeofthelistpointstoanothernodewhichpointstoanothernodeetc,itisnaturaltodefinealinkedlistasarecursivedatastructure:alinkedlistiseitherempty,oritisanodewhichpointstoalinkedlist.HereisasimpleC++
classwhichimplementssuchadefinitionforalinkedlistwhichstoresintegerdata:pointertoalistand aninteger,insertsanewitematthetailofthelistandthenreturnsapointertothemodifiedlist.One
waytodothisisviarecursion: List *List::insertTail(List*list,intN) { if(list==NULL)// Base case { List *new_node =newList();//Createanewnode new_node->n=N;//Storethedata returnnew_node;//Return the newnode } else//Recursivecase {//Recurseonlist->next list->next=insertTail(list->next,N); returnlist; } }Ifyoucanunderstandthispieceofcodethenyoureallyunderstandrecursivefunctions.Doesthiscodereallyinsertanew
nodeatthetailofthelist?Considerthefollowingargument: •Iflisthaslength0,thebasecaseholds.Themethodreturnstoalistconsistingofonlyanewnodecontainingthe integerN.Thisisthecorrectanswer. •Iflisthaslengthn>0thentherecursivecallisappliedtotheListlist.nextwhichhaslengthn-1. •Letusassumethatthisrecursivecallgivesthecorrectresultforalistoflengthn-1.ThatistosayweassumetherecursivecallreturnsapointertotheListlist.next(containingallnodesafterthefirst)withanewnodecontaining
theintegerNinsertedatitsrear.Giventhisassumption,storingapointertothis Listintolist->nextandreturninglist
producesthecorrectresultforthelistoflengthn. •Byinduction,wethereforeobtainthecorrectresultforaListofanyfinitelength.Manyoperationsonlinkedlistsaremosteasilywrittenusingrecursivemethods.Thisisnatural,becauselinkedlistsare
themselvesdefinedrecursively.Abasicimplementationoftheaboveclasscanbe downloadedfromtheclasswebsitefor
youtoplayaroundwith.ItisfairlyeasytoimplementlinkedlistsinalternativewaystotheapproachdescribedinSec.2.2whichavoidrecursion.
Somemorecomplicateddatastructureshowever,areverydifficuilttoimplementwithoutrecursion.Oneofthesimplestand
mostusefulofsuchstructuresisabinarytree.Abinarytreeisatreedatastructureinwhicheachnodehasatmosttwochildren,whicharereferredtoastheleftchildandtherightchild.Thetopnodeinthetreeiscalledtherootnode.Aswith
linkedlists,abinarytreedatastructureusuallycontainssomedataateachnodeinadditiontopointerstotheleftandright
childnodes.Hereisasketchofasmalltreewith8nodesstoringthelistofdataitems{d1,d2,...d8}in"breadth-first"
searchorder:Binarytreeshaveusefulapplicationsinsearchandsortingproblems.Oneparticularlyimportantapplicationoftreesin
searchisastructurecalledabinarysearchtree.ConsideranarrayofNkey-valuepairs,{(k1,d1),(k2,d2),...(kN,dN)}
whichcanbesortedonthekey values.Thisistosay,thelistcanbearrangedsothatk1≤k2≤...≤kN.Forarandom
key value,k,betweenk1andkNwe wishtoreturnthedatavalueassociatedwiththekeyk.Thenaiveapproachwould betosimplyiteratethroughthesortedlistonebyone,comparingthekeyofeachlistelementwithkuntilweeitherfind onethatmatchesorreachavaluewhichexceedsk(inwhichcasethekeykisnotinthelist).Onaveragethisisclearly anO(N)operation.Bystoringthelistinabinarysearchtree,thisoperationcanbeaccomplishedinO(log(N))time.Abinarysearchtree,sometimesalsocalledanorderedorsortedbinarytree,isabinarytreewithakey-valuepairateach
nodewithoutthespecialpropertythatthekeyofanynodeislargerthanthekeysofallnodesinthatnode"sleftchildand
smallerthanthekeysofallnodesinthatnode"srightchild.Asmallmodificationoftheaboveincorporatesthesearchkey
(wetakethekeytobe aninteger): class BinaryTree { intkey;examiningtherootnode. Ifthetreeisnull,thekeywearesearchingfordoesnotexistinthetree.Otherwise,ifthekey
equalsthatoftheroot,thesearchissuccessfulandwereturnthevalueatthatnode. Ifthekeyislessthanthatoftheroot,
wesearchtheleftsubtree.Similarly,ifthekeyisgreaterthanthatoftheroot,wesearchtherightsubtree.Thisprocess
isrepeateduntilthekeyisfoundortheremainingsubtreeisnull.Ifthesearchedkeyisnotfoundbeforea nullsubtreeis
reached,thentheitemmustnotbepresentinthetree.Thisiseasilyexpressedasarecursivealgorithmwhichreturnsa pointertothenodecontainingthematchingkey(ortheNULLpointerinthecasethatthekeyisnotmatched):Assumingthatthetreeisbalanced,boththeleftandrightchildtreeswillhaveapproximatelyhalfthenumberofnodes.The
amountofworkwhichwehavetodoateachiterationistherefore(atmost)4comparisonsfollowedbyasearchonatree ofhalfthesize.MathematicallyThefollowingproblemarisessofrequentlyinsimulationofstochasticprocesses(see[2])thatitisworthdevotingalittle
timetostudyhowtosolveitefficiently.Wearegivenalistofpairs,{(x1,k1),(x2,k2),...(xN,kN)},inwhicheach xiisarealnumberandeachkiisaninteger.Intheapplicationtostochasticsimulation,wehave alistofpossibleevents
whichcanoccurinthesystemlabelledk1,...,kNand anassociatedlistofratesx1,...,xN,atwhichtheseevents occur.Fromthelistofrateswecancalculatethelistofpartialsums, {si:i=0,1...N}wheresi=i? j=1x j, (wetakes0=0).Thecomputationalchallengeisthefollowing:givena numberx?(0,sN],canweefficientlyfindisuchthatsi1 applicationstostochasticsimulationweusuallywanttoreturntheindexkisincethisdetermineswhicheventhappendnext. SuchastructureiscalledaFenwicktree. Ifweleaveasidethequestionofhowtobuildthetreestartingfromtheinitial list,theprocessofsearchingfortheintervalcontainingaparticularvaluexisrathersimilartothebinarytreesearch.We comparextothesumofthevaluesstoredinleftchildandifxislessthanthisvaluethenwesearchforxontheleftbranch. Ifnot,thenwesubtractfromxthevalueofallthenodesontheleftbranchandsearchfortheresultontherightbranch.The subtractionisthebitwhichiseasytogetwronghere.Checkthediagramabovetoconvinceyourselfthatthisiscorrect. treeisavailablefromtheclasswebsiteforyoutoplayaroundwith.Inparticular,thisimplementationcontainsarecursive functionwhichbuildsthetreeinthefirstplace.BuildingthetreeisanO(N)operation.Anotherimportantpointabout FenwicktreesandbinarysearchtreesingeneralwhichmakesthempracticalforefficientimplementationofMonteCarlo simulationsis thatonceatreehasbeenbuilt,theoperationsofinsertion,deletionandupdatingofnodescanalsobe done inO(log(N))time.Thismeansthatitisnotnecessarytorebuildthetreeeachtimetheunderlyingarrayofrateschanges. andsearchbecomesequivalenttolinearsearch).Inpractice,thetreemayneedtoberebalancedatoccasionalintervals.Notes2:Introductiontodatastructures
MA934NumericalMethodsNotes2
Figure2.2:Comparisonoftheperformanceofbinarytreesearch(bluesquares)andlinearsearch(redcircles)forthe problemofsearchingtherunningtotalatanarrayofrealnumbers. DatumBinaryTree::search(BinaryTree*T,doublex)
{ if(T->L==NULL&&T->R==NULL) { // Basecase:we havereacheda leafnode.Return thedatastoredthere returnT->data; } else { //Recursivecase:decidewhetherto go left orright. doubley=(T->L->data).x;//Get the sumoftheleftchild if(x <= y) { //Searchleftchild return search(T->L, x); } else { //Searchright child return search(T->R,x-y); } } } Whendoinglargescalestochasticsimulations,theextraworkassociatedwithimplementingsuchadatastructureis wellworththeincreaseinspeedoftheresultingcode.Fig.2.2showstheresultsofdoingsomeperformancemeasure- mentsofsearchonaFenwicktreecomparedtonaivelinearsearch.Ifyou havenotusedbinarysearchalgorithmsinyour programmingbefore,theincreaseinspeedforlargearraysistrulyawesometobehold!AnimplementationofaFenwick Notes2:Introductiontodatastructures
MA934NumericalMethodsNotes2
Bibliography
[1]K.BogartandC.Stein.CS21/math19-coursenotes.https://math.dartmouth.edu/archive/ m19w03/public_html/book.html,2003. [2]J.L.Blue, I.Beichl,andF.Sullivan.Fastermontecarlosimulations.Phys.Rev.E,51(2):R867-R868,1995. Notes2:Introductiontodatastructures