Bayesian Variable Selection for Random Intercept Modeling of

and Lesaffre (2008) suggested to use finite mixture of normal priors for p(?i

J.M.Bernar do,M.J. Bayarri,J.O.Berger, A.P.Dawid,

D.Heckerman, A.F.M.Smith andM.West (Eds.)


BayesianVariableSelectionfo rRandom

InterceptModeling ofGaussianand

non-GaussianData DepartmentofAppliedStatistics andEc onometrics,JohannesKeplerUniversit¨atLinz, Austria


Thepaper considersBayesianv ariableselectionforrandomin terceptmodelsboth forGaussian andnon-Gaussiandata.F orGaussian datathemodelreads y it =x it i it it N


2 ,(1) whereyitarerepeated responsesobservedfor Nunits(e.g. subjects)i=1,...,Non Tioccasionst=1,...,Ti.xitisthe(1×d)design matrixforanunkno wnregression coe ffi cientα=(α1,...,α d ofdim ensiond,including theoverallin tercept.For each unit,βiisasubjectsp ecificdeviati onfromtheov erallintercept. Fore ffi cientestimationitisnecessary tospecifythedi stributionofheterogene- ityp(β1,...,βN).Asusual weassum ethat β1,...,βN|θareindep endentgiven arandomh yperparameter θwithpriorp(θ).Marginall y,therandomintercepts β1,...,βNaredependen tandp(β1,...,βN)actsa smoothi ngpri orwhichtiesthe randomintercepts togetherandencouragesshrinkage ofβitowardtheoverall inter- ceptby "borrowingstrength"from observationsofothersubjects.Averypopular choiceisthefollo wingstandardrandominterceptm odel: i |Q≂N(0,Q),Q≂G -1 (c 0 ,C 0 ),(2) whichisbasedonassumingconditional normalityof therandomin tercept. Severalpapersdealwith theissueofspeci fyingalternative smoothing priors p(β1,...,βN),because misspecifyi ngthisdistributionmayleadtoinefficient,and forrandomi nterceptmo delfornon-Gaussiandata,eventoinconsisten testimation oftheregressi oncoe ffi cientα,seee. g.Neuhauset al.(1992).Recently ,Kom´arek andLesa ff re(2008)suggested tousefinite mixtureof normal priorsforp(β i |θ)to handlethisissue. Inthepresentpaper wealsodeviatefromthe commonl yused normalprior(2)and considermoregeneralpriors.Ho wever, inaddition tocorrect estimationofα,ourfo cuswil lbeonBa yesianvariableselection. TheBay esianvariableselectionapproachiscomm onlyappliedtoastandard regressionmodelwhere β i isequalto0i n(1) forall unitsandaim satseparating

2S.Fr ¨uhwirth-SchnatterandH.Wagner

non-zeroregression coe ffi cientsαj?=0from zeroregressionco efficientsαj=0. Bycho osinganappropriatepriorp(α),it ispossible toshrinksomeco efficientsαr toward0andidentify inthisw ayrel evantcoe ffi cients.Commonshrinkage priors arespik e-and-slabpriors(MitchellandBeauchamp, 1988;GeorgeandMcCulloch,

1993,1997;Ish waranand Rao,2005),whereaspikeat 0(either aDiracmeasure

oradensi tywi thverysmallvariance)is combinedinthesl abwithadensity with largevariance.Al ternatively,unimodalshrinkage priorshavebeenappliedlikethe doubleexponential orLaplacepriorleadingtotheBay esianLasso(P arkandCasella,

2008)orthe moregeneral normal-gam maprior(Gri

ffi nandBro wn,2010); seealso

Fahrmeiretal.(2010)forarecen treview.

Subsequentlyweconsidervariabl eselectionfortherandom interceptm odel(1). Althoughthisalsoconcernsα,we willfocuson variableselectionfortherandom e ff ectswhic h,todate,hasbeendiscussedonlyb yafew papers.F ollowing Kinney andDunson(2007), Fr¨ uhwirth-Schnatter andT¨uchler(2008),andT¨uchler(2008) wecouldconsider variableselectionfor therandominterceptmo delasaproblemof varianceselection.Underpri or(2),forinstance,asinglebinary indicatorδcouldbe introducedwhereδ=0corresp ondstoQ=0,whi leδ=1all owsQtobe different from0.Thi sim plicitlyimpli esvariableselectionfortherandomintercept,because settingδ=0forces allβ i tobe zero,whileforδ=1all randomin terceptsβ 1 N areall owedbedi ff erentfrom0. Inthepresen tpaper weareinterested inaslightlym oregeneralvariableselection problemforrandome ff ects.Rather thandiscriminatingas abov ebetweenamodel whereall randome ff ectsarezero andam odelwhere allrandome ff ectsaredi ff erent from0,i tmi ghtbeof interesttomakeunit-specificselectionof randome ff ectsin ordertoi dentify unitswhichare"average"inthesensethat theydonot deviate fromtheo verall mean,i.e.β i =0,and unitswhi ch deviatesignificantlyfromthe "average",i.e.β i ?=0. Inanalogy tovariableselection instandardregression model,wewillshowthat individualshrinkagefortherandome ff ectscanb eachi evedthroughappropriatese- lectionofthepriorp(β i |θ)ofthe randomeffects.For instance,ifp(β i |Q)is aLaplace ratherthana normalpri orasin (2)witha randomhyperparameterQ,we obtain aBay esianLassorandome ff ectsmodels wherethesmoothingadditionall yallo ws individualshrinkageoftherandomin tercepttoward0forspecific units.How ever, asfora standardregression modelto omuc hshrinkagetakes placeforthenon-zero randome ff ectsunder theLaplaceprior. Forthi sreasonw einv estigatealternative shrinkage-smoothingpriorsfortherandominterceptmo dellikethespik e-and-slab randome ff ectsmo delwhichiscloselyrelatedtothe finitemixturesofrandom e ff ects modelinvestigated byFr¨uhwirth-Schnatteretal.(2004)andKom´arekand Lesa ff re (2008).



ff ectsapproach couldbeappli ed,meaningthateachunit specificparameterβ i istreatedjustasan- otherregression coe ffi cientandthehi ghdimensional parameterα 1 N isestimatedfrom alargeregressionmodelwithoutan yrandomin tercept: y it =x it it it N


2 .(3)

Wecouldthenp erformvariableselection forα

inthelargeregression model(3), inwhichcase abinaryvariableselectionindicatorδ i isintroduced foreachrandom BayesianVariable SelectionforRandom InterceptModels3 e ff ectβiindividually.Thisappearstobethesolutiontothevariable selection problemaddressedinthein troduction, howev er,vari ableselectionin(3)isnot entirelystandard:first,thedimension ofα growswiththen umberNofunits; second,ani nformationimbal ancebetweenthe regressioncoe ffi cientsαjandthe randomintercepts β i ispresent,because thenumber ofobservationsis N i=1 T i for j ,butonl yT i forβ i .This makeitdi ffi culttochoose thepriorp(α ).Under a(Dirac)-spi ke-and-slabpriorforp(α ),fori nstance,aprior hastobec hosenfor allnon-zerocoe ffi cientsinα .Anasym ptotically optimalchoiceinastandard regressionmodelis Zellner'sg-prior,however, theinformationimbalancebetween j andβ i makeitimpossi bletocho oseavalueforgwhichissuitablefor allnon-zero elementsofα Theinform ationimbalancesuggeststochoosethe priorfortheregressionco- e ffi cientsindependentlyfromthe priorfortherandomintercepts,i.e.p(α p(α)p(β 1 N ).Vari ableselectionforβ i inthelargeregression model(3) isthen controlledthroughthechoiceofp(β 1 N )whic hisexactlythesameproblem as choosingthesmoothingintheoriginal randomintercept model(1).Thismotivated ustouse commonshri nkage priorsinBayesian variableselectionassmoothing priors intherandomin terceptmodel andtostudy howthischoicee ff ectsshrinkage for therandomi ntercept. Practicallyallpriorshav eahierarc hicalrepresentationwhere i i andβ j j areindep endentandp(ψ i |θ)depends onahyperparameter θ.The goalis toidentifyc hoicesof p(ψ i |θ)whic hleadtostrongshrinkage ifmanyrandom interceptsareclosetozero, butintro ducelittlebias,if allunitsare heterogeneous.

Notethatthe marginal distribution

p(βi|θ)= p(βi|ψi)p(ψi|θ)dψi isnon-Gaussianandthat thejointdensit yp(β 1 N )is smoothingpriorin the standardsenseonly ,ifat leastsomecomponents oftheh yperparameter θareran- dom.



Thissubsectiondealswith unimodalnon-Gaussianshrink agepriorswhic hputa lot ofprior masscloseto0, buthaveheavy tails.Suchaprior encouragesshrinkage of insignificantrandome ff ectsto ward0and,thesametime, allowsthat theremaining randome ff ectsm aydeviateconsiderablyfrom0.F orsuch aprior,theposterior modeofp(βi|yi,θ)is typicallyto 0withpositiveprobability.Wecall sucha prior anon-Gaussian shrinkageprior.

3.1.Non-GaussianShrinkage Priors

ChoosingtheinvertedG ammapri orψ

i |ν,Q≂G -1 (ν,Q)leads totheStudent-t randomintercept modelwhere i |ν,Q≂t 2ν (0,Q/ν).(5)

4S.Fr ¨uhwirth-SchnatterandH.Wagner


