Difference-in-Differences in Stata 17

16 juin 2021 Two-way fixed effects also known as generalized DID (default). Allows 2x2 design. Provides a wide range of standard errors.


Difference in differences (DID) The coefficient for 'did' is the differences-in-differences estimator. ... The command diff is user-defined for Stata.

Differences-in-Differences (using Stata)

Differences-in-Differences. (using Stata) Difference in differences (DID) ... The coefficient for 'did' is the differences-in-differences estimator.

Simplifying the estimation of difference in differences treatment

22 janv. 2013 Propensity Score (Heckman et al. 1997

Diff: Simplifying the Estimation of Difference-in-differences

12 mars 2014 Although the latest version of Stata is equipped with the command teffects which estimates the treatment effects on a cross-sectional basis


1 mars 2018 Regression Discontinuity. • Today we'll focus on difference-in-differences. – Reminder on basic concepts/theory. – Applications in Stata.

Bacon decomposition for understanding differences-in-differences

differences-in-differences with variation in treatment timing. July 11 2019. Stata Conference. Andrew Goodman-Bacon (Vanderbilt University).

csdid: Difference-in-Differences with Multiple Time Periods in Stata

Today's talk is all about how to implement it with our Stata command csdid. 5. Page 9. Framework and Assumptions. Page 10 

Stata Tutorial

Do-files are ASCII files that contain of Stata commands to run specific procedures. used to indicate a significant difference (some use ±3).

Module 2.5: Difference-in-Differences Designs

? Nous ne reproduirons qu'une partie du code STATA ci-dessous ; veuillez vous référer au fichier DO pour le code complet et les notes accompagnées. ? Ouvrez le jeu de données et

Title statacom didregress — Difference-in-differences estimation

These two differences give theDIDmethod its name and highlight its intuitive appeal More appealing is the fact that you can get the effect of interest theATET from one parameter in a linear regression Below we illustrate how to use didregress and xtdidregress For more information about the methods used below see[TE]DID intro

(v 33) - Princeton University

This document shows how to perform difference-in-differences regression in the following two situations: Event happened at the same time for all treated groups Event is staggered across groups Event happens at the same time for all treated groups Data preparation The before/after variable Create an indicator variable where:

Introduction to Difference in Differences (DID) Analysis

• Difference-in-Differences (DID) analysis is a useful statistic technique that analyzes data from a nonequivalence control group design and makes a casual inference about an independent variable (e g an event treatment or policy) on an outcome variable • The analytic concept of DID is very easy to comprehended within the framework

Diff: simplifying the causal inference analysis with - Stata

Difference in differences Quantile Kernel PSM Diff-in-diff diff fte t(treated) p(t) qdid(0 50) cov(bk kfc roys) kernel id(id) *** KERNEL PROPENSITY SCORE MATCHING QUANTILE DIFFERENCE-IN-DIFFERENCES *** Number of observations: 801 Baseline Follow-up Control: 78 77 155 Treated: 326 320 646

differencesestimator(‘did’inthepreviousexample) Theeffect is significantat10 withthetreatmenthavinganegativeeffect 4 The ssc Type singthecommanddiff commanddiffisuser?definedforStata Toinstalltype Dummies for treatmentand time seepreviousslide installdiff diffyt(treated)p(time)NumberofobservationsintheDIFF-IN-DIFF:70 BaselineFollow-up

3/1/181Photo Credit Goes Here

IAPRI-MSU Technical Training


1March2018IndabaAgriculturalPolicyResearchInstituteLusaka,ZambiaRecall from July/September trainings on introduction to impact evaluation (1)


Recall from July/September trainings on introduction to impact evaluation (2)

• Wedidabriefoverviewofcommonmethodsofimpactevaluation(IE)• Randomizedevaluation• PropensityScoreMatching• Difference-in-Differences• InstrumentalVariables• RegressionDiscontinuity

• Todaywe'llfocusondifference-in-differences- Reminderonbasicconcepts/theory- ApplicationsinStata

Learning objectives

• Bytheendoftoday'ssession,youshouldbeableto:

1. UnderstandthekeyassumptionofDIDmodels2. WritedowntheregressionequationforaDID


3. Identifywhichparameterintheregression

4. EstimateaDIDmodelinStataandinterpretthe




• Khandker,S.R.,Koolwal,G.B.andSamad,H.A.,2010.

Othersuggestedreadingsandreferences-DIDsectionsin:• Angrist,J.D.,&Pischke,J.S.(2009).Mostlyharmless

• Angrist,J.D.,&Pischke,J.S.(2015).Mastering'metrics:The • Imbens,G.W.,&Wooldridge,J.(2007).What'snewin • Ravallion,M.(2008).Evaluatinganti-povertyprograms.

REVIEW: With vs. Without • ThekeycomparisonwewanttomakeinIEisbetweenoutcomesWITHVS.WITHOUTtheintervention(project/program/policy)

• Impact="With"outcome-"without"outcome




Participants'incomeWITHtheprogram?• Y

4 Participants'incomeWITHOUTtheprogram(counterfactualincome)?• Y 2

Programimpact?• Y

4 -Y 2

Introducing Difference-in-Differences (DID)

• WhohasusedDIDbeforeandwhatwereyoustudying? • WhatistheDIDapproachtoconstructingacomparisongroup/ - "TheDIDestimatorreliesonacomparisonofparticipantsand - DIDImpact=(avg.ΔYparticipants)-(avg.ΔYnon-participants) • (avg.Y T after-avg.Y T before)-(avg.Y c after-avg.Y c before)• àwhyit'scalleddifference-in-differencesordoubledifference


DID - visual representation


T after-avg.Y T before)-(avg.Y c after-avg.Y c before)


Changeinparticipants'income?• Y

4 -Y 0

Changeinnon-participants'(control)income?• Y

3 -Y 1

DIDimpact?• (Y

4 -Y 0 )-(Y 3 -Y 1

DID key assumption: parallel (common) trends

Paralleltrends="unobservedcharacteristicsaffectingprogramparticipationdonotvaryovertimewithtreatmentstatus"(Khandkeretal.2010,p.73)• I.e.,trendsintheoutcomevariablewouldbethesameinthetwogroupswithouttreatment(Angrist&Pischke2009);or• "...treatmentandcontroloutcomesmoveinparallelintheabsenceoftreatments"(Angrist&Pischke(2015,p.178)• Implies(Y

1 -Y 0 )=(Y 3 -Y 2


Changeinparticipants'income(after-before)?• Y 4 -Y 0 Changeinnon-participants'(control)income(after-before)?• Y 3 -Y 1

DIDimpact?• (Y

4 -Y 0 )-(Y 3 -Y 1 )• =Y 4 -Y 0 -Y 3 +Y 1 =Y 4 -Y 3 +Y 1 -Y 0


1 -Y 0 )=(Y 3 -Y 2 )(paralleltrendassumption):• =Y 4 -Y 3 +Y 3 -Y 2 =Y 4 -Y 2



What happens if trends are not parallel?

• DIDestimateisbiased • àSeenext2slidesandhandoutbasedonfiguresin



Source: Ravallion (2008)

Selection bias Same

DID estimate = Treatment effect

Treatment effect



Selection bias

Source: Ravallion (2008)



Treatment effect

DID estimate

Treatment effect

(here DID estimate < Treatment effect )


Can partially test for parallel trends if have multiple pre-treatment waves of data

• Use3wavesofdata:- 2wavespriortoimplementationofNAAIAP(2004&2007TAPRAsurveys)- 1waveafter(during)implementation(2010TAPRAsurvey)• Regresschangeinoutcomevariable(2007minus2004)ondummyforifHH

• Whilenoguaranteethattrendswouldhavebeenparallel2007to2010in


DID simple numerical example



T after-avg.Y T before)-(avg.Y c after-avg.Y c before)

DID data requirements

• Repeatedcross-sections - Separaterandomsamplesfromtherelevant OR• Paneldata- Randomsamplefromtherelevantpopulationand


Regression DID - Basic Setup

Y it =α+γTreated i +λAfter t +δ(Treated i


t it

Where:• iindexesthecross-sectionalunitandtindexestime• Treated=1ifunitisultimatelytreated(exposedtoproject/program/policychange),

• After=1iftimeperiodisaftertheproject/program/policychange,

• Treated×Afteristheinteractionofthesetwovariables• *Note:Notationaboveisforwhen"treatment"ortheproject/program/policychange

• Whichparameterrepresentsthecausaleffectofinterest(assumingthekey Regression DID - Basic Setup - with higher level project/program/policy change Y idt =α+γTreated d +λAfter t +δ(Treated d


t idt • Whichparameterrepresentsthecausaleffectofinterest(assumingthe keyassumptionshold)?

• ThisisthemorecommoninstanceinwhichDIDisused• Wouldwanttoclusteryourstandarderrorsatthedistrictlevel



Regression DID - Basic Setup - with covariates


Y idt =α+γTreated d +λAfter t +δ(Treated d


t )+X idt idt • Boldrepresentsvectors• Whichparameterrepresentsthecausaleffectofinterest(assumingthe keyassumptionshold)?



Y it =α+λAfter t +δTreated it +c i +u it • Firstdifferencetoremovec i (orestimateviaFE)ΔY i =λ+δΔTreated i +Δu i • Whichparameteristhecausaleffectofinterest (assumingthekeyassumptionshold)?

Panel FE setup without control variables



Y it =α+Year t


it +X it


i +u it • WhereYearisavectorofyeardummies• EstimateviaFE • Whichparameteristhecausaleffectofinterest (assumingthekeyassumptionshold)?

Panel FE setup with control variables

Examples & tweaking the variable names/notation to fit your particular situation (1)


it =α+γTreated i +λAfter t +δ(Treated i


t valuesinacityinMassachusetts

• Newgarbageincineratorconstructionbeganin1981• Havehousingvaluedatafrom1978and1981plusinfoondistancefromhouse

• Let"nearinc"=1ifhouseisnear(within3miles/4.83km)incinerator,=0o.w. (farfromincinerator)

• Let"y81"=1ifyearis1981,and=0o.w.(yearis1978)• HowwouldyouwritetheDIDregressionequationinthiscase?Whatisthekey



Examples & tweaking the variable names/notation to fit your particular situation (1)


it =α+γTreated i +λAfter t +δ(Treated i


t valuesinacityinMassachusetts


it =α+γnearinc i +λy81 t +δ(nearinc i


t it

• Where- "rprice"isthehomevalueinrealUS$- "nearinc"=1ifhouseisnear(within3miles/4.83km)incinerator,=0o.w.

(farfromincinerator) - "y81"=1ifyearis1981,and=0o.w.(yearis1978) Examples & tweaking the variable names/notation to fit your particular situation (2)


it =α+γTreated i +λAfter t +δ(Treated i


t it

Example:Angrist&Pischke(2009)-understandingcholeratransmissioninmid-1800s• Havedistrict-leveldatafromLondonondeathratesandwatercompanyin1849&


- "deathr"bethedeathrateindistrictd- "Lambeth"=1ifdistrictdgetsitswaterfromtheLambethCompany(which

- "y1852"=1ifyearis1852,and=0o.w.(yearis1849)• WritedowntheDIDequationforthisscenarioandusingthesevariablesSpecific:deathr

dt =α+γLambeth d +λy1852 t +δ(Lambeth d


t dt


Examples & tweaking the variable names/notation to fit your particular situation (3)


idt =α+γTreated d +λAfter t +δ(Treated d


t idt

Example:Angrist&Pischke(2009)-effectofémin.wageonfastfoodemployment• Havedatafromneighboringstates(NJ&PA).Bothstateshave$4.25minimumwagein

- "employ"betheemploymentlevelofeachrestaurant- "NJ"=1ifstateisNJ,=0ifstateisPA- "Nov92"=1iftimeisNovember1992,and=0o.w.(timeisFebruary1992)• WritedowntheDIDequationforthisscenarioandusingthesevariables.Thinkcarefully



ist =α+γNJ s +λNov92 t +δ(NJ s


t ist

DID good to consider if have natural experiment

• Bigpicturequestion:whydomanyHHsselllow,buyhigh(w.r.t.cropprices)?• Hypothesis:"short-termexpenditureneedsforcepoorhouseholdstosell

• Naturalexperiment:Malawichangedprimaryschoolcalendar- 2009:startinDecember.2010:startinSeptember.à HHshavetomakeschool-relatedexpendituresmuchearlierin2010than


DID & natural experiment example (cont'd)


it =α+γTreated i +λAfter t +δ(Treated i


t it


it =α+γChildren i +λy2010 t +δ(Children i


t it

Where• Cropsales=cumulativevalueofHHcropsalesthroughAugustofyeart• Children=#ofchildreninprimaryschool(0,1,2,3,etc.).Orcoulddo0/1• y2010=1ifyearis2010;=0ifyearis2009• EstimateseparatelyforHHsabovevs.belowthepovertyline

What other situations/examples appropriate for DID can you think of? • AndhowwouldyousetupyourDIDregressionequation? • General:Y=α+γTreated+λAfter+δ(Treated×After)+ε


DID vs. other methods

• KeydifferencebetweenPSMandDID:PSMassumesselectiononobservablesonly,DIDallowsselectiontobeafunctionoftime-constantunobservedfactors(a.k.a.timeinvariantunobservedheterogeneity)

- Wherehaveyouheardthistermbefore?- Whatifselectionisafunctionoftime-varyingunobservables? • Anotherkeydifference: - Randomizedevaluations&PSM-cross-sectionaldata - NeedrepeatedcrosssectionsorpaneldataforDID


Theory wrap-up

(Source: Angrist & Pischke 2015, pp. 203-204) "MasterStevefu:Wrapitupforme,Grasshopper.


In Stata - DID (panel or pooled cross-sections)


it =α+γTreated i +λAfter t +δ(Treated i


t it istime-constant) interest(assumingthekeyassumptionshold)? • Theoneonthei.Treated#i.AftervariableIn Stata - FE (panel data)


it =α+λAfter t +δTreated it +c i +u it varying) interest(assumingthekeyassumptionshold)? • Theoneonthei.Treatedvariable


Stata exercises

1. Wooldridge(2002)newgarbageincinerator&housingvalues

a. UseKIELMC.DTAin"data"foldertoestimate: rprice it =α+γnearinc i +λy81 t +δ(nearinc i


t it b. Interpretthekeycoefficientofinterest

Stata exercises

a. Usehh_9198.dtain"data/Khandkeretal2010datafiles" foldertoestimate:lexptot it =α+γdfmfd98 i
