March 30, 2020

Anne Case

Please address questions to:

Downloads available:



A. US Mortality Data

US Couma mortality estimates

We obtained permission to use county identifiers on individual death records from the National Vital Statistics System for the period 1990-2016, and created couma-level mortality statistics by cause, for white non-Hispanics, in 10-year age groups (25-34, 35-44, ? 65-74). Information on the application process is available here:

county-couma Stata data crosswalks are available in the data appendix to Case and Deaton, Mortality and Morbidity in the 21st Century, Brookings Papers on Economic Activity, Spring 2017, available online:


B. US Population Estimates for use with individual death records


Population estimates by county-year were downloaded from the CDC Bridged-Race Population series



C. Creation of mortality rates by cause


Causes of death are identified using ICD codes.

Here is the stata code used to identify death by cause. (Note ICD_3 is a 3-letter string for ICD10 codes.):


/*** poison ***/


gen pois = 0;


replace pois=1 if icd9n>=2920 & icd9n<=2929 ;


replace pois=1 if icd9n==304? & length(icd9)==3 ;

replace pois=1 if icd9n>=3040 & icd9n<=3049 ;


replace pois=1 if icd9n>=3052 & icd9n<=3059 ;


replace pois=1 if icd9n>=850 & icd9n<=858 & length(icd9)==3;

replace pois=1 if icd9n>=8500 & icd9n<=8589 ;



replace pois=1 if icd9n==860 & length(icd9)==3;

replace pois=1 if icd9n>=8600 & icd9n<=8609 ;


replace pois=1 if icd9n==980? & length(icd9)==3;

replace pois=1 if icd9n>=9800 & icd9n<=9805;



replace pois=1? if (ICD_3=="F11"|ICD_3=="F12"|ICD_3=="F13"|ICD_3=="F14"|ICD_3=="F15"|ICD_3=="F16"|ICD_3==

"F18"|ICD_3=="F19") ;


forval i = 10(1)15{;

replace pois = 1 if icd10=="Y`i'" ;


forval i = 40(1)45{;

replace pois = 1 if icd10=="X`i'" ;


forval i = 45(2)49{;

replace pois = 1 if icd10=="Y`i'" ;





/****** suicide ******/


gen suicide = 0;


replace suicide=1 if icd9n>=9500 & icd9n<=9599 ;

replace suicide= 1 if icd9n>=950 & icd9n<=959 & length(icd9)==3;


forval i = 60(1)84{;

replace suicide = 1 if icd10=="X`i'" ;


replace suicide = 1 if icd10=="Y870";




/****** liver ********/

gen liver = 0;


replace liver=1 if icd9n>=2910 & icd9n<=2919 ;


replace liver=1 if icd9n==303 & length(icd9)==3;

replace liver=1 if icd9n==3050 ;


replace liver=1 if icd9n>=5710 & icd9n<=5713 ;

replace liver=1 if icd9n==5719? ;


replace liver=1 if ICD_3=="F10" ;

replace liver=1 if icd10=="K70";


forval i = 0(1)9{;

replace liver=1 if icd10=="K70`i'";



/****** ?heart disease as defined by CDC? ******/


gen heart = 0;


replace heart=1 if icd9n>=390? & icd9n<=398 & length(icd9)==3;

replace heart=1 if icd9n>=3900 & icd9n<=3989;


replace heart=1 if (icd9n==402|icd9n==404) & length(icd9)==3 ;

replace heart=1 if icd9n>=4020 & icd9n<=4029 ;

replace heart=1 if icd9n>=4040 & icd9n<=4049 ;


replace heart=1 if (icd9n>=410 & icd9n<=429) & length(icd9)==3;

replace heart=1 if icd9n>=4100 & icd9n<=4299;



forval i = 0(1)99{;

replace heart=1 if icd10=="I0`i'";



forval i = 11(2)13{;

replace heart=1 if icd10=="I`i'";



forval i = 0(1)9{;

replace heart=1 if icd10=="I11`i'";

replace heart=1 if icd10=="I13`i'";



forval i = 20(1)51{;

replace heart=1 if icd10=="I`i'";


forval i = 200(1)519{;

replace heart=1 if icd10=="I`i'";







/*** cancer (malignant neoplasms) ***/


gen cancer=0;


replace cancer =1 if icd9n>=140 & icd9n<=208 & length(icd9)==3;

replace cancer =1 if icd9n>=1400 & icd9n<=2089;

replace cancer =1 if xcode=="C";

* note: xcode is the first digit of the ICD10 code



gen lungC=0;

replace lungC =1 if icd9n>=1622 & icd9n<=1629 & length(icd9)==4;

replace lungC=1 if substr(icd10,1,3)=="C34";





Geographic identifiers on death certificates are US counties. We combined counties into units that can be matched with Public Use Microdata Areas (PUMAs) used by the US Census Bureau.


We call these new units coumas. They are at least the size of a county, and at least the size of a PUMA (and so contain at least 100,000 people). Because PUMAs change after a decennial census, we match data from 1990-1999 into one set of coumas; 2000-2011 into a second set; and 2012-16 into a third set. Some counties (e.g. Philadelphia) contain many PUMAs. Some counties (e.g. in southwestern Montana) are sparsely populated, and need to be combined to match one PUMA. In some cases, to obtain clean boundaries, we need to combine multiple PUMAs and multiple counties into one couma. All observations from a county belong to the same couma. The crosswalks between counties and coumas are provided in the documentation to Case and Deaton, Mortality and Morbidity in the 21st Century, Brookings Papers on Economic Activity, Spring 2017, and are available here:


Note that identification is by state and couma.






II. US Economic Data

We pulled data on employment, unemployment and labor force participation from the Local Area Unemployment Statistics, by county:

From these we created a couma-level unemployment rate, and the employment to population ratio for ages 16-64, and ages 16 and above.


We pulled data on poverty counts and population in the SNAP program, by county, from the Small Area Income and Poverty Estimates, by county:

From these we created a couma-level poverty rate, and a couma-level rate of participation in SNAP.


We used data made available from the Gallup Daily Poll, which asked a question on whether respondents experienced pain ?a lot of the day yesterday.? The variable C_pctPAIN? is the fraction of white non-Hispanics ages 25-74 in the couma who responded that they were in pain yesterday.