HCRIS - 2552-10
HCRIS is an annual file offering microdata from US hospital financial
reports. From this page you can access data from 2010+. All the data here is
available on the CMS website. An earlier
compendium of HCRIS data is available at with 1996-2018 data.
Here we provide reformatted and more convenient versions of the Form 2552-10
files. Each fiscal year comes from CMS as 3 CSV files in a zipped container.
One file has slowly changing hospital information such as name and NPI, one has
the numeric variables and one has thealpha-numeric variables. These last two
are not rectangular files but in an unusual "long" format. That is, where you
might expect a record to include a hospital id and various hospital variables,
such as wages, supllies, fees, etc each record includes only a hospital id, a
worksheet id, line id, column id and finally a single item value. So there are
as many records per hospital as there are filled-in line on the forms. The last
three id variables identify the place on the form 2552-10 where the hospital
was asked to place the information, and can be concatenated into an ersatz
variable name. For most variables no other name is provided. Because of the
inconvenience of the "long" format, the files are offered here in "wide"
format. That is, as a rectangular file with one record per hospital in each
file. The reason for the "long" format is unstated, in some circumstances it
can save disk space.
Notes
- Most variable names are constructed from worksheet locations. For example:
abel s3_2_c2_1 = "Total Salaries S300002 00100 00200";
The last 3 fields are the values used in the long format files to identify fields, which
are turned into the variable name used in our files.
- We start in 2010 because that is the first year for Form 2552-10.
Before that the form is 2552-96.
- Annual files range up to 652 megabytes - to much for a spreadsheet but easily
handled by any statistical package.
- Some years have fewer responding hospitals:
2323 2323 237572 hosp10_2010_RPT.CSV
6150 6150 630432 hosp10_2011_RPT.CSV
6227 6227 639727 hosp10_2012_RPT.CSV
6248 6248 641591 hosp10_2013_RPT.CSV
6249 6249 640188 hosp10_2014_RPT.CSV
6257 6257 631974 hosp10_2015_RPT.CSV
6204 6204 600737 hosp10_2016_RPT.CSV
6121 6121 572401 hosp10_2017_RPT.CSV
974 974 90747 hosp10_2018_RPT.CSV
3408 3408 318546 hosp10_2019_RPT.CSV
12 12 1147 hosp10_2020_RPT.CSV
50173 50173 5005062 total
Each file includes hospitals with various fiscal year endpoints. The most
recent file is likely incomplete, but is offered on the CMS website before
the completeion on the calander year for what it is worth. Presumably it is
too early for 2020, but why is 2018 so small?
- Comments are welcome. Please write data@nber.org or call Daniel Feenberg
at 617-863-0343.
Downloads
- The files as download from the CMS website.
- The files unzipped from (1).
- The files are transformed in sas7bdat format.
- Extracts of 1,700+ commonly used variables:
- Wide Format (Annual Files):
- Panel Format (2010+ in one file)
Variable list
Record Counts
Source code
last update 11 Nov 2020 by drf