Orbis at NBER

The NBER collection of ORBIS data is available to NBER affiliated faculty who are members of the ORBIS group. For information contact Sebnem Kalemli-Ozcan.

This page documents /home/data/orbis3. An earlier version of this data is documented here. Orbis2 was defective and should not be used. We will not be removing obsolete files in the interest of reproducibility.

The data has been processed in several stages, all the intermediate stages are available but should not be required by users. If you do find you need one of the earlier stages, please consult with us first.

  1. Stage 1 - (not online, but available) .rar files as supplied by BvD.
  2. Stage 2 - /home/data/orbis3/*/txt .txt files expanded from Stage 1. Multipart stage 1 files are assembled into a single file.
  3. Stage 3 - /home/data/orbs3/world/*.dta .dta files imported from stage 2. A number of variables are dropped at this point.
  4. Stage 4 - /home/data/orbis3/bycnty/CC/* .dta files separated by country. A number of long string variables are dropped, as are all the "interim" files and financial files in Eurodollars. New files are added which merge industry class information to the financial files. The ownership files are supplimented with entity data. These new files are indicated by an "x" prefix.
  5. Stage 5 - /home/data/orbis3/bycnty/CC/*-panel.dta hindfgr.dta and hindgfrusd.dta are reduced by the elimination of duplicate and redundant records.

    If you need other formats or have questions about the conversion process, including concerns that information may have been lost, please contact Daniel Feenberg.

    Because these files are still quite large, we remind you that the Stata -use- statement can filter by variable and observation. So it isn't necessary to load or store any variables or observations you won't be using in your analysis.

    You may notice that many of the Orbis variables are long strings with few possible values, such as the name of a stock exchange (given 71 bytes) The Stata -encode- command can reduce those to 1 or 2 bytes quite easily.

    Agreement forms

    Vendor Documentation

    Local Documentation

    Notes:

    • Balance sheet data for most countries begins in 2005. For European countries, coverage begins in 1998. Ownership data for all countries begins in 2007.
    • Both ownership and balance sheet data end around June 2019. An important caveat is that because of lags in report and/or data compilation and standardization by BvD, the data for 2018 and 2019 is not as complete as in the prior years.
    • Countries are referred to by 2 letter ISO country code.
    • Paths are relative to /home/data/orbis3/ on the NBER Linux cluster.
    • Files with all countries are in ./world
    • Files divided by country are in ./bycnty/CC/
    • Stata .do files used in the conversion to .dta are in ./code
    • Stata .dta files with variable information are in ./docs
    • Stata .log files from the conversion are in ./logs
    • Variable and file names are all lowercase. Country directory names are uppercase.
    • Most variable names are created by Stata from variable descriptions. BvD did not provide any variable names to go with the descriptions.
    • BvD did not provide any additional documentation.
    • A single copy of all the files in .dta format totals 2.3TB, but compresses to 254GB in our ZFS file store. Nevertheless, Stata decompresses the data in core. The largest single countries are Russia and the US, with 28GB and 20GB of compressed and ten times that amount in core.

    Descriptive Data

    Notes:
    • Most of these files had many stray double quote marks. Stata was instructed to ignore them.

    File Description Filename Comment
    Additional company info addinfo.dta
    All addresses addresses.dta
    Auditor current auditor.dta
    Bankers current bankers.dta
    Bankers previous bankpre.dta
    Bvd9 bvd9.dta
    BvD ID and Name bvdname.dta
    Contact infor contact.dta
    DMC current only dmccurr.dta very long lines (>100K)
    DMC previous dmcprev.dta very long lines (>100K)
    Identifiers identifiers.dta
    Industry classifications indclass.dta
    Legal info legal.dta
    Other_advisors advisors.dta
    Overviews overviews.dta
    StatusHistory status.dta
    Stock exchanges and indexes exchanges.dta
    Trade description trade.dta
    Industry Class Core indclass_core.dta Notes

    Financials - Detailed cash flow and interim June text

    Notes:
    • The option "x" preceeded the file name indicates a file with industry class information merged and drops records with no industry class information

    File Description Filename Comment
    Cash_flow_non_US-industries {x}cfnus.dta
    Cash_flow_non_US-industries-USD {x}cfnusindusd.dta
    Cash_flow_US-industries {x}cfusind.dta
    Cash_flow_US-industries-USD {x}cfusindusd.dta
    Detailed_format-industries {x}dfind.dta
    Detailed_format-industries-USD {x}dfindusd.dta

    Financials - Gobal format incl histo for industries June text

    Notes:
    • Where the file description was the same as a file in the Detailed group, an "h" was prefixed to the name.
    • The original files are in ./global/rar and ./global/txt

    File Description Filename Comment
    Banks Global financials and ratios {x}hbnkgfr.dta
    Banks Global financials and ratios USD {x}hbnkgfrusd.dta
    Industry Global Financials and ratios {x}hindgfr.dta
    Industry Global Financials and ratios USD {x}hindgfrusd.dta
    Industry Global Financials and ratios {x}hindgfr-panel.dta
    Industry Global Financials and ratios USD {x}hindgfrusd-panel.dta
    Insurances Global Financials and ratios {x}hinsgfr.dta
    Insurances Global Financials and ratios USD {x}hinsgfrusd.dta
    Key financials {x}keyfin.dta
    Key financials USD {x}keyfinusd.dta

    Ownership Histo June text

    Notes:
    • Each ownership link is kept twice - once in the Ownership (Shareholder) Links file and once in the Subsidiary Links file according to the country code of the owner or subsidiary.
    • The optional preceeding "x" indicates a file with firm type information merged in from the entities file.

    File Description Filename Comment
    Entities entities.dta
    {x}links_20NN {x}ownlinksNN.dta NN runs from 07 to 19.
    {x}links_20NN {x}sublinksNN.dta

    Last update 20 April 2020 by drf