English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
EndNote (UTF-8)
 
DownloadE-Mail
  When Should We Believe Research Using ORBIS Firm Data?

Arndt, H. L. R. (2023). When Should We Believe Research Using ORBIS Firm Data? SSRN. doi:10.2139/ssrn.4584106.

Item is

Files

show Files

Locators

hide
Description:
Full text via SSRN
OA-Status:
Gold

Creators

hide
 Creators:
Arndt, H. Lukas R.1, Author                 
Affiliations:
1International Max Planck Research School on the Social and Political Constitution of the Economy, MPI for the Study of Societies, Max Planck Society, ou_1214550              

Content

hide
Free keywords: ORBIS, big data, data quality, research ethics
 Abstract: Since around 2010, large off-the-shelf firm datasets covering a wide variety of variables for millions of firms are for sale by commercial providers. Such datasets are the basis for high impact scientific works, and have also beyond that been used frequently e.g. in economics, finance, political economy, economic sociology, geography, and other disciplines. One example of such an offered dataset is the ORBIS firm data, claiming to cover more than 400m firms globally in 2022. After examining and using the dataset in a four-year research project, I make the case that there are significant problems to use the data for research, depending on the question to be answered, as well as the available capacity and skills to make the data usable. Some of these are already known from the literature, others are not. But the most pressing point is that the magnitude of these problems is still understated. I gather (references to) solutions and applications to overcome a few examples of the identified problems. However, overall, I argue that the complexity of the data, existing errors, and inconsistencies, missing relevant information, the interrelation of these problems, and insufficient documentation and support provided with the data only allow for confidence in the validity of results under very specific conditions. These conditions will in most cases only apply to a much smaller sample than the claimed hundreds of millions of firms covered which undermines the main reason to work with the data in the first place. Evaluating whether conditions for valid inferences are met requires extensive knowledge of the data. ORBIS in its current state should only be used for research if sufficient time, resources, and skills are available to fully understand the data, and overcome its problems – if possible - convincingly and demonstratable. Furthermore, these issues have serious implications for the feasibility to review and evaluate research using the ORBIS data, and therefore reproducibility, unless significant additional effort is made by researchers to document the data preparation process and its implied decisions. This article gives some insights to reviewers as well as potential authors to better evaluate the use and preparation of ORBIS data. This added required effort for researchers, together with the price of getting access to the data in the first place, further speaks against its scientific use.

Details

hide
Language(s): eng - English
 Dates: 2023-03-312023-10-25
 Publication Status: Published online
 Pages: 39
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.2139/ssrn.4584106
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

hide
Title: SSRN
Source Genre: Web Page
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: - Identifier: -