Statistical model selection with “Big Data”
Big Data offer potential benefits for statistical modelling, but confront problems including an excess of false positives, mistaking correlations for causes, ignoring sampling biases and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based r...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2015-12-01
|
Series: | Cogent Economics & Finance |
Subjects: | |
Online Access: | http://dx.doi.org/10.1080/23322039.2015.1045216 |
_version_ | 1818025497498484736 |
---|---|
author | Jurgen A. Doornik David F. Hendry |
author_facet | Jurgen A. Doornik David F. Hendry |
author_sort | Jurgen A. Doornik |
collection | DOAJ |
description | Big Data offer potential benefits for statistical modelling, but confront problems including an excess of false positives, mistaking correlations for causes, ignoring sampling biases and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based relationship using Big Data, and the possible role of Autometrics in that context. Paramount considerations include embedding relationships in general initial models, possibly restricting the number of variables to be selected over by non-statistical criteria (the formulation problem), using good quality data on all variables, analyzed with tight significance levels by a powerful selection procedure, retaining available theory insights (the selection problem) while testing for relationships being well specified and invariant to shifts in explanatory variables (the evaluation problem), using a viable approach that resolves the computational problem of immense numbers of possible models. |
first_indexed | 2024-12-10T04:17:03Z |
format | Article |
id | doaj.art-31675d0732a14fbda0eccd21536555bd |
institution | Directory Open Access Journal |
issn | 2332-2039 |
language | English |
last_indexed | 2024-12-10T04:17:03Z |
publishDate | 2015-12-01 |
publisher | Taylor & Francis Group |
record_format | Article |
series | Cogent Economics & Finance |
spelling | doaj.art-31675d0732a14fbda0eccd21536555bd2022-12-22T02:02:33ZengTaylor & Francis GroupCogent Economics & Finance2332-20392015-12-013110.1080/23322039.2015.10452161045216Statistical model selection with “Big Data”Jurgen A. Doornik0David F. Hendry1Institute for New Economic Thinking, Oxford Martin SchoolInstitute for New Economic Thinking, Oxford Martin SchoolBig Data offer potential benefits for statistical modelling, but confront problems including an excess of false positives, mistaking correlations for causes, ignoring sampling biases and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based relationship using Big Data, and the possible role of Autometrics in that context. Paramount considerations include embedding relationships in general initial models, possibly restricting the number of variables to be selected over by non-statistical criteria (the formulation problem), using good quality data on all variables, analyzed with tight significance levels by a powerful selection procedure, retaining available theory insights (the selection problem) while testing for relationships being well specified and invariant to shifts in explanatory variables (the evaluation problem), using a viable approach that resolves the computational problem of immense numbers of possible models.http://dx.doi.org/10.1080/23322039.2015.1045216Big Datamodel selectionlocation shiftsAutometricscomputational problems |
spellingShingle | Jurgen A. Doornik David F. Hendry Statistical model selection with “Big Data” Cogent Economics & Finance Big Data model selection location shifts Autometrics computational problems |
title | Statistical model selection with “Big Data” |
title_full | Statistical model selection with “Big Data” |
title_fullStr | Statistical model selection with “Big Data” |
title_full_unstemmed | Statistical model selection with “Big Data” |
title_short | Statistical model selection with “Big Data” |
title_sort | statistical model selection with big data |
topic | Big Data model selection location shifts Autometrics computational problems |
url | http://dx.doi.org/10.1080/23322039.2015.1045216 |
work_keys_str_mv | AT jurgenadoornik statisticalmodelselectionwithbigdata AT davidfhendry statisticalmodelselectionwithbigdata |