Validation of causal inference data using DirectLiNGAM in an environmental small-scale model and calculation settings

The development of data science has been needed in environmental fields such as marine, weather, and soil data. In general, the datasets are large in some cases, but they are often small because they contain observation data that the analyses themselves are limited. In such a case, the data are stat...

Full description

Bibliographic Details
Main Authors: Atsushi Kurotani, Hirokuni Miyamoto, Jun Kikuchi
Format: Article
Language:English
Published: Elsevier 2024-06-01
Series:MethodsX
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2215016123005241
_version_ 1827389287503495168
author Atsushi Kurotani
Hirokuni Miyamoto
Jun Kikuchi
author_facet Atsushi Kurotani
Hirokuni Miyamoto
Jun Kikuchi
author_sort Atsushi Kurotani
collection DOAJ
description The development of data science has been needed in environmental fields such as marine, weather, and soil data. In general, the datasets are large in some cases, but they are often small because they contain observation data that the analyses themselves are limited. In such a case, the data are statistically evaluated by increasing or decreasing the levels of factors using differential analysis, resulting in the essential factors are estimated. However, there is no consistent approach to the means of assessing strong associations as a group between factors. Causal inference method has the possibility to output effective results for small data, and the results are expected to provide important information for understanding the potential highly association between factors, not necessarily the inference with big data. Here, we describe essential checkpoints and settings for the calculation by a direct method for learning a linear non-Gaussian structural equation model (DirectLiNGAM) and validation methods for the calculation results by using DirectLiNGAM with small-scale model data as an additional discussion of DirectLiNGAM portion of the related research article. Thus, this study provides the statistical validation methods for the association networks, treatments, and interventions for structural inference as a group of essential factors. • Causal inference with DirectLiNGAM • Validation of correlation coefficient and feature importance • Validation using causal effect object and propensity scores
first_indexed 2024-03-08T16:32:26Z
format Article
id doaj.art-92236e4041d64ede9fb340c67a2c0a7a
institution Directory Open Access Journal
issn 2215-0161
language English
last_indexed 2024-03-08T16:32:26Z
publishDate 2024-06-01
publisher Elsevier
record_format Article
series MethodsX
spelling doaj.art-92236e4041d64ede9fb340c67a2c0a7a2024-01-06T04:38:58ZengElsevierMethodsX2215-01612024-06-0112102528Validation of causal inference data using DirectLiNGAM in an environmental small-scale model and calculation settingsAtsushi Kurotani0Hirokuni Miyamoto1Jun Kikuchi2Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, Ibaraki 305-0856, Japan; Tokyo University of Agriculture and Technology, Koganei, Tokyo 184-0012, JapanGraduate School of Horticulture, Chiba University: Matsudo, Chiba 271-8501, Japan; RIKEN Center for Integrated Medical Science, Yokohama, Kanagawa 230-0045, Japan; Corresponding author.RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa 230-0045, JapanThe development of data science has been needed in environmental fields such as marine, weather, and soil data. In general, the datasets are large in some cases, but they are often small because they contain observation data that the analyses themselves are limited. In such a case, the data are statistically evaluated by increasing or decreasing the levels of factors using differential analysis, resulting in the essential factors are estimated. However, there is no consistent approach to the means of assessing strong associations as a group between factors. Causal inference method has the possibility to output effective results for small data, and the results are expected to provide important information for understanding the potential highly association between factors, not necessarily the inference with big data. Here, we describe essential checkpoints and settings for the calculation by a direct method for learning a linear non-Gaussian structural equation model (DirectLiNGAM) and validation methods for the calculation results by using DirectLiNGAM with small-scale model data as an additional discussion of DirectLiNGAM portion of the related research article. Thus, this study provides the statistical validation methods for the association networks, treatments, and interventions for structural inference as a group of essential factors. • Causal inference with DirectLiNGAM • Validation of correlation coefficient and feature importance • Validation using causal effect object and propensity scoreshttp://www.sciencedirect.com/science/article/pii/S2215016123005241DirectLiNGAM: A causal inference by direct estimation approach for learning the basic LiNGAM model with non-Gaussian data
spellingShingle Atsushi Kurotani
Hirokuni Miyamoto
Jun Kikuchi
Validation of causal inference data using DirectLiNGAM in an environmental small-scale model and calculation settings
MethodsX
DirectLiNGAM: A causal inference by direct estimation approach for learning the basic LiNGAM model with non-Gaussian data
title Validation of causal inference data using DirectLiNGAM in an environmental small-scale model and calculation settings
title_full Validation of causal inference data using DirectLiNGAM in an environmental small-scale model and calculation settings
title_fullStr Validation of causal inference data using DirectLiNGAM in an environmental small-scale model and calculation settings
title_full_unstemmed Validation of causal inference data using DirectLiNGAM in an environmental small-scale model and calculation settings
title_short Validation of causal inference data using DirectLiNGAM in an environmental small-scale model and calculation settings
title_sort validation of causal inference data using directlingam in an environmental small scale model and calculation settings
topic DirectLiNGAM: A causal inference by direct estimation approach for learning the basic LiNGAM model with non-Gaussian data
url http://www.sciencedirect.com/science/article/pii/S2215016123005241
work_keys_str_mv AT atsushikurotani validationofcausalinferencedatausingdirectlingaminanenvironmentalsmallscalemodelandcalculationsettings
AT hirokunimiyamoto validationofcausalinferencedatausingdirectlingaminanenvironmentalsmallscalemodelandcalculationsettings
AT junkikuchi validationofcausalinferencedatausingdirectlingaminanenvironmentalsmallscalemodelandcalculationsettings