Exploratory Analysis for Big Social Data Using Deep Network

Exploratory analysis is an important way to gain understanding and find unknown relationships from various data sources, especially in the era of big data. Traditional paradigms of social science data analysis follow the steps of feature selection, modeling, and prediction. In this paper, we propose...

Full description

Bibliographic Details
Main Authors: Chao Wu, Guolong Wang, Jiangcheng Zhu, Piyawat Lertvittayakumjorn, Simon Hu, Chilie Tan, Hong Mi, Yadan Xu, Jun Xiao
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8643028/
_version_ 1818924291401449472
author Chao Wu
Guolong Wang
Jiangcheng Zhu
Piyawat Lertvittayakumjorn
Simon Hu
Chilie Tan
Hong Mi
Yadan Xu
Jun Xiao
author_facet Chao Wu
Guolong Wang
Jiangcheng Zhu
Piyawat Lertvittayakumjorn
Simon Hu
Chilie Tan
Hong Mi
Yadan Xu
Jun Xiao
author_sort Chao Wu
collection DOAJ
description Exploratory analysis is an important way to gain understanding and find unknown relationships from various data sources, especially in the era of big data. Traditional paradigms of social science data analysis follow the steps of feature selection, modeling, and prediction. In this paper, we propose a new paradigm that does not require feature selection so that data can speak for itself without manually picking out features. Besides, we propose using the deep network as a methodology to explore previously unknown relationships and capture complexity and non-linearity between target variables and a large number of input features for big social data. The new paradigm tends to be a relatively generic approach that can be widely used in different scenarios. In order to validate the feasibility of the paradigm, we use country-level indicators forecasting as a case study. The process includes: 1) data collection and preparation and 2) modeling and experiment. The data collection and preparation part builds a data warehouse and conducts the extract-transform-load process to eliminate data format inconsistency. The modeling and experiment part includes model setup and model structures change to achieve relatively high accuracy on prediction results at both model level and case level. We find some patterns about network capacity modification and the influence of time interval difference on the test results, whereas both of them deserve further research.
first_indexed 2024-12-20T02:23:00Z
format Article
id doaj.art-27b879b996c04d138f301945d6c38d6d
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-20T02:23:00Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-27b879b996c04d138f301945d6c38d6d2022-12-21T19:56:46ZengIEEEIEEE Access2169-35362019-01-017214462145310.1109/ACCESS.2019.28982388643028Exploratory Analysis for Big Social Data Using Deep NetworkChao Wu0Guolong Wang1https://orcid.org/0000-0003-4905-6040Jiangcheng Zhu2Piyawat Lertvittayakumjorn3Simon Hu4Chilie Tan5Hong Mi6Yadan Xu7Jun Xiao8School of Public Affairs, Zhejiang University, Hangzhou, ChinaSchool of Public Affairs, Zhejiang University, Hangzhou, ChinaCollege of Control Science and Technology, Zhejiang University, Hangzhou, ChinaData Science Institute, Imperial College London, London, U.K.ZJU-UIUC Institute, International Campus, Zhejiang University, Haining, ChinaTongdun Technology, Hangzhou, ChinaSchool of Public Affairs, Zhejiang University, Hangzhou, ChinaInternational Campus, Zhejiang University, Haining, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaExploratory analysis is an important way to gain understanding and find unknown relationships from various data sources, especially in the era of big data. Traditional paradigms of social science data analysis follow the steps of feature selection, modeling, and prediction. In this paper, we propose a new paradigm that does not require feature selection so that data can speak for itself without manually picking out features. Besides, we propose using the deep network as a methodology to explore previously unknown relationships and capture complexity and non-linearity between target variables and a large number of input features for big social data. The new paradigm tends to be a relatively generic approach that can be widely used in different scenarios. In order to validate the feasibility of the paradigm, we use country-level indicators forecasting as a case study. The process includes: 1) data collection and preparation and 2) modeling and experiment. The data collection and preparation part builds a data warehouse and conducts the extract-transform-load process to eliminate data format inconsistency. The modeling and experiment part includes model setup and model structures change to achieve relatively high accuracy on prediction results at both model level and case level. We find some patterns about network capacity modification and the influence of time interval difference on the test results, whereas both of them deserve further research.https://ieeexplore.ieee.org/document/8643028/Data-drivennew paradigmsocial science
spellingShingle Chao Wu
Guolong Wang
Jiangcheng Zhu
Piyawat Lertvittayakumjorn
Simon Hu
Chilie Tan
Hong Mi
Yadan Xu
Jun Xiao
Exploratory Analysis for Big Social Data Using Deep Network
IEEE Access
Data-driven
new paradigm
social science
title Exploratory Analysis for Big Social Data Using Deep Network
title_full Exploratory Analysis for Big Social Data Using Deep Network
title_fullStr Exploratory Analysis for Big Social Data Using Deep Network
title_full_unstemmed Exploratory Analysis for Big Social Data Using Deep Network
title_short Exploratory Analysis for Big Social Data Using Deep Network
title_sort exploratory analysis for big social data using deep network
topic Data-driven
new paradigm
social science
url https://ieeexplore.ieee.org/document/8643028/
work_keys_str_mv AT chaowu exploratoryanalysisforbigsocialdatausingdeepnetwork
AT guolongwang exploratoryanalysisforbigsocialdatausingdeepnetwork
AT jiangchengzhu exploratoryanalysisforbigsocialdatausingdeepnetwork
AT piyawatlertvittayakumjorn exploratoryanalysisforbigsocialdatausingdeepnetwork
AT simonhu exploratoryanalysisforbigsocialdatausingdeepnetwork
AT chilietan exploratoryanalysisforbigsocialdatausingdeepnetwork
AT hongmi exploratoryanalysisforbigsocialdatausingdeepnetwork
AT yadanxu exploratoryanalysisforbigsocialdatausingdeepnetwork
AT junxiao exploratoryanalysisforbigsocialdatausingdeepnetwork