Missing Link Prediction Using Non-Overlapped Features and Multiple Sources of Social Networks

The current methods for missing link prediction in social networks focus on using data from overlapping users from two social network sources to recommend links between unconnected users. To improve prediction of the missing link, this paper presents the use of information from non-overlapping users...

Full description

Bibliographic Details
Main Authors: Pokpong Songmuang, Chainarong Sirisup, Aroonwan Suebsriwichai
Format: Article
Language:English
Published: MDPI AG 2021-05-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/12/5/214
_version_ 1827692236833292288
author Pokpong Songmuang
Chainarong Sirisup
Aroonwan Suebsriwichai
author_facet Pokpong Songmuang
Chainarong Sirisup
Aroonwan Suebsriwichai
author_sort Pokpong Songmuang
collection DOAJ
description The current methods for missing link prediction in social networks focus on using data from overlapping users from two social network sources to recommend links between unconnected users. To improve prediction of the missing link, this paper presents the use of information from non-overlapping users as additional features in training a prediction model using a machine-learning approach. The proposed features are designed to use together with the common features as extra features to help in tuning up for a better classification model. The social network data sources used in this paper are Twitter and Facebook where Twitter is a main data for prediction and Facebook is a supporting data. For evaluations, a comparison using different machine-learning techniques, feature settings, and different network-density level of data source is studied. The experimental results can be concluded that the prediction model using a combination of the proposed features and the common features with Random Forest technique gained the best efficiency using percentage amount of recovering missing links and F1 score. The model of combined features yields higher percentage of recovering link by an average of 23.25% and the F1-measure by an average of 19.80% than the baseline of multi-social network source.
first_indexed 2024-03-10T11:17:32Z
format Article
id doaj.art-deb74c95037b444c90045754cd8e1d9f
institution Directory Open Access Journal
issn 2078-2489
language English
last_indexed 2024-03-10T11:17:32Z
publishDate 2021-05-01
publisher MDPI AG
record_format Article
series Information
spelling doaj.art-deb74c95037b444c90045754cd8e1d9f2023-11-21T20:18:40ZengMDPI AGInformation2078-24892021-05-0112521410.3390/info12050214Missing Link Prediction Using Non-Overlapped Features and Multiple Sources of Social NetworksPokpong Songmuang0Chainarong Sirisup1Aroonwan Suebsriwichai2Faculty of Science and Technology, Thammasat University, Pathumthani 12121, ThailandFaculty of Science and Technology, Thammasat University, Pathumthani 12121, ThailandFaculty of Science and Technology, Thammasat University, Pathumthani 12121, ThailandThe current methods for missing link prediction in social networks focus on using data from overlapping users from two social network sources to recommend links between unconnected users. To improve prediction of the missing link, this paper presents the use of information from non-overlapping users as additional features in training a prediction model using a machine-learning approach. The proposed features are designed to use together with the common features as extra features to help in tuning up for a better classification model. The social network data sources used in this paper are Twitter and Facebook where Twitter is a main data for prediction and Facebook is a supporting data. For evaluations, a comparison using different machine-learning techniques, feature settings, and different network-density level of data source is studied. The experimental results can be concluded that the prediction model using a combination of the proposed features and the common features with Random Forest technique gained the best efficiency using percentage amount of recovering missing links and F1 score. The model of combined features yields higher percentage of recovering link by an average of 23.25% and the F1-measure by an average of 19.80% than the baseline of multi-social network source.https://www.mdpi.com/2078-2489/12/5/214Social Networkmissing linklink predictionmachine learning
spellingShingle Pokpong Songmuang
Chainarong Sirisup
Aroonwan Suebsriwichai
Missing Link Prediction Using Non-Overlapped Features and Multiple Sources of Social Networks
Information
Social Network
missing link
link prediction
machine learning
title Missing Link Prediction Using Non-Overlapped Features and Multiple Sources of Social Networks
title_full Missing Link Prediction Using Non-Overlapped Features and Multiple Sources of Social Networks
title_fullStr Missing Link Prediction Using Non-Overlapped Features and Multiple Sources of Social Networks
title_full_unstemmed Missing Link Prediction Using Non-Overlapped Features and Multiple Sources of Social Networks
title_short Missing Link Prediction Using Non-Overlapped Features and Multiple Sources of Social Networks
title_sort missing link prediction using non overlapped features and multiple sources of social networks
topic Social Network
missing link
link prediction
machine learning
url https://www.mdpi.com/2078-2489/12/5/214
work_keys_str_mv AT pokpongsongmuang missinglinkpredictionusingnonoverlappedfeaturesandmultiplesourcesofsocialnetworks
AT chainarongsirisup missinglinkpredictionusingnonoverlappedfeaturesandmultiplesourcesofsocialnetworks
AT aroonwansuebsriwichai missinglinkpredictionusingnonoverlappedfeaturesandmultiplesourcesofsocialnetworks