Two stages hybrid model of fuzzy linear regression with support vector machines for colorectal cancer

Fuzzy linear regression analysis has become popular among researchers and standard model in analyzing data in vagueness phenomena. However, the factor and symptoms to predict tumor size of colorectal cancer still ambiguous and not clear. The problem in using a linear regression will arise when uncer...

Full description

Bibliographic Details
Main Author: Shafi, Muhammad Ammar
Format: Thesis
Language:English
English
English
Published: 2020
Subjects:
Online Access:http://eprints.uthm.edu.my/32/1/24p%20MUHAMMAD%20AMMAR%20SHAFI.pdf
http://eprints.uthm.edu.my/32/2/MUHAMMAD%20AMMAR%20SHAFI%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/32/3/MUHAMMAD%20AMMAR%20SHAFI%20WATERMARK.pdf
Description
Summary:Fuzzy linear regression analysis has become popular among researchers and standard model in analyzing data in vagueness phenomena. However, the factor and symptoms to predict tumor size of colorectal cancer still ambiguous and not clear. The problem in using a linear regression will arise when uncertain data and not precise data were presented. Since the fuzzy set theory‟s concept can deal with data not to a precise point value (uncertainty data), fuzzy linear regression was applied. In this study, two new models for hybrid model namely the multiple linear regression clustering with support vector machine model (MLRCSVM) and fuzzy linear regression with symmetric parameter with support vector machine (FLRWSPCSVM) were proposed to analyze colorectal cancer data. Other than that, the parameter, error and explanation of the five procedures to both new models were included. These models applying five statistical models such as multiple linear regression, fuzzy linear regression, fuzzy linear regression with symmetric parameter, fuzzy linear regression with asymmetric parameter and support vector machine model. At first, the proposed models were applied to the 1000 simulated data. Furthermore, secondary data of 180 colorectal cancer patients who received treatment in general hospital with twenty five independent variables with different combination of variable types were considered to find the best models to predict the tumor size of CRC. The main objective of this study is to determine the best model to predicting the tumor size of CRC and to identify the factors and symptoms that contribute to the size of CRC. The comparisons among all the models were carried out to find the best model by using statistical measurements of mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). The results showed that the FLRWSPCSVM was found to be the best model, having the lowest MSE, RMSE, MAE and MAPE value by 100.605, 10.030, 7.556 and 14.769. Hence, the size of colorectal cancer could be predicted by managing twenty five independent variables.