Exploring the application of contrastive learning on two-sample hypothesis test and tabular data

The purpose of this study is to investigate the application of contrastive learning in two-sample tests for image data and feature enhancement for tabular data. This research is motivated by the potential of contrastive learning to improve the performance and accuracy of statistical tests and data c...

Full description

Bibliographic Details
Main Author: Wan, Bingbing
Other Authors: Lihui Chen
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/179367
_version_ 1826122316966264832
author Wan, Bingbing
author2 Lihui Chen
author_facet Lihui Chen
Wan, Bingbing
author_sort Wan, Bingbing
collection NTU
description The purpose of this study is to investigate the application of contrastive learning in two-sample tests for image data and feature enhancement for tabular data. This research is motivated by the potential of contrastive learning to improve the performance and accuracy of statistical tests and data classification. The main research problem addressed in this thesis is whether contrastive learning can enhance the performance of two-sample tests for images and improve feature quality for tabular data. To address this problem, we first verified the strong test power of pairing contrastive learning with the Maximum Mean Discrepancy (MMD) [1] two-sample test method. We then introduced a novel method called the contrastive two-sample test. Additionally, we enhanced the features for tabular data using contrastive learning techniques. The experiments and comparisons were conducted on various datasets to evaluate the effectiveness of these approaches. The results of our experiments demonstrated that the contrastive learning approach significantly improved the performance of two-sample tests on images and slightly improved classification accuracies on tabular data. Specifically, the accuracy of image-based tests increased, indicating a more robust method for statistical testing in visual contexts. For tabular data, the enhancements led to more refined features that marginally boosted classification performance, showcasing the versatility of contrastive learning. These findings suggest that contrastive learning can be a valuable tool for improving the reliability of two-sample tests on image data and enhancing features on tabular data. This dual applicability highlights its potential in a variety of data types, making it a promising area for further research. Future research could explore its application to other types of data such as text and voice, potentially broadening the scope and impact of contrastive learning methodologies.
first_indexed 2024-10-01T05:46:29Z
format Thesis-Master by Coursework
id ntu-10356/179367
institution Nanyang Technological University
language English
last_indexed 2024-10-01T05:46:29Z
publishDate 2024
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1793672024-08-02T15:43:22Z Exploring the application of contrastive learning on two-sample hypothesis test and tabular data Wan, Bingbing Lihui Chen School of Electrical and Electronic Engineering ELHCHEN@ntu.edu.sg Computer and Information Science Engineering Contrastive learning Two-sample tests Maximum Mean Discrepancy (MMD) Feature extraction The purpose of this study is to investigate the application of contrastive learning in two-sample tests for image data and feature enhancement for tabular data. This research is motivated by the potential of contrastive learning to improve the performance and accuracy of statistical tests and data classification. The main research problem addressed in this thesis is whether contrastive learning can enhance the performance of two-sample tests for images and improve feature quality for tabular data. To address this problem, we first verified the strong test power of pairing contrastive learning with the Maximum Mean Discrepancy (MMD) [1] two-sample test method. We then introduced a novel method called the contrastive two-sample test. Additionally, we enhanced the features for tabular data using contrastive learning techniques. The experiments and comparisons were conducted on various datasets to evaluate the effectiveness of these approaches. The results of our experiments demonstrated that the contrastive learning approach significantly improved the performance of two-sample tests on images and slightly improved classification accuracies on tabular data. Specifically, the accuracy of image-based tests increased, indicating a more robust method for statistical testing in visual contexts. For tabular data, the enhancements led to more refined features that marginally boosted classification performance, showcasing the versatility of contrastive learning. These findings suggest that contrastive learning can be a valuable tool for improving the reliability of two-sample tests on image data and enhancing features on tabular data. This dual applicability highlights its potential in a variety of data types, making it a promising area for further research. Future research could explore its application to other types of data such as text and voice, potentially broadening the scope and impact of contrastive learning methodologies. Master's degree 2024-07-31T07:37:13Z 2024-07-31T07:37:13Z 2024 Thesis-Master by Coursework Wan, B. (2024). Exploring the application of contrastive learning on two-sample hypothesis test and tabular data. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/179367 https://hdl.handle.net/10356/179367 en 3511: CL for summarization application/pdf Nanyang Technological University
spellingShingle Computer and Information Science
Engineering
Contrastive learning
Two-sample tests
Maximum Mean Discrepancy (MMD)
Feature extraction
Wan, Bingbing
Exploring the application of contrastive learning on two-sample hypothesis test and tabular data
title Exploring the application of contrastive learning on two-sample hypothesis test and tabular data
title_full Exploring the application of contrastive learning on two-sample hypothesis test and tabular data
title_fullStr Exploring the application of contrastive learning on two-sample hypothesis test and tabular data
title_full_unstemmed Exploring the application of contrastive learning on two-sample hypothesis test and tabular data
title_short Exploring the application of contrastive learning on two-sample hypothesis test and tabular data
title_sort exploring the application of contrastive learning on two sample hypothesis test and tabular data
topic Computer and Information Science
Engineering
Contrastive learning
Two-sample tests
Maximum Mean Discrepancy (MMD)
Feature extraction
url https://hdl.handle.net/10356/179367
work_keys_str_mv AT wanbingbing exploringtheapplicationofcontrastivelearningontwosamplehypothesistestandtabulardata