A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records
The proliferation of textual data, notably in the form of database records, calls for innovative methods of analysis that go beyond traditional numerical techniques. While least squares regression has been a cornerstone in quantitative data analysis, its applicability to textual data remains largely...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
FRUCT
2023-11-01
|
Series: | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
Subjects: | |
Online Access: | https://www.fruct.org/publications/volume-34/acm34/files/Roz.pdf |
_version_ | 1797354941095870464 |
---|---|
author | Ondřej Rozinek Monika Borkovcova |
author_facet | Ondřej Rozinek Monika Borkovcova |
author_sort | Ondřej Rozinek |
collection | DOAJ |
description | The proliferation of textual data, notably in the form of database records, calls for innovative methods of analysis that go beyond traditional numerical techniques. While least squares regression has been a cornerstone in quantitative data analysis, its applicability to textual data remains largely unexplored. This study aims to bridge this gap by introducing a similarity-based least squares method tailored for textual data. Drawing on the principles of similarity measures in text, such as semantic and syntactic closeness, we propose an extension to the conventional least squares framework. Our approach incorporates wordbased similarity metrics into the least squares objective function, enabling the analysis of textual data in a manner coherent with its qualitative nature. The developed methodology is rigorously evaluated using both synthetic and real-world database records, demonstrating its efficacy in uncovering intricate relationships within textual data. Our findings open new avenues for textual data analysis, blending the precision of classical statistical methods with the subtleties of text similarity. |
first_indexed | 2024-03-08T13:57:06Z |
format | Article |
id | doaj.art-5632f330cb574fba98b224c8b729cc37 |
institution | Directory Open Access Journal |
issn | 2305-7254 2343-0737 |
language | English |
last_indexed | 2024-03-08T13:57:06Z |
publishDate | 2023-11-01 |
publisher | FRUCT |
record_format | Article |
series | Proceedings of the XXth Conference of Open Innovations Association FRUCT |
spelling | doaj.art-5632f330cb574fba98b224c8b729cc372024-01-15T12:32:23ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372023-11-01342277https://youtu.be/h8gHeaYM13s10.5281/zenodo.10426332A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database RecordsOndřej Rozinek0Monika Borkovcova1University of PardubiceUniversity of PardubiceThe proliferation of textual data, notably in the form of database records, calls for innovative methods of analysis that go beyond traditional numerical techniques. While least squares regression has been a cornerstone in quantitative data analysis, its applicability to textual data remains largely unexplored. This study aims to bridge this gap by introducing a similarity-based least squares method tailored for textual data. Drawing on the principles of similarity measures in text, such as semantic and syntactic closeness, we propose an extension to the conventional least squares framework. Our approach incorporates wordbased similarity metrics into the least squares objective function, enabling the analysis of textual data in a manner coherent with its qualitative nature. The developed methodology is rigorously evaluated using both synthetic and real-world database records, demonstrating its efficacy in uncovering intricate relationships within textual data. Our findings open new avenues for textual data analysis, blending the precision of classical statistical methods with the subtleties of text similarity.https://www.fruct.org/publications/volume-34/acm34/files/Roz.pdfsimilarity spacelinear regressionsimilarity search |
spellingShingle | Ondřej Rozinek Monika Borkovcova A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records Proceedings of the XXth Conference of Open Innovations Association FRUCT similarity space linear regression similarity search |
title | A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records |
title_full | A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records |
title_fullStr | A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records |
title_full_unstemmed | A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records |
title_short | A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records |
title_sort | novel approach to regression exploring the similarity space with ordinary least squares on database records |
topic | similarity space linear regression similarity search |
url | https://www.fruct.org/publications/volume-34/acm34/files/Roz.pdf |
work_keys_str_mv | AT ondrejrozinek anovelapproachtoregressionexploringthesimilarityspacewithordinaryleastsquaresondatabaserecords AT monikaborkovcova anovelapproachtoregressionexploringthesimilarityspacewithordinaryleastsquaresondatabaserecords AT ondrejrozinek novelapproachtoregressionexploringthesimilarityspacewithordinaryleastsquaresondatabaserecords AT monikaborkovcova novelapproachtoregressionexploringthesimilarityspacewithordinaryleastsquaresondatabaserecords |