A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records

The proliferation of textual data, notably in the form of database records, calls for innovative methods of analysis that go beyond traditional numerical techniques. While least squares regression has been a cornerstone in quantitative data analysis, its applicability to textual data remains largely...

Full description

Bibliographic Details
Main Authors: Ondřej Rozinek, Monika Borkovcova
Format: Article
Language:English
Published: FRUCT 2023-11-01
Series:Proceedings of the XXth Conference of Open Innovations Association FRUCT
Subjects:
Online Access:https://www.fruct.org/publications/volume-34/acm34/files/Roz.pdf
_version_ 1797354941095870464
author Ondřej Rozinek
Monika Borkovcova
author_facet Ondřej Rozinek
Monika Borkovcova
author_sort Ondřej Rozinek
collection DOAJ
description The proliferation of textual data, notably in the form of database records, calls for innovative methods of analysis that go beyond traditional numerical techniques. While least squares regression has been a cornerstone in quantitative data analysis, its applicability to textual data remains largely unexplored. This study aims to bridge this gap by introducing a similarity-based least squares method tailored for textual data. Drawing on the principles of similarity measures in text, such as semantic and syntactic closeness, we propose an extension to the conventional least squares framework. Our approach incorporates wordbased similarity metrics into the least squares objective function, enabling the analysis of textual data in a manner coherent with its qualitative nature. The developed methodology is rigorously evaluated using both synthetic and real-world database records, demonstrating its efficacy in uncovering intricate relationships within textual data. Our findings open new avenues for textual data analysis, blending the precision of classical statistical methods with the subtleties of text similarity.
first_indexed 2024-03-08T13:57:06Z
format Article
id doaj.art-5632f330cb574fba98b224c8b729cc37
institution Directory Open Access Journal
issn 2305-7254
2343-0737
language English
last_indexed 2024-03-08T13:57:06Z
publishDate 2023-11-01
publisher FRUCT
record_format Article
series Proceedings of the XXth Conference of Open Innovations Association FRUCT
spelling doaj.art-5632f330cb574fba98b224c8b729cc372024-01-15T12:32:23ZengFRUCTProceedings of the XXth Conference of Open Innovations Association FRUCT2305-72542343-07372023-11-01342277https://youtu.be/h8gHeaYM13s10.5281/zenodo.10426332A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database RecordsOndřej Rozinek0Monika Borkovcova1University of PardubiceUniversity of PardubiceThe proliferation of textual data, notably in the form of database records, calls for innovative methods of analysis that go beyond traditional numerical techniques. While least squares regression has been a cornerstone in quantitative data analysis, its applicability to textual data remains largely unexplored. This study aims to bridge this gap by introducing a similarity-based least squares method tailored for textual data. Drawing on the principles of similarity measures in text, such as semantic and syntactic closeness, we propose an extension to the conventional least squares framework. Our approach incorporates wordbased similarity metrics into the least squares objective function, enabling the analysis of textual data in a manner coherent with its qualitative nature. The developed methodology is rigorously evaluated using both synthetic and real-world database records, demonstrating its efficacy in uncovering intricate relationships within textual data. Our findings open new avenues for textual data analysis, blending the precision of classical statistical methods with the subtleties of text similarity.https://www.fruct.org/publications/volume-34/acm34/files/Roz.pdfsimilarity spacelinear regressionsimilarity search
spellingShingle Ondřej Rozinek
Monika Borkovcova
A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records
Proceedings of the XXth Conference of Open Innovations Association FRUCT
similarity space
linear regression
similarity search
title A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records
title_full A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records
title_fullStr A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records
title_full_unstemmed A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records
title_short A Novel Approach to Regression: Exploring the Similarity Space with Ordinary Least Squares on Database Records
title_sort novel approach to regression exploring the similarity space with ordinary least squares on database records
topic similarity space
linear regression
similarity search
url https://www.fruct.org/publications/volume-34/acm34/files/Roz.pdf
work_keys_str_mv AT ondrejrozinek anovelapproachtoregressionexploringthesimilarityspacewithordinaryleastsquaresondatabaserecords
AT monikaborkovcova anovelapproachtoregressionexploringthesimilarityspacewithordinaryleastsquaresondatabaserecords
AT ondrejrozinek novelapproachtoregressionexploringthesimilarityspacewithordinaryleastsquaresondatabaserecords
AT monikaborkovcova novelapproachtoregressionexploringthesimilarityspacewithordinaryleastsquaresondatabaserecords