Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach

BackgroundMedication noncompliance is a critical issue because of the increased number of drugs sold on the web. Web-based drug distribution is difficult to control, causing problems such as drug noncompliance and abuse. The existing medication compliance surveys lack complet...

Full description

Bibliographic Details
Main Authors:	Tomohiro Nishiyama, Shuntaro Yada, Shoko Wakamiya, Satoko Hori, Eiji Aramaki
Format:	Article
Language:	English
Published:	JMIR Publications 2023-05-01
Series:	Journal of Medical Internet Research
Online Access:	https://www.jmir.org/2023/1/e44870

_version_	1797734268589309952
author	Tomohiro Nishiyama Shuntaro Yada Shoko Wakamiya Satoko Hori Eiji Aramaki
author_facet	Tomohiro Nishiyama Shuntaro Yada Shoko Wakamiya Satoko Hori Eiji Aramaki
author_sort	Tomohiro Nishiyama
collection	DOAJ
description	BackgroundMedication noncompliance is a critical issue because of the increased number of drugs sold on the web. Web-based drug distribution is difficult to control, causing problems such as drug noncompliance and abuse. The existing medication compliance surveys lack completeness because it is impossible to cover patients who do not go to the hospital or provide accurate information to their doctors, so a social media–based approach is being explored to collect information about drug use. Social media data, which includes information on drug usage by users, can be used to detect drug abuse and medication compliance in patients. ObjectiveThis study aimed to assess how the structural similarity of drugs affects the efficiency of machine learning models for text classification of drug noncompliance. MethodsThis study analyzed 22,022 tweets about 20 different drugs. The tweets were labeled as either noncompliant use or mention, noncompliant sales, general use, or general mention. The study compares 2 methods for training machine learning models for text classification: single-sub-corpus transfer learning, in which a model is trained on tweets about a single drug and then tested on tweets about other drugs, and multi-sub-corpus incremental learning, in which models are trained on tweets about drugs in order of their structural similarity. The performance of a machine learning model trained on a single subcorpus (a data set of tweets about a specific category of drugs) was compared to the performance of a model trained on multiple subcorpora (data sets of tweets about multiple categories of drugs). ResultsThe results showed that the performance of the model trained on a single subcorpus varied depending on the specific drug used for training. The Tanimoto similarity (a measure of the structural similarity between compounds) was weakly correlated with the classification results. The model trained by transfer learning a corpus of drugs with close structural similarity performed better than the model trained by randomly adding a subcorpus when the number of subcorpora was small. ConclusionsThe results suggest that structural similarity improves the classification performance of messages about unknown drugs if the drugs in the training corpus are few. On the other hand, this indicates that there is little need to consider the influence of the Tanimoto structural similarity if a sufficient variety of drugs are ensured.
first_indexed	2024-03-12T12:40:45Z
format	Article
id	doaj.art-c9eb6d7db34249819db5881d979ea1b3
institution	Directory Open Access Journal
issn	1438-8871
language	English
last_indexed	2024-03-12T12:40:45Z
publishDate	2023-05-01
publisher	JMIR Publications
record_format	Article
series	Journal of Medical Internet Research
spelling	doaj.art-c9eb6d7db34249819db5881d979ea1b32023-08-28T23:51:48ZengJMIR PublicationsJournal of Medical Internet Research1438-88712023-05-0125e4487010.2196/44870Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing ApproachTomohiro Nishiyamahttps://orcid.org/0000-0003-1538-8266Shuntaro Yadahttps://orcid.org/0000-0002-6209-1054Shoko Wakamiyahttps://orcid.org/0000-0002-9371-1340Satoko Horihttps://orcid.org/0000-0002-4596-5418Eiji Aramakihttps://orcid.org/0000-0003-0201-3609 BackgroundMedication noncompliance is a critical issue because of the increased number of drugs sold on the web. Web-based drug distribution is difficult to control, causing problems such as drug noncompliance and abuse. The existing medication compliance surveys lack completeness because it is impossible to cover patients who do not go to the hospital or provide accurate information to their doctors, so a social media–based approach is being explored to collect information about drug use. Social media data, which includes information on drug usage by users, can be used to detect drug abuse and medication compliance in patients. ObjectiveThis study aimed to assess how the structural similarity of drugs affects the efficiency of machine learning models for text classification of drug noncompliance. MethodsThis study analyzed 22,022 tweets about 20 different drugs. The tweets were labeled as either noncompliant use or mention, noncompliant sales, general use, or general mention. The study compares 2 methods for training machine learning models for text classification: single-sub-corpus transfer learning, in which a model is trained on tweets about a single drug and then tested on tweets about other drugs, and multi-sub-corpus incremental learning, in which models are trained on tweets about drugs in order of their structural similarity. The performance of a machine learning model trained on a single subcorpus (a data set of tweets about a specific category of drugs) was compared to the performance of a model trained on multiple subcorpora (data sets of tweets about multiple categories of drugs). ResultsThe results showed that the performance of the model trained on a single subcorpus varied depending on the specific drug used for training. The Tanimoto similarity (a measure of the structural similarity between compounds) was weakly correlated with the classification results. The model trained by transfer learning a corpus of drugs with close structural similarity performed better than the model trained by randomly adding a subcorpus when the number of subcorpora was small. ConclusionsThe results suggest that structural similarity improves the classification performance of messages about unknown drugs if the drugs in the training corpus are few. On the other hand, this indicates that there is little need to consider the influence of the Tanimoto structural similarity if a sufficient variety of drugs are ensured.https://www.jmir.org/2023/1/e44870
spellingShingle	Tomohiro Nishiyama Shuntaro Yada Shoko Wakamiya Satoko Hori Eiji Aramaki Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach Journal of Medical Internet Research
title	Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_full	Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_fullStr	Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_full_unstemmed	Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_short	Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_sort	transferability based on drug structure similarity in the automatic classification of noncompliant drug use on social media natural language processing approach
url	https://www.jmir.org/2023/1/e44870
work_keys_str_mv	AT tomohironishiyama transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach AT shuntaroyada transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach AT shokowakamiya transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach AT satokohori transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach AT eijiaramaki transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach

Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach

Similar Items