The promise of zero-shot learning for alcohol image detection: comparison with a task-specific deep learning algorithm

Abstract Exposure to alcohol content in media increases alcohol consumption and related harm. With exponential growth of media content, it is important to use algorithms to automatically detect and quantify alcohol exposure. Foundation models such as Contrastive Language-Image Pretraining (CLIP) can...

Full description

Bibliographic Details
Main Authors: Abraham Albert Bonela, Aiden Nibali, Zhen He, Benjamin Riordan, Dan Anderson-Luxford, Emmanuel Kuntsche
Format: Article
Language:English
Published: Nature Portfolio 2023-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-39169-4
_version_ 1797774266139148288
author Abraham Albert Bonela
Aiden Nibali
Zhen He
Benjamin Riordan
Dan Anderson-Luxford
Emmanuel Kuntsche
author_facet Abraham Albert Bonela
Aiden Nibali
Zhen He
Benjamin Riordan
Dan Anderson-Luxford
Emmanuel Kuntsche
author_sort Abraham Albert Bonela
collection DOAJ
description Abstract Exposure to alcohol content in media increases alcohol consumption and related harm. With exponential growth of media content, it is important to use algorithms to automatically detect and quantify alcohol exposure. Foundation models such as Contrastive Language-Image Pretraining (CLIP) can detect alcohol exposure through Zero-Shot Learning (ZSL) without any additional training. In this paper, we evaluated the ZSL performance of CLIP against a supervised algorithm called Alcoholic Beverage Identification Deep Learning Algorithm Version-2 (ABIDLA2), which is specifically trained to recognise alcoholic beverages in images, across three tasks. We found ZSL achieved similar performance compared to ABIDLA2 in two out of three tasks. However, ABIDLA2 outperformed ZSL in a fine-grained classification task in which determining subtle differences among alcoholic beverages (including containers) are essential. We also found that phrase engineering is essential for improving the performance of ZSL. To conclude, like ABIDLA2, ZSL with little phrase engineering can achieve promising performance in identifying alcohol exposure in images. This makes it easier for researchers, with little or no programming background, to implement ZSL effectively to obtain insightful analytics from digital media. Such analytics can assist researchers and policy makers to propose regulations that can prevent alcohol exposure and eventually prevent alcohol consumption.
first_indexed 2024-03-12T22:17:34Z
format Article
id doaj.art-1fd92654a08d4c69aab543f19cb28940
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-12T22:17:34Z
publishDate 2023-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-1fd92654a08d4c69aab543f19cb289402023-07-23T11:12:09ZengNature PortfolioScientific Reports2045-23222023-07-011311910.1038/s41598-023-39169-4The promise of zero-shot learning for alcohol image detection: comparison with a task-specific deep learning algorithmAbraham Albert Bonela0Aiden Nibali1Zhen He2Benjamin Riordan3Dan Anderson-Luxford4Emmanuel Kuntsche5Centre for Alcohol Policy Research, La Trobe UniversityComputer Science and Information Technology, La Trobe UniversityComputer Science and Information Technology, La Trobe UniversityCentre for Alcohol Policy Research, La Trobe UniversityCentre for Alcohol Policy Research, La Trobe UniversityCentre for Alcohol Policy Research, La Trobe UniversityAbstract Exposure to alcohol content in media increases alcohol consumption and related harm. With exponential growth of media content, it is important to use algorithms to automatically detect and quantify alcohol exposure. Foundation models such as Contrastive Language-Image Pretraining (CLIP) can detect alcohol exposure through Zero-Shot Learning (ZSL) without any additional training. In this paper, we evaluated the ZSL performance of CLIP against a supervised algorithm called Alcoholic Beverage Identification Deep Learning Algorithm Version-2 (ABIDLA2), which is specifically trained to recognise alcoholic beverages in images, across three tasks. We found ZSL achieved similar performance compared to ABIDLA2 in two out of three tasks. However, ABIDLA2 outperformed ZSL in a fine-grained classification task in which determining subtle differences among alcoholic beverages (including containers) are essential. We also found that phrase engineering is essential for improving the performance of ZSL. To conclude, like ABIDLA2, ZSL with little phrase engineering can achieve promising performance in identifying alcohol exposure in images. This makes it easier for researchers, with little or no programming background, to implement ZSL effectively to obtain insightful analytics from digital media. Such analytics can assist researchers and policy makers to propose regulations that can prevent alcohol exposure and eventually prevent alcohol consumption.https://doi.org/10.1038/s41598-023-39169-4
spellingShingle Abraham Albert Bonela
Aiden Nibali
Zhen He
Benjamin Riordan
Dan Anderson-Luxford
Emmanuel Kuntsche
The promise of zero-shot learning for alcohol image detection: comparison with a task-specific deep learning algorithm
Scientific Reports
title The promise of zero-shot learning for alcohol image detection: comparison with a task-specific deep learning algorithm
title_full The promise of zero-shot learning for alcohol image detection: comparison with a task-specific deep learning algorithm
title_fullStr The promise of zero-shot learning for alcohol image detection: comparison with a task-specific deep learning algorithm
title_full_unstemmed The promise of zero-shot learning for alcohol image detection: comparison with a task-specific deep learning algorithm
title_short The promise of zero-shot learning for alcohol image detection: comparison with a task-specific deep learning algorithm
title_sort promise of zero shot learning for alcohol image detection comparison with a task specific deep learning algorithm
url https://doi.org/10.1038/s41598-023-39169-4
work_keys_str_mv AT abrahamalbertbonela thepromiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT aidennibali thepromiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT zhenhe thepromiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT benjaminriordan thepromiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT danandersonluxford thepromiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT emmanuelkuntsche thepromiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT abrahamalbertbonela promiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT aidennibali promiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT zhenhe promiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT benjaminriordan promiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT danandersonluxford promiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm
AT emmanuelkuntsche promiseofzeroshotlearningforalcoholimagedetectioncomparisonwithataskspecificdeeplearningalgorithm