A Siamese Transformer Network for Zero-Shot Ancient Coin Classification

Ancient numismatics, the study of ancient coins, has in recent years become an attractive domain for the application of computer vision and machine learning. Though rich in research problems, the predominant focus in this area to date has been on the task of attributing a coin from an image, that is...

Full description

Bibliographic Details
Main Authors: Zhongliang Guo, Ognjen Arandjelović, David Reid, Yaxiong Lei, Jochen Büttner
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/9/6/107
_version_ 1797268232837529600
author Zhongliang Guo
Ognjen Arandjelović
David Reid
Yaxiong Lei
Jochen Büttner
author_facet Zhongliang Guo
Ognjen Arandjelović
David Reid
Yaxiong Lei
Jochen Büttner
author_sort Zhongliang Guo
collection DOAJ
description Ancient numismatics, the study of ancient coins, has in recent years become an attractive domain for the application of computer vision and machine learning. Though rich in research problems, the predominant focus in this area to date has been on the task of attributing a coin from an image, that is of identifying its issue. This may be considered the cardinal problem in the field and it continues to challenge automatic methods. In the present paper, we address a number of limitations of previous work. Firstly, the existing methods approach the problem as a classification task. As such, they are unable to deal with classes with no or few exemplars (which would be most, given over 50,000 issues of Roman Imperial coins alone), and require retraining when exemplars of a new class become available. Hence, rather than seeking to learn a representation that distinguishes a <i>particular</i> class from all the others, herein we seek a representation that is <i>overall</i> best at distinguishing classes from one another, thus relinquishing the demand for exemplars of <i>any specific</i> class. This leads to our adoption of the paradigm of pairwise coin matching by issue, rather than the usual classification paradigm, and the specific solution we propose in the form of a Siamese neural network. Furthermore, while adopting deep learning, motivated by its successes in the field and its unchallenged superiority over classical computer vision approaches, we also seek to leverage the advantages that transformers have over the previously employed convolutional neural networks, and in particular their non-local attention mechanisms, which ought to be particularly useful in ancient coin analysis by associating semantically but not visually related distal elements of a coin’s design. Evaluated on a large data corpus of 14,820 images and 7605 issues, using transfer learning and only a small training set of 542 images of 24 issues, our Double Siamese ViT model is shown to surpass the state of the art by a large margin, achieving an overall accuracy of 81%. Moreover, our further investigation of the results shows that the majority of the method’s errors are unrelated to the intrinsic aspects of the algorithm itself, but are rather a consequence of unclean data, which is a problem that can be easily addressed in practice by simple pre-processing and quality checking.
first_indexed 2024-03-11T02:17:31Z
format Article
id doaj.art-8c99a298c1294b7883058b48e1e96de3
institution Directory Open Access Journal
issn 2313-433X
language English
last_indexed 2024-04-25T01:29:13Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Journal of Imaging
spelling doaj.art-8c99a298c1294b7883058b48e1e96de32024-03-08T12:44:48ZengMDPI AGJournal of Imaging2313-433X2023-05-019610710.3390/jimaging9060107A Siamese Transformer Network for Zero-Shot Ancient Coin ClassificationZhongliang Guo0Ognjen Arandjelović1David Reid2Yaxiong Lei3Jochen Büttner4School of Computer Science, University of St Andrews, Scotland KY16 9AJ, UKSchool of Computer Science, University of St Andrews, Scotland KY16 9AJ, UKSchool of Computer Science, University of St Andrews, Scotland KY16 9AJ, UKSchool of Computer Science, University of St Andrews, Scotland KY16 9AJ, UKMax Planck Institute for the History of Science, Boltzmannstraße 22, 14195 Berlin, GermanyAncient numismatics, the study of ancient coins, has in recent years become an attractive domain for the application of computer vision and machine learning. Though rich in research problems, the predominant focus in this area to date has been on the task of attributing a coin from an image, that is of identifying its issue. This may be considered the cardinal problem in the field and it continues to challenge automatic methods. In the present paper, we address a number of limitations of previous work. Firstly, the existing methods approach the problem as a classification task. As such, they are unable to deal with classes with no or few exemplars (which would be most, given over 50,000 issues of Roman Imperial coins alone), and require retraining when exemplars of a new class become available. Hence, rather than seeking to learn a representation that distinguishes a <i>particular</i> class from all the others, herein we seek a representation that is <i>overall</i> best at distinguishing classes from one another, thus relinquishing the demand for exemplars of <i>any specific</i> class. This leads to our adoption of the paradigm of pairwise coin matching by issue, rather than the usual classification paradigm, and the specific solution we propose in the form of a Siamese neural network. Furthermore, while adopting deep learning, motivated by its successes in the field and its unchallenged superiority over classical computer vision approaches, we also seek to leverage the advantages that transformers have over the previously employed convolutional neural networks, and in particular their non-local attention mechanisms, which ought to be particularly useful in ancient coin analysis by associating semantically but not visually related distal elements of a coin’s design. Evaluated on a large data corpus of 14,820 images and 7605 issues, using transfer learning and only a small training set of 542 images of 24 issues, our Double Siamese ViT model is shown to surpass the state of the art by a large margin, achieving an overall accuracy of 81%. Moreover, our further investigation of the results shows that the majority of the method’s errors are unrelated to the intrinsic aspects of the algorithm itself, but are rather a consequence of unclean data, which is a problem that can be easily addressed in practice by simple pre-processing and quality checking.https://www.mdpi.com/2313-433X/9/6/107Siamese neural networkmatchingdeep learningcomputer visionmachine learninglow-shot learning
spellingShingle Zhongliang Guo
Ognjen Arandjelović
David Reid
Yaxiong Lei
Jochen Büttner
A Siamese Transformer Network for Zero-Shot Ancient Coin Classification
Journal of Imaging
Siamese neural network
matching
deep learning
computer vision
machine learning
low-shot learning
title A Siamese Transformer Network for Zero-Shot Ancient Coin Classification
title_full A Siamese Transformer Network for Zero-Shot Ancient Coin Classification
title_fullStr A Siamese Transformer Network for Zero-Shot Ancient Coin Classification
title_full_unstemmed A Siamese Transformer Network for Zero-Shot Ancient Coin Classification
title_short A Siamese Transformer Network for Zero-Shot Ancient Coin Classification
title_sort siamese transformer network for zero shot ancient coin classification
topic Siamese neural network
matching
deep learning
computer vision
machine learning
low-shot learning
url https://www.mdpi.com/2313-433X/9/6/107
work_keys_str_mv AT zhongliangguo asiamesetransformernetworkforzeroshotancientcoinclassification
AT ognjenarandjelovic asiamesetransformernetworkforzeroshotancientcoinclassification
AT davidreid asiamesetransformernetworkforzeroshotancientcoinclassification
AT yaxionglei asiamesetransformernetworkforzeroshotancientcoinclassification
AT jochenbuttner asiamesetransformernetworkforzeroshotancientcoinclassification
AT zhongliangguo siamesetransformernetworkforzeroshotancientcoinclassification
AT ognjenarandjelovic siamesetransformernetworkforzeroshotancientcoinclassification
AT davidreid siamesetransformernetworkforzeroshotancientcoinclassification
AT yaxionglei siamesetransformernetworkforzeroshotancientcoinclassification
AT jochenbuttner siamesetransformernetworkforzeroshotancientcoinclassification