DECIMER—hand-drawn molecule images dataset
Abstract The translation of images of chemical structures into machine-readable representations of the depicted molecules is known as optical chemical structure recognition (OCSR). There has been a lot of progress over the last three decades in this field, but the development of systems for the reco...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-06-01
|
Series: | Journal of Cheminformatics |
Online Access: | https://doi.org/10.1186/s13321-022-00620-9 |
_version_ | 1811248089552388096 |
---|---|
author | Henning Otto Brinkhaus Achim Zielesny Christoph Steinbeck Kohulan Rajan |
author_facet | Henning Otto Brinkhaus Achim Zielesny Christoph Steinbeck Kohulan Rajan |
author_sort | Henning Otto Brinkhaus |
collection | DOAJ |
description | Abstract The translation of images of chemical structures into machine-readable representations of the depicted molecules is known as optical chemical structure recognition (OCSR). There has been a lot of progress over the last three decades in this field, but the development of systems for the recognition of complex hand-drawn structure depictions is still at the beginning. Currently, there is no data for the systematic evaluation of OCSR methods on hand-drawn structures available. Here we present DECIMER — Hand-drawn molecule images, a standardised, openly available benchmark dataset of 5088 hand-drawn depictions of diversely picked chemical structures. Every structure depiction in the dataset is mapped to a machine-readable representation of the underlying molecule. The dataset is openly available and published under the CC-BY 4.0 licence which applies very few limitations. We hope that it will contribute to the further development of the field. Graphical Abstract |
first_indexed | 2024-04-12T15:20:42Z |
format | Article |
id | doaj.art-995414f823a94f8ea5c6c01bd506c97b |
institution | Directory Open Access Journal |
issn | 1758-2946 |
language | English |
last_indexed | 2024-04-12T15:20:42Z |
publishDate | 2022-06-01 |
publisher | BMC |
record_format | Article |
series | Journal of Cheminformatics |
spelling | doaj.art-995414f823a94f8ea5c6c01bd506c97b2022-12-22T03:27:27ZengBMCJournal of Cheminformatics1758-29462022-06-011411410.1186/s13321-022-00620-9DECIMER—hand-drawn molecule images datasetHenning Otto Brinkhaus0Achim Zielesny1Christoph Steinbeck2Kohulan Rajan3Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University JenaInstitute for Bioinformatics and Chemoinformatics, Westphalian University of Applied SciencesInstitute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University JenaInstitute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University JenaAbstract The translation of images of chemical structures into machine-readable representations of the depicted molecules is known as optical chemical structure recognition (OCSR). There has been a lot of progress over the last three decades in this field, but the development of systems for the recognition of complex hand-drawn structure depictions is still at the beginning. Currently, there is no data for the systematic evaluation of OCSR methods on hand-drawn structures available. Here we present DECIMER — Hand-drawn molecule images, a standardised, openly available benchmark dataset of 5088 hand-drawn depictions of diversely picked chemical structures. Every structure depiction in the dataset is mapped to a machine-readable representation of the underlying molecule. The dataset is openly available and published under the CC-BY 4.0 licence which applies very few limitations. We hope that it will contribute to the further development of the field. Graphical Abstracthttps://doi.org/10.1186/s13321-022-00620-9 |
spellingShingle | Henning Otto Brinkhaus Achim Zielesny Christoph Steinbeck Kohulan Rajan DECIMER—hand-drawn molecule images dataset Journal of Cheminformatics |
title | DECIMER—hand-drawn molecule images dataset |
title_full | DECIMER—hand-drawn molecule images dataset |
title_fullStr | DECIMER—hand-drawn molecule images dataset |
title_full_unstemmed | DECIMER—hand-drawn molecule images dataset |
title_short | DECIMER—hand-drawn molecule images dataset |
title_sort | decimer hand drawn molecule images dataset |
url | https://doi.org/10.1186/s13321-022-00620-9 |
work_keys_str_mv | AT henningottobrinkhaus decimerhanddrawnmoleculeimagesdataset AT achimzielesny decimerhanddrawnmoleculeimagesdataset AT christophsteinbeck decimerhanddrawnmoleculeimagesdataset AT kohulanrajan decimerhanddrawnmoleculeimagesdataset |