A database of thermally activated delayed fluorescent molecules auto-generated from scientific literature with ChemDataExtractor

Abstract A database of thermally activated delayed fluorescent (TADF) molecules was automatically generated from the scientific literature. It consists of 25,482 data records with an overall precision of 82%. Among these, 5,349 records have chemical names in the form of SMILES strings which are repr...

Full description

Bibliographic Details
Main Authors: Dingyun Huang, Jacqueline M. Cole
Format: Article
Language:English
Published: Nature Portfolio 2024-01-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-023-02897-3
_version_ 1797350114607497216
author Dingyun Huang
Jacqueline M. Cole
author_facet Dingyun Huang
Jacqueline M. Cole
author_sort Dingyun Huang
collection DOAJ
description Abstract A database of thermally activated delayed fluorescent (TADF) molecules was automatically generated from the scientific literature. It consists of 25,482 data records with an overall precision of 82%. Among these, 5,349 records have chemical names in the form of SMILES strings which are represented with 91% accuracy; these are grouped in a subsidiary database. Each data record contains one of the following four properties: maximum emission wavelength (λ EM), photoluminescence quantum yield (PLQY), singlet-triplet energy splitting (ΔE ST), and delayed lifetime (τ D). The databases were created through text mining using ChemDataExtractor, a chemistry-aware natural-language-processing toolkit, which has been adapted for TADF research. The text-mined corpus consisted of 2,733 papers from the Royal Society of Chemistry and Elsevier. To the best of our knowledge, these databases are the first databases that have been auto-generated for TADF molecules from existing publications. The databases have been publicly released for experimental and computational applications in the TADF research field.
first_indexed 2024-03-08T12:40:09Z
format Article
id doaj.art-2f44d4910f5646838c342631d4f01acf
institution Directory Open Access Journal
issn 2052-4463
language English
last_indexed 2024-03-08T12:40:09Z
publishDate 2024-01-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj.art-2f44d4910f5646838c342631d4f01acf2024-01-21T12:10:15ZengNature PortfolioScientific Data2052-44632024-01-011111910.1038/s41597-023-02897-3A database of thermally activated delayed fluorescent molecules auto-generated from scientific literature with ChemDataExtractorDingyun Huang0Jacqueline M. Cole1Cavendish Laboratory, University of Cambridge, J. J. Thomson AvenueCavendish Laboratory, University of Cambridge, J. J. Thomson AvenueAbstract A database of thermally activated delayed fluorescent (TADF) molecules was automatically generated from the scientific literature. It consists of 25,482 data records with an overall precision of 82%. Among these, 5,349 records have chemical names in the form of SMILES strings which are represented with 91% accuracy; these are grouped in a subsidiary database. Each data record contains one of the following four properties: maximum emission wavelength (λ EM), photoluminescence quantum yield (PLQY), singlet-triplet energy splitting (ΔE ST), and delayed lifetime (τ D). The databases were created through text mining using ChemDataExtractor, a chemistry-aware natural-language-processing toolkit, which has been adapted for TADF research. The text-mined corpus consisted of 2,733 papers from the Royal Society of Chemistry and Elsevier. To the best of our knowledge, these databases are the first databases that have been auto-generated for TADF molecules from existing publications. The databases have been publicly released for experimental and computational applications in the TADF research field.https://doi.org/10.1038/s41597-023-02897-3
spellingShingle Dingyun Huang
Jacqueline M. Cole
A database of thermally activated delayed fluorescent molecules auto-generated from scientific literature with ChemDataExtractor
Scientific Data
title A database of thermally activated delayed fluorescent molecules auto-generated from scientific literature with ChemDataExtractor
title_full A database of thermally activated delayed fluorescent molecules auto-generated from scientific literature with ChemDataExtractor
title_fullStr A database of thermally activated delayed fluorescent molecules auto-generated from scientific literature with ChemDataExtractor
title_full_unstemmed A database of thermally activated delayed fluorescent molecules auto-generated from scientific literature with ChemDataExtractor
title_short A database of thermally activated delayed fluorescent molecules auto-generated from scientific literature with ChemDataExtractor
title_sort database of thermally activated delayed fluorescent molecules auto generated from scientific literature with chemdataextractor
url https://doi.org/10.1038/s41597-023-02897-3
work_keys_str_mv AT dingyunhuang adatabaseofthermallyactivateddelayedfluorescentmoleculesautogeneratedfromscientificliteraturewithchemdataextractor
AT jacquelinemcole adatabaseofthermallyactivateddelayedfluorescentmoleculesautogeneratedfromscientificliteraturewithchemdataextractor
AT dingyunhuang databaseofthermallyactivateddelayedfluorescentmoleculesautogeneratedfromscientificliteraturewithchemdataextractor
AT jacquelinemcole databaseofthermallyactivateddelayedfluorescentmoleculesautogeneratedfromscientificliteraturewithchemdataextractor