New developments on the cheminformatics open workflow environment CDK-Taverna

<p>Abstract</p> <p>Background</p> <p>The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Lego™-...

Full description

Bibliographic Details
Main Authors: Truszkowski Andreas, Jayaseelan Kalai, Neumann Stefan, Willighagen Egon L, Zielesny Achim, Steinbeck Christoph
Format: Article
Language:English
Published: BMC 2011-12-01
Series:Journal of Cheminformatics
Online Access:http://www.jcheminf.com/content/3/1/54
_version_ 1818551408987734016
author Truszkowski Andreas
Jayaseelan Kalai
Neumann Stefan
Willighagen Egon L
Zielesny Achim
Steinbeck Christoph
author_facet Truszkowski Andreas
Jayaseelan Kalai
Neumann Stefan
Willighagen Egon L
Zielesny Achim
Steinbeck Christoph
author_sort Truszkowski Andreas
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Lego™-like, graphical assembly of I/O modules and algorithms into a complex workflow which can be easily deployed, modified and tested without the hassle of implementing it into a monolithic application. The CDK-Taverna project aims at building a free open-source cheminformatics pipelining solution through combination of different open-source projects such as Taverna, the Chemistry Development Kit (CDK) or the Waikato Environment for Knowledge Analysis (WEKA). A first integrated version 1.0 of CDK-Taverna was recently released to the public.</p> <p>Results</p> <p>The CDK-Taverna project was migrated to the most up-to-date versions of its foundational software libraries with a complete re-engineering of its worker's architecture (version 2.0). 64-bit computing and multi-core usage by paralleled threads are now supported to allow for fast in-memory processing and analysis of large sets of molecules. Earlier deficiencies like workarounds for iterative data reading are removed. The combinatorial chemistry related reaction enumeration features are considerably enhanced. Additional functionality for calculating a natural product likeness score for small molecules is implemented to identify possible drug candidates. Finally the data analysis capabilities are extended with new workers that provide access to the open-source WEKA library for clustering and machine learning as well as training and test set partitioning. The new features are outlined with usage scenarios.</p> <p>Conclusions</p> <p>CDK-Taverna 2.0 as an open-source cheminformatics workflow solution matured to become a freely available and increasingly powerful tool for the biosciences. The combination of the new CDK-Taverna worker family with the already available workflows developed by a lively Taverna community and published on myexperiment.org enables molecular scientists to quickly calculate, process and analyse molecular data as typically found in e.g. today's systems biology scenarios.</p>
first_indexed 2024-12-12T08:59:31Z
format Article
id doaj.art-d3e1fab405ff4b47ba147a8bd96cbdce
institution Directory Open Access Journal
issn 1758-2946
language English
last_indexed 2024-12-12T08:59:31Z
publishDate 2011-12-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj.art-d3e1fab405ff4b47ba147a8bd96cbdce2022-12-22T00:29:53ZengBMCJournal of Cheminformatics1758-29462011-12-01315410.1186/1758-2946-3-54New developments on the cheminformatics open workflow environment CDK-TavernaTruszkowski AndreasJayaseelan KalaiNeumann StefanWillighagen Egon LZielesny AchimSteinbeck Christoph<p>Abstract</p> <p>Background</p> <p>The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Lego™-like, graphical assembly of I/O modules and algorithms into a complex workflow which can be easily deployed, modified and tested without the hassle of implementing it into a monolithic application. The CDK-Taverna project aims at building a free open-source cheminformatics pipelining solution through combination of different open-source projects such as Taverna, the Chemistry Development Kit (CDK) or the Waikato Environment for Knowledge Analysis (WEKA). A first integrated version 1.0 of CDK-Taverna was recently released to the public.</p> <p>Results</p> <p>The CDK-Taverna project was migrated to the most up-to-date versions of its foundational software libraries with a complete re-engineering of its worker's architecture (version 2.0). 64-bit computing and multi-core usage by paralleled threads are now supported to allow for fast in-memory processing and analysis of large sets of molecules. Earlier deficiencies like workarounds for iterative data reading are removed. The combinatorial chemistry related reaction enumeration features are considerably enhanced. Additional functionality for calculating a natural product likeness score for small molecules is implemented to identify possible drug candidates. Finally the data analysis capabilities are extended with new workers that provide access to the open-source WEKA library for clustering and machine learning as well as training and test set partitioning. The new features are outlined with usage scenarios.</p> <p>Conclusions</p> <p>CDK-Taverna 2.0 as an open-source cheminformatics workflow solution matured to become a freely available and increasingly powerful tool for the biosciences. The combination of the new CDK-Taverna worker family with the already available workflows developed by a lively Taverna community and published on myexperiment.org enables molecular scientists to quickly calculate, process and analyse molecular data as typically found in e.g. today's systems biology scenarios.</p>http://www.jcheminf.com/content/3/1/54
spellingShingle Truszkowski Andreas
Jayaseelan Kalai
Neumann Stefan
Willighagen Egon L
Zielesny Achim
Steinbeck Christoph
New developments on the cheminformatics open workflow environment CDK-Taverna
Journal of Cheminformatics
title New developments on the cheminformatics open workflow environment CDK-Taverna
title_full New developments on the cheminformatics open workflow environment CDK-Taverna
title_fullStr New developments on the cheminformatics open workflow environment CDK-Taverna
title_full_unstemmed New developments on the cheminformatics open workflow environment CDK-Taverna
title_short New developments on the cheminformatics open workflow environment CDK-Taverna
title_sort new developments on the cheminformatics open workflow environment cdk taverna
url http://www.jcheminf.com/content/3/1/54
work_keys_str_mv AT truszkowskiandreas newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna
AT jayaseelankalai newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna
AT neumannstefan newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna
AT willighagenegonl newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna
AT zielesnyachim newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna
AT steinbeckchristoph newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna