Redesign of Handling of Tautomerism for InChI V2
Description
The IUPAC International Chemical Identifier (InChI) algorithm is now well established as a powerful means of denoting the basic chemical structure of a well-defined, small (<1024 atoms) organic molecule as a unique machine-readable character string, suitable for electronic data storage, searching and exchange. The IUPAC Division VIII InChI Subcommittee is now starting work on a complete overhaul of the InChI algorithm, i.e. the beginning of plans for a version 2 of InChI. A crucial part of this work is intended to address the known shortcomings of the current InChI algorithm pertaining to the handling (or lack thereof) of various types of tautomerism.
Important issues intended to be addressed in this context are discussed in, e.g., M. Sitzmann, W.-D. Ihlenfeldt, M. C. Nicklaus, “Tautomerism in large databases”, J Comput Aided Mol Des (2010) 24:521-551; https://dx.doi.org/10.1007/s10822-010-9346-4.
The present project is devoted to analysis of the current handling of the various types of tautomerism in InChI, their deficiencies, their connection with metal disconnection and protonation/deprotonation, comparison of InChI’s current algorithm with approaches published in literature and/or used in other databases and software, and putting together a list of new requirements of how an InChI V2 algorithm should handle tautomerism and related issues.
Information | |
---|---|
Content Type | OER |
Content Status | publish |
Number of Comments | No Comments |
Date Published | August 1, 2012 |
Content Tags | Database Applications, IUPAC Project, InChI Development, Tautomers |