Matlab InChIKey Scripts

This is a collection of Matlab scripts for working with InChIKeys: IKextract, IKfreqFH, IKstring, and IKmusic

IKextract, InChIKey Extract, can extract InChIKeys from chemical Structure data files (SDFs). This script was successfully used to extract over 90 million InChIKeys (unique chemical identifiers) from over 5000 PubChem SD files. Users can also extract other data from SDFs by specifying the desired SD tag.

IKfreqFH, InChIKey frequency of first hash block, extracts the first hash block of InChIKeys and sorts them by frequency. Such a method is useful for analyzing the variety of chemical connectivity in large datasets.

IKstring, InChIKey String, allows for searching for strings within InChIKeys. I use it to search the > 90 million InChIKeys in PubChem.

IKmusic, InChIKey music, creates music from InChIKeys. A unique song is created for each InChIKey (i.e. every unique chemical substance has a different song!)

Information
Content Type Non OER
Author Vincent F. Scalfani
DOI https://www.mathworks.com/matlabcentral/fileexchange/62870-matlab-inchikey-scripts-ikextract-ikfreqfh-ikstring-and-ikmusic,
Content Link https://www.mathworks.com/matlabcentral/fileexchange/62870-matlab-inchikey-scripts-ikextract-ikfreqfh-ikstring-and-ikmusic
Content Status publish
Number of Comments No Comments
Date Published
Content Tags Cheminformatics, Data Extraction, Data Management, Matlab