I was just wondering, is there any way to convert IUPAC or common molecular names to SMILES? I want to do this without having to manually convert every single one utilizing online systems. Any input would be much appreciated!
For background, I am currently working with python and RDkit, so I wasn’t sure if RDkit could do this and I was just unaware. My current data is in the csv format.
Thank you!
Advertisement
Answer
RDKit cant convert names to SMILES. Chemical Identifier Resolver can convert names and other identifiers (like CAS No) and has an API so you can convert with a script.
from urllib.request import urlopen from urllib.parse import quote def CIRconvert(ids): try: url = 'http://cactus.nci.nih.gov/chemical/structure/' + quote(ids) + '/smiles' ans = urlopen(url).read().decode('utf8') return ans except: return 'Did not work' identifiers = ['3-Methylheptane', 'Aspirin', 'Diethylsulfate', 'Diethyl sulfate', '50-78-2', 'Adamant'] for ids in identifiers : print(ids, CIRconvert(ids))
Output
3-Methylheptane CCCCC(C)CC Aspirin CC(=O)Oc1ccccc1C(O)=O Diethylsulfate CCO[S](=O)(=O)OCC Diethyl sulfate CCO[S](=O)(=O)OCC 50-78-2 CC(=O)Oc1ccccc1C(O)=O Adamant Did not work