Comparison of InChI to other chemical formats
The following table compares InChI to a few common chemical formats.
InChI | InChIKey | SMILES | Molfile | CML | |||||||||||||
Linearized | Yes | Yes | Yes | No | No | ||||||||||||
Unique, canonical | Yes | No10 | Possibly1 | No | No | ||||||||||||
Human readable | Hardly2 | Impossible | Easily3 | Hardly2 | Hardly2 | ||||||||||||
Includes atom coordinates | No | No | No | Yes4 | Yes4 | Length (characters per atom)5 | ~2 | ~111 | 1-2 | ~50 | ~50 | Software support (0-1) | 0.36 | 0.112 | 0.27 | 18 | 0.59 |
- SMILES is normally not unique, but has the possibility of a canonical form, that is unique for each structure.
- These formats separate the information about atoms and bonds and thus their reading by human requires at least a piece of paper, a pen, some knowledge of the format and a lot of patience.
- SMILES was designed to be read and written by humans and is therefor relatively straightforward to read, provided the user knows a few basic principles of the format.
- These formats do not require presence of atom coordinates, but usually contain them.
- This is just a very rough estimate and may vary significantly, especially for the longer formats.
- InChI is gaining support from the software producers and is not understood by most major chemical editors.
- A few programs support SMILES.
- Molfile is a format supported by most chemical packages.
- CML is supported by several chemical programs, but is far less common than Molfile.
- For every molecule only one InChIKey is valid, however two molecules may have the same InChIKey – more details here.
- InChIKey is fixed length (27 characters). For typical organic molecules its length is around 1 character per atom, but can be as much as 27 characters per atom for single atom molecules. For large molecules the ratio drops near to 0 characters/atom.
- InChIKey is very new but we expect it to be supported very soon by all software already having InChI support.