Comparison of InChI to other chemical formats
The following table compares InChI to a few common chemical formats.
Comparison of InChI to other formats
|
InChI |
InChIKey |
SMILES |
Molfile |
CML |
| Linearized |
Yes |
Yes |
Yes |
No |
No |
| Unique, canonical |
Yes |
No10 |
Possibly1 |
No |
No |
| Human readable |
Hardly2 |
Impossible |
Easily3 |
Hardly2 |
Hardly2 |
| Includes atom coordinates |
No |
No |
No |
Yes4 |
Yes4 |
| Length (characters per atom)5 |
~2 |
~111 |
1-2 |
~50 |
~50 |
| Software support (0-1) |
0.36 |
0.112 |
0.27 |
18 |
0.59 |
- SMILES is normally not unique, but has the possibility of a canonical form, that is unique for each structure.
- These formats separate the information about atoms and bonds and thus their reading by human requires at least a piece of paper, a pen, some knowledge of the format and a lot of patience.
- SMILES was designed to be read and written by humans and is therefor relatively straightforward to read, provided the user knows a few basic principles of the format.
- These formats do not require presence of atom coordinates, but usually contain them.
- This is just a very rough estimate and may vary significantly, especially for the longer formats.
- InChI is gaining support from the software producers and is not understood by most major chemical editors.
- A few programs support SMILES.
- Molfile is a format supported by most chemical packages.
- CML is supported by several chemical programs, but is far less common than Molfile.
- For every molecule only one InChIKey is valid, however two molecules may have the same InChIKey - more details here.
- InChIKey is fixed length (25 characters). For typical organic molecules its length is around 1 character per atom, but can be as much as 25 characters per atom for single atom molecules. For large molecules the ratio drops near to 0 characters/atom.
- InChIKey is very new but we expect it to be supported very soon by all software already having InChI support.