

InChIKey is a new format directly derived from InChI. I has several features that distinguish it from InChI and make it attractive for slightly different purposes.
The following table summarizes the basic differences between InChI and InChIKey.
| Property | InChI | InChIKey |
| Readable | yes | no |
| One string for molecule1 | yes | yes |
| One molecule for string2 | yes | no3 |
| Fixed length | no | yes |
| Transfer safe4 | no | yes |
| Consistency check | no | yes5 |
InChIKey is a fixed-length format directly derived from InChI. It is based on a strong hash (SHA-256 algorithm) of an InChI string.
Because of the hash nature of the InChIKey, there is no guarantee that two distinct molecules will have different InChIKeys. At the time or writing of this article no such collisions are known, but they are unavoidable in the future. On the other hand it is possible that the first collision will not be found in the next 100 or 1000 years. The nature of the hash algorithm also means that it is virtually impossible to deduce the original InChI from InChIKey (hashes are designed especially for this purpose). The only possible way is to use brute-force method of trying InChIs of all known chemical compounds.
The nature of InChIKey makes it ideal for database storage, especially for indexing purposes. On the other hand it cannot be used as the only format for chemical structure storage because it is not convertible to the original structure.
InChIKey is also a very good format for online publishing in form of metadata. Its small length and built-in consistency checking mechanism guarantee that search engines will read and index them properly, which might not be true for long InChIs.
You may use our online InChIKey generator to convert InChIs to InChIKeys or our online InChIKey checker to check the consistency of an InChIKey.
The 25 characters long InChIKey is made of two parts connected by a hyphen. The first part is 14 characters long and is based on the connectivity and proton layers of an InChI string. The second part, contains 9 characters that are related to all other InChI layers (isotopes, stereochemistry, etc.) and the last character is a checksum character computed from the rest of the InChIKey.
Both parts of the InChIKey are based on a truncated SHA-256 hash of the corresponding InChI layers. For encoding of the data only uppercase ASCII letters are used which ensures that the indexing engines will not split the data and also avoids case-insensitivity problems.