These algorithms map a character string to another character string of hexadecimal characters highly likely to be unique. The latter is used to uniquely identify a source text (and the underlying source language).
Details
Secure Hash Algorithm 1
Method sha1
corresponds to SHA-1 (Secure Hash Algorithm version 1), a
cryptographic hashing function. While it is now superseded by more secure
variants (SHA-256, SHA-512, etc.), it is still useful for non-sensitive
purposes. It is fast, collision-resistant, and may handle very large inputs.
It emits strings of 40 hexadecimal characters.
Cumulative UTF-8 Sum
This method is experimental. Use with caution.
Method utf8
is a simple method derived from cumulative sums of UTF-8 code
points (converted to integers). It is slightly faster than method sha1
for
small inputs and emits hashes with a width porportional to the underlying
input's length. It is used for testing purposes internally.