Skip to contents

These algorithms map a character string to another character string of hexadecimal characters highly likely to be unique. The latter is used to uniquely identify a source text (and the underlying source language).

Usage

algorithms()

Details

Secure Hash Algorithm 1

Method sha1 corresponds to SHA-1 (Secure Hash Algorithm version 1), a cryptographic hashing function. While it is now superseded by more secure variants (SHA-256, SHA-512, etc.), it is still useful for non-sensitive purposes. It is fast, collision-resistant, and may handle very large inputs. It emits strings of 40 hexadecimal characters.

Cumulative UTF-8 Sum

[Experimental]

This method is experimental. Use with caution.

Method utf8 is a simple method derived from cumulative sums of UTF-8 code points (converted to integers). It is slightly faster than method sha1 for small inputs and emits hashes with a width porportional to the underlying input's length. It is used for testing purposes internally.