The sum utility is identical to the cksum utility, except that it defaults to using historic algorithm 1, as described below. It is provided for compatibility only.
The
md5
utility takes as input a message of arbitrary length and produces
as output a 128-bit
``fingerprint''
or
``message digest''
of the input.
It is conjectured that it is computationally infeasible
to product two messages having the same message digest, or to produce
any message having a given prespecified target message digest.
The
MD5 algorithm is intended for digital signature applications, where
a large file must be
``compressed''
in a secure manner before being encrypted with a private (secret)
key under a public-key encryption system such as
RSA
.
The md2 and md4 utilities behave in exactly the same manner as md5 but use different algorithms.
The sha1 and rmd160 utilities also produce message digests, however the output from these two programs is 160 bits in length, as opposed to 128.
The options are as follows:
md5 *.tgz > MD5
sha1 *.tgz > SHA1
MD5
,
then use the following command to verify them:
cat MD5 SHA1 | cksum -c
Algorithm 1 is the algorithm used by historic
BSD
systems as the
sum(1)
algorithm and by historic
AT&T
System V UNIX
systems as the
sum(1)
algorithm when using the
-r
option.
This is a 16-bit checksum, with a right rotation before each addition;
overflow is discarded.
Algorithm 2 is the algorithm used by historic
AT&T
System V UNIX
systems as the
default
sum(1)
algorithm.
This is a 32-bit checksum, and is defined as follows:
s = sum of all bytes; r = s % 2^16 + (s % 2^32) / 2^16; cksum = (r % 2^16) + r / 2^16;
Both algorithm 1 and 2 write to the standard output the same fields as the default algorithm except that the size of the file in bytes is replaced with the size of the file in blocks. For historic reasons, the block size is 1024 for algorithm 1 and 512 for algorithm 2. Partial blocks are rounded up.
The following options apply only when using the one of the message digest algorithms:
The default CRC used is based on the polynomial used for CRC error checking in the networking standard ISO/IEC 8802-3:1989 . The CRC checksum encoding is defined by the generating polynomial:
G(x) = x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x + 1
Mathematically, the CRC value corresponding to a given file is defined by the following procedure:
The n bits to be evaluated are considered to be the coefficients of a mod 2 polynomial M(x) of degree n-1. These n bits are the bits from the file, with the most significant bit being the most significant bit of the first octet of the file and the last bit being the least significant bit of the last octet, padded with zero bits (if necessary) to achieve an integral number of octets, followed by one or more octets representing the length of the file as a binary value, least significant octet first. The smallest number of octets capable of representing this integer are used.M(x) is multiplied by x^32 (i.e., shifted left 32 bits) and divided by G(x) using mod 2 division, producing a remainder R(x) of degree 31.
The coefficients of R(x) are considered to be a 32-bit sequence.
The bit sequence is complemented and the result is the CRC.
The cksum and sum utilities exit 0 on success, and >0 if an error occurs.
The default calculation is identical to that given in pseudo-code in the following ACM article.