Check digit

A check digit is a form of redundancy check used for error detection on identification numbers, such as bank account numbers, which are used in an application where they will at least sometimes be input manually. It is analogous to a binary parity bit used to check for errors in computer-generated data. It consists of one or more digits computed by an algorithm from the other digits in the sequence input.
With a check digit, one can detect simple errors in the input of a series of characters such as a single mistyped digit or some permutations of two successive digits.

Design

Check digit algorithms are generally designed to capture human transcription errors. In order of complexity, these include the following:

letter/digit errors, such as l → 1 or O → 0
single-digit errors, such as 1 → 2
transposition errors, such as 12 → 21
twin errors, such as 11 → 22
jump transpositions errors, such as 132 → 231
jump twin errors, such as 131 → 232
phonetic errors, such as 60 → 16

In choosing a system, a high probability of catching errors is traded off against implementation difficulty; simple check digit systems are easily understood and implemented by humans but do not catch as many errors as complex ones, which require sophisticated programs to implement.
A desirable feature is that left-padding with zeros should not change the check digit. This allows variable length numbers to be used and the length to be changed.
If there is a single check digit added to the original number, the system will not always capture multiple errors, such as two replacement errors though, typically, double errors will be caught 90% of the time.
A very simple check digit method would be to take the sum of all digits modulo 10. This would catch any single-digit error, as such an error would always change the sum, but does not catch any transposition errors as re-ordering does not change the sum.
A slightly more complex method is to take the weighted sum of the digits, modulo 10, with different weights for each number position.
To illustrate this, for example if the weights for a four digit number were 5, 3, 2, 7 and the number to be coded was 4871, then one would take 5×4 + 3×8 + 2×7 + 7×1 = 65, i.e. 65 modulo 10, and the check digit would be 5, giving 48715.
Systems with weights of 1, 3, 7, or 9, with the weights on neighboring numbers being different, are widely used: for example, 31 31 weights in UPC codes, 13 13 weights in EAN numbers, and the 371 371 371 weights used in United States bank routing transit numbers. This system detects all single-digit errors and around 90% of transposition errors. 1, 3, 7, and 9 are used because they are coprime with 10, so changing any digit changes the check digit; using a coefficient that is divisible by 2 or 5 would lose information and thus not catch some single-digit errors. Using different weights on neighboring numbers means that most transpositions change the check digit; however, because all weights differ by an even number, this does not catch transpositions of two digits that differ by 5, since the 2 and 5 multiply to yield 10.
The code instead uses modulo 11, which is prime, and all the number positions have different weights 1, 2, ... 10. This system thus detects all single-digit substitution and transposition errors, but at the cost of the check digit possibly being 10, represented by "X". instead uses the GS1 algorithm used in EAN numbers.
More complicated algorithms include the Luhn algorithm, which captures 98% of single-digit transposition errors and the still more sophisticated Verhoeff algorithm, which catches all single-digit substitution and transposition errors, and many more complex errors. Similar is another abstract algebra-based method, the Damm algorithm, that too detects all single-digit errors and all adjacent transposition errors. These three methods use a single check digit and will therefore fail to capture around 10% of more complex errors. To reduce this failure rate, it is necessary to use more than one check digit and/or to use a wider range of characters in the check digit, for example letters plus numbers.

Examples

UPC, EAN, GLN, GTIN, numbers administered by GS1

The final digit of a Universal Product Code, International Article Number, Global Location Number or Global Trade Item Number is a check digit computed as follows:

Add the digits in the odd-numbered positions from the left together and multiply by three.
Add the digits in the even-numbered positions to the result.
Take the remainder of the result divided by 10. If the remainder is equal to 0 then use 0 as the check digit, and if not 0 subtract the remainder from 10 to derive the check digit.

A GS1 check digit calculator and detailed documentation is online at GS1's website. Another official calculator page shows that the mechanism for GTIN-13 is the same for Global Location Number/GLN.
For instance, the UPC-A barcode for a box of tissues is "036000241457". The last digit is the check digit "7", and if the other numbers are correct then the check digit calculation must produce 7.

Add the odd number digits: 0+6+0+2+1+5 = 14.
Multiply the result by 3: 14 × 3 = 42.
Add the even number digits: 3+0+0+4+4 = 11.
Add the two results together: 42 + 11 = 53.
To calculate the check digit, take the remainder of, which is also known as, and if not 0, subtract from 10. Therefore, the check digit value is 7. i.e. = 5 remainder 3; 10 - 3 = 7.

Another example: to calculate the check digit for the following food item "01010101010x".

Add the odd number digits: 0+0+0+0+0+0 = 0.
Multiply the result by 3: 0 x 3 = 0.
Add the even number digits: 1+1+1+1+1=5.
Add the two results together: 0 + 5 = 5.
To calculate the check digit, take the remainder of, which is also known as, and if not 0, subtract from 10: i.e. = 0 remainder 5; = 5. Therefore, the check digit x value is 5.

ISBN 10

The final character of a ten-digit International Standard Book Number is a check digit computed so that multiplying each digit by its position in the number and taking the sum of these products modulo 11 is 0. The digit the farthest to the right is the check digit, chosen to make the sum correct. It may need to have the value 10, which is represented as the letter X. For example, take the : The sum of products is 0×10 + 2×9 + 0×8 + 1×7 + 5×6 + 3×5 + 0×4 + 8×3 + 2×2 + 1×1 = 99 ≡ 0. So the ISBN is valid. Positions can also be counted from left, in which case the check digit is multiplied by 10, to check validity: 0×1 + 2×2 + 0×3 + 1×4 + 5×5 + 3×6 + 0×7 + 8×8 + 2×9 + 1×10 = 143 ≡ 0.

ISBN 13

ISBN 13 is equal to the EAN-13 code found underneath a book's barcode. Its check digit is generated in a similar way to the UPC.
The check digit is computed as follows:

Add the digits in the odd-numbered positions from the left together.
Add the digits in the even-numbered positions together, and multiply by three, and add this to the result.
Take the remainder of the result divided by 10. If the remainder is equal to 0 then use 0 as the check digit, and if not 0 subtract the remainder from 10 to derive the check digit.

For example, take the, belonging to Harry Potter and the Philosopher's Stone. 9 is the check digit here, so the calculations must yield 9 at the end.

Add the odd number digits: 9+8+7+7+3+6 = 40.
Add the even number digits: 7+0+4+5+2+9 = 27.
Multiply the result by 3: 27 x 3 = 81.
Add the two results together: 40 +81 = 121.
To calculate the check digit, take the remainder of, which is also known as, and if not 0, subtract from 10. Therefore, the check digit value is 9, i.e. = 12 remainder 1; 10 - 1 = 9.

NCDA

The NOID Check Digit Algorithm, in use since 2004, is designed for application in persistent identifiers and works with variable length strings of letters and digits, called extended digits. It is widely used with the ARK identifier scheme and somewhat used with schemes, such as the Handle System and DOI. An extended digit is constrained to betanumeric characters, which are alphanumerics minus vowels and the letter 'l'. This restriction helps when generating opaque strings that are unlikely to form words by accident and will not contain both O and 0, or l and 1. Having a prime radix of R=29, the betanumeric repertoire permits the algorithm to guarantee detection of single-character and transposition errors for strings less than R=29 characters in length. The algorithm generalizes to any character repertoire with a prime radix R and strings less than R characters in length.

Other examples of check digits

International

The International SEDOL number.
The final digit of an ISSN code or IMO Number.
The International Securities Identifying Number.
Object Management Group FIGI standard final digit.
The International CAS registry number's final digit.
Modulo 10 check digits in credit card account numbers, calculated by the Luhn algorithm.
*Also used in the Norwegian KID numbers used in bank giros,
*Used in IMEI of mobile phones.
Last check digit in EAN/UPC serialisation of Global Trade Identification Number. It applies to GTIN-8, GTIN-12, GTIN-13 and GTIN-14.
The final digit of a DUNS number.
The third and fourth digits in an International Bank Account Number.
The final digit in an International Standard Text Code.
The final character encoded in a magnetic stripe card is a computed Longitudinal redundancy check.

In the US

The tenth digit of the National Provider Identifier for the US healthcare industry.
The final digit of a POSTNET code.
The North American CUSIP number.
The final digit of the ABA routing transit number, a bank code used in the United States.
The ninth digit of a Vehicle Identification Number.
Mayo Clinic patient identification numbers used in Arizona and Florida include a trailing check digit.
The eleventh digit of a Customs & Border Protection entry number.

In Central America

The Guatemalan Tax Number based on modulo 11.

In Africa

The South African identity (ID) number uses the Luhn algorithm to calculate its 13th and final digit.

In Eurasia

The UK NHS Number uses the modulo 11 algorithm.
The Spanish fiscal identification number .
The Dutch Burgerservicenummer uses the modulo 11 algorithm.
The ninth digit of an Israeli Teudat Zehut number.
The 13th digit of the Serbian and Former Yugoslav Unique Master Citizen Number.
The last two digits of the 11-digit Turkish Identification Number.
The ninth character in the 14-character EU cattle passport number.
The ninth digit in an Icelandic Kennitala.
Modulo 97 check digits in a Belgian and Serbian bank account numbers. Serbia sometimes also uses modulo 11, for reference number.
The ninth digit in a Hungarian TAJ number.
For the residents of India, the unique identity number named Aadhaar has a trailing 12th digit that is calculated with the Verhoeff algorithm.
The Intellectual Property Office of Singapore has confirmed a new format for application numbers of registrable intellectual property. It will include a check character calculated with the Damm algorithm.
The last digit of Chinese citizen ID number is calculated by modulo 11-2 as specified in Chinese GuoBiao GB11643-1999 which adopts ISO 7064:1983. 'X' is used if the calculated checking digit is 10.
The 11th digit of Estonian Isikukood.
The last letter on vehicle registration plates of Singapore.

In Oceania

The Australian tax file number.
The seventh character of a New Zealand NHI Number.
The last digit in a New Zealand locomotive's Traffic Monitoring System number.

Algorithms

Notable algorithms include: