Validation
This section may contain original research. |
The consequence of the complexity outlined above is that for almost every rule concerning UK postcodes, an exception can be found. Automatic validation of postcodes on the basis of pattern feasibility is therefore almost impossible to design, and the system contains no self-validating feature such as a check digit. Completely accurate validation is only possible by attempting to deliver mail to the address, and verifying with the recipient. Validation is usually performed against a copy of the "Postcode Address File" (PAF), which is generated by the Royal Mail and contains about 27 million UK commercial and residential addresses, covered by more than 1.7 million postcodes. However, even the PAF cannot be relied on as it contains errors, and because new postcodes are occasionally created and used before copies of the PAF can be distributed to users.
It is possible to validate the format of a postcode using the rules described in British Standard BS 7666. In general, the format is one of "A9 9AA", "A99 9AA", "A9A 9AA", "AA9 9AA", "AA99 9AA" or "AA9A 9AA", where A is an alphabetic character and 9 is a numeric character. There are restrictions on the set of alphabetic characters dependent on the position they are in. As can be seen, the first character is always alphabetical and the final three characters are always a numeric character followed by two alphabetic characters. A regular expression is given in the comments of the schema, which implements full checking of all the stated BS 7666 postcode format rules. That regular expression can be restated as a "traditional" regular expression:
This can be further reduced, through removal and combination of redundant alternatives and character classes, down to:
NB: British Forces Post Office postcodes do not follow the BS 7666 rules, but have the format "BFPO NNNN" or "BFPO c/o NNNN", where NNNN is 1 to 4 numerical digits.
An alternative short regular expression from BS7666 Schema is:
The above expressions fail to exclude many non-existent area codes (such as A
, AA
, Z
and ZY
). A more refined regex, which excludes all invalid areas and some invalid districts is:
The preceding expression also matches the legacy GIR 0AA and the new BF and BX non-geographic postcodes.
Read more about this topic: Postcodes In The United Kingdom