Comment 1:
Hi Gustav, What an coincidence. Today we have been doing something similar with the Danish cpr-nr which is 10 digit 'ddmmyyzzz'. When we treated this as a numeric fiels with 10 digits our algorithms broke down when the date was after the 21. I think that the problem is that the largest integer in many pc languages (here Delphi/pascal) for an integer systems is The MaxInt constant gives the largest allowed value for an Integer. The value is normally (2^32)-1 = 2147483647
If you use a float you will get other problems. The solution for us was to store the complete personal number af a string variable and then extract the relevant number using substr and real.
Best wishes Claus
Seniorstatistician Claus Holst Institute of Preventive Medicine Center for Health and Society Copenhagen
The institute: http://www.ipm.regionh.dk EU projects: http://www.nugenob.org http://www.diogenes-eu.org/
Comment 2: In EpiData we recently changed default max length for integers to be 9 digits and to have the same default in as well analysis and entry. (Although for analysis if you use gen command the length would be 4)
The reason for the limitation of length of integer being smaller than the MaxInt constant mentioned by Claus Holst. This is due to the founding of the rec file format in Epi Info v6, which handled integers until the length of 4 as true integers, but stored larger integers internally as floats.
We can discuss if the length of 9 is sufficient. But it is my experience in all software that merging on large integers tends to give problems of precision. Therefore my suggested solution for Gustav's problem would be as Claus Holst also suggests to use a string variable.
In extension (but not related to the integer/string problem) the algoritm for the luhn algorithm
(http://en.wikipedia.org/wiki/Luhn_algorithm) could be implemented as a user written CHK command based on the procedures written for EpiData Entry. This is documented on the page www.epidata.dk/documentation.php with two examples: Soundex, Metaphone and Gumm algoritms.
regards
Jens Lauritsen EpiData Association