[EpiData-list] How to: checking for duplicate records

epidata-list at lists.umanitoba.ca epidata-list at lists.umanitoba.ca
Wed Jan 18 00:16:04 CST 2006


Dear Jamie,

Indeed some of the line breaks threw it a bit. I reformatted below - but 
still get that strange TRUNC error like:
. list if k = (( k[_n-1]) or (k[_n+1]))
Operator TRUNC is incompatible with String
Operation aborted
--------------
Read "C:\Program Files\EpiData\testdata\bromar.rec"
* Assume you wish to see variables with same value of id sex age and dectime
* I prefer the "gen" to the "define k _____ "  but this is not important.
gen s(80) k = "id: " + string(id)
k = trim(k) + "s:" + trim(string(sex) )
* the first "trim(k)  is important !!!.
k = trim(k) + "a:" + trim(string(age) )
k = trim(k) + "d:" + trim(string(dectime) )
sort k
* list now any two records where two consecutive records have the same value
* notice [ _n-1] indicates previous record [ _n+1] next record
* notice also the many parenthesis. They could be removed but makes sure
* of what is what.
list if k = (( k[_n-1]) or (k[_n+1]))
* An alternative would be to create a frequency table, but with many records
* (here 4000)
* this is not going to work. So we use instead.
define x #
x = 1
* now count how many times a given value in k was there
* do not show the table (takes too long) therefore /notable.
aggregate k  /sum=x /close /notable
list n k if n > 1

But I would like this running as it is exactly what I wanted to have 
listed - how is your solution different, how do I interpret your table?
Tx  Max

epidata-list at lists.umanitoba.ca wrote:

> Some of the line breaks in the example did not come through properly. 
> This is how I ran it:
> ***********
> Read "C:\Program Files\EpiData\validate\estimates\bromar.rec"
> gen s(25) k = "s:" + trim(string(sex) )
> k = trim(k) + "a:" + trim(string(age) )
> k = trim(k) + "d:" + trim(string(dectime) )
> sort k
> gen x=1
> aggregate k  /sum=x /close /notable
> freq n
> ***********
> It works fine. I think you were not executing the 'aggregate' statement.
> With /close, the 'aggregate' command closes your working file and 
> creates a new one with the following fields:
> k, n, nx, sumx
> For this example, n, nx and sumx are the same values.
>
> Jamie Hockin
> Public Health Agency of Canada
>
> _______________________________________________
> EpiData-list mailing list
> EpiData-list at lists.umanitoba.ca
> http://lists.umanitoba.ca/mailman/listinfo/epidata-list




More information about the EpiData-list mailing list