consistency checks on longitudinal data
Hello all, I'm setting up a multi-centre longitudinal study where data will be entered by health care workers with very little experience of research or possibly of computers. Every two weeks they will be visited by researchers. The data will be entered long, ie with one record for each time point. I need the data entry to be very simple, and the researchers will run consistency checks to find problems. Some things I want to do are: 1) check for duplicates on a few fields ( of course there is a unique ID - this would be to pick up duplicate data with different IDs) 2) check number of time points per patient ID 3) check dates are within range 4) check that date for time point 2 comes after date for time point 1 for the same patient ID I have used EpiData for entry for a few years but I haven't used Analysis. I would be glad of any advice. Thanks, Vicky Simms
Vicky, There are some useful tips in the list archives from April 2007 (Getting values from other records) that will help with 2)
In general, you can make use of the ability of EpiData Analysis to reference values in other records in the database. You do this by using square brackets after a variable name. var[i] is the value of var in record i; var[_n] is the value of var in the current record. For example, for problem 1), say you want to use fields age, sex, district to check for records where ID might be miscoded:
gen s asd=string(age)+sex+district sort asd id define checkrec # checkrec=0 if (asd=asd[_n-1]) and (id<>id[_n-1]) then checkrec=1 if (asd=asd[_n+1]) and (id<>id[_n+1]) then checkrec=1 select checkrec=1 list
This will list all records where asd matches the record before or after, but id does not match.
3) should be done at entry time 4) can be done using same strategy as above sort id vist gen b checkrec=(id=id[_n-1]) and (visitdate<visitdate[_n-1]) list id visit visitdate if checkrec
Jamie Vicy wrote:
I'm setting up a multi-centre longitudinal study where data will be entered by health care workers with very little experience of research or possibly of computers. Every two weeks they will be visited by researchers. The data will be entered long, ie with one record for each time point. I need the data entry to be very simple, and the researchers will run consistency checks to find problems. Some things I want to do are:
- check for duplicates on a few fields ( of course there is a unique
ID - this would be to pick up duplicate data with different IDs) 2) check number of time points per patient ID 3) check dates are within range 4) check that date for time point 2 comes after date for time point 1 for the same patient ID I have used EpiData for entry for a few years but I haven't used Analysis. I would be glad of any advice. Thanks, Vicky Simms
participants (1)
-
epidata-list@lists.umanitoba.ca