January 2006 - EpiData-list - University of Manitoba Mailing Lists

Re: [EpiData-list] wild cards in variable names?
by epidata-list＠lists.umanitoba.ca 30 Jan '06

30 Jan '06

I have 70 variables, named var1, var2, . . . var70 > Each has a corresponding numerical variable, which is calculated from > the value of var, during data entry, via an IF THEN ELSE statement in > the check file: > > var1num, var2num, . . . var70num > > I would like each varXnum to be zero by default, when a new record is > opened. Easy: (try): before file defaultvalue var1num-var70num 0 end You must get v3.1 latest build (jan 2006) for this to work. Also the variables must be placed in sequence. The first implementation of defaultvalue wasa a bit different, updated installation files with the text below in the help files are uploaded with about 30 minutes. The command DEFAULTVALUE can be used to assign a certain value instead of “blank” for any fields. Default values can be defined like this: In a BEFORE FILE or BEFORE RECORD block you add: DEFAULTVALUE ALL | ALLSTRINGS | ALLSTRING | ALLNUMERIC x Similarly for lists of fields: DEFAULTVALUE field1-field10,field12 x DEFAULTVALUE field14-field20,field11 "x" In a field block: DEFAULTVALUE x A DEFAULTVALUE expands the range, legal or comment legal definitions of a field. If DEFAULTVALUE 9 is part of a field block then the value 9 is allowed to be entered even if e.g. RANGE 1-5 is also defined. Field block specification overrules a general definition in a BEFORE FILE or BEFORE RECORD block. DEFAULTVALUEs will be exported as part of the any COMMENT LEGAL definitions for the field in question. The category text assigned will be “Default”. *NOTE: The value assignment only takes place in new records. Not in already entered records.* / / /Examples:/ (in before file block): DEFAULTVALUE ALL 9 fills all numerical and string fields with 9 DEFAULTVALUE ALL “No info” ALLstrings fills all string fields with “No info” (in a field block): DEFAULTVALUE 2001 assigns the value 2001 as default. (in a string field): DEFAULTVALUE “No Information” assigns the value “No Information” as default. Regards Jens Lauritsen EpiData Association ps. The revising of the website is almost done (a few pages are left).

1 0

wild cards in variable names?
by epidata-list＠lists.umanitoba.ca 26 Jan '06

26 Jan '06

I have 70 variables, named var1, var2, . . . var70 Each has a corresponding numerical variable, which is calculated from the value of var, during data entry, via an IF THEN ELSE statement in the check file: var1num, var2num, . . . var70num I would like each varXnum to be zero by default, when a new record is opened. I found the BEFORE RECORD command, which it seems would do the job. But I want to save on typing, so is there a wildcard character, such as * , so I can type just one command in a BEFORE RECORD block, like this: BEFORE RECORD var*num = 0 END Thanks. -- Christopher W. Ryan, MD SUNY Upstate Medical University Clinical Campus at Binghamton and Wilson Family Practice Residency, Johnson City, NY cryanatbinghamtondotedu GnuPG and PGP public keys available at http://pgp.mit.edu "If you want to build a ship, don't drum up the men to gather wood, divide the work and give orders. Instead, teach them to yearn for the vast and endless sea." [Antoine de St. Exupery]

1 0

Final problem in " How to: checking for duplicate records"
by epidata-list＠lists.umanitoba.ca 22 Jan '06

22 Jan '06

Apologies for having expressed the "compare current record with previous record or next record" as list if k = ( ( k[_n-1] ) or ( k[_n-1]) ) which should have been: list if ( ( k = k[_n-1] ) or ( k = k[_n+1]) ) The aggregate (tab stat) does not have a /save option, the most efficient is to: aggregate k /sum=x /close /notable select if n > 1 savedata "newfile.rec" Then the newfile.rec would contain only replicates. Jens Lauritsen EpiData Association ps. The revised web site www.epidata.dk is still in construction. I will give a notice when the new design has been completely implemented.

1 1

How to: checking for duplicate records
by epidata-list＠lists.umanitoba.ca 19 Jan '06

19 Jan '06

Yes - I got this error too. The LIST statement should read: list if (k = k[_n-1]) or (k = k[_n+1]) * if you only want the first of two records to appear: list if k = k[_n+1] *** Of course you have to insert some duplicate records into a copy of bromar.rec first *** |Indeed some of the line breaks threw it a bit. I reformatted below - but |still get that strange TRUNC error like: |. list if k = (( k[_n-1]) or (k[_n+1])) |Operator TRUNC is incompatible with String |Operation aborted

1 1

How to: checking for duplicate records
by epidata-list＠lists.umanitoba.ca 18 Jan '06

18 Jan '06

Some of the line breaks in the example did not come through properly. This is how I ran it: *********** Read "C:\Program Files\EpiData\validate\estimates\bromar.rec" gen s(25) k = "s:" + trim(string(sex) ) k = trim(k) + "a:" + trim(string(age) ) k = trim(k) + "d:" + trim(string(dectime) ) sort k gen x=1 aggregate k /sum=x /close /notable freq n *********** It works fine. I think you were not executing the 'aggregate' statement. With /close, the 'aggregate' command closes your working file and creates a new one with the following fields: k, n, nx, sumx For this example, n, nx and sumx are the same values. Jamie Hockin Public Health Agency of Canada

1 1

checking for duplicate records
by epidata-list＠lists.umanitoba.ca 17 Jan '06

17 Jan '06

Is there an easy way to check for duplicate records during data cleaning after data entry? Currently we export EpiData sets to MS Excel and run a little routine - but maybe someone has some within EpiData solution. Tx Max

1 2

Converting string to number
by epidata-list＠lists.umanitoba.ca 17 Jan '06

17 Jan '06

I tried the solutions given but it didn't work - purhaps I have made some errors. Here is the printout: . read Loading data C:\Data\Test.rec, please wait.. File name : C:\Data\Test.rec Fields: 1 Total records: 3 Valid records: 3 ******************************************************************************************* NOTE: Test.rec contains the variable var1 (string) and the following cases: 10/01 11/02 11/03 ******************************************************************************************* . Run "C:\DOCUME~1\STIG~1.UHL\LOKALA~1\Temp\TMP67F.Tmp" . define var2 ## Var Name VAR2 of type Integer Var length: 2 decimals 0 . var2 = string(copy(var1,1,2)) Data type mismatch . close File closed . read Loading data C:\Data\Test.rec, please wait.. File name : C:\Data\Test.rec Fields: 1 Total records: 3 Valid records: 3 . Run "C:\DOCUME~1\STIG~1.UHL\LOKALA~1\Temp\TMP680.Tmp" . gen i var2 = string(copy(var1,1,2)) Data type mismatch ------------------------------------------------------------------------ Stig Uhlin tel 090-786 60 71 Stig.Uhlin(a)umdac.umu.se UMDAC 070-591 03 04 901 87 Umeå fax 090-786 67 62 ------------------------------------------------------------------------

1 0

How to: checking for duplicate records
by epidata-list＠lists.umanitoba.ca 16 Jan '06

16 Jan '06

The decision should be to define the time for this: a. At entry of data b. As part of the analysis or quality control during entry For A: ....................................................................................................................................... Two commands are important here: Key unique Autosearch list The "Key Unique" will prevent more than one instance of a value in a given field to be entered. "Autosearch list" will show you any other record with the same value as the current field already entered in the file. A "key unique" status can also be used when creating combined fields. E.g. if you wish to combine id + area + date of visit then at the combination of these the status of key will be investigated. qes id person id #### area township # visit date of visit # chk file: id and area are defined mustenter and date of visit. * control assumed a string variable visit mustenter after entry control = "id" + string(id) + " area:" + string(area) + "visit:" + visit end end control * notice as well "key unique" as "noenter" key unique noenter end ....................................................................................................................................... Analysis or quality control. To test this I added in the bromar.rec file five double records by copy and paste, but with the same id numbers. I also added 5 records but changed the id to a new value. First the easy one. Let us find the five where id number is the same: * Assume you wish to see variables with same value of id sex age and dectime * I prefer the "gen" to the "define k _____ " but this is not important. gen s(80) k = "id: " + string(id) k = trim(k) + "s:" + trim(string(sex) ) // the first "trim(k) is important !!! k = trim(k) + "a:" + trim(string(age) ) k = trim(k) + "d:" + trim(string(dectime) ) sort k * list now any two records where two consecutive records have the same value * notice [ _n-1] indicates previous record [ _n+1] next record * notice also the many parenthesis. They could be removed but makes sure of what is what. list if k = ( ( k[_n-1] ) or ( k[_n-1]) ) * An alternative would be to create a frequency table, but with many records (here 4000) * this is not going to work. So we use instead: define x # x = 1 * now count how many times a given value in k was there * do not show the table (takes too long) therefore /notable aggregate k /sum=x /close /notable . list n k if n > 1 N K 2 id30 s:2a:55d:4.4833 3 id5 s:2a:51d:3.8667 2 id75 s:2a:52d:4.9167 So we know that records are the same for id 30, 5 and 75. The same principle can be used for any number of variables which will fit within 80 characters. The problem is how many records should be controlled for possible double entry: For my example the number of records to control in roughly 4100 records is: Combine id and 3 variables: 5 (correct) Three variables only (sex age dectime) : 741 combinations with more than one record Four variables only (sex age dectime km) : 154 combinations with more than one record So the question "Is there an easy way" - is answered. It is easy if you have an id or a number of variables to compare by. Most likely more than 4-5 variables. Jens Lauritsen EpiData Association

1 1

Converting string to number
by epidata-list＠lists.umanitoba.ca 13 Jan '06

13 Jan '06

In Epi6 I often used the following define var1 ## let var1=stringvar[startposition,number ] to convert a part of a stringvariabel to an integer variable. Whar is the equivalance in EpiStat? The way to find this information in EpiData Analysis is the following: 1. press F1 (latest version 1.1) 2. This will either show the complete "commands and functions" reference, where a section shows functions or a specific command, if a word in the beginning of the command prompt is found as a key word in the commands ref. The answer to your question can then be found by: writing in the command prompt: functions // and then press F1 or by just pressing F1 and look for the functions near the bottom of the file. ANswer: define var1 ## var1 = string(var2) *or if you wish only a part of var2: var1 = string(copy(var2,1,1)) or simpler with command "gen" gen i var1 = string(var2) *or if you wish only a part of var2: gen i var1 = string(copy(var2,1,1)) "i" stands for integer variable regards Jens Lauritsen EpiData Association, Denmark functions

1 0

New website design and double entry problem
by epidata-list＠lists.umanitoba.ca 13 Jan '06

13 Jan '06

A "cure" for the double entry problem with special calculations has been found for EpiData Entry. A new build will be put out within the next few days. On the www.epidata.dk website a new front page has been created. I expect to adapt the menu on the left side within a week. Users having problems of seing parts of the page, please send a message to the list or to "info at epidata.dk" regards Jens Lauritsen EpiData Association

1 0