I have 70 variables, named var1, var2, . . . var70
> Each has a corresponding numerical variable, which is calculated from
> the value of var, during data entry, via an IF THEN ELSE statement in
> the check file:
>
> var1num, var2num, . . . var70num
>
> I would like each varXnum to be zero by default, when a new record is
> opened.
Easy: (try):
before file
defaultvalue var1num-var70num 0
end
You must get v3.1 latest build (jan 2006) for this to work. Also the
variables must be placed in sequence. The first implementation of
defaultvalue wasa a bit different, updated installation files with the
text below in the help files are uploaded with about 30 minutes.
The command DEFAULTVALUE can be used to assign a certain value instead
of “blank” for any fields.
Default values can be defined like this:
In a BEFORE FILE or BEFORE RECORD block you add:
DEFAULTVALUE ALL | ALLSTRINGS | ALLSTRING | ALLNUMERIC x
Similarly for lists of fields:
DEFAULTVALUE field1-field10,field12 x
DEFAULTVALUE field14-field20,field11 "x"
In a field block:
DEFAULTVALUE x
A DEFAULTVALUE expands the range, legal or comment legal definitions of
a field. If DEFAULTVALUE 9 is part of a field block then the value 9 is
allowed to be entered even if e.g. RANGE 1-5 is also defined. Field
block specification overrules a general definition in a BEFORE FILE or
BEFORE RECORD block.
DEFAULTVALUEs will be exported as part of the any COMMENT LEGAL
definitions for the field in question.
The category text assigned will be “Default”.
*NOTE: The value assignment only takes place in new records. Not in
already entered records.*
/ /
/Examples:/
(in before file block):
DEFAULTVALUE ALL 9 fills all numerical and string fields with 9
DEFAULTVALUE ALL “No info” ALLstrings fills all string fields with “No
info”
(in a field block):
DEFAULTVALUE 2001 assigns the value 2001 as default.
(in a string field):
DEFAULTVALUE “No Information” assigns the value “No Information” as default.
Regards
Jens Lauritsen
EpiData Association
ps. The revising of the website is almost done (a few pages are left).
I have 70 variables, named var1, var2, . . . var70
Each has a corresponding numerical variable, which is calculated from
the value of var, during data entry, via an IF THEN ELSE statement in
the check file:
var1num, var2num, . . . var70num
I would like each varXnum to be zero by default, when a new record is
opened. I found the BEFORE RECORD command, which it seems would do the
job. But I want to save on typing, so is there a wildcard character,
such as * , so I can type just one command in a BEFORE RECORD block,
like this:
BEFORE RECORD
var*num = 0
END
Thanks.
--
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
and Wilson Family Practice Residency, Johnson City, NY
cryanatbinghamtondotedu
GnuPG and PGP public keys available at http://pgp.mit.edu
"If you want to build a ship, don't drum up the men to gather wood,
divide the work and give orders. Instead, teach them to yearn for the
vast and endless sea." [Antoine de St. Exupery]
Apologies for having expressed the
"compare current record with previous record or next record"
as list if k = ( ( k[_n-1] ) or ( k[_n-1]) )
which should have been:
list if ( ( k = k[_n-1] ) or ( k = k[_n+1]) )
The aggregate (tab stat) does not have a /save option, the most
efficient is to:
aggregate k /sum=x /close /notable
select if n > 1
savedata "newfile.rec"
Then the newfile.rec would contain only replicates.
Jens Lauritsen
EpiData Association
ps. The revised web site www.epidata.dk is still in construction. I will
give a notice when the new design has been completely implemented.
Yes - I got this error too. The LIST statement should read:
list if (k = k[_n-1]) or (k = k[_n+1])
* if you only want the first of two records to appear:
list if k = k[_n+1]
*** Of course you have to insert some duplicate records into a copy of bromar.rec first ***
|Indeed some of the line breaks threw it a bit. I reformatted below - but
|still get that strange TRUNC error like:
|. list if k = (( k[_n-1]) or (k[_n+1]))
|Operator TRUNC is incompatible with String
|Operation aborted
Some of the line breaks in the example did not come through properly.
This is how I ran it:
***********
Read "C:\Program Files\EpiData\validate\estimates\bromar.rec"
gen s(25) k = "s:" + trim(string(sex) )
k = trim(k) + "a:" + trim(string(age) )
k = trim(k) + "d:" + trim(string(dectime) )
sort k
gen x=1
aggregate k /sum=x /close /notable
freq n
***********
It works fine. I think you were not executing the 'aggregate' statement.
With /close, the 'aggregate' command closes your working file and
creates a new one with the following fields:
k, n, nx, sumx
For this example, n, nx and sumx are the same values.
Jamie Hockin
Public Health Agency of Canada
Is there an easy way to check for duplicate records during data
cleaning after data entry? Currently we export EpiData sets to MS
Excel and run a little routine - but maybe someone has some within
EpiData solution. Tx Max
I tried the solutions given but it didn't work -
purhaps I have made some errors. Here is the printout:
. read
Loading data C:\Data\Test.rec, please wait..
File name : C:\Data\Test.rec
Fields: 1 Total records: 3 Valid records: 3
*******************************************************************************************
NOTE: Test.rec contains the variable var1 (string) and the following cases:
10/01
11/02
11/03
*******************************************************************************************
. Run "C:\DOCUME~1\STIG~1.UHL\LOKALA~1\Temp\TMP67F.Tmp"
. define var2 ##
Var Name VAR2 of type Integer
Var length: 2 decimals 0
. var2 = string(copy(var1,1,2))
Data type mismatch
. close
File closed
. read
Loading data C:\Data\Test.rec, please wait..
File name : C:\Data\Test.rec
Fields: 1 Total records: 3 Valid records: 3
. Run "C:\DOCUME~1\STIG~1.UHL\LOKALA~1\Temp\TMP680.Tmp"
. gen i var2 = string(copy(var1,1,2))
Data type mismatch
------------------------------------------------------------------------
Stig Uhlin tel 090-786 60 71 Stig.Uhlin(a)umdac.umu.se
UMDAC 070-591 03 04
901 87 Umeå fax 090-786 67 62
------------------------------------------------------------------------
The decision should be to define the time for this:
a. At entry of data
b. As part of the analysis or quality control during entry
For A:
.......................................................................................................................................
Two commands are important here:
Key unique
Autosearch list
The "Key Unique" will prevent more than one instance of a value in a
given field to be entered.
"Autosearch list" will show you any other record with the same value as
the current field already entered in the file.
A "key unique" status can also be used when creating combined fields.
E.g. if you wish to combine id + area + date of visit then at the
combination of these the status of key will be investigated.
qes
id person id #### area township # visit date of visit #
chk file:
id and area are defined mustenter and date of visit. * control assumed
a string variable
visit
mustenter
after entry
control = "id" + string(id) + " area:" + string(area) + "visit:" + visit
end
end
control
* notice as well "key unique" as "noenter"
key unique
noenter
end
.......................................................................................................................................
Analysis or quality control.
To test this I added in the bromar.rec file five double records by copy
and paste, but with the same id numbers. I also added 5 records but
changed the id to a new value.
First the easy one. Let us find the five where id number is the same:
* Assume you wish to see variables with same value of id sex age and dectime
* I prefer the "gen" to the "define k _____ " but this is not important.
gen s(80) k = "id: " + string(id)
k = trim(k) + "s:" + trim(string(sex) ) // the first "trim(k) is
important !!!
k = trim(k) + "a:" + trim(string(age) )
k = trim(k) + "d:" + trim(string(dectime) )
sort k
* list now any two records where two consecutive records have the same value
* notice [ _n-1] indicates previous record [ _n+1] next record
* notice also the many parenthesis. They could be removed but makes sure
of what is what.
list if k = ( ( k[_n-1] ) or ( k[_n-1]) )
* An alternative would be to create a frequency table, but with many
records (here 4000)
* this is not going to work. So we use instead:
define x #
x = 1
* now count how many times a given value in k was there
* do not show the table (takes too long) therefore /notable
aggregate k /sum=x /close /notable
. list n k if n > 1
N
K
2 id30
s:2a:55d:4.4833
3 id5
s:2a:51d:3.8667
2 id75 s:2a:52d:4.9167
So we know that records are the same for id 30, 5 and 75.
The same principle can be used for any number of variables which will
fit within 80 characters.
The problem is how many records should be controlled for possible
double entry:
For my example the number of records to control in roughly 4100 records is:
Combine id and 3 variables: 5 (correct)
Three variables only (sex age dectime) : 741 combinations with more
than one record
Four variables only (sex age dectime km) : 154 combinations with more
than one record
So the question "Is there an easy way" - is answered. It is easy if you
have an id or a number of variables to compare by. Most likely more than
4-5 variables.
Jens Lauritsen
EpiData Association
In Epi6 I often used the following
define var1 ##
let var1=stringvar[startposition,number ]
to convert a part of a stringvariabel to an integer variable.
Whar is the equivalance in EpiStat?
The way to find this information in EpiData Analysis is the following:
1. press F1 (latest version 1.1)
2. This will either show the complete "commands and functions" reference,
where a section shows functions or a specific command, if a word in the
beginning of the command prompt is found as a key word in the commands ref.
The answer to your question can then be found by:
writing in the command prompt:
functions // and then press F1
or by just pressing F1 and look for the functions near the bottom of the file.
ANswer:
define var1 ##
var1 = string(var2)
*or if you wish only a part of var2:
var1 = string(copy(var2,1,1))
or simpler with command "gen"
gen i var1 = string(var2)
*or if you wish only a part of var2:
gen i var1 = string(copy(var2,1,1))
"i" stands for integer variable
regards
Jens Lauritsen
EpiData Association, Denmark
functions
A "cure" for the double entry problem with special calculations has been found for EpiData Entry.
A new build will be put out within the next few days.
On the www.epidata.dk website a new front page has been created. I expect to adapt the
menu on the left side within a week.
Users having problems of seing parts of the page, please send a message to the list
or to "info at epidata.dk"
regards
Jens Lauritsen
EpiData Association