Since release of first public test build of analysis in october last year
good progress has been made. In judging progress it is important to
recognize the nature of the EpiData Association, the size of economic
contributions for development and the pricing principle - freeware.
The latest build is available from http://www.epidata.dk/testing.php
The list of what is fixed and remaing by build number is seen at:
http://www.epidata.dk/analysisinfo/docs/versioninfo.htm
The bug reporting database is shown at:
http://asp.epidata.dk/support/default.asp
The routine tests now include more than 350 conditions looking at as well
overall, stratified and segmented conditions and comparing to external
standard estimates. The aim of the complete test suite is to certify all
available commands for a chosen number of representative datasets.
The definite condition to meet before release is that the quality system
certifies as well stability as correctness of estimation.
But also that the implemented principles are logical in structure and
documented such that decisions and requirements are clear to the end user.
An example of this is the latest preparation of a detailed description of
the principles of the output implementation of html and style sheets.
The number of combinations of font selection, sizes and character sets is
very large requiring control at the level of the end-user. This is
available now, but for persons with very little or no knowledge in HTML
and stylesheet principles the control of output can be a mystery.
The status list includes :
a. The main design principle of user interface, output formulation, number
of commands, principle of missing data is final and most of it working.
b.Finalisation of a simple way for users to define interface with local
fonts etc. as part of the installation is almost finished.
c. A coherent strategy for handling missing values across commands is
being implemented.
d. Some issues of table estimation exist. E.g. with correct turning of
outcome and exposure. Some "Floating point and "Pointer" error messages
occurs with some combinations of data
e. Some aspects of graphing are giving problems. (Also relates to point b)
Re c: e.g. if the user requests frequency tables for say 5 variables:
1. freq a b c d e
2. freq a b c d e /nomissing
the first will show all data for all records, whereas the second will
demand complete data for all five variables. This principle should be the
same across all commands. And the implementation is needed as part of the
testing of handling the data correctly.
Kind regards
--
Jens Lauritsen
Coordinator and initiator of EpiData Project
http://www.epidata.dk
Dirk Enzmann wrote:
> Using the following .qes and .chk files with EpiData version 3.1,
> EpiData's behavior is strange:
>
> If I use the [-] numeric key to enter missing values in the field
> "schtype", EpiData jumps to the field "male" (i.e. "class" receives no
> focus). If I use the [-] key in field "male", EpiData issues the error
> message "'-' is not a valid floating point value" although missing
> values are defined.
>
> The problem seems to be the movement to the next field after using the
> [-] key, because if I drop field "coder" the problem of field "male"
> does not occur.
>
> Is there a problem in the check file I did overlook or is this a
> problem of EpiData version 3.1? Are there any solutions/suggestions?
>
two problems in above
1. I've noticed that the [-] key sometimes gives this error - the field
is not accepted (the missing values are not recognized as legal). This
seems to be a random occurrence. Closing the file and starting data
entry again has solved the problem for me.
2. The problem of EpiData skipping a field after using [-] to enter a
missing value can be addressed for now by checking whether the skipped
field is blank before proceeding (see below). This problem only arises
when the default missing value has the maximum number of digits in a
numeric field. In Dirk's case, missing value is 9997 or 9999 in a ####
field. Another solution is to make these special missing value codes
negative numbers (e.g. -1) or smaller numbers (e.g. 999), which will be
smaller than the field size. Then there is no skipping. This problem
does not occur with fields of length one.
Jamie Hockin
Public Health Agency of Canada
---------------------------------
* -- .chk --------------------------------
LABELBLOCK
LABEL country
* [0.2 ]
430 Austria
320 Belgium
3870 "Bosnia and Herzegovina"
11 Canada
3850 Croatia
3570 Cyprus
4200 "Czech Republic"
450 Denmark
440 "England and Wales"
3720 Estonia
3580 Finland
330 France
490 Germany
360 Hungary
3540 Iceland
3530 Ireland
390 Italy
3710 Latvia
3700 Lithuania
310 Netherlands
442 "Northern Ireland"
470 Norway
480 Poland
70 Russia
441 Scotland
3810 Serbia
3860 Slovenia
460 Sweden
410 Switzerland
10 "United States"
580 Venezuela
END
LABEL schtype
* [0.3 ]
1 "slot1 (lowest grade)"
2 slot2
3 slot3
4 slot4
5 slot5
6 slot6
7 slot7
9998 "other schooltype"
9999 "no answer"
END
LABEL class
* [0.4 ]
9997 "ambiguous answer"
9999 "no answer"
END
LABEL male
* [1.0 ]
1 male
2 female
7 "ambiguous answer"
9 "no answer"
END
END
* ========================================
BEFORE FILE
HELP "ISRD2 pilot study nation"
END
* ========================================
casenum
* [0.1 ]
KEY
Autosearch casenum
RANGE 1 9999
MUSTENTER
TYPE STATUSBAR "CaseNum = "
END
country
* [0.2 ]
* RANGE 1 9996
BEFORE ENTRY
IF (ISBLANK(country)) THEN
COMMENT LEGAL USE country SHOW
ENDIF
END
COMMENT LEGAL USE country
MUSTENTER
REPEAT
TYPE COMMENT
END
schtype
* [0.3 ]
RANGE 1 9996
MISSINGVALUE -1
* COMMENT LEGAL USE schtype
MUSTENTER
* TYPE COMMENT
END
class
* [0.4 ]
RANGE 1 9996
MISSINGVALUE 9997 9999
* COMMENT LEGAL USE class
MUSTENTER
AFTER ENTRY
IF (class = 9997) THEN
WRITENOTE "ambiguous answer"
ENDIF
END
* TYPE COMMENT
END
male
* [1.0 ]
RANGE 1 2
MISSINGVALUE 7 9
* COMMENT LEGAL USE male
* before entry
* if (isblank(class)) then
* goto class
* endif
* end
MUSTENTER
* TYPE COMMENT
AFTER ENTRY
IF (male = 7) THEN
WRITENOTE "ambiguous answer"
ENDIF
END
END
coder
* [66.0s]
BEFORE ENTRY
if (isblank(male)) then
goto male
endif
TYPE "(alphanumeric)" GREEN
END
MUSTENTER
WRITENOTE "@coder"
END
* -- end .chk --------------------------------
Using the following .qes and .chk files with EpiData version 3.1,
EpiData's behavior is strange:
If I use the [-] numeric key to enter missing values in the field
"schtype", EpiData jumps to the field "male" (i.e. "class" receives no
focus). If I use the [-] key in field "male", EpiData issues the error
message "'-' is not a valid floating point value" although missing
values are defined.
The problem seems to be the movement to the next field after using the
[-] key, because if I drop field "coder" the problem of field "male"
does not occur.
Is there a problem in the check file I did overlook or is this a problem
of EpiData version 3.1? Are there any solutions/suggestions?
BTW: Unfortunately, if I want to label the missing values (an important
feedback to the coder in case different types of "missingness") I cannot
utilize the facility of EpiData to use the [-] key for entering missing
values, because I cannot (better: should not) use COMMAND LEGAL and
MISSINGVALUE with overlapping sets of values for the same field. This
underscores the importance of the suggestions in my last mail (July 1,
2005) to the list.
Here the code of the .qes and .chk files:
* -- .qes --------------------------------
casenum country specific casenumber [0.1 ] ####
country country code [0.2 ] ####
schtype school type [0.3 ] ####
class class [0.4 ] ####
male gender [1.0 ] #
coder Coder Initials: [66.0s] <A >
* -- end .qes ----------------------------
* -- .chk --------------------------------
LABELBLOCK
LABEL country
* [0.2 ]
430 Austria
320 Belgium
3870 "Bosnia and Herzegovina"
11 Canada
3850 Croatia
3570 Cyprus
4200 "Czech Republic"
450 Denmark
440 "England and Wales"
3720 Estonia
3580 Finland
330 France
490 Germany
360 Hungary
3540 Iceland
3530 Ireland
390 Italy
3710 Latvia
3700 Lithuania
310 Netherlands
442 "Northern Ireland"
470 Norway
480 Poland
70 Russia
441 Scotland
3810 Serbia
3860 Slovenia
460 Sweden
410 Switzerland
10 "United States"
580 Venezuela
END
LABEL schtype
* [0.3 ]
1 "slot1 (lowest grade)"
2 slot2
3 slot3
4 slot4
5 slot5
6 slot6
7 slot7
9998 "other schooltype"
9999 "no answer"
END
LABEL class
* [0.4 ]
9997 "ambiguous answer"
9999 "no answer"
END
LABEL male
* [1.0 ]
1 male
2 female
7 "ambiguous answer"
9 "no answer"
END
END
* ========================================
BEFORE FILE
HELP "ISRD2 pilot study nation"
END
* ========================================
casenum
* [0.1 ]
KEY
Autosearch casenum
RANGE 1 9999
MUSTENTER
TYPE STATUSBAR "CaseNum = "
END
country
* [0.2 ]
* RANGE 1 9996
BEFORE ENTRY
IF (ISBLANK(country)) THEN
COMMENT LEGAL USE country SHOW
ENDIF
END
COMMENT LEGAL USE country
MUSTENTER
REPEAT
TYPE COMMENT
END
schtype
* [0.3 ]
RANGE 1 9996
MISSINGVALUE 9999
* COMMENT LEGAL USE schtype
MUSTENTER
* TYPE COMMENT
END
class
* [0.4 ]
RANGE 1 9996
MISSINGVALUE 9999 9997
* COMMENT LEGAL USE class
MUSTENTER
AFTER ENTRY
IF (class = 9997) THEN
WRITENOTE "ambiguous answer"
ENDIF
END
* TYPE COMMENT
END
male
* [1.0 ]
RANGE 1 2
MISSINGVALUE 9 7
* COMMENT LEGAL USE male
MUSTENTER
* TYPE COMMENT
AFTER ENTRY
IF (male = 7) THEN
WRITENOTE "ambiguous answer"
ENDIF
END
END
coder
* [66.0s]
BEFORE ENTRY
TYPE "(alphanumeric)" GREEN
END
MUSTENTER
WRITENOTE "@coder"
END
* -- end .chk --------------------------------
Dirk
*************************************************
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Edmund-Siemers-Allee 1
D-20146 Hamburg
Germany
phone: +49-040-42838.7498 (office)
+49-040-42838.4591 (Billon)
fax: +49-040-42838.2344
email: dirk.enzmann(a)jura.uni-hamburg.de
www:
http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/E…
*************************************************
I noticed some differences between EpiData 3.0 and 3.1 and a possible
bug in version 3.1:
1) If I use the check command TYE STATUSBAR "CaseNum = " the value of
the field (here: casenum) is only shown in the statusbar if the focus is
on the field. If I move to other fields, the value is not visible
anymore. This occurs in version 3.1 but not in 3.0. This seems to be an
error of version 3.1.
2) Version 3.0 and 3.1 differ in the way they display values defined by
RANGE, MISSINGVALUE, and/or COMMAND LEGAL (field).
For example, I have a multiple response question and the first question
is a filter (if checked, the following multiple response questions do
not apply). Or there is no filter variable but I want that the coder
first has to decide whether any of the multiple response questions are
checked (this is what I use in the following example). If not, I want
that all values are set to missing (9 - no answer) and the focus should
be set to the next block of questions. On the other hand, if there is at
least one answer checked, the focus should move to the first of the
multiple response questions and the user has to enter one of the folling
values: 0 (no checked), 1 (checked), or 7 (ambiguous answer). When all
data are entered, I want that the value labels are defined as follows
(because I want that the SPSS syntax attaches value labels to missing
values, as well; 8 is necesseary because the whole block of questions
may not be applicable):
* ---------------------------------------
LABEL notice
0 "somebody found out"
1 "nobody found out"
7 "ambiguous answer"
8 "not applicable"
9 "no answer"
END
* ---------------------------------------
However, to help the coder I want that while entering the data he/she
does not see these labels but the following:
-----------------------------------------
0 - (not checked)
1 - no
7 - (ambiguous answer)
-----------------------------------------
With EpiData version 3.0 this is possible if I use the following check
commands:
* =======================================
beerdt
MISSINGVALUE 9 8 7 * <- (1)
BEFORE ENTRY
HELP "MULTIPLE RESPONSE QUESTION\n\nQuestion 46.6: Anything
checked? (y/n)" KEYS="YN"
IF (RESULTVALUE=2) THEN
beerdt=9
beerdtpa=9
beerdtpo=9
beerdtte=9
beerdtse=9
GOTO beerpun
ENDIF
COMMENT LEGAL SHOW * <- (2)
0 "(not checked)"
1 no
7 "(ambiguous answer)"
END
END
COMMENT LEGAL USE notice * <- (3)
MUSTENTER
TYPE COMMENT RED
JUMPS
1 beerpun
9 beerpun
END
AFTER ENTRY
IF (beerdt=0) THEN
clear beerdtpa
clear beerdtpo
clear beerdtte
clear beerdtse
ENDIF
IF (beerdt=1) THEN
beerdtpa=0
beerdtpo=0
beerdtte=0
beerdtse=0
ENDIF
IF (beerdt = 7) THEN
WRITENOTE "ambiguous answer"
ENDIF
IF (beerdt=9) THEN
beerdtpa=9
beerdtpo=9
beerdtte=9
beerdtse=9
ENDIF
END
END
* =======================================
However, with EpiData 3.1, I cannot use "MISSINGVALUE 9 8 7" (see: * <-
(1)) _and_ "COMMAND LEGAL SHOW" (see: * <- (2)) or "COMMAND LEGAL
notice" (see: * <- (3)) simultaneously. If I use (1) and (2), the user sees:
----------------------
0 - (not checked)
1 - no
7 - (ambiguous answer)
9 - missing
8 - missing
7 - missing
----------------------
and is able to enter 9 (which should not be used) or 8 (which would be
wrong). If I only use (2), the value (9) or (8) cannot be set by the
"BEFORE ENTRY .. END" command or by an IF .. THEN condition in another
field. To use (3) would also be no solution to this problem. Thus, only
version 3.0 seems to allow a solution to this problem.
My main question is:
Are there any reasons why I should prefer version 3.1 to version 3.0? Or
which check commands do I need to solve the problem in version 3.1?
With respect to the missing values it would be extemely helpful if
EpiData could create SPSS syntax that defines missing values according
to the definitions in the EpiData check file. However, if (like in
version 3.1) I cannot use MISSINGVALUE and "COMMAND LEGAL labelname"
(because I want to label the missing values and to show the coder the
meaning of the values entered) simultaneously , there are no missing
value definitions that EpiData could use to create the SPSS syntax file.
I think that there are two solutions to this problem:
(1) If MISSINGVALUE and "COMMAND LEGAL labelname" are used
simultaneously, in displaying the legal values the values defined in the
LABEL block get priority (and values MISSINGVALUE are not duplicated in
the display). This would solve the problem described above, as well.
(2) Alternatively, if the MISSINGVALUE is commented out EpiData uses
only the values defined by RANGE or COMMAND LEGAL, but uses the values
defined by the MISSINGVALUE command that is preceded by "*" for creating
the missing value definition in the SPSS syntax. For example:
* ----------------------------------------
LABELBLOCK
LABEL variable
1 no
2 yes
8 "not applicable"
9 "no answer"
END
END
variable
COMMENT LEGAL variable
* MISSINGVALUE 9 8
END
* ----------------------------------------
would create the following SPSS syntax:
* ----------------------------------------
VALUE LABELS variable (1) 'no'
(2) 'yes'
(8) 'not applicable'
(9) 'no answer'.
MISSING VALUES variable (9,8).
* ----------------------------------------
Regards,
Dirk
*************************************************
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Edmund-Siemers-Allee 1
D-20146 Hamburg
Germany
phone: +49-040-42838.7498 (office)
+49-040-42838.4591 (Billon)
fax: +49-040-42838.2344
email: dirk.enzmann(a)jura.uni-hamburg.de
www:
http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/E…
*************************************************