[EpiData-list] stattables error and test principles.

Sat Jul 8 12:53:34 CDT 2006

Susanne Widmar pointed out an "estimation" error in a command:

> As you can see the missing values are included in the calculation for 
> the 90th percentile. It's obviously a bug. However you may not detect 
> the bug if you're interested in the interquartile range:

I quickly ran the same analysis with the most recent build available on 
the test page which does not show the problems.
. version
Current version: 1.1 Release 8 (Build 68)
Latest public release 1.1 Release 1 (Build 62)
. stattables sex /idr=km
SEX    N    NKM    P10KM    P90KM   
Kvinde    490    423    18.00    265.00   
Mand    3537    3197    0.00    265.00   
. stattables sex /iqr=km
SEX    N    NKM    P25KM    P75KM   
Kvinde    490    423    42.00    237.00   
Mand    3537    3197    36.00    192.00   
. describe km if sex = 1
Variable    N=490    Sum    Mean    (95%    cfi)    Min    p5    p10    
p25    Median    p75    p90    p95    Max
KM    423    50288.0    118.88    109.42    128.35    0.0    0.0    
18.00    42.00    73.00    237.00    265.00    277.00    286.00
. describe km if sex = 2
Variable    N=3537    Sum    Mean    (95%    cfi)    Min    p5    p10    
p25    Median    p75    p90    p95    Max
KM    3197    330796.0    103.47    100.23    106.71    0.0    0.0    
0.0    36.00    71.00    192.00    265.00    265.00    286.00

Apparently the bug shown has been corrected in build 68. - To avoid such 
errors a comprehensive test suite has been developed for analysis. 
Currently the test involves testing of around 650 different conditions 
and a test for the problem above has been added.

I will update to a later build very soon to remedy the problem.

Known other problems are:
Estimation of OR in sparse tables
The aggregate (stattables) command can create variable names longer than 
10 characters

Apart from that I am not aware of current problems of estimation. If 
other definitive bugs are known (not requests for new functions) please 
report to the new bug database, found at Http://www.epidata.dk/php/mantis

Currently development is focused on "long term plans" including 
rewriting the software for
further documentation and testing, experimentation with collaborative 
editing of documentation and other
aspects of the development plan for 2006-2010.

The "in-house" build of analysis has come to 85 now, where build 68 is 
the latest for public testing.

When the current aspects are completely finalised a new build will be 
placed for testing.

One of the many new aspects are more user control over the interface, a 
completely rewritten table command,
which includes exact testing of stratified tables, calculation of gamma 
coefficient for ordinal data and extensive sorting.
Working with summarised data is next in line.

Other new aspects are further documentation tools within analysis (value 
labels, missing value assignment), renaming variables and
enhanced control over graphics, e.g. axis and ranges.

Kind regards

Jens Lauritsen
EpiData Association

