AW: [EpiData-list] Formatting and exporting data

epidata-list at lists.umanitoba.ca epidata-list at lists.umanitoba.ca
Tue Dec 6 01:37:49 CST 2005


Dear Jens
I fully agree and support your approach to preserve maximum accuracy with the data at hand and the make the display somewhat flexible hopefully in a generic way or command specific.
Additionally, the exact accuracy of the data should be preserved when exporting or saving in another data format.
with best wishes 
marcel

***********
Marcel Zwahlen, PhD
Department of Social and Preventive Medicine
University Berne 
Finkenhubelweg 11
CH-3012 Bern
Switzerland
phone     ++41-31-631.3554
facsimile ++41-31-631.3520
email: zwahlen at ispm.unibe.ch
http://www.ispm.unibe.ch/ 

-----Ursprüngliche Nachricht-----
Von: epidata-list at lists.umanitoba.ca [mailto:epidata-list at lists.umanitoba.ca] 
Gesendet: Samstag, 3. Dezember 2005 11:45
An: epidata-list at lists.umanitoba.ca
Betreff: [EpiData-list] Formatting and exporting data

The question of how to format data by Susanne Widmar is a good one.
Allow me to give a lengthy explanation since this is an issue in which we should agree on a uniform principle as well for import, export and use of data in EpiData Analysis.

This relates to two aspects:
1. Accuracy of data - data types (binary, integer, real, double...)
      - we wish to replicate the precision of the collected data 2. How many decimals are shown on output from commands
      - we wish to see as much as needed for the interpretation
      - often less than aspect 1.

In Stata the format is for some commands controlling the number of decimals, such as the CI command, whereas for other commands the display format is part of the command specification.

The same "confusion" is now in EpiData Analysis where there is only partial consistency of showing and controlling formatting.

In my view output formatting of the output should be decided by the command not by the data.
In  v1.1 build 54+ of EpiData Analysis (now available for testing) I have implemented for frequencies new options:
freq v1   /D0              // would give percentages with no decimals
freq v1   /D1              // would give percentages with one decimal

For cross tables there are the settings for tables:
TABLE PERCENT FORMAT COL    P1()
TABLE PERCENT FORMAT ROW    P1{}
TABLE PERCENT FORMAT TOTAL    P1[]
in which the user can define the number of decimals for the percentages

For other commands (describe, means..) the display format of commands is either fixed or depending on the size of the estimates. User specification of output format will be implemented later.

For import and export to Stata we could rewrite the procedures to work on datatype rather than format as it is now, if this is possible. We  could then set all numerical variables to format %9.0g Another strategy could be to allow EpiData Analysis to save data in Stata format (which Bill Gould CEO of StataCorp has given permission to do).

I suggest that users comment on the uniform strategy:
a. All calculation is based on maximum accuracy with the data at hand b. Output formation should be part of each command.
c. Data storage (type of variable) should be as strict as possible. 
Accomplishment of this is more
    important than speed of import and export.

Regards
Jens Lauritsen
EpiData Association






More information about the EpiData-list mailing list