AW: [EpiData-list] Formatting and exporting data
Dear Jens I fully agree and support your approach to preserve maximum accuracy with the data at hand and the make the display somewhat flexible hopefully in a generic way or command specific. Additionally, the exact accuracy of the data should be preserved when exporting or saving in another data format. with best wishes marcel
*********** Marcel Zwahlen, PhD Department of Social and Preventive Medicine University Berne Finkenhubelweg 11 CH-3012 Bern Switzerland phone ++41-31-631.3554 facsimile ++41-31-631.3520 email: zwahlen@ispm.unibe.ch http://www.ispm.unibe.ch/
-----Ursprüngliche Nachricht----- Von: epidata-list@lists.umanitoba.ca [mailto:epidata-list@lists.umanitoba.ca] Gesendet: Samstag, 3. Dezember 2005 11:45 An: epidata-list@lists.umanitoba.ca Betreff: [EpiData-list] Formatting and exporting data
The question of how to format data by Susanne Widmar is a good one. Allow me to give a lengthy explanation since this is an issue in which we should agree on a uniform principle as well for import, export and use of data in EpiData Analysis.
This relates to two aspects: 1. Accuracy of data - data types (binary, integer, real, double...) - we wish to replicate the precision of the collected data 2. How many decimals are shown on output from commands - we wish to see as much as needed for the interpretation - often less than aspect 1.
In Stata the format is for some commands controlling the number of decimals, such as the CI command, whereas for other commands the display format is part of the command specification.
The same "confusion" is now in EpiData Analysis where there is only partial consistency of showing and controlling formatting.
In my view output formatting of the output should be decided by the command not by the data. In v1.1 build 54+ of EpiData Analysis (now available for testing) I have implemented for frequencies new options: freq v1 /D0 // would give percentages with no decimals freq v1 /D1 // would give percentages with one decimal
For cross tables there are the settings for tables: TABLE PERCENT FORMAT COL P1() TABLE PERCENT FORMAT ROW P1{} TABLE PERCENT FORMAT TOTAL P1[] in which the user can define the number of decimals for the percentages
For other commands (describe, means..) the display format of commands is either fixed or depending on the size of the estimates. User specification of output format will be implemented later.
For import and export to Stata we could rewrite the procedures to work on datatype rather than format as it is now, if this is possible. We could then set all numerical variables to format %9.0g Another strategy could be to allow EpiData Analysis to save data in Stata format (which Bill Gould CEO of StataCorp has given permission to do).
I suggest that users comment on the uniform strategy: a. All calculation is based on maximum accuracy with the data at hand b. Output formation should be part of each command. c. Data storage (type of variable) should be as strict as possible. Accomplishment of this is more important than speed of import and export.
Regards Jens Lauritsen EpiData Association
participants (1)
-
epidata-list@lists.umanitoba.ca