[EpiData-list] Reading the new XML files into R

epidata-list at lists.umanitoba.ca epidata-list at lists.umanitoba.ca
Sun Jun 12 09:00:20 CDT 2011


Interesting to see how the "read into R" is progressing in latest days.
Just a few comments:

Basicly any external system reading xml files should read at least 1-3 of:
1. The data file structure (type and number of fields).
2. The contained data
3. The metadata - that is defined value labels, defined missing 
variables (now contained as part of the value labels), the questions (or 
variable labels).
But possibly also depending on purpose:
4. System data, such as defined delimeters for decimals and dates 
(written in the header section of the xml).
5. Project information

Mark Myatt active in the early EpiData development has written an 
introduction to R, including data examples. You find this on: 
http://www.brixtonhealth.com/Rex.zip

Regarding the specific discussions lately here:
a.
David writes:

iv) deal with records of different lengths.

The reason we have decided to only write variables containing data in 
the xml structure is that this makes data files much smaller. However I 
think we could consider whether this can be defined by the user, such 
that all fields are written in each record, regardless of whether they 
contain data or not.

b
Once you are done with the script/principles, please write up in a 
document that we can either link to or save in the wiki under examples.

regards
Jens Lauritsen
EpiData Association



More information about the EpiData-list mailing list