[EpiData-list] Epidata XML to R: how to handle labels? - and missing values

epidata-list at lists.umanitoba.ca epidata-list at lists.umanitoba.ca
Tue Jun 14 08:58:10 CDT 2011


Great work on this so far, David. This will be useful, because you can get more of the metadata into R directly.
I'm not sure what you mean by "other data types". As I see it, if an EpiData field uses value labels, then that field (variable) should be a factor in R. 

Not well documented, because you cannot create them in EpiData Manager, are boolean fields. These are a holdover from the current version of EpiData and so far they persist in the new version if imported from old .rec files. FIeld type is 0 (zero) and valid values are Y, N or missing.

I don't think there was an answer to someone's question about the field ST. This is record status (0=active, 1=deleted). In the future, another code will be used for "verified". An import option would be to ignore deleted records (default for EpiData Analysis).

Since date, time and decimal separators are metadata, the R function should read these from the XML and convert to whatever R requires. This should not require user choice.

I'll be trying this out, even though I don't use R much anymore. We taught EpiData to a group last week, among which were several R users. They liked the simplicity of EpiData Analysis, but will be very happy with the direct import to R.

Jamie Hockin
Ottawa

> Quick reply: yes, strings are easy, converting to factors, it's the other data types that need some consideration. Regarding using the stata export, I did at one point wonder if what I was doing was completely unecessary and that using the stata export would be sufficient, but I think that having something that is native to R provides more possibilities.
> 
> I'm proceeding with the idea of reporting the labels and providing functions to ease working with them. I'm nearly there ...
> 
> David



More information about the EpiData-list mailing list