[EpiData-list] Reading the new XML files into R

epidata-list at lists.umanitoba.ca epidata-list at lists.umanitoba.ca
Sat Jun 11 09:44:19 CDT 2011


Many thanks for this, I have imported my tiny sample file, but don't seem
able to do anything with it, is this related to you saying you haven't
started working with value labels.

Here is some output from R

 x <- read.epidata.xml("test.epx", dec.sep = ".")> x$datafile_id_0
        Date   Species Count st
1 2009-11-23 Blackbird    34  0
2 2010-11-23    Thrush    57  0
3 2011-12-24 Blackbird   130  0
4 2006-11-23 Blackbird   134  0
5 2011-06-23    Thrush    34  0
6 2005-05-23   Sparrow    24  0
> summary(x)              Length Class      Mode
datafile_id_0 4      data.frame list> boxplot(x$Count)Error in
plot.window(xlim = xlim, ylim = ylim, log = log, yaxs = pars$yaxs) :
  need finite 'ylim' valuesIn addition: Warning messages:1: In
is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'2: In
is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'3: In
is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'4: In
min(x) : no non-missing arguments to min; returning Inf5: In max(x) :
no non-missing arguments to max; returning -Inf

I'm also not sure what the st column is.

You will gather I am right on the edge of my undertstanding of R here.


On 11 June 2011 14:44, <epidata-list at lists.umanitoba.ca> wrote:

> On Sat, Jun 11, 2011 at 10:53:08AM +0200, epidata-list at lists.umanitoba.cawrote:
> > > I have started working on an R package that imports an epidata XML file
> > > directly into R using the R XML package. So far it creates a dataframe
> in R
> > > and uses the field information to convert the data to appropriate R
> data
> > > types. I haven't started working with the value labels yet.
> [...]
> >  This sounds excellent, and I would be happy to to give it a trial, once
> you
> > get to the stage of sharing,  if that would be useful.
> OK, here you go. I've put the code on github:
> https://github.com/daudi/Epidata-XML-to-R
> It's not a proper package yet, just some functions in a single file for
> now. And it probably isn't particularly great code (I'm still learning how
> to use XML), but it works. I've got some code in there for logging and
> debugging that can come out later (i.e. save(), status.log()). git clone it
> or just download the R file.
> Some TODOs:
> i) handle value labels;
> ii) map the remaining data types;
> iii) tighten up the code, e.g. replace for loops;
> iv) deal with records of different lengths.
> The last one is an issue that needs some thinking about. If all
> rows/records have the same number of columns it works okay. But if you
> create a screen, enter some data, then add a new field and only enter data
> in the subsequent records then the first set of records will have fewer
> fields than the second set of records. R will recycle values and issue a
> warning. Detecting this and then dealing with it will mean changing the
> function that gets the records and I need to ponder this one.
> David
> --
> _______________________________________________
> EpiData-list mailing list
> EpiData-list at lists.umanitoba.ca
> http://lists.umanitoba.ca/mailman/listinfo/epidata-list

More information about the EpiData-list mailing list