Thanks Torsten (and Jens in the other reply). A tool or inconsistency report sounds like a good approach. I encountered this issue when playing with the code to import the epidata XML file into R. When reading such an XML file into R, the I think the options are:
1) do nothing to detect this and have the import fail and spit out a cryptic R error message because the factor levels won't match (not a good approach);
2) detect it, report it, and do not apply the labels for this field;
3) detect it, report it, and attempt to apply the labels where possible.
I think option 3 will be the least surprising for the user. I'll add code to do this after I've finished adding in the rest of the metadata (study info etc.)
David --
David Whiting, PhD | Senior Epidemiology & Public Health Specialist tel +32-2-6437945 | mob +32-496-266436 | David.Whiting@idf.org
International Diabetes Federation 166 Chaussée de La Hulpe, B-1170 Brussels, Belgium tel +32-2-5385511 | fax +32-2-5385114 info@idf.org | www.idf.org | VAT BE 0433.674.528
IDF | Promoting diabetes care, prevention and a cure worldwide
________________________________________ From: epidata-list-bounces@lists.umanitoba.ca [epidata-list-bounces@lists.umanitoba.ca] On Behalf Of epidata-list@lists.umanitoba.ca [epidata-list@lists.umanitoba.ca] Sent: 17 June 2011 20:02 To: epidata-list@lists.umanitoba.ca Subject: SV: [EpiData-list] It is possible to have labels that are inconsistent with previously entered data
Hi David
It is correct that neither Manager nor EntryClient checks the datafile for consistency upon opening.
We are thinking that this type of check will be available as a tool within Manager and/or Analysis. Since these are the programs used by the project maintainers.
Please feel free to discuss if this is sufficient or another solution is better.
Kind regards, Torsten Bonde Christiansen. Epidata Association.
----- Reply message ----- Fra: epidata-list@lists.umanitoba.ca Dato: fre., jun. 17, 2011 19:45 Emne: [EpiData-list] It is possible to have labels that are inconsistent with previously entered data Til: epidata-list@lists.umanitoba.ca
Hi,
Using the new data manager and client it is possible to create a data entry system, enter data, then add labels that don't match the labels. E.g. enter "dogs", "cats", "cows" then add labels for that field that only has "dogs" and "cats". This results in the labels defined in the XML file not matching the previously entered data. One solution is "don't do that!" but at the moment it is too easy to do so, possibly accidentally. I've certainly seen situations where a data entry system has evolved. Removing a set of labels removes the link between the labels and the field (and warns before doing so) and I wonder if epidata should also warn when creating this inconsistency. It could potentially mean a lot of checking if the data file is large and the set of labels is used for many fields. I wouldn't want the system to over-write any data, but perhaps have a way of reporting which fields and values are inconsistent with the label set.
David -- _______________________________________________ EpiData-list mailing list EpiData-list@lists.umanitoba.ca http://lists.umanitoba.ca/mailman/listinfo/epidata-list