Re: [EpiData-list] Importing data to Epidata (v.2.0.5.51)

25 Jun 2015


      The short answer is:
There is no safe way of changing from one variable (field) type to another . The reason for this is that in many cases we would have to make decisions which could alter data in a different way than the user intended.
The best good practice way of importing data from a spreadsheet was summarised by Jamie Hockin, but the aspect of deciding on field type was not mentioned. This is step 7 below.
The best practice is repeated below (hopefully in a structured and understandable way):
The step list is then:
1. open and work in your spreadsheet file
2. check each column, one by one and make sure that all cells in that column have the same format. E.g. all cells in column M should be date if M contains dates
3. add one right most column, call it "dummy" and insert a 1 in all cells in that column in all rows which contains data.
4. Make sure all names of columns (in row 1) are valid. E.g. "age_of_this_patient" must be changed to "age" or "Thisverylongname" should be made shorter "thisname"
5. I change the decimal separator to . and I omit all thousand separators from formats.
6. I make sure that any "boolean"y/n" data are changed to 0 1 or 0 1 9, where 9 indicates blank. Boolean should be avoided since many software, e.g. those from M$ record no value as a "no" on export or conversion instead of as a "no value present".
7. I insert a row after the name row where I specifically change the content, such variables are converted correctly in Manager. E.g. in Denmark we have civil registration numbers which are 10 digits long, and we should read this as string to use it appropriately for a key. e.g. insert xxx in that place.
8. Then I mark the whole dataset as a block and copy to clipboard
After that import in Manager should be straight forward. As first part of the definition I would then in Manager:
a. look at data in browse. In particular I look at the first, the last and a few records in the middle to make sure that the import is working correctly.
b. save the file in Manager
c. open the file in EntryClient
d. mark the added record (point 7 above) for deletion
e. save the file
f. pack the file in Manager, where I delete the record marked for deletion.
g. Then I finalise the documentation of the data, e.g. add project descriptions and value labels or further controls such as range, calculations etc.
h. I save the project
i. I now run the "Data Content Validation" to find values not confirming to the content definition form point g.
j. I run the "count by ID" documentation feature to make sure all id's are as expected.
k. Any errors found in data or documentation are then fixed appropriately.
The same routine is applied whenever I am asked to look at data, which was entered or defined by other persons.
regards Jens Lauritsen
EpiData Association
Den 25. jun. 2015, EpiData development and support epidata-list@lists.umanitoba.ca skrev:
...
I thought so too, thanks Jamie.
Can anyone from the development team comment on the issue of changing the
field type and whether it is safe/wise to do so directly in XML? I tested
it and it worked out fine, at least it appeared so, but there may be issues
which are not immediately visible for a user.
Thanks!
Pekka