Importing data to Epidata (v.2.0.5.51)
Hello,
I am a newcomer to Epidata, so I apologize if my questions are not sophisticated. I am exploring the latest version and it is fantastic with many ingenious features, great work and great programming philosophy, thank you!
As the version is new, I understand that the documentation is not yet fully developed, and hence if someone can help me with these questions, I think they can be of use to many others as well.
What I want to do is to import a spreadsheet with field labels and data to Epidata. The spreasheet contains field names on the top row and records on the following rows. Once I have imported the initial data, I would like to ammend the records with additional fields, i.e. have the initial spreadsheet data as the seed data in records. I think this is a very normal practice to do, but I have run into several difficulties:
1. I have tried to import the spreadsheet with "File/Import file" but I don't seem to get it working. I have turned the spreasheet from Excel to CSV, but the dialogue box that opens doesn't recognize data. I made a workaround by copying a part of the spreadsheet and used "File/Import from clipboard" and that did work. The downside is that you have to do it part by part, and the fundamental problem is why it doesn't work in the first place. Is Epidata Manager very picky as to what field separators are used in CSV? I have tried changing field separators but with no luck. Clipboard seems to be the only workaround, but even still I have to do it in parts. (I am using Ubuntu and the latest Linux 64-bit version)
2. With the copy-paste method I noticed that Epidata was smartly able to interpret some fields with the right type (e.g. date, but not all date-fields, but I believe this is the issue of my spreadsheet), but not all. I have not discovered a way to change the field type afterwards, is it at all possible? The only hack I figured out was to manually change the field type in XML code directly, but I would be very hesitant to do it. If I touch XML code directly, will that mess up everything, or is it the simple fix to the problem?
3. Value-labels: My spreadsheet contains no value labels, obviously. It has for example field "Country of origin" and the value is string "Nepal". Is there a way for me to easily convert strings of text into value labels as a part of the import process? String values are consistent (the spreadsheet is an output file from a database), so I should imagine it would be possible do so.
4. Provided I get questions 1,2 and 3 adequately resolved, I should be fine with the initial data import. But what do I do if I want to later import additional records from the same spreadsheet structure to my Epidata project? Is this something that Append is supposed to do? There I have the same issue with the spreadsheet compatibility: The CSV file does not open in a recognisable way, hence I haven't been able to play around and test it either.
Help is much appreciated, thank you!
Pekka
I have found that import from the clipboard works well if the Excel data are prepared for the import. Here is what I do.
In Excel, format the columns they way you want them to appear in EpiData (text columns, numbers with set decimal places, dates as YYYY/MM/DD is best). I usually then format the columns to fit the data No spaces in your column names - these should be valid EpiData field names for consistency Make sure that the first and last rows of data are complete; I often put dummy data in these and then delete the records in EpiData. I believe the main thing here is to have valid data (not empty cells) in the last column of data for the first and/or last rows. Ideally, have no blank cells - put missing values in. This is probably not necessary, as EpiData handles empty cells well.
You may find that fields are not the length that you want. You can safely edit these fields to change the size. Text, number and data fields should all come through with the correct type.
I don’t know of any simple way to convert data for an Excel column to coded values with ‘value labels’. Unless your data set is massive, there is no real penalty to keeping the values in full.
When I had to automate repeated conversion of messy Excel data to EpiData, I made a new sheet within the Excel file and used formulas to put copies of the original data into the new sheet, with all of the cleaning. For example, we had some inconsistencies in coding province, so I used a lookup table to convert the original values (Ontario, Ont, ON) to be consistent (ON). This is one way to do your data coding, by the way, but you still cannot automate getting the value labels into EpiData. In my case, the data set was small and existing data might change, so I just kept importing the whole thing. You can also just convert the new data, import to a new EpiData project and then append this to your complete project.
You will have problems with import from the clipboard if you have commas in your text fields. In my testing, the import fails without an error message.
EpiData is very fussy about importing data, as it should be, since it does such a good job of maintaining the integrity of data when you use it for data entry. Import of csv data does work, but you have to be very sure to have clean data and to not have stray delimiters in your data. In any case, your data should be in good shape since it originates in a database (what database?? Maybe there is another more direct way to get into EpiData) -- Jamie
Pekka wrote: What I want to do is to import a spreadsheet with field labels and data to Epidata. The spreasheet contains field names on the top row and records on the following rows. Once I have imported the initial data, I would like to ammend the records with additional fields, i.e. have the initial spreadsheet data as the seed data in records. I think this is a very normal practice to do, but I have run into several difficulties:
1. I have tried to import the spreadsheet with "File/Import file" but I don't seem to get it working. I have turned the spreasheet from Excel to CSV, but the dialogue box that opens doesn't recognize data. I made a workaround by copying a part of the spreadsheet and used "File/Import from clipboard" and that did work. The downside is that you have to do it part by part, and the fundamental problem is why it doesn't work in the first place. Is Epidata Manager very picky as to what field separators are used in CSV? I have tried changing field separators but with no luck. Clipboard seems to be the only workaround, but even still I have to do it in parts. (I am using Ubuntu and the latest Linux 64-bit version)
2. With the copy-paste method I noticed that Epidata was smartly able to interpret some fields with the right type (e.g. date, but not all date-fields, but I believe this is the issue of my spreadsheet), but not all. I have not discovered a way to change the field type afterwards, is it at all possible? The only hack I figured out was to manually change the field type in XML code directly, but I would be very hesitant to do it. If I touch XML code directly, will that mess up everything, or is it the simple fix to the problem?
3. Value-labels: My spreadsheet contains no value labels, obviously. It has for example field "Country of origin" and the value is string "Nepal". Is there a way for me to easily convert strings of text into value labels as a part of the import process? String values are consistent (the spreadsheet is an output file from a database), so I should imagine it would be possible do so.
4. Provided I get questions 1,2 and 3 adequately resolved, I should be fine with the initial data import. But what do I do if I want to later import additional records from the same spreadsheet structure to my Epidata project? Is this something that Append is supposed to do? There I have the same issue with the spreadsheet compatibility: The CSV file does not open in a recognisable way, hence I haven't been able to play around and test it either.
participants (1)
-
EpiData development and support