[EpiData-list] follow up: big data sets

epidata-list at lists.umanitoba.ca epidata-list at lists.umanitoba.ca
Fri Aug 5 17:41:30 CDT 2011


Thanks for that analysis Jamie. The file I am working with is 887 fields, 
and freezes upon opening. If I leave it alone, it eventually loads after 
several hours. Keeping my fingers crossed that future program releases 
will be able to handle it better.

Jonathan




From:   epidata-list at lists.umanitoba.ca
To:     epidata-list at lists.umanitoba.ca
Date:   2011-08-04 11:14 AM
Subject:        Re: [EpiData-list] follow up: big data sets
Sent by:        epidata-list-bounces at lists.umanitoba.ca



Excelent analysis, Jamie.

Regards,




El 03/08/11 11:57, epidata-list at lists.umanitoba.ca escribió:
> I tried the new Entryclient with two large data sets.
>
> Set 1: 500 integer fields, 500 records, uncompressed size: 1.1 MB as 
.epx  45K as .epz
> - load the file: 150 seconds
> - jump from record 1 to last record: 1 or 2 seconds
> - * to get empty record: 5 seconds
> - save file: 3 second
> - load in Analysis (running under Wine, so a bit slower than on an 
equally powered PC): 5 seconds
> - freq v500: immediate
> - no difference with .epz file
>
> Set 2: 20 integer fields, 10,560 records, uncompressed size: 1.1 MB
> - load the file: immediate
> - jump to end or new record: immediate
> - load in Analysis: 7 seconds
> - freq v1: immediate
>
> I expect the typical "large" data set would be somewhere between these 
two. Most large data sets I have used are like Set 2 and have been 
extracts of national databases. I might use the Entryclient on these, but 
unlikely. I've seen others using data like Set 1, but never really 
understood why they used such large data sets as most data items were 
unused and interesting only in a clinical review of individual subjects.
>
> (technical note: time to execute a Javascript loadxml()<500 msec)
>
> Jamie
>
> _______________________________________________
> EpiData-list mailing list
> EpiData-list at lists.umanitoba.ca
> http://lists.umanitoba.ca/mailman/listinfo/epidata-list
>


-- 
Omar Bautista González
- Colaborador en autogestión comunitaria desde República Dominicana

_______________________________________________
EpiData-list mailing list
EpiData-list at lists.umanitoba.ca
http://lists.umanitoba.ca/mailman/listinfo/epidata-list



More information about the EpiData-list mailing list