follow up: big data sets
I tried the new Entryclient with two large data sets.
Set 1: 500 integer fields, 500 records, uncompressed size: 1.1 MB as .epx 45K as .epz - load the file: 150 seconds - jump from record 1 to last record: 1 or 2 seconds - * to get empty record: 5 seconds - save file: 3 second - load in Analysis (running under Wine, so a bit slower than on an equally powered PC): 5 seconds - freq v500: immediate - no difference with .epz file
Set 2: 20 integer fields, 10,560 records, uncompressed size: 1.1 MB - load the file: immediate - jump to end or new record: immediate - load in Analysis: 7 seconds - freq v1: immediate
I expect the typical "large" data set would be somewhere between these two. Most large data sets I have used are like Set 2 and have been extracts of national databases. I might use the Entryclient on these, but unlikely. I've seen others using data like Set 1, but never really understood why they used such large data sets as most data items were unused and interesting only in a clinical review of individual subjects.
(technical note: time to execute a Javascript loadxml() <500 msec)
Jamie
Excelent analysis, Jamie.
Regards,
El 03/08/11 11:57, epidata-list@lists.umanitoba.ca escribió:
I tried the new Entryclient with two large data sets.
Set 1: 500 integer fields, 500 records, uncompressed size: 1.1 MB as .epx 45K as .epz
- load the file: 150 seconds
- jump from record 1 to last record: 1 or 2 seconds
- to get empty record: 5 seconds
- save file: 3 second
- load in Analysis (running under Wine, so a bit slower than on an equally powered PC): 5 seconds
- freq v500: immediate
- no difference with .epz file
Set 2: 20 integer fields, 10,560 records, uncompressed size: 1.1 MB
- load the file: immediate
- jump to end or new record: immediate
- load in Analysis: 7 seconds
- freq v1: immediate
I expect the typical "large" data set would be somewhere between these two. Most large data sets I have used are like Set 2 and have been extracts of national databases. I might use the Entryclient on these, but unlikely. I've seen others using data like Set 1, but never really understood why they used such large data sets as most data items were unused and interesting only in a clinical review of individual subjects.
(technical note: time to execute a Javascript loadxml()<500 msec)
Jamie
EpiData-list mailing list EpiData-list@lists.umanitoba.ca http://lists.umanitoba.ca/mailman/listinfo/epidata-list
Thanks for that analysis Jamie. The file I am working with is 887 fields, and freezes upon opening. If I leave it alone, it eventually loads after several hours. Keeping my fingers crossed that future program releases will be able to handle it better.
Jonathan
From: epidata-list@lists.umanitoba.ca To: epidata-list@lists.umanitoba.ca Date: 2011-08-04 11:14 AM Subject: Re: [EpiData-list] follow up: big data sets Sent by: epidata-list-bounces@lists.umanitoba.ca
Excelent analysis, Jamie.
Regards,
El 03/08/11 11:57, epidata-list@lists.umanitoba.ca escribió:
I tried the new Entryclient with two large data sets.
Set 1: 500 integer fields, 500 records, uncompressed size: 1.1 MB as
.epx 45K as .epz
- load the file: 150 seconds
- jump from record 1 to last record: 1 or 2 seconds
- to get empty record: 5 seconds
- save file: 3 second
- load in Analysis (running under Wine, so a bit slower than on an
equally powered PC): 5 seconds
- freq v500: immediate
- no difference with .epz file
Set 2: 20 integer fields, 10,560 records, uncompressed size: 1.1 MB
- load the file: immediate
- jump to end or new record: immediate
- load in Analysis: 7 seconds
- freq v1: immediate
I expect the typical "large" data set would be somewhere between these
two. Most large data sets I have used are like Set 2 and have been extracts of national databases. I might use the Entryclient on these, but unlikely. I've seen others using data like Set 1, but never really understood why they used such large data sets as most data items were unused and interesting only in a clinical review of individual subjects.
(technical note: time to execute a Javascript loadxml()<500 msec)
Jamie
EpiData-list mailing list EpiData-list@lists.umanitoba.ca http://lists.umanitoba.ca/mailman/listinfo/epidata-list
participants (1)
-
epidata-list@lists.umanitoba.ca