October 2006 - EpiData-list - University of Manitoba Mailing Lists

Set and options, graphs and recent questions
by epidata-list＠lists.umanitoba.ca 09 Oct '06

09 Oct '06

The point of illegal dates brought up by Robert is a good point to include in future work. >Loading data C:\c temp\date 1.REC, please wait.. >invalid year 1111, month 11, day 11 > > It can be a tedious work to clean out such dates, which can often result also from registry extracts from e.g. patient data. >---------------------------------------------------------------------- > > A few comments on the set / option question in Analysis: Rule 1: Epidata Analysis should be based on proper methodology: Therefore the intention with development is always: a. Develop a good internal structure which does the job correctly. b. Create commands which make sense to users (easy to understand) c. Add proper options to the command to give the user more control d. Sometimes create parallel commands if the users are divided in two groups or the same internal command can be used for two completely different purposes. (e.g. aggregate and stattables do the same in current version) e. As part of development of each command make sure to develop test procedures to assess quality and precision. re e: Currently we have around 550 different tests made after changing anything inside the programme, but we have not had the time to develop these tests for the new table and graphics module, which has to be done before v2.0beta is released for user testing. Rule 2: Case sensitivity of options and set: EpiData Analysis will never be case sensitive, since this creates problems for many users. Actually just now there is an internal problem with <A > fields, which I am not sure to give correct behaviour always in all functions. Users are adviced to use the function upper( ) with such fields. Rule 3: Beginners should be helped in using the software. Aside from writing documentation and introduction notes (see www.epidata.dk/documentation.php) it is important to find a strategy of creating dialogs which assist users in getting a proper analysis done. More of this work will come in the future. Maybe we should have two types of dialogs. One with only the very few basic options and another for the experienced with all options available. Pedro's suggestion of >But at least you must try to follow a rule: e.g. the first letter of the >meaningful word affected by the options. > > makes good sense. But sometimes the "sense" is bound to one language, and then which of the many languages should we choose. The choice will be much like you have seen, but avoidance of lengthy options: e.g. /m (or /M) instead of /missing or /q instead of /quiet >Options should affect only: > >1.- what kind of records are affected by the command (e.g. those including >missing values; or those marked as deleted); >2.- what information is included in the output (e.g. Row percents, CI, >Statistics, etc..); >3.- the way the output is sorted (e.g. by labels, by count, by RR) > > We can discuss whether some options divide the user base in two groups, such that a general "set" must be defined. E.g. set deleted = on or similar. Finally let me say about confidence intervals, that I meant to write: TABLE CI FORMAT = "C()-" Which would give,. e.g.: "(0.21-0.56)" , whereas "C " would give "0.21 0.56" in the tables. -------------------------------------------------------- Regarding Scatter graph and regression lines: This is a good suggestion for a new feature. I suggest that such new features are reported to the Mantis database found at the epidata site in subfolder "php/mantis" Currently there is a very unrestrictive sign-on principle for the system. (See comment below). Please respect the system as a creative tool in further development. -------------------------------------------------------- Restrictions implemented Unfortunately a number of users (possibly only one) in the beginning telling to come from Hungary has found it interesting to send adverticements for pharmaceutical products to the EpiData Newslist increasing the number of members there by about 25-50 persons. This unfortunately is one of the annoying aspects of releasing software in an open spirit environment, since I had to spend about 4-5 hours before finding a solution to hopefully block further attempts and to remedy the "damage done". Hours which were more fruitfully used on development, user help or going for a walk near the sea in the beginning autumn . -------------------------------------------------------- Kind regards Jens lauritsen EpiData Association > > > > > ------------------------------------------------------------------------ > > Subject: > [EpiData-list] RE: EpiData-list Digest, Vol 36, Issue 6 > From: > epidata-list(a)lists.umanitoba.ca > Date: > Sun, 8 Oct 2006 10:03:28 +0200 > To: > <epidata-list(a)lists.umanitoba.ca> > > To: > <epidata-list(a)lists.umanitoba.ca> > > >Scatter Graph >Is it possible to show "linear regression" line in options graph >like Epi-Info command scatter var1 var2/r > >Thank you > >B. Branger, Nantes, France > > > >------------------------------------------------------------------------ > >________________________________________ >EpiData-list(a)lists.umanitoba.ca >http://lists.umanitoba.ca/mailman/listinfo/epidata-list > >

1 0

query re: exporting to excel and then using pivot tables
by epidata-list＠lists.umanitoba.ca 09 Oct '06

09 Oct '06

Hello everyone...I do appreciate this forum. I have exported to excel a db with data from 140 questionnaires. The export worked fine, other than when I create a pivot table, the colums in the pivot appear to add themselves. I have never seen this before when using pivot tables so Im assuming its related to the EpiData export. That is to say... In EXCEL, on a 2X2 pivot table where Im looking at gender, 'male = 1' and 'female = 2' for data entry. If there are 11 'females' that should be represented in a box, it gives '22'...I assume somewhere the 11 females are being summed (2 X 11). I hope this is clear. Anyone have any ideas? Thanks you Todd

1 1

Invalid Date
by epidata-list＠lists.umanitoba.ca 09 Oct '06

09 Oct '06

Jamie, Thanks for the quick response. Yes, we are using epidata 3.1 for all data entry up to this point. We are before the stage of double entry validation, but I need to append the data from 18 computers and calculate the number of questionnaires entered to pay the individual staff. Also I am writing my analysis syntax on the unclean data so it won't take so long to write the report when the data entry/cleaning is finished. Yes, I should have put a consistency check in for those dates, but I did not. Now I have a mountain of 1st entry data that I have to fight with. I am using analysis to write and run the APPEND.PGM. I thought it would work through EPI-C, but I cannot get EPI-C to recognize the append command. These types of .pgm's used to run through EPI-INFO, but don't appear to run in EPI-DATA. If I am wrong, please correct me. date.qes date <dd/mm/yyyy> date 1.rec & date 2.rec In Analysis: read "date 1" Loading data C:\c temp\date 1.REC, please wait.. invalid year 1111, month 11, day 11 invalid year 2222, month 2, day 22 invalid year 4000, month 3, day 12 invalid year 500, month 3, day 12 invalid year 6000, month 3, day 12 File name : C:\c temp\date 1.REC date Fields: 2 Total records: 6 Valid records: 6 . append "date 2" Loading data C:\c temp\date 2.REC, please wait.. invalid year 212, month 12, day 12 Appended file: Fields: 2 Total records: 1 Valid records: 1 Combined file: Fields: 2 Total records: 7 . savedata "dateall" Saving data to: C:\c temp\dateall.rec invalid date Failed to save data to C:\c temp\dateall.rec This is the Invalid date that I refer too. Thanks for your time. Robert > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 7 Oct 2006 03:05:50 -0700 (PDT) > From: epidata-list(a)lists.umanitoba.ca > Subject: [EpiData-list] Invalid date error > To: epidata-list(a)lists.umanitoba.ca > Message-ID: <20061007100550.72167.qmail(a)web51301.mail.yahoo.com> > Content-Type: text/plain; charset=iso-8859-1 > > Dear Epidata Listserve, > > The function on Epidata that only allows valid dates > is not allowing me > to save data after the append command. > The data entry keyers sometimes enter a wrong date. > The invalid date > does not become a problem (i believe) until I run the > savedata command. > Then the invalid date error pops up and the data is > not saved. > > read "WOMAN 1.rec" /close > append "WOMAN 2.rec" > append "WOMAN 3.rec" > etc... > savedata "WOMANALL" > > Is there any way to turn off the valid date > requirement or work around this? > > Thanks for the program and your support. > Robert Johnston > Phnom Penh, Cambodia. > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > > ------------------------------ > > Message: 2 > Date: Sat, 07 Oct 2006 08:27:16 -0400 > From: epidata-list(a)lists.umanitoba.ca > Subject: Re: [EpiData-list] Invalid date error > To: epidata-list(a)lists.umanitoba.ca > Message-ID: <45279D24.4080207(a)sympatico.ca> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Were the data entered in EpiData? If so, how could there be invalid > dates? If the data were entered using another program you should still > be able to validate the data with EpiData Entry, using a .chk file with > a CONSISTENCYBLOCK and a CHECK command. In the Entry Help file use the > Contents link to "Document datafile" and "Logical consistency check". > > It's a good idea to clean up this type of error before using Analysis. > > Jamie > > Robert wrote > >> The function on Epidata that only allows valid >> dates is not allowing me >> to save data after the append command. >> The data entry keyers sometimes enter a wrong date. >> The invalid date >> does not become a problem (i believe) until I run the >> savedata command. >> Then the invalid date error pops up and the data is >> not saved. >> >> > > > > ------------------------------ > > Message: 3 > Date: Sat, 07 Oct 2006 16:12:38 +0200 > From: epidata-list(a)lists.umanitoba.ca > Subject: [EpiData-list] Design decision and information (EpIData > Analysis) > To: epidata-list(a)lists.umanitoba.ca > Message-ID: <4527B5D6.6030804(a)epidata.dk> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > In the design of v2.0 of EpiData Analysis I wish to simplify and clarify > the use of options and "set" definitions to a unified approach. And this > principle is more important > than backwards continuation of an unclear principle. The reason is among > other > that v2.0 will include more statistics, e.g. gamma coefficient for > ordinal data and > exact statistics for tables in general. > > Another intention is to have more options and fewer commands, but to assist > users by more intuitive dialogs. E.g. as now where graph commands can be > specified with a structured dialog where most options are available. The > intention is that : > v2.0: will include more functionality at the command level, but the > dialogs helping > beginners will come later. > > The problem with the current way of implementation is that there is a > non-unified > approach to what is defined at command level and what at "set" level. > Mostly > due to the way it was done in Epi6. And this confuses users. > > When the first release of v2.0 is ready all users will be able to > comment on the > actual implementation of the principles. The release of v2.0 for test > will be made > when we have reached a certain level of clarification. We are not quite > there yet. > But good progress is being made. > > I would like users to comment on these decisions: > ................................................................................................................. > option: > A specification for a given command. These are a combination of / and > letters or numbers > All options are made as short as possible and there is no intention to > try to make the option name > understandable, but in the documentation they should all be explained. A > given designation for an > option should mean the same in all commands if possible. > e.g. /m is always allow missing values > > set: > a general specification for running the programme or setting formats. > The first word of a "set" > tells what this specifices. General which cannot obey this rule should > be avoided. > e.g. set display databrowser = on/off (will always show data browser > in the background). > ................................................................................................................. > All showing of labels, values etc are defined by these "set" commands: > set var label = on/off ( show or hide variable label) > set var name = on/off (removes name of var if the same as first word > in variable label) > set var value = on/off (controls whether actual numerical value are > shown) > set var label = on/off (controls showing of "comment legal/value labels") > > e.g. set var value = off: show only value labels in tables (but show > values if no value labels are defined) > ................................................................................................................. > Creation and display of statistical tests, percentages etc are moved to > each command: > e.g. > means age sex (only descriptive summaries are shown) > means age sex /t (will display t test if one group and f test > +Bartletts test with more groups) > tables agegrp sex (only showns counts) > tab agegrp sex /chi (shows chi square values) > tab agegrp sex /r /c (would add row and column percentages) > tab agegrp sex /t (would add total percentages with no decimal points) > ................................................................................................................. > Format of estimates and confidence intervals and table column headers > are defined as "set": > e.g. > TABLE PERCENT HEADER ROW % > TABLE PERCENT FORMAT ROW P1{} > ................................................................................................................. > Format of Confidence Intervals are > > TABLE CI HEADER (95% CI) > TABLE CI FORMAT C2-() > ................................................................................................................. > > Regards > > Jens Lauritsen > EpiData Association > > > ------------------------------ > > ________________________________________ > EpiData-list(a)lists.umanitoba.ca > http://lists.umanitoba.ca/mailman/listinfo/epidata-list > > > End of EpiData-list Digest, Vol 36, Issue 6 > ******************************************* > > > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

1 1

Comments about SET and OPTIONS
by epidata-list＠lists.umanitoba.ca 08 Oct '06

08 Oct '06

Jens wrote: "Another intention is to have more options and fewer commands, but to assist users by more intuitive dialogs. E.g. as now where graph commands can be specified with a structured dialog where most options are available." I am not sure what you mean. In my opinion the only unnecessary command, right now in EpiData Analysis is EPiTables, because you can modify very easily the TAB command using an option /Epi (for instance) if you want to add calculation of RR, OR ,etc. In addition, including dialogs can be very useful for beginners, but still try to keep it as simple as possible. In my experience the one already for graphs is a little bit confuse, you have lot of options you dont know how to use and most users are not going to explore it. " The problem with the current way of implementation is that there is a non-unified approach to what is defined at command level and what at "set" level. Mostly due to the way it was done in Epi6. And this confuses users." Thats clear and it is better to have a defined approach. .................................................................................................................. " option: A specification for a given command. These are a combination of / and letters or numbers All options are made as short as possible and there is no intention to try to make the option name understandable, but in the documentation they should all be explained." But at least you must try to follow a rule: e.g. the first letter of the meaningful word affected by the options. If options are case sensitive then a rule should be followed, e.g. always in CAPITAL LETTERS. "A given designation for an option should mean the same in all commands if possible. e.g. /m is always allow missing values" Agree. Options affect the current command and only this instance of the command. Options should affect only: 1.- what kind of records are affected by the command (e.g. those including missing values; or those marked as deleted); 2.- what information is included in the output (e.g. Row percents, CI, Statistics, etc..); 3.- the way the output is sorted (e.g. by labels, by count, by RR) Options shall be also part of the set group, so that the default option is the one defined by the SET parameters (e.g. SET MISSING=ON will affect all the commands as if in the command a /m were included). "set: a general specification for running the programme or setting formats. The first word of a "set" tells what this specifies. General which cannot obey this rule should be avoided. e.g. set display databrowser = on/off (will always show data browser in the background). .................................................................................................................. "All showing of labels, values etc are defined by these "set" commands:" Agree "Creation and display of statistical tests, percentages etc are moved to each command:" Agree (See above) "Format of Confidence Intervals are TABLE CI HEADER (95% CI) TABLE CI FORMAT C2-()" Dont understand. Do you mean: CI-( )? I hope other users want to share their experiences and opinions about these topics. Saludos, and thank you Jens for your efforts and this wonderful program. Pedro Arias Epidemiologist Co-Translator of EpiData into Spanish

1 0

RE: EpiData-list Digest, Vol 36, Issue 6
by epidata-list＠lists.umanitoba.ca 08 Oct '06

08 Oct '06

Scatter Graph Is it possible to show "linear regression" line in options graph like Epi-Info command scatter var1 var2/r Thank you B. Branger, Nantes, France

1 0

Design decision and information (EpIData Analysis)
by epidata-list＠lists.umanitoba.ca 07 Oct '06

07 Oct '06

In the design of v2.0 of EpiData Analysis I wish to simplify and clarify the use of options and "set" definitions to a unified approach. And this principle is more important than backwards continuation of an unclear principle. The reason is among other that v2.0 will include more statistics, e.g. gamma coefficient for ordinal data and exact statistics for tables in general. Another intention is to have more options and fewer commands, but to assist users by more intuitive dialogs. E.g. as now where graph commands can be specified with a structured dialog where most options are available. The intention is that : v2.0: will include more functionality at the command level, but the dialogs helping beginners will come later. The problem with the current way of implementation is that there is a non-unified approach to what is defined at command level and what at "set" level. Mostly due to the way it was done in Epi6. And this confuses users. When the first release of v2.0 is ready all users will be able to comment on the actual implementation of the principles. The release of v2.0 for test will be made when we have reached a certain level of clarification. We are not quite there yet. But good progress is being made. I would like users to comment on these decisions: ................................................................................................................. option: A specification for a given command. These are a combination of / and letters or numbers All options are made as short as possible and there is no intention to try to make the option name understandable, but in the documentation they should all be explained. A given designation for an option should mean the same in all commands if possible. e.g. /m is always allow missing values set: a general specification for running the programme or setting formats. The first word of a "set" tells what this specifices. General which cannot obey this rule should be avoided. e.g. set display databrowser = on/off (will always show data browser in the background). ................................................................................................................. All showing of labels, values etc are defined by these "set" commands: set var label = on/off ( show or hide variable label) set var name = on/off (removes name of var if the same as first word in variable label) set var value = on/off (controls whether actual numerical value are shown) set var label = on/off (controls showing of "comment legal/value labels") e.g. set var value = off: show only value labels in tables (but show values if no value labels are defined) ................................................................................................................. Creation and display of statistical tests, percentages etc are moved to each command: e.g. means age sex (only descriptive summaries are shown) means age sex /t (will display t test if one group and f test +Bartletts test with more groups) tables agegrp sex (only showns counts) tab agegrp sex /chi (shows chi square values) tab agegrp sex /r /c (would add row and column percentages) tab agegrp sex /t (would add total percentages with no decimal points) ................................................................................................................. Format of estimates and confidence intervals and table column headers are defined as "set": e.g. TABLE PERCENT HEADER ROW % TABLE PERCENT FORMAT ROW P1{} ................................................................................................................. Format of Confidence Intervals are TABLE CI HEADER (95% CI) TABLE CI FORMAT C2-() ................................................................................................................. Regards Jens Lauritsen EpiData Association

1 0

Invalid date error
by epidata-list＠lists.umanitoba.ca 07 Oct '06

07 Oct '06

Dear Epidata Listserve, The function on Epidata that only allows valid dates is not allowing me to save data after the append command. The data entry keyers sometimes enter a wrong date. The invalid date does not become a problem (i believe) until I run the savedata command. Then the invalid date error pops up and the data is not saved. read "WOMAN 1.rec" /close append "WOMAN 2.rec" append "WOMAN 3.rec" etc... savedata "WOMANALL" Is there any way to turn off the valid date requirement or work around this? Thanks for the program and your support. Robert Johnston Phnom Penh, Cambodia. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

1 1

Multiple Jumps
by epidata-list＠lists.umanitoba.ca 06 Oct '06

06 Oct '06

Hi, I've a problem with the JUMP command. I have 3 different questionaries in size and need that epidata jumps from a identifcation variable to the next variables. The problem is that I need multiple jumps. Something like that: If code=0 then go to questions 1, 5, 8, 9 If code=1 then go to questions 1,2, 3,4, 5, 6, 7, 8, 9, 10 and so on.. Unfortunatly the jump command provides only the order for the next jump and not for that ones in the future. Thanks in advance. Greatings. Benjamin Wecker

1 2

Re: [EpiData-list] Finding DEL records
by epidata-list＠lists.umanitoba.ca 03 Oct '06

03 Oct '06

This is easy in Analysis: SET READ DELETED=ON READ yourfile SELECT RECDELETED LIST * or any other commands RECDELETED is a boolean that is true for deleted records. You need to use the set command to be able to access the deleted records. I don't know of a way to do this with Entry. Jamie > > From: epidata-list(a)lists.umanitoba.ca > Date: 2006/10/02 Mon PM 08:02:35 EST > To: <epidata-list(a)lists.umanitoba.ca> > Subject: [EpiData-list] Finding DEL records > > Does anyone know how to do a search for records that have been marked as deleted, i.e. where the red X at the bottom of the data entry form has been clicked and DEL now shows up next to it. > > Thanks, > Suzanna > _______________________________________________ > EpiData-list mailing list > EpiData-list(a)lists.umanitoba.ca > http://lists.umanitoba.ca/mailman/listinfo/epidata-list >

1 1

Finding DEL records
by epidata-list＠lists.umanitoba.ca 02 Oct '06

02 Oct '06

Does anyone know how to do a search for records that have been marked as deleted, i.e. where the red X at the bottom of the data entry form has been clicked and DEL now shows up next to it. Thanks, Suzanna

1 0