How to: How is sort order of tables defined ? (get tables in right order)
One way of getting other people to see problems is to have analysis creating random data with a combination of the command generate (which produces an empty data file) and the random functions. Then we do not need the attached files.
The reason for NOT allowing attachments is that this is a very good way of protecting all on the list from virus spreading
E.g. for the question from today run these commands: First: create a chk file with the labels: ....................... t.chk ................ labelblock label s 1 "London" 2 "Copenhagen" 3 "Madrid" 4 "New Delhi" 5 "Madras" 6 "Pretoria" 7 "Kiev" 8 "Hong Kong" 9 "Melbourne" 10 "Sao Paolo" 11 "Santiago" 12 "Ottawa" 13 "New Orleans" 14 "Bangkok" 15 "Honolulu" 16 "Havana" 0 "Oslo" 17 "Odense" 18 "Paris" 19 "Budapest" end end sitecode comment legal use s end sex comment legal 0 male 1 female end end ..... save the text above as t.chk ........................
............ Then create a pgm file: t.pgm: ................................ * Example file for creation of test data: close * create an empty data set with 1000 observations: generate 1000 define sex # define sitecode ## define sitecode2 ## * now assign values to the variables (random numbers) sex = ran(2) // sex will get numbers 0 and 1 sitecode = random(20) // sitecode will get numbers from 0 to 19 sitecode2 = sitecode savedata t * Now we have saved a data file with 1000 numbers, * but we need to read it again to get the labels read t /close * turn on labels, but turn off numbers for the tables: set table label = on set table value = off tab sex sitecode tab sex sitecode2 * notice that the order is the same as the numerical code for both variables * now we also turn on the numbers for the sitecode: set table value = on tab sex sitecode ................................ end t.pgm
When you run the t.pgm you will notice that the first table of sex and sitecode shows this: SEX SITECODE male % female % Total % Oslo 25 {52.1} 23 {47.9} 48 {100.0} London 26 {48.1} 28 {51.9} 54 {100.0} .... snip ....................... Whereas the last with the values turned on shows: set table value = on tab sex sitecode SEX SITECODE 0 male % 1 female % Total % 0 Oslo 25 {52.1} 23 {47.9} 48 {100.0} 1 London 26 {48.1} 28 {51.9} 54 {100.0} 10 Sao Paolo 26 {47.3} 29 {52.7}
Which means that for crosstables there is apparently a sorting going on on the first column. Notice that the two variables sitecode and sitecode2 are of type numerical. If we instead had used string variables: define v2 __ v2 = string(sitecode) tab v2 * then the order of v2 would be sorted meaning the "0 oslo - 1 london - 2 sao paolo" sequence * but without the text. To remedy this the trick would be to say: v2 = " " + v2 if length(trim(v2)) < 2 Then a new "tab v2" would show the correct table
------------------------------------------ Lesson: Obviously more user control of sort order of cross tables is on the wish list. But the feature is not part of the current most wanted list from my point of view. We have been experimenting a bit with sorting, but it is not all that easy. By converting btw. variable types and turning values and labels on and off you can get a lot of control as it is now.
A feature which is expected to be available later this year is to be able to take the text label from a variable and save in a string variable, e.g. v2string = label(sitecode) or v2string = string(sitecode) + label(sitecode) (this is not implemented now)
An undocumented sorting feature is available - but will not be documented until I am sure how to control sorting. Users are welcome to use it. The table will be sorted in reverse numerical order. But the problem with labels and values above persist. table sex v2 /rd or tab sex v2 /Rowdes
Jens Lauritsen EpiData Association
participants (1)
-
epidata-list@lists.umanitoba.ca