For a medication study we wish to test the hypothesis of no diffence in use of additional morphine for patients given two different pain regimens.
The data collected for each patient is the date and time of additional medication from day one until day seven. The number of additional medication given varies from 0 to many. Therefore we record the data as:
id : patient id d : date of medication m : Dosage of the medication
Basic data are saved for each patient grp : treatment group, A vs. B dayop : Date of surgery
For the analysis we also need: variable m : sum of additional morphica given varable d : day of treatment
This would seem easy: read medication
* add day of surgery etc.: merge id /table=basicdata gen i day = d - dayop
* now aggregate on day of medication aggregate id day grp /sum=m
* save for later: savedata daymedication /replace
* this creates the summary variable summ for m * summ contains the sum of medication on each day for each patient
means summ /by=grp /t : overall means summ /by=grp / t if day = 1 : example comparison on day one.
But this will give a wrong result, since only persons given medication are recorded in the medication file.
We therefore need to restructure the analysis:
A. Create a complete file for all patients: * generate file of patient id' numbers * (here 22 patients on day one and two): generate 44 gen i idn = recnumber +1
* we need two lines for each patient: gen i id = (idn div 2) gen i grp = 1
* our dataset is such that first 11 patients are group 1 and remaining group 2: if id > 11 then grp = 2 gen i day = 1 if (id = id[recnumber-1]) then day = 2 drop idn
labeldata "Key file for all patients" savedata key /replace * this file now contains exactly one record for all patients on each of day one and two.
B now we can analyse the data again: (notice the daymedication file created above)
* get the complete file: Read key /close
* add medication information: merge id day /table=daymedication
* add basic information on merge id /table=basicdata
* for patients with no medication summ is equal to . (missing) recode summ .=0
* now summ contains the sum of medication on each day for each patient
* show the observations for all days: * generate indicator variable based on day and group gen i g = 10*(grp-1) + day labelvalue g /1="Case Day 1" /2="Day 2" /11="Control day 1" /12="Day 2"
* verify distribution of summ as a probit plot cdfplot summ /p
* for aggregation in the cdfplot the variable must be integer: gen i mx = integer(summ*1000) cdfplot mx /p /agg
* for my specific data showing a straight line * (with a slight "too many at zero" problem)
* show as box and dotplot: box summ /by=g /n /out dotplot summ /by=g /sizex=600 /xa /di=0.25 /ti="Amount given - patient level"
* finally test the hypothesis means summ /by=grp /t : overall means summ /by=grp if day = 1 : comparison on day one. means summ /by=grp if day = 2 : comparison on day two.
Jens Lauritsen Coordinator and initiator of EpiData Project http://www.epidata.dk