Tables of summary statistics

As discussed in earlier examples, svyby can be used to estimate statistics in subpopulations and svymean and svytotal give proportions or totals in subpopulations when used on factor variables. In this example we see how to construct reasonably attractive tables of summary statistics using the output from these functions. These examples use the dclus1 survey design object created in an earlier example.

The first example shows the estimation of proportions in the cells of a contingency table: school type (elementary, middle, high) by whether the school met its "comparable improvement" target. The first step is to construct a single factor variable that specifies all the cells in the table and use svymean to estimate proportions

 > a <- svymean(~interaction(stype, comp.imp), design = dclus1)
 > a
                                       mean     SE
 interaction(stype, comp.imp)E.No  0.174863 0.0260
 interaction(stype, comp.imp)H.No  0.038251 0.0161
 interaction(stype, comp.imp)M.No  0.060109 0.0246
 interaction(stype, comp.imp)E.Yes 0.612022 0.0417
 interaction(stype, comp.imp)H.Yes 0.038251 0.0161
 interaction(stype, comp.imp)M.Yes 0.076503 0.0217

This contains all the numbers we need, but the formatting leaves something to be desired. The ftable function reshapes output like this into a flattened table. We specify the variable names and the labels for each level in the rownames argument:

 > b <- ftable(a, rownames = list(stype = c("E", "H",
     "M"), comp.imp = c("No", "Yes")))
 > b
               stype          E          H          M
 comp.imp
 No       mean       0.17486339 0.03825137 0.06010929
          SE         0.02599552 0.01607602 0.02457177
 Yes      mean       0.61202186 0.03825137 0.07650273
          SE         0.04167572 0.01605469 0.02167084

The major remaining fault in the table is that too many digits are given. We can convert to percentages and then round to one decimal place:

 > round(100 * b, 1)
               stype    E    H    M
 comp.imp
 No       mean       17.5  3.8  6.0
          SE          2.6  1.6  2.5
 Yes      mean       61.2  3.8  7.7
          SE          4.2  1.6  2.2

The second example deals with a table of means produced by svyby. This is more straightforward, since svyby already knows the variable names and levels. First we estimate the mean of 1999 and 2000 API by school type and comparable improvement target

 > a<-svyby(~api99 + api00, ~stype + sch.wide, rclus1, svymean, keep.var=TRUE)
 > a
       stype sch.wide statistic.api99 statistic.api00      SE1      SE2
 E.No      E       No        601.6667        596.3333 70.04669 64.50553
 E.Yes     E      Yes        608.3485        653.6439 23.67277 22.37296
 H.No      H       No        662.0000        659.3333 40.92204 37.80385
 H.Yes     H      Yes        577.6364        607.4545 57.38815 53.97142
 M.No      M       No        611.3750        606.3750 48.19716 48.27853
 M.Yes     M      Yes        607.2941        643.2353 49.49574 49.34813

Now we convert to a table

> ftable(a)
              sch.wide        No                 Yes
                           api99     api00     api99     api00
stype
E     svymean          601.66667 596.33333 608.34848 653.64394
      SE                47.27582  43.49010  21.52493  20.31720
H     svymean          662.00000 659.33333 577.63636 607.45455
      SE                29.23003  27.00275  46.50125  43.70468
M     svymean          611.37500 606.37500 607.29412 643.23529
      SE                41.11886  41.11686  42.53046  42.12850

For variety, we trim the excess digits using the digits argument to print (the previous approach using round would also work).

> print(ftable(a),digits=3)
              sch.wide    No         Yes
                       api99 api00 api99 api00
stype
E     svymean          601.7 596.3 608.3 653.6
      SE                47.3  43.5  21.5  20.3
H     svymean          662.0 659.3 577.6 607.5
      SE                29.2  27.0  46.5  43.7
M     svymean          611.4 606.4 607.3 643.2
      SE                41.1  41.1  42.5  42.1

Note that digits specifies the number of significant digits, rather than decimal places.

Thomas Lumley

Last modified: Thu Jul 28 10:48:04 PDT 2005