Summary table with gtsummary

Preparation for MSDM CEP

Author

Jae Jung

Published

March 29, 2025

1 Loading Packages

```{r loading up packages}
library(tidyverse) # mega package containing 8 packages
library(gt)
library(gtsummary)
library(sjPlot)
```

2 gtSummary package

3 Experimental Design:

Trial data

trial: data frame with 200 rows-one row per patient

  • trt: Chemotherapy Treatment
  • age: Age
  • marker: Marker Level (ng/mL): the concentration of a specific protein or substance, measured in nanograms per milliliter (ng/mL), that can be detected in a blood sample and may indicate the presence or progression of cancer.
  • stage T Stage: The T stage in cancer describes the size and extent of the primary tumor. It’s part of the TNM staging system, which is the most common way to stage cancer.
T stage Meaning
T0 No evidence of a tumor
T1 A small tumor
T2 A larger tumor that has grown into nearby tissue
T3 A larger tumor that has grown into nearby tissue
T4 A larger or more advanced tumor that has grown into nearby tissue
TX The tumor cannot be measured
Tis The tumor is still within the confines of the normal glands and cannot metastasize
  • grade: Grade: describes how abnormal the cancer cells look under a microscope compared to normal cells, with higher grades indicating more rapid growth and a greater likelihood of spread.�
  • response: Tumor Response
  • death: Patient Died
  • ttdeath: Months to Death/Censor

3.1 tbl_summary()

  • Designed for Descriptive Statistics
```{r}
# data()
trial
class(trial)
glimpse(trial)
view_df(trial)

trial2 <- trial |> 
  select(trt, age, grade, response)

trial2 |> 
  tbl_summary()
```
# A tibble: 200 × 8
   trt      age marker stage grade response death ttdeath
   <chr>  <dbl>  <dbl> <fct> <fct>    <int> <int>   <dbl>
 1 Drug A    23  0.16  T1    II           0     0    24  
 2 Drug B     9  1.11  T2    I            1     0    24  
 3 Drug A    31  0.277 T1    II           0     0    24  
 4 Drug A    NA  2.07  T3    III          1     1    17.6
 5 Drug A    51  2.77  T4    III          1     1    16.4
 6 Drug B    39  0.613 T4    I            0     1    15.6
 7 Drug A    37  0.354 T1    II           0     0    24  
 8 Drug A    32  1.74  T1    I            0     1    18.4
 9 Drug A    31  0.144 T1    II           0     0    24  
10 Drug B    34  0.205 T3    I            0     1    10.5
# ℹ 190 more rows
[1] "tbl_df"     "tbl"        "data.frame"
Rows: 200
Columns: 8
$ trt      <chr> "Drug A", "Drug B", "Drug A", "Drug A", "Drug A", "Drug B", "…
$ age      <dbl> 23, 9, 31, NA, 51, 39, 37, 32, 31, 34, 42, 63, 54, 21, 48, 71…
$ marker   <dbl> 0.160, 1.107, 0.277, 2.067, 2.767, 0.613, 0.354, 1.739, 0.144…
$ stage    <fct> T1, T2, T1, T3, T4, T4, T1, T1, T1, T3, T1, T3, T4, T4, T1, T…
$ grade    <fct> II, I, II, III, III, I, II, I, II, I, III, I, III, I, I, III,…
$ response <int> 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0…
$ death    <int> 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0…
$ ttdeath  <dbl> 24.00, 24.00, 24.00, 17.64, 16.43, 15.64, 24.00, 18.43, 24.00…
Data frame: trial
ID Name Label Values Value Labels
1 trt Chemotherapy Treatment <output omitted>
2 age Age range: 6-83
3 marker Marker Level (ng/mL) range: 0.0-3.9
4 stage T Stage T1
T2
T3
T4
5 grade Grade I
II
III
6 response Tumor Response range: 0-1
7 death Patient Died range: 0-1
8 ttdeath Months to Death/Censor range: 3.5-24.0
Characteristic N = 2001
Chemotherapy Treatment
    Drug A 98 (49%)
    Drug B 102 (51%)
Age 47 (38, 57)
    Unknown 11
Grade
    I 68 (34%)
    II 68 (34%)
    III 64 (32%)
Tumor Response 61 (32%)
    Unknown 7
1 n (%); Median (Q1, Q3)

4 Differences between groups

4.1 Wilcoxon Rank sum test

  • using median, which is a default
```{r}
trial |> 
  select(trt, age, marker, ttdeath) |> 
  tbl_summary(by = trt) |> 
  add_p() 

# modifying
trial |> 
  select(trt, age, marker, ttdeath) |>
  tbl_summary(by = trt) |> 
  add_p(
    pvalue_fun = label_style_pvalue(digits = 2)
    ) |> 
  add_overall() # add overall statistics
```
Characteristic Drug A
N = 981
Drug B
N = 1021
p-value2
Age 46 (37, 60) 48 (39, 56) 0.7
    Unknown 7 4
Marker Level (ng/mL) 0.84 (0.23, 1.60) 0.52 (0.18, 1.21) 0.085
    Unknown 6 4
Months to Death/Censor 23.5 (17.4, 24.0) 21.2 (14.5, 24.0) 0.14
1 Median (Q1, Q3)
2 Wilcoxon rank sum test
Characteristic Overall
N = 2001
Drug A
N = 981
Drug B
N = 1021
p-value2
Age 47 (38, 57) 46 (37, 60) 48 (39, 56) 0.72
    Unknown 11 7 4
Marker Level (ng/mL) 0.64 (0.22, 1.41) 0.84 (0.23, 1.60) 0.52 (0.18, 1.21) 0.085
    Unknown 10 6 4
Months to Death/Censor 22.4 (15.9, 24.0) 23.5 (17.4, 24.0) 21.2 (14.5, 24.0) 0.14
1 Median (Q1, Q3)
2 Wilcoxon rank sum test

4.2 Welch Two Samples T-test

```{r}
# be careful
trial |> 
  select(trt, age, marker, ttdeath) |> 
  tbl_summary(
    by = trt,
    statistic = list(
      c(age, marker, ttdeath) ~ "{mean} ({sd})"),
    missing = "no"
  ) |> 
  add_p() # uses Wilcoxon rank sum test

trial |> 
  select(trt, age, marker, ttdeath) |> 
  tbl_summary(
    by = trt,
    statistic = list(
      c(age, marker, ttdeath) ~ "{mean} ({sd})"),
    missing = "no"
  ) |> 
  add_difference(
    pvalue_fun = label_style_pvalue(digits = 2)
    ) |> # Welch two-sample t-test
  #modify_caption("**Table 1. Effectiveness of Drugs**") |> 
  as_gt() |> 
  tab_header(title = md("**Table 1. Effectiveness of Drugs**"),
             subtitle = "Patient Characteristics")
```
Characteristic Drug A
N = 981
Drug B
N = 1021
p-value2
Age 47 (15) 47 (14) 0.7
Marker Level (ng/mL) 1.02 (0.89) 0.82 (0.83) 0.085
Months to Death/Censor 20.2 (5.0) 19.0 (5.5) 0.14
1 Mean (SD)
2 Wilcoxon rank sum test

4.3 Chi-square with tbl_summary

```{r}
trial |> 
  select(trt, stage, grade) |> 
  tbl_summary(by = trt) |> 
  add_p(pvalue_fun = label_style_pvalue(digits = 2)) |> 
  add_overall()
```
Characteristic Overall
N = 2001
Drug A
N = 981
Drug B
N = 1021
p-value2
T Stage


0.87
    T1 53 (27%) 28 (29%) 25 (25%)
    T2 54 (27%) 25 (26%) 29 (28%)
    T3 43 (22%) 22 (22%) 21 (21%)
    T4 50 (25%) 23 (23%) 27 (26%)
Grade


0.87
    I 68 (34%) 35 (36%) 33 (32%)
    II 68 (34%) 32 (33%) 36 (35%)
    III 64 (32%) 31 (32%) 33 (32%)
1 n (%)
2 Pearson’s Chi-squared test

4.4 Customizing tbl_summary() output

```{r}
trial2 |> 
  tbl_summary(
    by = trt,
    type = age ~ "continuous2",
    statistic = 
      list(age ~ c("{mean} ({sd})", "{min}, {max}"),
           response ~ "{n}/{N} ({p}%)"),
    label = grade ~ "Pathological Tumor Grade",
    digits = age ~ 2
    ) |> 
  add_p(pvalue_fun = label_style_pvalue(digits = 2)) |> 
  add_q(method = "bonferroni") # p-values adjusted for multiple comparison
```
Characteristic Drug A
N = 981
Drug B
N = 1021
p-value2 q-value3
Age

0.72 >0.99
    Mean (SD) 47.01 (14.71) 47.45 (14.01)

    Min, Max 6.00, 78.00 9.00, 83.00

    Unknown 7 4

Pathological Tumor Grade

0.87 >0.99
    I 35 (36%) 33 (32%)

    II 32 (33%) 36 (35%)

    III 31 (32%) 33 (32%)

Tumor Response 28/95 (29%) 33/98 (34%) 0.53 >0.99
    Unknown 3 4

1 n (%); n/N (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
3 Bonferroni correction for multiple testing

4.4.1 as_gt()

  • how to combine gtSummary with gt objects
```{r}
trial |> 
  select(trt, marker, response) |> 
  tbl_summary(by = trt)

trial |> 
  select(trt, marker, response) |> 
  tbl_summary(
    by = trt,
    statistic = list(
      marker ~ "{mean} ({sd})",
      response ~ "{p}%"),
    missing = "no"
  ) |> 
  add_difference(
    pvalue_fun = label_style_pvalue(digits = 2)
  ) |> 
  #modify_caption("**Table 1. Effectiveness of Drugs**") |> 
  as_gt() |> 
  tab_header(title = md("**Table 1. Effectiveness of Drugs**"),
             subtitle = "Patient Characteristics")
```
Characteristic Drug A
N = 981
Drug B
N = 1021
Marker Level (ng/mL) 0.84 (0.23, 1.60) 0.52 (0.18, 1.21)
    Unknown 6 4
Tumor Response 28 (29%) 33 (34%)
    Unknown 3 4
1 Median (Q1, Q3); n (%)

4.5 Chi-Square with tbl_cross()

  • Cross-tabulation with trial data
```{r}
trial |> 
  tbl_cross(
    row = trt,
    col = stage,
    percent = "column"
  ) |> 
  add_p() |> 
  bold_labels()

# cf

trial |> 
  select(trt, stage) |> 
  tbl_summary(by = stage) |> 
  add_p() |> 
  bold_labels()
```
T Stage
Total p-value1
T1 T2 T3 T4
Chemotherapy Treatment




0.9
    Drug A 28 (53%) 25 (46%) 22 (51%) 23 (46%) 98 (49%)
    Drug B 25 (47%) 29 (54%) 21 (49%) 27 (54%) 102 (51%)
Total 53 (100%) 54 (100%) 43 (100%) 50 (100%) 200 (100%)
1 Pearson’s Chi-squared test
Characteristic T1
N = 531
T2
N = 541
T3
N = 431
T4
N = 501
p-value2
Chemotherapy Treatment



0.9
    Drug A 28 (53%) 25 (46%) 22 (51%) 23 (46%)
    Drug B 25 (47%) 29 (54%) 21 (49%) 27 (54%)
1 n (%)
2 Pearson’s Chi-squared test
Tip
  • A chi-square test can be done with either tbl-summary or tbl-cross, but the latter produces more attractive table.
  • The former is better when the table includes multiple types of statistical tests.

5 Survey Data

5.1 tbl_summary()

```{r}
gss_cat
class(gss_cat) # builtin data from forcats package
view_df(gss_cat)

glimpse(gss_cat)
gss_cat |> 
  tbl_summary()

# by race

gss_cat |> 
  #count(race)
  mutate(race = fct_drop(race)) |> 
  tbl_summary(by = race) 
```
# A tibble: 21,483 × 9
    year marital         age race  rincome        partyid    relig denom tvhours
   <int> <fct>         <int> <fct> <fct>          <fct>      <fct> <fct>   <int>
 1  2000 Never married    26 White $8000 to 9999  Ind,near … Prot… Sout…      12
 2  2000 Divorced         48 White $8000 to 9999  Not str r… Prot… Bapt…      NA
 3  2000 Widowed          67 White Not applicable Independe… Prot… No d…       2
 4  2000 Never married    39 White Not applicable Ind,near … Orth… Not …       4
 5  2000 Divorced         25 White Not applicable Not str d… None  Not …       1
 6  2000 Married          25 White $20000 - 24999 Strong de… Prot… Sout…      NA
 7  2000 Never married    36 White $25000 or more Not str r… Chri… Not …       3
 8  2000 Divorced         44 White $7000 to 7999  Ind,near … Prot… Luth…      NA
 9  2000 Married          44 White $25000 or more Not str d… Prot… Other       0
10  2000 Married          47 White $25000 or more Strong re… Prot… Sout…       3
# ℹ 21,473 more rows
[1] "tbl_df"     "tbl"        "data.frame"
Data frame: gss_cat
ID Name Label Values Value Labels
1 year range: 2000-2014
2 marital No answer
Never married
Separated
Divorced
Widowed
Married
3 age range: 18-89
4 race Other
Black
White
Not applicable
5 rincome No answer
Don't know
Refused
$25000 or more
$20000 - 24999
$15000 - 19999
$10000 - 14999
$8000 to 9999
$7000 to 7999
$6000 to 6999
$5000 to 5999
$4000 to 4999
$3000 to 3999
$1000 to 2999
Lt $1000
<... truncated>
6 partyid No answer
Don't know
Other party
Strong republican
Not str republican
Ind,near rep
Independent
Ind,near dem
Not str democrat
Strong democrat
7 relig No answer
Don't know
Inter-nondenominational
Native american
Christian
Orthodox-christian
Moslem/islam
Other eastern
Hinduism
Buddhism
Other
None
Jewish
Catholic
Protestant
<... truncated>
8 denom No answer
Don't know
No denomination
Other
Episcopal
Presbyterian-dk wh
Presbyterian, merged
Other presbyterian
United pres ch in us
Presbyterian c in us
Lutheran-dk which
Evangelical luth
Other lutheran
Wi evan luth synod
Lutheran-mo synod
<... truncated>
9 tvhours range: 0-24
Rows: 21,483
Columns: 9
$ year    <int> 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 20…
$ marital <fct> Never married, Divorced, Widowed, Never married, Divorced, Mar…
$ age     <int> 26, 48, 67, 39, 25, 25, 36, 44, 44, 47, 53, 52, 52, 51, 52, 40…
$ race    <fct> White, White, White, White, White, White, White, White, White,…
$ rincome <fct> $8000 to 9999, $8000 to 9999, Not applicable, Not applicable, …
$ partyid <fct> "Ind,near rep", "Not str republican", "Independent", "Ind,near…
$ relig   <fct> Protestant, Protestant, Protestant, Orthodox-christian, None, …
$ denom   <fct> "Southern baptist", "Baptist-dk which", "No denomination", "No…
$ tvhours <int> 12, NA, 2, 4, 1, NA, 3, NA, 0, 3, 2, NA, 1, NA, 1, 7, NA, 3, 3…
Characteristic N = 21,4831
year
    2000 2,817 (13%)
    2002 2,765 (13%)
    2004 2,812 (13%)
    2006 4,510 (21%)
    2008 2,023 (9.4%)
    2010 2,044 (9.5%)
    2012 1,974 (9.2%)
    2014 2,538 (12%)
marital
    No answer 17 (<0.1%)
    Never married 5,416 (25%)
    Separated 743 (3.5%)
    Divorced 3,383 (16%)
    Widowed 1,807 (8.4%)
    Married 10,117 (47%)
age 46 (33, 59)
    Unknown 76
race
    Other 1,959 (9.1%)
    Black 3,129 (15%)
    White 16,395 (76%)
    Not applicable 0 (0%)
rincome
    No answer 183 (0.9%)
    Don't know 267 (1.2%)
    Refused 975 (4.5%)
    $25000 or more 7,363 (34%)
    $20000 - 24999 1,283 (6.0%)
    $15000 - 19999 1,048 (4.9%)
    $10000 - 14999 1,168 (5.4%)
    $8000 to 9999 340 (1.6%)
    $7000 to 7999 188 (0.9%)
    $6000 to 6999 215 (1.0%)
    $5000 to 5999 227 (1.1%)
    $4000 to 4999 226 (1.1%)
    $3000 to 3999 276 (1.3%)
    $1000 to 2999 395 (1.8%)
    Lt $1000 286 (1.3%)
    Not applicable 7,043 (33%)
partyid
    No answer 154 (0.7%)
    Don't know 1 (<0.1%)
    Other party 393 (1.8%)
    Strong republican 2,314 (11%)
    Not str republican 3,032 (14%)
    Ind,near rep 1,791 (8.3%)
    Independent 4,119 (19%)
    Ind,near dem 2,499 (12%)
    Not str democrat 3,690 (17%)
    Strong democrat 3,490 (16%)
relig
    No answer 93 (0.4%)
    Don't know 15 (<0.1%)
    Inter-nondenominational 109 (0.5%)
    Native american 23 (0.1%)
    Christian 689 (3.2%)
    Orthodox-christian 95 (0.4%)
    Moslem/islam 104 (0.5%)
    Other eastern 32 (0.1%)
    Hinduism 71 (0.3%)
    Buddhism 147 (0.7%)
    Other 224 (1.0%)
    None 3,523 (16%)
    Jewish 388 (1.8%)
    Catholic 5,124 (24%)
    Protestant 10,846 (50%)
    Not applicable 0 (0%)
denom
    No answer 117 (0.5%)
    Don't know 52 (0.2%)
    No denomination 1,683 (7.8%)
    Other 2,534 (12%)
    Episcopal 397 (1.8%)
    Presbyterian-dk wh 244 (1.1%)
    Presbyterian, merged 67 (0.3%)
    Other presbyterian 47 (0.2%)
    United pres ch in us 110 (0.5%)
    Presbyterian c in us 104 (0.5%)
    Lutheran-dk which 267 (1.2%)
    Evangelical luth 122 (0.6%)
    Other lutheran 30 (0.1%)
    Wi evan luth synod 71 (0.3%)
    Lutheran-mo synod 212 (1.0%)
    Luth ch in america 71 (0.3%)
    Am lutheran 146 (0.7%)
    Methodist-dk which 239 (1.1%)
    Other methodist 33 (0.2%)
    United methodist 1,067 (5.0%)
    Afr meth ep zion 32 (0.1%)
    Afr meth episcopal 77 (0.4%)
    Baptist-dk which 1,457 (6.8%)
    Other baptists 213 (1.0%)
    Southern baptist 1,536 (7.1%)
    Nat bapt conv usa 40 (0.2%)
    Nat bapt conv of am 76 (0.4%)
    Am bapt ch in usa 130 (0.6%)
    Am baptist asso 237 (1.1%)
    Not applicable 10,072 (47%)
tvhours 2 (1, 4)
    Unknown 10,146
1 n (%); Median (Q1, Q3)
Characteristic Other
N = 1,9591
Black
N = 3,1291
White
N = 16,3951
year


    2000 175 (8.9%) 429 (14%) 2,213 (13%)
    2002 167 (8.5%) 410 (13%) 2,188 (13%)
    2004 201 (10%) 377 (12%) 2,234 (14%)
    2006 592 (30%) 634 (20%) 3,284 (20%)
    2008 183 (9.3%) 281 (9.0%) 1,559 (9.5%)
    2010 183 (9.3%) 311 (9.9%) 1,550 (9.5%)
    2012 196 (10%) 301 (9.6%) 1,477 (9.0%)
    2014 262 (13%) 386 (12%) 1,890 (12%)
marital


    No answer 2 (0.1%) 2 (<0.1%) 13 (<0.1%)
    Never married 633 (32%) 1,305 (42%) 3,478 (21%)
    Separated 110 (5.6%) 196 (6.3%) 437 (2.7%)
    Divorced 212 (11%) 495 (16%) 2,676 (16%)
    Widowed 70 (3.6%) 262 (8.4%) 1,475 (9.0%)
    Married 932 (48%) 869 (28%) 8,316 (51%)
age 37 (29, 48) 42 (31, 55) 48 (35, 61)
    Unknown 8 14 54
rincome


    No answer 14 (0.7%) 35 (1.1%) 134 (0.8%)
    Don't know 45 (2.3%) 45 (1.4%) 177 (1.1%)
    Refused 92 (4.7%) 150 (4.8%) 733 (4.5%)
    $25000 or more 621 (32%) 886 (28%) 5,856 (36%)
    $20000 - 24999 112 (5.7%) 220 (7.0%) 951 (5.8%)
    $15000 - 19999 134 (6.8%) 180 (5.8%) 734 (4.5%)
    $10000 - 14999 126 (6.4%) 210 (6.7%) 832 (5.1%)
    $8000 to 9999 41 (2.1%) 56 (1.8%) 243 (1.5%)
    $7000 to 7999 24 (1.2%) 27 (0.9%) 137 (0.8%)
    $6000 to 6999 26 (1.3%) 35 (1.1%) 154 (0.9%)
    $5000 to 5999 27 (1.4%) 40 (1.3%) 160 (1.0%)
    $4000 to 4999 34 (1.7%) 38 (1.2%) 154 (0.9%)
    $3000 to 3999 35 (1.8%) 59 (1.9%) 182 (1.1%)
    $1000 to 2999 47 (2.4%) 71 (2.3%) 277 (1.7%)
    Lt $1000 36 (1.8%) 51 (1.6%) 199 (1.2%)
    Not applicable 545 (28%) 1,026 (33%) 5,472 (33%)
partyid


    No answer 25 (1.3%) 36 (1.2%) 93 (0.6%)
    Don't know 0 (0%) 0 (0%) 1 (<0.1%)
    Other party 22 (1.1%) 22 (0.7%) 349 (2.1%)
    Strong republican 81 (4.1%) 56 (1.8%) 2,177 (13%)
    Not str republican 156 (8.0%) 88 (2.8%) 2,788 (17%)
    Ind,near rep 118 (6.0%) 92 (2.9%) 1,581 (9.6%)
    Independent 612 (31%) 491 (16%) 3,016 (18%)
    Ind,near dem 285 (15%) 352 (11%) 1,862 (11%)
    Not str democrat 437 (22%) 746 (24%) 2,507 (15%)
    Strong democrat 223 (11%) 1,246 (40%) 2,021 (12%)
relig


    No answer 14 (0.7%) 16 (0.5%) 63 (0.4%)
    Don't know 3 (0.2%) 3 (<0.1%) 9 (<0.1%)
    Inter-nondenominational 2 (0.1%) 29 (0.9%) 78 (0.5%)
    Native american 16 (0.8%) 0 (0%) 7 (<0.1%)
    Christian 74 (3.8%) 141 (4.5%) 474 (2.9%)
    Orthodox-christian 1 (<0.1%) 2 (<0.1%) 92 (0.6%)
    Moslem/islam 42 (2.1%) 35 (1.1%) 27 (0.2%)
    Other eastern 10 (0.5%) 2 (<0.1%) 20 (0.1%)
    Hinduism 62 (3.2%) 1 (<0.1%) 8 (<0.1%)
    Buddhism 72 (3.7%) 10 (0.3%) 65 (0.4%)
    Other 29 (1.5%) 18 (0.6%) 177 (1.1%)
    None 323 (16%) 384 (12%) 2,816 (17%)
    Jewish 8 (0.4%) 10 (0.3%) 370 (2.3%)
    Catholic 916 (47%) 207 (6.6%) 4,001 (24%)
    Protestant 387 (20%) 2,271 (73%) 8,188 (50%)
    Not applicable 0 (0%) 0 (0%) 0 (0%)
denom


    No answer 14 (0.7%) 17 (0.5%) 86 (0.5%)
    Don't know 6 (0.3%) 15 (0.5%) 31 (0.2%)
    No denomination 99 (5.1%) 240 (7.7%) 1,344 (8.2%)
    Other 180 (9.2%) 468 (15%) 1,886 (12%)
    Episcopal 9 (0.5%) 38 (1.2%) 350 (2.1%)
    Presbyterian-dk wh 15 (0.8%) 8 (0.3%) 221 (1.3%)
    Presbyterian, merged 1 (<0.1%) 2 (<0.1%) 64 (0.4%)
    Other presbyterian 2 (0.1%) 2 (<0.1%) 43 (0.3%)
    United pres ch in us 2 (0.1%) 6 (0.2%) 102 (0.6%)
    Presbyterian c in us 6 (0.3%) 5 (0.2%) 93 (0.6%)
    Lutheran-dk which 6 (0.3%) 6 (0.2%) 255 (1.6%)
    Evangelical luth 1 (<0.1%) 2 (<0.1%) 119 (0.7%)
    Other lutheran 0 (0%) 0 (0%) 30 (0.2%)
    Wi evan luth synod 0 (0%) 1 (<0.1%) 70 (0.4%)
    Lutheran-mo synod 2 (0.1%) 2 (<0.1%) 208 (1.3%)
    Luth ch in america 2 (0.1%) 2 (<0.1%) 67 (0.4%)
    Am lutheran 3 (0.2%) 5 (0.2%) 138 (0.8%)
    Methodist-dk which 3 (0.2%) 35 (1.1%) 201 (1.2%)
    Other methodist 2 (0.1%) 5 (0.2%) 26 (0.2%)
    United methodist 11 (0.6%) 49 (1.6%) 1,007 (6.1%)
    Afr meth ep zion 0 (0%) 31 (1.0%) 1 (<0.1%)
    Afr meth episcopal 0 (0%) 76 (2.4%) 1 (<0.1%)
    Baptist-dk which 37 (1.9%) 697 (22%) 723 (4.4%)
    Other baptists 5 (0.3%) 47 (1.5%) 161 (1.0%)
    Southern baptist 30 (1.5%) 355 (11%) 1,151 (7.0%)
    Nat bapt conv usa 1 (<0.1%) 35 (1.1%) 4 (<0.1%)
    Nat bapt conv of am 1 (<0.1%) 58 (1.9%) 17 (0.1%)
    Am bapt ch in usa 4 (0.2%) 76 (2.4%) 50 (0.3%)
    Am baptist asso 6 (0.3%) 97 (3.1%) 134 (0.8%)
    Not applicable 1,511 (77%) 749 (24%) 7,812 (48%)
tvhours 2 (1, 4) 3 (2, 5) 2 (1, 4)
    Unknown 932 1,429 7,785
1 n (%); Median (Q1, Q3)

5.2 tbl_cross()

```{r}
levels(gss_cat$race)
unique(gss_cat$race)

gss_cat_md <- gss_cat |> 
  mutate(race = fct_drop(race)) 

levels(gss_cat_md$race)
levels(gss_cat_md$marital)

## cross-tab
gss_cat_md |> 
  filter(marital != "No answer") |> # reduce the number of cells
  mutate(marital = fct_drop(marital)) |> 
  tbl_cross(
    row = marital,
    col = race,
    percent = "column",
    missing = "no"
  ) |> 
  add_p() |> 
  bold_labels()
```
[1] "Other"          "Black"          "White"          "Not applicable"
[1] White Black Other
Levels: Other Black White Not applicable
[1] "Other" "Black" "White"
[1] "No answer"     "Never married" "Separated"     "Divorced"     
[5] "Widowed"       "Married"      
race
Total p-value1
Other Black White
marital



<0.001
    Never married 633 (32%) 1,305 (42%) 3,478 (21%) 5,416 (25%)
    Separated 110 (5.6%) 196 (6.3%) 437 (2.7%) 743 (3.5%)
    Divorced 212 (11%) 495 (16%) 2,676 (16%) 3,383 (16%)
    Widowed 70 (3.6%) 262 (8.4%) 1,475 (9.0%) 1,807 (8.4%)
    Married 932 (48%) 869 (28%) 8,316 (51%) 10,117 (47%)
Total 1,957 (100%) 3,127 (100%) 16,382 (100%) 21,466 (100%)
1 Pearson’s Chi-squared test

6 More on tbl_summary()

6.1 Modifyingtbl_summary() function argument

```{r}
trial2 <- trial |> 
  select(trt, age, grade)

trial2 |> 
  tbl_summary(by = trt)

trial2 |> 
  tbl_summary(
    by = trt,
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} / {N} ({p}%)"
    ),
    digits = all_continuous() ~ 2,
    label = grade ~ "Tumor Grade",
    missing_text = "(Missing)"
  ) |> 
  add_p() |> 
  add_overall()
```
Characteristic Drug A
N = 981
Drug B
N = 1021
Age 46 (37, 60) 48 (39, 56)
    Unknown 7 4
Grade

    I 35 (36%) 33 (32%)
    II 32 (33%) 36 (35%)
    III 31 (32%) 33 (32%)
1 Median (Q1, Q3); n (%)
Characteristic Overall
N = 2001
Drug A
N = 981
Drug B
N = 1021
p-value2
Age 47.24 (14.31) 47.01 (14.71) 47.45 (14.01) 0.7
    (Missing) 11 7 4
Tumor Grade


0.9
    I 68 / 200 (34%) 35 / 98 (36%) 33 / 102 (32%)
    II 68 / 200 (34%) 32 / 98 (33%) 36 / 102 (35%)
    III 64 / 200 (32%) 31 / 98 (32%) 33 / 102 (32%)
1 Mean (SD); n / N (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test

6.2 Formatting table with tbl_summary() functions

```{r}
glimpse(trial2)

# Customizing table
trial |> 
  select(trt, age, grade, response) |> 
  #filter(!is.na(age)) |> 
  tbl_summary(
    by = trt,
    missing = "no") |> 
  #show_header_names()
  add_p(pvalue_fun = label_style_pvalue(digits = 2)) |> 
  add_overall() |> 
  add_n() |> 
  add_stat_label(label = all_categorical() ~ "No. (%)") |> 
  modify_header(label ~ "**Variables**") |> 
  modify_spanning_header(c("stat_1", "stat_2") ~ "**Treatment Received**") |> 
  modify_footnote(
    all_stat_cols() ~ "Median(IQR) or Frequency (%)"
  ) |> 
  modify_caption("**Table 1. Patient Characteristics**") |> 
  bold_labels() 
```
Rows: 200
Columns: 3
$ trt   <chr> "Drug A", "Drug B", "Drug A", "Drug A", "Drug A", "Drug B", "Dru…
$ age   <dbl> 23, 9, 31, NA, 51, 39, 37, 32, 31, 34, 42, 63, 54, 21, 48, 71, 3…
$ grade <fct> II, I, II, III, III, I, II, I, II, I, III, I, III, I, I, III, II…
Table 1. Patient Characteristics
Variables N Overall
N = 2001
Treatment Received
p-value2
Drug A
N = 981
Drug B
N = 1021
Age, Median (Q1, Q3) 189 47 (38, 57) 46 (37, 60) 48 (39, 56) 0.72
Grade, No. (%) 200


0.87
    I
68 (34%) 35 (36%) 33 (32%)
    II
68 (34%) 32 (33%) 36 (35%)
    III
64 (32%) 31 (32%) 33 (32%)
Tumor Response, No. (%) 193 61 (32%) 28 (29%) 33 (34%) 0.53
1 Median(IQR) or Frequency (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test

6.3 t-test

```{r}
# function
my_ttest2 <- function(data, variable, by, ...) {
  t.test(data[[variable]] ~ as.factor(data[[by]])) |>
    broom::tidy() %>%
    dplyr::mutate(
      stat = glue::glue("t={style_sigfig(statistic)}, {style_pvalue(p.value, prepend_p = TRUE)}")
    ) %>%
    dplyr::pull(stat)
}

# t-test
trial |> 
  select(age, marker, trt) |> 
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_stat(fns = everything() ~ my_ttest2) |> 
  modify_header(add_stat_1 = "**Treatment Comparison**")

# add_difference
trial |> 
  select(age, marker, trt) |> 
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_difference()

# change default stat to mean (sd)
trial |> 
  select(age, marker, trt) |> 
  tbl_summary(
    by = trt,
    missing = "no",
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} / {N} ({p}%)"
    ),
  ) |> 
  add_difference()
```
Characteristic Drug A
N = 981
Drug B
N = 1021
Treatment Comparison
Age 46 (37, 60) 48 (39, 56) t=-0.21, p=0.8
Marker Level (ng/mL) 0.84 (0.23, 1.60) 0.52 (0.18, 1.21) t=1.6, p=0.12
1 Median (Q1, Q3)

6.4 add_difference()

  • t-tests for continuous variables
  • test for equality of proportions
```{r}
trial |>
  select(trt, age, marker, response, death) %>%
  tbl_summary(
    by = trt,
    statistic =
      list(
        all_continuous() ~ "{mean} ({sd})",
        all_dichotomous() ~ "{p}%"
      ),
    missing = "no"
  ) |>
  add_n() |>
  add_difference()

## controlling decimal points
trial |>
  select(trt, age, marker, response, death) %>%
  tbl_summary(
    by = trt,
    statistic =
      list(
        all_continuous() ~ "{mean} ({sd})",
        all_dichotomous() ~ "{p}%"
      ),
    digits = list(all_continuous() ~ 2,
                  all_dichotomous() ~ 2),
    missing = "no"
  ) |>
  add_n() |>
  add_difference() |>
  modify_fmt_fun(
    c(conf.low, conf.high) ~ label_style_number(digits = 2),       
    p.value = label_style_pvalue(digits = 2)    
  )
```
Characteristic N Drug A
N = 98
1
Drug B
N = 102
1
Difference2 95% CI2 p-value2
Age 189 47 (15) 47 (14) -0.44 -4.6, 3.7 0.8
Marker Level (ng/mL) 190 1.02 (0.89) 0.82 (0.83) 0.20 -0.05, 0.44 0.12
Tumor Response 193 29% 34% -4.2% -18%, 9.9% 0.6
Patient Died 200 53% 59% -5.8% -21%, 9.0% 0.5
Abbreviation: CI = Confidence Interval
1 Mean (SD); %
2 Welch Two Sample t-test; 2-sample test for equality of proportions with continuity correction
Characteristic N Drug A
N = 98
1
Drug B
N = 102
1
Difference2 95% CI2 p-value2
Age 189 47.01 (14.71) 47.45 (14.01) -0.44 -4.57, 3.69 0.83
Marker Level (ng/mL) 190 1.02 (0.89) 0.82 (0.83) 0.20 -0.05, 0.44 0.12
Tumor Response 193 29.47% 33.67% -4.2% -0.18, 0.10 0.64
Patient Died 200 53.06% 58.82% -5.8% -0.21, 0.09 0.50
Abbreviation: CI = Confidence Interval
1 Mean (SD); %
2 Welch Two Sample t-test; 2-sample test for equality of proportions with continuity correction

7 ANCOVA Table

```{r}
# ANCOVA adjusted for grade and stage
trial |>
  #select(trt, age, marker, grade, stage)
  tbl_summary(
    by = trt,
    statistic = list(all_continuous() ~ "{mean} ({sd})"),
    missing = "no",
    include = c(age, marker, ttdeath, trt)
  ) |>
  add_n() |>
  add_difference(adj.vars = c(grade, stage))
```
Characteristic N Drug A
N = 98
1
Drug B
N = 102
1
Adjusted Difference2 95% CI2 p-value2
Age 189 47 (15) 47 (14) -0.36 -4.5, 3.8 0.9
Marker Level (ng/mL) 190 1.02 (0.89) 0.82 (0.83) 0.19 -0.05, 0.43 0.12
Months to Death/Censor 200 20.2 (5.0) 19.0 (5.5) 1.0 -0.38, 2.5 0.15
Abbreviation: CI = Confidence Interval
1 Mean (SD)
2 ANCOVA

8 Regression model with tbl_regression()

8.1 Traditoinal Logistic model

```{r}
m1 <- trial |> glm(
  response ~ age + stage,
  data = _,
  family = binomial(link = "logit")
)

summary(m1)
```

Call:
glm(formula = response ~ age + stage, family = binomial(link = "logit"), 
    data = trial)

Coefficients:
            Estimate Std. Error z value Pr(>|z|)  
(Intercept) -1.48622    0.62023  -2.396   0.0166 *
age          0.01939    0.01147   1.691   0.0909 .
stageT2     -0.54143    0.44000  -1.231   0.2185  
stageT3     -0.05953    0.45042  -0.132   0.8948  
stageT4     -0.23109    0.44823  -0.516   0.6062  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 228.58  on 182  degrees of freedom
Residual deviance: 223.93  on 178  degrees of freedom
  (17 observations deleted due to missingness)
AIC: 233.93

Number of Fisher Scoring iterations: 4

8.2 Table Using tbl_regression

```{r}
m1 |> 
  tbl_regression() # same result as the above

# customize
m1 |> 
  tbl_regression(
    exponentiate = TRUE,
    pvalue_fun = label_style_pvalue(digits = 2)
  ) |> 
  add_global_p() |> 
  bold_p(t = 0.10) |> 
  bold_labels() |> 
  add_glance_table(
    include = c(nobs, logLik, AIC, BIC)
  )
```
Characteristic log(OR) 95% CI p-value
Age 0.02 0.00, 0.04 0.091
T Stage


    T1
    T2 -0.54 -1.4, 0.31 0.2
    T3 -0.06 -0.95, 0.82 0.9
    T4 -0.23 -1.1, 0.64 0.6
Abbreviations: CI = Confidence Interval, OR = Odds Ratio
Characteristic OR 95% CI p-value
Age 1.02 1.00, 1.04 0.087
T Stage

0.62
    T1
    T2 0.58 0.24, 1.37
    T3 0.94 0.39, 2.28
    T4 0.79 0.33, 1.90
No. Obs. 183

Log-likelihood -112

AIC 234

BIC 250

Abbreviations: CI = Confidence Interval, OR = Odds Ratio

8.3 Multiple regression with OLS

  • tbl_regression()
```{r}
mpg_model <- mpg |> 
  lm(hwy ~ displ+ cyl + drv, data = _)

mpg_model |> 
  summary() # same as below

mpg_model |> 
  tbl_regression() |> 
  add_n() |> 
  add_vif() |> 
  bold_labels()
```

Call:
lm(formula = hwy ~ displ + cyl + drv, data = mpg)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.7095 -2.0282 -0.1297  1.3760 13.8110 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  33.0915     1.0306  32.108  < 2e-16 ***
displ        -1.1245     0.4614  -2.437   0.0156 *  
cyl          -1.4526     0.3334  -4.357 1.99e-05 ***
drvf          5.0446     0.5134   9.826  < 2e-16 ***
drvr          4.8851     0.7116   6.864 6.20e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.968 on 229 degrees of freedom
Multiple R-squared:  0.7559,    Adjusted R-squared:  0.7516 
F-statistic: 177.2 on 4 and 229 DF,  p-value: < 2.2e-16
Characteristic N Beta 95% CI p-value GVIF Adjusted GVIF1
displ 234 -1.1 -2.0, -0.22 0.016 9.4 3.1
cyl 234 -1.5 -2.1, -0.80 <0.001 7.6 2.8
drv 234


2.0 1.2
    4



    f
5.0 4.0, 6.1 <0.001

    r
4.9 3.5, 6.3 <0.001

Abbreviations: CI = Confidence Interval, GVIF = Generalized Variance Inflation Factor
1 GVIF^[1/(2*df)]

9 Univariate Regression

  • the same as simple regression
  • Univariate regression Analyzes the relationship between one dependent variable and one independent variable, while “regression” generally refers to a broader class of statistical methods that examine relationships between a dependent variable and one or more independent variables.

The function is a wrapper for tbl_regression(), and as a result, accepts nearly identical function arguments.

9.1 Dichotomous DV

```{r}
trial |>
  tbl_uvregression(
    method = glm,
    y = response,
    include = c(age, stage),
    method.args = list(family = binomial),
    exponentiate = TRUE,
    pvalue_fun = label_style_pvalue(digits = 2)
  ) |>
  add_global_p() |> 
  add_q() |> 
  bold_p(t = 0.10, q = TRUE) |> 
  bold_labels()
```
Characteristic N OR 95% CI p-value q-value1
Age 183 1.02 1.00, 1.04 0.091 0.18
T Stage 193

0.58 0.58
    T1


    T2
0.63 0.27, 1.46

    T3
1.13 0.48, 2.68

    T4
0.83 0.36, 1.92

Abbreviations: CI = Confidence Interval, OR = Odds Ratio
1 False discovery rate correction for multiple testing

9.2 Continuous DV

```{r}
mpg |> 
  tbl_uvregression(
    method = lm,
    y = hwy,
    include = c(displ, cyl, drv),
    pvalue_fun = label_style_pvalue(digits = 2)
  ) 

# cf
mpg |> 
  lm(hwy ~ displ, data = _) |> 
  summary()

mpg |> 
  lm(hwy ~ cyl, data = _) |> 
  summary()

mpg |> 
  lm(hwy ~ drv, data = _) |> 
  summary()
```
Characteristic N Beta 95% CI p-value
displ 234 -3.5 -3.9, -3.1 <0.001
cyl 234 -2.8 -3.1, -2.5 <0.001
drv 234


    4

    f
9.0 7.9, 10 <0.001
    r
1.8 0.03, 3.6 0.047
Abbreviation: CI = Confidence Interval

Call:
lm(formula = hwy ~ displ, data = mpg)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.1039 -2.1646 -0.2242  2.0589 15.0105 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  35.6977     0.7204   49.55   <2e-16 ***
displ        -3.5306     0.1945  -18.15   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.836 on 232 degrees of freedom
Multiple R-squared:  0.5868,    Adjusted R-squared:  0.585 
F-statistic: 329.5 on 1 and 232 DF,  p-value: < 2.2e-16


Call:
lm(formula = hwy ~ cyl, data = mpg)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.7579 -2.4968  0.2421  2.4379 15.2421 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  40.0190     0.9591   41.72   <2e-16 ***
cyl          -2.8153     0.1571  -17.92   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.865 on 232 degrees of freedom
Multiple R-squared:  0.5805,    Adjusted R-squared:  0.5787 
F-statistic: 321.1 on 1 and 232 DF,  p-value: < 2.2e-16


Call:
lm(formula = hwy ~ drv, data = mpg)

Residuals:
    Min      1Q  Median      3Q     Max 
-11.160  -2.175  -1.000   1.960  15.840 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  19.1748     0.4037  47.501   <2e-16 ***
drvf          8.9856     0.5668  15.852   <2e-16 ***
drvr          1.8252     0.9134   1.998   0.0469 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.097 on 231 degrees of freedom
Multiple R-squared:  0.5307,    Adjusted R-squared:  0.5266 
F-statistic: 130.6 on 2 and 231 DF,  p-value: < 2.2e-16

10 Combining tables in columns or rows

  • tbl_merge: in columns
  • tbl_stack: in rows
```{r}
r1 <- mpg |> 
  lm(hwy ~ displ + cyl + drv, data = _) |> 
  tbl_regression() |> 
  add_n() |> 
  add_vif() |> 
  bold_labels()

r2 <- mpg |> 
  lm(cty ~ displ + cyl + drv, data = _) |> 
  tbl_regression() |> 
  add_n() |> 
  add_vif() |> 
  bold_labels()

tbl_merge(list(r1, r2))
tbl_merge(list(r1, r2), tab_spanner = c("Highway Mileage", "City Mileage"))

tbl_stack(list(r1, r2))
tbl_stack(list(r1, r2), group_header = c("Highway Mileage", "City Mileage"))
```
Characteristic
Table 1
Table 2
N Beta 95% CI p-value GVIF Adjusted GVIF1 N Beta 95% CI p-value GVIF Adjusted GVIF1
displ 234 -1.1 -2.0, -0.22 0.016 9.4 3.1 234 -0.74 -1.4, -0.06 0.032 9.4 3.1
cyl 234 -1.5 -2.1, -0.80 <0.001 7.6 2.8 234 -1.3 -1.8, -0.81 <0.001 7.6 2.8
drv 234


2.0 1.2 234


2.0 1.2
    4







    f
5.0 4.0, 6.1 <0.001


2.5 1.7, 3.3 <0.001

    r
4.9 3.5, 6.3 <0.001


2.2 1.1, 3.2 <0.001

Abbreviations: CI = Confidence Interval, GVIF = Generalized Variance Inflation Factor
1 GVIF^[1/(2*df)]
Characteristic
Highway Mileage
City Mileage
N Beta 95% CI p-value GVIF Adjusted GVIF1 N Beta 95% CI p-value GVIF Adjusted GVIF1
displ 234 -1.1 -2.0, -0.22 0.016 9.4 3.1 234 -0.74 -1.4, -0.06 0.032 9.4 3.1
cyl 234 -1.5 -2.1, -0.80 <0.001 7.6 2.8 234 -1.3 -1.8, -0.81 <0.001 7.6 2.8
drv 234


2.0 1.2 234


2.0 1.2
    4







    f
5.0 4.0, 6.1 <0.001


2.5 1.7, 3.3 <0.001

    r
4.9 3.5, 6.3 <0.001


2.2 1.1, 3.2 <0.001

Abbreviations: CI = Confidence Interval, GVIF = Generalized Variance Inflation Factor
1 GVIF^[1/(2*df)]
Characteristic N Beta 95% CI p-value GVIF Adjusted GVIF1
displ 234 -1.1 -2.0, -0.22 0.016 9.4 3.1
cyl 234 -1.5 -2.1, -0.80 <0.001 7.6 2.8
drv 234


2.0 1.2
    4



    f
5.0 4.0, 6.1 <0.001

    r
4.9 3.5, 6.3 <0.001

displ 234 -0.74 -1.4, -0.06 0.032 9.4 3.1
cyl 234 -1.3 -1.8, -0.81 <0.001 7.6 2.8
drv 234


2.0 1.2
    4



    f
2.5 1.7, 3.3 <0.001

    r
2.2 1.1, 3.2 <0.001

Abbreviations: CI = Confidence Interval, GVIF = Generalized Variance Inflation Factor
1 GVIF^[1/(2*df)]
Characteristic N Beta 95% CI p-value GVIF Adjusted GVIF1
Highway Mileage
displ 234 -1.1 -2.0, -0.22 0.016 9.4 3.1
cyl 234 -1.5 -2.1, -0.80 <0.001 7.6 2.8
drv 234


2.0 1.2
    4



    f
5.0 4.0, 6.1 <0.001

    r
4.9 3.5, 6.3 <0.001

City Mileage
displ 234 -0.74 -1.4, -0.06 0.032 9.4 3.1
cyl 234 -1.3 -1.8, -0.81 <0.001 7.6 2.8
drv 234


2.0 1.2
    4



    f
2.5 1.7, 3.3 <0.001

    r
2.2 1.1, 3.2 <0.001

Abbreviations: CI = Confidence Interval, GVIF = Generalized Variance Inflation Factor
1 GVIF^[1/(2*df)]

11 themes

```{r}
# With a default theme

mpg |> 
  lm(hwy ~ displ + cyl + drv, data = _) |> 
  tbl_regression() |> 
  add_n() |> 
  add_vif() |> 
  bold_labels()

# Journal of American Medical Association Theme
theme_gtsummary_journal(journal = "jama")

mpg |> 
  lm(hwy ~ displ + cyl + drv, data = _) |> 
  tbl_regression() |> 
  add_n() |> 
  add_vif() |> 
  bold_labels()

reset_gtsummary_theme()

# The Quarterly Journal of Economics
theme_gtsummary_journal(journal = "qjecon")

mpg |> 
  lm(hwy ~ displ + cyl + drv, data = _) |> 
  tbl_regression() |> 
  add_n() |> 
  add_vif() |> 
  bold_labels()
```
Characteristic N Beta 95% CI p-value GVIF Adjusted GVIF1
displ 234 -1.1 -2.0, -0.22 0.016 9.4 3.1
cyl 234 -1.5 -2.1, -0.80 <0.001 7.6 2.8
drv 234


2.0 1.2
    4



    f
5.0 4.0, 6.1 <0.001

    r
4.9 3.5, 6.3 <0.001

Abbreviations: CI = Confidence Interval, GVIF = Generalized Variance Inflation Factor
1 GVIF^[1/(2*df)]
Characteristic N Beta (95% CI) p-value GVIF Adjusted GVIF1
displ 234 -1.1 (-2.0 to -0.22) 0.016 9.4 3.1
cyl 234 -1.5 (-2.1 to -0.80) <0.001 7.6 2.8
drv 234

2.0 1.2
    4



    f
5.0 (4.0 to 6.1) <0.001

    r
4.9 (3.5 to 6.3) <0.001

Abbreviations: CI = Confidence Interval, GVIF = Generalized Variance Inflation Factor
1 GVIF^[1/(2*df)]
Characteristic N Beta
(SE)
1
GVIF Adjusted GVIF2
displ 234 -1.1* (0.461) 9.4 3.1
cyl 234 -1.5*** (0.333) 7.6 2.8
drv 234
2.0 1.2
    4


    f
5.0*** (0.513)

    r
4.9*** (0.712)

Abbreviations: GVIF = Generalized Variance Inflation Factor, SE = Standard Error
1 *p<0.05; **p<0.01; ***p<0.001
2 GVIF^[1/(2*df)]

12 References