Day 12: Inference for a single proportion or difference of two (independent) proportions (Sections 8.1-8.2)

BSTA 511/611

Week 7
Author
Affiliation

Meike Niederhausen, PhD

OHSU-PSU School of Public Health

Published

November 13, 2024

Load packages

  • Packages need to be loaded every time you restart R or render an Qmd file
Code
# run these every time you open Rstudio
library(tidyverse)    
library(oibiostat)
library(janitor)
library(rstatix)
library(knitr)
library(gtsummary)
library(moderndive)
library(gt)
library(broom) 
library(here) 
library(pwr) # new-ish
  • You can check whether a package has been loaded or not
    • by looking at the Packages tab and
    • seeing whether it has been checked off or not

MoRitz’s tip of the day: code folding

  • With code folding we can hide or show the code in the html output by clicking on the Code buttons in the html file.

  • Note the </> Code button on the top right of the html output.

  • See the new options in the yaml above (in the .qmd file).

code-fold: show code-tools: true source: repo

See more information at https://quarto.org/docs/output-formats/html-code.html#folding-code

Where are we?

CI’s and hypothesis tests for different scenarios:

point estimate±z(or t)SE,  test stat=point estimatenull valueSE

Day Book Population
parameter
Symbol Point estimate Symbol SE
10 5.1 Pop mean μ Sample mean x¯ sn
10 5.2 Pop mean of paired diff μd or δ Sample mean of paired diff x¯d sdn
11 5.3 Diff in pop
means
μ1μ2 Diff in sample
means
x¯1x¯2 s12n1+s22n2 or pooled
12 8.1 Pop proportion p Sample prop p^ ???
12 8.2 Diff in pop
proportions
p1p2 Diff in sample
proportions
p^1p^2 ???

Goals for today (Sections 8.1-8.2)

  • Statistical inference for a single proportion or the difference of two (independent) proportions
    1. Sampling distribution for a proportion or difference in proportions

    2. What are H0 and Ha?

    3. What are the SE’s for p^ and p^1p^2?

    4. Hypothesis test

    5. Confidence Interval

    6. How are the SE’s different for a hypothesis test & CI?

    7. How to run proportions tests in R

    8. Power & sample size for proportions tests (extra material)

Motivating example

One proportion

  • A 2010 study found that out of 269 male college students, 35% had participated in sports betting in the previous year.
    • What is the CI for the proportion?
    • The study also reported that 36% of noncollege young males had participated in sports betting. Is the proportion for male college students different from 0.36?

Two proportions

  • There were 214 men in the sample of noncollege young males (36% participated in sports betting in the previous year).
  • Compare the difference in proportions between the college and noncollege young males.
    • CI & Hypothesis test

Barnes GM, Welte JW, Hoffman JH, Tidwell MC. Comparisons of gambling and alcohol use among college students and noncollege young people in the United States. J Am Coll Health. 2010 Mar-Apr;58(5):443-52. doi: 10.1080/07448480903540499. PMID: 20304756; PMCID: PMC4104810.

Steps in a Hypothesis Test

  1. Set the level of significance α

  2. Specify the null ( H0 ) and alternative ( HA ) hypotheses

    1. In symbols
    2. In words
    3. Alternative: one- or two-sided?
  3. Calculate the test statistic.

  4. Calculate the p-value based on the observed test statistic and its sampling distribution

  5. Write a conclusion to the hypothesis test

    1. Do we reject or fail to reject H0?
    2. Write a conclusion in the context of the problem

Step 2: Null & Alternative Hypotheses

Null and alternative hypotheses in words and in symbols.

One sample test

  • H0: The population proportion of young male college students that participated in sports betting in the previous year is 0.36.

  • HA: The population proportion of young male college students that participated in sports betting in the previous year is not 0.36.

H0:p=0.36HA:p0.36

Two samples test

  • H0: The difference in population proportions of young male college and noncollege students that participated in sports betting in the previous year is 0.

  • HA: The difference in population proportions of young male college and noncollege students that participated in sports betting in the previous year is not 0.

H0:pcollpnoncoll=0HA:pcollpnoncoll0

One proportion inference

Sampling distribution of p^

  • p^=Xn where X is the number of “successes” and n is the sample size.
  • XBin(n,p), where p is the population proportion.
  • For n “big enough”, the normal distribution can be used to approximate a binomial distribution:

Bin(n,p)N(μ=np,σ=np(1p))

  • Since p^=Xn is a linear transformation of X, we have for large n:

p^N(μp^=p,σp^=p(1p)n)

  • How we apply this result to CI’s and test statistics is different!!!

Step 3: Test statistic

Sampling distribution of p^ if we assume H0:p=p0 is true:

p^N(μp^=p,σp^=p(1p)n)N(μp^=p0,σp^=p0(1p0)n)

Test statistic for a one sample proportion test:

test stat=point estimatenull valueSE=zp^=p^p0p0(1p0)n


Example: A 2010 study found that out of 269 male college students, 35% had participated in sports betting in the previous year.

What is the test statistic when testing H0:p=0.36 vs.  HA:p0.36?

Code
p0 <- 0.36
n <- 269
n*.35
[1] 94.15
Code
(ph <- 94/n)
[1] 0.3494424
Code
(SEp <- sqrt(p0*(1-p0)/n))
[1] 0.02926612
Code
(zp <- (ph-p0)/SEp)
[1] -0.3607455

zp^=94/2690.360.36(10.36)2690.3607455

Step “3b”: Conditions satisfied?

Conditions:

  1. Independent observations
    • The observations were collected independently.
  2. The number of expected successes and expected failures is at least 10.
    • n1p010,  n1(1p0)10

Example: A 2010 study found that out of 269 male college students, 35% had participated in sports betting in the previous year.

Testing H0:p=0.36 vs. HA:p0.36.

Are the conditions satisfied?

Step 4: p-value

The p-value is the probability of obtaining a test statistic just as extreme or more extreme than the observed test statistic assuming the null hypothesis H0 is true.

Calculate the p-value:

2P(p^<0.35)=2P(Zp^<94/2690.360.36(10.36)269)=2P(Zp^<0.3607455)=0.7182897

Code
2*pnorm(-0.3607455)
[1] 0.7182897

Step 5: Conclusion to hypothesis test

H0:p=0.36HA:p0.36

  • Recall the p-value = 0.7182897
  • Use α = 0.05.
  • Do we reject or fail to reject H0?

Conclusion statement:

  • Stats class conclusion
    • There is insufficient evidence that the (population) proportion of young male college students that participated in sports betting in the previous year is different than 0.36 ( p-value = 0.72).
  • More realistic manuscript conclusion:
    • In a sample of 269 male college students, 35% had participated in sports betting in the previous year, which is not different from 36% ( p-value = 0.72).

95% CI for population proportion

What to use for SE in CI formula?

p^±zSEp^

Sampling distribution of p^:

p^N(μp^=p,σp^=p(1p)n)

Problem: We don’t know what p is - it’s what we’re estimating with the CI.
Solution: approximate p with p^:

SEp^=p^(1p^)n


Example: A 2010 study found that out of 269 male college students, 35% had participated in sports betting in the previous year.
Find the 95% CI for the population proportion.

94/269±1.96SEp^SEp^=(94/269)(194/269)269

Interpretation:
We are 95% confident that the (population) proportion of young male college students that participated in sports betting in the previous year is in (0.29, 0.41).

Conditions for one proportion: test vs. CI

Hypothesis test conditions

  1. Independent observations
    • The observations were collected independently.
  2. The number of expected successes and expected failures is at least 10.

n1p010,  n1(1p0)10

Confidence interval conditions

  1. Independent observations
    • The observations were collected independently.
  2. The number of successes and failures is at least 10:

n1p^110,  n1(1p^1)10

Inference for difference of two independent proportions p^1p^2

Sampling distribution of p^1p^2

  • p^1=X1n1 and p^2=X2n2,
    • X1 & X2 are the number of “successes”
    • n1 & n2 are the sample sizes of the 1st & 2nd samples


  • Each p^ can be approximated by a normal distribution, for “big enough” n
  • Since the difference of independent normal random variables is also normal, it follows that for “big enough” n1 and n2

p^1p^2N(μp^1p^2=p1p2,  σp^1p^2=p1(1p1)n1+p2(1p2)n2)

where p1 & p2 are the population proportions, respectively.

  • How we apply this result to CI’s and test statistics is different!!!

Step 3: Test statistic (1/2)

Sampling distribution of p^1p^2: p^1p^2N(μp^1p^2=p1p2,  σp^1p^2=p1(1p1)n1+p2(1p2)n2)

Since we assume H0:p1p2=0 is true, we “pool” the proportions of the two samples to calculate the SE:

pooled proportion=p^pool=total number of successestotal number of cases=x1+x2n1+n2

Test statistic:

test statistic=zp^1p^2=p^1p^20p^pool(1p^pool)n1+p^pool(1p^pool)n2

Step 3: Test statistic (2/2)

test statistic=zp^1p^2=p^1p^20p^pool(1p^pool)n1+p^pool(1p^pool)n2

pooled proportion=p^pool=total number of successestotal number of cases=x1+x2n1+n2


Example: A 2010 study found that out of 269 male college students, 35% had participated in sports betting in the previous year, and out of 214 noncollege young males 36% had.
What is the test statistic when testing H0:pcollpnoncoll=0 vs.  HA:pcollpnoncoll0?

zp^1p^2=94/26977/21400.354(10.354)(1269+1214)=0.2367497

Step “3b”: Conditions satisfied?

Conditions:

  • Independent observations & samples
    • The observations were collected independently.
    • In particular, observations from the two groups weren’t paired in any meaningful way.
  • The number of expected successes and expected failures is at least 10 for each group - using the pooled proportion:
    • n1p^pool10,  n1(1p^pool)10
    • n2p^pool10,  n2(1p^pool)10

Example: A 2010 study found that out of 269 male college students, 35% had participated in sports betting in the previous year, and out of 214 noncollege young males 36% had.
Testing H0:pcollpnoncoll=0 vs.  HA:pcollpnoncoll0? .
Are the conditions satisfied?

Step 4: p-value

The p-value is the probability of obtaining a test statistic just as extreme or more extreme than the observed test statistic assuming the null hypothesis H0 is true.

Calculate the p-value:

2P(p^1p^2<0.350.36)=2P(Zp^1p^2<94/26977/21400.354(10.354)(1269+1214))=2P(Zp^<0.2367497)

Code
2*pnorm(-0.2367497)
[1] 0.812851

Step 5: Conclusion to hypothesis test

H0:pcollpnoncoll=0HA:pcollpnoncoll0

  • Recall the p-value = 0.812851
  • Use α = 0.05.
  • Do we reject or fail to reject H0?

Conclusion statement:

  • Stats class conclusion
    • There is insufficient evidence that the difference in (population) proportions of young male college and noncollege students that participated in sports betting in the previous year are different ( p-value = 0.81).
  • More realistic manuscript conclusion:
    • 35% of young male college students (n=269) and 36% of noncollege young males (n=214) participated in sports betting in the previous year ( p-value = 0.81).

95% CI for population difference in proportions

What to use for SE in CI formula?

p^1p^2±zSEp^1p^2

SE in sampling distribution of p^1p^2

σp^1p^2=p1(1p1)n1+p2(1p2)n2

Problem: We don’t know what p is - it’s what we’re estimating with the CI.
Solution: approximate p1, p2 with p^1, p^2:

SEp^1p^2=p^1(1p^1)n1+p^2(1p^2)n2


Example: A 2010 study found that out of 269 male college students, 35% had participated in sports betting in the previous year, and out of 214 noncollege young males 36% had. Find the 95% CI for the difference in population proportions.

9426977214±1.96SEp^1p^2

SEp^1p^2=94/269(194/269)269+77/214(177/214)214

Interpretation:
We are 95% confident that the difference in (population) proportions of young male college and noncollege students that participated in sports betting in the previous year is in (-0.127, 0.106).

Conditions for difference in proportions: test vs. CI

Hypothesis test conditions

  1. Independent observations & samples
    • The observations were collected independently.
    • In particular, observations from the two groups weren’t paired in any meaningful way.
  2. The number of expected successes and expected failures is at least 10 for each group - using the pooled proportion:
    • n1p^pool10,  n1(1p^pool)10
    • n2p^pool10,  n2(1p^pool)10

Confidence interval conditions

  1. Independent observations & samples
    • The observations were collected independently.
    • In particular, observations from the two groups weren’t paired in any meaningful way.
  2. The number of successes and failures is at least 10 for each group.
    • n1p^110,  n1(1p^1)10
    • n2p^210,  n2(1p^2)10

1- and 2-sample proportions tests in R

  • prop.test
  • Need a dataset to use prop.test
    • Create dataset based on the summary stats if do not have one
  • Input of prop.test is a table() of the dataset
  • Continuity correction

R: 1-sample proportion test (1/3)

Create a dataset based on the results:

Code
.35*269 # number of "successes"
[1] 94.15
Code
# round this value

SportsBet1 <- tibble(
  Coll = c(rep("Bet", 94), 
           rep("NotBet",269-94))
  )
glimpse(SportsBet1)
Rows: 269
Columns: 1
$ Coll <chr> "Bet", "Bet", "Bet", "Bet", "Bet", "Bet", "Bet", "Bet", "Bet", "B…
Code
SportsBet1 %>% tabyl(Coll)
   Coll   n   percent
    Bet  94 0.3494424
 NotBet 175 0.6505576

R code for proportions test requires input as a base R table:

Code
table(SportsBet1$Coll)

   Bet NotBet 
    94    175 

R: 1-sample proportion test (2/3)

prop.test requires the input x to be a table

Code
prop.test(x = table(SportsBet1$Coll),
       alternative = "two.sided",
       p = 0.36,
       correct = FALSE)

    1-sample proportions test without continuity correction

data:  table(SportsBet1$Coll), null probability 0.36
X-squared = 0.13014, df = 1, p-value = 0.7183
alternative hypothesis: true p is not equal to 0.36
95 percent confidence interval:
 0.2949476 0.4081767
sample estimates:
        p 
0.3494424 

R: 1-sample proportion test: with vs. without CC (3/3)

Apply a continuity correction (CC) to the p-value calculation.

Code
prop.test(x = table(SportsBet1$Coll), alternative = "two.sided",
       p = 0.36, correct = FALSE) %>% tidy() %>% gt()
estimate statistic p.value parameter conf.low conf.high method alternative
0.3494424 0.1301373 0.7182897 1 0.2949476 0.4081767 1-sample proportions test without continuity correction two.sided
Code
prop.test(x = table(SportsBet1$Coll), alternative = "two.sided",
       p = 0.36, correct = TRUE) %>% tidy() %>% gt()
estimate statistic p.value parameter conf.low conf.high method alternative
0.3494424 0.08834805 0.7662879 1 0.2931841 0.4100774 1-sample proportions test with continuity correction two.sided

Differences are small when sample sizes are large.

R: 2-samples proportions test (1/3)

We first need a dataset based on the results:

Code
.35*269 # number of "successes"
[1] 94.15
Code
.36*214 # round these value
[1] 77.04
Code
SportsBet2 <- tibble(
  Group = c(rep("College", 269), 
         rep("NonCollege", 214)),
  Bet = c(rep("yes", 94), 
          rep("no", 269-94),
          rep("yes", 77), 
          rep("no", 214-77))
)
glimpse(SportsBet2)
Rows: 483
Columns: 2
$ Group <chr> "College", "College", "College", "College", "College", "College"…
$ Bet   <chr> "yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes", "…
Code
SportsBet2 %>% tabyl(Group, Bet)
      Group  no yes
    College 175  94
 NonCollege 137  77

R code for proportions test requires input as a base R table:

Code
table(SportsBet2$Group, SportsBet2$Bet)
            
              no yes
  College    175  94
  NonCollege 137  77

R: 2-samples proportions test (2/3)

prop.test requires the input x to be a table

Code
prop.test(x = table(SportsBet2$Group, SportsBet2$Bet),
       alternative = "two.sided",
       correct = FALSE)

    2-sample test for equality of proportions without continuity correction

data:  table(SportsBet2$Group, SportsBet2$Bet)
X-squared = 0.05605, df = 1, p-value = 0.8129
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.07554399  0.09628540
sample estimates:
   prop 1    prop 2 
0.6505576 0.6401869 

R: 2-samples proportions test: with vs. without CC (3/3)

Apply a continuity correction (CC) to the p-value calculation.

Code
prop.test(x = table(SportsBet2$Group, SportsBet2$Bet), alternative = "two.sided", 
          correct = FALSE) %>% tidy() %>% gt()
estimate1 estimate2 statistic p.value parameter conf.low conf.high method alternative
0.6505576 0.6401869 0.05605044 0.8128509 1 -0.07554399 0.0962854 2-sample test for equality of proportions without continuity correction two.sided
Code
prop.test(x = table(SportsBet2$Group, SportsBet2$Bet), alternative = "two.sided", 
          correct = TRUE) %>% tidy() %>% gt()
estimate1 estimate2 statistic p.value parameter conf.low conf.high method alternative
0.6505576 0.6401869 0.01987511 0.8878864 1 -0.07973918 0.1004806 2-sample test for equality of proportions with continuity correction two.sided

Differences are small when sample sizes are large.

Power & sample size
for testing proportions

Sample size calculation for testing one proportion

  • Recall in our sports betting example that the null p0=0.36 and the observed proportion was p^=0.35.
    • The p-value from the hypothesis test was not significant.
    • How big would the sample size n need to be in order for the p-value to be significant?
  • Calculate n
    • given α, power ( 1β ), “true” alternative proportion p, and null p0:

n=p(1p)(z1α/2+z1βpp0)2

Code
p <- 0.35
p0 <- 0.36
alpha <- 0.05
beta <- 0.20  #power=1-beta; want >=80% power
n <- p*(1-p)*((qnorm(1-alpha/2) + qnorm(1-beta)) /
                (p-p0))^2
n
[1] 17856.2
Code
ceiling(n) 
[1] 17857

We would need a sample size of at least 17,857!

Power calculation for testing one proportion

Conversely, we can calculate how much power we had in our example given the sample size of 269.

  • Calculate power,
    • given α, n, “true” alternative proportion p, and null p0

1β=Φ(zz1α/2)+Φ(zz1α/2),where z=pp0p(1p)n

Φ is the probability for a standard normal distribution

Code
p <- 0.35; p0 <- 0.36; alpha <- 0.05; n <- 269
(z <- (p-p0)/sqrt(p*(1-p)/n))
[1] -0.343863
Code
(Power <- pnorm(z - qnorm(1-alpha/2)) +  pnorm(-z - qnorm(1-alpha/2)))
[1] 0.06365242

If the population proportion is 0.35 instead of 0.36, we only have a 6.4% chance of correctly rejecting H0 when the sample size is 269.

R package pwr for power analyses

  • Specify all parameters except for the one being solved for.

  • One proportion

pwr.p.test(h = NULL, n = NULL, sig.level = 0.05, power = NULL,       alternative = c("two.sided","less","greater"))

  • Two proportions (same sample sizes)

pwr.2p.test(h = NULL, n = NULL, sig.level = 0.05, power = NULL,       alternative = c("two.sided","less","greater"))

  • Two proportions (different sample sizes)

pwr.2p2n.test(h = NULL, n1 = NULL, n2 = NULL, sig.level = 0.05, power = NULL,       alternative = c("two.sided", "less","greater"))


h is the effect size, and calculated using an arcsine transformation:

h=ES.h(p1, p2)=2arcsin(p1)2arcsin(p2)

See PASS documentation for

pwr: sample size for one proportion test

pwr.p.test(h = NULL, n = NULL, sig.level = 0.05, power = NULL,       alternative = c("two.sided","less","greater"))

  • h is the effect size: h = ES.h(p1, p2)
    • p1 and p2 are the two proportions being tested
    • one of them is the null proportion p0, and the other is the alternative proportion

Specify all parameters except for the sample size:

Code
library(pwr)

p.n <- pwr.p.test(
  h = ES.h(p1 = 0.36, p2 = 0.35),
  sig.level = 0.05, 
  power = 0.80, 
  alternative = "two.sided")
p.n

     proportion power calculation for binomial distribution (arcsine transformation) 

              h = 0.02089854
              n = 17971.09
      sig.level = 0.05
          power = 0.8
    alternative = two.sided
Code
plot(p.n)

pwr: power for one proportion test

pwr.p.test(h = NULL, n = NULL, sig.level = 0.05, power = NULL,       alternative = c("two.sided","less","greater"))

  • h is the effect size: h = ES.h(p1, p2)
    • p1 and p2 are the two proportions being tested
    • one of them is the null proportion p0, and the other is the alternative proportion

Specify all parameters except for the power:

Code
library(pwr)

p.power <- pwr.p.test(
  h = ES.h(p1 = 0.36, p2 = 0.35),
  sig.level = 0.05, 
  # power = 0.80, 
  n = 269,
  alternative = "two.sided")
p.power

     proportion power calculation for binomial distribution (arcsine transformation) 

              h = 0.02089854
              n = 269
      sig.level = 0.05
          power = 0.06356445
    alternative = two.sided
Code
plot(p.power)

pwr: sample size for two proportions test

  • Two proportions (same sample sizes)

pwr.2p.test(h = NULL, n = NULL, sig.level = 0.05, power = NULL,       alternative = c("two.sided","less","greater"))

  • h is the effect size: h = ES.h(p1, p2); p1 and p2 are the two proportions being tested

Specify all parameters except for the sample size:

Code
p2.n <- pwr.2p.test(
  h = ES.h(p1 = 0.36, p2 = 0.35),
  sig.level = 0.05, 
  power = 0.80, 
  alternative = "two.sided")
p2.n

     Difference of proportion power calculation for binomial distribution (arcsine transformation) 

              h = 0.02089854
              n = 35942.19
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: same sample sizes

Note: n in output is the number per sample!

Code
plot(p2.n)

pwr: power for two proportions test

  • Two proportions (different sample sizes)

pwr.2p2n.test(h = NULL, n1 = NULL, n2 = NULL, sig.level = 0.05, power = NULL,       alternative = c("two.sided", "less","greater"))

  • h is the effect size: h = ES.h(p1, p2); p1 and p2 are the two proportions being tested

Specify all parameters except for the power:

Code
p2.n2 <- pwr.2p2n.test(
  h = ES.h(p1 = 0.36, p2 = 0.35),
  n1 = 214,
  n2 = 269,
  sig.level = 0.05, 
  # power = 0.80, 
  alternative = "two.sided")
p2.n2

     difference of proportion power calculation for binomial distribution (arcsine transformation) 

              h = 0.02089854
             n1 = 214
             n2 = 269
      sig.level = 0.05
          power = 0.05598413
    alternative = two.sided

NOTE: different sample sizes

Note: n in output is the number per sample!

Code
plot(p2.n2)

Where are we?

CI’s and hypothesis tests for different scenarios:

point estimate±z(or t)SE,  test stat=point estimatenull valueSE

Day Book Population
parameter
Symbol Point estimate Symbol SE
10 5.1 Pop mean μ Sample mean x¯ sn
10 5.2 Pop mean of paired diff μd or δ Sample mean of paired diff x¯d sdn
11 5.3 Diff in pop
means
μ1μ2 Diff in sample
means
x¯1x¯2 s12n1+s22n2 or pooled
12 8.1 Pop proportion p Sample prop p^ p(1p)n
12 8.2 Diff in pop
proportions
p1p2 Diff in sample
proportions
p^1p^2 p1(1p1)n1+p2(1p2)n2