Skip to contents

This document is prepared automatically using the following R command.

Problem

Confidence interval of a sample proportion

The approach that we used to solve this problem is valid when the following conditions are met.

  • The sampling method must be simple random sampling. This condition is satisfied; the problem statement says that we used simple random sampling.

  • Each sample point can result in just two possible outcomes. We call one of these outcomes a success and the other, a failure.

  • The sample should include at least 10 successes and 10 failures. Suppose we classify a “more local news” response as a success, and any other response as a failure. Then, we have 0.73 \(\times\) 100 = 73 successes, and 0.27 \(\times\) 100 = 27 failures - plenty of successes and failures.

  • The population size is at least 20 times as big as the sample size. If the population size is much larger than the sample size, we can use an approximate formula for the standard deviation or the standard error. This condition is satisfied, so we will use one of the simpler approximate formulas.

Solution

This approach consists of four steps:

  • state the hypotheses

  • formulate an analysis plan

  • analyze sample data

  • interpret results.

1. State the hypotheses

The first step is to state the null hypothesis and an alternative hypothesis.

\[Null\ hypothesis(H_0): P = 0.8\] \[Alternative\ hypothesis(H_1): P \neq 0.8\]

Note that these hypotheses constitute a two-tailed test. The null hypothesis will be rejected if the sample proportion is too big or if it is too small..

2. Formulate an analysis plan

For this analysis, the significance level is 0.01`. The test method, shown in the next section, is a one-sample z-test.

2. Select a confidence level.

In this analysis, the confidence level is defined for us in the problem. We are working with a 99% confidence level.

3. Analyze sample data

Using sample data, we calculate the standard deviation (sd) and compute the z-score test statistic (z).

\[sd=\sqrt{\frac{P\times(1-P)}{n}}\] \[sd=\sqrt{\frac{0.8\times(1-0.8)}{100}}=0.04\] \[z=\frac{p-P}{sd}=\frac{0.73-0.8}{0.04}=-1.75\] where \(P\) is the hypothesized value of population proportion in the null hypothesis, \(p\) is the sample proportion, and \(n\) is the sample size.

Since we have a two-tailed test, the P-value is the probability that the z statistic is less than -1.75 or greater than 1.75.

We can use following R code to find the p value.

\[p=pnorm(-abs(-1.75))\times2=0.08\]

Alternatively,we can use the Normal Distribution curve to find p value.

draw_n(z=x$result$z,alternative=x$result$alternative)

4. Interpret results.

Since the P-value (0.08) is greater than the significance level (0.01), we cannot reject the null hypothesis.

Result of propCI()

$data

[38;5;246m# A tibble: 1 × 1
[39m
  value
  
[3m
[38;5;246m<lgl>
[39m
[23m

[38;5;250m1
[39m 
[31mNA
[39m   

$result
  alpha   n df    p   P   se critical        ME     lower     upper
1  0.01 100 99 0.73 0.8 0.04 2.575829 0.1030332 0.6269668 0.8330332
                      CI     z     pvalue alternative
1 0.73 [99CI 0.63; 0.83] -1.75 0.08011831   two.sided

$call
propCI(n = 100, p = 0.73, P = 0.8, alpha = 0.01)

attr(,"measure")
[1] "prop"

Reference

The contents of this document are modified from StatTrek.com. Berman H.B., “AP Statistics Tutorial”, [online] Available at: https://stattrek.com/hypothesis-test/proportion.aspx?tutorial=AP URL[Accessed Data: 1/23/2022].