The R language has built-in support for a lot of hypothesis testing methods. Though we can also compute it manually.
One-sampled Proportion Test
See: Z-test > One Sample Proportion Test
prop.test(x, n, p, alt = "two.sided", conf.level = 0.95, correct = TRUE)
- where
x
is the observed count of success (a count rather than proportion!) n
is the sample sizep
is the probability of success according to the null hypothesisalt
is the direction of the alternative hypothesis. Can beless
,greater
, ortwo.sided
conf.level = 0.95
correct = True
is whether to apply Yates’ continuity correction
For example:
prop.test(19, 220, 0.1, alt = "two.sided", correct = FALSE)
To compute the p-value manually, we first need to compute the test statistics according to the formula
box = ... # expected population distribution e.g. c(rep(0, 9), 1)
n = ... # Sample size. e.g. 42
OV = ... # observed value
box_mean = mean(box)
box_sd = rafalib::popsd(box) # Note that we need population standard deviation!
# EV and SE of the sample sum
EV = n * box_mean
SE = sqrt(n) * box_sd
# calculate test statistic
test_stat = (OV - EV)/SE
Then we use pnorm
to calculate the chance of getting a test statistic as or more extreme. The exact formula depends on the side of alternative hypothesis and whether we are doing one-tailed and two-tailed tests:
p_value = 2 * pnorm(test_stat) # if 2-sided, t < 0
T-Test
Use t.test()
Check Normality: Shapiro-Wilk Test
see: Shapiro-Wilk Test
shapiro.test(vec)