(two sample t-test for groups in data frame
here)
First we create a data frame, on which we can work
x <- rnorm(100)
y <- sample(1:3, 100, replace=T)
z <- sample(1:2, 100, replace=T)
df <- data.frame(group=y,sex=z,sds=x)
head(df)
group sex sds
1 1 2 1.3412663
2 1 1 0.8000326
3 2 2 0.0371029
4 1 2 0.2064684
5 1 2 1.6429816
6 3 1 1.3271138
In the following I use
ddply() from the plyr-package, so if you did not install and/or load it you have to do it now - the command
library() without any argument gives you a list of the installed packages:
install.packages("plyr") # install the package
library(plyr) # load it
Here is the function for doing the t.tests:
t.test.plyr <- function(x, var, mean=0 ){
y <- rep(NA,10)
y[6] <- nrow(x)[1] # count observations
if(nrow(x) < 2) return(y) # exits if too less observations
res <- t.test(x[var], mu=mean) # doing the test
y[1] <- res$statistic # extract values of interest
y[2] <- res$p.value
y[3] <- res$estimate
y[4] <- res$conf.int[1]
y[5] <- res$conf.int[2]
y[7] <- res$parameter
y[8] <- res$method
y[9] <- res$alternative
y[10] <- res$null.value
names(y) <- c("statistic","p.value","estimate","conf.int1", "conf.int2", "nobs","dof","method","alternative","null.value")
y
}
where
t.test.plyr() is a function of:
- x - the data frame
- var - the variable which should be testet
- mean - the mean of the t-test
now we can use ddply in the following way:
result <- ddply(df, .(group, sex), t.test.plyr, "sds")
result
group sex statistic p.value estimate
1 1 1 0.774917157596946 0.445379429071851 0.132214420988956
2 1 2 -0.415594591752269 0.68359099812399 -0.0987341821354776
3 2 1 0.130257003222609 0.899578855830346 0.0602600950757679
4 2 2 -0.0579045286518569 0.954965406753824 -0.0150105343694587
5 3 1 1.18044920537142 0.256202807853187 0.406567809727671
6 3 2 1.49215606704641 0.15126489039304 0.294865619363907
conf.int1 conf.int2 nobs dof method alternative
1 -0.218494854049804 0.482923696027717 27 26 One Sample t-test two.sided
2 -0.605109702463049 0.407641338192093 16 15 One Sample t-test two.sided
3 -1.00655416438508 1.12707435453662 9 8 One Sample t-test two.sided
4 -0.592608791299465 0.562587722560548 11 10 One Sample t-test two.sided
5 -0.327541518602652 1.14067713805799 16 15 One Sample t-test two.sided
6 -0.117342538637965 0.70707377736578 21 20 One Sample t-test two.sided
null.value
1 0
2 0
3 0
4 0
5 0
6 0
If you want to change the
mu in you t.test, add the argument mean:
result <- ddply(df, .(group, sex), t.test.plyr, "sds", mean=1)
result
If there is no need or you do not want a data frame as result you can just use
tapply(). tapply results in a list with as many elements as subsets, each element contains the results of each t.test (as list).
res <- tapply(df$sds, list(df$sex,df$group), t.test)
res
1 2 3
1 List,9 List,9 List,9
2 List,9 List,9 List,9
So the first element of
res contains the t.test results for sex=1 and group=1 etc pp
res[[1]]
One Sample t-test
data: X[[1L]]
t = 0.7749, df = 26, p-value = 0.4454
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.2184949 0.4829237
sample estimates:
mean of x
0.1322144