What made me learn R in my second semester of statistics was that my professor actually gave us references with the exact command and all the variables and what they meant. He EXPLAINED it. He didn't make us scour the internets for what the hell we were supposed to do for a two-sample t-test. He gave us the commands. He expected us to know how to use them. Did you explain to your boss that Student's T is an ANOVA for only 2 samples and that for ANOVA you have to do Tukey's HSD (which is really not that hard in R)? This is the thingamabobber I typed up for the sophomore, published here just in case anyone else on Hubski wants a mini-statistics and mini-R lesson:
IMPORTING YOUR DATA INTO R Say you have a set of data. Before you do anything with this data at all, make sure each variable is in the columns, not the rows (i.e. the headers are at the top of each column and the data goes down the column). This makes working with the data a lot easier. Let's assume you use Microsoft Excel 2007 or later, because that's common, and that you have this little sheet of data, called dogbones.csv (save it as a .csv, because R hates Excel. Also, save it in your My Documents folder, because this is where R will pull datasheets from). Even though the data's going to be typed out here in rows because this is an email, imagine that it is in columns. To import this data into R, there are a few ways to do it. If you have a small dataset with one IV and one DV, you can type: If you have a huge dataset, you type: dogbones <- read.csv("dogbones.csv", header=TRUE) To get the same sets as 'dogs' and 'fmrlengths', you type: TESTS One-sample T-test: Let's say we're comparing a population and we want to see whether it's significantly different than a mean of 3. Two-sample T-test: Let's say we're comparing heights of men and women, and that these variables are coded as 'hmales' and 'hfemales', and that we're assuming equal variance. All you type is: Pearson's chi-square: For a 2-way table called 'townsmog' where the rows are towns A and B and the columns are 'bothered by pollution' and 'not bothered by pollution': Correlation test: Is 'money' correlated with 'intelligence'? Linear regression test: Assuming 'money' is correlated with intelligence, how much does 'intelligence' determine 'money', and is this significant?
So dog 1 has femur length 14, dog 5 has femur length 18, and so on. Dogs: 1 2 3 4 5
FemurLengths: 14 15 16 17 18
and ignore the more complicated dataset commands. dogs <- scan(1, 2, 3, 4, 5)
fmrlengths <- scan(14, 15, 16, 17, 18)
(as an aside here, before you import a data set in R, it's a lot easier if the variable names are kinda condensed). dogs <- dogbones$Dogs
fmrlengths <- dogbones$FemurLengths
t.test(population, mu=3)
t.test(hmales, hfemales, var.equal=TRUE)
chisq(townsmog)
cor.test(money, intelligence)
moneyIQreg <- lm(money ~ intelligence)
summary(moneyIQreg)
There were underlying issues that went further than just the type of test to be using, I meant it more as an example. Also, while I oh-so cherish my time spent in R, I'm not doing anything that necessitates that level of control, nor is that particular skill (working in R, not stats knowledge) set something that needs to be maintained. I also spent a bunch of time messing around in LaTeX, thinking that would come in handy later, or at least be interesting enough to merit working in, but in the end I have a fancy-looking CV that is really annoying to update.Did you explain to your boss that Student's T is an ANOVA for only 2 samples and that for ANOVA you have to do Tukey's HSD (which is really not that hard in R)?
I use Word. I'm so glad biology doesn't have a bizarre practice of using LaTeX for everything. What the fuck, physics.