Welcome to week two! Today we will be using Rmarkdown to execute simple statistics, this module could be an entire class, but we are going to briefly review the topics below in a day:
Overview:
1. Basic helpful functions in R
2. Testing assumptions
3. Comparing means
3.1 T-tests
3.2 Non-parametric test
3.3 Anova
What is Rmarkdown? Rmarkdown is a file type that allows users to save/execute their code into shareable reports. Read more about Rmarkdown with this tutorial
variable <- function(argument)
Basic mathematical functions for summary statistics
+ `sum()`
+ `mean()`
+ `min()`
+ `max()`
+ `sd()`
+ `median()`
+ `summary()`
Functions to create & explore data structure
+ `c()` - concatenate function
+ `:` - colon operator for sequence generation
+ `seq()` - sequence generation
+ `rep()` - replicate elements of vectors and lists
+ `View()` - invoke a data viewer
+ `length()` - length of an object
+ `class()` - object class
+ `head()` - return the first 6 rows of an object
+ `tail()` - return the last last 6 rows an object
+ `dim()` - returns the dimensions of an object
+ `nrow()` - number of rows
+ `ncol()` - number of columns
+ `str()` - display the structure of each column of an object
+ `names()` - display the names of an object, typically the column names
+ `nlevels()` - categorial levels within an object
+ `cbind()` - combine objects by columns
+ `rbind()` - combine objects by rows
Let’s learn about our dataframe using some if the functions we learned last week
#view the top 6 rows
head(c4c)
## Id Sex Forearm Weight Age Date Prev Para Size
## 1 fb_1 f 64.02 38.58 1.000 2015-01-11 0 0 Medium
## 2 fb_2 f 65.12 39.59 1.478 2015-01-12 1 38 Medium
## 3 fb_3 f 65.41 40.42 1.570 2015-01-13 0 0 Medium
## 4 fb_4 f 66.08 42.23 1.578 2015-01-14 0 0 Medium
## 5 fb_5 f 66.09 42.63 1.598 2015-01-15 0 0 Medium
## 6 fb_6 f 66.32 43.67 1.610 2015-01-16 0 0 Medium
#structure
str(c4c)
## 'data.frame': 1200 obs. of 9 variables:
## $ Id : chr "fb_1" "fb_2" "fb_3" "fb_4" ...
## $ Sex : chr "f" "f" "f" "f" ...
## $ Forearm: num 64 65.1 65.4 66.1 66.1 ...
## $ Weight : num 38.6 39.6 40.4 42.2 42.6 ...
## $ Age : num 1 1.48 1.57 1.58 1.6 ...
## $ Date : chr "2015-01-11" "2015-01-12" "2015-01-13" "2015-01-14" ...
## $ Prev : int 0 1 0 0 0 0 0 0 0 0 ...
## $ Para : int 0 38 0 0 0 0 0 0 0 0 ...
## $ Size : chr "Medium" "Medium" "Medium" "Medium" ...
#lets correct the variables (review from last week)
$Sex <- as.factor(c4c$Sex)
c4c$Size<- as.factor(c4c$Size)
c4c$Weight<- as.numeric(c4c$Weight) c4c
It looks like there are N/As in our weight column/variable in our dataset. Let’s remove the N/A values from that columns
#use is.na() method
<- c4c[!is.na(c4c$Weight),]
c4c
#use subset() method
<- subset(c4c, !is.na(Weight))
c4c
#use tidyverse method
<- c4c %>%
c4c drop_na(Weight)
# basic stats - lets get basic stats for the weight variable
var(c4c$Weight)
## [1] 287.5528
mean(c4c$Weight)
## [1] 57.03703
max(c4c$Weight)
## [1] 87.52
min(c4c$Weight)
## [1] 14.2
sd(c4c$Weight)
## [1] 16.95738
median(c4c$Weight)
## [1] 58.27
range(c4c$Weight)
## [1] 14.20 87.52
quantile(c4c$Weight)
## 0% 25% 50% 75% 100%
## 14.20 48.99 58.27 63.52 87.52
# or just run summary()
summary(c4c$Weight)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 14.20 48.99 58.27 57.04 63.52 87.52
Useful functions for plotting in base R. Here are a list of useful functions for making plots in R. By using the ? in R you can figure out the details of these different plotting functions.
One of our critical assumptions for one-sample and two-sample t-tests is that the distribution of the sample means is normal. We have learned that if the distribution of our sample is normal we are guaranteed that the distribution of our sample means will be normal. Additionally, we have learned that the Central Limit Theorem tells us that if our sample size is big enough, the distribution of our sample means will be normal even if the distribution of our sample (or the population from which we drew this sample) is non-normal. However, it is often the case in ecology that our samples are very non-normal and the Central Limit Theorem does not apply. This may be because the data are strongly skewed or because there are outliers in the data. In this lab you will learn how to deal with non-normal data and data with extreme outliers.
We first have to check whether our data are normal using the following qualitative and quantitative methods.
Histogram: If the histogram is not “bell-shaped” this is your first indication that your data may be non-normal.
QQplot: If all data do not stay within the confidence band your data may be non-normal.
Shapiro-Wilk Test: A quantitative test that tells you whether your data are normal or not. \(H_0\): Data are normal, \(H_A\): Data are non-normal. If p < 0.05, reject your \(H_0\) and conclude that the data are not normal; if p > 0.05, “conclude” that your data are normal. Remember not to count on the results from this test alone, and always to visually examine a QQplot or histogram.
Checking normality with residuals
This allows us to check normality of more complicated relationships rather than just checking one variable at a time.
Outliers can drastically influence the results of t-tests and other parametric tests. It is always import to identify outliers before you begin your analysis. While there are formal methods for determining what constitutes an outlier, a simple method is to make a boxplot of your data. A boxplot will identify any points that do not lie within a given range above or below the median of your data. This range can vary depending on how you build your boxplots, but typically the width of the boxplots is related to the first and third quartiles of your data. In R, the default range is 1.5 times the distance between first and third quartiles.
While sometimes outliers can be erroneous values (i.e. you measured something incorrectly) they often tell you something important about the population you are measuring. Therefore, you cannot drop an outlier from your analysis without having good reason to do so. Good reasons could include measurement error or an underlying difference between that individual and the rest of the population. Dropping an outlier because it makes your life and the analysis easier is not a good reason…though we all wish it was. However, it is good practice to run your analysis with and without your outlier to determine if/how the outlier changes your results.
When you transform your data you apply some function (often non-linear) to your data. Wait… why the heck can you do this? If you think about it, the scale that we use to measure things is completely arbitrary (think about Celsius vs Fahrenheit). We could just as easily measure things in a base 2 system and there would, in principal, be nothing wrong with that. If this helps, you can imagine that by transforming your data you are moving to an alien world where they measure everything on a different scale (e.g. on a log scale). Here are some of the reasons why we transform our data:
Here are some common transformations and the scenarios in which you may use them. The variable Y represents your original data and Y’ represents your transformed data.
Transformations are not magic. After you have transformed your data you must retest all of your assumptions. Moreover, transforming your data inherently changes your null hypothesis. For example, if I log transformed my number of flowers variable and then tested whether the means were different between the edge and the interior of a habitat patch, my \(H_0\) becomes that there is no difference in the log-mean.
Any conclusions you make about your data have to be in terms of the transformation. For example, you would conclude that the log number of flowers on the edge was significantly more than the log number of flowers in the interior.
If a transformation does not fix your problem (e.g. make your data more normal, reduce the heterogeneity of variance, etc.) you cannot proceed with your parametric analysis. Fortunately, there are some non-parametric options that you can consider. Remember that non-parametric does not assume that the data have any kind of distribution.
To know what normal (gaussian) data should look like, we will generate a dataframe with a known normal distribution
# Generate randomly sampled data to visualize
= rnorm(1000, mean=21 , sd=3) #normal distribution
norm_samp = rexp(1000) #exponential distribution
exp_samp
# Visualization with a Histogram
hist(norm_samp,
breaks=20, #20 bins is more practical, even though our sample size is 10x bigger. You get a sense of how many fall within bins that have a width that makes sense. This normal variable could be "age" for example
col='darkgray',
ylab="Frequency",
main="Histogram of Normally Distributed Data")
hist(exp_samp,
breaks=20, #20 bins is more practical, even though our sample size is 10x bigger. You get a sense of how many fall within bins that have a width that makes sense. This normal variable could be "age" for example
col='darkgray',
ylab="Frequency",
main="Histogram of Non-Normally Distributed Data")
The norm_samp
data looks normally distributed The exp_samp
data does not look normally distributed
# Visualization with a qqPlot
# Now we’ll generate a qqplot using qqPlot function in car package.
# Normal data
qqPlot(norm_samp, main = "qqPlot of Normally Distributed Data") #qqplot function
## [1] 464 313
# Exponential data
qqPlot(exp_samp, main = "qqPlot of Non-Normally Distributed Data") #qqplot function
## [1] 802 977
For a qqplot, if all of your data points fall in between the dashed confidence bands you can be pretty confident that your data follow a normal distribution.
For norm_samp
, most data points (circles) fall within the confidence bands, so data is normally distributed. Although you can see data point 199 and data point 191 are close to being outliers (i.e. fall outside of dashed confident lines)
For exp_samp
, most data points (circles) fall outside the confidence bands, so data is not normally distributed. There are clear outlier data such as data points 268 and 744.
The null hypothesis of this test is that your data are normally distributed. The alternative hypothesis of this test is that your data are not normally distributed. Therefore, if this test gives us a p-value that is non-significant (i.e. p > 0.05) we can be reasonably confident that our data are normal.
p < 0.05 you can reject the null, your variable is not normally distributed p > 0.05 you fail to reject the null, your variable is normally distributed
Example 1. Let’s use the Shapiro-Wilk test to see if norm_samp is a normal distribution:
# EXAMPLE 1
shapiro.test(norm_samp) #run a shapiro-wilk test on norm_samp
##
## Shapiro-Wilk normality test
##
## data: norm_samp
## W = 0.99833, p-value = 0.4472
#run a shapiro-wilk test on norm_samp
How to interpret shapiro test results
Remember p > 0.05 means you fail to reject the null hypothesis based on your data and tests
This shapiro-wilk value makes sense since my histogram and qqPlot looked normally distributed
Example 2 Let’s use the Shapiro-Wilk test to see if exp_samp is a normal distribution:
# EXAMPLE 2
shapiro.test(exp_samp) #run a shapiro-wilk test on exp_samp
##
## Shapiro-Wilk normality test
##
## data: exp_samp
## W = 0.80627, p-value < 2.2e-16
How to interpret shapiro test results
Remember p < 0.05 means you reject the null hypothesis based on your data and tests
This shapiro-wilk value makes sense since my histogram and qqPlot looked not normally distributed
Once we have determined if our data are normal, then we can determine an appropriate statistical test to compare means of our data
When we want to ask a question about the mean of a sample, but we do not know the true variance of the population (i.e. we have to estimate the variance from the sample) we need to use a one sample t-test. The one-sample t-test has the following assumptions:
The hypotheses for a two-sided one sample t-test are:
\(H_0 : \mu = a\)
\(H_A : \mu \neq a\)
where \(a\) is your hypothesized mean.
To test these hypotheses, calculate a t statistic using the following formula:
\(t = \frac{\bar Y - a}{s/\sqrt{n}}\)
where \(\bar Y\) is the sample mean, \(s\) is the standard deviation of your sample, and n is your sample size.
The resulting t-statistic has \(n-1\) degrees of freedom (df). You can test your null hypothesis in three different ways:
R code:
# t.test(DATASET$VARIABLE, mu=MEAN)
If you want to compare the means between two groups and each data point in one group is uniquely paired with a data point in another group you can use the paired t-test. Take the difference between each pair of data points and then perform a one sample t-test with \(H_0 : \mu_d = 0\). Assumptions:
R code
#first check for normality
#dataset$difference <- (dataset$variable1 - dataset$variable2)
#hist(dataset$difference)
#shapiro.test(dataset$difference)
#t.test(dataset$variable1 - dataset$variable2, paired = TRUE)
The two-sample t-test compares whether the means of two groups are significantly different from each other. This test has the following assumptions:
If we have two groups called group 1 and group 2, the two-sided hypothes for the two-sample t-test are:
\(H_0 : \mu_1 = \mu_2\)
\(H_A : \mu_1 \neq \mu_2\)
Before we can test these hypotheses we need to ensure that our assumptions are met. You can test the normality assumption using any of the procedures that you already know (e.g. qqplots or the Shapiro-Wilk statistic). To test the homogeneity (equal) of variance assumption we will use something called Levene’s Test. The null hypothesis of the Levene’s test is that the variance between the two groups are equal. The alternative hypothesis is that they are not equal.
If we fail to reject the null hypothesis of the Levene’s test we can pool our variances and use a standard two-sample t-test. We would compute this t statistic using the following equations:
\(SE_{\bar Y_1 - \bar Y_2} = \sqrt{s^2_p(\frac{1}{n_1}+\frac{1}{n_2})}\)
where
\(s^2_p = \frac{df_1s^2_1+df_2s^2_2}{df_1+df_2}\)
is the pooled sample variance (described in your book). With these equations we can calculate the t statistic:
\(t = \frac{\bar Y_1 - \bar Y_2}{SE_{\bar Y_1 - \bar Y_2}}\)
where this t statistic has a df of \(n_1 + n_2 - 2\).
If we reject the null hypothesis of the Levene’s test we use something called Welch’s t-test. We calculate the t statistic with a similar equation as equation 7, but the denominator is changed to account for the different variances between the two groups. Moreover, the df is also calculated a bit differently. For the most part, you can ignore these technicalities. You just need to understand that a different test is used when between group variances are unequal. If possible, we would prefer our two groups to have equal variance because our statistical test will have more power.
Similar to the one-sample t-test, you can draw conclusions using one of the equivalent methods:
R code
# Two-sampled t-test with unequal variances using the Welch's t-test
#t.test(GROUP1$VARIABLE, GROUP2$VARIABLE, var.equal = FALSE)
# One-sided test: Mean is significantly less of Group 1 than mean of Group 2
# t.test(GROUP1$VARIABLE, GROUP2$VARIABLE, alternative = "less, var.equal = FALSE)
Resources for more organized and visually appealing Rmarkdown documents
Rmarkdown guide slides https://commonmark.org/help/tutorial/02-emphasis.html
Rmarkdown cookbook https://bookdown.org/yihui/rmarkdown-cookbook/
Our parasite data is not normal so we will not be using a ttest to compare parasite load means. Instead we will use data from the lterdatasampler package in R.
Dataset description
hbr_maples: Sugar maple seedlings at Hubbard Brook Experimental Forest (New Hampshire) in calcium-treated and reference watersheds in August 2003 and June 2004
More info found [here] (https://lter.github.io/lterdatasampler/articles/hbr_maples_vignette.html)
Citation: Horst A, Brun J (2022). lterdatasampler: Educational dataset examples from the Long Term Ecological Research program. R package version 0.1.0, https://github.com/lter/lterdatasampler.
# The data we will use for this exercise is preloaded into R! Let's take a look at it using glimpse
#using glimpse function to explore the data
glimpse(hbr_maples)
## Rows: 359
## Columns: 11
## $ year <dbl> 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 20…
## $ watershed <fct> Reference, Reference, Reference, Reference, Refere…
## $ elevation <fct> Low, Low, Low, Low, Low, Low, Low, Low, Low, Low, …
## $ transect <fct> R1, R1, R1, R1, R1, R1, R1, R1, R1, R1, R1, R1, R1…
## $ sample <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,…
## $ stem_length <dbl> 86.9, 114.0, 83.5, 68.1, 72.1, 77.7, 85.5, 81.6, 9…
## $ leaf1area <dbl> 13.837, 14.572, 12.451, 9.974, 6.838, 9.660, 8.823…
## $ leaf2area <dbl> 12.130, 15.267, 9.731, 10.068, 5.480, 7.643, 9.233…
## $ leaf_dry_mass <dbl> 0.0453, 0.0476, 0.0423, 0.0397, 0.0204, 0.0317, 0.…
## $ stem_dry_mass <dbl> 0.0300, 0.0338, 0.0248, 0.0194, 0.0180, 0.0246, 0.…
## $ corrected_leaf_area <dbl> 29.104, 32.976, 25.319, 23.179, 15.455, 20.440, 21…
str(hbr_maples)
## tibble [359 × 11] (S3: tbl_df/tbl/data.frame)
## $ year : num [1:359] 2003 2003 2003 2003 2003 ...
## $ watershed : Factor w/ 2 levels "Reference","W1": 1 1 1 1 1 1 1 1 1 1 ...
## $ elevation : Factor w/ 2 levels "Low","Mid": 1 1 1 1 1 1 1 1 1 1 ...
## $ transect : Factor w/ 12 levels "R1","R2","R3",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ sample : Factor w/ 20 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ stem_length : num [1:359] 86.9 114 83.5 68.1 72.1 77.7 85.5 81.6 92.9 59.6 ...
## $ leaf1area : num [1:359] 13.84 14.57 12.45 9.97 6.84 ...
## $ leaf2area : num [1:359] 12.13 15.27 9.73 10.07 5.48 ...
## $ leaf_dry_mass : num [1:359] 0.0453 0.0476 0.0423 0.0397 0.0204 0.0317 0.0382 0.0179 0.0286 0.0125 ...
## $ stem_dry_mass : num [1:359] 0.03 0.0338 0.0248 0.0194 0.018 0.0246 0.0316 0.015 0.0291 0.0149 ...
## $ corrected_leaf_area: num [1:359] 29.1 33 25.3 23.2 15.5 ...
First, let’s create an exploratory visualization of the data using ggplot. Is there evidence for differences in maple seedling height (millimeters) between the calcium-treated (W1) and untreated (reference) watersheds? We can visualize out data using a boxplot
ggplot(data = hbr_maples, aes(x = watershed, y = stem_length)) +
geom_boxplot(aes(color = watershed, shape = watershed), #adds color based on watershed
alpha = 0.8, #makes color transparent
width = 0.5) +
geom_jitter(
aes(color = watershed), #geom_jitter adds data points to the boxplots
alpha = 0.5,
show.legend = FALSE,
position = position_jitter(width = 0.2, seed = 0)
+
) labs(
x = "Watershed",
y = "Stem length (millimeters)",
title = "Stem Lengths of Sugar Maple Seedlings",
subtitle = "Hubbard Brook LTER"
+
) facet_wrap(~year) + #Facetwrap creates two side-by-side plots to visulaize differences between years
theme_minimal()
In both years of the study (2003 and 2004) it appears that seedling heights shift toward longer stem lengths in the calcium-treated watershed (W1), compared to the untreated watershed.
Is the data normal? Let’s look at the histogram and qqplot and then run a shapir.wilks test
#simple histogram to view the distribution of our Y variable using geom_hist()
ggplot(hbr_maples, aes(x = stem_length)) +
geom_histogram()
#lets make our histogram a little bit fancier, adding titles, color coding watershed
ggplot(hbr_maples, aes(x = stem_length, color = watershed, fill = watershed)) +
geom_histogram(alpha = 0.5, position = "identity") +
geom_density(aes(color = watershed), size = 1) +
labs(
x = "Stem Length (mm)",
y = "Frequency",
title = "Distribution of Sugar Maple seedling stem lengths",
subtitle = "Hubbard Brook LTER"
+
) theme_minimal()
Next assumption: The two samples have equal variance
Having explored distributions in the histograms above, we can conclude that the data are approximately normal. As a next step, we will use var.test() to test for equal variances of 2004 seedling height (stem_length, in millimeters) observations in each watershed.
Null and alternative hypotheses:
\(H_0\): Ratio of variances is equal to 1 (equal variances)’ \(H_A\): Ratio of variances is not equal to 1 (unequal variances)
Test this using the tidyverse method
%>%
hbr_maples filter(year == 2003) %>% #only selects obsevations from 2003
var.test(stem_length ~ watershed, data = .) #var.test tests for equal variance
##
## F test to compare two variances
##
## data: stem_length by watershed
## F = 0.94458, num df = 119, denom df = 119, p-value = 0.7563
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.6583073 1.3553360
## sample estimates:
## ratio of variances
## 0.944578
%>%
hbr_maples filter(year == 2004) %>% #only selects observations from 2004
var.test(stem_length ~ watershed, data = .) #var.test tests for equal variance
##
## F test to compare two variances
##
## data: stem_length by watershed
## F = 1.2701, num df = 58, denom df = 59, p-value = 0.3626
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.7573898 2.1323658
## sample estimates:
## ratio of variances
## 1.270104
With a p-value > 0.05, we fail to reject the null hypothesis that the ratio of the variances of the two watersheds is equal, and continue with the assumption of equal variances.
Now, we’ll do a two-sided two-sample t-test using t.test() to determine if there is a statistically significant difference in seedling height between the watersheds (using only 2004 observations).
Null and alternative hypotheses:
\(H_0:𝜇1=𝜇2\)
\(H_A:𝜇1≠𝜇2\)
By default, t.test() will use a Welch t-test that does not assume equal variances (default argument is var.equal = FALSE). We set var.equal = TRUE based on the outcome of our F-test above:
%>%
hbr_maples filter(year == 2003) %>%
t.test(stem_length ~ watershed,
var.equal = TRUE,
data = .)
##
## Two Sample t-test
##
## data: stem_length by watershed
## t = -3.7797, df = 238, p-value = 0.0001985
## alternative hypothesis: true difference in means between group Reference and group W1 is not equal to 0
## 95 percent confidence interval:
## -10.497532 -3.304134
## sample estimates:
## mean in group Reference mean in group W1
## 80.98500 87.88583
%>%
hbr_maples filter(year == 2004) %>%
t.test(stem_length ~ watershed,
var.equal = TRUE,
data = .)
##
## Two Sample t-test
##
## data: stem_length by watershed
## t = -4.3092, df = 117, p-value = 3.432e-05
## alternative hypothesis: true difference in means between group Reference and group W1 is not equal to 0
## 95 percent confidence interval:
## -16.982759 -6.287862
## sample estimates:
## mean in group Reference mean in group W1
## 85.88136 97.51667
Based on the outcome of these t-test, we can reject the null hypothesis and conclude that mean seedling heights in the reference watershed (no calcium treatment) differ significantly from those in the calcium-treated watershed (W1) in both 2003 and 2004.
Non-parametric tests make no assumptions about the distributions of the data and thus are robust to violations of normality. When our data are not normal and we can’t fix them with a transformation we typically turn to non-parametric tests. You typically do non-parametric tests on your untransformed data. Transforming your data will not change the results of a non-parametric test, but it will change your interpretation of the results. Who wants to talk about medians of square-root transformed data when you don’t have to? Keep in mind that non-parametric tests can be used with data that are normally distributed. However, parametric tests that assume normality have more power, so you should use those if your data are normal.
We are only going to consider one non-parametric test in this lab: the Mann-Whitney U-test, aka Wilcoxon Test. These tests can be used in place of a two-sample t-test when your assumptions of normality are not met. However, there are still some assumptions for the Mann-Whitney U-test:
You have already learned about these tests in class. They rank your data (e.g. smallest value has a rank of 1 and the largest value has a rank of n) and then look at the distributions of these ranks. Note that while you can make similar conclusions with non-parametric tests, the \(H_0\) and \(H_A\) are not specified in terms of the mean, but rather whether the distributions of the two groups differ in central tendency (i.e. the median).
The null hypothesis for a Mann-Whitney U-test is:
\(H_0\) : Median of group 1 equals the median of group 2 \(H_A\) : Median of group 1 does not equal median of group 2
Now let’s walk through an example of how to do the Mann-Whitney U-test in R. We are going to use our parasite data! For this analysis, we are going to see if there is a sinificant difference in male and femal parasite load in infected individuals
<- c4c %>%
c4c_pos filter(Para > 0) %>% #creating a subset with only infected individuals
mutate(log_para = log(Para), # adding a column with square root transformation
sq_para = sqrt(Para)) # adding a column with log transformation
Boxplot first
#boxplot with untransformed data
boxplot(c4c_pos$Para~c4c_pos$Sex,
main="Parasite load in Males and Females",
xlab="Sex",
ylab ="Parasite load")
#boxblot with log tranformed parasite burden
boxplot(c4c_pos$log_para~c4c_pos$Sex,
main="Parasite load in Males and Females",
xlab="Sex",
ylab ="Log Parasite load")
Histograms, lets see if the data look normally distributed
par(mfrow=c(3,1)) #sets up your plot (3 x 1)
#histogram with untransformed Y
hist(c4c_pos$Para ,probability = T,
col = "plum1",
main="Parasite load distribution",
xlab="Parasite load")
#histogram with log transformed Y
hist(c4c_pos$sq_para ,probability = T,
col = "azure1",
main=" Square root tranformed parasite distribution",
xlab="Parasite load")
#histogram with square root transformed Y
hist(c4c_pos$log_para, probability = T,
col = "pink",
main="Log transformed parasite load distribution",
xlab="Parasite load")
#qqPlot
qqPlot(c4c_pos$log_para)
## [1] 63 8
#shapiro test
shapiro.test(c4c_pos$log_para)
##
## Shapiro-Wilk normality test
##
## data: c4c_pos$log_para
## W = 0.89433, p-value = 7.727e-07
#p-value = 7.727e-07, we reject the null that our data is normally distributed.
Since our data is not normal after transformation, we will go back to using the un-transformed data for a non-parametric test. But we also need to check for equal variance. We can do that with a levenes test!
#Run Levene's Test; (y~x) or (y,x)
leveneTest(Para~Sex, data = c4c_pos)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 0.0244 0.8763
## 98
Levene’s Test gives p= 0.98, so we cannot reject the null hypothesis. This is strong evidence that the variances of these distributions are not different. We therefore can use a Mann-Whitney U-test which assumes that the distributions have the same shape.
# Run a Mann-Whitney U-test/Wilcoxon Test
wilcox.test(Para ~ Sex, data=c4c_pos)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Para by Sex
## W = 1243.5, p-value = 0.8142
## alternative hypothesis: true location shift is not equal to 0
Based on the above results we can conclude that males and females do not have signficantly different median parasite loads. Note that the W statistic (or the U statistic) is analogous to a t or Z statistic: it comes from a distribution from which you can determine its associated p-value.
We can also find the medians of each group:
median(c4c_pos$Para[c4c_pos$Sex == "m"], na.rm=T) # na.rm is removing NAs
## [1] 36
median(c4c_pos$Para[c4c_pos$Sex == "f"], na.rm=T) # na.rm is removing NAs
## [1] 23
From this, we can see that males actually have a higher median Parasite burden than females do, but the difference is not significant.
We have now seen many different ways to compare the means of two groups. However, what if we have more than two groups? Unfortunately, we can’t just use a bunch of pairwise t-tests because this would inflate our Type I () error. ANOVA is the answer. It gives us a robust and powerful away to determine whether the means of two or more groups are different from each other. There are a couple terms that we need to familiarize ourselves with before we start using ANOVAs.
Factor: Our variable of interest that we are changing. For example, if we were comparing the mean hearing frequency between different felines the factor would be feline and it would have different levels (i.e lion, tiger, and puma).
Level/group/treatment: The levels of a factor are the groups that we are comparing. For example, the levels of the factor feline are lion, tiger, and puma.
Here are the assumptions of the ANOVA and how we test them:
If we are comparing the means of k populations, the \(H_0\) and \(H_A\) of a one-way ANOVA are:
\(H_0\) : The mean of group 1 = mean of group 2 = mean of group k (μ_1= μ_2 = … =μ_k)
\(H_0\) : At least one mean across the groups (μ_j) is not equal
These hypotheses assert that any observation i from group (level) j, y_ij, can be calculated using the following equation: y_ij=μ_j+ϵ_ij
Where ϵ_ij is an error associated with the i measurement from group j and is normally distributed with mean 0 and some variance σ2ϵ. The variance σ2ϵ is constant for all levels. This is just fancy notaiton for restating our two assumptions:
We could equivalently state the null hypotheses for the ANOVA as
\(H_0\) :α_1 = α_2 =….= α_j =…= α_k =0
\(H_A\) :At least one α_j is not equal to 0
In this case, α_j, is just the deviation of group j from the grand mean μ. The grand mean is the mean of all assumptions pooled across groups. If all these deviations are 0, there is no difference between our means. These hypotheses assert that any observation i from group (level) j, y_ij, can be calculated using the following equation y_ij=μ+α_j+ϵ_ij
Equation [eq:effects] and equation [eq:means] are equivalent.
If we reject our \(H_0\), we have no information about which means are different, just that at least one is different.
Resources for more R ANOVA tutorials
ANOVA blog https://medium.com/@StepUpAnalytics/anova-one-way-vs-two-way-6b3ff87d3a94
Introduction to ANOVA video https://www.youtube.com/watch?v=FLyCZnjuJVA&feature=youtu.be
Is there a significant difference in adult salamander weight among the years 2010-2019?
We are going to use another dataset provided by the lterdatasampler pacakge in R.
Dataset description
The and_vertebrates dataset contains length and weight observations for Coastal Cutthroat Trout and two salamander species (Coastal Giant Salamander, and Cascade Torrent Salamander) in previously clear cut (c. 1963) and old growth coniferous forest sections of Mack Creek in HJ Andrews Experimental Forest, Willamette National Forest, Oregon.
Citation: Gregory, S.V. and I. Arismendi. 2020. Aquatic Vertebrate Population Study in Mack Creek, Andrews Experimental Forest, 1987 to present ver 14. Environmental Data Initiative. https://doi.org/10.6073/pasta/7c78d662e847cdbe33584add8f809165
Hypotheses
\(H_0\): α_2010=α_2011=α_2012…=α_2017
\(H_A\): At least one α_j is not equal to 0
Always make sure you know what your factor and its levels are before beginning an ANOVA. Otherwise, things can get really confusing. Okay, let’s begin the analysis.
First, lets load in the dataset and_vertebrates and look at it to get a sense of what’s going on.
##read in your data and check out how its structured
str(and_vertebrates)
## tibble [32,209 × 16] (S3: tbl_df/tbl/data.frame)
## $ year : num [1:32209] 1987 1987 1987 1987 1987 ...
## $ sitecode : chr [1:32209] "MACKCC-L" "MACKCC-L" "MACKCC-L" "MACKCC-L" ...
## $ section : chr [1:32209] "CC" "CC" "CC" "CC" ...
## $ reach : chr [1:32209] "L" "L" "L" "L" ...
## $ pass : num [1:32209] 1 1 1 1 1 1 1 1 1 1 ...
## $ unitnum : num [1:32209] 1 1 1 1 1 1 1 1 1 1 ...
## $ unittype : chr [1:32209] "R" "R" "R" "R" ...
## $ vert_index : num [1:32209] 1 2 3 4 5 6 7 8 9 10 ...
## $ pitnumber : num [1:32209] NA NA NA NA NA NA NA NA NA NA ...
## $ species : chr [1:32209] "Cutthroat trout" "Cutthroat trout" "Cutthroat trout" "Cutthroat trout" ...
## $ length_1_mm: num [1:32209] 58 61 89 58 93 86 107 131 103 117 ...
## $ length_2_mm: num [1:32209] NA NA NA NA NA NA NA NA NA NA ...
## $ weight_g : num [1:32209] 1.75 1.95 5.6 2.15 6.9 5.9 10.5 20.6 9.55 13 ...
## $ clip : chr [1:32209] "NONE" "NONE" "NONE" "NONE" ...
## $ sampledate : Date[1:32209], format: "1987-10-07" "1987-10-07" ...
## $ notes : chr [1:32209] NA NA NA NA ...
#lets create a subset with only cutthroat trout and 3 levels on unit type (cascade, pool, and channel). We also want to only include adult salamanders in this analysis,adult mass ranges from 22 to 114 g.
<- and_vertebrates %>%
new_vert filter(species == "Coastal giant salamander", #selects only salamanders
%in% c("C", "P", "SC"),
unittype > 2010,
year > 22) %>%
weight_g drop_na(unittype) #drops NAs
$year <-as.factor(new_vert$year)
new_vert#let's look at the structure of our subset
str(new_vert)
## tibble [184 × 16] (S3: tbl_df/tbl/data.frame)
## $ year : Factor w/ 9 levels "2011","2012",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ sitecode : chr [1:184] "MACKCC-L" "MACKCC-L" "MACKCC-L" "MACKCC-L" ...
## $ section : chr [1:184] "CC" "CC" "CC" "CC" ...
## $ reach : chr [1:184] "L" "L" "L" "L" ...
## $ pass : num [1:184] 1 1 1 2 2 1 1 1 1 1 ...
## $ unitnum : num [1:184] 1 2 3 2 3 5 5 5 6 7 ...
## $ unittype : chr [1:184] "C" "SC" "C" "SC" ...
## $ vert_index : num [1:184] 48 48 42 38 36 18 31 40 9 18 ...
## $ pitnumber : num [1:184] 16396930 26173229 26200280 26197920 26178692 ...
## $ species : chr [1:184] "Coastal giant salamander" "Coastal giant salamander" "Coastal giant salamander" "Coastal giant salamander" ...
## $ length_1_mm: num [1:184] 117 91 99 120 118 108 97 89 119 120 ...
## $ length_2_mm: num [1:184] 204 165 177 187 202 187 168 153 196 204 ...
## $ weight_g : num [1:184] 48.7 28 30.7 55.9 50.6 ...
## $ clip : chr [1:184] "NONE" "NONE" "NONE" "NONE" ...
## $ sampledate : Date[1:184], format: "2011-09-12" "2011-09-12" ...
## $ notes : chr [1:184] "RECAP" NA "BLEEDER" "CUTTAIL" ...
Now lets visualize the data with a boxplot:
ggplot(data = new_vert, aes(x = year, y = weight_g, fill = year)) +
geom_boxplot() +
labs(
x = "Unit type",
y = "Weight (grams)",
title = "Adult coastal giant salamander w among stream habitats",
subtitle = "Mack Creek in HJ Andrews Experimental Forest, Willamette National Forest, Oregon"
+
) theme_minimal()
Notice that in general, the variances of each level seem pretty comparable in the boxplot. Moreover, the data don’t look like they violate normality too severely, though there is one outlier.
To test our assumption we need to look at the residuals of our ANOVA model. Residuals are defined as:\(eij=yij−yj¯\)
In words, this says the residual eij is equal to the observed value (yij) minus the predicted value \((ŷ ij−y¯js\))
Residuals are useful for testing our assumptions because they are estimates of our error terms ϵij. Therefore we should expect that our residuals should be normally distributed and that they should have approximately equal variances.
For more info on residuals: http://www.statsmakemecry.com/smmctheblog/confusing-stats-terms-explained-residual.html
To get our residuals, we first need to run our ANOVA. We will use the funciton aov() which is very similar in terms of output and structure to lm(), but will calculate ANOVA results instead of regression results!
# HINT: aov(y ~ x, data = dataset)
<- aov(weight_g~as.factor(year), data=new_vert) #run the ANOVA
salamander par(mfrow=c(2,2)) # places 2 plots on 1 output view
plot(salamander) #look at the diagnostic plots; focus on the first two plots (fitted vs residuals & QQ plot)
While all the plots on this graph are useful, the two that we will focus on are the plots in the first row.
The first plot is called a residual plot and it is used to test our Homogeneity of Variance (homoskedasticity) assumption (i.e. plot titled “Residuals vs. Fitted”). It plots the values predicted by our ANOVA model (see equation [eq:means] and [eq:effects]) against the residuals of the model. If the residuals fall along the zero line for each treatment and don’t have any distinct pattern then we can be pretty confident that our assumption of equal variances is met. If this plot shows a distinct pattern, such as a wedge shape, there is good evidence that our assumptions of homogeneity of variance are not met and we might want to try a transformation.
The other plot that is useful in is the qqplot of the residuals (i.e. plot titled “Normal Q-Q”). You have already seen a lot of the qqPlot so you know that if it deviates from a straight line the residuals likely do not follow a normal distribution. This would violated the normality assumption of the ANOVA.
We have run the ANOVA model! Be warned, we cannot interpret the results until we check the residuals! R has been incredibly helpful and already calculated the residuals for us. Now our residuals are saved in the variable res. With this done, let’s check some of our assumptions.
Normality: Run a Shapiro-Wilk test on res.
par(mfrow=c(1,1)) # resets output view
<- salamander$residuals #retrieve the residuals
res
# Tool 1 for checking normality
hist(res)
# Tool 2 for checking normality
qqPlot(res)
## [1] 99 180
# Tool 3 for checking normality
shapiro.test(res) ## W = 0.97056, p-value = 0.0006311, not normal!
##
## Shapiro-Wilk normality test
##
## data: res
## W = 0.97056, p-value = 0.0006311
Time to transform and retest all our assumptions
$log_weight <- log(new_vert$weight_g)
new_vert
#retest all the assumptions
<- aov(log_weight~as.factor(year), data=new_vert) #run the ANOVA
salamander_2 par(mfrow=c(2,2)) # places 2 plots on 1 output view
plot(salamander_2) #look at the diagnostic plots; focus on the first two plots (fitted vs residuals & QQ plot)
par(mfrow=c(1,1)) # resets output view
<- salamander_2$residuals #retrieve the residuals
res2
# Tool 1 for checking normality
hist(res2)
# Tool 2 for checking normality
qqPlot(res2)
## [1] 165 23
# Tool 3 for checking normality
shapiro.test(res2) ## W = 0.99125, p-value = 0.328, p > 0.05, our data is normal!
##
## Shapiro-Wilk normality test
##
## data: res2
## W = 0.99125, p-value = 0.328
P-value (0.328) > 0.05 so we cannot reject the null; suggests that the data are normal. Now lets check our next assumption.
Homogeneity of Variance: Run a Levene’s Test on the stacked data just like you have been doing for the last few weeks. I get a p-value of 0.08664.
leveneTest(log_weight~as.factor(year), data = new_vert) # F value = 1.7661 , 0.08664
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 8 1.7661 0.08664 .
## 175
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-value (0.08664) > 0.05, so the variances are equal!
Once we have confirmed that our assumptions are met we can analyze the ANOVA. For the hand washing data, we met both our assumptions of normality and homogeneity of variance, so we can analyze the model.
# we actually already ran the anova when we created the residuals as a reminder we did it with:
# HINT: aov(y ~ x, data = dataset)
<- aov(log_weight~year, data=new_vert)
salamander_2 #So now we just need to get the results with:
summary(salamander_2)
## Df Sum Sq Mean Sq F value Pr(>F)
## year 8 4.144 0.5180 4.049 0.000193 ***
## Residuals 175 22.387 0.1279
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As described in class, this table contains all of the information that was required to run the ANOVA. We see that we obtained an F-value of 4.049 and a p-value (0.000193) less than 𝛼=0.05 allowing us to reject the null hypothesis and conclude that there has been a significant difference in adult salamander weight in at least one year from 2010 to 2019.
We just concluded from the above ANOVA that at least one of the means differ, but we often want to see which groups actually differ. We know we can’t do a bunch of t-tests because that will inflate our Type I error rate. We are instead going to use a method called a Tukey-Kramer test, which controls our Type I error rate. Here are some important notes about the Tukey-Kramer test:
The Tukey-Kramer test assumes that you have already run an ANOVA and rejected your null hypothesis
The Tukey-Kramer test does pairwise comparisons for all the groups we are testing and controls the family-wise Type I error rate so it is at most 𝛼=0.05 (it could be less).
When your ANOVA is unbalanced (unequal sample sizes in each group), the Tukey-Kramer test is conservative, meaning that the family-wise Type I error is less than 𝛼 and it is harder for us to reject the null.
The Tukey-Kramer test makes all of the same assumptions as the ANOVA
To run a Tukey-Kramer test for our hand-washing ANOVA
# HINT: TukeyHSD(aov_model)
TukeyHSD(salamander_2)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = log_weight ~ year, data = new_vert)
##
## $year
## diff lwr upr p adj
## 2012-2011 0.08258972 -0.193983105 0.3591626 0.9904323
## 2013-2011 0.05499478 -0.268538811 0.3785284 0.9998288
## 2014-2011 0.20922045 -0.133020962 0.5514619 0.6007748
## 2015-2011 0.15881195 -0.164721642 0.4823455 0.8340924
## 2016-2011 0.30456963 -0.074143310 0.6832826 0.2268798
## 2017-2011 0.23647394 -0.077130174 0.5500780 0.3080292
## 2018-2011 0.42911525 0.093721430 0.7645091 0.0027494
## 2019-2011 0.46485788 0.106530971 0.8231848 0.0022273
## 2013-2012 -0.02759494 -0.351128534 0.2959387 0.9999992
## 2014-2012 0.12663073 -0.215610684 0.4688721 0.9631990
## 2015-2012 0.07622223 -0.247311365 0.3997558 0.9981232
## 2016-2012 0.22197991 -0.156733033 0.6006928 0.6547741
## 2017-2012 0.15388421 -0.159719897 0.4674883 0.8343610
## 2018-2012 0.34652553 0.011131707 0.6819193 0.0369532
## 2019-2012 0.38226816 0.023941248 0.7405951 0.0268674
## 2014-2013 0.15422567 -0.226970663 0.5354220 0.9384482
## 2015-2013 0.10381717 -0.260676240 0.4683106 0.9930173
## 2016-2013 0.24957485 -0.164677431 0.6638271 0.6199053
## 2017-2013 0.18147915 -0.174229993 0.5371883 0.8023314
## 2018-2013 0.37412046 -0.000940159 0.7491811 0.0511291
## 2019-2013 0.40986310 0.014161653 0.8055645 0.0360698
## 2015-2014 -0.05040850 -0.431604831 0.3307878 0.9999748
## 2016-2014 0.09534918 -0.333673137 0.5243715 0.9987634
## 2017-2014 0.02725348 -0.345552355 0.4000593 0.9999998
## 2018-2014 0.21989480 -0.171417955 0.6112075 0.7049357
## 2019-2014 0.25563743 -0.155501028 0.6667759 0.5780618
## 2016-2015 0.14575768 -0.268494600 0.5600100 0.9728124
## 2017-2015 0.07766198 -0.278047162 0.4333711 0.9989123
## 2018-2015 0.27030330 -0.104757328 0.6453639 0.3699983
## 2019-2015 0.30604593 -0.089655516 0.7017474 0.2748831
## 2017-2016 -0.06809570 -0.474640281 0.3384489 0.9998468
## 2018-2016 0.12454562 -0.299034335 0.5481256 0.9913643
## 2019-2016 0.16028825 -0.281672311 0.6022488 0.9673397
## 2018-2017 0.19264131 -0.173888388 0.5591710 0.7750893
## 2019-2017 0.22838394 -0.159241109 0.6160090 0.6483222
## 2019-2018 0.03574263 -0.369713480 0.4411987 0.9999989
A fancier way of doing this:
# Use the function glht in the multcomp package
# HINT: glht(ANOVA_MODEL, linfct = mcp(FACTOR = "Tukey)
glht() # you'll see you need to have a factor for your mcp(argument)
?
<- glht(salamander_2, linfct = mcp(year = "Tukey")) # if year is not a factor when you ran the salamander_2 anova, the glht() will not work
Post_hoc
summary(Post_hoc)
##
## Simultaneous Tests for General Linear Hypotheses
##
## Multiple Comparisons of Means: Tukey Contrasts
##
##
## Fit: aov(formula = log_weight ~ year, data = new_vert)
##
## Linear Hypotheses:
## Estimate Std. Error t value Pr(>|t|)
## 2012 - 2011 == 0 0.08259 0.08805 0.938 0.99009
## 2013 - 2011 == 0 0.05499 0.10300 0.534 0.99982
## 2014 - 2011 == 0 0.20922 0.10896 1.920 0.59421
## 2015 - 2011 == 0 0.15881 0.10300 1.542 0.83021
## 2016 - 2011 == 0 0.30457 0.12057 2.526 0.22245
## 2017 - 2011 == 0 0.23647 0.09984 2.369 0.30225
## 2018 - 2011 == 0 0.42912 0.10678 4.019 0.00281 **
## 2019 - 2011 == 0 0.46486 0.11408 4.075 0.00226 **
## 2013 - 2012 == 0 -0.02759 0.10300 -0.268 1.00000
## 2014 - 2012 == 0 0.12663 0.10896 1.162 0.96205
## 2015 - 2012 == 0 0.07622 0.10300 0.740 0.99805
## 2016 - 2012 == 0 0.22198 0.12057 1.841 0.64896
## 2017 - 2012 == 0 0.15388 0.09984 1.541 0.83048
## 2018 - 2012 == 0 0.34653 0.10678 3.245 0.03598 *
## 2019 - 2012 == 0 0.38227 0.11408 3.351 0.02625 *
## 2014 - 2013 == 0 0.15423 0.12136 1.271 0.93665
## 2015 - 2013 == 0 0.10382 0.11604 0.895 0.99276
## 2016 - 2013 == 0 0.24957 0.13188 1.892 0.61432
## 2017 - 2013 == 0 0.18148 0.11325 1.603 0.79770
## 2018 - 2013 == 0 0.37412 0.11941 3.133 0.04970 *
## 2019 - 2013 == 0 0.40986 0.12598 3.253 0.03484 *
## 2015 - 2014 == 0 -0.05041 0.12136 -0.415 0.99997
## 2016 - 2014 == 0 0.09535 0.13659 0.698 0.99872
## 2017 - 2014 == 0 0.02725 0.11869 0.230 1.00000
## 2018 - 2014 == 0 0.21989 0.12458 1.765 0.69919
## 2019 - 2014 == 0 0.25564 0.13089 1.953 0.57166
## 2016 - 2015 == 0 0.14576 0.13188 1.105 0.97189
## 2017 - 2015 == 0 0.07766 0.11325 0.686 0.99887
## 2018 - 2015 == 0 0.27030 0.11941 2.264 0.36400
## 2019 - 2015 == 0 0.30605 0.12598 2.429 0.26980
## 2017 - 2016 == 0 -0.06810 0.12943 -0.526 0.99984
## 2018 - 2016 == 0 0.12455 0.13485 0.924 0.99103
## 2019 - 2016 == 0 0.16029 0.14071 1.139 0.96629
## 2018 - 2017 == 0 0.19264 0.11669 1.651 0.77052
## 2019 - 2017 == 0 0.22838 0.12341 1.851 0.64273
## 2019 - 2018 == 0 0.03574 0.12908 0.277 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)
confint(Post_hoc) # shows your lower and upper CI bounds
##
## Simultaneous Confidence Intervals
##
## Multiple Comparisons of Means: Tukey Contrasts
##
##
## Fit: aov(formula = log_weight ~ year, data = new_vert)
##
## Quantile = 3.1302
## 95% family-wise confidence level
##
##
## Linear Hypotheses:
## Estimate lwr upr
## 2012 - 2011 == 0 0.0825897 -0.1930279 0.3582074
## 2013 - 2011 == 0 0.0549948 -0.2674214 0.3774110
## 2014 - 2011 == 0 0.2092205 -0.1318390 0.5502799
## 2015 - 2011 == 0 0.1588120 -0.1636043 0.4812282
## 2016 - 2011 == 0 0.3045696 -0.0728354 0.6819746
## 2017 - 2011 == 0 0.2364739 -0.0760471 0.5489950
## 2018 - 2011 == 0 0.4291152 0.0948798 0.7633507
## 2019 - 2011 == 0 0.4648579 0.1077685 0.8219473
## 2013 - 2012 == 0 -0.0275949 -0.3500112 0.2948213
## 2014 - 2012 == 0 0.1266307 -0.2144287 0.4676902
## 2015 - 2012 == 0 0.0762222 -0.2461940 0.3986385
## 2016 - 2012 == 0 0.2219799 -0.1554251 0.5993849
## 2017 - 2012 == 0 0.1538842 -0.1586368 0.4664053
## 2018 - 2012 == 0 0.3465255 0.0122900 0.6807610
## 2019 - 2012 == 0 0.3822682 0.0251788 0.7393575
## 2014 - 2013 == 0 0.1542257 -0.2256542 0.5341055
## 2015 - 2013 == 0 0.1038172 -0.2594174 0.4670518
## 2016 - 2013 == 0 0.2495748 -0.1632468 0.6623965
## 2017 - 2013 == 0 0.1814792 -0.1730015 0.5359598
## 2018 - 2013 == 0 0.3741205 0.0003552 0.7478858
## 2019 - 2013 == 0 0.4098631 0.0155283 0.8041979
## 2015 - 2014 == 0 -0.0504085 -0.4302883 0.3294713
## 2016 - 2014 == 0 0.0953492 -0.3321915 0.5228898
## 2017 - 2014 == 0 0.0272535 -0.3442648 0.3987718
## 2018 - 2014 == 0 0.2198948 -0.1700665 0.6098561
## 2019 - 2014 == 0 0.2556374 -0.1540811 0.6653560
## 2016 - 2015 == 0 0.1457577 -0.2670639 0.5585793
## 2017 - 2015 == 0 0.0776620 -0.2768187 0.4321426
## 2018 - 2015 == 0 0.2703033 -0.1034620 0.6440686
## 2019 - 2015 == 0 0.3060459 -0.0882889 0.7003808
## 2017 - 2016 == 0 -0.0680957 -0.4732362 0.3370448
## 2018 - 2016 == 0 0.1245456 -0.2975714 0.5466627
## 2019 - 2016 == 0 0.1602882 -0.2801459 0.6007224
## 2018 - 2017 == 0 0.1926413 -0.1726225 0.5579052
## 2019 - 2017 == 0 0.2283839 -0.1579024 0.6146703
## 2019 - 2018 == 0 0.0357426 -0.3683132 0.4397984
cld(Post_hoc) #compact letter display of Tukey groups
## 2011 2012 2013 2014 2015 2016 2017 2018 2019
## "a" "a" "a" "ab" "ab" "ab" "ab" "b" "b"
plot(Post_hoc, main = "Fig 1: 95% confidence level", cex.axis = .5) #confidence intervals plotted
# (has a different letter and CIs don't overlap with 0 and p-values significant)
The plot of the confidence intervals generated from the Tukey-Kramer test for every pairwise combination of the groups. To interpret the table, notice that we see every pairwise combination of the levels of our factor. For each of these pairs, the Tukey-Kramer test has run a two-sample t-test and controlled for the family-wise Type I error.
It is also really useful to look at the letters at the bottom of the output. If the levels within your factor do not share a letter, they are significantly different..
Similarly, the Fig 1 displays the 95% family-wise CIs for each pairwise comparison. If the interval does not overlap with 0, we can conclude that the levels are different.