Aims of this exercise
This exercise aims to further sharpen your skills in
- translating the research question in a a null and alternative hypothesis of t-tests
- critically evaluating the assumptions of t-tests, and
- selecting the appropriate test for answering the research question.
The shrimps dataset
Dataset on the accumulation of PCBs (Polychlorinated biphenyls) in the adipose tissue of shrimps. PCBs are often present in coolants, and are know to accumulate easily in the adipose tissue of shrimps. In this experiment, two groups of 18 samples (each 100 grams) of shrimps each were cultivated in different conditions, one control condition and one condition where the medium was poluted with PCBs. Note that the PCB concentrations were measured in pg/g adipose tissue.
Research question
Is there an effect of the growth condition on the PCB concentration in the adipose tissue of shrimps?
Load libraries:
Import the data
shrimps <- read_tsv(
"https://raw.githubusercontent.com/statOmics/PSLSData/main/shrimps.txt"
)
## Rows: 36 Columns: 2
## ── Column specification ──────────────────────────────────────────────
## Delimiter: "\t"
## dbl (2): PCB.conc, group
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 36
## Columns: 2
## $ PCB.conc <dbl> 29.7, 24.5, 97.7, 39.1, 22.6, 32.4, 27.7, 100.1, 40…
## $ group <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
Data tidying
shrimps <- shrimps %>%
mutate(group = as.factor(group))
Data exploration
The first step is to explore the data.
Visualize the data:
shrimps %>%
ggplot(aes(x = group, y = PCB.conc, fill = group)) +
scale_fill_manual(values = c("darkorchid", "olivedrab")) +
theme_bw() +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2) +
ggtitle("Boxplot of the PCB concentrations in two groups of shrimps") +
ylab("PCB concentration (pg/g)") +
stat_summary(
fun = mean, geom = "point",
shape = 5, size = 3, color = "black",
)
We can see that for group 1 we have four very clear outliers in the data. These values were double-checked (i.e for typing errors), but there was no reason found to believe that these values are incorrect.
Analysis
A good way for testing the research hypothesis is to perform an unpaired two-sample t-test to find out whether there is a significant difference in the mean PCB concentrations between both groups of samples. Before we can do this, we must check if all the required assumptions are met.
Assumptions
- The observations are independent of each other (in both groups)
- The data (PCB.conc) must be normally distributed (in both groups)
- The variance is equal in the two groups.
The first assumption is met, as we randomly selected shrimps and submitted them to one of two growth conditions. No underlying correlation patterns are expected.
We can check the second assumption with a QQ-plot.
shrimps %>%
ggplot(aes(sample = PCB.conc)) +
geom_qq() +
geom_qq_line() +
facet_grid(~ group)
We clearly see that we have strong deviations from normality. Many datapoints do not lie near the quantile-quantile line. As such, we may conclude that our data are not normally distributed. In addition, the boxplots suggest that the variability differs between the two groups.
Given the location of the outliers transformation will not help here. Therefore, the t-test is not appropriate here. We will revisit this dataset in exercise 9.1 and consider an alternative analysis using a non-parametric test.
LS0tCnRpdGxlOiAiRXhlcmNpc2UgNS40OiBIeXBvdGhlc2lzIHRlc3Rpbmcgb24gdGhlIHNocmltcHMgZGF0YXNldCAtIHNvbHV0aW9uIgphdXRob3I6ICJMaWV2ZW4gQ2xlbWVudCwgSmVyb2VuIEdpbGlzIGFuZCBNaWxhbiBNYWxmYWl0IgpkYXRlOiAic3RhdE9taWNzLCBHaGVudCBVbml2ZXJzaXR5IChodHRwczovL3N0YXRvbWljcy5naXRodWIuaW8pIgotLS0KIyBBaW1zIG9mIHRoaXMgZXhlcmNpc2UKClRoaXMgZXhlcmNpc2UgYWltcyB0byBmdXJ0aGVyIHNoYXJwZW4geW91ciBza2lsbHMgaW4KCi0gdHJhbnNsYXRpbmcgdGhlIHJlc2VhcmNoIHF1ZXN0aW9uIGluIGEgYSBudWxsIGFuZCBhbHRlcm5hdGl2ZSBoeXBvdGhlc2lzIG9mIHQtdGVzdHMKLSBjcml0aWNhbGx5IGV2YWx1YXRpbmcgdGhlIGFzc3VtcHRpb25zIG9mIHQtdGVzdHMsIGFuZAotIHNlbGVjdGluZyB0aGUgYXBwcm9wcmlhdGUgdGVzdCBmb3IgYW5zd2VyaW5nIHRoZSByZXNlYXJjaCBxdWVzdGlvbi4KCiMgVGhlIHNocmltcHMgZGF0YXNldAoKRGF0YXNldCBvbiB0aGUgYWNjdW11bGF0aW9uIG9mIFBDQnMgKFBvbHljaGxvcmluYXRlZCBiaXBoZW55bHMpCmluIHRoZSBhZGlwb3NlIHRpc3N1ZSBvZiBzaHJpbXBzLiBQQ0JzIGFyZSBvZnRlbiBwcmVzZW50IGluIGNvb2xhbnRzLCBhbmQgYXJlCmtub3cgdG8gYWNjdW11bGF0ZSBlYXNpbHkgaW4gdGhlIGFkaXBvc2UgdGlzc3VlIG9mIHNocmltcHMuIEluIHRoaXMgZXhwZXJpbWVudCwKdHdvIGdyb3VwcyBvZiAxOCBzYW1wbGVzIChlYWNoIDEwMCBncmFtcykgb2Ygc2hyaW1wcyBlYWNoIHdlcmUgY3VsdGl2YXRlZAppbiBkaWZmZXJlbnQgY29uZGl0aW9ucywgb25lIGNvbnRyb2wgY29uZGl0aW9uIGFuZCBvbmUgY29uZGl0aW9uCndoZXJlIHRoZSBtZWRpdW0gd2FzIHBvbHV0ZWQgd2l0aCBQQ0JzLiBOb3RlIHRoYXQgdGhlIFBDQiBjb25jZW50cmF0aW9ucyB3ZXJlCm1lYXN1cmVkIGluIHBnL2cgYWRpcG9zZSB0aXNzdWUuCgojIFJlc2VhcmNoIHF1ZXN0aW9uCgpJcyB0aGVyZSBhbiBlZmZlY3Qgb2YgdGhlCmdyb3d0aCBjb25kaXRpb24gb24gdGhlIFBDQiBjb25jZW50cmF0aW9uIGluIHRoZSBhZGlwb3NlCnRpc3N1ZSBvZiBzaHJpbXBzPwoKTG9hZCBsaWJyYXJpZXM6CgpgYGB7ciBsaWJyYXJpZXMsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkodGlkeXZlcnNlKQpgYGAKCiMgSW1wb3J0IHRoZSBkYXRhCgpgYGB7cn0Kc2hyaW1wcyA8LSByZWFkX3RzdigKICAiaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL3N0YXRPbWljcy9QU0xTRGF0YS9tYWluL3NocmltcHMudHh0IgopCmdsaW1wc2Uoc2hyaW1wcykKYGBgCgojIERhdGEgdGlkeWluZwoKYGBge3J9CnNocmltcHMgPC0gc2hyaW1wcyAlPiUKICBtdXRhdGUoZ3JvdXAgPSBhcy5mYWN0b3IoZ3JvdXApKQpgYGAKCiMgRGF0YSBleHBsb3JhdGlvbgoKVGhlIGZpcnN0IHN0ZXAgaXMgdG8gZXhwbG9yZSB0aGUgZGF0YS4KCmBgYHtyfQpzaHJpbXBzICU+JQogIGNvdW50KGdyb3VwKQpgYGAKClZpc3VhbGl6ZSB0aGUgZGF0YToKCmBgYHtyfQpzaHJpbXBzICU+JQogIGdncGxvdChhZXMoeCA9IGdyb3VwLCB5ID0gUENCLmNvbmMsIGZpbGwgPSBncm91cCkpICsKICBzY2FsZV9maWxsX21hbnVhbCh2YWx1ZXMgPSBjKCJkYXJrb3JjaGlkIiwgIm9saXZlZHJhYiIpKSArCiAgdGhlbWVfYncoKSArCiAgZ2VvbV9ib3hwbG90KG91dGxpZXIuc2hhcGUgPSBOQSkgKwogIGdlb21faml0dGVyKHdpZHRoID0gMC4yKSArCiAgZ2d0aXRsZSgiQm94cGxvdCBvZiB0aGUgUENCIGNvbmNlbnRyYXRpb25zIGluIHR3byBncm91cHMgb2Ygc2hyaW1wcyIpICsKICB5bGFiKCJQQ0IgY29uY2VudHJhdGlvbiAocGcvZykiKSArCiAgc3RhdF9zdW1tYXJ5KAogICAgZnVuID0gbWVhbiwgZ2VvbSA9ICJwb2ludCIsCiAgICBzaGFwZSA9IDUsIHNpemUgPSAzLCBjb2xvciA9ICJibGFjayIsCiAgKQpgYGAKCldlIGNhbiBzZWUgdGhhdCBmb3IgZ3JvdXAgMSB3ZSBoYXZlIGZvdXIgdmVyeSBjbGVhciBvdXRsaWVycwppbiB0aGUgZGF0YS4gVGhlc2UgdmFsdWVzIHdlcmUgZG91YmxlLWNoZWNrZWQgKGkuZSBmb3IKdHlwaW5nIGVycm9ycyksIGJ1dCB0aGVyZSB3YXMgbm8gcmVhc29uIGZvdW5kIHRvIGJlbGlldmUKdGhhdCB0aGVzZSB2YWx1ZXMgYXJlIGluY29ycmVjdC4KCiMgQW5hbHlzaXMKCkEgZ29vZCB3YXkgZm9yCnRlc3RpbmcgdGhlIHJlc2VhcmNoIGh5cG90aGVzaXMgaXMgdG8gcGVyZm9ybSBhbiB1bnBhaXJlZAp0d28tc2FtcGxlIHQtdGVzdCB0byBmaW5kIG91dCB3aGV0aGVyIHRoZXJlIGlzIGEgc2lnbmlmaWNhbnQKZGlmZmVyZW5jZSBpbiB0aGUgbWVhbiBQQ0IgY29uY2VudHJhdGlvbnMgYmV0d2VlbiBib3RoIGdyb3VwcwpvZiBzYW1wbGVzLiBCZWZvcmUgd2UgY2FuIGRvIHRoaXMsIHdlIG11c3QgY2hlY2sgaWYgYWxsIHRoZQpyZXF1aXJlZCBhc3N1bXB0aW9ucyBhcmUgbWV0LgoKIyMgQXNzdW1wdGlvbnMKCjEuIFRoZSBvYnNlcnZhdGlvbnMgYXJlIGluZGVwZW5kZW50IG9mIGVhY2ggb3RoZXIgKGluIGJvdGggZ3JvdXBzKQoyLiBUaGUgZGF0YSAoUENCLmNvbmMpIG11c3QgYmUgbm9ybWFsbHkgZGlzdHJpYnV0ZWQgKGluIGJvdGggZ3JvdXBzKQozLiBUaGUgdmFyaWFuY2UgaXMgZXF1YWwgaW4gdGhlIHR3byBncm91cHMuCgpUaGUgZmlyc3QgYXNzdW1wdGlvbiBpcyBtZXQsIGFzIHdlIHJhbmRvbWx5IHNlbGVjdGVkIHNocmltcHMgYW5kCnN1Ym1pdHRlZCB0aGVtIHRvIG9uZSBvZiB0d28gZ3Jvd3RoIGNvbmRpdGlvbnMuIE5vIHVuZGVybHlpbmcKY29ycmVsYXRpb24gcGF0dGVybnMgYXJlIGV4cGVjdGVkLgoKV2UgY2FuIGNoZWNrIHRoZSBzZWNvbmQgYXNzdW1wdGlvbiB3aXRoIGEgUVEtcGxvdC4KCmBgYHtyfQpzaHJpbXBzICU+JQogIGdncGxvdChhZXMoc2FtcGxlID0gUENCLmNvbmMpKSArCiAgZ2VvbV9xcSgpICsKICBnZW9tX3FxX2xpbmUoKSArCiAgZmFjZXRfZ3JpZCh+IGdyb3VwKQpgYGAKCldlIGNsZWFybHkgc2VlIHRoYXQgd2UgaGF2ZSBzdHJvbmcgZGV2aWF0aW9ucyBmcm9tCm5vcm1hbGl0eS4gTWFueSBkYXRhcG9pbnRzIGRvIG5vdCBsaWUgbmVhciB0aGUgcXVhbnRpbGUtcXVhbnRpbGUKbGluZS4gQXMgc3VjaCwgd2UgbWF5IGNvbmNsdWRlIHRoYXQgb3VyIGRhdGEgYXJlIG5vdCBub3JtYWxseSBkaXN0cmlidXRlZC4KSW4gYWRkaXRpb24sIHRoZSBib3hwbG90cyBzdWdnZXN0IHRoYXQgdGhlCnZhcmlhYmlsaXR5IGRpZmZlcnMgYmV0d2VlbiB0aGUgdHdvIGdyb3Vwcy4KCkdpdmVuIHRoZSBsb2NhdGlvbiBvZiB0aGUgb3V0bGllcnMgdHJhbnNmb3JtYXRpb24gd2lsbCBub3QgaGVscCBoZXJlLiBUaGVyZWZvcmUsCnRoZSB0LXRlc3QgaXMgbm90IGFwcHJvcHJpYXRlIGhlcmUuIFdlIHdpbGwgcmV2aXNpdCB0aGlzIGRhdGFzZXQgaW4gW2V4ZXJjaXNlCjkuMV0oLi8wOV8xX3NocmltcHMuaHRtbCkgYW5kIGNvbnNpZGVyIGFuIGFsdGVybmF0aXZlIGFuYWx5c2lzIHVzaW5nIGEKKipub24tcGFyYW1ldHJpYyoqIHRlc3QuCg==