Aims of this exercise
This exercise aims to further sharpen your skills in
- translating the research question in a a null and alternative hypothesis of t-tests
- critically evaluating the assumptions of t-tests, and
- selecting the appropriate test for answering the research question.
The shrimps dataset
Dataset on the accumulation of PCBs (Polychlorinated biphenyls)
in the adipose tissue of shrimps. PCBs are often present in coolants, and are
know to accumulate easily in the adipose tissue of shrimps. In this experiment,
two groups of 18 samples (each 100 grams) of shrimps each were cultivated
in different conditions, one control condition and one condition
where the medium was poluted with PCBs. Note that the PCB concentrations were
measured in pg/g adipose tissue.
Research question
Is there an effect of the
growth condition on the PCB concentration in the adipose
tissue of shrimps?
Load libraries:
Import the data
shrimps <- read_tsv(
"https://raw.githubusercontent.com/statOmics/PSLSData/main/shrimps.txt"
)
## Rows: 36 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## dbl (2): PCB.conc, group
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 36
## Columns: 2
## $ PCB.conc <dbl> 29.7, 24.5, 97.7, 39.1, 22.6, 32.4, 27.7, 100.1, 40.1, 23.3, …
## $ group <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2…
Data tidying
shrimps <- shrimps %>%
mutate(group = as.factor(group))
Data exploration
The first step is to explore the data.
Visualize the data:
shrimps %>%
ggplot(aes(x = group, y = PCB.conc, fill = group)) +
scale_fill_manual(values = c("darkorchid", "olivedrab")) +
theme_bw() +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2) +
ggtitle("Boxplot of the PCB concentrations in two groups of shrimps") +
ylab("PCB concentration (pg/g)") +
stat_summary(
fun = mean, geom = "point",
shape = 5, size = 3, color = "black",
)

We can see that for group 1 we have four very clear outliers
in the data. These values were double-checked (i.e for
typing errors), but there was no reason found to believe
that these values are incorrect.
Analysis
A good way for
testing the research hypothesis is to perform an unpaired
two-sample t-test to find out whether there is a significant
difference in the mean PCB concentrations between both groups
of samples. Before we can do this, we must check if all the
required assumptions are met.
Assumptions
- The observations are independent of each other (in both groups)
- The data (PCB.conc) must be normally distributed (in both groups)
- The variance is equal in the two groups.
The first assumption is met, as we randomly selected shrimps and
submitted them to one of two growth conditions. No underlying
correlation patterns are expected.
We can check the second assumption with a QQ-plot.
shrimps %>%
ggplot(aes(sample = PCB.conc)) +
geom_qq() +
geom_qq_line() +
facet_grid(~ group)

We clearly see that we have strong deviations from
normality. Many datapoints do not lie near the quantile-quantile
line. As such, we may conclude that our data are not normally distributed.
In addition, the boxplots suggest that the
variability differs between the two groups.
Given the location of the outliers transformation will not help here. Therefore,
the t-test is not appropriate here. We will revisit this dataset in exercise
9.1 and consider an alternative analysis using a
non-parametric test.
LS0tCnRpdGxlOiAiRXhlcmNpc2UgNS40OiBIeXBvdGhlc2lzIHRlc3Rpbmcgb24gdGhlIHNocmltcHMgZGF0YXNldCAtIHNvbHV0aW9uIgphdXRob3I6ICJMaWV2ZW4gQ2xlbWVudCwgSmVyb2VuIEdpbGlzIGFuZCBNaWxhbiBNYWxmYWl0IgpkYXRlOiAic3RhdE9taWNzLCBHaGVudCBVbml2ZXJzaXR5IChodHRwczovL3N0YXRvbWljcy5naXRodWIuaW8pIgotLS0KIyBBaW1zIG9mIHRoaXMgZXhlcmNpc2UKClRoaXMgZXhlcmNpc2UgYWltcyB0byBmdXJ0aGVyIHNoYXJwZW4geW91ciBza2lsbHMgaW4KCi0gdHJhbnNsYXRpbmcgdGhlIHJlc2VhcmNoIHF1ZXN0aW9uIGluIGEgYSBudWxsIGFuZCBhbHRlcm5hdGl2ZSBoeXBvdGhlc2lzIG9mIHQtdGVzdHMKLSBjcml0aWNhbGx5IGV2YWx1YXRpbmcgdGhlIGFzc3VtcHRpb25zIG9mIHQtdGVzdHMsIGFuZAotIHNlbGVjdGluZyB0aGUgYXBwcm9wcmlhdGUgdGVzdCBmb3IgYW5zd2VyaW5nIHRoZSByZXNlYXJjaCBxdWVzdGlvbi4KCiMgVGhlIHNocmltcHMgZGF0YXNldAoKRGF0YXNldCBvbiB0aGUgYWNjdW11bGF0aW9uIG9mIFBDQnMgKFBvbHljaGxvcmluYXRlZCBiaXBoZW55bHMpCmluIHRoZSBhZGlwb3NlIHRpc3N1ZSBvZiBzaHJpbXBzLiBQQ0JzIGFyZSBvZnRlbiBwcmVzZW50IGluIGNvb2xhbnRzLCBhbmQgYXJlCmtub3cgdG8gYWNjdW11bGF0ZSBlYXNpbHkgaW4gdGhlIGFkaXBvc2UgdGlzc3VlIG9mIHNocmltcHMuIEluIHRoaXMgZXhwZXJpbWVudCwKdHdvIGdyb3VwcyBvZiAxOCBzYW1wbGVzIChlYWNoIDEwMCBncmFtcykgb2Ygc2hyaW1wcyBlYWNoIHdlcmUgY3VsdGl2YXRlZAppbiBkaWZmZXJlbnQgY29uZGl0aW9ucywgb25lIGNvbnRyb2wgY29uZGl0aW9uIGFuZCBvbmUgY29uZGl0aW9uCndoZXJlIHRoZSBtZWRpdW0gd2FzIHBvbHV0ZWQgd2l0aCBQQ0JzLiBOb3RlIHRoYXQgdGhlIFBDQiBjb25jZW50cmF0aW9ucyB3ZXJlCm1lYXN1cmVkIGluIHBnL2cgYWRpcG9zZSB0aXNzdWUuCgojIFJlc2VhcmNoIHF1ZXN0aW9uCgpJcyB0aGVyZSBhbiBlZmZlY3Qgb2YgdGhlCmdyb3d0aCBjb25kaXRpb24gb24gdGhlIFBDQiBjb25jZW50cmF0aW9uIGluIHRoZSBhZGlwb3NlCnRpc3N1ZSBvZiBzaHJpbXBzPwoKTG9hZCBsaWJyYXJpZXM6CgpgYGB7ciBsaWJyYXJpZXMsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkodGlkeXZlcnNlKQpgYGAKCiMgSW1wb3J0IHRoZSBkYXRhCgpgYGB7cn0Kc2hyaW1wcyA8LSByZWFkX3RzdigKICAiaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL3N0YXRPbWljcy9QU0xTRGF0YS9tYWluL3NocmltcHMudHh0IgopCmdsaW1wc2Uoc2hyaW1wcykKYGBgCgojIERhdGEgdGlkeWluZwoKYGBge3J9CnNocmltcHMgPC0gc2hyaW1wcyAlPiUKICBtdXRhdGUoZ3JvdXAgPSBhcy5mYWN0b3IoZ3JvdXApKQpgYGAKCiMgRGF0YSBleHBsb3JhdGlvbgoKVGhlIGZpcnN0IHN0ZXAgaXMgdG8gZXhwbG9yZSB0aGUgZGF0YS4KCmBgYHtyfQpzaHJpbXBzICU+JQogIGNvdW50KGdyb3VwKQpgYGAKClZpc3VhbGl6ZSB0aGUgZGF0YToKCmBgYHtyfQpzaHJpbXBzICU+JQogIGdncGxvdChhZXMoeCA9IGdyb3VwLCB5ID0gUENCLmNvbmMsIGZpbGwgPSBncm91cCkpICsKICBzY2FsZV9maWxsX21hbnVhbCh2YWx1ZXMgPSBjKCJkYXJrb3JjaGlkIiwgIm9saXZlZHJhYiIpKSArCiAgdGhlbWVfYncoKSArCiAgZ2VvbV9ib3hwbG90KG91dGxpZXIuc2hhcGUgPSBOQSkgKwogIGdlb21faml0dGVyKHdpZHRoID0gMC4yKSArCiAgZ2d0aXRsZSgiQm94cGxvdCBvZiB0aGUgUENCIGNvbmNlbnRyYXRpb25zIGluIHR3byBncm91cHMgb2Ygc2hyaW1wcyIpICsKICB5bGFiKCJQQ0IgY29uY2VudHJhdGlvbiAocGcvZykiKSArCiAgc3RhdF9zdW1tYXJ5KAogICAgZnVuID0gbWVhbiwgZ2VvbSA9ICJwb2ludCIsCiAgICBzaGFwZSA9IDUsIHNpemUgPSAzLCBjb2xvciA9ICJibGFjayIsCiAgKQpgYGAKCldlIGNhbiBzZWUgdGhhdCBmb3IgZ3JvdXAgMSB3ZSBoYXZlIGZvdXIgdmVyeSBjbGVhciBvdXRsaWVycwppbiB0aGUgZGF0YS4gVGhlc2UgdmFsdWVzIHdlcmUgZG91YmxlLWNoZWNrZWQgKGkuZSBmb3IKdHlwaW5nIGVycm9ycyksIGJ1dCB0aGVyZSB3YXMgbm8gcmVhc29uIGZvdW5kIHRvIGJlbGlldmUKdGhhdCB0aGVzZSB2YWx1ZXMgYXJlIGluY29ycmVjdC4KCiMgQW5hbHlzaXMKCkEgZ29vZCB3YXkgZm9yCnRlc3RpbmcgdGhlIHJlc2VhcmNoIGh5cG90aGVzaXMgaXMgdG8gcGVyZm9ybSBhbiB1bnBhaXJlZAp0d28tc2FtcGxlIHQtdGVzdCB0byBmaW5kIG91dCB3aGV0aGVyIHRoZXJlIGlzIGEgc2lnbmlmaWNhbnQKZGlmZmVyZW5jZSBpbiB0aGUgbWVhbiBQQ0IgY29uY2VudHJhdGlvbnMgYmV0d2VlbiBib3RoIGdyb3VwcwpvZiBzYW1wbGVzLiBCZWZvcmUgd2UgY2FuIGRvIHRoaXMsIHdlIG11c3QgY2hlY2sgaWYgYWxsIHRoZQpyZXF1aXJlZCBhc3N1bXB0aW9ucyBhcmUgbWV0LgoKIyMgQXNzdW1wdGlvbnMKCjEuIFRoZSBvYnNlcnZhdGlvbnMgYXJlIGluZGVwZW5kZW50IG9mIGVhY2ggb3RoZXIgKGluIGJvdGggZ3JvdXBzKQoyLiBUaGUgZGF0YSAoUENCLmNvbmMpIG11c3QgYmUgbm9ybWFsbHkgZGlzdHJpYnV0ZWQgKGluIGJvdGggZ3JvdXBzKQozLiBUaGUgdmFyaWFuY2UgaXMgZXF1YWwgaW4gdGhlIHR3byBncm91cHMuCgpUaGUgZmlyc3QgYXNzdW1wdGlvbiBpcyBtZXQsIGFzIHdlIHJhbmRvbWx5IHNlbGVjdGVkIHNocmltcHMgYW5kCnN1Ym1pdHRlZCB0aGVtIHRvIG9uZSBvZiB0d28gZ3Jvd3RoIGNvbmRpdGlvbnMuIE5vIHVuZGVybHlpbmcKY29ycmVsYXRpb24gcGF0dGVybnMgYXJlIGV4cGVjdGVkLgoKV2UgY2FuIGNoZWNrIHRoZSBzZWNvbmQgYXNzdW1wdGlvbiB3aXRoIGEgUVEtcGxvdC4KCmBgYHtyfQpzaHJpbXBzICU+JQogIGdncGxvdChhZXMoc2FtcGxlID0gUENCLmNvbmMpKSArCiAgZ2VvbV9xcSgpICsKICBnZW9tX3FxX2xpbmUoKSArCiAgZmFjZXRfZ3JpZCh+IGdyb3VwKQpgYGAKCldlIGNsZWFybHkgc2VlIHRoYXQgd2UgaGF2ZSBzdHJvbmcgZGV2aWF0aW9ucyBmcm9tCm5vcm1hbGl0eS4gTWFueSBkYXRhcG9pbnRzIGRvIG5vdCBsaWUgbmVhciB0aGUgcXVhbnRpbGUtcXVhbnRpbGUKbGluZS4gQXMgc3VjaCwgd2UgbWF5IGNvbmNsdWRlIHRoYXQgb3VyIGRhdGEgYXJlIG5vdCBub3JtYWxseSBkaXN0cmlidXRlZC4KSW4gYWRkaXRpb24sIHRoZSBib3hwbG90cyBzdWdnZXN0IHRoYXQgdGhlCnZhcmlhYmlsaXR5IGRpZmZlcnMgYmV0d2VlbiB0aGUgdHdvIGdyb3Vwcy4KCkdpdmVuIHRoZSBsb2NhdGlvbiBvZiB0aGUgb3V0bGllcnMgdHJhbnNmb3JtYXRpb24gd2lsbCBub3QgaGVscCBoZXJlLiBUaGVyZWZvcmUsCnRoZSB0LXRlc3QgaXMgbm90IGFwcHJvcHJpYXRlIGhlcmUuIFdlIHdpbGwgcmV2aXNpdCB0aGlzIGRhdGFzZXQgaW4gW2V4ZXJjaXNlCjkuMV0oLi8wOV8xX3NocmltcHMuaHRtbCkgYW5kIGNvbnNpZGVyIGFuIGFsdGVybmF0aXZlIGFuYWx5c2lzIHVzaW5nIGEKKipub24tcGFyYW1ldHJpYyoqIHRlc3QuCg==