In this hands-on exercise session you will perform the data exploration for 4 different studies:


1 NHANES example

The National Health and Nutrition Examination Survey (NHANES) contains data that has been collected since 1960. For this exercise, we will make use of the data that was collected between 2009 and 2012, for 10.000 U.S. civilians. The dataset contains a large number of physical, demographic, nutritional and life-style-related parameters.

The first step in a data analysis is data exploration.

In this exercise, you will learn howto:

  • import data into R
  • tidy and wrangle data
  • explore and visualize data

by following and interpreting the code of a worked out example.

  • Preliminary: Preliminary

  • Exercise: Exercise1

  • Data path:

    https://raw.githubusercontent.com/statOmics/PSLSData/main/NHANES.csv


2 Pertussis example

Researchers wanted to study the immune response on pertussis. They have set up an experiments with 40 rats. 16 rats were infected with pertussis and 24 rats received a control treatment. Researchers measured the white blood cell concentration (WBC) in each rat (count per mm\(^3\).

The data consist of two variables:

  • WBC: white blood cell count (counts/mm\(^3\)).

  • trt: treatment

    • control: rat recieved control treatment
    • pertussis: rat was infected with pertussis

Upon this exercise you can

  • implement a good data exploration for a two group comparison in R and
  • interpret the plots and results.

Files

  • Exercise: Exercise2

  • Data path:

    https://raw.githubusercontent.com/statOmics/PSLSData/main/wbcon.csv

  • Solution: Solution2


3 Diabetes example

The diabetes dataset consists of a small experiment with 8 patients that were subjected to a glucose tolerance test.

Patients had to fast for eight hours before the test. When the patients entered the hospital their baseline glucose level was measured (mmol/l).

Patients then had to drink 250 ml of a syrupy glucose solution containing 100 grams of sugar. Two hours later, their blood glucose level was measured again.

The data consist of three variables:

  • before: glucose concentration upon 8 hours of fasting (mmol/l)
  • after: glucose concentration 2 hours after drinking glucose solution (mmol/l).
  • patient: identifier for the patient

In this exercise, you will acquire the skills to

  • recognize paired data
  • conduct a data exploration in R for data from paired experimental designs.
  • interpret the results of a data exploration for paired experimental designs

Files:

  • Exercise: Exercise3

  • Data path:

    https://raw.githubusercontent.com/statOmics/PSLSData/main/diabetes.txt

  • Solution: Solution3


4 FEV example

The forced expiratory volume (FEV) is a measure of how much air a person can exhale (in liters) during a forced breath. In this dataset, the FEV of 606 children, between the ages of 6 and 17, were measured. The dataset also provides additional information on these children: their age, their height, their gender and, most importantly, whether the child is a smoker or a non-smoker. The goal of this experiment was to find out if smoking has an effect on the FEV of children.

In this exercise, you will learn how plots can help you to discover confounding in a real datasets.

  • Exercise: Exercise4

  • Data path:

    https://raw.githubusercontent.com/statOmics/PSLSData/main/fev.txt

  • Solution: Solution4


LS0tCnRpdGxlOiAiRXhlcmNpc2VzIG9uIGNoYXB0ZXIgNDogRGF0YSBleHBsb3JhdGlvbiIKYXV0aG9yOiAiTGlldmVuIENsZW1lbnQsIEplcm9lbiBHaWxpcyBhbmQgTWlsYW4gTWFsZmFpdCIKZGF0ZTogInN0YXRPbWljcywgR2hlbnQgVW5pdmVyc2l0eSAoaHR0cHM6Ly9zdGF0b21pY3MuZ2l0aHViLmlvKSIKY29kZV9kb3dubG9hZDogZmFsc2UKLS0tCgpJbiB0aGlzIGhhbmRzLW9uIGV4ZXJjaXNlIHNlc3Npb24geW91IHdpbGwgcGVyZm9ybSB0aGUgZGF0YSBleHBsb3JhdGlvbiBmb3IgNCBkaWZmZXJlbnQgc3R1ZGllczoKCiogW05IQU5FUyBleGFtcGxlXQoqIFtQZXJ0dXNzaXMgZXhhbXBsZV0KKiBbRGlhYmV0ZXMgZXhhbXBsZV0KKiBbRkVWIGV4YW1wbGVdCgotLS0KCiMgTkhBTkVTIGV4YW1wbGUKClRoZSBOYXRpb25hbCBIZWFsdGggYW5kIE51dHJpdGlvbiBFeGFtaW5hdGlvbiBTdXJ2ZXkgKE5IQU5FUykgY29udGFpbnMgZGF0YQp0aGF0IGhhcyBiZWVuIGNvbGxlY3RlZCBzaW5jZSAxOTYwLiBGb3IgdGhpcyBleGVyY2lzZSwgd2Ugd2lsbCBtYWtlIHVzZSBvZiB0aGUKZGF0YSB0aGF0IHdhcyBjb2xsZWN0ZWQgIGJldHdlZW4gMjAwOSBhbmQgIDIwMTIsIGZvciAxMC4wMDAgVS5TLiBjaXZpbGlhbnMuClRoZSBkYXRhc2V0IGNvbnRhaW5zIGEgbGFyZ2UgbnVtYmVyIG9mIHBoeXNpY2FsLCBkZW1vZ3JhcGhpYywgbnV0cml0aW9uYWwgYW5kCmxpZmUtc3R5bGUtcmVsYXRlZCBwYXJhbWV0ZXJzLgoKClRoZSBmaXJzdCBzdGVwIGluIGEgZGF0YSBhbmFseXNpcyBpcyBkYXRhIGV4cGxvcmF0aW9uLgoKSW4gdGhpcyBleGVyY2lzZSwgeW91IHdpbGwgbGVhcm4gaG93dG86CgotIGltcG9ydCBkYXRhIGludG8gUgotIHRpZHkgYW5kIHdyYW5nbGUgZGF0YQotIGV4cGxvcmUgYW5kIHZpc3VhbGl6ZSBkYXRhCgpieSBmb2xsb3dpbmcgYW5kIGludGVycHJldGluZyB0aGUgY29kZSBvZiBhIHdvcmtlZCBvdXQgZXhhbXBsZS4KCi0gUHJlbGltaW5hcnk6IFtQcmVsaW1pbmFyeV0oLi9leHRyYTFfcHJlbGltaW5hcnlfdGlkeXZlcnNlLmh0bWwpCi0gRXhlcmNpc2U6IFtFeGVyY2lzZTFdKC4vMDRfMV9OSEFORVMuaHRtbCkKLSBEYXRhIHBhdGg6CgogIGBodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vc3RhdE9taWNzL1BTTFNEYXRhL21haW4vTkhBTkVTLmNzdmAKCi0tLQoKIyBQZXJ0dXNzaXMgZXhhbXBsZQoKUmVzZWFyY2hlcnMgd2FudGVkIHRvIHN0dWR5IHRoZSBpbW11bmUgcmVzcG9uc2Ugb24gcGVydHVzc2lzLgpUaGV5IGhhdmUgc2V0IHVwIGFuIGV4cGVyaW1lbnRzIHdpdGggNDAgcmF0cy4KMTYgcmF0cyB3ZXJlIGluZmVjdGVkIHdpdGggcGVydHVzc2lzIGFuZCAyNCByYXRzIHJlY2VpdmVkIGEgY29udHJvbCB0cmVhdG1lbnQuClJlc2VhcmNoZXJzIG1lYXN1cmVkIHRoZSB3aGl0ZSBibG9vZCBjZWxsIGNvbmNlbnRyYXRpb24gKFdCQykgaW4gZWFjaCByYXQgKGNvdW50IHBlciBtbSReMyQuCgpUaGUgZGF0YSBjb25zaXN0IG9mIHR3byB2YXJpYWJsZXM6CgotIFdCQzogd2hpdGUgYmxvb2QgY2VsbCBjb3VudCAoY291bnRzL21tJF4zJCkuCi0gdHJ0OiB0cmVhdG1lbnQKCiAgICAtIGNvbnRyb2w6IHJhdCByZWNpZXZlZCBjb250cm9sIHRyZWF0bWVudAogICAgLSBwZXJ0dXNzaXM6IHJhdCB3YXMgaW5mZWN0ZWQgd2l0aCBwZXJ0dXNzaXMKClVwb24gdGhpcyBleGVyY2lzZSB5b3UgY2FuCgotIGltcGxlbWVudCBhIGdvb2QgZGF0YSBleHBsb3JhdGlvbiBmb3IgYSB0d28gZ3JvdXAgY29tcGFyaXNvbiBpbiBSIGFuZAotIGludGVycHJldCB0aGUgcGxvdHMgYW5kIHJlc3VsdHMuCgpGaWxlcwoKLSBFeGVyY2lzZTogW0V4ZXJjaXNlMl0oLi8wNF8yX3BlcnR1c3Npcy5odG1sKQotIERhdGEgcGF0aDoKCiAgYGh0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9zdGF0T21pY3MvUFNMU0RhdGEvbWFpbi93YmNvbi5jc3ZgCgotIFNvbHV0aW9uOiBbU29sdXRpb24yXSguLzA0XzJfcGVydHVzc2lzX3NvbC5odG1sKQoKLS0tCgojIERpYWJldGVzIGV4YW1wbGUKClRoZSBkaWFiZXRlcyBkYXRhc2V0IGNvbnNpc3RzIG9mIGEgc21hbGwgZXhwZXJpbWVudCB3aXRoIDggcGF0aWVudHMKdGhhdCB3ZXJlIHN1YmplY3RlZCB0byBhIGdsdWNvc2UgdG9sZXJhbmNlIHRlc3QuCgpQYXRpZW50cyBoYWQgdG8gZmFzdCBmb3IgZWlnaHQgaG91cnMgYmVmb3JlIHRoZSB0ZXN0LgpXaGVuIHRoZSBwYXRpZW50cyBlbnRlcmVkIHRoZSBob3NwaXRhbCB0aGVpciBiYXNlbGluZSBnbHVjb3NlIGxldmVsIHdhcyBtZWFzdXJlZCAobW1vbC9sKS4KClBhdGllbnRzIHRoZW4gIGhhZCB0byBkcmluayAyNTAgbWwgb2YgYSBzeXJ1cHkgZ2x1Y29zZSBzb2x1dGlvbiBjb250YWluaW5nIDEwMCBncmFtcyBvZiBzdWdhci4KVHdvIGhvdXJzIGxhdGVyLCB0aGVpciBibG9vZCBnbHVjb3NlIGxldmVsIHdhcyBtZWFzdXJlZCBhZ2Fpbi4KClRoZSBkYXRhIGNvbnNpc3Qgb2YgdGhyZWUgdmFyaWFibGVzOgoKLSBiZWZvcmU6IGdsdWNvc2UgY29uY2VudHJhdGlvbiB1cG9uIDggaG91cnMgb2YgZmFzdGluZyAobW1vbC9sKQotIGFmdGVyOiBnbHVjb3NlIGNvbmNlbnRyYXRpb24gMiBob3VycyBhZnRlciBkcmlua2luZyBnbHVjb3NlIHNvbHV0aW9uIChtbW9sL2wpLgotIHBhdGllbnQ6IGlkZW50aWZpZXIgZm9yIHRoZSBwYXRpZW50CgpJbiB0aGlzIGV4ZXJjaXNlLCB5b3Ugd2lsbCBhY3F1aXJlIHRoZSBza2lsbHMgdG8KCi0gcmVjb2duaXplIHBhaXJlZCBkYXRhCi0gY29uZHVjdCBhIGRhdGEgZXhwbG9yYXRpb24gaW4gUiBmb3IgZGF0YSBmcm9tCnBhaXJlZCBleHBlcmltZW50YWwgZGVzaWducy4KLSBpbnRlcnByZXQgdGhlIHJlc3VsdHMgb2YgYSBkYXRhIGV4cGxvcmF0aW9uIGZvciBwYWlyZWQgZXhwZXJpbWVudGFsIGRlc2lnbnMKCkZpbGVzOgoKLSBFeGVyY2lzZTogW0V4ZXJjaXNlM10oLi8wNF8zX2RpYWJldGVzLmh0bWwpCi0gRGF0YSBwYXRoOgoKICBgaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL3N0YXRPbWljcy9QU0xTRGF0YS9tYWluL2RpYWJldGVzLnR4dGAKCi0gU29sdXRpb246IFtTb2x1dGlvbjNdKC4vMDRfM19kaWFiZXRlc19zb2wuaHRtbCkKCi0tLQoKIyBGRVYgZXhhbXBsZQoKVGhlIGZvcmNlZCBleHBpcmF0b3J5IHZvbHVtZSAoRkVWKSBpcyBhIG1lYXN1cmUgb2YgaG93Cm11Y2ggYWlyIGEgcGVyc29uIGNhbiBleGhhbGUgKGluIGxpdGVycykgIGR1cmluZyAgYSBmb3JjZWQgYnJlYXRoLiBJbiB0aGlzCmRhdGFzZXQsIHRoZSBGRVYgb2YgNjA2IGNoaWxkcmVuLCBiZXR3ZWVuIHRoZSBhZ2VzIG9mIDYgYW5kIDE3LCB3ZXJlIG1lYXN1cmVkLgpUaGUgZGF0YXNldCBhbHNvIHByb3ZpZGVzIGFkZGl0aW9uYWwgaW5mb3JtYXRpb24gb24gdGhlc2UgY2hpbGRyZW46CnRoZWlyIGBhZ2VgLCB0aGVpciBgaGVpZ2h0YCwgdGhlaXIgYGdlbmRlcmAgYW5kLCBtb3N0IGltcG9ydGFudGx5LCB3aGV0aGVyIHRoZQpjaGlsZCBpcyBhIHNtb2tlciBvciBhIG5vbi1zbW9rZXIuIFRoZSBnb2FsIG9mIHRoaXMgZXhwZXJpbWVudCB3YXMgdG8gZmluZCBvdXQKaWYgc21va2luZyBoYXMgYW4gZWZmZWN0IG9uIHRoZSBGRVYgb2YgY2hpbGRyZW4uCgpJbiB0aGlzIGV4ZXJjaXNlLCB5b3Ugd2lsbCBsZWFybiBob3cgcGxvdHMgY2FuIGhlbHAgeW91IHRvIGRpc2NvdmVyIGNvbmZvdW5kaW5nIGluIGEgcmVhbCBkYXRhc2V0cy4KCi0gRXhlcmNpc2U6IFtFeGVyY2lzZTRdKC4vMDRfNF9GRVYuaHRtbCkKLSBEYXRhIHBhdGg6CgogIGBodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vc3RhdE9taWNzL1BTTFNEYXRhL21haW4vZmV2LnR4dGAKCi0gU29sdXRpb246IFtTb2x1dGlvbjRdKC4vMDRfNF9GRVZfc29sLmh0bWwpCgo8YnIvPgo=