This assignment covers material from Topics 1-3 (Descriptive Statistics, Resampling Meth-
ods, Linear Regression) and is worth 50 marks. Your solutions should be properly pre-
sented, and it is important that you double-check your spelling and grammar and thor-
oughly proofread your assignment before submitting. Instructions for assignment
submission are presented in the \Assignment 1″ link and must be strictly ad-
hered to. No marks will be awarded to assignments that are submitted after
the due date and time.
Questions
In this assignment, we will examine a random subset of a dataset that contains information on athletes that competed in an event in the \Olympics” (both winter and summer versions) over the 120 year history of the modern games.
1. (20 marks)
Consider the variable Weight in the \Olympics” dataset, which records the weight
of the athlete.
(a) Use appropriate graphical displays and measures of centrality and dispersion
to summarise the Weight variable. Provide a reasonable explanation for why
the Weight data might have the distribution you observe. (6 marks)
(b) For the most appropriate measure of centrality and measure of dispersion
you have selected for Weight, produce a table of the form shown below that
presents:
 the particular measures (i.e., statistics) you have chosen,
 those measures (i.e., statistics) as calculated for the variable Weight,
 the jackknife and bootstrap estimators for those statistics,
 the jackknife and bootstrap standard errors for those statistics, and
1
 the jackknife and bootstrap estimates of bias for those statistics.
Do these measures of centrality and dispersion appear to be biased or unbiased
estimators? (8 marks)
Measure of Centrality:  Name of measure of centrality¡
Value of measure of centrality
when applied to original data¡
Jackknife Bootstrap
Estimator
Standard error
Bias
Measure of Dispersion:  Name of measure of dispersion¡
Value of measure of dispersion
when applied to original data¡
Jackknife Bootstrap
Estimator
Standard error
Bias
(c) Produce graphical displays of the sampling distributions of the measure of
centrality and measure of dispersion you have selected for Weight. Comment
on the shapes of these distributions. Additionally, produce a 95% bootstrap
percentile con dence interval for both your measure of centrality and measure
of dispersion and interpret them. If there is anything unusual about the 95%
bootstrap percentile con dence intervals, comment on that. (6 marks)
2. (18 marks)
Now consider the relationship between (Weight) and type of medal won (Medal).
Consider carefully the categories of Medal present in the dataset.
(a) Clearly and accurately state the
 linearity,
 independence,
 normality, and
 equal variances (i.e., homoscedasticity)
assumptions of linear regression as they pertain to these data, and assess them
for a linear model of Weight on Medal. This assessment should include refer-
2
ence to appropriate graphical displays.
(8 marks)
(b) Consider common transformations of the data and present the form of the
linear model which you believe would be best when attempting to assess the
relationship between Weight and Medal. Present and discuss relevant diag-
nostic plots for assessing the assumptions of linear regression for this model,
clearly noting any violations of assumptions that may still exist. (6 marks)
(c) Assuming that the model presented in Part (b) is wholly appropriate (i.e., there
are no violations of the assumptions of linear regression), provide a table of
relevant R output for that model and comment on whether there is a signi cant
\e ect” of winning a medal on the weight of an athlete. If so, how would you
interpret this \e ect”. (4 marks)
3. (8 marks)
To assess the level of preservatives used in mass-produced breads, researchers ran-
domly sampled four loaves of bread of di erent brands and varieties that are stocked
by a large supermarket chain. They let the loaves sit in a controlled environment
at 27C until mould appeared. The number of days until mould appeared for each
of the four loaves of bread is as follows:
2 3 8 6
By hand (i.e., no computer allowed), calculate the jackknife estimator and standard
error of the median time until mould appears, showing all working. Is the estimator
unbiased?
4. (4 marks)
Presentation marks:
These marks are allocated based on:
 structure, clarity, and tidiness of presented solutions/answers,
 correctness in spelling and grammar, and
 readability of R code (which includes usage of informative variable names and
commenting).
3

Found something interesting ?

• On-time delivery guarantee
• PhD-level professional writers
• Free Plagiarism Report

• 100% money-back guarantee
• Absolute Privacy & Confidentiality
• High Quality custom-written papers

Related Model Questions

Feel free to peruse our college and university model questions. If any our our assignment tasks interests you, click to place your order. Every paper is written by our professional essay writers from scratch to avoid plagiarism. We guarantee highest quality of work besides delivering your paper on time.

Grab your Discount!

25% Coupon Code: SAVE25
get 25% !!