This assessment involves writing a report that summarises a data science related investigation that you have
conducted on data that you have collected yourself. The investigation must involve the main topics covered in the
subject, most noticeably data pre-processing (representation, wrangling, tidying) and exploratory data
visualisation using R/RStudio.
It is a merger of all your learnings in this semester. These include all aspects of visualization, data processing
and wrangling. However the pre-processing/exploratory steps to be carried out will not be specified, you have to
make independent choices and decisions.
We have provided you with 4 datasets each with their readme files. The data and the descriptor files are stored in
the Capstone Data subfolder within the Assessment folder, in LearnJCU. You may also find your own data using
good practices. This could include data from the UC Irvine machine learning repository. Your dataset cannot be
smaller than 1000 observations of 5 variables, except if the targeted data science problem to be addressed relates
to spatial-temporal data, case in which less than 5 dimensions could be allowed.
Preferably, you should use a dataset relevant to your place of work. Do not use data from textbooks or from R
packages. Do not use data from the same public sources that have been used in the subject (e.g. UCI repository).
You can use public data, but the data should be appropriate for addressing a relevant data science problem.
You don’t need to solve this entire data science problem in your investigation, but you need to clearly indicate
what the targeted problem would be about and how your project can contribute towards addressing it.
You have to write a report with details about the problem in question, the data, the methods, results, analyses and
findings. You might like to look online for research papers for examples of how to shape your report. Obviously
many of these papers will have undergone extensive work to collect their data, we don’t expect that for you.
We also don’t expect you to win a Nobel prize with this assessment. Ideally, you will be able to demonstrate that:
(a) you have grasped important concepts associated with this subject, most noticeably data pre-processing and
exploratory visualisation; and (b) you can communicate your investigation in a formal written manner.
#Sales Offer!| Get upto 25% Off: