Task 2.1) Conduct and report on exploratory data analysis (EDA) of housing.csv data set using
RapidMiner Studio data mining tool and RapidMiner Studio operators
Provide following for Task 2.1:
(i) a screen capture of final EDA process, briefly describe EDA process
(ii) summarise key results of exploratory data analysis in Table 2.1 Results of Exploratory
Data Analysis for housing.csv. Table 2.1 should include key characteristics of each
variable in housing.csv set such as maximum, minimum values, average, standard
deviation, most frequent values (mode), missing values and invalid values etc.
(iii) Discuss key results of exploratory data analysis presented in Table 2.1 and provide a
rationale for selecting top 5 variables for predicting sale price of a property (Price), in
particular focusing on the relationships of independent variables with each other and with
dependent variable Price drawing on results of EDA analysis and relevant literature on
what determinates property prices
(20 marks 250 words)
Hint: Statistics Tab and Chart Tab in RapidMiner Studio provide a lot of descriptive statistical
information and the ability to create useful charts like Barcharts, Scatterplots, Boxplot charts etc
for EDA analysis. You might also like to look at running correlations and/or chi square tests as
appropriate to determine which variables contribute most to predicting property sale price (Price).
Task 2.2) Build and report on Linear Regression model for predicting property sale price (Price)
using RapidMiner data mining process and appropriate set of data mining operators and a reduced set
of variables from housing.csv data set as determined by your exploratory data analysis in Task 2.1.
Provide the following for Task 2.2:
(i) A screen capture of Final Linear Regression Model process and briefly describe your Final
Linear Regression Model process
(ii) Table 2.2 named Results of Final Linear Regression Model for Task 2.2 for housing.csv
data set.
(iii) Discuss the results of Final Linear Regression Model for housing.csv data set drawing on
key outputs (coefficients, standardised coefficients, t-statistics values, p-values and
significance levels etc) for predicting property sale price (Price) and relevant supporting
literature on interpretation of a Linear Regression Model.
(16 marks 150 words)
Include all appropriate outputs such as RapidMiner Processes, Graphs and Tables that support key
aspects of exploratory data analysis and linear regression model analysis of the housing.csv data
set in your Assignment 2 report.

Found something interesting ?

• On-time delivery guarantee
• PhD-level professional writers
• Free Plagiarism Report

• 100% money-back guarantee
• Absolute Privacy & Confidentiality
• High Quality custom-written papers

Related Model Questions

Feel free to peruse our college and university model questions. If any our our assignment tasks interests you, click to place your order. Every paper is written by our professional essay writers from scratch to avoid plagiarism. We guarantee highest quality of work besides delivering your paper on time.

Grab your Discount!

25% Coupon Code: SAVE25
get 25% !!