i. Write a code in R or Python that will calculate the correlation between “Column2” and “Column3” of a “dataframe” [1 Mark] ii. The above dataset has been loaded for you in R or Python in a variable named “dataframe”. Write a code that will select only the rows for which parameter is Alpha? [1 Mark] iii. A majority of work in Python or R uses systems internal memory and with large datasets, situations may arise when the Python or R workspace cannot hold all the data in memory. So, removing the unused objects is one of the solutions. Write a command that will remove rows with values called “Beta” [1 Mark] iv. State and explain Techniques and tools (R or Python packages) that are used to preprocess data so that it can be ready for data mining [5 Marks] (b) Suppose that your local bank has a data mining system. The bank has been studying your debit card usage patterns. Noticing that you make many transactions at home renovation stores, the bank decides to contact you, offering information regarding their special loans for home improvements. i. Briefly explain how this may conflict with your right to privacy. [2 Marks] ii. Describe a privacy-preserving data mining method that may allow the bank to perform customer pattern analysis without infringing on its customers’ right to privacy. [2 Marks] (c) Data quality can be assessed in terms of several issues, including accuracy, completeness, and consistency. For each of the above three issues; i. Briefly discuss how data quality assessment can depend on the intended use of the data, giving examples. [2 Marks] ii. Propose TWO other dimensions of data quality [2 Marks] (d) In real-world data, tuples with missing values for some attributes are a common occurrence. Describe any TWO methods for handling this problem. [2 Marks] (e) Briefly describe any TWO issues to consider during data integration. Give example for each case. [2 Marks] (f) What are the differences between the three main types of data warehouse usage, namely; i. Information processing [1 Mark] ii. Analytical processing [1 Mark] iii. Data mining [1 Mark]

Found something interesting ?

• On-time delivery guarantee
• PhD-level professional writers
• Free Plagiarism Report

• 100% money-back guarantee
• Absolute Privacy & Confidentiality
• High Quality custom-written papers

Related Model Questions

Feel free to peruse our college and university model questions. If any our our assignment tasks interests you, click to place your order. Every paper is written by our professional essay writers from scratch to avoid plagiarism. We guarantee highest quality of work besides delivering your paper on time.

Grab your Discount!

25% Coupon Code: SAVE25
get 25% !!