“Zorg en Zekerheid is a medium-sized and independent regional health insurer in the
Netherlands, with more than 460 employees and more than 380,000 policyholders. The
company is committed to providing accessible and affordable healthcare. The majority of
policyholders and healthcare providers submit claims for treatments that have actually
taken place. However, a small number commit fraud — for example by adapting invoices.
There are instances of policyholders who, after returning from vacation, submit invoices
for medical costs made abroad. Further examination shows that the invoiced amount has
been altered and is many times higher than the original amount. There are also instances
of “up-coding” — a form of fraud committed by healthcare providers performing simple
services but claiming for more complex alternatives, which results in higher costs.”
Zorg en Zekerheid needed a more accurate and efficient solution to detect claims fraud.
Assume you have been hired by Zorg en Zekerheid to work on this project — to build a
predictive modelling solution that translates many data elements from a diverse range of
sources into quantitative risk ratings for fraudulent claims.
Tasks
Having covered the CRISP-DM methodology at length at the university, you decide to
apply it to this project.
You are free to make reasonable assumptions about possible data sources. However, keep
in mind that some data may not be allowed to be used in the Netherlands. You need not
know the precise laws in the Netherlands or elsewhere, just highlight your legal/ethical
concerns if any arise.
(a) Elaborate on the Business Understanding: determine business objectives and
possible ways to achieve them. Assess the situation, making assumptions where
necessary, and determine data mining goals. [35%]
(b) Discuss the next stages of Data Understanding and Data Preparation. How does your
plan of these stages look like? Think of additional data sources that might be useful
for this problem. Be creative but realistic. Describe all data sources in terms of their
expected properties (structured, unstructured, 4Vs). Comment on practical
challenges that may arise from using these sources. [30%]
(c) What variable do you expect to use as target? What specific challenges your
predictive analytics on detecting fraudulent claims might face using the past data?
Why will you need to partition the data for predictive modelling? Will over-sampling
be needed? [35%]

For a custom-written paper on the above topic, place your order now!

What We Offer
• On-time delivery guarantee
• PhD-level professionals
• Automatic plagiarism check
• 100% money-back guarantee
• 100% Privacy and Confidentiality
• High Quality custom-written papers

Found something interesting ?

• On-time delivery guarantee
• PhD-level professional writers
• Free Plagiarism Report

• 100% money-back guarantee
• Absolute Privacy & Confidentiality
• High Quality custom-written papers

Grab your Discount!

25% Coupon Code: SAVE25
get 25% !!