When we are reviewing association patterns for interesting relationships, objective measures are commonly used. These are required as the relationships may be hidden by the large data set, as indicated by the text. These measures, taken in whole, may instead give us inconsistent data on the interesting nature of the relationship. With large data sets, the data scientist may depend too much on objective measures, and not explore alternatives, which may provide a better analysis.
This has been used at length in the medical field. At any hospital, there is a massive data set to work with. This is in the form of the patient’s medical records. Presently, most hospitals or healthcare facilities use EMR (electronic medical records). This would make the project much more timely, as the researchers would not have to go through all of the boxes of patient files, but could have a program do this portion of the work for them.
At times, the doctor may not be sure of the disease based on the symptoms the patient is presenting. We want to theoretically review the data set and arrive at rules for symptoms and disease. You want to find the best rules to match the symptoms with the disease or Symptom(s) → Disease. Feel free to use the format in the text (p. 361) or another presentation format. Please do this for Hypertension, Diabetes, Congestive Heart Failure, Broken Bone, and two others of your choice.
For the exercise, do not try and find electronic records to work on. You may do research online for the exercise. Please explain why you chose the particular symptoms and the confidence level (low, medium, or high). If you have any questions, please let me know.