Javier Rojas
MS Candidate in Big Data Analytics
CIS627 - Big Data Analytics Capstone
Patients at risk for deterioration in the hospital setting may not be identified easily or in a timely manner by healthcare professionals.
To this effect, the PEWS (Pediatric Early Warning System) score was created as a tool used by said professionals to promptly quantify the severity of illness and identify critically ill patients before possibly experiencing a Code Blue event, which entails the patient requiring immediate resuscitation.
Level of severity depends on the total score for the patient in question based on looking at his or her vital signs, which, for this project, are diastolic blood pressure, systolic blood pressure, temperature, oxygen saturation (SpO2), heart rate and respiration rate.
The total score of a patient is obtained by adding all the assigned values of each vital sign result, which are based on a scale from 0 (normal) to 3 (severe).
Based on the literature, it has been denoted that a score of five or more has been associated with an increased propability in Code Blue events occurring; as such, this was accounted for in this project.
Nonetheless, most PEWS used today have low sensitivity and are labor intensive. Thus, there is the need for an early warning system that can help improve communication between the nursing staff and physicians and can help identify a higher risk patient population more effectively and efficiently.
Improve the existing PEWS system’s sensitivity of early warning scores by applying statistics, sub sampling and machine learning into the analysis of raw patient data, which contains their vital signs, lab results and diagnostics.
To accomplish such goal, I intend to construct a machine learning algorithm, which can serve as an automated scoring system with a higher sensitivity than the current PEWS in identifying patients at risk of entering Code Blue in a timely manner.
For prepping the data for analysis, using R’s tidyverse package functions such as filtering, selecting, mutating, grouping, slicing and arranging along with the pipe %>% operator for applying the exclusion criteria intended.
Implementing Support Vector Machine or Logistic Regression by using the caret (classification and regression training) and e1071 (support vector machine) package for the problem in question.
Applying 10-fold cross validation through the caret package’s “trainControl()” function for defining the rule for model training and the conditions around how sampling and grid search is to be done.
Calling the package’s “train()” function for evaluating the model using the cross-validation and resampling metrics specified in the “trainControl” to measure the effect of tuning these parameters in performance.
Next, relying on the ROCR package for calculating the Area Under the Curve (AUC) and to create the Receiver Operating Curve (ROC) through functions such as “predict()” for producing predicted values from the model in question, “prediction()” function for extracting the predicted values via “predict()” from a model object and return a data frame and “performance()” for creating performance objects and performing various types of predictor evaluations.
Lastly, using a “ggplot()” object created by ggplot2 library and plotROC library to plot the ROC curve.
## # A tibble: 40 × 182
## MRN VS_1t_1 VS_2t_1 VS_3t_1 VS_4t_1 VS_5t_1 VS_6t_1 VS_1t_2 VS_2t_2
## <int> <int> <int> <dbl> <int> <int> <int> <int> <int>
## 1 1023323 64 99 98.7 94 110 26 73 113
## 2 1028322 82 123 98.4 100 77 22 75 117
## 3 1129490 85 106 97.9 95 147 22 80 121
## 4 1130377 71 102 98.1 98 101 26 76 127
## 5 1137850 67 107 97.8 100 107 25 67 107
## 6 1177721 79 113 98.2 98 96 20 82 116
## 7 1186850 65 99 97.0 100 68 20 52 91
## 8 1225722 48 98 36.1 100 108 24 68 96
## 9 1258815 58 97 97.7 100 88 18 58 97
## 10 1259031 62 102 101.3 98 128 24 56 89
## # ... with 30 more rows, and 173 more variables: VS_3t_2 <dbl>,
## # VS_4t_2 <int>, VS_5t_2 <int>, VS_6t_2 <int>, VS_1t_3 <int>,
## # VS_2t_3 <int>, VS_3t_3 <dbl>, VS_4t_3 <int>, VS_5t_3 <int>,
## # VS_6t_3 <int>, VS_1t_4 <int>, VS_2t_4 <int>, VS_3t_4 <dbl>,
## # VS_4t_4 <int>, VS_5t_4 <int>, VS_6t_4 <int>, VS_1t_5 <int>,
## # VS_2t_5 <int>, VS_3t_5 <dbl>, VS_4t_5 <int>, VS_5t_5 <int>,
## # VS_6t_5 <int>, VS_1t_6 <int>, VS_2t_6 <int>, VS_3t_6 <dbl>,
## # VS_4t_6 <int>, VS_5t_6 <int>, VS_6t_6 <int>, VS_1t_7 <int>,
## # VS_2t_7 <int>, VS_3t_7 <dbl>, VS_4t_7 <int>, VS_5t_7 <int>,
## # VS_6t_7 <int>, VS_1t_8 <int>, VS_2t_8 <int>, VS_3t_8 <dbl>,
## # VS_4t_8 <int>, VS_5t_8 <int>, VS_6t_8 <int>, VS_1t_9 <int>,
## # VS_2t_9 <int>, VS_3t_9 <dbl>, VS_4t_9 <int>, VS_5t_9 <int>,
## # VS_6t_9 <int>, VS_1t_10 <int>, VS_2t_10 <int>, VS_3t_10 <dbl>,
## # VS_4t_10 <int>, VS_5t_10 <int>, VS_6t_10 <int>, VS_1t_11 <int>,
## # VS_2t_11 <int>, VS_3t_11 <dbl>, VS_4t_11 <int>, VS_5t_11 <int>,
## # VS_6t_11 <int>, VS_1t_12 <int>, VS_2t_12 <int>, VS_3t_12 <dbl>,
## # VS_4t_12 <int>, VS_5t_12 <int>, VS_6t_12 <int>, VS_1t_13 <int>,
## # VS_2t_13 <int>, VS_3t_13 <dbl>, VS_4t_13 <int>, VS_5t_13 <int>,
## # VS_6t_13 <int>, VS_1t_14 <int>, VS_2t_14 <int>, VS_3t_14 <dbl>,
## # VS_4t_14 <int>, VS_5t_14 <int>, VS_6t_14 <int>, VS_1t_15 <int>,
## # VS_2t_15 <int>, VS_3t_15 <dbl>, VS_4t_15 <int>, VS_5t_15 <int>,
## # VS_6t_15 <int>, VS_1t_16 <int>, VS_2t_16 <int>, VS_3t_16 <dbl>,
## # VS_4t_16 <int>, VS_5t_16 <int>, VS_6t_16 <int>, VS_1t_17 <int>,
## # VS_2t_17 <int>, VS_3t_17 <dbl>, VS_4t_17 <int>, VS_5t_17 <int>,
## # VS_6t_17 <int>, VS_1t_18 <int>, VS_2t_18 <int>, VS_3t_18 <dbl>,
## # VS_4t_18 <int>, VS_5t_18 <int>, VS_6t_18 <int>, ...
SVM
## Confusion Matrix and Statistics
##
## Reference
## Prediction Normal Worst
## Normal 14 7
## Worst 8 11
##
## Accuracy : 0.625
## 95% CI : (0.458, 0.7727)
## No Information Rate : 0.55
## P-Value [Acc > NIR] : 0.2142
##
## Kappa : 0.2462
## Mcnemar's Test P-Value : 1.0000
##
## Sensitivity : 0.6111
## Specificity : 0.6364
## Pos Pred Value : 0.5789
## Neg Pred Value : 0.6667
## Prevalence : 0.4500
## Detection Rate : 0.2750
## Detection Prevalence : 0.4750
## Balanced Accuracy : 0.6237
##
## 'Positive' Class : Worst
##
LR
## Confusion Matrix and Statistics
##
## Reference
## Prediction Normal Worst
## Normal 11 8
## Worst 11 10
##
## Accuracy : 0.525
## 95% CI : (0.3613, 0.6849)
## No Information Rate : 0.55
## P-Value [Acc > NIR] : 0.6844
##
## Kappa : 0.0547
## Mcnemar's Test P-Value : 0.6464
##
## Sensitivity : 0.5556
## Specificity : 0.5000
## Pos Pred Value : 0.4762
## Neg Pred Value : 0.5789
## Prevalence : 0.4500
## Detection Rate : 0.2500
## Detection Prevalence : 0.5250
## Balanced Accuracy : 0.5278
##
## 'Positive' Class : Worst
##
Although the results indicate the models are better than random guess, I was not completely satisfied and approached the same problem but now working directly with the max PEWS score ever assigned to the patient.
If their corresponding maximum PEWS score was less than 5, they were assigned a normal label; otherwise, a worst label.
## Confusion Matrix and Statistics
##
## Reference
## Prediction Normal Worst
## Normal 27 2
## Worst 2 9
##
## Accuracy : 0.9
## 95% CI : (0.7634, 0.9721)
## No Information Rate : 0.725
## P-Value [Acc > NIR] : 0.006632
##
## Kappa : 0.7492
## Mcnemar's Test P-Value : 1.000000
##
## Sensitivity : 0.8182
## Specificity : 0.9310
## Pos Pred Value : 0.8182
## Neg Pred Value : 0.9310
## Prevalence : 0.2750
## Detection Rate : 0.2250
## Detection Prevalence : 0.2750
## Balanced Accuracy : 0.8746
##
## 'Positive' Class : Worst
##
## Confusion Matrix and Statistics
##
## Reference
## Prediction Normal Worst
## Normal 12 5
## Worst 17 6
##
## Accuracy : 0.45
## 95% CI : (0.2926, 0.6151)
## No Information Rate : 0.725
## P-Value [Acc > NIR] : 0.99994
##
## Kappa : -0.0304
## Mcnemar's Test P-Value : 0.01902
##
## Sensitivity : 0.5455
## Specificity : 0.4138
## Pos Pred Value : 0.2609
## Neg Pred Value : 0.7059
## Prevalence : 0.2750
## Detection Rate : 0.1500
## Detection Prevalence : 0.5750
## Balanced Accuracy : 0.4796
##
## 'Positive' Class : Worst
##
## # A tibble: 8,217 × 183
## Patient.ID VS_1t_1 VS_2t_1 VS_3t_1 VS_4t_1 VS_5t_1 VS_6t_1 VS_1t_2
## <int> <int> <int> <dbl> <dbl> <int> <int> <int>
## 1 835205 72 119 98.1 100 98 18 85
## 2 1127414 67 108 97.9 85 80 24 67
## 3 1224603 82 132 98.5 99 82 20 82
## 4 948156 73 105 100.4 100 77 19 73
## 5 1067551 59 94 98.4 100 88 18 59
## 6 775157 83 125 97.9 97 103 18 83
## 7 60081351 67 101 98.0 99 71 20 62
## 8 1057692 55 86 97.5 99 129 24 55
## 9 1285366 68 104 98.2 99 121 22 49
## 10 941231 74 121 36.3 100 120 16 78
## # ... with 8,207 more rows, and 175 more variables: VS_2t_2 <int>,
## # VS_3t_2 <dbl>, VS_4t_2 <dbl>, VS_5t_2 <int>, VS_6t_2 <int>,
## # VS_1t_3 <int>, VS_2t_3 <int>, VS_3t_3 <dbl>, VS_4t_3 <dbl>,
## # VS_5t_3 <int>, VS_6t_3 <int>, VS_1t_4 <int>, VS_2t_4 <int>,
## # VS_3t_4 <dbl>, VS_4t_4 <dbl>, VS_5t_4 <int>, VS_6t_4 <int>,
## # VS_1t_5 <int>, VS_2t_5 <int>, VS_3t_5 <dbl>, VS_4t_5 <dbl>,
## # VS_5t_5 <int>, VS_6t_5 <int>, VS_1t_6 <int>, VS_2t_6 <int>,
## # VS_3t_6 <dbl>, VS_4t_6 <dbl>, VS_5t_6 <int>, VS_6t_6 <int>,
## # VS_1t_7 <int>, VS_2t_7 <int>, VS_3t_7 <dbl>, VS_4t_7 <dbl>,
## # VS_5t_7 <int>, VS_6t_7 <int>, VS_1t_8 <int>, VS_2t_8 <int>,
## # VS_3t_8 <dbl>, VS_4t_8 <dbl>, VS_5t_8 <int>, VS_6t_8 <int>,
## # VS_1t_9 <int>, VS_2t_9 <int>, VS_3t_9 <dbl>, VS_4t_9 <dbl>,
## # VS_5t_9 <int>, VS_6t_9 <int>, VS_1t_10 <int>, VS_2t_10 <int>,
## # VS_3t_10 <dbl>, VS_4t_10 <dbl>, VS_5t_10 <int>, VS_6t_10 <int>,
## # VS_1t_11 <int>, VS_2t_11 <int>, VS_3t_11 <dbl>, VS_4t_11 <dbl>,
## # VS_5t_11 <int>, VS_6t_11 <int>, VS_1t_12 <int>, VS_2t_12 <int>,
## # VS_3t_12 <dbl>, VS_4t_12 <dbl>, VS_5t_12 <int>, VS_6t_12 <int>,
## # VS_1t_13 <int>, VS_2t_13 <int>, VS_3t_13 <dbl>, VS_4t_13 <dbl>,
## # VS_5t_13 <int>, VS_6t_13 <int>, VS_1t_14 <int>, VS_2t_14 <int>,
## # VS_3t_14 <dbl>, VS_4t_14 <dbl>, VS_5t_14 <int>, VS_6t_14 <int>,
## # VS_1t_15 <int>, VS_2t_15 <int>, VS_3t_15 <dbl>, VS_4t_15 <dbl>,
## # VS_5t_15 <int>, VS_6t_15 <int>, VS_1t_16 <int>, VS_2t_16 <int>,
## # VS_3t_16 <dbl>, VS_4t_16 <dbl>, VS_5t_16 <int>, VS_6t_16 <int>,
## # VS_1t_17 <int>, VS_2t_17 <int>, VS_3t_17 <dbl>, VS_4t_17 <dbl>,
## # VS_5t_17 <int>, VS_6t_17 <int>, VS_1t_18 <int>, VS_2t_18 <int>,
## # VS_3t_18 <dbl>, VS_4t_18 <dbl>, VS_5t_18 <int>, ...
##
## Normal Worst
## 0.9889254 0.0110746
Thus, it prompted me to break up all the patients and PEWS results into age groups and thresholds of interests respectively. The PEWS thresholds consisted of <=3, <=4 and <=5, where patients that fall within these thresholds are labeled (“normal”) and those that don’t labeled (“worst”). The age groups consisted of the following: Infant (<1yr) - 316 patients, Toddler (1-2yr) - 2015 patients, Preschool (3-5yr) - 1434 patients, School-age (6-11yr) - 1709 patients, Adolescent (12-15yr) - 1131 patients.
Nonetheless, it did not resolve the problem, and I eventually resorted to considering two scenarios: manual balancing (working with the unbalanced design) and ROSE sampling (combines under-sampling with the generation of additional data).
In addition, I noticed that I incorrectly stated the 10-fold cross validation with 20 repeats since I used “cv” instead of “repeatedcv” when specifying the methods parameter in the trainControl function, which lead me to reduce the number of repeats to 5 because it was computationally time-consuming.
By considering the previously mentioned scenarios, I compared both models specifically with the “Manual” and “ROSE” sampling and for one of the age groups, which pertains to the school-age (6-11yr) group.
We can observe in the following results that applying “ROSE” sampling to the data significantly improved the heavily imbalanced distribution encountered originally.
## # A tibble: 1,709 × 182
## VS_1t_1 VS_2t_1 VS_3t_1 VS_4t_1 VS_5t_1 VS_6t_1 VS_1t_2 VS_2t_2 VS_3t_2
## <int> <int> <dbl> <dbl> <int> <int> <int> <int> <dbl>
## 1 67 108 97.9 85 80 24 67 108 97.9
## 2 82 132 98.5 99 82 20 82 132 98.5
## 3 67 101 98.0 99 71 20 62 110 98.0
## 4 55 86 97.5 99 129 24 55 86 97.5
## 5 91 103 97.4 98 107 18 91 103 97.8
## 6 70 109 98.1 99 96 20 63 102 99.0
## 7 64 99 98.2 100 130 18 64 99 98.2
## 8 59 96 98.0 99 91 20 60 97 98.2
## 9 66 102 98.4 100 67 18 66 102 98.4
## 10 53 90 97.8 93 100 24 53 90 97.8
## # ... with 1,699 more rows, and 173 more variables: VS_4t_2 <dbl>,
## # VS_5t_2 <int>, VS_6t_2 <int>, VS_1t_3 <int>, VS_2t_3 <int>,
## # VS_3t_3 <dbl>, VS_4t_3 <dbl>, VS_5t_3 <int>, VS_6t_3 <int>,
## # VS_1t_4 <int>, VS_2t_4 <int>, VS_3t_4 <dbl>, VS_4t_4 <dbl>,
## # VS_5t_4 <int>, VS_6t_4 <int>, VS_1t_5 <int>, VS_2t_5 <int>,
## # VS_3t_5 <dbl>, VS_4t_5 <dbl>, VS_5t_5 <int>, VS_6t_5 <int>,
## # VS_1t_6 <int>, VS_2t_6 <int>, VS_3t_6 <dbl>, VS_4t_6 <dbl>,
## # VS_5t_6 <int>, VS_6t_6 <int>, VS_1t_7 <int>, VS_2t_7 <int>,
## # VS_3t_7 <dbl>, VS_4t_7 <dbl>, VS_5t_7 <int>, VS_6t_7 <int>,
## # VS_1t_8 <int>, VS_2t_8 <int>, VS_3t_8 <dbl>, VS_4t_8 <dbl>,
## # VS_5t_8 <int>, VS_6t_8 <int>, VS_1t_9 <int>, VS_2t_9 <int>,
## # VS_3t_9 <dbl>, VS_4t_9 <dbl>, VS_5t_9 <int>, VS_6t_9 <int>,
## # VS_1t_10 <int>, VS_2t_10 <int>, VS_3t_10 <dbl>, VS_4t_10 <dbl>,
## # VS_5t_10 <int>, VS_6t_10 <int>, VS_1t_11 <int>, VS_2t_11 <int>,
## # VS_3t_11 <dbl>, VS_4t_11 <dbl>, VS_5t_11 <int>, VS_6t_11 <int>,
## # VS_1t_12 <int>, VS_2t_12 <int>, VS_3t_12 <dbl>, VS_4t_12 <dbl>,
## # VS_5t_12 <int>, VS_6t_12 <int>, VS_1t_13 <int>, VS_2t_13 <int>,
## # VS_3t_13 <dbl>, VS_4t_13 <dbl>, VS_5t_13 <int>, VS_6t_13 <int>,
## # VS_1t_14 <int>, VS_2t_14 <int>, VS_3t_14 <dbl>, VS_4t_14 <dbl>,
## # VS_5t_14 <int>, VS_6t_14 <int>, VS_1t_15 <int>, VS_2t_15 <int>,
## # VS_3t_15 <dbl>, VS_4t_15 <dbl>, VS_5t_15 <int>, VS_6t_15 <int>,
## # VS_1t_16 <int>, VS_2t_16 <int>, VS_3t_16 <dbl>, VS_4t_16 <dbl>,
## # VS_5t_16 <int>, VS_6t_16 <int>, VS_1t_17 <int>, VS_2t_17 <int>,
## # VS_3t_17 <dbl>, VS_4t_17 <dbl>, VS_5t_17 <int>, VS_6t_17 <int>,
## # VS_1t_18 <int>, VS_2t_18 <int>, VS_3t_18 <dbl>, VS_4t_18 <dbl>,
## # VS_5t_18 <int>, VS_6t_18 <int>, VS_1t_19 <int>, ...
##
## Normal Worst
## 0.4839087 0.5160913
The following table summary was obtained upon comparing both models’ overall accuracies and kappa
, which is a measure of how well the model’s predictions agrees with the true values.
One common interpretation of the kappa
statistic is summarized as follows:
Poor agreement = less than 0.20
Fair agreement = 0.20 to 0.40
Moderate agreement = 0.40 to 0.60
Good agreement = 0.60 to 0.80
Very good agreement = 0.80 to 1.00
Manual Balancing
##
## Call:
## summary.resamples(object = resultsa)
##
## Models: LR, SVM
## Number of resamples: 30
##
## Accuracy
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## LR 0.9532 0.9825 0.9883 0.9864 0.9942 1 0
## SVM 0.9825 0.9883 0.9941 0.9916 0.9942 1 0
##
## Kappa
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## LR -0.010340 -0.007859 -0.005882 0.1082 0 0.6640 2
## SVM -0.007859 -0.005882 0.000000 0.0156 0 0.4956 3
ROSE sampling
##
## Call:
## summary.resamples(object = resultsb)
##
## Models: LR, SVM
## Number of resamples: 30
##
## Accuracy
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## LR 0.8363 0.8772 0.8921 0.8925 0.9104 0.9593 0
## SVM 0.8480 0.8889 0.8977 0.8982 0.9104 0.9415 0
##
## Kappa
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## LR 0.6720 0.7544 0.7842 0.7850 0.8200 0.9185 0
## SVM 0.6952 0.7777 0.7949 0.7961 0.8209 0.8831 0
## Confusion Matrix and Statistics
##
## Reference
## Prediction Normal Worst
## Normal 793 24
## Worst 34 858
##
## Accuracy : 0.9661
## 95% CI : (0.9563, 0.9741)
## No Information Rate : 0.5161
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.932
## Mcnemar's Test P-Value : 0.2373
##
## Sensitivity : 0.9728
## Specificity : 0.9589
## Pos Pred Value : 0.9619
## Neg Pred Value : 0.9706
## Prevalence : 0.5161
## Detection Rate : 0.5020
## Detection Prevalence : 0.5219
## Balanced Accuracy : 0.9658
##
## 'Positive' Class : Worst
##
LR (ROC & AUC)
## The Area Under ROC curve for this model is 0.9658383
Table 1: The results in the table above correspond to the best ML results for each age group based on the thresholds for PEWS that were considered in terms of sensitivity obtained. These results also correspond to the implemented Logistic Regression model. TP and FN stand for true positives (number of correctly classified positives “worst”) and false negatives (number of incorrectly classified positives “worst”); whereas, TN and FP stand for true negatives (number of correctly classified negatives “normal”) and false positives (number of incorrectly classified negatives “normal”).
“ROC 1 (Manual - Best)”
Figure 1: Displaying the best corresponding true positive rates for each of the age groups in ascending order in the form of an “ROC” curve.
“Accuracy 1 (Manual - Best)”
Figure 2: Displaying the best corresponding accuracies for each of the age groups in ascending order in the form of a bar plot.
“ROC 2 (ROSE - Best)”
Figure 3: Displaying the best corresponding true positive rates for each of the age groups in ascending order in the form of an “ROC” curve.
“Accuracy 2 (ROSE - Best)”
Figure 4: Displaying the best corresponding accuracies for each of the age groups in ascending order in the form of a bar plot.
The current results that have been demonstrated have been promosing in terms of correctly classifying those patients considered “normal” or “worst”. Nonetheless, there still remains the task of analyzing the factor of time, which corresponds, in this case, to the moment each patient’s observation was recorded.
To start accomplishing such task, I took one patient who had suffered from cardiac arrest and considered those vital signs that have a high correlation between each other based on the correlogram created below:
Once these vital signs were determined, I plotted the results with the help of the qcc (Quality Control Chart)
function in R, which will flag runs in each of the vital sign results for violating Nelson Rules
, which consists of a set of eight decision rules for detecting “out-of-control” or non-random conditions on control charts where the magnitude of some variable is plotted against time.
The rules are based on the mean value and the standard deviation of the samples, and they are as follows:
Rule 1: One point is more than 3 standard deviations from the mean.
Rule 2: Nine (or more) points in a row are on the same side of the mean.
Rule 3: Six (or more) points in a row are continually increasing (or decreasing).
Rule 4: Fourteen (or more) points in a row alternate in direction, increasing then decreasing.
Rule 5: Two (or three) out of three points in a row are more than 2 standard deviations from the mean in the same direction.
Rule 6: Four (or five) out of five points in a row are more than 1 standard deviation from the mean in the same direction.
Rule 7: Fifteen points in a row are all within 1 standard deviation of the mean on either side of the mean.
Rule 8: Eight points in a row exist, but none within 1 standard deviation of the mean, and the points are in both directions from the mean.
## $beyond.limits
## [1] 80
##
## $violating.runs
## [1] 7 52 53 79 80 81 82 83 84 85 86 95 96 115 116 117 118
## [18] 119 120 121 122 123 124 125 126 127 128 129 140 141 151 152 153 154
## [35] 155 156 157 158 159 160 161 171 172 186
## $beyond.limits
## integer(0)
##
## $violating.runs
## [1] 40 41 42 43 44 45 46 47 48 49 50 51 52 53 66 67 68
## [18] 69 70 79 80 81 82 92 93 94 95 96 104 105 106 107 108 109
## [35] 110 111 123 124 125 140 141 142 143 151 152 153 154 155 156 157 158
## [52] 159 160 161 171 172 173 174 175 176 186 187 188 189 190 191 192 193
## $beyond.limits
## [1] 132 133 144
##
## $violating.runs
## [1] 7 8 9 10 11 12 147 148 149 150 151 152 174 175 176 177 178
## [18] 179 180 181 182 183 184 185 186 19 20 31 32 33 76 77 96 97
## [35] 98 99 100 101 102 103 104 105 106 114 115 116 117 118 119 120 159
## $beyond.limits
## integer(0)
##
## $violating.runs
## [1] 7 147 148 186 187 31 32 33 47 48 49 76 77 78 79 103 104
## [18] 118 119 120 121 122 123 124 125 174 175
The indices of points beyond normal limits and triggering one of the Nelson Rules are highlighted in some form of a “red” and “orange” color respectively.
Disregarding the fact that at certain moments there is a wide gap in terms of time difference from one observation to the next. As such, the results are plotted sequentially as they are observed.
Tne final step is to find innovative ways of quantifying these results in a manner that can be integrated into the implemented model’s analysis with the end goal of detecting at-risk patients a few hours ahead of time.
Sax Frederic L, C.M.E., Medical Patients at high risk for catastrophic deterioration. Crit Care Med, 1987. 15: p. 510-15.
Smith AF, W.J., Can some in-hospital cardio-respiratory arrests be prevented? A prospective survey. Resuscitation, 1998. 37: p. 133-7.
Subbe, C.P., et al., Validation of a modified Early Warning Score in medical admissions. Qjm, 2001. 94(10): p. 521-6.
Gardner-Thorpe, J., et al., The value of Modified Early Warning Score (MEWS) in surgical in-patients: a prospective observational study. Ann R Coll Surg Engl, 2006. 88(6): p. 571-5.
Odetola, F.O., A. Gebremariam, and G.L. Freed, Patient and hospital correlates of clinical outcomes and resource utilization in severe pediatric sepsis. Pediatrics, 2007. 119(3): p. 487-94.
Parshuram Christopher S, H.J., Middaugh Kristen, Development and initial validation of the Bedside Paediatric Early Warning System score. Crit Care Forum, 2009. 13(4): p. 1-10.
Monaghan, A., Detecting and managing deterioration in children. Paediatric Nurs, 2005. 17(1): p. 32-5.
Tucker KM, B.T., Baker RB, Demeritt B, Vossmeyer MT, Prospective evaluation of a pediatric inpatient early warning scoring system. J Spec Pediatr Nurs, 2009. 14(2): p. 79-85.
Skaletzky SM, Raszynski A, Totapally BR. Validation of a modified pediatric early warning system score: a retrospective case-control study. Clin Pediatr (Phila). 2012 May;51(5):431-5. doi: 10.1177/0009922811430342. Epub 2011 Dec 8. PubMed PMID: 22157421.
Mitchell TM. Machine learning. New York: McGraw-Hill; 1997. xvii, 414 p.
Michalski RSa, Carbonell JG, Mitchell TM, Anderson JR. Machine learning : an artificial intelligence approach. Palo Alto, Calif.: Tioga Pub. Co.; 1983. v. p.
Bezdek JC. Pattern Analysis. In: Pedrycz W, Bonissone PP, Ruspini EH, editors. Handbook of Fuzzy Computation. Bristol: Institute of Physics; 1998. p. F6.1.-F6..20.
Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet. 2011;89I:82-93.
Lunardon, N., Menardi, G., & Torelli, N. (n.d.). ROSE: A Package for Binary Imbalanced Learning. Retrieved from https://journal.r-project.org/archive/2014-1/menardi-lunardon-torelli.pdf
Compare The Performance of Machine Learning Algorithms in R - Machine Learning Mastery. (n.d.). Retrieved June 21, 2017, from http://machinelearningmastery.com/compare-the-performance-of-machine-learning-algorithms-in-r/