When analyzing an NBA team’s season performance, the aspects most people turn to initially is there win and loss record and where the team ranks in their respective conference and division. Thus, in order to elaborate effectively these aspects, we must consider what is known in basketball as “The Four Factors”. These four metrics, which include a team’s effective field goal percentage (EFG), free throw rate (FTR), turnovers per possession (TPP) and offensive rebounding percentage (ORP), has been researched and concluded to represent the overall offensive and defensive performance metrics of a team, which can help us predict whether the team being analysed wins or losses a game.
With the above in mind, in this assignment, I will be analyzing the data from the .csv
file that represents NBA’s Miami Heat basketball team’s and their corresponding opponent’s performance metrics for each game of the 2010-2011 season. In addition, the file has a record of Miami Heat’s wins and losses for each game of the season. With regards to this data, I intend to verify whether the stated metrics in the introduction due indeed serve as crucial indicators in determining whether the Miami Heat team won or lost a game.
MHdata<-read.csv("MiamiHeat.csv")
attach(MHdata)
names(MHdata)
## [1] "Game" "Date" "Location" "Opp"
## [5] "Win" "FG" "FGA" "EFG"
## [9] "FG3" "FG3A" "FT" "FTA"
## [13] "FTR" "Rebounds" "DR" "OffReb"
## [17] "ORP" "Assists" "Steals" "Blocks"
## [21] "Turnovers" "TPP" "Fouls" "Points"
## [25] "OppFG" "OppFGA" "DEFG" "OppFG3"
## [29] "OppFG3A" "OppFT" "OppFTA" "DFTR"
## [33] "OppOffReb" "OppDR" "OppRebounds" "DORP"
## [37] "OppAssists" "OppSteals" "OppBlocks" "OppTurnovers"
## [41] "DTPP" "OppFouls" "Diff" "OppPoints"
\[ \begin{aligned} EFG = (FGM + 0.5 × 3PM)/FGA \end{aligned} \] \[ \begin{aligned} FTR = FTM/FGA \end{aligned} \] \[ \begin{aligned} TPP = TO_t /POSS_t \end{aligned} \] \[ \begin{aligned} ORP = OREB_t /(OREB_t+ DREB_o) \end{aligned} \]
MHdata.lr<-lm(Win~EFG+FTR+TPP+ORP+DEFG+DFTR+DTPP+DORP, MHdata)
summary(MHdata.lr)
##
## Call:
## lm(formula = Win ~ EFG + FTR + TPP + ORP + DEFG + DFTR + DTPP +
## DORP, data = MHdata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.62507 -0.20060 -0.00156 0.22559 0.53717
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.562436 0.681544 0.825 0.41193
## EFG 0.034768 0.005851 5.943 8.83e-08 ***
## FTR 0.011826 0.003507 3.372 0.00119 **
## TPP -0.025934 0.009562 -2.712 0.00833 **
## ORP 0.015724 0.004924 3.193 0.00208 **
## DEFG -0.024623 0.005771 -4.267 5.86e-05 ***
## DFTR -0.007630 0.004535 -1.683 0.09673 .
## DTPP 0.029401 0.011464 2.565 0.01238 *
## DORP -0.019203 0.007982 -2.406 0.01867 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2824 on 73 degrees of freedom
## Multiple R-squared: 0.6571, Adjusted R-squared: 0.6195
## F-statistic: 17.49 on 8 and 73 DF, p-value: 3.042e-14
Based on the results from the summary command, we obtained a p-value much less than 0.05, precisely 3.042e-14, which indicates pretty strong evidence against the null hypothesis that was established, which again is that “The Four Factors” has no influence with regards to determining that the team won or lost.
In addition, with regards to the Residual Standard Error, we got a value of 0.2824, which indicates how much the response (Wins) deviates from the true regression line.
Another aspect that we can base our analysis on is the F-Statistic, which is a good indicator of whether there is a relationship between our predictor and the response variables. The further the F-statistic is from 1 the better it is and sufficient to reject the null hypothesis (H0 : There is no relationship between Wins and The Four Factors). However, how much larger the F-statistic needs to be depends on both the number of data points and the number of predictors. In our example, the F-statistic is 17.49 which is relatively larger than 1 given the size of our data.
plot(MHdata.lr,which=2)
\[ \begin{aligned} 1\space(win),\space if\space\space Y\space>=\space0.5 \\ 0\space(lose),\space if\space\space Y\space<\space0.5 \end{aligned} \]
W<-predict(MHdata.lr)
W
## 1 2 3 4 5 6
## 0.13269009 0.78132622 0.95123613 1.30552487 1.42991846 0.25140707
## 7 8 9 10 11 12
## 0.82616845 0.62507351 0.52359223 0.67270045 1.08612284 0.50900481
## 13 14 15 16 17 18
## 0.37440558 0.11820039 0.07903593 0.63322954 0.32236433 0.85766581
## 19 20 21 22 23 24
## 1.17671705 1.20109335 0.79246505 0.74260814 1.10038613 1.12185162
## 25 26 27 28 29 30
## 1.22360394 0.74669935 0.85471420 0.89322030 0.56740869 0.25398047
## 31 32 33 34 35 36
## 0.69653393 0.73396158 1.00886933 0.71526181 0.98459868 1.08788289
## 37 38 39 40 41 42
## 0.80914410 0.46282776 0.69663953 0.42968176 -0.17096526 0.19913253
## 43 44 45 46 47 48
## 0.33728369 1.02536644 0.45818570 0.61723071 0.66298055 1.09670835
## 49 50 51 52 53 54
## 0.82244105 0.99424404 0.91460493 0.72885842 0.96939169 0.27773834
## 55 56 57 58 59 60
## 0.56606081 0.77047909 0.95177207 0.16775382 0.73226896 0.32528730
## 61 62 63 64 65 66
## 0.30743325 -0.16142810 0.42300456 0.35383120 0.64394860 1.41455770
## 67 68 69 70 71 72
## 1.20404909 0.17077875 1.12150279 0.67615083 0.74946765 1.03229808
## 73 74 75 76 77 78
## 0.90179323 -0.22789145 1.10748411 1.06604379 1.10264805 0.25052020
## 79 80 81 82
## 0.86055585 1.02492776 0.91420409 0.83748038
length(which(W > 0.5))
## [1] 59