For this assignment, I will be answering the following questions:
10.1
From out textbook: pp. 188 Question 10.1
Set up an additive model for the ashina data, as part of the ISwR package
This data contains additive effects on subjects, period, and treatment. Compare the results with those obtained from t tests.Result Interpretation:
Analyzing the results of the additive model, we first see that with the variable treat, its coefficient is -42.87 suggesting that the treatment group experienced a 42.87 unit decrease in vas as compared to the control group. As for period, its coefficient value is 80.50 which means that period 2 experienced an 80.50 unit increase in vas compared to period 1. Moving on to subjects, the first thing that sticks out are the significance levels and it is learned that for the intercept, treat, subject3, subject5, subject7, subject8, and subject10 are all significant (**) at 0.1 significance level where we can infer that their effects are not likely due to random chance. As for residuals, the range appears to go from -48.94 to positive 48.94. Ideally, these values should be randomly distributed around zero but a pattern such as this indicates the model is not capturing some aspect of data. As for the r-squared value, it is 0.7566 or 75.66% and while a good fit is a value close to 1, it means the 75.66% of the variability in vas scores is accounted for by the model. Lastly, with the F-statistic 2.914 and its associated p-value 0.02229, we can see that the model is indeed statistically significant.
Moving on to t test for treatment, we can see that there is a significant difference between the treatment and the control group via the p-value (0.02099). As for the t test for period, it is only marginally significant with the p-value (0.0672).
10.3 Consider the following definitions
Note:
The rnorm() is a built-in R function that generates a vector of normally distributed random numbers. The rnorm() method takes a sample size as input and generates that many numbers.
Your assignment:
Generate the model matrices for models z ~ a*b, z ~ a:b, etc. In your blog posting discuss the implications. Carry out the model fits and notice which models contain singularities.
Hint
We are looking for...
model.matrix(~ a:b); lm(z ~a:b)
R code:
Result Interpretation:
Through executing the code, the only model that contained singularities is model2 which held the expression a:b. The rest of the models yielded false when asked if they held singularities. What are the implications of this? Well, first off, it means the model2 has perfect collinearity. In other words, one predictor can be exactly predicted from the other which may lead to numerical instability in estimating the coefficients. Additionally, singularity means that there is an infinite number of solutions to model and R cannot uniquely determine the coefficients of a and b. Further, because a and b cannot be separated due to collinearity, the coefficient estimates will be unreliable.
As for the other models which yielded false when prompted for collinearity, we know that the coefficient estimates are reliable. Seeing that only one model had perfect collinearity and the rest did not, it might be best to focus on the non-collinear models for future analysis.
~ Katie