Thursday, September 21, 2023

LIS4273 - Module 5 Assignment

 For this post, I will be answering the following questions.

Question 1:

The director of manufacturing at a cookie factory needs to determine whether a new machine is production a particular type of cookies according to the manufacturer's specifications, which indicate that cookies should have a mean of 70 and standard deviation of 3.5 pounds. A sample pf 49 of cookies reveals a sample mean breaking strength of 69.1 pounds.

A. State the null and alternative hypothesis

B. Is there evidence that the machine is not meeting the manufacturer's specifications for average strength? Use a 0.05 level of significance.

C. Compute the p value and interpret its meaning.

D. What would be your answer in (B) if the standard deviation was specified at 1.75 pounds?

E. What would be your answer in (B) if the sample mean was 69 pounds and the standard deviation was 3.5 pounds?

Answer:

A. Null hypothesis: The new machine is making a particular type of cookie according to the manufacturer's specifications with the mean breaking strength of 70 pounds.

    Alternative hypothesis: The new machine is NOT making a particular type of cookie according to the manufacturer's specifications with the mean breaking strength of 70 pounds.

We can write the null and alternative hypothesis as follows:


The hypotheses correspond to a two-tail test.

B. Using the values that were given to us, we must compute the test statistic to determine if there is evidence that the machine is not meeting manufacturer’s specifications.

The formula for the test statistic is as follows:

Where x bar refers to the sample mean 69.1, μ is the mean or in this case, 70. The sigma in the denominator is the standard deviation or 3.5 and n will be represented by the sample of 49 cookies. The calculation can be performed as follows: 69.1 – 70 / (3.5 / sqrt(49)) = -1.8

We must now determine the critical values at the 0.05 level of significance and we can use R to calculate this now that we have the test statistic.

The output tells us that the p_value is 0.07186064. We fail to reject the null hypothesis because there is no evidence that the machine is not meeting specifications.

C. Compute the p value and interpret its meaning.

We can calculate the p_value using this code:

The p_value is 0.07186064 and we can understand from the value that it is greater than the significance level of 0.5 or alpha so we must fail to reject the null hypothesis.

D. What would be your answer in (B) if the standard deviation were specified as 1.75 pounds?

Given different results for the test statistic (-3.6) and the p-value (0.0003182172), we would reject the null hypothesis.

E. What would be your answer in (B) if the sample mean were specified as 69 pounds and the standard deviation is 3.5 pounds?

Given different results for the test statistic (-2) and the p-value (0.04550026), we would reject the null hypothesis.

Question 2:

If x̅ = 85, σ = standard deviation = 8, and n=64, set up 95% confidence interval estimate of the population mean μ.

Answer:

Looking at the 95% confidence interval, it is important to note that the z-score will be 1.96. 1.96 will be our margin of error. With these values in mind, we can plug these values into the following equation to determine the range within which the true population mean will fall given a certain level of confidence.


At the 95% confidence interval, the population mean μ lies within the range of 83.04 to 86.96.

Question 3:

Use dataset found in week 5 folder.

The accompanying data are: x = girls, y = boys (goals, time spend on assignment)

A. Calculate the correlation coefficient for this dataset

Through the provided code from the question we generate the plot of the dataset. but we also generate a matrix of the correlation coefficient to determine the value.

Correlation matrix:

When we compare girls and boys in terms of time spent and goals, we immediately see that there is extremely high correlation. Boy goals and Girl goals are practically 1 to 1 and time spent is pretty high up there with the value of 0.9991175.

B. Pearson correlation coefficient

To calculate the Pearson correlation coefficient, we can use the following code:

Looking at the output, we can say the Pearson correlation coefficient is 1

C. Create plot of the correlation

Executing this code, brings up the following plot that shows us where these values are from a numeric perspective.


If we were to change out panel.cor for panel.shade we can see the high correlation shading in action:

~ Katie


No comments:

Post a Comment

LIS 4370 R Programming - sentimentTextAnalyzer2 Final Project

For this class's major final project, I set out to make the process of analyzing textual files and URL links for sentiment insights much...