Complete a hypothesis test on the two variables at the 0.05 level
(a) Name the two variables. Answer: gender and religious preferenceChi-Square Test Of Goodness Of Fit
An urban economist wishes to test the claim that the distribution of United States residents in the United States is different today than it was in 1999. In 1999, 19.6% of the population resided in the Northeast, 23.0% resided in the Midwest, 35.4% resided in the South, and 22.0% resided in the West (based on data obtained from the Census Bureau). The economists randomly selects 2000 households in the United States and obtains the frequency distribution shown below.
Use a chi-square test of goodness of fit to determine whether the distribution of residents in the USA is different today from the distribution in 1999 at the alpha level of 0.05 from the East have a different distribution of opinions than those teachers from the South.
(a) What is the null hypothesis?
(b) What is the df? Answer:
(c) What is the calculated chi-square? Answer:
(d) What is the "p" value? Answer:
(e) What is the theoretical chi-square? Answer:
(f) Is the p value smaller than alpha?
(g) Is the critical chi-square larger than the theoretical chi-square? Answer:
(h) What is the conclusion? Answer:
Correlation
1.Find the correlation coefficient between age (in years) and
systolic blood pressure of the following individuals. Answer: 0.7905
Age
16 25 39
45 49
Blood Pressure 109 122 143
132 199
2..Which value of r indicates the strongest correlation: r =
0.731 or r = -.845? Answer: -.845
3.Which of the following could not represent a correlation coefficient?
(a) 0.926 (b) 1.054 (c) -.7003 (d) -.0000005
Answer: b
Simple Linear Regression
1.The regression equation for the selling price of a house in dollars and number of square feet of heated space is y (hat) = 50.729x +1004.50
(a) Find the selling price for a house with 2000 square feet of heated space.
Answer: $102,462.50
(b) Find the selling price of a house that did not have any heated space.
Answer: $1004.50
2.The ages (years) of seven men and their systolic blood pressure are given in the table below.
Age
17 26 37
48 50 68 72
Blood Pressure
110 124 146 140 200
192 200
(a) Find the equation of the regression line with x
as the explanatory variable and y as the response variable. Answer: y (hat) =
1.662x+83.338
(b) What is the correlation coefficient between the two variables? Answer:
.8958
(c) What is the coefficient of determination? Answer: 80.25%
(d) How much of the variation in systolic blood pressure is unexplained by this model? Answer:19.75%
(e) Predict the blood pressure for a male age 60. Answer: 183.08
(f) Could there be a lurking variable at work here? If so what might it be? Answer? Yes, Weight
Multiple Linear regression
1.Below are data which shows the carbon monoxide, tar, nicotine content
and weight in milligrams of 13 brands of U.S. cigarettes. Source: FTC
Carbon
Monoxide (x1) Tar(x2) Nicotine(x3)
Weight(x4)
13.6
14.1
0.86 985.3
16.6
16.0
1.06 1093.8
10.2
8.0
0.67
928.0
5.4
4.1
0.40
946.2
15.0 15.0
1.04
888.5
9.0
8.8
0.76 1026.7
12.3
12.4
0.95
922.5
16.3
16.6
1.12
937.2
15.4
14.9
1.02
885.8
13.0
13.7
1.01
964.3
14.4
15.1
0.90
931.6
10.0
7.8
0.57
970.5
10.2 11.4
0.78 1124.0
(a) Which variable has the highest correlation with x1 and what is the
coefficient? Answer: x2= .967
(b) Write the regression equation with x1 as the response variable and x2 as the explanatory
variable. Answer: x1 = 2.365 + 0.827x2
(c) How much of the variation in x1 is explained by x2 using this model? Answer: 93.4%
(d) Write the regression equation with x1 as the response variable and x2, x3, and x4 as the explanatory variables. Answer: x1 = 6.318 + 0.822x2 + 0.031x3 - 0.004x4
(e) How much of the variation in x1 is explained by x2, and x3, and x4? Answer: 94.2%
(f) write a regression equation with x2 as the response variable and x4 as the explanatory variable. Answer: x2 = 14.885 - 0.003x4
(g) using this last equation, predict the amount of tar produced by a cigarette weighing 950 milligrams. Answer: 12.202 mg