Hi there,
I need help with my assignment for Business statistics. It is to do with regression analysis and hypothesis testing. A good knowledge of Excel is required for this task.
I would like to know how to work out these questions too.

Data on Various Household Financial Variables for Randomly Selected Households							
							
Household	Location	Family_Size	Ownership	Income	Utilities	Debt	Monthly_Payment
1	2	2	1	96709	252	5692	1585
2	2	6	0	77470	216	4267	1314
3	4	3	0	65746	207	2903	383
4	1	1	1	56610	249	3896	1002
5	3	3	0	59185	217	3011	743
6	1	4	0	56441	208	3718	991
7	1	1	1	47969	243	5907	849
8	1	1	1	55487	242	2783	752
9	2	3	1	59947	256	6275	1498
10	1	6	0	68808	222	4845	991
11	1	1	1	53113	251	5267	1163
12	4	3	0	48319	209	2256	619
13	1	1	1	48064	250	3918	1434
14	4	2	0	50309	190	879	997
15	4	4	1	25853	254	4606	948
16	1	3	1	49728	272	5427	1378
17	3	2	0	79123	201	3482	902
18	1	4	1	46265	256	6658	1626
19	1	2	1	49809	253	4234	834
20	4	4	1	43640	259	5393	729
21	1	1	1	57395	248	4456	1473
22	3	1	1	52847	245	3999	1038
23	2	2	1	59486	251	2598	933
24	2	4	1	109665	255	6261	903
25	2	1	1	63825	250	5402	1346
26	1	3	0	124842	202	3911	574
27	2	2	1	88270	257	5889	1407
28	4	4	1	33646	264	4275	606
29	2	3	1	112446	254	5737	699
30	1	3	1	56067	264	5538	988
31	4	3	0	31042	214	1852	746
32	1	3	1	84272	264	7020	1510
33	2	4	1	126416	261	7212	1091
34	4	2	0	44395	211	1796	402
35	3	5	0	51533	225	3452	696
36	2	3	1	93507	249	6428	1548
37	1	4	1	75551	258	4469	919
38	2	2	1	58666	245	5516	1669
39	1	3	1	77550	256	5239	1049
40	4	3	1	34760	259	3731	493
41	3	7	1	65830	286	7228	722
42	1	3	1	56375	261	3441	881
43	2	5	1	101743	264	6067	1424
44	1	1	1	29805	253	5462	1161
45	4	1	0	35289	192	1734	457
46	2	5	1	116010	277	6153	1458
47	3	2	0	53739	199	3763	977
48	2	2	1	68806	251	4930	1453
49	2	1	1	66770	257	6025	1658
50	2	1	1	59321	247	5314	1664
51	2	4	1	80712	260	5179	1062
52	2	2	1	93156	253	5768	1660
53	3	5	0	65291	215	2767	699
54	3	5	1	58389	279	5311	573
55	1	3	0	107937	207	3934	748
56	1	2	1	54362	244	4648	1318
57	4	5	0	37741	223	3030	572
58	3	2	1	32848	254	4113	720
59	4	2	0	27791	206	1428	669
60	2	3	1	147155	260	6434	1272
61	3	2	1	75351	247	4428	1084
62	2	3	1	105213	257	6675	1476
63	1	2	0	57571	195	1821	1033
64	1	1	1	43847	244	4264	847
65	3	1	1	39862	250	4809	1472
66	4	2	0	44437	201	3705	520
67	4	2	0	46295	195	3661	501
68	2	3	1	104286	259	7043	1292
69	1	5	0	81276	221	1739	490
70	3	4	1	70304	260	5895	1435
71	2	2	1	66346	247	6584	1914
72	1	4	1	117620	249	6165	1123
73	3	2	1	89772	259	7177	1293
74	4	3	0	63834	212	3949	776
75	2	2	1	151964	246	6309	875
76	2	4	1	101510	254	6956	1534
77	2	4	1	97868	261	7363	1283
78	1	3	1	102142	272	4848	1138
79	3	2	1	33105	249	3634	976
80	1	4	1	62697	251	5235	1302
81	2	1	1	26910	244	3752	1028
82	3	5	1	64085	278	5430	640
83	2	2	1	110645	254	5429	989
84	1	2	1	96192	245	6172	1751
85	3	5	1	48304	265	5131	752
86	4	3	0	68714	207	3187	428
87	1	5	0	118372	218	2615	888
88	1	3	1	93026	254	7415	1469
89	2	2	1	112609	255	4752	943
90	4	3	0	59901	202	983	577
91	4	1	1	27377	254	3233	815
92	2	2	0	78960	199	3285	1134
93	2	3	1	114934	259	4497	1278
94	3	3	1	64753	265	5736	1014
95	2	2	1	64154	259	5048	1088
96	2	2	1	130387	241	6145	1137
97	4	6	1	41335	278	6188	959
98	2	4	1	118377	265	7345	831
99	3	3	0	76648	205	3144	578
100	2	6	1	75512	275	5991	1040


Car demand				
				
Year	Sales	RP	RDI	IR
1993	7115274	108	1668	7.9
1994	8676408	112	1728	5.7
1995	9321305	111	1797	5.3
1996	9618508	111	1916	8
1997	7448339	118	1897	10.8
1998	7049843	128	1932	7.9
1999	8606856	136	2001	6.8
2000	9104932	143	2067	6.8
2001	9304247	154	2167	9.11
2002	8316018	166	2212	12.7
2003	6578359	179	2214	15.3
2004	6206688	190	2249	18.9
2005	5756614	198	2262	14.9
2006	6795226	203	2331	10.8
2007	7951786	209	2470	12
2008	8204694	215	2542	9.9
2009	8222475	224	2645	8.3
2010	7080889	233	2676	8.2

ETF2121/ETF5912 Data Analysis in Business
1
ASSIGNMENT 2
Due by Monday 5pm of Week 12
? This assignment comprises 15% of the assessment for ETF2121 and 15% of the
assessment for ETF5912, and it consists of questions on Statistical Inference and
Regression Analysis. The data files for this assignment must be downloaded from
the “Assignments” tab of the ETF2121/ETF5912 Moodle site.
? You must submit a printed “hard copy” of your written work (with an Assignment
Cover Sheet - from the “Assignments” tab on Moodle) by 5pm on the due date.
Submit it to your tutor in your tutorial (or to your tutor’s mailbox, 5th floor H Block).
ENSURE that you
* submit a printed “hard copy” of your assignment to your tutor.
* obtain a receipt for your submitted assignment by uploading it to Moodle.
This receipt will prove that you submitted your work, and that it was on
time. (Go to the “Assignments” tab and click on the “Assignment 2” link to
upload.)
? There is a penalty of 10% of the marks for each day the assignment is overdue.
Extensions beyond the due date will only be allowed in special circumstances.
Contact your lecturer.
? Do not submit your assignment in a folder – stapled pages are easier for the
marker. Save trees! Double-sided printing is encouraged.
? Keep your marked assignment work until after publication of final results for the unit.
? The assignment should be no more than 15 pages. PAGES EXCEEDING THE
PAGE LIMIT WILL NOT BE MARKED.
? Assignments without signed cover sheets WILL NOT BE ACCEPTED.
? For each question
o Use the relevant Excel functions and procedures.
o You may insert text boxes at relevant points in the spreadsheet for any
comments and explanations.
o In point form, give details on how the procedure was implemented. You
may use a text box within the spreadsheet for this or type your details in a
WORD document and insert them at the relevant point among your print
outs. Alternatively, you may cut and paste some of the data and your
output and insert it into the WORD document.
o Do not print out whole spreadsheets, but just sections showing relevant
data and output.
? Some Excel help is available via the “EXCEL” tab on Moodle.
ETF2121/ETF5912 Data Analysis in Business
2
Question 1
Use the relevant Excel functions and procedures to generate your output.
Consider a random sample of 100 households from a middle-class neighbourhood that
was the recent focus of an economic development study conducted by the local
government. Specifically, for each of the 100 households in the sample that is in one of
four local government locations, information was gathered on the gross annual income of
the household (in $) and on each of several other variables. The data are given in the
EXCEL file A2_1. Economic researchers would like to test for existence of a significance
difference between the mean annual income levels of householders in every pair locations
(that is, first and second; first and third; first and fourth; second and third; second and
fourth; third and fourth).
a) Before conducting any hypothesis test on the difference between the various pairs
of mean income levels, perform a test for equal population variances in income
for each pair of locations. Comment on your findings.
b) Given your conclusions in Part (a), perform a test for the existence of a difference
in mean annual income levels in each pair of locations. Comment on your
findings.
Question 2
Use the relevant Excel functions and procedures to generate your output.
Continuing on from Question 1, economic researchers want to understand the relationship
between the size of monthly mortgage or rent payment for household in this particular
neighbourhood and the following set of variables: family size, location of the household
with the neighbourhood, an indication whether those surveyed own or rent their home;
gross annual household income (in $); average monthly expenditure on utilities (in $) and
the total indebtedness (in $ and excluding the value of monthly mortgage or rent
payment) of the household. The data are in the EXCEL file A2_1.
a) Fit a regression model to predict the size of the monthly mortgage or rent
payments using all the explanatory variables. Explain whether multicollinearity is
a problem.
b) Fit the best regression model to predict the size of the monthly mortgage or rent
payments. Explain why this is the best regression model.
c) Comment on the goodness of fit of the model from Part (b).
d) Interpret each of the regression coefficients for the fitted model in Part (b).
e) For the fitted model in Part (b), are there any possible outliers? If there are, take
the necessary action.
f) For the fitted model in Part (b), after taking Part (e) into account, check for
violations of the regression assumptions.
ETF2121/ETF5912 Data Analysis in Business
3
Question 3
Use the relevant Excel functions and procedures to generate your output.
The EXCEL file A2_3 contains the following information for the years 1993 to 2010:
? Domestic Car sales (in thousands)
? Real price index for new car prices (where 1987 = 100 is the base index)
? Real disposable income (in 1992 dollars)
? Interest rate
a) Fit a multiple linear regression model to predict domestic car sales. Analyse the
fitted model.
b) Fit a multiplicative model to predict domestic car sales. Interpret the coefficients.
c) Compare the two estimated models from Parts (a) and (b). Explain which is the
better model.
d) During a year in which the real price index is 250, the interest rate is 12% and the
real disposable income per person is $3,500, and using the multiplicative model,
there is a 5% chance that car sales will be less than what particular value. (Hint:
Use the standard error of prediction and the fact that the errors are approximately
normally distributed)