# ISSCM 491 Statistics for managers Excel PHstat My Occupational group is Group # 3 Admin Support – everything else should be pretty straight forward in the

ISSCM 491 Statistics for managers Excel PHstat My Occupational group is Group # 3 Admin Support

– everything else should be pretty straight forward in the excel spreadsheet it shows and tells you exactly what to do. – the file for this assignment is hw7.xlsx

Also the Survey Data ISSCM491 file is attached as (if needed) to look at this for this homework assignment (if it is) mentioned inside hw7.xlsx

Townes.HW1.xlsx will show you the The Correct way on how to answer and submit this assignment, so you could take a look at that to help how to format to answer the questions. Homework 7

(This homework has four questions)

Note: The answers to all questions should be provided (written down).

Do not show them on the computer output.

Question 1.

You are interested in determining the relationship between the employee INCOME (Y), and of

the years of WORK (X1), years of EDUCation (X2) and GENDER (X3) (male or female) for the

employees in your occupational group. (You will be using the file that you have created for your

occupational group as part of homework 1). You can use PHStat output to answer these questions.

Suppose you wanted to use all three variables (X1= years of WORK, X2= EDUCation and

X3=GENDER) as the independent variables to predict the INCOME of the employees

in your occupational group.

(a) Using the method of least squares, determine the equation for the estimated

regression line Y = b0 + b1X1 + b2X2 + b3X3

(b) State the meaning of the slope b3 for the GENDER of the employees.

(c) Using the estimated multiple regression equation, calculate the predicted INCOME of a

FEMALE employee with 20 years of WORK experience and 14 years of EDUCation.

(d) If you had used the average INCOME as your estimate, what would be the total sum of

squared errors (SST)?

(e) If you use multiple regression prediction values as your estimate for employee INCOMES,

what would be the squared errors – error sum of squares (SSE)?

(f) How much of the total variation is explained by the years of WORK, EDUCation, and

GENDER (SSR)?

(g) What percentage of the total variation in employee INCOMES is explained by the years

of WORK, EDUCATION, and GENDER (r2Y.123)?

(h) Construct 95% confidence interval for the mean INCOME of the MALE employees

with 20 years WORK experience and 14 years of EDUCation.

(i) In testing Ho that b3=0 against the two sided alternative, should the Null hypothesis

be rejected or accepted at a sinficance level of 0.05?

What does it mean to reject the Null hypothesis (b3=0) in this case?

(j) Based on the hypothesis testing result of (i) above, does the GENDER have a significant

effect on the INCOME of the employees in your occupational group?

(k) Now, compare your conclusion in (j) above with the conclusion you had in your answer

to Question #6 of homework 3. Are your answers consistent?

Page 1 of 6 (Homework 7)

Important:

PHStat REGRESSION procedure, when CIE & PIE is selected, sometimes fails to work.

If that happens, do the following:

(a) Open a new file and copy your data into this new file

(b) Repeat PHStat –> Multiple-Sample Test –> One-Way ANOVA

(c) If the same problem continues

(i) Close and restart your computer

(ii) Open a new file and copy your data to this new file

(iii) Repeat PHStat –> Multiple-Sample Test –> One-Way ANOVA

Question 2.

The marketing manager of StarDisk company wanted to study the effect of price on the

demand for their 1GB USB Flash drives. The company has selected a sample of 11 stores

with equal sales potential each, and used different weekly prices/promotions in each store.

The prices charged (X) and the weekly demand (Y) for the flash drives in these stores

are given below:

1

2

3

4

5

6

7

8

9

10

11

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

Y

287

288

290

295

300

300

300

299

297

296

295

X

7,5

8,0

8,5

9,0

9,5

10,0

10,5

11,0

11,5

12,0

12,5

Y = Demand

X = Price

Determine the equation for the estimated regression line Y = b0 + b1X1 (Call this Model 1)

State the meaning of the slope b1 for the PRICE.

Is PRICE a significant factor in determining the sales? Use a significance level (a) of 0.05?

What percentage of the total variation in SALES is explained by the PRICE (What is r2)?

Construct 95% confidence interval for the mean SALES of the 1 GB Flasdrives when

the price is $10.

Develop a regression model that includes price (X), and the squard prices (X2) to predict

Y = b0 + b1X1 + b2X12 to predict the weekly sales. Determine the equation for the

estimated regression line. (Call this Model 2)

(Note that you need to create a new column that will have the square prices. That coulmn

will be the second X variable in the table)

Is squared prices (X12) a significant factor in determining the sales? Use a significance

level (a) of 0.05?

Predict the weekly sales for the 1GB Flashdrive for a price of $10 by using Model 2.

Using Model 2, construct 95% confidence interval for the mean SALES of the 1 GB USB Flash

drives when the price is $10.

What percentage of the total variation in SALES is explained by the PRICE and

Squared PRICES (What is r2 equal to?)

Which regression model is more appropriate, Model 1 or Model 2? Why?

What price do you suggest the company charge for 1 GB USB Flash drives? Why?

Page 2 of 6 (Homework 7)

Question 3.

Note this his problem was covered in homework 3 and in Example 1 of class 4.

The CFO of a Healthcare organization wanted to examine whether two of the

hospitals (Hospitals A and B) operated by the organization differed on the average

cost per case. The financial manager has randomly taken a random sample of 6 cases

from hospital A and hospital B and assigned the following costs to each discharge.

Cases

1

2

3

4

5

6

Hospital A Hospital B

4,5

2,0

5,0

6,0

5,5

4,0

3,5

3,5

6,5

4,5

5,0

4,0

We have already shown in class that this problem could be solved by using

(a) Hypothesis testing for the difference between two means

(b) One-Way Analysis of Variance

Your job is to check if regression analysis can also be used to answer this question.

For regression purposes, the observations can be (re)stated as follows:

Cost

4,5

5,0

5,5

3,5

6,5

5,0

2,0

6,0

4,0

3,5

4,5

4,0

Hospital

A

A

A

A

A

A

B

B

B

B

B

B

First, you need to assign dummy variables for hospitals (0=Hospital A, 1=Hospital B) to use

regression to predict the costs.

(a) Determine the equation for the estimated regression line Y = b0 + b1X1, where X1 is

a dummy variable used for the hospitals.

(b) State the meaning of the slope b1 for the dummy variable (used for hospitals).

(c) Is the coefficient b1 significant at significance level (a) of 0.05? What is the P-Value

associated with it?

(d) Using your answer to (c) above, answer the following question:

Is there strong evidence to show that there is a significant cost

difference in the average costs between hospitals A and B?

(e) Is this answer consistent with the answers we have when obtained by using hypothesis

testing and ANOVA approaches covered in Class 4?

Page 3 of 6 (Homework 7)

=3.80-3.85 = -0.05

Question 4.

(This is a nonlinear regression question)

Enterprise Industries produces Fresh, a brand of liquid laundry detergent.

In order to more effectively manage its inventory, the company would like

to better predict demand for Fresh. To develop a prediction model, the

company has gathered data concerning demand for Fresh over the last 30

sales periods (each sales period is considered to be a four-week period).

The demand data are given in Table 1 with the following variables:

Variables:

Y the demand for large size bottle of Fresh (in hundreds of thousands

of bottles) in the sales period

X1 The price (in dollars) of Fresh as offered by Enterprise Industries in

the sales period

X2 The average industry price (in dollars) of competitors’ similar

detergents in the sales period

X3 Enterprise Industries’ advertising expenditure (in hundreds of

thousands of dollars) to promote Fresh in the sales period

The Enterprise Industries believes on theoretical grounds that the single

independent variable X4 = X2-X1 (The “price difference” in the sales

period) adequately describes the effects of X1 and X2 on Y. That is,

perhaps demand for Fresh depends more on how the price for Fresh

compares to competitors’ price than it does on the absolute levels of the

prices for Fresh and other competing detergents. This makes sense since

most consumers must buy a certain amount of detergent no matter what

the price might be.

First, you need to complete column X4 in the Table 1.

I have completed the first two rows.

Questions:

(1) Answer the following questions by using model Y = b0 + b1X4 + b2X3

(a) What is the estimated regression equation?

(b) Interpret the slopes of b1 and b2.

(c) Do the p-values for the independent variables in this model indicate

that these independent variables (X4, X3) are significant?

(d) What percent of the total variation in observed demands is explained

by this model? (What is r2 equal to?)

(2) Consider the demand for Fresh Detergent in the future sales period

when Enterprise Industries’ price for Fresh will be X1 = 3.70,

Table 1

FRESH DETERGENT DATA

Price

X1

3,85

3,75

3,70

3,70

3,60

3,60

3,60

3,80

3,80

3,85

3,90

3,90

3,70

3,75

3,75

3,80

3,70

3,80

3,70

3,80

3,80

3,75

3,70

3,55

3,60

3,65

3,70

3,75

3,80

3,70

IndPrice Demand

X2

Y

3,80

7,38

4,00

8,51

4,30

9,52

3,70

7,50

3,85

9,33

3,80

8,28

3,75

8,75

3,85

7,87

3,65

7,10

4,00

8,00

4,10

7,89

4,00

8,15

4,10

9,10

4,20

8,86

4,10

8,90

4,10

8,87

4,20

9,26

4,30

9,00

4,10

8,75

3,75

7,95

3,75

7,65

3,65

7,27

3,90

8,00

3,65

8,50

4,10

8,75

4,25

9,21

3,65

8,27

3,75

7,67

3,85

7,93

4,25

9,26

X2-X1

X4

-0,05

0,25

AdvExp

X3

5,50

6,75

7,25

5,50

7,00

6,50

6,75

5,25

5,25

6,00

6,50

6,25

7,00

6,90

6,80

6,80

7,10

7,00

6,80

6,50

6,25

6,00

6,50

7,00

6,80

6,80

6,50

5,75

5,80

6,80

of competitors’ similar detergent will be X2 = 3.90 and Enterprise

the average price Industries’ advertising expenditure for Fresh

will be X3 = 6.50.

(a) What is the point prediction of this demand?

(b) Identify the 95 percent confidence interval estimation for the avergae demand.

Page 4 of 6 (Homework 8)

USING PHStat TO OBTAIN SIMPLE REGRESSION

Choose the following:

Check ANOVA and Coefficients Tables

PHStat —> Regression —> Simple Linear Regression… —>

Output

(Fill out the window as shown below)

Data

X Value

Confidence Level

1.652,96

95%

Intermediate Calculations

Sample Size

70

Degrees of Freedom

68

t Value

1,9955

XBar, Sample Mean of X

214,9286

Sum of Squared Differences from

7.606.537

XBar

Standard Error of the Estimate 15,9436

h Statistic

0,0143

Predicted Y (YHat)

214,929

17

For Average Y

Interval Half Width

3,803

Confidence Interval Lower Limit 211,126

Confidence Interval Upper Limit 218,731

18a

For Individual Response Y

Interval Half Width

32,041

Prediction Interval Lower Limit 182,887

Prediction Interval Upper Limit 246,970

Regression Statistics

Multiple R

0,8617

R Square

0,7425

Adjusted R Square

0,7388

Standard Error

15,9436

Observations

70

18b

15

16

14cba

ANOVA

Regression

Residual

Total

Intercept

AREA

df

1

68

69

Coefficients

81,110

0,081

11

SS

49.853,253

17.285,390

67.138,643

Standard Error

9,744

0,006

MS

49.853,253

254,197

F

196,121

Signif F

0,00000

t Stat

8,324

14,004

P-value

0,00000

0,00000

Lower 95%

19a

61,67

0,07

Upper 95%

100,55

0,09

19b

USING PHStat TO OBTAIN MULTIPLE LINEAR REGRESSION

Choose the following:

PHStat —> Regression —> Multiple Regression… —>

Check ANOVA and Coefficients Tables

(Fill out the window as shown below)

Confidence Interval Estimate and Prediction Interval

Example 1-2

Data

Confidence Level

AREA given value

ASSESS given value

0,95

1

1.652,96

137,60

Enter

these

Predicted Y (YHat)

214,906

27

For Average Predicted Y (YHat)

Interval Half Width

3,193

Confidence Interval Lower Limit

211,714

Confidence Interval Upper Limit

218,099

For Individual Response Y

Interval Half Width

26,901

Prediction Interval Lower Limit

188,005

Prediction Interval Upper Limit

241,807

SUMMARY OUTPUT

Regression Statistics

Multiple R

R Square

Adjusted R Square

Standard Error

Observations

28a

28b

Example 1-2

0,9063

0,8213

0,8160

13,3820

70

25

26

24cba

ANOVA

df

2

67

69

Regression

Residual

Total

Intercept

AREA

ASSESS

Coefficients

53,5824

0,0646

0,3961

21

SS

55.140,48

11.998,17

67.138,64

Stand Error

9,6202

0,0057

0,0729

MS

27.570,24

179,077

t Stat

5,5698

11,3225

5,4337

F

153,9574

Significance F

P-value

0,0000

0,0000

0,0000

Lower 99% Upper 99%

28,0771

79,0877

0,0495

0,0798

0,2028

0,5893

29a

0,0000

29b

nter

ese

Homework 1

Note that each student will prepare and submit the homework assignments individually.

It is important that you complete this first assignment since the remaining assignments will be using the

work on this assignment. You will be using MS Excel or PHStat software. It is acceptable, if you prefer

to use another software for the study. Please let me know if you need help with the use of Excel.

You will be using the survey data given in ISSCM491 file for most of the homework problems.

These 800 employees have different occupations. The categorical variables used for occupation are

0. Managerial 1. Professional 2. Production 3. Admin. Support 4. Laborer

5. Finance 6. Corporate 8. Information Systems 9. Technical 10. Service

You will be working with the survey data of a specific occupation. The occupational group data that

you will be working is provided on the Announcements page of the course web site.

For all the questions during this quarter, the term Occupational Group refers to

the occupational group that you will be working.

The survey data (file ISSCM491.xlsx) can be downloaded from the Assignments page of the course website.

The Excel file ISSCM491.xlsx has the following three worksheets

Includes the survey information (questions).

Has the survey data as observed.

Has the same survey data sorted according to the occupational groups.

(This sorted data will be helpful to obtain the sample values of your occupational group).

Questions Part I:

1. Extract the employees of your group and save them with a meaningful file name

such as manag.xls, profe.xls, prod.xls etc.

2. Calculate the Descriptive Statistics for INCOME, Years WORKed and EDUCation for your Group.

3. Find the frequency distribution and Histogram for the INCOME of your Group

(See next page for using the Descriptive Statistics and Histogram options of MS Excel).

(a) Calculate the Descriptive Statistics for INCOME, Years WORKed and EDUCation for FEMALE

employees in your Group.

(b) Calculate the Descriptive Statistics for INCOME, Years WORKed and EDUCation for MALE

employees in your Group.

5. Select a random sample of 16 employees from this 800 observations.

(Note that this sample will be randomly selected from the whole 800 sample survey results,

not from the group you are working with).

(a) Explain how you have selected your sample to ensure that it is a random sample.

(b) What is the average (mean) AGE of the employees in your sample?

4.

Questions Part II: (These are general questions that do not use the sample survey data.

Your are asked to identify the population and suggest a sample selection for each case).

6. You would like to know, if there is a convincing evidence that newborns from mothers who smoke

have a different average birth weight than newborns from mothers who don’t smoke?

(a) What is the population for this study?

(b) How do you select the sample for this study?

7. Suppose a pharmaceutical company had developed a new drug for lowering blood pressure,

and they are preparing a clinical trial (experiment) to test the drug’s effectiveness.

(a) What is the population in this study?

(b) How do you proceed with this study? i.e., How do you collect the sample for this study?

Submit the file containing your answers to questions (1), (2), (3), (4), (5), (6) and (7) electronically

on the course website on Sakai.

Please write your name and your Occupational Group number on the first page of your submission.

Page 1 of 3 (Homework 1)

USING PHStat to calculate DECSRIPTIVE STATISTICS

PHStat–> Descritive Statistisc –> Descriptive Summary –>

Page 2

Page 2 of 3 (Homework 1)

To calculate Descriptive Statistics Using Excel

Step 1

Step 2

Select Data pane

Click Data Analysis …

Page 3

(Data analysis dialog window will open)

If Data Analysis… is not available on your tools menu, then the Analysis Toolpack has not been

installed in your system. See the FAQ on the course website.

Step 3

Choose Descriptive Statistics by clicking it once and then clicking the OK button

(The Descriptive Statistics dialog window will open. Provide the necessary information as shown below)

Step 4

Click the OK button. A new worksheet will be created with the results.

To construct a Histogram and Frequency Distribution Using Excel

Step 1

Step 2

Step 3

Select Data pane

Click Data Analysis …

(Data analysis dialog window will open)

Select Histogram option and click the OK button

(Next you will get the Histogram window. Provide the necessary information as shown below).

Step 4

Click the OK button. A new worksheet will be created with the results.

Page 3 of 3 (Homework 1)

QUESTION 1

1

2

3

4

5

6

7

8

9

ID_No

OCCUP

Resp

INCOME

Years

WORKed

EDUC

GENDER

AGE

Work

HRS

No

EARN

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

35.2

25.8

34.3

34.1

29.3

31.7

30.6

29.7

37.1

27.8

33.1

29.3

26.7

25.9

35.8

30.0

26.2

44.6

32.7

30.0

29.7

30.5

28.9

23.2

48.7

24.4

23.4

25.3

25.5

72.8

27.8

25.4

25.0

50.7

27.1

27.7

25.3

28.1

15

20

7

33

10

15

20

27

21

20

25

8

20

9

45

15

10

10

13

25

15

20

12

22

8

6

26

17

21

19

10

15

11

24

15

20

3

22

12

14

12

14

13

12

14

12

13

12

16

12

12

12

12

12

12

16

15

13

12

12

12

12

12

16

12

13

14

12

15

14

12

18

13

13

12

13

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

33

50

24

50

26

53

52

43

39

40

42

28

37

35

63

42

35

27

29

61

32

55

29

38

32

29

42

38

47

36

27

37

32

44

41

40

20

65

50

40

49

42

40

40

40

40

36

45

36

50

40

43

35

40

40

40

42

40

40

40

32

40

40

40

40

35

60

40

45

40

40

40

40

45

40

40

1

1

2

3

1

1

1

2

3

1

1

1

2

1

1

2

2

2

2

1

2

1

2

2

1

3

1

1

3

1

1

1

2

2

2

3

2

1

2

14

21

31

50

77

94

98

99

100

108

110

112

113

114

126

137

140

150

163

166

176

187

198

200

212

213

218

220

236

249

250

252

269

280

300

307

312

350

359

360

390

400

450

500

550

600

650

700

750

800

19

29

40

58

82

109

115

134

207

222

233

254

315

385

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

3

32.5

25.0

34.2

31.9

25.5

46.1

28.1

25.6

37.3

24.2

33.2

20.3

26.8

43.1

24.7

25.4

32.5

27.9

36.8

33.9

60.5

26.3

55.8

29.3

62.7

56.6

29.3

Total 2134.90

Average 32.84

25

6

30

21

34

36

4

24

20

20

37

20

27

7

15

16

10

17

18

10

20

24

24

16

31

32

18

12

12

12

12

12

16

14

14

12

17

13

13

12

16

16

12

12

14

13

13

16

12

12

9

14

12

12

0

0

0

0

0

0

0

0

0

0

0

0

0

1

1

1

1

1

1

1

1

1

1

1

1

1

1

61

23

55

31

56

56

30

42

38

38

58

57

69

25

35

32

27

27

37

29

37

41

41

33

48

48

35

40

44

50

47

40

40

48

37

40

37

32

32

29

40

40

40

40

40

40

41

50

72

45

56

80

40

40

1

2

2

3

2

1

2

1

4

1

1

2

1

1

2

1

3

2

4

2

2

1

1

1

2

4

2

1216.00

18.71

851.00

13.09

14.00

0.22

2602.00

40.03

2745.00

42.23

113.00

1.74

Marques Townes

Occup = 3

10

11

12

13

14

15

16

Fam

Inc

Job

Sat

Job

Char

Get

Ahead

Traumat

Events

Employ

Years

Number

Prom

25.2

17.4

27.4

69.6

19.6

35.7

47.3

57.8

31.2

17.8

34.2

19.3

26.5

15.9

38.7

23.0

43.4

69.3

43.1

20.0

38.9

20.5

21.9

16.7

38.7

28.3

29.7

15.3

68.7

63.4

17.8

15.4

25.4

49.2

32.4

31.4

27.1

18.2

1

3

3

2

2

2

2

1

2

1

4

2

2

1

1

1

2

1

2

1

2

2

1

2

2

3

1

1

4

1

1

1

2

3

2

2

1

1

1

1

4

5

3

5

5

5

5

1

5

1

1

5

2

2

1

5

2

1

5

1

5

5

1

1

4

5

5

4

5

1

3

1

5

5

4

5

2

1

2

1

1

2

1

1

1

1

2

1

1

1

3

3

1

1

3

2

1

2

1

1

1

1

1

1

3

1

3

1

1

2

1

1

1

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

1

0

0

0

1

0

0

0

1

0

0

0

0

0

0

4

10

5

3.5

4

12.5

3

9.41

16

0.5

12

3

10

1.5

22

7.08

2.5

5

4.33

18.33

11.58

15

3

1

6

3

1.08

4

1

2.16

2

15

10

22

4

7

1.5

13

3

0

1

1

5

1

1

1

0

0

1

0

0

1

1

0

3

4

0

0

3

2

1

0

0

0

1

1

0

0

0

3

0

0

0

1

0

1

Bins

15.0

25.0

35.0

45.0

55.0

65.0

75.0

22.5

37.3

27.9

40.2

44.5

36.9

41.8

16.4

72.6

49.2

23.2

42.5

16.8

33.1

46.9

15.4

32.5

46.5

34.0

43.0

59.2

16.3

49.4

23.2

56.1

71.0

45.0

1

3

2

1

2

1

2

4

1

2

2

2

1

4

2

1

1

2

3

2

2

1

1

1

4

1

2

5

2

1

1

5

1

1

1

4

5

5

1

5

5

5

1

5

5

4

4

5

5

5

4

1

5

1

3

1

2

1

1

1

2

2

1

1

2

2

1

3

2

2

1

1

1

3

1

3

1

1

3

1

1

0

0

0

0

0

0

1

0

0

0

1

1

0

0

3

0

0

2

0

0

0

1

0

0

0

0

0

17

2.5

30

10

21

30

1

2

10.5

5

5

3

25

1

0.75

0.16

5

0.5

16

3

8.5

18

8

4.5

9

25

8

1

0

3

0

0

3

1

0

4

0

0

0

3

1

2

0

0

0

0

2

2

0

4

0

1

1

0

2284.80

35.15

119.00

1.83

217.00

3.34

99.00

1.5…

Purchase answer to see full

attachment