# Statistics Probability Problems Set 1 Questions Answer those questions and show steps. Answer those questions and show steps. Answer those questions and s

Your first homework submission involves:
1. Problem Set 1(below)
2. In-Class Worksheets (3 worksheets and these will be graded based on completion).
If your homework is completely typed (including equations), you will receive up to 10 pts. You cannot
earn above 100% on a homework. Homework must be handed in at class for full credit. You may email
your homework if you are unable to make it to class (homework must show up in my mailbox before class
starts), but you will receive a 5-point deduction and are ineligible for bonus points.
You will receive full credit for the problems from problem set 1 if you show ALL your work (i.e. I want to
see all the steps.). You may find z-table and t-table posted on Carmen. Each problem in this problem set
has same weight.
PROBLEM SET 1
Due Feb. 5 (section TTR)
Due Feb. 6 (section WF)
1. Expected value, variance, and standard deviation are really important to investors. Suppose you
are considering to invest \$2 by buying a Power Ball ticket. The Grand Prize is 723 million. The
diagram below shows combinations of white and red balls that would yield the Grand Prize as
well as other prizes. To get Grand prize your ticket needs to match 5 white balls and a red ball;
to win \$1 million you need to match 5 white balls; etc
Prizes
Prob. of winning
\$ 723,000,000.00
3.4223E-09
\$ 1,000,000.00
1.71115E-07
\$
50,000.00
1.09513E-06
\$
100.00
2.73776E-05
\$
100.00
6.89888E-05
\$
7.00
0.001721882
\$
7.00
0.001423832
\$
4.00
0.010755001
\$
4.00
0.02543235
1
a) Use the table with probabilities associated with each prize to calculate the expected value of
Power Ball ticket given that cost of a single ticket is \$2? (Hint: Do not forget that you spent
\$2 and that probability of losing \$2 1-sum of all probabilities that you might win
something.)
b) What is the variance and standard deviation of Power Ball ticket?
c) Would this be a good investment? Why?
d) What is the Grand Prize at which expected return from the ticket is \$0?
2. An econometrics class has 80 students, and the mean student weight is 145lbs. A random
sample of 4 students is selected from the class, and their average is calculated.
a. Will the average weight of the students in the sample equal 145lbs? Why or why not?
b. Use this example to explain why the sample average is a random variable.
3. Suppose that 1 , 2 , … are i.i.d RV drawn from a normal distribution N(1,4) distribution.
a. What is the mean and the variance of sample average where i.) n=10, ii) n=100, iii)
n=1000.
b. What is the distribution of sample average where i.) n=10, ii) n=100, iii) n=1000.
c. Sketch the probability density of sample average for i.) n=10, ii) n=100, iii) n=1000.
i. What is happening to the probability density as you increase number of
observations?
4. If Y is distributed N(1,4), find following probabilities (show all of your work, do not use
calculator):
a. Pr( ≤ 3)
b. Pr( ≥ 3)
c. Pr( ≥ 5)
d. Pr(2 ≤ ≤ 3)
5. Below is a table with some data:
X
Y
7
9
8
6
3
2
2
5
a) Calculate mean and variance of X and Y.
b) How can you conclude what is the relationship between X and Y without doing any
calculations?
c) What is correlation between X and Y?
6.
a.
b.
c.
d.
Give an example of an estimator that we used in class and that is not sample average.
Name properties of an estimator.
Why do we use sample average as an estimator of mean?
What is an estimate? Provide an example of an estimate.
2
7. In a population, average is 100 and variance is 43. You have collected random samples and you
would like to calculate
a. Pr( � ≤ 101) when n=100
b. Pr( � > 98) when n=165
c. Pr(101 ≤ � ≤ 103) when n=64
8. Data on fifth-grade test scores (reading and mathematics) for 420 school districts in California
yield average score � = 654.2 and standard deviation = 19.1.
a. Test whether the mean test score in the population is 654.2 at 10%, 5%, and 1%.
b. Test whether the mean test score in the population is less than 654.2 at 10%, 5%, and
1%.
c. Construct the 90%, 95%, 99% confidence intervals for the mean test score in the
population.
When the districts were divided into those with small classes ( < 20students per teacher) and those with large classes ( ≥ 20 students per teacher), the following results were found: d) Is there statistically significant evidence that the districts with smaller classes have higher average test scores? Explain. 9. For this problem, you will use STATA and data set for the project. The goal is to test mean differences in wages between German born individuals and immigrants who are full-time employed who are working 30 or more hours and earn positive wages. a. How many observations do you have in the data set when you load it? b. Keep in the sample only full-time employed individuals. i. use command: tab pgemplst (this will provide frequency table of individuals according to their employment status) ii. use command: keep if pgemplst==” here you need a number that specifies only full-employed individuals” iii. How many obs. do you have after keeping only full-time employed? c. Keep in the sample only individuals who are working 30 or more hours. i. Identify name of the variable that measures “Actual Weekly Time” ii. use command: keep if var_name>=30
3
iii. How many obs. do you have after keeping only individuals working 30 or more
hours?
d. Identify variable that measures weekly earnings.
i. rename variable to wage (command: rename old_name wage).
ii. What is the average wage? What is standard deviation? What is min and what is
max?
iii. Do you have any individuals earning non-positive wages? If yes, how many?
(command: sum wage if wage==0)
iv. Remove individuals with 0 earnings (command: keep if “_____”
e. Generate dummy variable immigrant that will take value 1 if individual is immigrant and
0 if German.
i. Generate variable immigrant. (command: gen immigrant=.)
ii. Assign 1 to binary variable immigrant (command: replace immigrant=1 if
germborn==2)
iii. Assign 1 to binary variable immigrant (command: replace immigrant=0 if
germborn==1)
iv. What percentage of your data set is made up by immigrants?
v. What is mean wage for an immigrant? (command: sum wage if immigrant==1)
vi. What is mean wage for an German?
vii. Is there statistically significant evidence that German born individual earn higher
wages than immigrants? (command: ttest ________, by(______) unequal)
10. If you are working in a group, you will need to answer this question jointly.
a. What is the topic you would like to investigate in your project? What is your hypothesis?
b. Name at least two variables from the data set that will be used in your project.
4

Purchase answer to see full
attachment