# Tietokoneharjoitus 3

Seatbelts is a balanced panel from 50 U.S. States, plus the District of Columbia, for the years 1983-1997. These data were provided by Professor Liran Einav of Stanford University and were used in his paper with Alma Cohen The Effects of Mandatory Seat Belt Laws on Driving Behavior and Traffic Fatalities,'' The Review of Economics and Statistics, 2003, Vol. 85, No. 4, pp 828-843.

Artikkeli loytyy osoitteesta http://www.stanford.edu/~leinav/pubs/RESTAT2003.pdf

• state = State (csv-tiedostossa, mutta ei txt-tiedostossa)
• year = Year
• fips = State ID Code
• vmt = Millions of traffic miles per year. (Note: Number of fatalities =fatalityrate $\times$ vmt)
• fatalityrate = Number of fatalities per million of traffic miles
• sb_usage = Seat belt usage rate
• speed65 = Binary variable for 65 mile per hour speed limit
• speed70 = Binary variable for 70 or higher mile per hour speed limit
• drinkage21 = Binary variable for age 21 drinking age
• ba08 = Binary variable for blood alcohol limit â‰¤ .08\%
• income = Per capita income
• age = Mean age
• primary = Binary variable for primary enforcement of seat belt laws
• secondary = Binary variable for secondary enforcement of seat belt laws

## Datan lukeminen R:aan

file<-"http://cc.oulu.fi/~jklemela/econometrics/SeatBelts.csv"


## Datan lukeminen SAS:iin

FILENAME myurl URL 'http://cc.oulu.fi/~jklemela/econometrics/SeatBelts.txt';

DATA SeatBelts;
INFILE myurl firstobs=2;
INPUT year fips vmt fatalityrate sb_usage speed65 speed70
drinkage21 ba08 income age primary secondary;
RUN;


## Tehtävä 5

Valitse FatalityRate y-muuttujaksi ja sb_usage, speed65, speed70, drinkage21, ba08, log(income) ja age x-muuttujiksi. Suorita OLS-regressio ja luettele kertoimien pienimmän neliösumman estimaatit. Suorita t-testit ja F-testi.

file<-"http://cc.oulu.fi/~jklemela/econometrics/SeatBelts.csv"

y<-data[,5]
sp.usage<-data[,6]
speed65<-data[,7]
speed70<-data[,8]
drinkage21<-data[,9]
ba08<-data[,10]
log.income<-log(data[,11])
age<-data[,12]

reg.model<-lm(y ~ sp.usage+speed65+speed70+drinkage21+ba08+log.income+age)

summary(reg.model)

plot(reg.model)



Call:
lm(formula = y ~ sp.usage + speed65 + speed70 + drinkage21 +
ba08 + log.income + age)

Residuals:
Min         1Q     Median         3Q        Max
-0.0109890 -0.0025729 -0.0002924  0.0027982  0.0132925

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.2008411  0.0104949  19.137  < 2e-16 ***
sp.usage     0.0039306  0.0015512   2.534 0.011557 *
speed65      0.0001167  0.0005145   0.227 0.820618
speed70      0.0023957  0.0006527   3.670 0.000266 ***
drinkage21   0.0012843  0.0011180   1.149 0.251131
ba08        -0.0013994  0.0005679  -2.464 0.014047 *
log.income  -0.0182520  0.0011886 -15.356  < 2e-16 ***
age         -0.0001334  0.0001391  -0.959 0.337916
---
Signif. codes:  0 â€˜***â€™ 0.001 â€˜**â€™ 0.01 â€˜*â€™ 0.05 â€˜.â€™ 0.1 â€˜ â€™ 1

Residual standard error: 0.004339 on 548 degrees of freedom
(209 observations deleted due to missingness)
Multiple R-squared: 0.4279,	Adjusted R-squared: 0.4206
F-statistic: 58.56 on 7 and 548 DF,  p-value: < 2.2e-16


ota<-!is.na(sp.usage)
K<-8
y<-data[ota,5]
n<-length(y)
x<-matrix(0,n,K)
x[,1]<-1
x[,2]<-sp.usage[ota]
x[,3]<-speed65[ota]
x[,4]<-speed70[ota]
x[,5]<-drinkage21[ota]
x[,6]<-ba08[ota]
x[,7]<-log.income[ota]
x[,8]<-age[ota]

A<-t(x)%*%x
invA<-solve(A,diag(1,K))
b<-invA%*%t(x)%*%y
b

[1,]  0.2008410835
[2,]  0.0039306452
[3,]  0.0001167195
[4,]  0.0023956645
[5,]  0.0012843398
[6,] -0.0013993839
[7,] -0.0182519776
[8,] -0.0001333946



Vertaillaan p-arvoja:

sp_usage_estimate<-0.0039306
sp_usage_standard_error<-0.0015512
df<-548   #degrees of freedom

t_statistics<-sp_usage_estimate/sp_usage_standard_error
t_statistics
#[1] 2.533909

#Pr(>|t|)
# t-jakauma
2*(1-pt(t_statistics,df=548))
#[1] 0.01155724

# normaalijakauma
2*(1-pnorm(t_statistics))
#[1] 0.01127979

2*(1-pt(t_statistics,df=54))
#[1] 0.01421385

# F-statistic: 58.56 on 7 and 548 DF,  p-value: < 2.2e-16
F<-58.56
1-pf(F, df1=7, df2=548)
#[1] 0

Q<-7
W<-Q*F
1-pchisq(W, df=7)
#[1] 0



Kokeillaan SAS:ia.

FILENAME myurl URL 'http://cc.oulu.fi/~jklemela/econometrics/SeatBelts.txt';

DATA SeatBelts;
INFILE myurl firstobs=2;
INPUT number \$ year fips vmt fatalityrate sb_usage speed65 speed70 drinkage21 ba08 income age primary secondary;
RUN;

PROC reg data=SeatBelts;
model fatalityrate =  sb_usage speed65 speed70 drinkage21 ba08 logincome age;
RUN;



  Parameter Estimates
Parameter       Standard
Variable        DF       Estimate          Error    t Value    Pr > |t|

Intercept        1        0.20084        0.01049      19.14      <.0001
sb_usage         1        0.00393        0.00155       2.53      0.0116
speed65          1     0.00011672     0.00051450       0.23      0.8206
speed70          1        0.00240     0.00065274       3.67      0.0003
drinkage21       1        0.00128        0.00112       1.15      0.2511
ba08             1       -0.00140     0.00056794      -2.46      0.0140
logincome        1       -0.01825        0.00119     -15.36      <.0001
age              1    -0.00013339     0.00013908      -0.96      0.3379