\documentclass[14pt]{extarticle}

\usepackage{hyperref}

\usepackage{Sweave}
\begin{document}
\title{Class: standard errors}
\author{Dean Foster}

Practice 2 due today.  Assignment from Berndt due Monday.

\section{Story: Pair programming}
\begin{itemize}
\item Mythical man month
\item If you double the number of programmers the amount of time it
takes doubles.'' Huh?
\item Invention of {\em Extreme Programming.}
\begin{itemize}
\item Always have working code
\item Always modify working code to make it simplier
\item Small changes and lots of testing
\end{itemize}
\item One Version is called {\em pair programming}
\begin{itemize}
\item Person who knows'' the code doesn't have the typewriter
\item Person who knows'' the code can only make suggestions
\item By the end of a day working together--the other person knows the code
\end{itemize}
\item Lots of experimental data on how effective pair programming is.
Not as many controlled experiments as there should be.  Oh well.
\end{itemize}

\section{Status so far}

The model
\begin{displaymath}
Y_i = \alpha + \beta x_i + \epsilon_i
\end{displaymath}
where $\epsilon_i$ are iid and
\begin{displaymath}
\epsilon_i \sim N(0,\sigma^2).
\end{displaymath}

\begin{itemize}
\item First we discussed fitting ($\alpha + \beta x_i$)
\item Then we discussed the residuals
\item Now we want to discuss how to estimate the error in $\hat\beta$
\end{itemize}

\section{Why we care}

If the normal linear model holds, we know that $\hat\beta$ has close
to a normal distribution.  In particular,
$\frac{\hat\beta}{SE(\hat\beta)}$ is a t-distribution.  We often want
to know how accurately we know $\beta$ purely for its own sake.  For
example, if our model is $Y =$ sales, and $X =$ advertisments, then
$\beta =$sales/ad.  So if we know that each sale generates \$10 profit, and each add costs \$1 (think web based advertisements) then
we need $\beta > .1$ before we make more money in sales than we spent

We can also use $\hat\beta$ to make predictions:
\begin{displaymath}
\hat{Y} = \hat\alpha + \hat\beta x_i
\end{displaymath}
So before we can now how accurate a forecast is, we need to know how
accurate $\hat\beta$ is.

Either of these require knowing that $\hat\beta$ is a good estimate of
$\beta$ and exactly how good an estimate of it it actually is.  So we
need the standard error for $\hat\beta$.

\section{Standard errors rely on the linear model assumption}

The standard errors generated by R/JMP/statistics all require that the
standar linear model holds.  I.e. the residuals are IID normal.  If
they aren't:
\begin{itemize}
\item predictions are wrong
\item CI are wrong
\item hypothesis tests are wrong
\item accuracy of slopes are unknown
\end{itemize}
We are left with just guessing.

In previous classes, all we did is notice problems.  Now we want to
fix them (if possible).

One problem that we can fix is that of hetroskadasticity.  Suppose
that we are regression salary (Y) on runs (X).  Then we might expect
that
\begin{displaymath}
Y = \alpha + \beta x
\end{displaymath}
will display hetroskadastic errors.  In particular, we might expect
that the errors grow with $x$.

\paragraph{log-log model}  If we use logs, we can consider the model
\begin{displaymath}
\log(Y) = \alpha + \beta \log(x) + \epsilon
\end{displaymath}
In this model, we now expect the errors to be homoskadastic.
\begin{eqnarray*}
\log(Y) & = & \alpha + \beta \log(x) + \epsilon\\
e^{\log(Y)} & = & e^{\alpha + \beta \log(x) + \epsilon}\\
Y & = & e^{\alpha} e^{\beta \log(x)} e^{\epsilon}\\
Y & = & e^{\alpha} (e^{\log(x)})^\beta  e^{\epsilon}\\
Y & = & k x^\beta  e^{\epsilon}\\
\end{eqnarray*}
where we inserted a $k$ for $e^{\alpha}$ so the equation looked
prettier.

The problem with this analysis is that we can no longer address the
question, How much is a hit worth?''  Instead, we can say, How
many log(dollars) is a log(hit) worth?''  If this sounds resonable to
you, then you are definitely an economists:  A slope between
log(Y) and log(x) is called the elasticity.

\paragraph{Intrinsically linear models}  Cobb / Douglass production
functions are a cool example of a linear model that doesn't look like
a linear model:  log(output) = log(labor) + log(capital) etc...

\paragraph{Weighted least squares:} Alternatively, we can do weighted
least squares.  If we believe that the errors scale with $x$, then we
could write our model precisely as:
\begin{displaymath}
Y = \alpha + \beta x + x \epsilon
\end{displaymath}
So when $x$ is large, the errors are large and when $x$ is small the
errors are small.  This would show up nicely in a plot of $\epsilon^2$
vs $x$.

We can modify our equation by dividing both sides by $x$:
\begin{displaymath}
Y/x = \alpha/x + \beta +  \epsilon
\end{displaymath}
Now if we do a regression of $Y/x$ on $1/x$ we have a homoskadastic
regression.  This methodology is called weighted least squares.  The
weights in this case are $1/x$.

There are two ways of doing a weighted least squares.
\begin{itemize}
\item  First, you can ask R to simply use a column of $1/x^2$ as
weights. Sample hetroskadastic data with evil LS line:
\begin{Schunk}
\begin{Sinput}
> x <- 1:20
> y <- 10 + x * (3 + rnorm(20))
> plot(x, y)
> abline(lm(y ~ x)$coef) \end{Sinput} \end{Schunk} \includegraphics{class_standard_errors-001} Now with the weighted line: \begin{Schunk} \begin{Sinput} > plot(x, y) > abline(lm(y ~ x, weights = 1/x^2)$coef)
\end{Sinput}
\end{Schunk}
\includegraphics{class_standard_errors-002}

This is nice since the equation it generates will still use $\alpha$
and $\beta$ as we have been using them in the equations already.
\item Second, you can create the two new variables $Y/x$ and $1/x$ and
do the regression yourself.  This allows you control and will work in
any regression package (i.e. excel).  But you then have to
interpret the slope and intercept carefully.  So it looks like:
\begin{Schunk}
\begin{Sinput}
> newx = 1/x
> newy = y/x
> plot(newx, newy)
> abline(lm(newy ~ newx)$coef) \end{Sinput} \end{Schunk} \includegraphics{class_standard_errors-003} Compare the coeffients: \begin{Schunk} \begin{Sinput} > summary(lm(y ~ x, weights = 1/x^2)) \end{Sinput} \begin{Soutput} Call: lm(formula = y ~ x, weights = 1/x^2) Residuals: Min 1Q Median 3Q Max -1.6074 -0.9099 -0.1673 0.8402 1.9102 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 9.3621 1.1956 7.830 3.32e-07 *** x 3.2447 0.3378 9.606 1.65e-08 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.165 on 18 degrees of freedom Multiple R-squared: 0.8368, Adjusted R-squared: 0.8277 F-statistic: 92.28 on 1 and 18 DF, p-value: 1.651e-08 \end{Soutput} \begin{Sinput} > summary(lm(newy ~ newx)) \end{Sinput} \begin{Soutput} Call: lm(formula = newy ~ newx) Residuals: Min 1Q Median 3Q Max -1.6074 -0.9099 -0.1673 0.8402 1.9102 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.2447 0.3378 9.606 1.65e-08 *** newx 9.3621 1.1956 7.830 3.32e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.165 on 18 degrees of freedom Multiple R-squared: 0.7731, Adjusted R-squared: 0.7605 F-statistic: 61.32 on 1 and 18 DF, p-value: 3.321e-07 \end{Soutput} \end{Schunk} To check that$1/x\$ is the wrong weights:
\begin{Schunk}
\begin{Sinput}
> summary(lm(y ~ x, weights = 1/x))
\end{Sinput}
\begin{Soutput}
Call:
lm(formula = y ~ x, weights = 1/x)

Residuals:
Min     1Q Median     3Q    Max
-5.294 -2.722 -0.490  1.772  6.716

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)    8.389      2.957   2.837   0.0109 *
x              3.420      0.387   8.837 5.78e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.847 on 18 degrees of freedom
Multiple R-squared: 0.8127,	Adjusted R-squared: 0.8023
F-statistic: 78.09 on 1 and 18 DF,  p-value: 5.777e-08
\end{Soutput}
\end{Schunk}

\end{itemize}

\end{document}