\documentclass[14pt]{extarticle} \renewcommand{\baselinestretch}{1.2} \usepackage{hyperref} \usepackage{Sweave} \begin{document} \title{Class: Doglegs /piecewise linear / Bent stick} \maketitle \section*{Story time: Why R?} \begin{itemize} \item Information wants to be free \begin{itemize} \item The Whole Earth Catalog \item The original Google (1966) \item Stewart Brand \end{itemize} \item Accedemics write papers for free \item Most musicians (as in numbers) write music for free \item Book authors no longer need book publishers \item So, copyright is kinda obselete \item I'm fairly radical on this and push against comercial publishers in accemia \item Hence R \end{itemize} \section*{Admistrivia} \begin{itemize} \item You should read and get ready to do the first real homework this weekend. To get you ready, do the first two exercises. \end{itemize} \section*{Todays Topic: When Taylor fails} Taylor doesn't always work. \begin{itemize} \item Analytic functions it works for \item nice functions it works for \item measurable functions it ``almost'' works for \item But not to worry, in the modern age we can quote Littlewood who said, that {\it almost} all functions are {\it almost} locally linear. \item But it can break \end{itemize} Nasty stuff: \begin{itemize} \item Jumps (notice it is still ``smooth'' almost everywhere--only a countable number of jumps!) \item piecewise linear function (again with the almost everywhere) \item exercise for math types: Show $f$ is piecewise linear iff $f'$ is piecewise constant. \end{itemize} \section{So it isn't linear: still use Taylor} In most setting, your best bet is still to keep things smooth by using polynomials. This is done in R with the command: \begin{Schunk} \begin{Sinput} > x <- 10 * seq(1:20) > y = rnorm(20) > plot(x, y) > fit <- lm(y ~ x + I(x^2) + I(x^3))$fit > lines(x, fit) \end{Sinput} \end{Schunk} \includegraphics{class_doglegs-001} \section{So it isn't linear: Doglegs} Want to fit a bent-stick. That is our goal today. So something like: \begin{Schunk} \begin{Sinput} > x <- seq(1:100) > large.x <- (x > 50) > deviation <- (x - 50) * large.x > y = x + 5 * deviation > plot(x, y, type = "l") \end{Sinput} \end{Schunk} \includegraphics{class_doglegs-002} \section{Work through generating a bent stick with the class} \begin{itemize} \item Introduce indicator functions \item Introduce logic connectives \item Talk about deviations / corrections \end{itemize} \section{How to fit?} \begin{itemize} \item Fitting is via multiple regression \begin{Schunk} \begin{Sinput} > y = x + 5 * deviation + 30 * rnorm(100) > plot(x, y) > summary(lm(y ~ x + deviation)) \end{Sinput} \begin{Soutput} Call: lm(formula = y ~ x + deviation) Residuals: Min 1Q Median 3Q Max -62.043 -19.783 -2.228 18.913 69.506 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.1645 7.8648 -0.148 0.883 x 1.0387 0.2291 4.535 1.65e-05 *** deviation 4.7651 0.4049 11.769 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 29.22 on 97 degrees of freedom Multiple R-squared: 0.9308, Adjusted R-squared: 0.9294 F-statistic: 652.5 on 2 and 97 DF, p-value: < 2.2e-16 \end{Soutput} \begin{Sinput} > fit <- lm(y ~ x + deviation)$fit > lines(x, fit) \end{Sinput} \end{Schunk} \includegraphics{class_doglegs-003} \end{itemize} \section{Why use doglegs?} \begin{itemize} \item Look impressive \item Easy to interpret \end{itemize} \section{More than one dogleg} \begin{itemize} \item What does it take to add two doglegs? \item Simple: just more variables \item Location of bends called "knots" \end{itemize} \section{More graceful doglegs} \begin{itemize} \item To get the first deriviatve to match \item Draw picture \item Figure out equation: (i.e. deviation squared) \item Fitting as before: \begin{Schunk} \begin{Sinput} > plot(x, y) > deviation2 = deviation^2 > fit2 <- lm(y ~ x + deviation2)$fit > lines(x, fit2) \end{Sinput} \end{Schunk} \includegraphics{class_doglegs-004} \section{Putting it all together} \begin{itemize} \item Called splines \item Add as many knots as you like \item Fit the whole thing in a regression \item Lots of choices \begin{itemize} \item number of knots \item degree (piece wise linear, quadradic, cubic) \item rules for significance \end{itemize} \end{itemize} \section{Called smoothing} R commands: \begin{itemize} \item smooth (Tukey's version) \item lowess (I think this is what JMP uses also) \item spline (where we ended today in class) \item Approx (linear interpolation) \end{itemize} \end{itemize} \end{document}