Biostatistics 230 Statistical Graphs
Biostat 230 will be offered Spring 2015. Course is in process of being updated for Spring 2015. Stay tuned for updates. It is open to all Biostatistics graduate students and quantitatively trained graduate students in other departments. Grading involves homework, labs, data analysis projects, simulation projects, saving the world and a final graphical data analysis project of your own data.
News & Announcements:
- Make your own labs have been sent back. They were fantastic! I learned a ton. Even if it appeared that I made lots of comments on how to improve your (next) lab, please know that I enjoyed reading everybody's lab. Let me know if you enjoyed this sort of assignment -- this was so wildly successful I'm inclined to include this in other classes.
- Thanks for a great class!
- Kernel Density Illustrations. New document
2015 Class Schedule, Email Addresses, and Office Hours:
|Lecture||10:00 - 11:50||Monday||61-269 CHS|
|Lecture||10:00 - 10:50||Wednesday||61-269 CHS|
|Computer Lab 1||11:00 - 11:50||Wednesday||A1-241 CHS|
|Labs will meet all Wednesdays including the first week Mar 31/Apr 1 through week 9.|
|Office Hours||Tuesday 2:00 - 2:50||Wednesday 1:00 - 1:50||51-269 CHS|
|Finals Week Office Hours||Monday, Tuesday 10:00am - 10:50am|
Syllabus and Textbook
Dilbert cartoon illustrating what you're up against.
Dilbert cartoon illustrating how bad it could be.
Spring 2015 Course Information Sheet
Spring 2015 Goals of the Course
Latest version of R manual. If R is unfamiliar, it is strongly recommended that you go through this manual up to about page 78. More detailed guidance can be found at lab 0.
Handy short 4 page reference card to all of R. Ok. Much of R.
No text. See the course information sheet for additional book resources. There are several good books out there: A good reference is R Graphics, 2nd edition by Paul Murrell, Chapman and Hall (CRC Press). See also Lattice: Multivariate Data Visualization with R by Deepayan Sarkar.
Murrell's web site for the second edition of the book.
R software homepage
RStudio homepage. Highly recommended.
The software for the class is R. The latest version of the R software is available for download. It runs on Mac, Windows and Unix machines.
Download R and install it on your home computer.
- To download R, click on the link to R.
- Pick a "CRAN Mirror." I usually use Berkeley when I download the latest version of R.
- Under Precompiled Binary Distributions, click on "Windows" (or other).
- Click on "base".
- Click on "Download R 3.1.3 for Windows".
- Save to disk and run the executable.
This gives you R and the main packages that run in R. There are tons of additional packages as well. Feel free to download them and try them out.
If you start R, you can get end the program by typing "q()" into the command line editor (without the quotes of course). This is the window labeled "R Console". Alternatively click in the "R Console" window and then go to the file menu and select "Exit". In answer to the question about saving your session, I recommend "No" not saving it when in the computer lab.
After installing R, you may install Murrell's book's R package. Start R and go to the Packages menu. Go to "Set CRAN mirror...". Pick a USA mirror. Go back to the Packages menu. Select "Install Packages ..." A menu box will appear. Go down a long way until you find "RGraphics". Select it and click "OK". Now under the packages menu, you'll need to select "Load package". Select "RGraphics". To produce figure 1.1 from Murrell's book, type 'figure1.1()' (without the quotes) into R. Similarly for any other figure.
Some other courses/websites on statistical graphics
- Hadley Wickham's website
- and the ggplot2 website.
- Cleveland at Purdue
- Frank Harrell at Vanderbilt
- Michael Friendly at York
"... do wonks really use PowerPoint? I think most of them would recoil in horror at the thought. PowerPoint decks are the favored tool of the well-coiffed marketing weenies, not the number crunchers. True wonks would be a lot more likely to either (a) spend hours lovingly kerning their equations in LaTeX and producing 3-D scatterplots in R, or (b) spend five minutes pounding out something unreadable in Emacs, accompanied by a crude line chart generated by some completely inappropriate shell script." -- Kevin Drum Sept. 25, 2012
- Password access to this site will disappear at the end of finals week!
- Sarkar's Lattice Book Web Page
- UBS Prices and Earnings 2003 2003 version of the BigMac data source
- UBS Prices and Earnings 2006 2006 version of the BigMac data source
- Lecture notes on Lowess
- HW #1. Due Wed Apr 15, 2015. In class for the bulk and an email with the excel sheet.
- Excel Spreadsheet for reporting results.
- HW #2. Due Wed Apr 22, 2015.
- HW #3. Due Wed Apr 29, 2015.
- HW #4. Due May 6, 2015.
- HW #5. Due May 13, 2015.
- Project: Create your own lab. Due May 27, 2015.
- Final Data Analysis Project. Abstract Due May 4. Project due June 9 3:30 email as a PDF or hard copy to my mailbox in CHS 51-254.
- Writing Advice
2015 Homework Due Dates
- HW 1. Wed April 15 email a pdf or in class.
- HW 2. Wed April 22 email a pdf or in class.
- HW 3. Wed April 29 email a pdf or in class.
- Final Project Abstract. May 4 By email. Send a pdf.
- HW 4. Wed May 6 email a pdf or in class.
- HW 5. Wed May 13 email a pdf or in class.
- Project: Create your own lab. Wed May 27 email a pdf and a word or latex file.
- Final Project. Email a single PDF (only) or hard copy to my mailbox in CHS 51-254 by 3:30pm California time, Tuesday June 9.
2015 Computer Labs (Password protected).
- Lab 0. Beginning R. You may use as an intro if you haven't learned R previously.
- Lab 1. Start here. Managing Data, Creating and Labeling Scatterplots.
- Lab 2. Looking at one variable.
- Lab 3. Lattice Graphics.
- Lab 4. Regression Graphics.
- Lab 5. ggplot2.
- Lab 6. Graphics for Longitudinal Data.
- Lab 7. Multivariate Graphics.
- Lab 8. More on ggplot2.
- Older lab.
- Lab. Multivariate Graphics.
Additional Code Samples and Data Sets
- R code to combine two data files, delete duplicate records.
- Code for all Murrell RGraphics Chapter 1 figures.
- Australian Institute of Sport (AIS) Data. R save file
- Australian Institute of Sport Data Documentation.
- Introductory lecture.
- Lectures on tables.
- Lectures on Exploratory Data Analysis and Regression
- 2011 Web links examples of good and bad graphics.
- Drawing conclusions from graphics
- Plots for Single Samples. 3 Lecture notes, 2 papers. I suggest that you read the papers.
- Longitudinal data graphics
- Math Stat for Kernel density estimates, Histograms and Lowess
- Writing Advice
- Grand Finale: