Open and Closed Intervals: A Problem for ML Inference But Not Bayes

Does maximum likelihood inference have a support problem? 

Maximum likelihood (ML) has a problem with parameters that take values in open sets (Is that all of them? Almost!). Bayesian inference doesn't obviously have this problem. 

Briefly, using maximum likelihood, how many of you out there know how to put a standard deviation/standard error or a confidence interval on a probability when you get a result of y = 0 out of n = 20 independent Bernoulli(\pi) trials? How about if you get a variance estimate of \tau-hat=zero in a random intercept model for longitudinal data? Selfcheck: I know how to set a CI for \pi, I did it once on a test too, fortunately a take home test. But I don't know how to do this for the variance parameter. Maybe I could figure it out, inverting a likelihood ratio test or something. And I haven't a clue how to construct a SE in these situations. Even if we can solve the question for \pi and \tau, the serious problem is constructing a confidence interval for \pi_1 - \pi_2, the difference in probability of success in treatment minus control groups. For the random intercept model, the problem is comparing \mu_2, the treatment mean to control mean \mu_1, where we've randomized groups to treatment or control, and we need to estimate the intraclass correlation coefficient to properly estimate the SE of the estimated mean difference. 

But these difficulties are not problems I ever have using Bayesian inference. Never. Not once. Data yes: I've had y=0 out of n=20 walk in the door as part of a short term consulting problem. And variance estimates of zero seem to happen every second analysis in group randomized trials when we are using SAS or similar software. We spend a ton of time trying to fix the resulting inferences. But as a Bayesian, I spend my time on modeling data, not on fixing the problems with ML inference.

In Bayesian inference, we put a prior density on an open set such as (0,1) for a probability or the positive real line for a variance. Poster densities live on the same open set -- Bayesian inference works fine. For ML inference, estimates live on the closed set: probabilities may be estimated in the closed set [0,1] and variances are estimated on zero plus the positive real line. The ML paradigm creates estimates and then uses those estimates in further needed calculations, such as for standard errors and for confidence intervals. ML estimates on the boundary can't be used in the usual SE or CI formulas, they do not come with natural standard error estimates. This causes enormous headaches for ML inference, though it appears to be a blessing for statisticians who need tenure as it gives them plenty of grist for writing papers about cones and all sorts of cool (or obscure depending on your viewpoint) mathematics. 

It's a matter of support. Bayesian inference you have to specify the support of the prior density. In classical statistics, getting the support right is usually a matter of finding out your approach is giving silly answers and you need to fix the approach. These fixes are ad hoc and require lots of papers to be published, creating an inflated working class of (inflated) tenured frequentist statisticians. :-) As the silliness gets more subtle, more work goes into the fixes than in rejecting the approach in the first place. ​Go back in time to yesteryear when the method of moments was popular. A serious problem occurred in variance component models giving negative estimates of variance. Negative estimates of variance! "But," sputters the frequentist statistician, "we don't do that any more." Fine, but you've still get a support problem. 

ML statisticians should start wearing support hose to help support their support problem. 

Guns are the Tools that People Use Most Often to Kill People

Intro. A good friend has a habit of re-posting on facebook annoying and silly claims about guns. No, I don't want to de-friend him. Mostly I ban the source of the claims but he finds new sources of material. When I attempt to verify some claims, the claims are easily refuted by a quick google search and the first link to Wikipedia.

The most specious of gun-related claims is the 'guns don't kill people, people kill people'. If you want to play it that way, fine. Cars don't drive people, people drive people! Scissors don't cut paper, people cut paper! Knives don't chop vegetables, people chop vegetables! A gun is a tool, as are cars, scissors and knives. I use my car most days to transport me around town, I use scissors perhaps weekly to open mail or cut sheets of paper to needed size; I use knives for daily preparation and consumption of food. Cars, scissors and knives are tools that occasionally get used to kill people. Still, soccer moms/dads and taxi drivers don't kill many folks intentionally with their cars, second graders use scissors for art projects without managing to kill anyone, and chefs and cooks and food-eaters of all kinds use knives without stopping someone's beating heart. 

Data. Being a statistician, what I did do is get data from Wikipedia on numbers of murders and population broken out by state. Wikipedia got its data from the FBI Uniform Crime Reporting Statistics. The FBI for 2010 seemed to be missing data from Florida, but Florida publishes the data for us, and the Florida and FBI sources match the data in the Wikipedia article. That particular Wikipedia article also gave 2010 state population. I subtracted gun murders from total murders to give non-gun murders by state and calculated gun and non-gun murder rates.

Results. The first figure plots numbers of murders (gun and non-gun separately) in a state against state population on a log-log plot with lowess curves (iter=0) for both murder types. Gun murder counts are generally greater than murder counts using all other tools combined. For most of the state population range, the lowess curves are linear and parallel, suggesting that the number of murders of either type goes up as a power of the population. Fitting a simple GLM suggests the power is about 1.1 for both gun and non-gun murders, only slightly larger than 1. 

Only for the smallest population states do the two lowess curves intersect. Checking the data, eight, mostly northerly, mostly smaller states have non-gun murder counts higher than their gun-murder counts; these are Hawaii, Maine, New Hampshire, North Dakota, Oregon, Utah, Vermont and West Virginia. The total number of murders with all tools in those 8 states was 262 with 34 more non-gun murders than gun murders.  In contrast, there were 13540 murders in the other 42 states. 

We can also look at gun and non-gun murder rates by state as well; these are given in the second figure. Non-gun murder rates barely increase with population size, while gun murder rates do increase with population size, until somewhere just after 5 million people when the rate of growth tapers off. 

Conclusion. For 2010 there were 9,304 gun murders, and 4,498 non-gun murders in the 50 United States. In the United States in 2010, the gun is the primary tool people use for killing people. People use guns to kill people more than twice as often as they use all other tools combined. People use guns to kill people. 

 

 

 

More on Writing, Guest Post

This from Dr. Robert Bolan of the LAGLC. 

I agree with Rob’s choices of writing references. Strunk & White and Zinnser are indispensable and, perhaps not so surprisingly, they are written well enough so they actually can be read and not only used as quick lookup sources. Of course there are others but these are touchstones of proper English grammar and word usage.

So much of good writing, as Rob suggests, is trying to achieve absolute clarity with the words you choose and how you string them together. Economy is a sacred principle in good writing. Use the right words and use as few as possible. Also, rearrange sentences to get the flow right. For guidance on these skills I like Getting the Words Right: How to Revise, Edit & Rewrite by Theodore A. Rees Cheney. For assistance in technical writing there are several references. I like Merriam-Webster’s Manual for Writers & Editors. For sheer brilliance and clarity of advice, check out Robertson Davies’ Reading and Writing, a slim volume you can read in two or three sessions (read slowly, let sink in, do not gulp this one down). And finally, I offer my fervent belief that scientific writing, although requiring parsimony and precision, need not be dry and devoid of style. Read anything by John Gardner on writing, Stephen King on writing, Eudora Welty on writing, or any novelist or essayist whose style you admire. And then when you’re done with all that, read Paradise Lost—aloud—not for comprehension but for the sheer thunderous music of it. Philip Pullman, who wrote the introduction to my edition of Milton’s masterpiece, remarked that "the experience of reading poetry aloud when you don’t fully understand it is a curious and complicated one. It’s like suddenly discovering you can play the organ." You will likely be thinking that poetry has nothing to do with scientific writing. I disagree. All writing exists for the purpose of communication. Again, scientific writing need not be sterile—although that often appears to be the gold standard for editors. If you have something important to say, you must say it clearly, of course. But cadence and musicality, sparingly used, can deliver your meaning with an elegance that will, unbeknownst to the reader, nestle it into place with crystal clarity. Compose, don’t just write.

Obsessive attention to nuance and detail in writing can be a curse as well as a virtue, and every true writer can identify with the following. A friend of Oscar Wilde’s is reported to have asked him what he did yesterday.  Wilde replied: "In the morning I took out a comma. In the afternoon I put it back in again."

Me again regarding this last: A wonderful short essay on being too critical of yourself early in the writing process is Gail Godwin's The Watcher at the Gate. 

Some Questions from an Undergraduate for a Biostatistics Graduate Admissions Chair

These are questions I answered recently for an undergrad who had questions about how best to prepare herself for graduate school. I thought the answers might be more widely of interest, and with a lot of editing, am including the questions and answers here. Disclaimer: Your mileage may vary. A lot. 

What do graduate programs look for in an applicant? How does admissions work? 

Different programs differ immensely. I imagine that what got me into my graduate program only got me in because I happened to be taking a course from the department chair the quarter I applied to graduate school. My own grad school application would not get me into UCLA Biostat PhD program at this time, though it might get me into our MS program. [There's a pretty harsh moral in there somewhere.] Biostat programs often have less math requirements than do statistics program, but they also may or may not teach as much math stat as a stat program. Ours does teach plenty of math stat. A lot of programs (stat and biostat alike) admit to the MS, then pass you through to the PhD program if you do well. We admit a few people to the PhD, many more to the MS, and we admit to the PhD from our own MS. I believe this is how many programs work. (Another model is that some programs admit only PhD students, but if people don't make it through the PhD, they are sent off with an MS degree as a consolation prize.) There are no doubt other models.    

Is it more important that I take particular mathematics courses before sending in my application or get good grades in the ones that I do take?

I'd vote for good grades. A good grade when you've only taken one or two (upper division) math courses is an 'A'. If you're going for a PhD you'll need to show that you can do PhD work though, and that means taking a few difficult math courses. Undergrads usually don't take what I would call "really difficult statistics courses".  But that is certainly university/program specific. If you take a lot of math classes, then you have enough to average out the occasional bad grade. But don't ask me to tell you how to balance out the occasional bad grade (i.e. a B) more mathematics courses.  

What do you look for in a graduate student?

You need to put in many hours (Confer Outliers, by Malcolm Gladwell) to get really good at something. So putting in the time is worth while, starting now. For example, no graduate program teaches you everything about a subject; if you're going to have a well rounded education after receiving your PhD, you're going to have to teach yourself more than you learn in class. How do you teach yourself material? Start now, trial and error, learn stuff, do badly on somethings, better on others, and most importantly, keep on trying. So that's what I really want in a doctoral student: curiosity, an ability to figure stuff out, an inquiring mind, stick-to-it-tiveness, someone interesting we'd really like to have around for several years. Sadly that's not what is tested for on the GRE. 

Do you, as an admissions officer, look at the courses applicants are "currently enrolled in" even if there is not a grade attached to that course?

Yes, absolutely. But mainly when it matters to my admissions decision. Here's one hypothetical example: Consider someone with all B's in their first 2 years of college who suddenly 'finds themself' and gets all A's their junior year; if they are a math major, I really want to know about those Fall grades. If they are all A's that person could be considered for the PhD program. If they are a mix of B's and some A's, maybe I'm willing to admit to the MS program only. Someone with all A's in their first three years of undergrad (yes, they exist), I probably don't need to see senior year Fall grades to make an admissions decision.

Here's another common hypothetical: someone who qualifies for one degree program, but is enrolled in courses, that, if she gets A's in them, would qualify her for a different degree program. And suppose she prefers that 'different degree program'. Then I'm looking at the courses, and I actually need to wait to see the grades before I can make a sensible admissions decision. 

What math courses specifically do you look for in a student's application/ suggest I take before I apply?

For the MS program you need to take UCLA's Math 31AB 32AB and 33AB sequence. [Those of you elsewhere can look up what those courses are and find the matching courses at your own university). Next is any/every junior/senior level math classes. Once people are enrolled in our MS program, but supposing they are interested in the PhD program, there isn't lots of time to take many math courses, so we basically only recommend that they take real analysis 131a (and 131b) and linear algebra (115a and sometimes 115b). I guess those qualify as the courses to take if you can only manage a small number of courses. But a lot of junior/senior level math courses can be useful, depending on what sort of statistics one ends up in. Also, real analysis can be a real bear of a course, and it may be helpful to ease into by taking something else junior/senior level first. For direct admission to the PhD program, more math (with good grades, natch) is always better. Remember: You can never be too think, too rich or know too much math. 

Where Have All the Tenured Women Gone?

Ingram Olkin has an excellent editorial about gender equality in statistics departments in the US.  Everyone should read it. 

Each University has its own business structure, and UCLA has its own structure. Biostatistics also differs from statistics in a lot of important respects, particularly with regard to soft and hard money. I looked at UCLA Biostat based on information from our web site.  Counting everyone listed as Professor in our directory, we had 2 Female out of 17 Full Professors, 4/4 Associate Professors, and 3/5 Assistant Professors. These numbers include joint and secondary appointments, and part time and non-tenured professors. Doing my best to count faculty with tenure in Biostat, I get 1/8 Full, 2/2 Associate and 0/1 Assistant Professors. It may sound funny to hedge ("Doing my best"), but I really didn't know how many people we had until I counted just now. For the tenured count, I again took my best guess. (Mistakes may get made!) One full male prof is actually split between two departments, so saying we have 1 female out of 7.5 tenured full professors might be slightly more accurate. 

UCLA Biostat may not be doing too bad at the junior ranks in terms of gender diversification. Numbers are obviously not the whole story; atmosphere, support, opportunities also matter immensely. Senior faculty ranks are clearly lopsidedly male.

We all have to continue to actively support gender equality in hiring and promotion. Semper Vigilo

 

What do I do? How do I apply statistics in my job? How did I get started?

I've been invited to a panel discussion by the UCLA undergraduate statistics club. Some of the questions I was told to expect are down below. By answering the questions here, there's a chance of a more literate answer and other students will be able to read the answers as well. 

What do you do on a day-to-day basis?

I'm not sure there's a day-to-day answer to this question! My days are quite varied and full. Some constants are:

  • Teaching classes, office hours, answering student emails. My classes are (i) longitudinal data analysis, (ii) Bayesian data analysis, (iii) multivariate analysis, (iv) statistical graphics. I occasionally present a one or two-day short course on longitudinal analysis.
  • Helping my non-statistician colleagues with their scientific research in many ways: 
    • By applying appropriate statistical methodologies in the analyses of their data,
    • Training their graduate students and biostat graduate students how to analyze their data, depending on who is analyzing the data,
    • Helping them design studies to collect the most useful data possible.
    • Helping them write grants to get money to do their research.
  • Advising my doctoral students on their dissertation work. This can include editing their writing, listening to where they are going with their research, making suggestions on where they might go with their research, advising on employment. 
  • Doing (bio)statistical research. Most of this is done jointly with my students and with some friends. It involves having an idea or three, writing the idea down, working out the details, running examples, writing the paper, then submitting the paper and nursing the paper through the submission process to acceptance. 
  • Lately I've been working on my statistics blog. 
  • Administrative jobs. Every business needs to be managed and run, and academia is no exception. I chair the admissions committee for our department and do other jobs around the department.
  • A number of students from around the university have discovered my Bayes and longitudinal courses, and I get to talk to them about their cutting edge research in various disciplines. There's very little more fun that talking to a highly motivated young person about their research. 
  • Refereeing biostatistics papers, and acting as an associate editor for a Biometrics. 

How do you apply statistics in your job?

  • Teaching statistics, analyzing data with my colleagues, advising doctoral students about statistics, designing studies, power calculations, developing new statistical models computational methods.

How did you get started in statistics?

That's a long story and I was lucky.

As a junior at the University of Minnesota, I was tired of being in school and wanted to graduate and start a career. Problem was, I needed a major and my current major (physics) wasn't going to work. I opened my copy of the University catalog and I read the requirements for every department, starting at the letter A and working my way forward alphabetically to M. At M, I realized, I had always liked mathematics and even better, had been good at it. Even better, I could graduate in 12 months if everything (me and scheduling) worked exactly right. Later that day I called my roommate and told him I was going to major in math. His response was to recommend taking statistics so I could get a job. The idea was that being an actuary paid well, although it had a reputation of being rather dull. Graduation required that I take three year long sequences and I took mathematical statistics, probability theory and real analysis. Those choices turned out to be ideal preparation for graduate school in statistics. That Fall I started mathematical statistics with Don Berry, a Bayesian statistician famous for his advocacy of adaptive Bayesian clinical trials. Don thought I showed promise and he recruited me to graduate school in statistics at the U of M. 

After starting graduate school, I discovered that some of my past activities provided useful preparation for statistics. I was a game player; I'd play chess, backgammon and bridge every chance I got. From chess I got the ability to calculate and to look ahead and to predict. Backgammon and bridge teach probability and all three games teach understanding of other people and their motivations. From bridge I learned Bayesian thinking. One situation in bridge is called a finesse, with the goal of needing to find a particular queen in either your left or right hand opponent's hand. The instructions given to me were: if you think your left hand opponent has the queen, you play this way, if you think the right hand opponent has the queen, you play a different way. At the time, as a novice bridge player, I did not know what to do with that instruction. Later on, I realized that type of thinking was Bayesian in nature.

From backgammon I learned Monte Carlo simulation. Backgammon is gambling game combining skill and chance. In any backgammon position played for monetary stakes, the value of the position is the amount you should pay an opponent if the game is ended at this point without completion. My friends and I would come across a particular position and wonder what the right move was. We would play the game from the given position with first one move than repeat with the other move. Backgammon uses the roll of two six-sided dice to determine what moves can potentially be played at each turn; skill is used to pick the best among the allowed moves given the roll. In complex positions, the best (or correct) move may be unclear, and we would use Monte Carlo simulation of both move choices to determine the value of the game following each move. The move with the higher value is the better move. 

Over my undergraduate years I had worked in two different high energy physics labs helping make detectors for high energy physics research. I also worked in a reading research psychology lab and I worked for a geophysicist who studied chemical compositions of meteorites to understand how the solar system came into being. The geo-physicist and reading research labs had expensive VAX computers running Unix for the sole purpose of data acquisition, management and analysis. Except on weekends, when the VAX might be switched over to play Star Trek. The high energy physicists spent enormous sums (millions) on constructing equipment to collect data. Data was clearly very valuable and I gained a healthy respect for data, even if I didn't know much about it at the time. 

Among other lessons, these experiences taught me that scientists had very strong opinions, and that those opinions might rule over the data on occasion. In psychology, I saw a well-respected senior researcher try to understand why an experiment came out wrong, and how to get it to come out right. He eventually reran the experiment with different subjects and it came out right. After I learned Bayesian statistics I learned a language and tools to think about this behavior. I also learned that scientists get it wrong sometimes. I remember the geo-physics research group standing around talking (while I listened), about how another highly respected research group at a different major university had published a paper in a major journal and had gotten the conclusion dead wrong. Important takeaways from this exposure to science and scientists was that data was important and valuable but data was not everything. Opinions mattered deeply, yet scientists can make bad mistakes. 

 

Soccer Intervention, A Story in Semi-Demi Hierarchical Models with a Different Number of Hierarchies in Treatment and Control

Was doing a data analysis and power calculation for a proposed group randomized study, and came across an interesting feature where the resulting model for the data will necessarily be different for treatment and control. Treatment will have 3 hierarchical levels; control will have 2 levels.

In this study, 24 neighborhoods are randomized either to Soccer League + Vocational Training (SL-V), Soccer League (SL) or Control Condition (CC), 8 each. Within an SL-V or SL neighborhood, young men will be assigned to particular soccer teams run by a well trained soccer coach. Neighborhoods are different and teams will be quite different depending on the coach. Outcomes are to be measured longitudinally on young men. Thus any analysis on young men's outcomes needs to account for the nesting structure:

  1. Observations within young men, 
  2. Young men within Team,
  3. Team within Neighborhoods.

But this is for the SL-V and SL conditions. The CC condition has no assignment to soccer teams, and thus no nesting of young men within teams, only within neighborhoods. 

Two observations on a given young man will be more similar than observations from two different young men. Two young men from the same soccer team will be more similar than two young men from different teams. And two young men from the same neighborhood are more similar to each other than two young men from different neighborhoods. But the control condition has no soccer league, and no assignment to soccer teams. Men in the control group are thus nested within neighborhoods only, they have no soccer teams to be nested in. Therefore the correlation structure/nesting structure is different for the CC neighborhoods compared to SL-V and SL neighborhoods. 

Consider the simplest of random effects models for the resulting data, where we have random intercepts for young men (YM), teams (T) and neighborhoods (N). There will be variances sigma_YM, sigma_T and sigma_N. 

Soccer neighborhoods have sigma_T and sigma_N. Control CC neighborhoods have sigma_N only. Will sigma_N for CC neighborhoods be the same as sigma_N for soccer neighborhoods? That is, will the sigma_T random effects only add variance to the soccer neighborhoods, or could it supplant some of the sigma_N variance in those neighborhoods? Do I need to have separate sigma_N's for CC and SL-V/SL neighborhoods? This is a neighborhood intervention, neighborhoods are not that large, and the intervention conditions could have rather substantial feedback loops within the neighborhood.

There will be 8 neighborhoods for each of the three SL-V, SL and CC conditions. That's not a lot of degrees of freedom for estimating the variances. And the SL-V and SL interventions could be enough different that they have different sigma_N and sigma_T variances from each other. And getting around to the sigma_YM, those could well be different between conditions as well. If neighborhoods are different enough, I suppose sigma_YM could even differ by neighborhood or by condition. 

Could be a fun study to analyze. 

Bayesian methods to the rescue. Using maximum likelihood (ML) will be highly problematic, with 24 neighborhoods and 2 to 3 teams per neighborhood. First, ML software doesn't allow us to build a model then estimate it, the models are all canned, and I'm not sure having no sigma_T random effects for the CC young men is even specifiable in standard ML software. There must be a way to trick SAS Proc Mixed or Glimmix into allowing this, but I can't think of it off the top of my head. Perhaps assigning every person in CC to the same team? Or every young man in a CC neighborhood to the same team? Those aren't quite right, what if sigma_T is larger than sigma_N? When ML software sets variances to zero, we have an even bigger problem, as small variances and zero variances are very different and give very different answers for things like confidence intervals, degrees of freedom and p-values. 

If I need separate variances sigma_N for CC, SL and SL-V, that's only 8 neighborhoods per, and again ML often has trouble estimating variances from small degrees of freedom (df).

Bayes software and methods are more flexible, perhaps because they're not as highly evolved as ML software. Proper priors for the variances are necessary, but sensible proper priors for variances are not that hard to specify in hierarchical models, particularly for binary outcomes, and even for count and continuous outcomes it's not that hard. I'm okay for purposes of this conversation in having a flat prior for the fixed effects parameters. Bayesian MCMC inference is straightforward, and the question becomes setting up the models and priors for the first time, and then running a zillion copies of the analysis, as we have many many outcomes to look at. One issue is how much tuning do we need for the variance priors across different outcomes. I think I can deal with that. In contrast, ML software doesn't acknowledge that this model exists, ML will frequently set variance estimates to zero, necessitating teeth-gnashing regarding how to proceed, and with complex random effects models, and many small degrees of freedom, one wonders how sensible or accurate all those Welch/Satterthwaite approximations might be. In contrast, Bayes methods will grab all the information in the data available, accommodate all the uncertainty, and not flinch at small degrees of freedom. Unless our prior explicitly allows for variance parameters equal to zero (as in a mixture continuous and lump at zero prior), then the Bayesian model will force all variances to be non-zero. Asymptotic approximations not required. Success, and we can worry about the science, and not worry about recalcitrant software and algorithms. 

Advice to a Prospective Biostatistician

This is advice to a prospective student wondering whether to go into public health/epi or biostatistics. I'm willing to blindly argue for biostatistics, but prospective students might find it more useful if I frame the issues so they can decide for themselves. 

Biostatistician or subject matter specialist? Do you want to work in one discipline or do you like to move around? Not sure what exact field you want to work in?

From the perspective of scientists, biostatisticians are natural generalists. Biostatistics is great for people who like science, or better, for people who like sciences. As a kid, when asked what I wanted to be when I grew up, I had a long list of 'ists' that I wanted to be. As a biostatistician, I can work with an environmental health scientist this morning, then work on maternal and child health in South African townships in the afternoon. I teach classes to apprentice statisticians. These apprentice statisticians work with scientists from a variety of fields on a wonderfully varied set of research questions and data analyses. In some classes, I teach scientists: political scientists and epidemiologists, educators and emergency room docs, health policy wonks and psychologists, geneticists and nutritionists and the occasional urban planner. And after they've taken my class, we can talk and communicate at a higher level than we could before they took my class. And we've written papers together, because they still don't know enough statistics to solve their data analysis problems but now they can handle the software and understand the issues if we talk things through together.

The world needs both specialists and generalists. In the discipline of biostatistics, biostatisticians do specialize. I specialize in Bayesian statistics, hierarchical modeling and longitudinal data. Other people specialize in psychiatric statistics or causal inference or survival analysis. 

Scientists ask biostatisticians how to analyze data, and, to analyze their specific data. They ask us how to design studies, and, to design their specific study. We write data analysis plans, we do power calculations, we analyze data, we report results. We teach statistics, and we figure out better ways to analyze data.

Statistics or biostatistics? Biostatistics is not much different from statistics. In biostat, we're usually concentrating on analyzing data from public health, medicine, biology and public policy. Statisticians can do that as well. 

Did you want to go to med school? But you preferred math? Biostat may be for you!

How do we analyze the data? How should we analyze the data? What can we do in the time available and with the budget allotted and with the skills we currently have? What's the answer? How do we even get to an answer? What does the answer mean? How good is this answer? Those are the questions biostatisticians discuss and work on and ponder and figure out how to answer. We do this in the context of particular data sets, working closely with scientists and advocates and doctors and teachers to help them understand the truth of their data and the truth of the world seen through their data. Biostatisticians do this generically, at a higher level, thinking about how to analyze data sets like this data set. We develop statistical methods that will help analyze classes of data sets, not just a single data set. 

Public health trains scientists, advocates and educators. Epidemiologists or community health scientists or anthropologists or environmental health specialists are usually scientists, and some are advocates and some are educators and some are combinations of all three. Health policy produces managers and also scientists. Biostatisticians are scientists, but we are general scientists. We know how to do science and we know how to think about science. We have a general outline of science sitting in our heads; when we learn about a particular study or discipline, we fill in that outline with more details, but the outline never goes away. In grant writing: we know the endgame: we know what we will do with the data we're proposing to collect. The endgame tells us whether the experimental design will return the information the scientist needs and it points us to better designs to collect data. We know where we're going so we can advise on how to get there.  

In academia especially, ​biostatisticians get to do all the fun stuff: designing studies, analyzing data, talking to students and other statisticians and scientific colleagues about what it all means, writing up results. We leave the difficult annoying time-consuming mind-numbing stuff to our colleagues​: figuring out what questions to ask our subjects and how to phrase them, actually collecting the data, keeping the mice and feeding them, storing the data, though we dive in and consult as needed. 

What do you want to be when you grow up? I want to be an environmental scientist, a paleontologist, a nutritionist, and an epidemiologist. I want to talk to physicists and computer scientists, biologists and geneticists, engineers and business people, field workers and economists. I want to be a biostatistician. 

Intuitively, it is clear that it is obvious that any idiot can see

In technical writing, three terms/phrases not to be used:

  • Intuitively, ...
  • It is clear that ...
  • It is obvious that ...

Just as well you could write

  • Any idiot can plainly see ...

These phrases may be true for you, the writer. However, the reader won't have your depth of understanding of the subject matter and especially of the current notation and material. When the reader does not find the truth as plain as the unvarnished nose on your face, using these phrases directly affronts dear reader. As figure 17 plainly demonstrates both to the unaided eye and to any imbecile, obviously you shouldn't use these phrases. Even my pet gerbil knows using these phrases will insult anyone with a kindergarten education and clearly get your paper rejected posthaste.

Acknowledgments: I first heard this advice indirectly from Don Berry via a student taking a course from him. Berry gave extensive feedback on a writing assignment. The 'even my pet gerbil knows' phrase is courtesy of Andy Unger, MN chess expert. 
Filed Under

Measurement and Measurement Error, Weight, Success and Failure

This blog currently weights 200 pounds. It's inscribed in my data base, so it must be true. 200 is the latest in a series of daily morning readings wearing the same clothing, at the same time of my day. But how is that 200 measured? And is 200 good or bad? Can 200 be trusted? My last four daily morning readings were 201, 202, 201, and now 200. 

Today I stepped on the scale 3 times in a row. A needle swings around and points at a number that I can't read accurately because I'm too tall (or my eyes with glasses are not fully corrected for reading scale fine-print at a distance of 5 feet 7.5 inches not adjusting for head tilt). The scale read 200, 201 and 200, very clearly in 3 consecutive procedures: step on scale, let scale (and me) settle, step off scale, let scale (and me) settle. At a guess, scales have a measurement error of roughly 1 pound. At least, that's my statistical conclusion about my scale, having been informally thinking about measurement error and scales (or this scale anyway) and weight for many years. As a statistician, I think of measurement error as something with a standard deviation (SD), but measurement error could be assessed in other ways, perhaps as a (minimum, maximum) pair, or as the range = maximum minus minimum, where the minimum and maximum are the absolute farthest outside readings possible given a true value x, with minimum < x < maximum.

Why did I step on the scale 3 times? (1) It's Saturday and I'm not in a rush and I can take the time. (2) Actually recording 200 has important meaning as a milestone, (3) Because 200 is a change from the previous day, it is more important as a conclusion than if 200 was the same as the past 3 weights, (4) Because 200 is a change, it is less believable than if it was the same as the previous measurement, (5) I want to be careful to not get over-excited or discouraged and not to be or get over-optimistic or under-pessimistic.

Lets unpack those reasons. Reason (1) is cost. I'm not rushing to beat traffic, and can take the time to be careful in the measurement. Reasons (2) and (3) are about utility. Both reasons note that a conclusion that I weigh 200 (and 200 is less than and different from yesterday's 201 reading) is more important to me personally than if the weight was 198 or 202. Reason (4) reflects a hazily-thought-of but simple model for weight that says today's weight should be similar to yesterday's weight. Even if I'm losing weight, that's still a sensible model in the short term. It can be improved, but provides a reasonable guide to thinking about weight readings in the short term. Finally, reason (5) is thinking of the future. Emotion is the enemy of careful measurement. If I get excited, and start looking forward to weighing a svelte 180 or something clearly ridiculous in the short term, then I'll quickly get discouraged when the scale never reads in the low 190s much less 180. I will then fail at any reasonable short term goals; I will stop recording my weight, and I will stop trying to eat healthy enough to continue to lose weight. In contrast, if my reading was the same as the last 3 readings, not much need to be careful; my weight has plateaued, the reading is believable, and I'd step on, read the number and step off and go on my way. 

Yesterday blogwife was available for reading the scale. She thought the reading was right in between 200 and 201. I stepped on the scale 3 times and decided to record 201. If this is a momentary blip down, I don't want to get discouraged if the next 7 daily weights are 201. And if tomorrow my weight is 199, then even more fun to have dropped 2 pounds instead of 1 pound. Notice that today's reading involved a decision. Out of those last 4 weights, each of those 201 measurements involved a decision to not record 200. Only the 202 was clearly not a 200. Today's 200, yesterday's 201 and Wednesday's 201 readings all involved a decision. Those negative decisions made it easier to decide to record 200 today, I've been holding off recording a 200, but finally decided to go for it today. 

Notice that utility as much as accuracy went into the decision to record 201 yesterday and 200 today. Measurement often involves decisions, and considerations that go into these decisions are typically not recorded. My scale isn't perfectly accurate, but a bigger source of error is that for personal reasons, I may report some number not what I read off the scale.

People don't like to report bad results. Think about gambling wins and losses. When was the last time your friends went to Las Vegas and reported losing $100 at the slot machines? They're much more likely to tell you about the $5 gain on their previous trip, then about the loss this current trip. Las Vegas is the beneficiary of much free advertising because of this! We all hear about the gains, often multiple times, but we rarely hear about the losses. 

To answer the last question: Is 200 good or bad? For research purposes, it's probably important that we not put value judgments on the data we collect from people in our studies. But for me, and for weight generally, it's interesting, but whether 200 is 'good' or 'bad' can clearly be answered, and it depends on where we came from and where we're going. If I'm coming at 200 from above, then 200 is 'good'. If I'm coming at 200 from below, '200' is bad. Assuming I'm healthy. If I have a dread disease that causes weight loss, dropping weight is likely a bad thing, and gaining weight likely a good thing. Good and bad are relative, and depend on context. Today, in my context, 200 is good. When I hit 200 on the way up, 200 was bad. Once past 200, 200 became good again. How about that. Cultural relativism writ small! 
Filed Under
Subscribe to