What do I do? How do I apply statistics in my job? How did I get started?

I've been invited to a panel discussion by the UCLA undergraduate statistics club. Some of the questions I was told to expect are down below. By answering the questions here, there's a chance of a more literate answer and other students will be able to read the answers as well. 

What do you do on a day-to-day basis?

I'm not sure there's a day-to-day answer to this question! My days are quite varied and full. Some constants are:

  • Teaching classes, office hours, answering student emails. My classes are (i) longitudinal data analysis, (ii) Bayesian data analysis, (iii) multivariate analysis, (iv) statistical graphics. I occasionally present a one or two-day short course on longitudinal analysis.
  • Helping my non-statistician colleagues with their scientific research in many ways: 
    • By applying appropriate statistical methodologies in the analyses of their data,
    • Training their graduate students and biostat graduate students how to analyze their data, depending on who is analyzing the data,
    • Helping them design studies to collect the most useful data possible.
    • Helping them write grants to get money to do their research.
  • Advising my doctoral students on their dissertation work. This can include editing their writing, listening to where they are going with their research, making suggestions on where they might go with their research, advising on employment. 
  • Doing (bio)statistical research. Most of this is done jointly with my students and with some friends. It involves having an idea or three, writing the idea down, working out the details, running examples, writing the paper, then submitting the paper and nursing the paper through the submission process to acceptance. 
  • Lately I've been working on my statistics blog. 
  • Administrative jobs. Every business needs to be managed and run, and academia is no exception. I chair the admissions committee for our department and do other jobs around the department.
  • A number of students from around the university have discovered my Bayes and longitudinal courses, and I get to talk to them about their cutting edge research in various disciplines. There's very little more fun that talking to a highly motivated young person about their research. 
  • Refereeing biostatistics papers, and acting as an associate editor for a Biometrics. 

How do you apply statistics in your job?

  • Teaching statistics, analyzing data with my colleagues, advising doctoral students about statistics, designing studies, power calculations, developing new statistical models computational methods.

How did you get started in statistics?

That's a long story and I was lucky.

As a junior at the University of Minnesota, I was tired of being in school and wanted to graduate and start a career. Problem was, I needed a major and my current major (physics) wasn't going to work. I opened my copy of the University catalog and I read the requirements for every department, starting at the letter A and working my way forward alphabetically to M. At M, I realized, I had always liked mathematics and even better, had been good at it. Even better, I could graduate in 12 months if everything (me and scheduling) worked exactly right. Later that day I called my roommate and told him I was going to major in math. His response was to recommend taking statistics so I could get a job. The idea was that being an actuary paid well, although it had a reputation of being rather dull. Graduation required that I take three year long sequences and I took mathematical statistics, probability theory and real analysis. Those choices turned out to be ideal preparation for graduate school in statistics. That Fall I started mathematical statistics with Don Berry, a Bayesian statistician famous for his advocacy of adaptive Bayesian clinical trials. Don thought I showed promise and he recruited me to graduate school in statistics at the U of M. 

After starting graduate school, I discovered that some of my past activities provided useful preparation for statistics. I was a game player; I'd play chess, backgammon and bridge every chance I got. From chess I got the ability to calculate and to look ahead and to predict. Backgammon and bridge teach probability and all three games teach understanding of other people and their motivations. From bridge I learned Bayesian thinking. One situation in bridge is called a finesse, with the goal of needing to find a particular queen in either your left or right hand opponent's hand. The instructions given to me were: if you think your left hand opponent has the queen, you play this way, if you think the right hand opponent has the queen, you play a different way. At the time, as a novice bridge player, I did not know what to do with that instruction. Later on, I realized that type of thinking was Bayesian in nature.

From backgammon I learned Monte Carlo simulation. Backgammon is gambling game combining skill and chance. In any backgammon position played for monetary stakes, the value of the position is the amount you should pay an opponent if the game is ended at this point without completion. My friends and I would come across a particular position and wonder what the right move was. We would play the game from the given position with first one move than repeat with the other move. Backgammon uses the roll of two six-sided dice to determine what moves can potentially be played at each turn; skill is used to pick the best among the allowed moves given the roll. In complex positions, the best (or correct) move may be unclear, and we would use Monte Carlo simulation of both move choices to determine the value of the game following each move. The move with the higher value is the better move. 

Over my undergraduate years I had worked in two different high energy physics labs helping make detectors for high energy physics research. I also worked in a reading research psychology lab and I worked for a geophysicist who studied chemical compositions of meteorites to understand how the solar system came into being. The geo-physicist and reading research labs had expensive VAX computers running Unix for the sole purpose of data acquisition, management and analysis. Except on weekends, when the VAX might be switched over to play Star Trek. The high energy physicists spent enormous sums (millions) on constructing equipment to collect data. Data was clearly very valuable and I gained a healthy respect for data, even if I didn't know much about it at the time. 

Among other lessons, these experiences taught me that scientists had very strong opinions, and that those opinions might rule over the data on occasion. In psychology, I saw a well-respected senior researcher try to understand why an experiment came out wrong, and how to get it to come out right. He eventually reran the experiment with different subjects and it came out right. After I learned Bayesian statistics I learned a language and tools to think about this behavior. I also learned that scientists get it wrong sometimes. I remember the geo-physics research group standing around talking (while I listened), about how another highly respected research group at a different major university had published a paper in a major journal and had gotten the conclusion dead wrong. Important takeaways from this exposure to science and scientists was that data was important and valuable but data was not everything. Opinions mattered deeply, yet scientists can make bad mistakes. 

 

Soccer Intervention, A Story in Semi-Demi Hierarchical Models with a Different Number of Hierarchies in Treatment and Control

Was doing a data analysis and power calculation for a proposed group randomized study, and came across an interesting feature where the resulting model for the data will necessarily be different for treatment and control. Treatment will have 3 hierarchical levels; control will have 2 levels.

In this study, 24 neighborhoods are randomized either to Soccer League + Vocational Training (SL-V), Soccer League (SL) or Control Condition (CC), 8 each. Within an SL-V or SL neighborhood, young men will be assigned to particular soccer teams run by a well trained soccer coach. Neighborhoods are different and teams will be quite different depending on the coach. Outcomes are to be measured longitudinally on young men. Thus any analysis on young men's outcomes needs to account for the nesting structure:

  1. Observations within young men, 
  2. Young men within Team,
  3. Team within Neighborhoods.

But this is for the SL-V and SL conditions. The CC condition has no assignment to soccer teams, and thus no nesting of young men within teams, only within neighborhoods. 

Two observations on a given young man will be more similar than observations from two different young men. Two young men from the same soccer team will be more similar than two young men from different teams. And two young men from the same neighborhood are more similar to each other than two young men from different neighborhoods. But the control condition has no soccer league, and no assignment to soccer teams. Men in the control group are thus nested within neighborhoods only, they have no soccer teams to be nested in. Therefore the correlation structure/nesting structure is different for the CC neighborhoods compared to SL-V and SL neighborhoods. 

Consider the simplest of random effects models for the resulting data, where we have random intercepts for young men (YM), teams (T) and neighborhoods (N). There will be variances sigma_YM, sigma_T and sigma_N. 

Soccer neighborhoods have sigma_T and sigma_N. Control CC neighborhoods have sigma_N only. Will sigma_N for CC neighborhoods be the same as sigma_N for soccer neighborhoods? That is, will the sigma_T random effects only add variance to the soccer neighborhoods, or could it supplant some of the sigma_N variance in those neighborhoods? Do I need to have separate sigma_N's for CC and SL-V/SL neighborhoods? This is a neighborhood intervention, neighborhoods are not that large, and the intervention conditions could have rather substantial feedback loops within the neighborhood.

There will be 8 neighborhoods for each of the three SL-V, SL and CC conditions. That's not a lot of degrees of freedom for estimating the variances. And the SL-V and SL interventions could be enough different that they have different sigma_N and sigma_T variances from each other. And getting around to the sigma_YM, those could well be different between conditions as well. If neighborhoods are different enough, I suppose sigma_YM could even differ by neighborhood or by condition. 

Could be a fun study to analyze. 

Bayesian methods to the rescue. Using maximum likelihood (ML) will be highly problematic, with 24 neighborhoods and 2 to 3 teams per neighborhood. First, ML software doesn't allow us to build a model then estimate it, the models are all canned, and I'm not sure having no sigma_T random effects for the CC young men is even specifiable in standard ML software. There must be a way to trick SAS Proc Mixed or Glimmix into allowing this, but I can't think of it off the top of my head. Perhaps assigning every person in CC to the same team? Or every young man in a CC neighborhood to the same team? Those aren't quite right, what if sigma_T is larger than sigma_N? When ML software sets variances to zero, we have an even bigger problem, as small variances and zero variances are very different and give very different answers for things like confidence intervals, degrees of freedom and p-values. 

If I need separate variances sigma_N for CC, SL and SL-V, that's only 8 neighborhoods per, and again ML often has trouble estimating variances from small degrees of freedom (df).

Bayes software and methods are more flexible, perhaps because they're not as highly evolved as ML software. Proper priors for the variances are necessary, but sensible proper priors for variances are not that hard to specify in hierarchical models, particularly for binary outcomes, and even for count and continuous outcomes it's not that hard. I'm okay for purposes of this conversation in having a flat prior for the fixed effects parameters. Bayesian MCMC inference is straightforward, and the question becomes setting up the models and priors for the first time, and then running a zillion copies of the analysis, as we have many many outcomes to look at. One issue is how much tuning do we need for the variance priors across different outcomes. I think I can deal with that. In contrast, ML software doesn't acknowledge that this model exists, ML will frequently set variance estimates to zero, necessitating teeth-gnashing regarding how to proceed, and with complex random effects models, and many small degrees of freedom, one wonders how sensible or accurate all those Welch/Satterthwaite approximations might be. In contrast, Bayes methods will grab all the information in the data available, accommodate all the uncertainty, and not flinch at small degrees of freedom. Unless our prior explicitly allows for variance parameters equal to zero (as in a mixture continuous and lump at zero prior), then the Bayesian model will force all variances to be non-zero. Asymptotic approximations not required. Success, and we can worry about the science, and not worry about recalcitrant software and algorithms. 

Advice to a Prospective Biostatistician

This is advice to a prospective student wondering whether to go into public health/epi or biostatistics. I'm willing to blindly argue for biostatistics, but prospective students might find it more useful if I frame the issues so they can decide for themselves. 

Biostatistician or subject matter specialist? Do you want to work in one discipline or do you like to move around? Not sure what exact field you want to work in?

From the perspective of scientists, biostatisticians are natural generalists. Biostatistics is great for people who like science, or better, for people who like sciences. As a kid, when asked what I wanted to be when I grew up, I had a long list of 'ists' that I wanted to be. As a biostatistician, I can work with an environmental health scientist this morning, then work on maternal and child health in South African townships in the afternoon. I teach classes to apprentice statisticians. These apprentice statisticians work with scientists from a variety of fields on a wonderfully varied set of research questions and data analyses. In some classes, I teach scientists: political scientists and epidemiologists, educators and emergency room docs, health policy wonks and psychologists, geneticists and nutritionists and the occasional urban planner. And after they've taken my class, we can talk and communicate at a higher level than we could before they took my class. And we've written papers together, because they still don't know enough statistics to solve their data analysis problems but now they can handle the software and understand the issues if we talk things through together.

The world needs both specialists and generalists. In the discipline of biostatistics, biostatisticians do specialize. I specialize in Bayesian statistics, hierarchical modeling and longitudinal data. Other people specialize in psychiatric statistics or causal inference or survival analysis. 

Scientists ask biostatisticians how to analyze data, and, to analyze their specific data. They ask us how to design studies, and, to design their specific study. We write data analysis plans, we do power calculations, we analyze data, we report results. We teach statistics, and we figure out better ways to analyze data.

Statistics or biostatistics? Biostatistics is not much different from statistics. In biostat, we're usually concentrating on analyzing data from public health, medicine, biology and public policy. Statisticians can do that as well. 

Did you want to go to med school? But you preferred math? Biostat may be for you!

How do we analyze the data? How should we analyze the data? What can we do in the time available and with the budget allotted and with the skills we currently have? What's the answer? How do we even get to an answer? What does the answer mean? How good is this answer? Those are the questions biostatisticians discuss and work on and ponder and figure out how to answer. We do this in the context of particular data sets, working closely with scientists and advocates and doctors and teachers to help them understand the truth of their data and the truth of the world seen through their data. Biostatisticians do this generically, at a higher level, thinking about how to analyze data sets like this data set. We develop statistical methods that will help analyze classes of data sets, not just a single data set. 

Public health trains scientists, advocates and educators. Epidemiologists or community health scientists or anthropologists or environmental health specialists are usually scientists, and some are advocates and some are educators and some are combinations of all three. Health policy produces managers and also scientists. Biostatisticians are scientists, but we are general scientists. We know how to do science and we know how to think about science. We have a general outline of science sitting in our heads; when we learn about a particular study or discipline, we fill in that outline with more details, but the outline never goes away. In grant writing: we know the endgame: we know what we will do with the data we're proposing to collect. The endgame tells us whether the experimental design will return the information the scientist needs and it points us to better designs to collect data. We know where we're going so we can advise on how to get there.  

In academia especially, ​biostatisticians get to do all the fun stuff: designing studies, analyzing data, talking to students and other statisticians and scientific colleagues about what it all means, writing up results. We leave the difficult annoying time-consuming mind-numbing stuff to our colleagues​: figuring out what questions to ask our subjects and how to phrase them, actually collecting the data, keeping the mice and feeding them, storing the data, though we dive in and consult as needed. 

What do you want to be when you grow up? I want to be an environmental scientist, a paleontologist, a nutritionist, and an epidemiologist. I want to talk to physicists and computer scientists, biologists and geneticists, engineers and business people, field workers and economists. I want to be a biostatistician. 

Intuitively, it is clear that it is obvious that any idiot can see

In technical writing, three terms/phrases not to be used:

  • Intuitively, ...
  • It is clear that ...
  • It is obvious that ...

Just as well you could write

  • Any idiot can plainly see ...

These phrases may be true for you, the writer. However, the reader won't have your depth of understanding of the subject matter and especially of the current notation and material. When the reader does not find the truth as plain as the unvarnished nose on your face, using these phrases directly affronts dear reader. As figure 17 plainly demonstrates both to the unaided eye and to any imbecile, obviously you shouldn't use these phrases. Even my pet gerbil knows using these phrases will insult anyone with a kindergarten education and clearly get your paper rejected posthaste.

Acknowledgments: I first heard this advice indirectly from Don Berry via a student taking a course from him. Berry gave extensive feedback on a writing assignment. The 'even my pet gerbil knows' phrase is courtesy of Andy Unger, MN chess expert. 
Filed Under

Measurement and Measurement Error, Weight, Success and Failure

This blog currently weights 200 pounds. It's inscribed in my data base, so it must be true. 200 is the latest in a series of daily morning readings wearing the same clothing, at the same time of my day. But how is that 200 measured? And is 200 good or bad? Can 200 be trusted? My last four daily morning readings were 201, 202, 201, and now 200. 

Today I stepped on the scale 3 times in a row. A needle swings around and points at a number that I can't read accurately because I'm too tall (or my eyes with glasses are not fully corrected for reading scale fine-print at a distance of 5 feet 7.5 inches not adjusting for head tilt). The scale read 200, 201 and 200, very clearly in 3 consecutive procedures: step on scale, let scale (and me) settle, step off scale, let scale (and me) settle. At a guess, scales have a measurement error of roughly 1 pound. At least, that's my statistical conclusion about my scale, having been informally thinking about measurement error and scales (or this scale anyway) and weight for many years. As a statistician, I think of measurement error as something with a standard deviation (SD), but measurement error could be assessed in other ways, perhaps as a (minimum, maximum) pair, or as the range = maximum minus minimum, where the minimum and maximum are the absolute farthest outside readings possible given a true value x, with minimum < x < maximum.

Why did I step on the scale 3 times? (1) It's Saturday and I'm not in a rush and I can take the time. (2) Actually recording 200 has important meaning as a milestone, (3) Because 200 is a change from the previous day, it is more important as a conclusion than if 200 was the same as the past 3 weights, (4) Because 200 is a change, it is less believable than if it was the same as the previous measurement, (5) I want to be careful to not get over-excited or discouraged and not to be or get over-optimistic or under-pessimistic.

Lets unpack those reasons. Reason (1) is cost. I'm not rushing to beat traffic, and can take the time to be careful in the measurement. Reasons (2) and (3) are about utility. Both reasons note that a conclusion that I weigh 200 (and 200 is less than and different from yesterday's 201 reading) is more important to me personally than if the weight was 198 or 202. Reason (4) reflects a hazily-thought-of but simple model for weight that says today's weight should be similar to yesterday's weight. Even if I'm losing weight, that's still a sensible model in the short term. It can be improved, but provides a reasonable guide to thinking about weight readings in the short term. Finally, reason (5) is thinking of the future. Emotion is the enemy of careful measurement. If I get excited, and start looking forward to weighing a svelte 180 or something clearly ridiculous in the short term, then I'll quickly get discouraged when the scale never reads in the low 190s much less 180. I will then fail at any reasonable short term goals; I will stop recording my weight, and I will stop trying to eat healthy enough to continue to lose weight. In contrast, if my reading was the same as the last 3 readings, not much need to be careful; my weight has plateaued, the reading is believable, and I'd step on, read the number and step off and go on my way. 

Yesterday blogwife was available for reading the scale. She thought the reading was right in between 200 and 201. I stepped on the scale 3 times and decided to record 201. If this is a momentary blip down, I don't want to get discouraged if the next 7 daily weights are 201. And if tomorrow my weight is 199, then even more fun to have dropped 2 pounds instead of 1 pound. Notice that today's reading involved a decision. Out of those last 4 weights, each of those 201 measurements involved a decision to not record 200. Only the 202 was clearly not a 200. Today's 200, yesterday's 201 and Wednesday's 201 readings all involved a decision. Those negative decisions made it easier to decide to record 200 today, I've been holding off recording a 200, but finally decided to go for it today. 

Notice that utility as much as accuracy went into the decision to record 201 yesterday and 200 today. Measurement often involves decisions, and considerations that go into these decisions are typically not recorded. My scale isn't perfectly accurate, but a bigger source of error is that for personal reasons, I may report some number not what I read off the scale.

People don't like to report bad results. Think about gambling wins and losses. When was the last time your friends went to Las Vegas and reported losing $100 at the slot machines? They're much more likely to tell you about the $5 gain on their previous trip, then about the loss this current trip. Las Vegas is the beneficiary of much free advertising because of this! We all hear about the gains, often multiple times, but we rarely hear about the losses. 

To answer the last question: Is 200 good or bad? For research purposes, it's probably important that we not put value judgments on the data we collect from people in our studies. But for me, and for weight generally, it's interesting, but whether 200 is 'good' or 'bad' can clearly be answered, and it depends on where we came from and where we're going. If I'm coming at 200 from above, then 200 is 'good'. If I'm coming at 200 from below, '200' is bad. Assuming I'm healthy. If I have a dread disease that causes weight loss, dropping weight is likely a bad thing, and gaining weight likely a good thing. Good and bad are relative, and depend on context. Today, in my context, 200 is good. When I hit 200 on the way up, 200 was bad. Once past 200, 200 became good again. How about that. Cultural relativism writ small! 
Filed Under