Why be Bayesian?
https://robweiss.faculty.biostat.ucla.edu/
enWhy Be Bayesian? Let Me Count the Ways
https://robweiss.faculty.biostat.ucla.edu/blog/why-be-bayesian-let-me-count-ways
<span>Why Be Bayesian? Let Me Count the Ways</span>
<div><p>In answer to an old friend's question. </p>
<ol><li> Bayesians have more fun. </li>
<ol><li> Our conferences are in better places too. </li>
</ol><li> It's the model not the estimator. </li>
<li> Life's too short to be a frequentist: In an infinite number of replications ... </li>
<li> Software works better. </li>
<ol><li> Rather surprisingly, Bayesian software is a lot more general than frequentist software. </li>
</ol><li> Small sample inference comes standard with most Bayesian model fitting these days. </li>
<ol><li> But if you like your inference asymptotic, that's available, just not high on anyone's priority list. </li>
<li> We can handle the no-data problem, all the way up to very large problems. </li>
<li> Don't need a large enough sample to allow for a bootstrap. </li>
</ol><li> Hierarchical random effects models are better fit with Bayesian models and software. </li>
<ol><li> If a variance component is small, the natural Bayes model doesn't allow zero as an estimate, while the natural maximum likelihood algorithms do allow zero. If you get a zero estimate, then you're going to get poor estimates of standard errors of fixed effects. [More discussion omitted.] </li>
<li> Can handle problems where there are more parameters than data. </li>
</ol><li> Logistic regression models fit better with Bayes </li>
<ol><li> If there's perfect separation on a particular variable, the maximum likelihood estimate of the coefficient is plus or minus infinity which isn't a good estimate. </li>
<li> Bayesian modeling offers (doesn't guarantee it, there's no insurance against stupidity) the opportunity to do the estimation correctly. </li>
<li> Same thing if you're trying to estimate a very tiny (or very large) probability. Suppose you observe 20 out of 20 successes on something that you know doesn't have 100% successes. </li>
<li> To rephrase a bit: In small samples or with rare events, Bayesian estimates shrink towards sensible point estimates, (if your prior is sensible) thus avoiding the large variance of point estimates. </li>
</ol><li> Variance bias trade-off is working in your favor. </li>
<li> Frequentists keep reinventing Bayesian methods </li>
<ol><li> Shrinkage estimates </li>
<li> Empirical Bayes </li>
<li> Lasso </li>
<li> Penalized likelihood </li>
<li> Ridge regression </li>
<li> James-Stein estimators </li>
<li> Regularization </li>
<li> Pittman estimation </li>
<li> Integrated likelihood </li>
<li> In other words, it's just not possible to analyze complex data structures without Bayesian ideas. </li>
</ol><li> Your answers are admissible if you're Bayesian but usually not if you're a frequentist. </li>
<ol><li> Admissibility means never having to say you're sorry. </li>
<li> Alternatively, admissibility means that someone else can't prove that they can do a better job than you. </li>
<li> And if you're a frequentist, someone is clogging our journals with proofs that the latest idiocy is admissible or not. </li>
<li> Unless they are clogging it with yet more ways to estimate the smoothing parameter for a nonparametric estimator. </li>
</ol><li> Bayesian models are generalizations of classical models. That's what the prior buys you: more models </li>
<li> Can handle discrete, categorical, ordered categorical, trees, densities, matrices, missing data and other odd parameter types. </li>
<li> Data and parameters are treated on an equal playing field. </li>
<li> I would argue that cross-validation works because it approximates Bayesian model selection tools. </li>
<li> Bayesian Hypothesis Testing </li>
<ol><li> Treats the null and alternative hypotheses on equal terms </li>
<li> Can handle two or more than two hypotheses </li>
<li> Can handle hypotheses that are </li>
<ol><li> Disjoint </li>
<li> Nested </li>
<li> Overlapping but neither disjoint nor nested </li>
</ol><li> Gives you the probability the alternative hypothesis is true. </li>
<li> Classical inference can only handle the nested null hypothesis problem. </li>
<li> We're all probably misusing p-values anyway. </li>
</ol><li> Provides a language for talking about modeling and uncertainty that is missing in classical statistics. </li>
<ol><li> And thus provides a language for developing new models for new data sets or scientific problems. </li>
<li> Provides a language for thinking about shrinkage estimators and why we want to use them and how to specify the shrinkage. </li>
<li> Bayesian statistics permits discussion of the sampling density of the data given the unknown parameters. </li>
<li> Unfortunately this is all that frequentist statistics allows you to talk about. </li>
<li> Additionally: Bayesians can discuss the distribution of the data unconditional on the parameters. </li>
<li> Bayesian statistics also allows you to discuss the distribution of the parameters. </li>
<li> You may discuss the distribution of the parameters given the data. This is called the posterior, and is the conclusion of a Bayesian analysis. </li>
<li> You can talk about problems that classical statistics can't handle: The probability of nuclear war for example. </li>
</ol><li> Novel computing tools -- but you can often use your old tools as well. </li>
<li> Bayesian methods allow pooling of information from diverse data sources. </li>
<ol><li> Data can come from books, journal articles, older lab data, previous studies, people, experts, the horse's mouth, rats a** or it may have been collected in the traditional form of data. </li>
<li> It isn't automatic, but there is language to think about how to do this pooling. </li>
</ol><li> Less work. </li>
<ol><li> Bayesian inference is via laws of probability, not by some ad hoc procedure that you need to invent for every problem or validate every time you use it. </li>
<li> Don't need to figure out an estimator. </li>
<li> Once you have a model and data set, the conclusion is a computing problem, not a research problem. </li>
<li> Don't need to prove a theorem to show that your posterior is sensible. It is sensible if your assumptions are sensible. </li>
<li> Don't need to publish a bunch of papers to figure out sensible answers given a novel problem </li>
<li> For example, estimating a series of means $mu_1, mu_2, \ldots$ that you know are ordered $mu_j \le mu_{j+1}$ is a computing problem in Bayesian inference, but was the source of numerous papers in the frequentist literature. Finding a (good) frequentist estimator and finding standard errors and confidence intervals took lots of papers to figure out. </li>
</ol><li> Yes, you can still use SAS. </li>
<ol><li> Or R or Stata. </li>
</ol><li> Can incorporate utility functions, if you have one. </li>
<li> Odd bits of other information can be incorporated into the analysis, for example </li>
<ol><li> That a particular parameter, usually allowed to be positive or negative, must be positive. </li>
<li> That a particular parameter is probably positive, but not guaranteed to be positive. </li>
<li> That a given regression coefficient should be close to zero. </li>
<li> That group one's mean is larger than group two's mean. </li>
<li> That the data comes from a distribution that is not a Poisson, Binomial, Exponential or Normal. For example, the data may be better modeled by a t, gamma. </li>
<li> That a collection of parameters come from a distribution that is skewed, or has long tails. </li>
<li> Bayesian nonparametrics can allow you to model an unknown density as a non-parametric mixture of normals (or other density). The uncertainty in estimating this distribution is incorporated in making inferences about group means and regression coefficients. </li>
</ol><li> Bayesian modeling is about the science. </li>
<ol><li> You can calculate the probability that your hypothesis is true. </li>
<li> Bayesian modeling asks if this model describes the data, mother nature, the data generating process correctly, or sufficiently correctly. </li>
<li> Classical inference is all about the statistician and the algorithm, not the science. </li>
<li> In repeated samples, how often (or how accurately) does this algorithm/method/model/inference scheme give the right answer? </li>
<li> Classical inference is more about the robustness (in repeated sampling) of the procedure. In that way, it provides robustness results for Bayesian methods. </li>
</ol><li> Bayesian methods have had notable successes, to wit: </li>
<ol><li> Covariate selection in regression problems </li>
<li> Model selection </li>
<li> Model mixing </li>
<li> And mixture models </li>
<li> Missing data </li>
<li> Multi-level and hierarchical models </li>
<li> Phylogeny </li>
</ol></ol><p>The bottom line: <b>More tools</b>. <b>Faster progress</b>.</p>
</div>
<span><span lang="" about="https://robweiss.faculty.biostat.ucla.edu/user/10" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">robweiss</span></span>
<span>Thu, 07/07/2016 - 12:18</span>
<div>
<div>Filed Under</div>
<div>
<div><a href="https://robweiss.faculty.biostat.ucla.edu/bayes" hreflang="en">Bayes</a></div>
<div><a href="https://robweiss.faculty.biostat.ucla.edu/inference" hreflang="en">inference</a></div>
<div><a href="https://robweiss.faculty.biostat.ucla.edu/comparative" hreflang="en">comparative</a></div>
<div><a href="https://robweiss.faculty.biostat.ucla.edu/frequentist" hreflang="en">frequentist</a></div>
<div><a href="https://robweiss.faculty.biostat.ucla.edu/classical" hreflang="en">classical</a></div>
<div><a href="https://robweiss.faculty.biostat.ucla.edu/why-be-bayesian" hreflang="en">Why be Bayesian?</a></div>
</div>
</div>
<ul class="links inline"><li class="comment-forbidden"><a href="https://robweiss.faculty.biostat.ucla.edu/user/login?destination=/blog/why-be-bayesian-let-me-count-ways%23comment-form">Log in</a> to post comments</li></ul><section><h2>Comments</h2>
<article data-comment-user-id="55" id="comment-1" class="comment js-comment"><mark class="hidden" data-comment-timestamp="1491322369"></mark><footer class="comment-wrap"><div class="author-details">
<article typeof="schema:Person" about="https://robweiss.faculty.biostat.ucla.edu/index.php/user/55" class="profile"></article><a href="https://robweiss.faculty.biostat.ucla.edu/comment/1#comment-1" hreflang="und">Permalink</a>
</div>
<div class="author-comments">
<p class="comment-submitted">Submitted by <span lang="" about="https://robweiss.faculty.biostat.ucla.edu/user/55" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">biostat202Cuser</span> on Tue, 04/04/2017 - 09:12</p>
<div class="content">
<h3><a href="https://robweiss.faculty.biostat.ucla.edu/comment/1#comment-1" class="permalink" rel="bookmark" hreflang="und">Biostat 285</a></h3>
<div><p>Student name: Changhee Lee<br />
Department: Electrical engineering</p>
</div>
<drupal-render-placeholder callback="comment.lazy_builders:renderLinks" arguments="0=1&1=default&2=und&3=" token="R_Gu3EzvWLFCHetA-cUhY3EDBnkBpLlLEkaVICt8Hrw"></drupal-render-placeholder></div>
</div>
</footer></article></section>Thu, 07 Jul 2016 19:18:45 +0000robweiss183 at https://robweiss.faculty.biostat.ucla.edu