In my opinion one of the more important, but tricky topics taught in first year statistics is the concept of Bayesian statistics. A lot of the basic analysis is done thinking about the null hypothesis, and trying to prove or disprove the null hypothesis. The basis is that given two sets of data, we have a null hypothesis that data A comes from the same set of data B. Some call this A/B testing, when you introduce something different, and want to see if there is any significant change.
But where this becomes limited is that sometimes you want to know more information, like what the probability is of getting a certain result from your test set. This is where Bayesian statistics comes into play. Taking the example further, we want to know the probability that if we picked from the two datasets, the probability difference between the two would be zero.
For me, this came up while doing work in analysing behaviour change. Normally, our observations are X, then we introduced a signal to do something different, and we get Y observations. Now this is time series data as well, so within certain time periods, what is the probability that the change was significant? This is after determining using a t-test that there was a difference in means between the two datasets.
This was all done in R, using the packages
BEST, which just happened to be what I found first.
My next set is to look into the package
stan which is a Bayesian statistics package, and very complicated.