Use the running pace calculator to find your necessary pace and splits based on your expected running time and distance

Photo by Bruno Nascimento


If you are a runner yourself, you are certainly aware of how important preparation is before a race. For the preparation of my first marathon, I used to rely on a training plan.

This running plan was great, but an important information was missing: the running pace. Most of the time, the distance and the time was given, but I needed to figure out the pace myself.

Although the computation is fairly easy, I felt like I was missing a quick way to compute my running pace based on the distance and expected time given by the training plan. So…

Learn the structure of a hypothesis test by hand, illustrated by 4 easy steps using the critical value, p-value and confidence interval methods

Photo by NeONBRAND

Descriptive versus inferential statistics

Remember that descriptive statistics is the branch of statistics aiming at describing and summarizing a set of data in the best possible manner, that is, by reducing it down to a few meaningful key measures and visualizations — with as little loss of information as possible. In other words, the branch of descriptive statistics helps to have a better understanding and a clear image about a set of observations thanks to summary statistics and graphics. …

Learn how to track the performance of your blog or website in R by analyzing page views, sessions, users and engagement with the {googleAnayticsR} package

Photo by Arthur Osipyan on Unsplash


Stats and R has been launched on December 16, 2019. Since the blog is officially one year old today and after having discussed the main benefits of maintaining a technical blog, I thought it would be a good time to share some numbers and thoughts about it.

In this article, I show how to analyze a blog and its blog posts with the R package (see package’s full documentation). After sharing some analytics about the blog, I will also discuss about content creation/distribution and, to a smaller extent, the future plans. …

Photo by Rey Seven

I am happy to announce that our paper entitled “ Waiting period from diagnosis for mortgage insurance issued to cancer survivors “ has been published in the European Actuarial Journal.

Here is a brief summary of it:

Massart (2018) testimonial illustrates the difficulties faced by patients having survived cancer to access mortgage insurance securing home loan. Data collected by national registries nevertheless suggest that excess mortality due to some types of cancer becomes moderate or even negligible after some waiting period.

In relation to the insurance laws passed in France and more recently in Belgium creating a right to be…

Learn how to perform an Analysis Of VAriance (ANOVA) in R to compare 3 groups or more. See also how to interpret the results and perform post-hoc tests

Photo by Battlecreek Coffee Roasters


ANOVA (ANalysis Of VAriance) is a statistical test to determine whether two or more population means are different. In other words, it is used to compare two or more groups to see if they are significantly different.

In practice, however, the:

  • Student t-test is used to compare 2 groups;
  • ANOVA generalizes the t-test beyond 2 groups, so it is used to compare 3 or more groups.

Note that there are several versions of the ANOVA (e.g., one-way ANOVA, two-way ANOVA, mixed ANOVA, repeated measures ANOVA, etc.). …

Learning by writing, getting feedback, contributing to the open source community and building professional relationships, among others

Photo by Patrick Fore

My blog was launched in December 2019. Although 9 months of writing is a very short period compared to others, I can already say that it’s been an incredible and very enriching adventure!

With 45 articles published (at the time of writing this article) and topics ranging from descriptive statistics, probability, inferential statistics to R Markdown and data visualization, I have seen many benefits of sharing my code through a technical blog.

In this article, I highlight 7 of them (in no particular order) with the hope that it will give ideas and incentives to some of you. …

Learn how to create professional graphics and plots in R with the ggplot2 package

Photo by Isaac Smith


R is known to be a really powerful programming language when it comes to graphics and visualizations (in addition to statistics and data science of course!).

To keep it short, graphics in R can be done in three ways, via the:

  1. package (the base graphics in R, loaded by default)
  2. package which adds more functionalities to the base package
  3. package (which needs to be installed and loaded beforehand)

The package comes with a large choice of plots (such as , , , , , , etc.) and additional related features (e.g., , , , ,

A R Shiny app to compute monthly loan or mortgage payments and to generate amortization tables

Photo by Tierra Mallorca


I recently moved out and bought my first apartment. Of course, I could not pay it entirely with my own savings, so I had to borrow money from the bank. I visited a couple of banks operating in my country and asked for a mortgage.

If you already bought your house or apartment in the past, you know how it goes: the bank analyzes your financial and personal situation and make an offer based on your propensity to repay the bank. You then either accept the offer if you are satisfied with the rate and conditions, or visit another bank…

Learn how to detect outliers in R via descriptive statistics, the Hampel filter, the Grubbs, the Dixon and the Rosner tests for outliers

Photo by Will Myers


An outlier is a value or an observation that is distant from other observations, that is to say, a data point that differs significantly from other data points. Enderlein (1987) goes even further as the author considers outliers as values that deviate so much from other observations one might suppose a different underlying sampling mechanism.

An observation must always be compared to other observations made on the same phenomenon before actually calling it an outlier. …

Learn how to perform the non-parametric version of the Student’s t-test in R

Photo by Annie Spratt


In a previous article, we showed how to compare two groups under different scenarios using the Student’s t-test. The Student’s t-test requires that the distributions follow a normal distribution.1 In this article, we show how to compare two groups when the normality assumption is violated, using the Wilcoxon test.

The Wilcoxon test is a non-parametric test, meaning that it does not rely on data belonging to any particular parametric family of probability distributions. Non-parametric tests have the same objective as their parametric counterparts. However, they have an advantage over parametric tests: they do not require the assumption of normality of…

Antoine Soetewey

PhD student and teaching assistant in statistics at UCLouvain (Belgium). Interested in statistics and R, author of and

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store