If you are a runner yourself, you are certainly aware of how important preparation is before a race. For the preparation of my first marathon, I used to rely on a training plan.

This running plan was great, but an important information was missing: the running pace. Most of the time, the distance and the time was given, but I needed to figure out the pace myself.

Although the computation is fairly easy, I felt like I was missing a quick way to compute my running pace based on the distance and expected time given by the training plan. So…

Remember that **descriptive statistics** is the branch of statistics aiming at **describing and summarizing a set of data** in the best possible manner, that is, by reducing it down to a few meaningful key measures and visualizations — with as little loss of information as possible. In other words, the branch of descriptive statistics helps to have a better understanding and a clear image about a set of observations thanks to summary statistics and graphics. …

Stats and R has been launched on December 16, 2019. Since the blog is officially one year old today and after having discussed the main benefits of maintaining a technical blog, I thought it would be a good time to share some numbers and thoughts about it.

In this article, I show how to **analyze a blog and its blog posts** with the `{googleAnalyticsR}`

R package (see package’s full documentation). After sharing some analytics about the blog, I will also discuss about content creation/distribution and, to a smaller extent, the future plans. …

I am happy to announce that our paper entitled “ Waiting period from diagnosis for mortgage insurance issued to cancer survivors “ has been published in the European Actuarial Journal.

Here is a brief **summary** of it:

Massart (2018) testimonial illustrates the difficulties faced by patients having survived cancer to access mortgage insurance securing home loan. Data collected by national registries nevertheless suggest that excess mortality due to some types of cancer becomes moderate or even negligible after some waiting period.

In relation to the insurance laws passed in France and more recently in Belgium creating a right to be…

ANOVA (ANalysis Of VAriance) is a statistical test to determine whether two or more population means are different. In other words, it is used to **compare two or more groups** to see if they are significantly **different**.

In practice, however, the:

**Student t-test**is used to compare**2 groups**;**ANOVA**generalizes the t-test beyond 2 groups, so it is used to compare**3 or more groups**.

Note that there are several versions of the ANOVA (e.g., one-way ANOVA, two-way ANOVA, mixed ANOVA, repeated measures ANOVA, etc.). …

My blog statsandr.com was launched in December 2019. Although 9 months of writing is a very short period compared to others, I can already say that it’s been an incredible and very enriching adventure!

With 45 articles published (at the time of writing this article) and topics ranging from descriptive statistics, probability, inferential statistics to R Markdown and data visualization, I have seen many benefits of sharing my code through a technical blog.

In this article, I highlight 7 of them (in no particular order) with the hope that it will give ideas and incentives to some of you. …

R is known to be a really powerful programming language when it comes to graphics and visualizations (in addition to statistics and data science of course!).

To keep it short, graphics in R can be done in three ways, via the:

`{graphics}`

package (the base graphics in R, loaded by default)`{lattice}`

package which adds more functionalities to the base package`{ggplot2}`

package (which needs to be installed and loaded beforehand)

The `{graphics}`

package comes with a large choice of plots (such as `plot`

, `hist`

, `barplot`

, `boxplot`

, `pie`

, `mosaicplot`

, etc.) and additional related features (e.g., `abline`

, `lines`

, `legend`

, `mtext`

, `rect`

…

I recently moved out and bought my first apartment. Of course, I could not pay it entirely with my own savings, so I had to borrow money from the bank. I visited a couple of banks operating in my country and asked for a mortgage.

If you already bought your house or apartment in the past, you know how it goes: the bank analyzes your financial and personal situation and make an offer based on your propensity to repay the bank. You then either accept the offer if you are satisfied with the rate and conditions, or visit another bank…

An **outlier** is a value or an **observation that is distant from other observations**, that is to say, a data point that differs significantly from other data points. Enderlein (1987) goes even further as the author considers outliers as values that deviate so much from other observations one might suppose a different underlying sampling mechanism.

An observation must always be compared to other observations made on the same phenomenon before actually calling it an outlier. …

In a previous article, we showed how to compare two groups under different scenarios using the Student’s t-test. The Student’s t-test requires that the distributions follow a normal distribution.1 In this article, we show how to **compare two groups when the normality assumption is violated**, using the **Wilcoxon test**.

The Wilcoxon test is a **non-parametric test**, meaning that it does not rely on data belonging to any particular parametric family of probability distributions. Non-parametric tests have the same objective as their parametric counterparts. However, they have an advantage over parametric tests: they **do not require the assumption of normality** of…

