Learn the structure of a hypothesis test by hand, illustrated by 4 easy steps using the critical value, p-value and confidence interval methods

Hypothesis test by hand
Hypothesis test by hand
Photo by NeONBRAND

Descriptive versus inferential statistics

Remember that descriptive statistics is the branch of statistics aiming at describing and summarizing a set of data in the best possible manner, that is, by reducing it down to a few meaningful key measures and visualizations — with as little loss of information as possible. In other words, the branch of descriptive statistics helps to have a better understanding and a clear image about a set of observations thanks to summary statistics and graphics. …


Learn how to track the performance of your blog or website in R by analyzing page views, sessions, users and engagement with the {googleAnayticsR} package

Track the performance of your blog or website in R based on Google Analytics data
Track the performance of your blog or website in R based on Google Analytics data
Photo by Arthur Osipyan on Unsplash

Introduction

Stats and R has been launched on December 16, 2019. Since the blog is officially one year old today and after having discussed the main benefits of maintaining a technical blog, I thought it would be a good time to share some numbers and thoughts about it.

In this article, I show how to analyze a blog and its blog posts with the {googleAnalyticsR} R package (see package’s full documentation). After sharing some analytics about the blog, I will also discuss about content creation/distribution and, to a smaller extent, the future plans. …


Waiting period from diagnosis for mortgage insurance issued to cancer survivors
Waiting period from diagnosis for mortgage insurance issued to cancer survivors
Photo by Rey Seven

I am happy to announce that our paper entitled “ Waiting period from diagnosis for mortgage insurance issued to cancer survivors “ has been published in the European Actuarial Journal.

Here is a brief summary of it:

Massart (2018) testimonial illustrates the difficulties faced by patients having survived cancer to access mortgage insurance securing home loan. Data collected by national registries nevertheless suggest that excess mortality due to some types of cancer becomes moderate or even negligible after some waiting period.

In relation to the insurance laws passed in France and more recently in Belgium creating a right to be…


Learn how to perform an Analysis Of VAriance (ANOVA) in R to compare 3 groups or more. See also how to interpret the results and test the assumptions

ANOVA in R
ANOVA in R
Photo by Battlecreek Coffee Roasters

Introduction

ANOVA (ANalysis Of VAriance) is a statistical test to determine whether two or more population means are different. In other words, it is used to compare two or more groups to see if they are significantly different.

In practice, however, the:

  • Student t-test is used to compare 2 groups;
  • ANOVA generalizes the t-test beyond 2 groups, so it is used to compare 3 or more groups.

Note that there are several versions of the ANOVA (e.g., one-way ANOVA, two-way ANOVA, mixed ANOVA, repeated measures ANOVA, etc.). …


Learning by writing, getting feedback, contributing to the open source community and building professional relationships, among others

Why do I have a data science blog? 7 benefits of sharing your code
Why do I have a data science blog? 7 benefits of sharing your code
Photo by Patrick Fore

My blog statsandr.com was launched in December 2019. Although 9 months of writing is a very short period compared to others, I can already say that it’s been an incredible and very enriching adventure!

With 45 articles published (at the time of writing this article) and topics ranging from descriptive statistics, probability, inferential statistics to R Markdown and data visualization, I have seen many benefits of sharing my code through a technical blog.

In this article, I highlight 7 of them (in no particular order) with the hope that it will give ideas and incentives to some of you. …


Learn how to create professional graphics and plots in R with the ggplot2 package

Graphics in R with ggplot2
Graphics in R with ggplot2
Photo by Isaac Smith

Introduction

R is known to be a really powerful programming language when it comes to graphics and visualizations (in addition to statistics and data science of course!).

To keep it short, graphics in R can be done in three ways, via the:

  1. {graphics} package (the base graphics in R, loaded by default)
  2. {lattice} package which adds more functionalities to the base package
  3. {ggplot2} package (which needs to be installed and loaded beforehand)

The {graphics} package comes with a large choice of plots (such as plot, hist, barplot, boxplot, pie, mosaicplot, etc.) and additional related features (e.g., abline, lines, legend, mtext, rect


A R Shiny app to compute monthly loan or mortgage payments and to generate amortization tables

Image for post
Image for post
Photo by Tierra Mallorca

Introduction

I recently moved out and bought my first apartment. Of course, I could not pay it entirely with my own savings, so I had to borrow money from the bank. I visited a couple of banks operating in my country and asked for a mortgage.

If you already bought your house or apartment in the past, you know how it goes: the bank analyzes your financial and personal situation and make an offer based on your propensity to repay the bank. You then either accept the offer if you are satisfied with the rate and conditions, or visit another bank…


Learn how to detect outliers in R via descriptive statistics, the Hampel filter, the Grubbs, the Dixon and the Rosner tests for outliers

Outliers detection in R
Outliers detection in R
Photo by Will Myers

Introduction

An outlier is a value or an observation that is distant from other observations, that is to say, a data point that differs significantly from other data points. Enderlein (1987) goes even further as the author considers outliers as values that deviate so much from other observations one might suppose a different underlying sampling mechanism.

An observation must always be compared to other observations made on the same phenomenon before actually calling it an outlier. …


Learn how to perform the non-parametric version of the Student’s t-test in R

Wilcoxon test in R: how to compare 2 groups under the non-normality assumption
Wilcoxon test in R: how to compare 2 groups under the non-normality assumption
Photo by Annie Spratt

Introduction

In a previous article, we showed how to compare two groups under different scenarios using the Student’s t-test. The Student’s t-test requires that the distributions follow a normal distribution.1 In this article, we show how to compare two groups when the normality assumption is violated, using the Wilcoxon test.

The Wilcoxon test is a non-parametric test, meaning that it does not rely on data belonging to any particular parametric family of probability distributions. Non-parametric tests have the same objective as their parametric counterparts. However, they have an advantage over parametric tests: they do not require the assumption of normality of…


See a step-by-step guide (with screenshots) on how to deploy and publish online a Shiny app using shinyapps.io

How to publish a Shiny app: example with shinyapps.io
How to publish a Shiny app: example with shinyapps.io

Introduction

The COVID-19 virus led many people to create interactive apps and dashboards. A reader recently asked me how to publish a Shiny app she just created. Similarly to a previous article where I show how to upload R code on GitHub, I thought it would be useful to some people to see how I publish my Shiny apps so they could do the same.

Before going through the different steps required to deploy your Shiny app online, you can check the final result with my apps here.

Note 1: The screenshots have been taken on MacOS and I have not…

Antoine Soetewey

PhD student and teaching assistant in statistics at UCLouvain (Belgium). Interested in statistics and R, author of statsandr.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store