# Statistics

### Statistics For Dummies: Indexing and Subsetting In R [Part 1 of 2] : Vectors And Matrices

Previously, we talked about the objects in R and how they are created. Now, we will discuss how information can be extracted from these objects. This process is also known as indexing or subsetting. The ability to access information is important since there are times that we only need a specific data point or subset …

### What Is It To Be Bayesian? The (Pretty Simple) Math Modelling Behind A Big Data Buzzword

If you’ve ever tripped up over the term ‘Bayesian’ while reading up on data or tech, fear not. Strip away the jargon and notation, and even the mathematics-averse can make sense of the simple yet revolutionary concept at the core of both machine learning and behavioural economics. As this video from the YouTube channel 3Blue1Brown …

### A Short Primer On Generalized Linear Models (GLM)

Generalized Linear Models (GLM) refers to a large class of models which include the familiar ordinary linear regression — ordinary least squares (OLS) regression — and the analysis of variance (ANOVA) models. A bag loaded with tricks (models, rather) Both OLS regression and ANOVA deal with continuous response variables. However, there are times that we …

### Statistics For Dummies: Type I And Type II Errors

You often hear Type I and Type II errors in statistics classes. There is good reason for that — minimizing either of these two errors is pretty much the core of statistical theory. Preliminaries Type I and Type II errors are related to the concept of hypothesis testing. In hypothesis testing, we have two hypotheses: …

### The Weird Ones: How To Handle Outliers In Your Data

Outliers in data are the weird ones in a set. Their values are way off the rest of the values of the sample. They can really ruin your analysis, especially if you are using methods which are sensitive to the presence of outliers. Given this, a lot are inclined to remove these observations. While this …

### How To: Perform Tests for Normality in R

Normality is one assumption that you will typically encounter in statistical methods that you will employ. A lot of the tests that were created have an underlying assumption that your data is normal. A large number of parametric tests assume normality of data. Ordinary least squares regression assumes it for its error terms, too. You can …

### Statistics for Dummies: Levels of Measurement

Arguably one of the most important skills you must have in order to get started with using statistical methods is knowing the scale or level of measurement of your data. The appropriate method of analysis for your data is dependent on the scale it was measured in. Here’s a quick rundown of the four levels …

### Statistics for Dummies: The Normal Distribution

The normal distribution is arguably the most used distribution in statistics. A lot of statistical methods rely on assuming that your data is normally distributed. What is so special about it? The infamous bell The normal distribution is characterized by its trademark bell-shaped curve. The shape of the bell curve is dictated by two parameters. …

### PyCon 2019 | Bayesian Data Science by Simulation

PyCon 2019 | Bayesian Data Science by Simulation Speaker: Eric Ma, Hugo Bowne-Anderson This tutorial is an Introduction to Bayesian data science through the lens of simulation or hacker statistics. We will become familiar with many common probability distributions through i) matching them to real-world stories & ii) simulating them. We will work with …