Posts in tag

Statistics


In my previous story Data Scientist — 12 Steps From Beginner to Pro I described how to master a profession from scratch. In this article, I will focus on the key skills required to become a Data Scientist. ? Hard Skills ? 1. Mathematical base Knowledge of machine learning techniques is an integral part of the Data …

Artificial intelligence (Al) paychecks benefit from a scarce talent and high demand. It is as per the law of supply and demand, and currently, everything related to Al is in very high demand. The salaries of Al professionals are reaching the sky height and beyond it. It is also essential to note that one Japanese …

No one can argue with statistics because they are hard facts. — are they really?  By this time, a lot of us have done a great job realising that not everything we see on the internet is true. Some of the information we receive is deliberately manipulated in order to influence our behavior or our …

Last time, we discussed how to index or subset vectors and matrices in R. Now, we will deal with indexing the other commonly used R objects: lists and data frames. Typically, we will not be dealing with data with the level of simplicity of vectors and matrices. Most of the time, more structure with the …

Previously, we talked about the objects in R and how they are created. Now, we will discuss how information can be extracted from these objects. This process is also known as indexing or subsetting. The ability to access information is important since there are times that we only need a specific data point or subset …

If you’ve ever tripped up over the term ‘Bayesian’ while reading up on data or tech, fear not. Strip away the jargon and notation, and even the mathematics-averse can make sense of the simple yet revolutionary concept at the core of both machine learning and behavioural economics. As this video from the YouTube channel 3Blue1Brown …

Generalized Linear Models (GLM) refers to a large class of models which include the familiar ordinary linear regression — ordinary least squares (OLS) regression — and the analysis of variance (ANOVA) models. A bag loaded with tricks (models, rather) Both OLS regression and ANOVA deal with continuous response variables. However, there are times that we …

You often hear Type I and Type II errors in statistics classes. There is good reason for that — minimizing either of these two errors is pretty much the core of statistical theory. Preliminaries Type I and Type II errors are related to the concept of hypothesis testing. In hypothesis testing, we have two hypotheses: …

Outliers in data are the weird ones in a set. Their values are way off the rest of the values of the sample. They can really ruin your analysis, especially if you are using methods which are sensitive to the presence of outliers. Given this, a lot are inclined to remove these observations. While this …

Normality is one assumption that you will typically encounter in statistical methods that you will employ. A lot of the tests that were created have an underlying assumption that your data is normal. A large number of parametric tests assume normality of data. Ordinary least squares regression assumes it for its error terms, too. You can …