Michael Harris Michael Harris

Using R to Automate Markdown Code

I recently learned that Google requires affiliate links to be “nofollow” links. Compared to the often default “follow” links, “nofollow” links essentially remove the website endorsement typically implied by websites linking to each other.

Read More
Michael Harris Michael Harris

Missing Data Imputation for Machine Learning

I go over methods for data imputation for training machine learning models. These techniques are inappropriate for hypothesis testing because they do not account for the uncertainty in the imputed data. However, if you are training a neural network on thousands of rows of data, and have missingness, these methods could be a good solution.

Read More
Michael Harris Michael Harris

Visually Determining Normality in R

Much of what we do in statistics requires that the data we are using be normally distributed. This prolific assumption requires that we either visually inspect the data or use a hypothesis test. While hypothesis tests like the Shapiro-Wilk test offer a clear-cut decision, it is sometimes preferred to simply visually inspect the data.

Read More
Michael Harris Michael Harris

Performing T-Tests in R

A guide to performing many different types of t-tests including: one-sample, two-sample assuming equal variance, two-sample assuming unequal variance, and two-sample dependent measures.

Read More
Michael Harris Michael Harris

Naive Bayes Classification in R

Naive Bayes is a computationally simple, but incredibly effective method for classification. In this tutorial, I will show you how to run this model and determine the classification accuracy of the model.

Read More
Michael Harris Michael Harris

Regression by Sampling

We may encounter situations where we can not store the data set and calculations required for regression models in RAM. This article presents a technique where we can estimate regression coefficients by sampling from a data set and running smaller regression calculations.

Read More
Michael Harris Michael Harris

Scrambling the letters of a message with R

One of the most powerful aspects of R is that it has a diverse set of random number generators. We can use these R tools to create methods of obscuring a message in what appears to be meaningless strings of text (cryptography).

Read More
Michael Harris Michael Harris

Writing while loops in R

A while statement will run as long as the conditional statement it is given evaluates as true. Naturally, the basic setup then is to make a conditional statement that will change over time and eventually evaluate as false when it no longer needs to keep running.

Read More
Michael Harris Michael Harris

Using For Loops in R

One of the most initially confusing aspects of R is using loops to repeat a chunk of code. I hope you will find this tutorial helpful and start to understand the power loops give to our code.

Read More
Michael Harris Michael Harris

Creating Variables in R

To create a variable we put the variable name we want on the left, add a left pointing arrow (“<“ followed by “-”), and then specify the number(s) we want to assign.

Read More