Intelligence, Inside and Outside.

Statistics For Dummies: Indexing And Subsetting In R [Part 2 of 2] : Lists And Data Frames

Last time, we discussed how to index or subset vectors and matrices in R. Now, we will deal with indexing the other commonly used R objects: lists and data frames.

Typically, we will not be dealing with data with the level of simplicity of vectors and matrices. Most of the time, more structure with the information we collect. With this, most objects that you will encounter will actually come in the form of lists and data frames. Data frames are highly used especially in the context of statistical analysis.

If you are not yet familiar with R objects, you may check the introduction here.

Requirements

  • R. If you haven’t installed R yet, you may do so here. We also made a tutorial on how to install R in Ubuntu.
  • RStudio (Optional). This tutorial will use R’s IDE, RStudio. You can still this tutorial only using R.

Lists

To illustrate subsetting in lists, let’s first generate a list by running the following code:

a <- list(names = c(“Mary”,”Lucas”,”June”),
          items = c(“Pen”, “Paper”, “Stone”, “Scissors”),
          numbers = c(1:6))

 

Calling the variable, a,  shows the output below:

> a
$names
[1] "Mary"  "Lucas"  "June" 

$items
[1] "Pen"  "Paper"  "Stone"  "Scissors"

$numbers
[1] 1 2 3 4 5 6

 

There are two levels of subsetting in lists:

  • element-wise
  • within elements

Lists: Element -wise subsetting

The elements could be vectors, numbers, data frames, matrices, and others. Suppose you want to extract just the collection of items in the list, a.

The collection of items is the second element in the list. You can extract it using the bracket operator as you would in vectors:

> a[2]
$items
[1] "Pen"  "Paper"  "Stone"  "Scissors"

 

Read More  Top 10 Data Scientist Skills to Develop to Get Yourself Hired

You can also do the same thing using the dollar ($) operator, now specifying the name of the list element as shown below:

> a$items
$items
[1] "Pen"  "Paper"  "Stone"  "Scissors"

 

Lists: Subsetting within elements

If instead, you want to extract the third item in the item list, you have to create another layer of subsetting. This can be done using both the bracket and dollar operator as shown below:

> a[[2]][3]
$items
[1] "Stone"

> a$items[3]
$items
[1] "Stone"

 

Using the bracket operator, you have to specify which element you will be extracting from by enclosing the index with double brackets ( [[ ]] ). Then, you have to specify the index of the element you are trying to extract with a bracket operator as you would in regular vectors.

Using the dollar operator, you instead have to specify the name of the element and layer it with a single bracket operator enclosing the index (or indices) of the sub-elements that you are trying to extract.

Data Frames

Data frames are extracted pretty much the same way as lists, though there are slight differences. Let’s create a data frame and see for ourselves:

b <- data.frame( A = 1:4, B = 5:8, row.names = c("Mary", "Lucas", "Mattie", "June"))

> b
       A B
Mary   1 5
Lucas  2 6
Mattie 3 7
June   4 8

Now, let’s look at how subsetting occurs in data frames.

Data Frames: Element -wise subsetting

You can use either the bracket operator or dollar operator to get the vector you desire. For instance, if we want to extract the second vector of the data frame:

> b$B
[1] 5 6 7 8

#if you want to extract the column retaining the names
> b[2]
       B
Mary   5
Lucas  6
Mattie 7
June   8

#if you want to extract the column as a vector
> b[[2]]
[1] 5 6 7 8

 

Read More  Statistics For Dummies: Type I And Type II Errors

Data Frames: Subsetting within elements

If you want to get the fourth element in the second column, there are multiple ways to do so using either the bracket or dollar operator:

> b$B[4]
[1] 8

#treating the data as if it was a matrix:
b[4,2]
[1] 8

#treating the data as if it was a vector
> b[[2]][4]
[1] 8

Since data frames are like structured matrices, you can use matrix indices to subset within an element. You can also treat them as vectors and use a bracket operator to extract a particular element.

Conclusion

This wraps up our tutorial on how to subset or index commonly used R objects. You can explore R yourself so that you can identify what subsetting methods and what kind of objects you are most comfortable with.


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!
Share this article
Shareable URL
Prev Post

Statistics For Dummies: Indexing and Subsetting In R [Part 1 of 2] : Vectors And Matrices

Next Post

Using AI To Enrich Digital Maps

Read next