7 Datasets in R

There are various pre-built data sets included with R, which are typically used as practice sets for playing with R functions. We’ll first go through how to load and use built-in R data sets in this lecture. The most popular R demo data sets will then be discussed, including mtcars, iris, ToothGrowth, PlantGrowth, and USArrests.

7.1 Pre-loaded data

To see the list of pre-loaded data in your system, type the function data():

7.2 Loading a built-in R data

Load and print mtcars data as follow:

# Loading
data(mtcars)
# Print the first 6 rows
head(mtcars, 6)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

7.3 Most used data sets in R

7.3.1 mtcars

mtcars data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

# 1. Loading 
data("mtcars")  

# 2. Print
head(mtcars)  
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
# Number of rows (observations)
nrow(mtcars)  
## [1] 32
# Number of columns (variables)
ncol(mtcars)
## [1] 11

If you want learn more about mtcars data sets, type this: ?mtcars

?mtcars

7.3.2 iris

iris data set gives the measurements in centimeters of the variables sepal length, sepal width, petal length and petal width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.

data("iris")
head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

If you want learn more about iris data sets, type this: ?iris

?iris

7.3.3 ToothGrowth

The results of an experiment examining the impact of vitamin C on tooth development in 60 Guinea pigs are included in the data set. Each rat received one of three vitamin C dose levels (0.5, 1, and 2 mg/day) using either ascorbic acid or orange juice as the delivery mechanism (a form of vitamin C and coded as VC).

data("ToothGrowth")
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5

If you want learn more about ToothGrowth data sets, type this: ?ToothGrowth

?ToothGrowth

7.3.4 PlantGrowth

Results from an experiment comparing yields between two distinct treatment conditions and a control condition (as determined by the dry weight of plants).

data("PlantGrowth")
head(PlantGrowth)
##   weight group
## 1   4.17  ctrl
## 2   5.58  ctrl
## 3   5.18  ctrl
## 4   6.11  ctrl
## 5   4.50  ctrl
## 6   4.61  ctrl

7.3.5 USArrests

Statistics on the rates of violent crime in each US state

data("USArrests")
head(USArrests)
##            Murder Assault UrbanPop Rape
## Alabama      13.2     236       58 21.2
## Alaska       10.0     263       48 44.5
## Arizona       8.1     294       80 31.0
## Arkansas      8.8     190       50 19.5
## California    9.0     276       91 40.6
## Colorado      7.9     204       78 38.7