R is a popular programming language and environment for statistical computing and graphics. One of its most powerful tools is the scatterplot, which is used to visualize relationships between two continuous variables. In this article, we will explore various ways to create scatterplots in R.
Basic Scatterplot
The basic scatterplot can be created using the plot
function.
plot(y ~ x)
This will create a scatterplot of y against x. You can customize the plot by adding labels, titles, and other features.
Customizing the Legend
You can disable the legend setting the legend
argument to FALSE.
scatterplot(x, y,
boxplots = "", # Disable boxplots
grid = FALSE, # Disable plot grid
ellipse = TRUE) # Draw ellipses
There are more arguments you can customize, so recall to type ?scatterplot
for additional details.
Scatterplot Matrix
When dealing with multiple variables, it is common to plot multiple scatter plots within a matrix, which will plot each variable against others to visualize the correlation between variables. You can create a scatter plot in R with multiple variables using the pairs
function.
pairs(~disp + wt + mpg + hp, data = mtcars)
In addition, if your dataset contains a factor variable, you can specify the variable in the col
argument as follows to plot the groups with different colors.
pairs(~disp + wt + mpg + hp, col = factor(mtcars$am), pch = 19, data = mtcars)
An alternative is to use the scatterplotMatrix
function of the car
package, which adds kernel density estimates in the diagonal.
install.packages("car")
library(car)
scatterplotMatrix(~ disp + wt + mpg + hp, data = mtcars)
You can customize the colors of the previous plot with the corresponding arguments:
scatterplotMatrix(~ disp + wt + mpg + hp, data = mtcars,
diagonal = FALSE, # Remove kernel density estimates
regLine = list(col = "green", # Linear regression line color
lwd = 3), # Linear regression line width
smooth = list(col.smooth = "red", # Non-parametric mean color
col.spread = "blue")) # Non-parametric variance color
Scatterplot with ggplot2
Creating a scatter graph with the ggplot2
library can be achieved with the geom_point
function, and you can divide the groups by color passing the aes
function with the group as parameter of the colour
argument.
# install.packages("ggplot2")
library(ggplot2)
my_df <- data.frame(x = x, y = y, group = group)
ggplot(my_df, aes(x = x, y = y)) +
geom_point(aes(colour = group)) + # Points and color by group
scale_color_discrete("Groups") + # Change legend title
xlab("Variable X") + # X-axis label
ylab("Variable Y") + # Y-axis label
theme(axis.line = element_line(colour = "black", # Changes the default theme
size = 0.24))
3D Scatterplot
With scatterplot3d
and rgl
libraries, you can create 3D scatter plots in R. The scatterplot3d
function allows to create a static 3D plot of three variables.
# install.packages("scatterplot3d")
library(scatterplot3d)
set.seed(2)
x <- rnorm(1000)
y <- rnorm(1000)
z <- rnorm(1000)
scatterplot3d(x, y, z, pch = 19, color = "blue")
An alternative is to use the plot3d
function of the rgl
package, which allows an interactive visualization. You can rotate, zoom in and zoom out the scattergram.
# install.packages("rgl")
library(rgl)
plot3d(x, y, z,
type = "s", # Type of the plot
radius = 0.1, # Radius of the observations
col = "lightblue", # Color of the observations
xlab = "X axis lab", # Label of the X axis
ylab = "Y axis lab", # Label of the Y axis
zlab = "Z axis lab") # Label of the Z axis
```, R provides a variety of ways to create scatterplots, including basic scatterplot, scatterplot matrix, and 3D scatterplot. With the help of libraries such as `ggplot2` and `rgl`, you can customize your scatterplots with colors, labels, and other features.