Reordering Graphs in ggplot Plot: A Step-by-Step Guide

When we create a plot using ggplot() in R, the order of the categories on the x-axis is set by default. However, for better clarity and aesthetics, we often want to reorder the plot based on the values being shown. This tutorial walks through how to reorder bars in a bar plot using a hypothetical dataset of student scores.

Prepare the dataset

Let’s start by loading the required library and creating our dataset. We’ll use the tidyverse package, which is useful for both data wrangling and plotting.

library(tidyverse)

student_scores <- data.frame( 
Student_ID = 1:11, 
English = c(85, 78, 92, 67, 88, 76, 95, 80, 72, 90, 100), 
Biology = c(95, 87, 90, 79, 94, 96, 93, 82, 89, 97, 105), 
Maths = c(90, 82, 58, 74, 89, 91, 88, 77, 84, 92, 100), 
Physics = c(78, 85, 89, 80, 90, 76, 83, 91, 87, 79, 100))

# Take a look at what the spreadsheet looks like
view(student_scores)

The spreadsheet includes scores for 11 students across four subjects: English, Biology, Maths, and Physics.

Calculate Mean Scores for Each Course

Our goal is to create a bar plot that shows the average score for each course. To do this, we first need to reshape the dataset into a long format with two columns: one for the course name and one for the score.

# reshape data
scores_reshape <- student_scores %>% 
pivot_longer( cols = English : Physics, 
              names_to = "courses", 
              values_to = "scores")

# get the data set that has the mean score for each subject

Next, we calculate the mean score for each course using group_by() and summarize():

mean <- scores_reshape %>% 
  group_by(courses) %>% 
  summarize(ave = mean(scores))

Now we have a clean dataset with the average scores:

Bar Plot with Default Order

Let’s plot the mean scores using a basic ggplot() bar plot:

ggplot(mean, aes(x = courses, y = ave, fill = courses)) +
  geom_bar(stat = "identity") + 
  theme_classic() + 
  labs(x = "Courses", y = "Mean Scores")

This plot uses the default alphabetic order of the course names, which may not reflect the order of the values (scores).

What if we want to order the bars based on the order of the values?

Reordering Bars Based on Values

To reorder the bars so that they appear from the lowest to the highest mean score, we can sort the data and prepend row numbers to the course names. This forces ggplot() to follow our desired order:

mean_ordered <- mean %>% 
  arrange(ave) %>% # order the scores from lowest to highest
  mutate(courses = paste0(row_number(), "_", courses))

The courses column now looks like this:

Now we plot the reordered data:

ggplot(mean_ordered, aes(x = courses, y = ave, fill = courses))+ 
  geom_bar(stat = "identity")+ 
  theme_classic()+ 
  labs(x = "Courses", y = "Mean Scores")

This shows the courses in the desired order, but the course names are now prefixed with numbers:

Cleaning Up the Axis Labels

To display clean course names on the x-axis while keeping the desired order, we can use scale_x_discrete() to relabel the courses on the x-axis and scale_fill_discrete() to relabel the courses on the legend:

ggplot(mean_ordered, aes(x = courses, y = ave, fill = courses)) + 
  geom_bar(stat = "identity") + 
  scale_x_discrete( 
    labels = c(
       "1_English" = "English", # rename the name of each course on the x-axis 
       "2_Maths" = "Maths", 
       "3_Physics" = "Physics", 
       "4_Biology" = "Biology" ))+
   scale_fill_discrete( 
     labels = c( 
        "1_Maths" = "Maths",
        "2_Biology" = "Biology", 
        "3_Physics" = "Physics", 
        "4_English" = "English" ) 
  theme_classic()+ 
  labs(x = "Courses", y = "Mean Scores")

And we successfully get the bar plot of scores for the four courses ordered from the lowest to the highest.

Now you have a bar plot with the courses ordered from the lowest to the highest average score—much more informative and visually appealing!