EDS 240: Lecture 1.2

Data visualization: an intro


Week 1 | January 6th, 2024

What is data visualization?


“…the practice of designing and creating easy-to-communicate and easy-to-understand graphic or visual representations of a large amount of complex quantitative and qualitative data and information with the help of static, dynamic or interactive visual items.”

-from Wikipedia



Created using {ggplot2}

Created using {gganimate}

Created using {shiny}

What is data visualization?




“any graphical representation of information and data”


“part art and part science”

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

History of data visualization


Note: this is by no means a comprehensive history of the visual representation of information!

Images from BBC News, Wikipedia (a, b, c, d), Smithsonian, The Marginalian, Scientific American & Twitter

Why do we visualize data?

Spend the next few minutes discussing with your Learning Partners, and if possible, pull up some example visualizations that demonstrate your thoughts / discussion points

04:00

. . . to answer questions / derive insights


Fig Caption: Unusual climate anomalies in 2023 (the red line, which appears bold in print). Sea ice extent (a, b), temperatures (c–e), and area burned in Canada (f) are presently far outside their historical ranges. These anomalies may be due to both climate change and other factors. Sources and additional details about each variable are provided in supplemental file S1. Each line corresponds to a different year, with darker gray representing later years.


A nice Twitter thread on key takeaways from the above paper

. . . to explore & generate new questions


Exploratory data analysis (EDA) is not a formal process with a strict set of rules. More than anything, EDS is a state of mind…you should feel free to investigate every idea that occurs to you. Some of these ideas will pan out, and some will be dead ends. As your exploration continues, you will hone in on a few particularly productive insights that you’ll eventually write-up and communicate to others.”

-Hadley Wickham, author of R for Data Science

ggplot(diamonds, aes(x = carat)) +
  geom_histogram(binwidth = 0.5)

ggplot(mpg, aes(x = fct_reorder(class, hwy, median), y = hwy)) +
  geom_boxplot()

ggplot(diamonds, aes(x = price, y = after_stat(density))) + 
  geom_freqpoly(aes(color = cut), binwidth = 500, linewidth = 0.75)

. . . to prompt discussion




. . . to prompt discussion




gif created from Antti Lipponen’s Temperature Anomolies.

. . . to prompt discussion




gif created from Mark SubbaRao’s Climate Spiral. For a similar visualization with accompanying {ggplot2} code, see Nicola Rennie’s TidyTuesday contribution!

. . . to create art / tell a story




Patchwork Kingdoms, by Nadieh Bremer portraying the “digital divide” in schools across the world

. . . to create art / tell a story


To enlarge, Right click > Open Image in New Tab



Vertices of Visualization


Why R for data viz?






  • I’m most comfortable in R
  • great ecosystem of data wrangling & visualization packages (inc. a massive and growing collection of {ggplot2} extensions)
  • amazing online learning communities
  • data viz fundamentals apply no matter the language / tool



Take a Break

~ This is the end of Lesson 2 (of 3) ~

05:00