Skip to Main Content

Intro to Data Visualization Telling a Story with Data

What Question Are You Answering?

The major goal of data visualization is to create an image that communicates something to your audience. Before you can create a wonderful graph, figure, or chart, you need to have some idea of what the something is that you'd like your audience to learn about. 

Are you trying to show the audience where to find something?

Do you want to show people how the data you collected changes over time?

Would you like to show how two things are similar and how they're different?

Having an idea of what kind of information you want to share will help you decide what type of figures, graphs, and charts might best communicate the story you want to tell. It can also be helpful to look through the literature in your field to see what kinds of data visualization others use, which can give you great ideas for how you can tell a story with your own data. One way to do that is to search in your favorite database for the title of your field or a key term related to your research and "data visualization."

You can find the list of UAB Libraries databases on this UAB Libraries webpage! If you aren't sure which database would be best for you, you can try narrowing down the list of databases to those that best fit your subject by using the left "All Subjects" drop-down menu. If you're still not sure, you can contact the liaison reference librarian for your department and get their advice!

Who Is Your Audience?

The way you present your data story needs to reflect the audience you're hoping to reach. There are some questions you may want to ask yourself about your audience that can help you make some decisions about the figure, graph, or chart you want to make.

Are you speaking to other experts in your field?

If you're speaking to people who aren't experts in your field, are there things you need to spend more time explaining?

Something else to consider might be the way your audience will experience your data visualization.

Is this going to be a figure in a paper that your audience will have lots of time to interpret?

Are you going to present this during an oral presentation, which limits the time someone can analyze your work?

Will your data visualization be given to people who aren't familiar with your field in a pamphlet or other form that is intended to stand alone without you nearby to explain more?

Will the viewer be able to view your figure up close or does it need to be observed from a distance, like a figure for a poster presentation?

Exploring Your Data

Before you can start creating a useful and appealing visualization of your data with a clear message, you need to get to know your data yourself! You can't tell the story to others if you don't know the story yourself. Making a quick chart can help see trends in your data that can inspire and inform how you tell the story about that data.

One classic example of the use of this practice is Anscombe's Quartet. Take a look at the data in the following four data sets.

 

This table and the following figure are from Schwabish, J. (2021). Better Data Visualizations: A Guide for Scholars, Researchers, and Wonks. New York Chichester, West Sussex: Columbia University Press. The UAB Libraries have digital access to this book!

Though there are some variations in the numbers for x and y in each of these four data sets, each set has the same mean for x and y, variance in x and y, correlation (how well the data matches a linear trendline) and regression line (linear trend line's equation). With that many things in common, surely these must be very similar sets of data, right? To confirm our hypothesis, let's just do a quick x,y scatter plot with a linear regression line to see what these four figures look like.

 

Oh - these aren't similar looking at all! This is a classic demonstration of why it's important for us to create ways to visualize our data not just for our intended audience, but for our own interpretations of our data sets! It's also a reminder of the principle "just because you can doesn't mean you should!" If we look at the figure in the upper right corner in the above set of charts, we can see that the dotted line (representing the linear trendline) does not fit the shape of our data at all - it looks more like a parabola, which would be better described with a quadratic equation rather than a linear equation! Trendlines are only useful if they actually match up with the pattern of the data. Are there other figures here that aren't described well by the linear trendline on the chart?

Another useful bit of information we can derive from a simple, quick plot of our data is whether any of our data don't seem to match the rest and are, possibly, outliers. There are statistical tests that can be performed to determine if a single data point is an outlier compared to the rest of your data, and being able to justify removing that data point can make the statistical analysis of the rest of your data more valuable. For example, in the bottom right chart above, most of the data is in a straight line, but the one data point that doesn't match this pattern skews the data and creates a trendline that doesn't really match any of the data.

These simple charts aren't the kinds of figures you would put in a paper - they lack axes labels and other elements that make a great chart for your audience - but they can be very useful for you in your work of analyzing the data you are working with.