Data Journalism

🪓Data Journalism Unit 7 – Data Visualization Principles and Best Practices

Data visualization transforms complex information into easily digestible visual representations, enabling readers to grasp trends and patterns quickly. It enhances storytelling, facilitates data-driven decision-making, and empowers journalists to uncover hidden stories and hold those in power accountable. Key concepts include understanding data types, encoding methods, and design principles. Tools range from spreadsheet software to programming languages and web-based platforms. Effective visualizations prioritize simplicity, visual hierarchy, and color theory while avoiding common pitfalls like overcomplication and misuse of chart types.

What's the Big Deal?

  • Data visualization transforms complex data into easily digestible visual representations (charts, graphs, maps)
  • Enables readers to quickly grasp trends, patterns, and outliers in large datasets
  • Enhances storytelling by making data more engaging and memorable
  • Facilitates data-driven decision making by highlighting key insights
  • Bridges the gap between raw data and human understanding
    • Especially important in an era of big data and information overload
  • Empowers journalists to uncover hidden stories and hold those in power accountable
  • Helps audiences understand the significance and impact of data-driven stories

Key Concepts to Know

  • Data types: categorical, ordinal, and quantitative
    • Categorical: discrete categories (gender, race, product type)
    • Ordinal: categories with a natural order (low, medium, high)
    • Quantitative: numerical values (age, income, temperature)
  • Encoding: mapping data to visual properties (position, size, color, shape)
  • Scales: mapping data values to visual dimensions
    • Linear, logarithmic, and categorical scales
  • Gestalt principles: how humans perceive visual elements as unified wholes
    • Proximity, similarity, continuity, closure, and figure-ground
  • Preattentive attributes: visual properties processed quickly by the human brain (color, size, orientation)
  • Interaction techniques: enabling users to explore data (zooming, filtering, highlighting)
  • Responsive design: ensuring visualizations adapt to different screen sizes and devices

Tools of the Trade

  • Spreadsheet software (Microsoft Excel, Google Sheets)
    • Organizing, cleaning, and analyzing data
    • Creating basic charts and graphs
  • Programming languages (Python, R)
    • Powerful libraries for data manipulation and visualization (Matplotlib, ggplot2)
    • Customizable and reproducible visualizations
  • Business intelligence platforms (Tableau, Power BI)
    • Drag-and-drop interfaces for creating interactive dashboards
    • Connecting to various data sources and real-time updates
  • Web-based tools (D3.js, Plotly)
    • Creating interactive and animated visualizations for the web
    • Leveraging web standards (HTML, CSS, JavaScript)
  • Graphic design software (Adobe Illustrator, Sketch)
    • Refining and polishing visualizations for publication
    • Creating custom visual elements and layouts

Design Principles that Pop

  • Simplicity: focusing on essential information and minimizing clutter
    • Removing unnecessary elements (gridlines, borders, labels)
    • Using clear and concise labels and annotations
  • Visual hierarchy: guiding the reader's attention through strategic use of visual elements
    • Emphasizing key data points with size, color, or position
    • Grouping related elements and creating a logical flow
  • Color theory: using color effectively to convey meaning and enhance understanding
    • Choosing a color palette that aligns with the data and message
    • Considering color blindness and ensuring accessibility
  • Typography: selecting fonts that are legible and appropriate for the context
    • Using a limited number of font families and sizes
    • Ensuring proper contrast between text and background
  • Consistency: maintaining a cohesive visual style throughout the visualization
    • Using a consistent color palette, font, and layout
    • Aligning elements and maintaining proper spacing
  • Data-ink ratio: maximizing the amount of ink used to display data and minimizing non-data ink
    • Removing unnecessary backgrounds, borders, and decorations
    • Using a minimalist approach to highlight the data

Common Pitfalls to Avoid

  • Overcomplicating the visualization with too much information or visual elements
    • Overwhelming the reader and obscuring key insights
  • Using inappropriate chart types for the data and message
    • Distorting the data or creating misleading impressions (pie charts for non-proportional data)
  • Misusing color, leading to confusion or misinterpretation
    • Using too many colors or colors with conflicting meanings
    • Failing to consider color blindness or cultural associations
  • Ignoring the importance of context and annotations
    • Leaving the reader to interpret the data without guidance
    • Missing opportunities to highlight key findings or provide explanations
  • Neglecting the target audience and their level of data literacy
    • Creating visualizations that are too complex or technical for the intended readers
  • Sacrificing accuracy for aesthetics
    • Manipulating scales or truncating axes to exaggerate differences
    • Cherry-picking data to support a predetermined narrative

Hands-On Techniques

  • Sketching: starting with pen and paper to explore ideas and layouts
    • Rapidly iterating on design concepts before moving to digital tools
  • Data preparation: cleaning, transforming, and aggregating data for visualization
    • Handling missing values, outliers, and inconsistencies
    • Reshaping data into a format suitable for visualization (long vs. wide format)
  • Exploratory data analysis (EDA): using visualizations to uncover patterns and insights
    • Creating multiple chart types to examine the data from different angles
    • Identifying correlations, trends, and outliers
  • Prototyping: creating quick and rough visualizations to test ideas and gather feedback
    • Using tools like Excel or Tableau to rapidly prototype designs
    • Iterating based on feedback from colleagues or target audience
  • Refinement: polishing the visualization for final publication
    • Adjusting colors, fonts, and layout for optimal readability and aesthetics
    • Adding annotations, labels, and legends to provide context and guidance
  • Interactivity: incorporating interactive elements to enable data exploration
    • Adding tooltips, filters, or hover effects to reveal additional details
    • Allowing users to zoom, pan, or select data points of interest

Real-World Examples

  • The New York Times' "The Rich Really Do Pay Lower Taxes Than You"
    • Using a line chart to compare tax rates across income groups over time
    • Highlighting the stark contrast between the ultra-wealthy and average taxpayers
  • Reuters' "The Rohingya Exodus"
    • Combining maps, charts, and photographs to tell the story of the Rohingya refugee crisis
    • Visualizing the scale and impact of the mass displacement
  • The Pudding's "The Largest Vocabulary in Hip Hop"
    • Interactive visualization allowing users to explore the vocabulary of famous rappers
    • Demonstrating the power of interactivity to engage audiences with data
  • The Washington Post's "2°C: Beyond the Limit"
    • Using maps and data visualizations to show the uneven impact of climate change
    • Contextualizing data with storytelling and human experiences
  • ProPublica's "Miseducation"
    • Interactive database enabling users to explore racial disparities in U.S. schools
    • Empowering readers to investigate educational inequities in their own communities

Ethical Considerations

  • Accuracy: ensuring data is properly collected, analyzed, and represented
    • Verifying data sources and methods
    • Providing transparency about data limitations and uncertainties
  • Integrity: maintaining objectivity and avoiding bias in data selection and presentation
    • Resisting the temptation to cherry-pick data to support a predetermined narrative
    • Presenting data in a fair and balanced manner
  • Privacy: protecting individual privacy and confidentiality
    • Anonymizing sensitive data and using aggregation to prevent identification
    • Obtaining informed consent when collecting personal data
  • Accessibility: ensuring visualizations are accessible to all users, including those with disabilities
    • Following web accessibility guidelines (WCAG)
    • Providing alternative text for images and using color-blind friendly palettes
  • Transparency: being open about data sources, methods, and limitations
    • Providing access to raw data and methodology
    • Acknowledging uncertainties and potential biases
  • Impact: considering the potential consequences of data visualizations on individuals and society
    • Being aware of how visualizations may influence public opinion or policy
    • Taking responsibility for the ethical implications of data-driven storytelling


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.