You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

12.2 Data journalism and analysis

3 min readjuly 18, 2024

and preparation are crucial steps in investigative journalism. Reporters must identify reliable sources, evaluate , and extract information using various methods. Cleaning and preprocessing ensure for analysis.

Statistical analysis and visualization techniques help uncover insights and patterns. Journalists integrate findings into narratives, develop data-driven angles, and create compelling visualizations. Ethical handling of sensitive data is essential to protect privacy and maintain accuracy.

Data Acquisition and Preparation

Dataset acquisition and preparation

Top images from around the web for Dataset acquisition and preparation
Top images from around the web for Dataset acquisition and preparation
  • Identify potential
    • and (, EPA)
    • Academic and research institutions (universities, research centers)
    • Non-profit organizations and think tanks (, )
    • Industry reports and corporate filings (annual reports, SEC filings)
  • Evaluate data quality and reliability
    • Assess data provenance and collection methodology to ensure credibility
    • Check for completeness, accuracy, and consistency across datasets
    • Verify data integrity and identify potential biases that may skew results
  • Acquire and extract data
    • Submit Freedom of Information Act (FOIA) requests to obtain government records
    • Scrape data from websites using tools like Python or R to automate the process
    • Download data from or online repositories (, )
  • Clean and preprocess data
    • Handle missing values, outliers, and inconsistencies to ensure data quality
    • Standardize data formats and variable names for consistency across datasets
    • Merge and aggregate data from multiple sources to create a comprehensive dataset
    • Perform data type conversions and transformations to prepare data for analysis

Statistical analysis for insights

    • Calculate (mean, median, mode, standard deviation) to summarize data
    • Identify patterns, trends, and anomalies in the data to guide further analysis
    • Conduct correlation and to uncover relationships between variables
    • Perform and significance tests to validate findings
  • techniques
    • Create charts, graphs, and maps to communicate findings effectively
      1. Bar charts and line graphs for comparing categories and trends over time
      2. Scatterplots and bubble charts for exploring relationships between variables
      3. Choropleth and heat maps for displaying geographic data and patterns
    • Use tools like , , or for interactive visualizations to engage readers
  • Advanced analytical methods
    • Apply machine learning algorithms for prediction and classification tasks ()
    • Conduct to uncover connections and relationships between entities
    • Perform and sentiment analysis on unstructured data (social media, news articles)

Integration of data in narratives

  • Identify key takeaways and newsworthy insights
    • Highlight significant patterns, trends, and outliers discovered in the data
    • Contextualize findings within the broader story narrative to provide relevance
    • Provide evidence-based support for investigative claims using data
  • Develop and leads
    • Use data to uncover hidden stories and unique perspectives not previously reported
    • Identify potential sources and interview subjects based on data insights
    • Generate new questions and avenues for further investigation based on data findings
  • Incorporate data visualizations into story presentation
    • Select appropriate visualizations to enhance reader understanding of complex topics
    • Integrate charts, graphs, and interactive elements into article layout for visual impact
    • Provide clear and concise explanations of data-driven findings for general audiences

Ethics of sensitive data handling

  • Protect individual privacy and confidentiality
    • Anonymize or aggregate data to prevent identification of individuals (redacting names)
    • Obtain informed consent when collecting personal information from sources
    • Securely store and transmit sensitive data using encryption and access controls
  • Ensure data accuracy and transparency
    • Verify data sources and collection methods to ensure reliability
    • Disclose limitations, biases, and potential errors in the data to maintain trust
    • Provide access to raw data and methodology for reproducibility and fact-checking
  • Avoid misrepresentation and misleading conclusions
    • Present data honestly and accurately, without cherry-picking or distortion of facts
    • Clearly distinguish between correlation and causation to avoid false inferences
    • Acknowledge alternative explanations and conflicting evidence in reporting
  • Adhere to legal and ethical standards
    • Comply with data protection laws and regulations (GDPR, ) when handling personal data
    • Respect intellectual property rights and data usage agreements from sources
    • Maintain journalistic integrity and avoid conflicts of interest in data-driven reporting
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary