You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

Creating original datasets through and is a powerful way to gather unique insights for data journalism. This approach allows reporters to collect targeted information directly from sources, filling gaps in existing data and uncovering new stories.

Surveys and crowdsourcing projects require careful planning, from defining research questions to designing questionnaires and engaging participants. Proper data cleaning, analysis, and integration with other sources can transform raw responses into compelling narratives that inform and engage readers.

Survey Design for Journalism

Key Steps in the Survey Process

Top images from around the web for Key Steps in the Survey Process
Top images from around the web for Key Steps in the Survey Process
  • Define the research question that guides the design of the survey instrument and data analysis
    • Research questions should be specific, measurable, and relevant to the journalistic project
  • Identify the target population (entire group the survey aims to understand) and sampling frame (lists of all members of the target population from which the sample will be selected)
  • Determine the sample size (large enough to represent the target population and support ) and sampling method (probability sampling methods like simple or help ensure representative samples)
  • Design the questionnaire by crafting clear, unbiased questions that elicit accurate responses
    • Question types include open-ended, closed-ended, Likert scale, and ranking
    • Pretest questions to identify potential issues with comprehension, flow, or response options
    • Include demographic questions to analyze differences among subgroups
  • Collect data through various modes, such as online surveys, phone interviews, or in-person interviews
    • Each mode has advantages and limitations related to cost, response rates, and data quality

Questionnaire Design and Data Collection

  • Craft clear, unbiased questions that elicit accurate responses
    • Use simple, jargon-free language that is easily understood by respondents
    • Avoid leading or loaded questions that suggest a particular answer
    • Provide clear instructions and definitions for key terms
  • Choose appropriate question types based on the information needed
    • Open-ended questions allow respondents to provide detailed, qualitative responses (e.g., "What do you think about the proposed policy?")
    • Closed-ended questions offer a fixed set of response options, making data analysis easier (e.g., "Do you support or oppose the proposed policy?")
    • Likert scale questions measure attitudes or opinions on a numeric scale (e.g., "On a scale of 1 to 5, how strongly do you agree with the statement?")
    • Ranking questions ask respondents to order a set of items by preference or importance
  • Pretest questions with a small group of respondents to identify potential issues
    • Check for comprehension, clarity, and ease of answering
    • Revise questions based on feedback to improve data quality
  • Collect data through the most appropriate mode for the target population and research question
    • Online surveys are cost-effective and can reach a large, geographically dispersed sample
    • Phone interviews allow for more in-depth questioning and can reach respondents without internet access
    • In-person interviews provide the richest data but are time-consuming and expensive

Crowdsourcing Data and Audiences

Leveraging Crowdsourcing Platforms

  • Crowdsourcing leverages the collective intelligence and resources of a large group of people, often through online platforms, to gather information, solve problems, or generate ideas
    • Crowdsourcing can help journalists collect data quickly and cost-effectively
  • Use general-purpose crowdsourcing platforms like or for simple data collection tasks
    • These platforms are easy to use and offer basic data analysis features
    • Example: Collecting reader opinions on a local issue through a Google Form embedded in an article
  • Utilize specialized crowdsourcing platforms like or for more complex projects
    • These platforms are designed for specific use cases, such as crisis mapping or collaborative investigations
    • Example: Using Ushahidi to map reports of election irregularities submitted by citizens

Designing and Engaging with Crowdsourcing Projects

  • Define the problem or question clearly and identify the target audience
    • Provide clear instructions for participation, including what data to submit and how to submit it
    • Example: Asking readers to submit photos of potholes in their neighborhood, along with the location and date
  • Gather a variety of data types, such as personal experiences, observations, opinions, or documents
    • Projects may involve asking participants to submit photos, videos, or other multimedia content
    • Example: Collecting personal stories and documents related to a particular issue, such as experiences with the healthcare system
  • Verify submissions and ensure data quality
    • Implement processes to validate submissions, detect fraudulent or duplicate entries, and ensure data accuracy
    • Example: Cross-referencing submitted data with official records or conducting follow-up interviews with selected participants
  • Engage with participants throughout the project
    • Communicate regularly with contributors, provide feedback and updates, and acknowledge their contributions
    • Example: Sending personalized thank-you messages to participants and featuring selected submissions in the final story
  • Consider ethical issues, such as protecting participant privacy, obtaining informed consent, and ensuring transparency about how data will be used
    • Example: Providing clear privacy policies and allowing participants to opt-out of having their submissions published

Data Cleaning and Analysis

Preparing Data for Analysis

  • Clean data by identifying and correcting errors, inconsistencies, or missing values
    • Remove duplicates, standardize formats, and handle outliers
    • Example: Converting all date values to a consistent format (YYYY-MM-DD) or removing rows with missing key variables
  • Validate data by checking whether it meets predefined criteria or rules
    • Verify that responses fall within acceptable ranges, confirm that required fields are complete, or cross-reference data against external sources
    • Example: Checking that age values are positive integers within a reasonable range (e.g., 18-120)
  • Conduct exploratory data analysis (EDA) to understand patterns, relationships, and potential insights
    • Calculate summary statistics, create visualizations, and identify correlations or trends
    • Example: Creating a histogram of survey respondents' ages or a scatterplot of two variables to identify potential relationships

Analyzing and Interpreting Results

  • Use statistical analysis to test hypotheses, compare groups, or build predictive models
    • Apply methods such as t-tests, ANOVA, chi-square tests, and regression analysis
    • Example: Using a t-test to compare the mean satisfaction scores between two different customer groups
  • Weight data to adjust for sampling biases or ensure representativeness
    • Assign different importance to individual responses based on demographic or other characteristics
    • Example: Weighting survey responses by age and gender to match the population distribution
  • Use tools like spreadsheet software (Excel), statistical packages (SPSS, R), and tools (Tableau, D3.js) for data cleaning, validation, and analysis
    • Example: Using R to perform a logistic regression analysis and create a interactive visualization of the results

Data Integration for Storytelling

Combining Original and External Datasets

  • Integrate original survey or crowdsourced data with other datasets (government databases, academic research, data from other media outlets) for a more comprehensive understanding
    • Merge datasets based on common variables or keys using techniques like joining tables, concatenating datasets, or aggregating data at different levels of granularity
    • Example: Combining original survey data on local business sentiment with government data on economic indicators
  • Integrate geospatial data (maps, satellite imagery) to explore geographic patterns or trends
    • Use GIS software or mapping libraries to visualize and analyze geospatial data
    • Example: Mapping crowdsourced reports of crime incidents and comparing them to official police data
  • Analyze text data (open-ended survey responses, social media posts) using natural language processing (NLP) techniques
    • Identify common themes, sentiment, or named entities in the text data
    • Example: Using NLP to analyze open-ended survey responses about experiences with a particular product or service

Communicating Insights through Data Visualization

  • Create clear, accurate, and tailored data visualizations to communicate insights from integrated datasets
    • Use charts, graphs, maps, and interactive dashboards to present findings
    • Example: Developing an interactive dashboard that allows readers to explore survey results by demographic group or geographic area
  • Be transparent about data sources, methods, and limitations
    • Provide access to raw data and methodology to enhance credibility and reproducibility
    • Example: Publishing a detailed methodology document that describes the data collection, cleaning, and analysis process
  • Consider ethical issues in data integration
    • Protect individual privacy, ensure data security, and avoid misleading or biased interpretations of the combined data
    • Example: Anonymizing sensitive data before publishing and providing context to help readers interpret the findings accurately
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary