📉Intro to Business Statistics Unit 6 – The Normal Distribution

The normal distribution is a fundamental concept in statistics, characterized by its symmetrical bell shape. It's defined by two parameters: the mean and standard deviation, which determine the center and spread of the distribution. This versatile model is widely used in various fields due to its well-defined properties. Key features of the normal distribution include its symmetry, unimodal shape, and the empirical rule describing data within standard deviations. The standard normal distribution, with a mean of 0 and standard deviation of 1, allows for standardization using z-scores. This enables probability calculations and comparisons across different distributions.

What's the Normal Distribution?

  • Continuous probability distribution that is symmetrical and bell-shaped
  • Defined by two parameters: mean (μ\mu) and standard deviation (σ\sigma)
  • Mean determines the center of the distribution while standard deviation controls the spread
  • Total area under the curve equals 1, representing all possible outcomes
  • Most common values cluster around the mean, with decreasing probability for values further away
    • For example, in a normal distribution of heights, most people will be close to the average height, with fewer people being very short or very tall
  • Widely used in statistics due to its well-defined properties and natural occurrence in many real-world phenomena (test scores, measurement errors)

Key Features and Properties

  • Symmetry: the left and right halves of the distribution are mirror images of each other
  • Unimodal: only one peak, located at the mean
  • Mean, median, and mode are all equal and located at the center of the distribution
  • Inflection points (where the curve changes from concave to convex) are located at μ±σ\mu \pm \sigma
  • Empirical rule (68-95-99.7 rule) describes the percentage of data within 1, 2, and 3 standard deviations of the mean
    • Approximately 68% of data falls within one standard deviation of the mean (μ±σ\mu \pm \sigma)
    • Approximately 95% of data falls within two standard deviations of the mean (μ±2σ\mu \pm 2\sigma)
    • Approximately 99.7% of data falls within three standard deviations of the mean (μ±3σ\mu \pm 3\sigma)
  • Skewness and kurtosis are both 0 for a perfect normal distribution

The Standard Normal Distribution

  • Special case of the normal distribution with a mean of 0 and a standard deviation of 1
  • Denoted as ZN(0,1)Z \sim N(0,1)
  • Allows for standardization of any normal distribution using z-scores
  • Z-score represents the number of standard deviations a value is from the mean
    • Positive z-scores indicate values above the mean, while negative z-scores indicate values below the mean
  • Standard normal distribution tables provide probabilities for z-scores, eliminating the need for integration
  • Enables comparison of values from different normal distributions on a common scale

Z-Scores and Probability

  • Z-score formula: z=xμσz = \frac{x - \mu}{\sigma}, where xx is the value of interest, μ\mu is the mean, and σ\sigma is the standard deviation
  • Z-scores allow for the calculation of probabilities using standard normal distribution tables or software
  • Probability of a value being less than, greater than, or between specific z-scores can be determined
    • For example, P(Z<1.5)P(Z < 1.5) represents the probability of a value being less than 1.5 standard deviations above the mean
  • Percentiles can be found using z-scores and the standard normal distribution
    • For instance, a z-score of 1.28 corresponds to the 90th percentile, meaning 90% of the data falls below this value

Applications in Business

  • Quality control: identifying products that fall outside acceptable limits (usually μ±3σ\mu \pm 3\sigma)
  • Financial analysis: modeling stock returns, portfolio risk, and option pricing (Black-Scholes model)
  • Marketing research: analyzing customer satisfaction scores or product ratings
  • Human resources: setting performance benchmarks and evaluating employee performance
  • Forecasting: predicting demand, sales, or revenue using historical data and assuming normality
  • Operations management: determining optimal inventory levels and reorder points based on lead time and demand variability

Common Misconceptions

  • Not all data follows a normal distribution; it's essential to check assumptions before applying normal distribution techniques
  • The normal distribution is a continuous distribution, not discrete; it's an approximation for large sample sizes
  • The empirical rule (68-95-99.7) is a guideline, not an exact rule; actual percentages may vary slightly
  • Z-scores do not indicate the probability directly; they need to be converted using the standard normal distribution
  • The mean and standard deviation are sensitive to outliers; robust measures like the median and interquartile range may be more appropriate for skewed or heavy-tailed distributions

Calculating with Normal Distributions

  • Finding probabilities: use z-scores and standard normal distribution tables or software (e.g., Excel's
    NORM.DIST
    or
    NORM.S.DIST
    functions)
    • Example: given XN(100,15)X \sim N(100, 15), find P(X<90)P(X < 90) by calculating the z-score and using the standard normal distribution
  • Finding values: use inverse z-scores and standard normal distribution tables or software (e.g., Excel's
    NORM.INV
    or
    NORM.S.INV
    functions)
    • Example: find the value that corresponds to the 25th percentile in a distribution with μ=50\mu = 50 and σ=10\sigma = 10
  • Linear transformations: if XN(μ,σ)X \sim N(\mu, \sigma), then aX+bN(aμ+b,aσ)aX + b \sim N(a\mu + b, |a|\sigma)
    • Example: if test scores follow N(70,5)N(70, 5) and are scaled by a factor of 1.5, the new distribution is N(105,7.5)N(105, 7.5)
  • Central Limit Theorem: the sampling distribution of the mean approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution
  • Confidence intervals: ranges of values that are likely to contain the true population parameter, based on the sample mean and standard error
  • Hypothesis testing: using the normal distribution to determine the likelihood of observing a sample statistic under the null hypothesis
  • Analysis of Variance (ANOVA): comparing means of multiple groups, assuming normality and equal variances
  • Regression analysis: modeling the relationship between variables, with residuals often assumed to follow a normal distribution


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.