You have 3 free guides left 😟

Light

You have 3 free guides left 😟

8.2 Linkage disequilibrium

9 min read•august 20, 2024

is a key concept in genetics, describing how alleles at different loci are associated more often than expected by chance. It's crucial for understanding genetic variation and plays a vital role in mapping genes associated with diseases and traits.

LD is influenced by factors like genetic drift, population bottlenecks, and natural . Measuring LD helps researchers conduct genome-wide association studies, fine-map disease loci, and infer population history. Understanding LD patterns is essential for interpreting genetic data and designing effective genomic studies.

Definition of linkage disequilibrium

Linkage disequilibrium (LD) is the non-random association of alleles at different loci in a given population
Alleles that are in LD are found together on the same more often than would be expected by chance
LD is a crucial concept in population genetics and is widely used in genetic mapping and association studies

Causes of linkage disequilibrium

Genetic drift

Top images from around the web for Genetic drift

Frontiers | GeTallele: A Method for Analysis of DNA and RNA Allele Frequency Distributions View original
Is this image relevant?
Population Genetics | Biology I View original
Is this image relevant?
Frontiers | Population Structure, Genetic Variation, and Linkage Disequilibrium in Perennial ... View original
Is this image relevant?
Frontiers | GeTallele: A Method for Analysis of DNA and RNA Allele Frequency Distributions View original
Is this image relevant?
Population Genetics | Biology I View original
Is this image relevant?

1 of 3

Top images from around the web for Genetic drift

Frontiers | GeTallele: A Method for Analysis of DNA and RNA Allele Frequency Distributions View original
Is this image relevant?
Population Genetics | Biology I View original
Is this image relevant?
Frontiers | Population Structure, Genetic Variation, and Linkage Disequilibrium in Perennial ... View original
Is this image relevant?
Frontiers | GeTallele: A Method for Analysis of DNA and RNA Allele Frequency Distributions View original
Is this image relevant?
Population Genetics | Biology I View original
Is this image relevant?

1 of 3

Genetic drift is the random fluctuation of allele frequencies in a population over time
In small populations, genetic drift can lead to the random fixation of alleles, resulting in increased LD
Genetic drift is more pronounced in smaller populations due to the greater impact of random sampling effects

Population bottlenecks

Population bottlenecks occur when a population undergoes a severe reduction in size, often due to environmental factors or demographic events
During a bottleneck, rare alleles may be lost, and the remaining alleles may become more frequent, leading to increased LD
Bottlenecks can also result in the random fixation of alleles, further contributing to LD

Founder effects

Founder effects occur when a new population is established by a small number of individuals from a larger population
The limited genetic diversity in the founding population can lead to increased LD, as the alleles present in the founders become more frequent
Founder effects are often observed in isolated populations or those that have undergone rapid expansion from a small initial population

Admixture

Admixture occurs when two or more previously isolated populations interbreed
Admixture can create new combinations of alleles, leading to LD between loci that were previously unlinked in the parental populations
The extent of LD generated by admixture depends on factors such as the genetic distance between the parental populations and the time since admixture occurred

Natural selection

Natural selection can create LD when it favors certain combinations of alleles at different loci
If two alleles at different loci confer a selective advantage when present together, they will tend to be inherited together, resulting in LD
Selective sweeps, where a beneficial allele rapidly increases in frequency and "sweeps" nearby linked alleles along with it, can also generate LD

Measures of linkage disequilibrium

D and D'

D is the basic measure of LD, calculated as the difference between the observed frequency of a haplotype and the expected frequency under random association
D' is a normalized version of D that ranges from -1 to 1, with |D'| = 1 indicating complete LD and D' = 0 indicating no LD
D' is useful for comparing the strength of LD between different pairs of loci, as it is not affected by allele frequencies

r and r²

r is the correlation coefficient between alleles at two loci, ranging from -1 to 1
is the square of r and represents the proportion of variance in allele frequencies at one locus that can be explained by the allele frequencies at the other locus
r² is commonly used in association studies, as it directly relates to the power to detect associations between markers and traits

Factors affecting linkage disequilibrium

Recombination rates

Recombination breaks down LD by shuffling alleles between haplotypes
The rate of is inversely related to the recombination rate between two loci
Regions of the genome with high recombination rates (hotspots) tend to have lower levels of LD, while regions with low recombination rates (coldspots) tend to have higher levels of LD

Mutation rates

Mutations can create new alleles and disrupt existing haplotypes, reducing LD
The impact of mutation on LD depends on the mutation rate and the age of the mutation
Recent mutations will be in complete LD with nearby alleles, while older mutations will have had more time to recombine and break down LD

Population size

Larger populations tend to have lower levels of LD due to the increased effectiveness of recombination in breaking down haplotypes
In smaller populations, genetic drift can lead to the random fixation of alleles and increased LD
Population size also affects the rate at which LD decays over time, with larger populations exhibiting faster decay

Mating patterns

Non-random mating, such as inbreeding or assortative mating, can increase LD by favoring the transmission of certain haplotypes
Inbreeding leads to an increase in homozygosity and can maintain LD by reducing the effective recombination rate
Assortative mating, where individuals with similar phenotypes mate more frequently, can create LD between loci that influence the phenotype

Applications of linkage disequilibrium

Genome-wide association studies (GWAS)

GWAS utilize LD to identify genetic variants associated with traits or diseases
By genotyping a set of markers across the genome and testing for associations with the phenotype of interest, GWAS can identify loci that harbor causal variants
The power of GWAS to detect associations depends on the strength of LD between the causal variant and the genotyped markers

Fine-mapping of disease loci

Once a locus has been identified through GWAS, fine-mapping can be used to pinpoint the causal variant(s) responsible for the association
Fine-mapping involves genotyping additional markers in the region of interest and analyzing patterns of LD to identify the most likely causal variant(s)
The resolution of fine-mapping depends on the strength of LD in the region and the density of markers genotyped

Inferring population history

Patterns of LD can provide insights into a population's demographic history, such as population bottlenecks, expansions, and admixture events
The extent and distribution of LD across the genome can be used to estimate parameters such as and the timing of demographic events
Comparing patterns of LD between populations can also reveal differences in their demographic histories and help identify regions of the genome that have been subject to population-specific selection

Detecting natural selection

LD can be used to detect signatures of natural selection in the genome
Regions of the genome that have undergone recent positive selection will exhibit elevated levels of LD and reduced genetic diversity
Various statistical tests, such as the extended haplotype homozygosity (EHH) test and the integrated haplotype score (iHS), have been developed to identify such regions based on patterns of LD

Linkage disequilibrium vs linkage analysis

Linkage disequilibrium and linkage analysis are two distinct but related concepts in genetics
Linkage analysis is a family-based method that uses the co-segregation of markers and traits within pedigrees to identify regions of the genome that contain causal variants
In contrast, LD is a population-based measure of the non-random association of alleles at different loci
While linkage analysis relies on the direct observation of recombination events within families, LD reflects the cumulative effects of recombination, mutation, drift, and selection over many generations in a population

Patterns of linkage disequilibrium

Variation across the genome

The extent of LD varies widely across the genome, with some regions exhibiting strong LD and others showing little to no LD
Factors such as recombination rates, mutation rates, and the action of selection can all contribute to this variation
Recombination hotspots, which are regions of the genome with elevated recombination rates, tend to have lower levels of LD compared to surrounding regions

Differences between populations

The patterns of LD can differ substantially between populations due to differences in their demographic histories and the action of population-specific selective pressures
Populations that have undergone recent bottlenecks or founder events tend to have higher levels of LD compared to those with more stable demographic histories
Admixture between populations can also create distinct patterns of LD, with the extent of LD depending on the genetic distance between the parental populations and the time since admixture occurred

Limitations of linkage disequilibrium

Indirect association

LD-based methods, such as GWAS, rely on the indirect association between markers and causal variants
This can lead to false positive associations if the causal variant is not directly genotyped and is only in partial LD with the associated marker
Conversely, false negatives can occur if the causal variant is not in strong LD with any of the genotyped markers

Confounding factors

Various confounding factors can influence the patterns of LD observed in a population and lead to spurious associations
, where subgroups within a population have different allele frequencies due to differences in ancestry, can create LD between unlinked loci and result in false positive associations
Cryptic relatedness, where individuals in a study are more closely related than expected by chance, can also inflate LD estimates and lead to false positives

Methods for estimating linkage disequilibrium

Pairwise LD measures

Pairwise LD measures, such as D, D', r, and r², are used to quantify the strength of association between alleles at two loci
These measures can be calculated from genotype data using various statistical software packages
Pairwise LD measures are often used to visualize patterns of LD across the genome and to identify regions of high or low LD

Haplotype-based methods

Haplotype-based methods consider the associations between alleles at multiple loci simultaneously
These methods can provide a more comprehensive view of LD patterns and can be more powerful for detecting associations than pairwise measures
Examples of haplotype-based methods include the estimation of haplotype frequencies, the calculation of haplotype diversity, and the identification of haplotype blocks

Visualization of linkage disequilibrium

LD plots

LD plots are used to visualize the strength of pairwise LD between markers in a region of the genome
These plots typically display the values of D' or r² for all pairs of markers, with the strength of LD indicated by the color or shading of the plot
LD plots can be used to identify regions of high LD, which may be indicative of functional importance or recent selection

Heatmaps

Heatmaps are another common method for visualizing LD patterns
In an LD heatmap, the strength of pairwise LD is represented by the color or intensity of each cell in the matrix, with darker colors indicating stronger LD
Heatmaps can be used to identify patterns of LD across larger regions of the genome and to compare LD patterns between different populations or subgroups

Impact of linkage disequilibrium on genomic studies

Study design considerations

The extent and distribution of LD in a population can have significant implications for the design of genetic studies
In populations with high levels of LD, fewer markers may be needed to capture the majority of the genetic variation, reducing the cost and complexity of the study
Conversely, in populations with lower levels of LD, a higher density of markers may be required to achieve adequate coverage of the genome

Interpretation of results

The presence of LD can complicate the interpretation of results from genetic studies
Associated markers identified through GWAS may not be the causal variants themselves, but rather may be in LD with the causal variant(s)
Fine-mapping and functional studies may be necessary to distinguish causal variants from those that are merely associated due to LD
The extent of LD in a region can also affect the resolution of genetic mapping, with higher levels of LD resulting in larger regions of association and reduced ability to pinpoint causal variants

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

About Fiveable Blog Careers Testimonials Code of Conduct Terms of Use Privacy Policy CCPA Privacy Policy

Resources

Cram Mode AP Score Calculators Study Guides Practice Quizzes Glossary Crisis Text Line Request a Feature

Stay Connected

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

About Fiveable Blog Careers Testimonials Code of Conduct Terms of Use Privacy Policy CCPA Privacy Policy

Resources

Cram Mode AP Score Calculators Study Guides Practice Quizzes Glossary Crisis Text Line Request a Feature

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

Back

Glossary

You have 3 free guides left 😟

You have 3 free guides left 😟

8.2 Linkage disequilibrium

Definition of linkage disequilibrium

Causes of linkage disequilibrium

Genetic drift

Top images from around the web for Genetic drift

Top images from around the web for Genetic drift

Population bottlenecks

Founder effects

Admixture

Natural selection

Measures of linkage disequilibrium

D and D'

r and r²

Factors affecting linkage disequilibrium

Recombination rates

Mutation rates

Population size

Mating patterns

Applications of linkage disequilibrium

Genome-wide association studies (GWAS)

Fine-mapping of disease loci

Inferring population history

Detecting natural selection

Linkage disequilibrium vs linkage analysis

Patterns of linkage disequilibrium

Variation across the genome

Differences between populations

Limitations of linkage disequilibrium

Indirect association

Confounding factors

Methods for estimating linkage disequilibrium

Pairwise LD measures

Haplotype-based methods

Visualization of linkage disequilibrium

LD plots

Heatmaps

Impact of linkage disequilibrium on genomic studies

Study design considerations

Interpretation of results

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

Resources

Stay Connected

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

Resources

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

Back

Next