Linkage analysis of lung cancer

Linkage analysis is a statistical analysis of pedi-
gree data that looks for evidence of cosegregation
through the generations in human pedigrees of al-
leles at a genetic “susceptibility” locus and some
known genetic “marker” locus (usually a DNA poly-
morphism). Linkage analysis is a very powerful
method for detecting genetic loci that are highly
penetrant (after adjusting for environmental risk
factors). However, power decreases as the suscep-
tibility allele becomes more common and less pene-
trant. Since cigarette smoking is an extremely strong
risk factor for lung cancer (e.g., 4), it is important
that one looks for a major gene after controlling for
at least personal smoking, as this will increase the
power to detect linkage.
Bailey-Wilson et al. [99] published the first ev-
idence of linkage of a putative lung cancer sus-
ceptibility locus to a region of chromosome 6q.
Data were collected at eight recruitment sites of the
Genetic Epidemiology of Lung Cancer Consortium
(GELCC): the University of Cincinnati, University of
Colorado, Johns Hopkins School of Public Health,
Karmanos Cancer Institute, Saccomanno Research
Institute, Louisiana State University Health Sciences
Center, Mayo Clinic, and Medical College of Ohio.
Of the 26,108 lung cancer cases screened at GELCC
sites for this study, 13.7% had at least one first-
degree relative with lung cancer. Following the ini-
tial family history screening process, we collected
additional information from the 3541 families with
at least one first-degree relative with lung cancer.
We interviewed probands and/or their family rep-
resentatives to collect data regarding additional per-
sons affected with any cancers in the extended fam-
ily, vital status of affected individuals, availability of
archival tissue, and willingness of family members
to participate in the study. Further pedigree devel-
opment and biospecimen collection (blood, buccal
cells, or fixed tissue) were performed on 771 fami-
lieswith three ormore first-degree relatives affected
with lung cancer. Cancers were verified by medical
records, pathology reports, cancer registry records,
or death certificates for 69% of individuals affected
with either lung or throat cancer (LT), and by reports
of multiple family members for the other 31% of
family members affected with LT. Of these families,
only 52 had enough biospecimens available tomake
theminformative for linkage analyses. DNA isolated
from blood was genotyped at the Center for Inher-
ited Disease Research (CIDR, a National Institutes of
Health-supported core research facility), and DNA
from buccal cells and archival tissue and sputum
were genotyped at the University of Cincinnati, for
a panel of 392 microsatellite (short tandem repeat
polymorphism, STRP) marker loci. The data were
checked for errors and then analyzed using para-
metric and nonparametric linkagemethods.Marker
allele frequencies were calculated separately and
linkage analyses were performed separately for the
white American and African American families,
with the results combined in overall tests of linkage.
Our primary analytical approach assumed a
modelwith 10%penetrance in carriers and 1%pen-
etrance in the noncarriers. This analytical approach
weights information only fromthe affected subjects.
For this analysis we used FASTLINK for two-point
analysis and SIMWALK2 formultipoint analysis.We
chose this linkage model as our primary analytical
approach because of uncertainty about the strength
of relationship between smoking behavior and lung
cancer risk in the high-risk families we are study-
ing, and because the complex “gene + environ-
ment” models from the published segregation anal-
yses were not currently available in any multipoint
linkage analysis program. In addition, since about
90% of the affected family members in our studies
smoked, weighting only the affected individuals in
our simple dominant, lowpenetrancemodel has the
effect of jointly allowing for smoking status, while
ignoring information from unaffected subjects. We
allowed for genetic heterogeneity (different families
having different genetic causation) in the analysis.
Secondary analyses usedmore complexmodels that
included age and pack-years of cigarette smoking to
modify the penetrances. Our standard for this anal-
ysis was LODLINK, which uses the genetic regres-
sive model, obtained from segregation analyses by
Sellers et al. [80] and Bailey-Wilson et al. [91]. The
current implementation of LODLINK only permit-
ted two-point analysis when a covariate is included
and it is well known that two-point linkage is less
powerful in general than multipoint linkage analy-
sis. Nonparametric analyses were also performed as
secondary analyseswith variance componentsmod-
els using SOLAR (binary trait option) and mixed ef-
fects Cox regression models, in which time to onset
of disease is modeled as a quantitative trait.
Multipoint parametric linkage under the sim-
ple dominant low-penetrance affected-only model
(Figure 2.1) yielded amaximumheterogeneity LOD
score (HLOD) of 2.79 at 155 cM (marker D6S2436)
on chromosome 6q23–25 in the 52 families, with
67% of families estimated to be linked. Multipoint
analysis of a subset of 38 families with four affected
relatives gave an HLOD of 3.47 at this same loca-
tion, with 78% of families estimated to be linked,
whereas for the 23 highest risk families (five ormore
affected in two ormore generations), themultipoint
HLOD score was 4.26, with 94% of these families
estimated to be linked to this region. Our non-
parametric analyses and the two-point parametric
analyses that used the Sellers et al. model [80,91]
all provided additional support for linkage to this
region.
Additional families are nowbeing collected by the
GELCC to attempt to confirm this linkage result in
an independent sample and to narrow the critical
region that may contain a susceptibility gene. In ad-
dition, several other regions showed suggestive ev-
idence of linkage and these are being pursued.

0 comments: