Hierarchical clustering of human papilloma virus genotype patterns in the ASCUS-LSIL triage study

Nicolas Wentzensen, Lauren E. Wilson, Cosette M. Wheeler, Joseph D. Carreon, Patti E. Gravitt, Mark Schiffman, Philip E. Castle

Research output: Contribution to journalArticlepeer-review

24 Scopus citations


Anogenital cancers are associated with ∼13 carcinogenic human papilloma virus (HPV) types in a broader group that cause cervical intraepithelial neoplasia (CIN). Multiple concurrent cervical HPV infections are common, which complicates the attribution of HPV types to different grades of CIN. Here we report the analysis of HPV genotype patterns in the atypical squamous cells of undetermined significance-low-grade squamous intraepithelial lesion triage study with the use of unsupervised hierarchical clustering. Women who underwent colposcopy at baseline (n = 2,780) were grouped into 20 disease categories based on histology and cytology. Disease groups and HPV genotypes were clustered with the use of complete linkage. Risk of 2-year cumulative CIN3+, viral load, colposcopic impression, and age were compared between disease groups and major clusters. Hierarchical clustering yielded four major disease clusters: cluster 1 included all CIN3 histology with abnormal cytology; cluster 2 included CIN3 histology with normal cytology and combinations with either CIN2 or high-grade squamous intraepithelial lesion cytology; cluster 3 included older women with normal or low-grade histology/cytology and low viral load; and cluster 4 included younger women with low-grade histology/cytology, multiple infections, and the highest viral load. Three major groups of HPV genotypes were identified: group 1 included only HPV16; group 2 included nine carcinogenic types, plus noncarcinogenic HPV53 and HPV66; and group 3 included noncarcinogenic types, plus carcinogenic HPV33 and HPV45. Clustering results suggested that colposcopy missed a prevalent precancer in many women with no biopsy/normal histology and high-grade squamous intraepithelial lesion. This result was confirmed by an elevated 2-year risk of CIN3+ in these groups. Our novel approach to study multiple genotype infections in cervical disease with the use of unsupervised hierarchical clustering can address complex genotype distributions on a population level.

Original languageEnglish (US)
Pages (from-to)8578-8586
Number of pages9
JournalCancer research
Issue number21
StatePublished - Nov 1 2010
Externally publishedYes

ASJC Scopus subject areas

  • Oncology
  • Cancer Research


Dive into the research topics of 'Hierarchical clustering of human papilloma virus genotype patterns in the ASCUS-LSIL triage study'. Together they form a unique fingerprint.

Cite this