Data Mining and Computationally Intensive Methods: Summary of Group 7 Contributions to Genetic Analysis Workshop 13

Tracy J. Costello; Catherine T. Falk; Kenny Q. Ye

doi:10.1002/gepi.10285

Data Mining and Computationally Intensive Methods: Summary of Group 7 Contributions to Genetic Analysis Workshop 13

Tracy J. Costello, Catherine T. Falk, Kenny Q. Ye

Research output: Contribution to journal › Article › peer-review

6 Scopus citations

Abstract

The Framingham Heart Study data, as well as a related simulated data set, were generously provided to the participants of the Genetic Analysis Workshop 13 in order that newly developed and emerging statistical methodologies could be tested on that well-characterized data set. The impetus driving the development of novel methods is to elucidate the contributions of genes, environment, and interactions between and among them, as well as to allow comparison between and validation of methods. The seven papers that comprise this group used data-mining methodologies (tree-based methods, neural networks, discriminant analysis, and Bayesian variable selection) in an attempt to identify the underlying genetics of cardiovascular disease and related traits in the presence of environmental and genetic covariates. Data-mining strategies are gaining popularity because they are extremely flexible and may have greater efficiency and potential in identifying the factors involved in complex disorders. While the methods grouped together here constitute a diverse collection, some papers asked similar questions with very different methods, while others used the same underlying methodology to ask very different questions. This paper briefly describes the data-mining methodologies applied to the Genetic Analysis Workshop 13 data sets and the results of those investigations.

Original language	English (US)
Pages (from-to)	S57-S63
Journal	Genetic Epidemiology
Volume	25
Issue number	SUPPL. 1
DOIs	https://doi.org/10.1002/gepi.10285
State	Published - 2003
Externally published	Yes

Keywords

Association test
Cardiovascular disease
Discriminant analysis
Framingham Heart Study
Genetic linkage
Glucose levels
Hypertension
Neural networks
Random forests
Stochastic search variable selection
Systolic blood pressure
Tree-based methods

ASJC Scopus subject areas

Epidemiology
Genetics(clinical)

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1002/gepi.10285

Cite this

@article{3bcdab3492334cbeb9f6f1ea72ec4ba0,

title = "Data Mining and Computationally Intensive Methods: Summary of Group 7 Contributions to Genetic Analysis Workshop 13",

abstract = "The Framingham Heart Study data, as well as a related simulated data set, were generously provided to the participants of the Genetic Analysis Workshop 13 in order that newly developed and emerging statistical methodologies could be tested on that well-characterized data set. The impetus driving the development of novel methods is to elucidate the contributions of genes, environment, and interactions between and among them, as well as to allow comparison between and validation of methods. The seven papers that comprise this group used data-mining methodologies (tree-based methods, neural networks, discriminant analysis, and Bayesian variable selection) in an attempt to identify the underlying genetics of cardiovascular disease and related traits in the presence of environmental and genetic covariates. Data-mining strategies are gaining popularity because they are extremely flexible and may have greater efficiency and potential in identifying the factors involved in complex disorders. While the methods grouped together here constitute a diverse collection, some papers asked similar questions with very different methods, while others used the same underlying methodology to ask very different questions. This paper briefly describes the data-mining methodologies applied to the Genetic Analysis Workshop 13 data sets and the results of those investigations.",

keywords = "Association test, Cardiovascular disease, Discriminant analysis, Framingham Heart Study, Genetic linkage, Glucose levels, Hypertension, Neural networks, Random forests, Stochastic search variable selection, Systolic blood pressure, Tree-based methods",

author = "Costello, {Tracy J.} and Falk, {Catherine T.} and Ye, {Kenny Q.}",

year = "2003",

doi = "10.1002/gepi.10285",

language = "English (US)",

volume = "25",

pages = "S57--S63",

journal = "Genetic Epidemiology",

issn = "0741-0395",

publisher = "Wiley-Liss Inc.",

number = "SUPPL. 1",

}

TY - JOUR

T1 - Data Mining and Computationally Intensive Methods

T2 - Summary of Group 7 Contributions to Genetic Analysis Workshop 13

AU - Costello, Tracy J.

AU - Falk, Catherine T.

AU - Ye, Kenny Q.

PY - 2003

Y1 - 2003

N2 - The Framingham Heart Study data, as well as a related simulated data set, were generously provided to the participants of the Genetic Analysis Workshop 13 in order that newly developed and emerging statistical methodologies could be tested on that well-characterized data set. The impetus driving the development of novel methods is to elucidate the contributions of genes, environment, and interactions between and among them, as well as to allow comparison between and validation of methods. The seven papers that comprise this group used data-mining methodologies (tree-based methods, neural networks, discriminant analysis, and Bayesian variable selection) in an attempt to identify the underlying genetics of cardiovascular disease and related traits in the presence of environmental and genetic covariates. Data-mining strategies are gaining popularity because they are extremely flexible and may have greater efficiency and potential in identifying the factors involved in complex disorders. While the methods grouped together here constitute a diverse collection, some papers asked similar questions with very different methods, while others used the same underlying methodology to ask very different questions. This paper briefly describes the data-mining methodologies applied to the Genetic Analysis Workshop 13 data sets and the results of those investigations.

AB - The Framingham Heart Study data, as well as a related simulated data set, were generously provided to the participants of the Genetic Analysis Workshop 13 in order that newly developed and emerging statistical methodologies could be tested on that well-characterized data set. The impetus driving the development of novel methods is to elucidate the contributions of genes, environment, and interactions between and among them, as well as to allow comparison between and validation of methods. The seven papers that comprise this group used data-mining methodologies (tree-based methods, neural networks, discriminant analysis, and Bayesian variable selection) in an attempt to identify the underlying genetics of cardiovascular disease and related traits in the presence of environmental and genetic covariates. Data-mining strategies are gaining popularity because they are extremely flexible and may have greater efficiency and potential in identifying the factors involved in complex disorders. While the methods grouped together here constitute a diverse collection, some papers asked similar questions with very different methods, while others used the same underlying methodology to ask very different questions. This paper briefly describes the data-mining methodologies applied to the Genetic Analysis Workshop 13 data sets and the results of those investigations.

KW - Association test

KW - Cardiovascular disease

KW - Discriminant analysis

KW - Framingham Heart Study

KW - Genetic linkage

KW - Glucose levels

KW - Hypertension

KW - Neural networks

KW - Random forests

KW - Stochastic search variable selection

KW - Systolic blood pressure

KW - Tree-based methods

UR - http://www.scopus.com/inward/record.url?scp=0344981437&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0344981437&partnerID=8YFLogxK

U2 - 10.1002/gepi.10285

DO - 10.1002/gepi.10285

M3 - Article

C2 - 14635170

AN - SCOPUS:0344981437

SN - 0741-0395

VL - 25

SP - S57-S63

JO - Genetic Epidemiology

JF - Genetic Epidemiology

IS - SUPPL. 1

ER -

Data Mining and Computationally Intensive Methods: Summary of Group 7 Contributions to Genetic Analysis Workshop 13

Abstract

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this