TY - JOUR
T1 - Leveraging Latent Dirichlet Allocation in processing free-text personal goals among patients undergoing bladder cancer surgery
AU - Li, Yuelin
AU - Rapkin, Bruce
AU - Atkinson, Thomas M.
AU - Schofield, Elizabeth
AU - Bochner, Bernard H.
N1 - Funding Information:
Funding This study was funded by (1) Patient-Centered Outcomes Research Institute Grant ME-1306-00781 (PI: Rapkin); (2) National Institute of Health Grant P30 CA008748 to Memorial Sloan Kettering Cancer Center; and (3) Sidney Kimmel Center for Prostate and Urological Cancers at Memorial Sloan Kettering Cancer Center, Pin Down Bladder Cancer, and the Michael A. and Zena Wiener Research and Therapeutics Program in Bladder Cancer (PI: Bochner).
Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019/6/15
Y1 - 2019/6/15
N2 - Purpose: As we begin to leverage Big Data in health care settings and particularly in assessing patient-reported outcomes, there is a need for novel analytics to address unique challenges. One such challenge is in coding transcribed interview data, typically free-text entries of statements made during a face-to-face interview. Latent Dirichlet Allocation (LDA) offers statistical rigor and consistency in automating the interpretation of patients’ expressed concerns and coping strategies. Methods: LDA was applied to interview data collected as part of a prospective, longitudinal study of QOL in N = 211 patients undergoing radical cystectomy and urinary diversion for bladder cancer. LDA analyzed personal goal statements to extract the latent topics and themes, stratified by time, and on things patients wanted to accomplish and prevent. Model comparison metrics determined the number of topics to extract. Results: LDA extracted seven latent topics. Prior to surgery, patients’ priorities were primarily in cancer surgery and recovery. Six months after the surgery, they were replaced by goals on regaining a sense of normalcy, to resume work, to enjoy life more fully, and to appreciate friends and family more. LDA model parameters showed changing priorities, e.g., immediate concerns on surgery and resuming employment decreased post-surgery and were replaced by concerns over cancer recurrence and a desire to remain healthy and strong. Conclusions: Novel Big Data analytics such as LDA offer the possibility of summarizing personal goals without the need for conventional fixed-length measures and resource-intensive qualitative data coding.
AB - Purpose: As we begin to leverage Big Data in health care settings and particularly in assessing patient-reported outcomes, there is a need for novel analytics to address unique challenges. One such challenge is in coding transcribed interview data, typically free-text entries of statements made during a face-to-face interview. Latent Dirichlet Allocation (LDA) offers statistical rigor and consistency in automating the interpretation of patients’ expressed concerns and coping strategies. Methods: LDA was applied to interview data collected as part of a prospective, longitudinal study of QOL in N = 211 patients undergoing radical cystectomy and urinary diversion for bladder cancer. LDA analyzed personal goal statements to extract the latent topics and themes, stratified by time, and on things patients wanted to accomplish and prevent. Model comparison metrics determined the number of topics to extract. Results: LDA extracted seven latent topics. Prior to surgery, patients’ priorities were primarily in cancer surgery and recovery. Six months after the surgery, they were replaced by goals on regaining a sense of normalcy, to resume work, to enjoy life more fully, and to appreciate friends and family more. LDA model parameters showed changing priorities, e.g., immediate concerns on surgery and resuming employment decreased post-surgery and were replaced by concerns over cancer recurrence and a desire to remain healthy and strong. Conclusions: Novel Big Data analytics such as LDA offer the possibility of summarizing personal goals without the need for conventional fixed-length measures and resource-intensive qualitative data coding.
KW - Big Data analysis
KW - Bladder cancer
KW - Latent Dirichlet Allocation
KW - Qualitative data
KW - Text analysis
UR - http://www.scopus.com/inward/record.url?scp=85062023040&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062023040&partnerID=8YFLogxK
U2 - 10.1007/s11136-019-02132-w
DO - 10.1007/s11136-019-02132-w
M3 - Article
C2 - 30798421
AN - SCOPUS:85062023040
SN - 0962-9343
VL - 28
SP - 1441
EP - 1455
JO - Quality of Life Research
JF - Quality of Life Research
IS - 6
ER -