Using natural language processing to determine predictors of healthy diet and physical activity behavior change in ovarian cancer survivors

Authors: Crane TE, Culnan J, Sharp R, Wright SJ, Franks G, Klimowski C, Merchant N, Bethard SJ

Category: Behavioral Science & Health Communication
Conference Year: 2020

Abstract Body:
Purpose of the study: To explore the use of speech technology and natural language processing in evaluating language and vocalics as predictors of behavior change in ovarian cancer survivors participating in a lifestyle intervention. Methods: Recorded telephone coaching sessions from women participating in the Lifestyle Intervention for Ovarian cancer Enhanced Survival (LIVES) study were used for this analysis. LIVES is testing whether women randomly assigned to a lifestyle intervention promoting a high vegetable, fruit and fiber and low-fat diet and increased physical activity will have increased progression free survival as compared to women assigned to an attention control. Motivational interviewing, a directive, patient-centered approach, is used to elicit behavior change. A 10% random sample of call recordings were scored for protocol fidelity. Three automated speech recognition programs, Google cloud to speech, AWS transcriber, and Watson speech to text were tested. The text transcriptions were analyzed by a natural language processing expert for how well they retained the information necessary for evaluating fidelity of the motivational interview to the LIVES protocol. Using the OpenSMILE acoustic feature extraction library, the audio was analyzed by a speech technology expert for how well different aspects of the speech signal (e.g., pitch, spectral energy) reflect low vs. high participant achievement. Results: The three automated speech recognition programs accurately detected between 72 and 76% of text, with Google cloud to speech performing the best. Additionally, the text was correctly attributed to the speaker 68% of the time (32% DER score). Analysis of the transcriptions suggests that the Google output recovers the critical words and phrases for five of six different measures of fidelity to the LIVES protocol. Analysis of the audio suggests that high-achieving participants show more variable pitch than low-achieving patients. Conclusions: Next steps will include analysis of the more than 33,000 recorded hours of LIVES calls for language and sentiment in relation to diet and physical activity behavior change. Speech technology and natural language processing hold high potential for identifying characteristics of language used in coaching calls

Keywords: speech technology, natural language processing, ovarian cancer, sentiment