Prediction of age of onset of SCA3 and DRPLA by survival analysis using machine learning

Niigata, Japan - Using machine learning, the Department of Neurology at Niigata University has developed a model to predict the asymptomatic probability at each age from the current age and number of CAG repeats in carriers of spinocerebellar degeneration. Polyglutamine diseases such as DRPLA and SCA3 are caused by an expansion of CAG repeats in the causative gene. In polyglutamine diseases, the number of CAG repeats is known to be inversely related to age of onset. Parametric survival analysis has traditionally been used to predict age of onset, but a more accurate prediction method has been desired. We used two machine-learning survival analyses to predict age of onset and compared their accuracy with six parametric survival analyses; the two machine-learning methods (Random Survival Forest and DeepSurv) showed higher prediction accuracy than parametric survival analyses. In particular, Random Survival Forests had the highest prediction accuracy and was used for the final prediction." This study is important for genetic counselling for career life planning. In the future, we will continue the analysis with more cases at several centres, aiming for more accurate prediction of the probability of onset of the disease," explain Dr. Hatano and Dr. Ishihara.

Original Publication

"Machine Learning Approach for the Prediction of Age-Specific Probability of SCA3 and DRPLA by Survival Curve Analysis"

Hatano Y, Ishihara T, Hirokawa S, Onodera O.

Neurol Genet. 2023 May 4;9(3):e200075. doi: 10.1212/NXG.0000000000200075. eCollection 2023 Jun.

Related BRI Department

Research Findings