Machine Learning joins our fight against cancer

>>> Let’s classify

Histology is used to do microscopic analyses of cancer tissue. It remains as the core technique of classifying many rare tumors as there is a lack of molecular identifiers compared to common types of tumors in which the abundance of identifiers allow technological developments to asses them without needing visual appraisal of cellular alterations.

The problem with histology

As it depends on visual observations, these can vary between different individuals leading to different classifications based on different assessments, thus introducing bias. Along with this human variation, it faces more challenges, i.e.  the fact that despite having similar histology, many tumors can still progress in different ways, and so the other way around, where tumors with different microscopic characteristics can progress the same way.

In previous research studies (1, 2) for example, this inter-observer variability in histopathological diagnosis has been reported in Central Nervous System (CNS) tumors like diffuse gliomas (brain tumors initiating in the glial cells), ependymomas ( brain tumors initiating in the ependymoma), and supratentorial primitive neuroectodermal tumors ( occurring mostly in children starting in the cerebrum). To try to address this problem, some molecular groupings have been updated into the World Health Organization (WHO) classification, but only for selected tumors such as medulloblastoma.

This diagnostic variation and uncertainty provide a challenge to decision-making in clinical practice that can have a major effect on the survival of a cancer patient. Therefore, Capper and colleagues decided to train their machine learning algorithm focusing not on complex visual assesments, but on the most studied epigenetic event in cancer, DNA methylation.

Histology vs DNA methylation

Epigenetic modifications do not affect the DNA sequence that encode how our cell will function, but it alters the expression of genes and the fate of the cell. In DNA methylation, a chemical group called a methyl group is bound to the DNA, and this feature is diverse in specific cancers which allow for innovative diagnostics to classify them. Compared with histology, epigenome analysis of DNA methylation in cancer allows for an unbiased diagnostic approach, and  thus Capper et al. (2018) fed their innovative cancer diagnostic computer genome-wide methylation data from samples of  almost all CNS tumour typed under WHO classification.

Machine Learning + DNA methylation

Capper et al. (2018) used the machine learning algorithm Random Forest (RF), as it combines several weak classifiers to improve the accuracy of the prediction, and trained it to recognize methylation patterns in the provided already histological-classified samples via supervised machine learning and find naturally occurring tumor patterns by itself to assign the samples based on this pattern category. Capper and his colleagues then used the computer to classify 1,104 test cases which has been diagnosed by pathologists using standard histological and molecular way. An overview of their findings showcases their interesting results:

rf

 

In 12.6% of the cases, the computer and pathologist diagnosis did not match, but after further laboratory testing involving a technique called gene sequencing that allows to see DNA changes at the genetic level, 92.8% of these unmatched tumors were found to correctly match the computers and not the pathologist’s assessment. Furthermore, 71% of these were computationally assigned a different tumor grade, which affect treatment delivery.

The Future

Despite this machine learning innovation, today histology remains as the indispensable method for accessible and universal tumor classification. However, the approach developed by Capper et al. (2018) complements and, in some cases such as rare tumor classification, outrivals histological microscopic examination. As this platform further develops in present laboratories, the future of cancer classification might prove one of utmost accuracy and unbiased approach by the combination of visual inspection and molecular analysis.

 

You with Alzheimer’s 6 years from now?

>>> tic, toc, time’s up… /n

Alzheimer’s is the most common type of dementia, a set of brain disorders that result in the loss of brain function. To throw some statistics highlighting the problem we face, 1 in 3 UK citizens will develop dementia during their lifetime where there is a 62% chance it will be Alzheimer’s, and it is the 6th leading cause of death in the USA.

The problem is that it is a multi-factorial disease as there are many factors influencing its development, i.e. reactive oxygen species, plaque aggregation, and protein malfunction. But these are just the tip of the iceberg, as at the heart of these activities leading to Alzheimer’s, there is a dysregulation (dyshomeostasis) of key biological transition metals such as Cu2+ and Zn 2+  that are vital to maintaining regular brain function and preventing dementia.  These factors contribute to the fact that there is no cure, and thus we are in competition against the clock to try and diagnose it as fast as possible to slow its progress.

ad_brains
Alzheimer’s (left) versus normal brain (right). Source.

Radiologist use Positron Emission Tomography (PET) scans to try and detect Alzheimer’s. PET allows the monitoring of molecular events as the disease evolves through the detection of positron emission using radioactive isotopes such as 18F. This isotope is attached to a version of glucose(18F-FDG), as glucose is the primary source of energy for brain cells, allowing their visualization. As brain cells become diseased, the amount of glucose decreases compared to normal brain cells. To aid in the war against time, Dr. Jae Ho Sohn combined machine learning with neuroimaging in the following article.

“One of the difficulties of Alzheimer’s disease is that by the time all the clinical symptoms manifest and we can make a definitive diagnosis, too many neurons have died, making it essentially irreversible. “

Jae Ho Sohn, MD,MS 

 

Debriefing the Article “A Deep Learning Model to Predict a Diagnosis  of Alzheimer’s Disease by Using 18F-FDG PET of the Brain” by Sohn et al.  

Objective. To develop a deep learning algorithm to forecast the diagnosis of Alzheimer’s disease (AD), mild cognitive impairment (MCI), or neither (non-AD/MCI)  of patients undergoing 18F-FDG PET brain imaging, and compare the results with that of conventional radiologic readers.

Reasoning. Due to the inefficacy of humans to detect slow, global imaging changes, and the awareness that deep learning may help address the complexity of imaging data as deep learning has been applied to help the detection of breast cancer using mammography, pulmonary nodule using CT, and hip osteoarthritis using radiography.

Methodology.  Sohn et al. trained the convolutional neural network of Inception V3 architecture using 90% (1921 imaging studies, 899 patients) of the total imaging studies from patients who had either AD, MCI, or neither enrolled in the Alzheimer’s Disease Neuroimaging Initiative (ADNI). This trained algorithm was then used for testing on the remaining 10% (188 imaging studies, 103 patients) of the ADNI images (labelled ADNI test set) , and on an independent set from 40 patients not in ADNI. To further asses the proficiency of this method, results from the trained algorithm were compared to radiological readers.

Results. The algorithm was able to predict with high ability those patients who were diagnosed with AD ( 92% in ADNI test set and 98% in the independent test set),  with MCI ( 63% in ADNI test set and 52% in the independent test set), and with non-AD/MCI (73% in ADNI test set and 84% in the independent test set). It outperformed three radiology readers in ROC space in forecasting the final AD diagnosis.

Limitations. The independent test data was small (n=40), not from a clinical trial, and also excluded data from patients with non-AD neurodegenerative cases and disorders like stroke that can affect memory function. The training of the algorithm was solely based on ADNI information and thus is limited by the ADNI patient population, which did not include patients with non-AD neurodegenerative diseases. The algorithm performed its predictions distinctly from human expert approaches, and the MCI and non-AD/MCI were unstable compared to AD diagnosis and their accuracy depends on the follow up time.

Conclusion. The trained deep learning algorithm using 18F-FDG PET images achieved 82% specificity with 100% sensitivity in predicting AD specifically, an average of 75.8 months(~6 years) before final diagnosis. It has the potential to diagnose Alzheimer’s 6 years in advance at the clinic, but further validation and analysis is needed per mentioned limitations.