Need a Sherlock Holmes to solve proteins 3D structure? Ask AlphaFold

>>> Proteins, proteins everywhere…

Proteins are the employees of the cell, working to maintain its survival, where their specific function is determined by their structural shape derived from instructions coming from the amino acid (AA) sequence encoded in our genes. For example, antibody proteins are Y-shaped, and this similitude to hooks allows them to hook to pathogens ( i.e. viruses, bacteria) and detect and tag them for extermination.

To understand how these employees go from AA sequence to their energy efficient 3D structure, the following video will be helpful. In summary, biochemists use 4 distinct aspects to describe a protein structure: A primary structure which consist of AA sequence, a secondary structure consisting of repeating local structures from this AA sequence held together by chemical bonds called hydrogen bonding forming α-helix, β-sheet and turns that are local, a tertiary structure consisting of the overall shape of a single polypeptide chain (long AA chain) by non-local chemical interactions, and a possibly quaternary structure if the protein is made by more than one polypeptide chain.

Elucidating the shape of the protein is an important scientific challenge because diseases like diabetes, Alzheimer’s, and cystic fibrosis arise by the misfolding of specific protein structures. The protein folding problem is to try and solve the right protein structure amidst many structural possibilities. A knowledge of protein structure will allow to combat deadly human diseases and use this knowledge within biotechnology to produce new proteins with functions such as plastic degradation.

Currently, the accurate experimental methods to determine the protein shape rely on laborious, lengthy, and costly processes (Figure 1). Therefore, biologists are turning to AI to help diminish these factors and speed up scientific discoveries with the potential of saving lives and bettering the environment.

protein_experimentals
Figure 1. Experimental Techniques to Determine Protein 3D Structure. (A) X-Ray Crystallography. Consists of shooting an x-ray beam through the protein crystal obtained through the use of specific chemical conditions, and uses the resulting diffraction pattern to analyse the location of electrons and decipher the protein model (image); (B) Cryo-Electron Microscopy (Cryo-EM). When biomolecules (i.e. proteins) do not want to crystallize, cryo-EM allows for the visualization of small-large biomolecule and its specific function although in a costly manner (image); (C) Nuclear Magnetic Resonance (NMR).  NMR allows to analyse the structure and conformational changes but is limited to small and soluble proteins (image).

 

“The success of our first foray into protein folding is indicative of how machine learning systems can integrate diverse sources of information to help scientists come up with creative solutions to complex problems at speed”

These were the words of Google’s AI DeepMind developers after project AlphaFold, which aims to use machine learning to predict 3D protein structure solely from amino acid sequence (from scratch), won the biennial global Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP) competition on 2018. CASP is used as a gold standard for assessing new methods for protein structure prediction, and AlphaFold showed “unprecedented progress” by accurately predicting 25 out of the 43 proteins in the set (proteins which 3D structures had been obtained by conventional experimental means but not made public), compared to the second team only predicting 3 out of the 43 proteins.

Deep learning efforts attempting to do what AlphaFold  have focused on secondary structure predictions using recurrent neural networks  that does not predict the tertiary and/or quaternary structure needed to for the 3D protein shape due to the complexity of predicting the tertiary structure from scratch.

AlphaFold is composed of deep neural networks trained to 1) predict protein properties, namely the distances between AA and angles made by chemical bonds connecting these AA, and 2) predict the distances between every pair of protein residues and combine the probabilities into a score and use gradient descent, a mathematical method used widely in machine learning to make small incremental improvements, to estimate the most accurate prediction (Figure 2).

deepmind
Figure 2. DeepMind AlphaFold Methodology (source).

Even though there is much more work to do for a precise accurate AI use to try and solve the protein folding problem and speed up solutions to some of our world’s most grave problems, AlphaFold is undoubtedly a step in the right direction.

You with Alzheimer’s 6 years from now?

>>> tic, toc, time’s up… /n

Alzheimer’s is the most common type of dementia, a set of brain disorders that result in the loss of brain function. To throw some statistics highlighting the problem we face, 1 in 3 UK citizens will develop dementia during their lifetime where there is a 62% chance it will be Alzheimer’s, and it is the 6th leading cause of death in the USA.

The problem is that it is a multi-factorial disease as there are many factors influencing its development, i.e. reactive oxygen species, plaque aggregation, and protein malfunction. But these are just the tip of the iceberg, as at the heart of these activities leading to Alzheimer’s, there is a dysregulation (dyshomeostasis) of key biological transition metals such as Cu2+ and Zn 2+  that are vital to maintaining regular brain function and preventing dementia.  These factors contribute to the fact that there is no cure, and thus we are in competition against the clock to try and diagnose it as fast as possible to slow its progress.

ad_brains
Alzheimer’s (left) versus normal brain (right). Source.

Radiologist use Positron Emission Tomography (PET) scans to try and detect Alzheimer’s. PET allows the monitoring of molecular events as the disease evolves through the detection of positron emission using radioactive isotopes such as 18F. This isotope is attached to a version of glucose(18F-FDG), as glucose is the primary source of energy for brain cells, allowing their visualization. As brain cells become diseased, the amount of glucose decreases compared to normal brain cells. To aid in the war against time, Dr. Jae Ho Sohn combined machine learning with neuroimaging in the following article.

“One of the difficulties of Alzheimer’s disease is that by the time all the clinical symptoms manifest and we can make a definitive diagnosis, too many neurons have died, making it essentially irreversible. “

Jae Ho Sohn, MD,MS 

 

Debriefing the Article “A Deep Learning Model to Predict a Diagnosis  of Alzheimer’s Disease by Using 18F-FDG PET of the Brain” by Sohn et al.  

Objective. To develop a deep learning algorithm to forecast the diagnosis of Alzheimer’s disease (AD), mild cognitive impairment (MCI), or neither (non-AD/MCI)  of patients undergoing 18F-FDG PET brain imaging, and compare the results with that of conventional radiologic readers.

Reasoning. Due to the inefficacy of humans to detect slow, global imaging changes, and the awareness that deep learning may help address the complexity of imaging data as deep learning has been applied to help the detection of breast cancer using mammography, pulmonary nodule using CT, and hip osteoarthritis using radiography.

Methodology.  Sohn et al. trained the convolutional neural network of Inception V3 architecture using 90% (1921 imaging studies, 899 patients) of the total imaging studies from patients who had either AD, MCI, or neither enrolled in the Alzheimer’s Disease Neuroimaging Initiative (ADNI). This trained algorithm was then used for testing on the remaining 10% (188 imaging studies, 103 patients) of the ADNI images (labelled ADNI test set) , and on an independent set from 40 patients not in ADNI. To further asses the proficiency of this method, results from the trained algorithm were compared to radiological readers.

Results. The algorithm was able to predict with high ability those patients who were diagnosed with AD ( 92% in ADNI test set and 98% in the independent test set),  with MCI ( 63% in ADNI test set and 52% in the independent test set), and with non-AD/MCI (73% in ADNI test set and 84% in the independent test set). It outperformed three radiology readers in ROC space in forecasting the final AD diagnosis.

Limitations. The independent test data was small (n=40), not from a clinical trial, and also excluded data from patients with non-AD neurodegenerative cases and disorders like stroke that can affect memory function. The training of the algorithm was solely based on ADNI information and thus is limited by the ADNI patient population, which did not include patients with non-AD neurodegenerative diseases. The algorithm performed its predictions distinctly from human expert approaches, and the MCI and non-AD/MCI were unstable compared to AD diagnosis and their accuracy depends on the follow up time.

Conclusion. The trained deep learning algorithm using 18F-FDG PET images achieved 82% specificity with 100% sensitivity in predicting AD specifically, an average of 75.8 months(~6 years) before final diagnosis. It has the potential to diagnose Alzheimer’s 6 years in advance at the clinic, but further validation and analysis is needed per mentioned limitations.

 

AI in healthcare: for better or for worse?

>>> Hello, World!

In this century of technological advancement, there has been much hype over the recent emerging field of artificial intelligence (AI), defined as the intelligence applied by computational means instead of the natural world, i.e. humans.

AI has gained popularity following innovative applications in fields such as the automotive, finance, military, and healthcare industry.

However, as with any emerging technology, ethical and controversial issues arise. Questions over whether artificial intelligence will “take over the world” by, for example,  replacing industry sectors with robotics or the uncontrolled use of AI for military purpose are current hot topics of debate.

The Media, literature, and particularly the film industry, with movies such as “I, Robot” and “The Terminator”, have certainly expanded our imaginations as to the potential negatives in the field.

Adding fuel to the fire, recent comments from Tesla and SpaceX CEO Elon Musk stating that “A.I. is far more dangerous than nukes” and thus need to be proactively regulated ignite reasonable worries over the use of AI applications.

In healthcare and medical research, however, far from robots replacing human physicians in the foreseeable future, AI devices have been helping physicians and scientists save lives and develop new medical treatments.

AI is going to lead to the full understanding of human biology and give us the means to fully address human disease.

–Thomas Chittenden, VP of Statistical Sciences at WuXi NextCODE

A shift in the use of AI in medical research occurred in 12 June 2007 with Adam, a scientific robot developed by researchers in the UK universities of Aberystwyth and Cambridge, able to produce hypotheses about which genes provide information to develop key enzymes able to speed up (catalyse) reactions in the Brewer’s yeast Saccharomyces cerevisiae and experimentally test them robotically. Researchers then individually tested Adam’s hypotheses about the role of 19 genes and discovered that 9 were new and accurate while only 1 was incorrect.

Adam set the precedent for the team to develop a more advanced scientific robot called Eve, which helped identify triclosan, an ingredient found in toothpaste, as a potential anti-malarial drug against drug resistant malaria parasites which contribute to an estimate malaria mortality of 1.2 million annually.

Eve screened thousands of compounds against specific yeast strains that had their essential genes for growth replaced with equivalent ones either from malaria parasites or humans to find compounds that decreased or stopped the growth of strains dependent on malaria genes but not human genes ( to avoid human toxicity). As a result, triclosan was identified to halt the activity of the DHFR enzyme necessary for malaria survival even in pyrimethamine drug-resistant malaria strains.

Eve_robot
The scientific robot “Eve” (source)

Without Eve, it is likely that the research would have still been in progress at this stage and taken years to arrive at the published result, which is what usually happens in the drug discovery field.

To make a drug, on average it takes at least 10 years of arduous research and an estimate of US $2.6 billion with a high percentage of this money spent on drug therapies that fail. AI has the potential to lessen these time, money, and research inefficiency factors.

In the clinic, AI tools can use algorithms to assist physicians with the high volume of patient data, provide updated medical information, reduce therapeutic error, and use this information to provide clinical assistance and diagnosis with over 90% accuracy. The depicted diagram below provides some insight into the structural function of AI and examples of applications in medicine based on the detailed published information found in Jiang et al.

AI_paint
Insight into AI structure and examples of medical applications

Just as advantages, applying AI in healthcare rings the alarm for ethical issues and analytical concerns which will be discussed in future posts.

However, far being from robotic disaster, AI has proved valuable for the development of human medicine and health.

As Suchi Saria, a professor of computer science and director of the Machine Learning and Health Lab at John Hopkins University, explains in her TEDx talk, AI is already saving lives by detecting symptoms 12-24 hours before a doctor could.

AI in healthcare undoubtedly sets the precedent for a new future in medicine.