ML and NMR Spectroscopy Predict Location of Atoms in Powdered Solids

Facebook X LinkedIn Email
Many drugs today are produced as powdered solids. To fully understand how the active ingredients will behave once inside the body, scientists need to know their exact atomic-level structure. Researchers are therefore working to develop technologies that can easily identify the exact crystal structures of microcrystalline powders.

One research team, from Ecole Polytechnique Fédérale de Lausanne (EPFL), is using machine learning (ML) to quickly predict chemical shifts of molecular solids and their polymorphs to within density functional theory (DFT) accuracy.

Machine learning model, NMR spectroscopy for modeling pharma atoms. EPFL.

Combined with experiments, the machine-learning model can determine, in record time, the location of atoms in powdered solids. Courtesy of Michele Ceriotti/EPFL.

The researchers used their ML program with nuclear magnetic resonance (NMR) spectroscopy to determine the exact location of atoms in complex organic compounds. Typically, complicated, time-consuming calculations involving quantum chemistry are required for NMR spectroscopy to fully determine crystal structure.

The team demonstrated that  the ML model was able to determine, based on the match between experimentally measured and ML-predicted shifts, the structures of cocaine and the drug 4-[4-(2-adamantylcarbamoyl)-5-tert-butylpyrazol-1-yl]benzoic acid. 

According to the researchers, their approach allows the calculation of chemical shifts for structures with ~100 atoms in less than 1 min, reducing the computational cost of chemical shift predictions in solids and relieving bottlenecks in the use of calculated chemical shifts for structure determination in solids.

“Even for relatively simple molecules, this model is almost 10,000 times faster than existing methods, and the advantage grows tremendously when considering more complex compounds,” said professor Michele Ceriotti. “To predict the NMR signature of a crystal with nearly 1600 atoms, our technique — ShiftML — requires about six minutes. The same feat would have taken 16 years with conventional techniques.”

The new program, which is freely available online, could benefit pharmaceutical companies, which must monitor the structure of the molecules that are active ingredients in drugs to meet requirements for patient safety.

“The massive acceleration in computation times will allow us to cover much larger conformational spaces and correctly determine structures where it was just not previously possible,” said professor Lyndon Emsley. “This puts most of the complex contemporary drug molecules within reach.”

A web version based on the protocol described here is publicly available at “Anyone can upload a molecule and get its NMR signature in just a few minutes,” Ceriotti said.

The research was published in Nature Communications (

Published: October 2018
machine learning
Machine learning (ML) is a subset of artificial intelligence (AI) that focuses on the development of algorithms and statistical models that enable computers to improve their performance on a specific task through experience or training. Instead of being explicitly programmed to perform a task, a machine learning system learns from data and examples. The primary goal of machine learning is to develop models that can generalize patterns from data and make predictions or decisions without being...
educationEPFL EuropespectroscopyNMR spectroscopymachine learningmedicalpharmaceuticalmedicineResearch & TechnologyTech Pulse

We use cookies to improve user experience and analyze our website traffic as stated in our Privacy Policy. By using this website, you agree to the use of cookies unless you have disabled them.