Structure Seer – a machine learning model for chemical structure elucidation from node labelling of a molecular graph
Literature Information
Joseph C. Bear
The identification of a compound's chemical structure remains one of the most crucial everyday tasks in chemistry. Among the vast range of existing analytical techniques NMR spectroscopy remains one of the most powerful tools. As a step towards structure prediction from experimental NMR spectra, this article introduces a novel machine-learning (ML) Structure Seer model that is designed to provide a quantitative probabilistic prediction on the connectivity of the atoms based on the information on the elemental composition of the molecule along with a list of atom-attributed isotropic shielding constants, obtained via quantum chemical methods based on a Hartree–Fock calculation. The utilization of shielding constants in the approach instead of NMR chemical shifts helps overcome challenges linked to the relatively limited sizes of datasets comprising reliably measured spectra. Additionally, our approach holds significant potential for scalability, as it can harness vast amounts of information on known chemical structures for the model's learning process. A comprehensive evaluation of the model trained on the QM9 and custom dataset derived from the PubChem database was conducted. The trained model was demonstrated to have the capability of accurately predicting up to 100% of the bonds for selected compounds from the QM9 dataset, achieving an impressive average accuracy rate of 37.5% for predicted bonds in the test fold. The application of the model to the tasks of NMR peak attribution, structure prediction and identification is discussed, along with prospective strategies of prediction interpretation, such as similarity searches and ranking of isomeric structures.
Related Literature
Morphology and chemical states of size-selected Ptn clusters on an aluminium oxide film on NiAl(110)
Atsushi Beniya, Noritake Isomura, Hirohito Hirata, Yoshihide Watanabe
DOI: 10.1039/C4CP01767F
Enhancement of hydrogen production using photoactive nanoparticles on a photochemically inert photonic macroporous support
Robert Mitchell, Rik Brydson, Richard E. Douthwaite
DOI: 10.1039/C4CP04333B
Lithium conductivity in glasses of the Li2O–Al2O3–SiO2 system
Sebastian Ross
DOI: 10.1039/C4CP03609C
Enhanced photoluminescence and photoactivity of plasmon sensitized nSiNWs/TiO2 heterostructures
Vedi Kuyil Azhagan
DOI: 10.1039/C4CP01497A
Operating mechanisms of electrolytes in magnesium ion batteries: chemical equilibrium, magnesium deposition, and electrolyte oxidation
Dong Young Kim, Younhee Lim, Basab Roy, Young-Gyoon Ryu, Seok-Soo Lee
DOI: 10.1039/C4CP01259C
Shock wave and modeling study of the thermal decomposition reactions of pentafluoroethane and 2-H-heptafluoropropane
C. J. Cobos, L. Sölter, E. Tellbach
DOI: 10.1039/C3CP54274B
Photophysical and structural characterisation of in situ formed quantum dots
A. K. Bansal, F. Antolini, M. T. Sajjad, L. Stroea, S. G. Ramkumar, K.-J. Kass, S. Allard, U. Scherf, I. D. W. Samuel
DOI: 10.1039/C4CP00727A
Correction: Plasmon-enhanced water splitting on TiO2-passivated GaP photocatalysts
Jing Qiu, Guangtong Zeng, Prathamesh Pavaskar, Zhen Li
DOI: 10.1039/C4CP90165G
Thermal boundary conductance between Al films and GaN nanowires investigated with molecular dynamics
Xiao-wang Zhou, Reese E. Jones, Patrick E. Hopkins, Thomas E. Beechem
DOI: 10.1039/C4CP00261J
Study of structural and dynamic characteristics of copper(ii) amino acid complexes in solutions by combined EPR and NMR relaxation methods
Valery G. Shtyrlin, Anvar Sh. Mukhtarov, Georgy V. Mamin, Siegfried Stapf, Carlos Mattea, Alexander A. Krutikov, Alexander N. Il'in, Nikita Yu. Serov
DOI: 10.1039/C4CP00255E
You might also like
What are the main uses of 1H-Indazole-6-carbonitrile (CAS: 141290-59-7)?
1H-Indazole-6-carbonitrile finds applications in pharmaceuticals, where it serve...
How should waste containing Dioctyl (2E)-2-butenedioate (CAS: 2997-85-5) be handled?
Waste containing Dioctyl (2E)-2-butenedioate (CAS: 2997-85-5) should be collecte...
What industries use Sodium [(1,2-benzoxazol-3-ylmethyl)sulfonyl]azanide (CAS: 68291-98-5)?
Sodium [(1,2-benzoxazol-3-ylmethyl)sulfonyl]azanide is primarily used in pharmac...
Are there alternatives to Dimethyl 4-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-2,6-pyridinedicarboxylate (CAS: 741709-66-0) in synthesis?
Dimethyl 4-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-2,6-pyridinedicarboxyla...
How should waste containing 2-Fluoro-6-hydrazinopyridine (CAS: 80714-39-2) be handled?
Waste containing 2-Fluoro-6-hydrazinopyridine (CAS: 80714-39-2) should be manage...
What is 6-Formyl-2-pyridinecarboxylic acid (CAS: 499214-11-8)?
6-Formyl-2-pyridinecarboxylic acid is an organic compound with the molecular for...
What is the market or research trend for 3-(3,4-dimethoxyphenyl)-2,5-dimethyl-N-(2-morpholin-4-ylethyl)pyrazolo[1,5-a]pyrimidin-7-amine (CAS: 900874-91-1)?
Research trends for this compound indicate a focus on its potential applications...
How is 9H-Tribenzo[b,d,f]azepine (CAS: 29875-73-8) typically synthesized?
9H-Tribenzo[b,d,f]azepine is typically synthesized via a multi-step process invo...
How is 1-Cyclopropyl-7-ethoxy-6-fluoro-8-methoxy-4-oxo-1,4-dihydro-3-quinolinecarboxylic acid (CAS: 1797982-51-4) typically synthesized?
1-Cyclopropyl-7-ethoxy-6-fluoro-8-methoxy-4-oxo-1,4-dihydro-3-quinolinecarboxyli...
How should waste containing Methyl 3-oxo-1,2,3,4-tetrahydro-6-quinoxalinecarboxylate (CAS: 671820-52-3) be handled?
Waste containing Methyl 3-oxo-1,2,3,4-tetrahydro-6-quinoxalinecarboxylate (CAS: ...















