Structure Seer – a machine learning model for chemical structure elucidation from node labelling of a molecular graph
Literature Information
Joseph C. Bear
The identification of a compound's chemical structure remains one of the most crucial everyday tasks in chemistry. Among the vast range of existing analytical techniques NMR spectroscopy remains one of the most powerful tools. As a step towards structure prediction from experimental NMR spectra, this article introduces a novel machine-learning (ML) Structure Seer model that is designed to provide a quantitative probabilistic prediction on the connectivity of the atoms based on the information on the elemental composition of the molecule along with a list of atom-attributed isotropic shielding constants, obtained via quantum chemical methods based on a Hartree–Fock calculation. The utilization of shielding constants in the approach instead of NMR chemical shifts helps overcome challenges linked to the relatively limited sizes of datasets comprising reliably measured spectra. Additionally, our approach holds significant potential for scalability, as it can harness vast amounts of information on known chemical structures for the model's learning process. A comprehensive evaluation of the model trained on the QM9 and custom dataset derived from the PubChem database was conducted. The trained model was demonstrated to have the capability of accurately predicting up to 100% of the bonds for selected compounds from the QM9 dataset, achieving an impressive average accuracy rate of 37.5% for predicted bonds in the test fold. The application of the model to the tasks of NMR peak attribution, structure prediction and identification is discussed, along with prospective strategies of prediction interpretation, such as similarity searches and ranking of isomeric structures.
Recommended Journals

Topics in Catalysis

Journal of Asian Natural Products Research

Critical Reviews in Solid State and Materials Sciences

Bioorganic & Medicinal Chemistry Letters

Journal of the Indian Institute of Science

Bioorganic & Medicinal Chemistry

Heteroatom Chemistry

Journal of Chemical Sciences

Acta Metallurgica Sinica-English Letters

Electroanalysis
Related Literature
Three-dimensional motion and transformation of a photoelectrochemical actuator
Kazutake Takada, Taichi Miyazaki, Nobutaka Tanaka, Tetsu Tatsuma
DOI: 10.1039/B600442C
Synthesis, structure, and olefinpolymerization with nickel(ii) N-heterocyclic carbene enolates
Benjamin E. Ketz, Xavier G. Ottenwaelder, Robert M. Waymouth
DOI: 10.1039/B511202H
Sieving behaviour of nanoscopic pores by hydrated ions
Joohan Lee, Juhyoun Kwak
DOI: 10.1039/B601613H
An electrochemical/photochemical information processing system using a monolayer-functionalized electrode
Ronan Baron, Avital Onopriyenko, Eugenii Katz, Oleg Lioubashevski, Itamar Willner, Sheng Wang, He Tian
DOI: 10.1039/B518378B
Scanning electrochemical microscopy under illumination: an elegant tool to directly determine the mobility of charge carriers within dye-sensitized nanostructured semiconductors
Biljana Bozic, Egbert Figgemeier
DOI: 10.1039/B601587E
Biocatalytic deuterium- and hydrogen-transfer using over-expressed ADH-‘A’: enhanced stereoselectivity and 2H-labeled chiral alcohols
Christian C. Gruber, Tina M. Poessl, Frank Niehaus, Juergen Eck, Reinhold Oehrlein, Andreas Hafner
DOI: 10.1039/B602487D
An achiral form of the hexameric resorcin[4]arene capsule sustained by hydrogen bonding with alcohols
Onome Ugono, K. Travis Holman
DOI: 10.1039/B604148E
Non-catalytic and template-free growth of aligned CdS nanowires exhibiting high field emission current densities
Yi-Feng Lin, Yung-Jung Hsu, Shih-Yuan Lu, Sheng-Chin Kung
DOI: 10.1039/B604309G
Recent progress in cobalt-mediated [2 + 2 + 2] cycloaddition reactions
Vincent Gandon, Corinne Aubert, Max Malacria
DOI: 10.1039/B517696B
Synthesis of organic–inorganic hybrid mesoporous tin oxophosphate in the presence of anionic surfactant
Masahiro Fujiwara, Masahiko Matsukata
DOI: 10.1039/B508589F
You might also like
What are the main uses of 1H-Indazole-6-carbonitrile (CAS: 141290-59-7)?
1H-Indazole-6-carbonitrile finds applications in pharmaceuticals, where it serve...
How should waste containing Dioctyl (2E)-2-butenedioate (CAS: 2997-85-5) be handled?
Waste containing Dioctyl (2E)-2-butenedioate (CAS: 2997-85-5) should be collecte...
What industries use Sodium [(1,2-benzoxazol-3-ylmethyl)sulfonyl]azanide (CAS: 68291-98-5)?
Sodium [(1,2-benzoxazol-3-ylmethyl)sulfonyl]azanide is primarily used in pharmac...
Are there alternatives to Dimethyl 4-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-2,6-pyridinedicarboxylate (CAS: 741709-66-0) in synthesis?
Dimethyl 4-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-2,6-pyridinedicarboxyla...
How should waste containing 2-Fluoro-6-hydrazinopyridine (CAS: 80714-39-2) be handled?
Waste containing 2-Fluoro-6-hydrazinopyridine (CAS: 80714-39-2) should be manage...
What is 6-Formyl-2-pyridinecarboxylic acid (CAS: 499214-11-8)?
6-Formyl-2-pyridinecarboxylic acid is an organic compound with the molecular for...
What is the market or research trend for 3-(3,4-dimethoxyphenyl)-2,5-dimethyl-N-(2-morpholin-4-ylethyl)pyrazolo[1,5-a]pyrimidin-7-amine (CAS: 900874-91-1)?
Research trends for this compound indicate a focus on its potential applications...
How is 9H-Tribenzo[b,d,f]azepine (CAS: 29875-73-8) typically synthesized?
9H-Tribenzo[b,d,f]azepine is typically synthesized via a multi-step process invo...
How is 1-Cyclopropyl-7-ethoxy-6-fluoro-8-methoxy-4-oxo-1,4-dihydro-3-quinolinecarboxylic acid (CAS: 1797982-51-4) typically synthesized?
1-Cyclopropyl-7-ethoxy-6-fluoro-8-methoxy-4-oxo-1,4-dihydro-3-quinolinecarboxyli...
How should waste containing Methyl 3-oxo-1,2,3,4-tetrahydro-6-quinoxalinecarboxylate (CAS: 671820-52-3) be handled?
Waste containing Methyl 3-oxo-1,2,3,4-tetrahydro-6-quinoxalinecarboxylate (CAS: ...





