Structure Seer – a machine learning model for chemical structure elucidation from node labelling of a molecular graph

Literature Information

Publication Date 2023-12-20
DOI 10.1039/D3DD00178D
Impact Factor 0
Authors

Joseph C. Bear


View Original

Abstract

The identification of a compound's chemical structure remains one of the most crucial everyday tasks in chemistry. Among the vast range of existing analytical techniques NMR spectroscopy remains one of the most powerful tools. As a step towards structure prediction from experimental NMR spectra, this article introduces a novel machine-learning (ML) Structure Seer model that is designed to provide a quantitative probabilistic prediction on the connectivity of the atoms based on the information on the elemental composition of the molecule along with a list of atom-attributed isotropic shielding constants, obtained via quantum chemical methods based on a Hartree–Fock calculation. The utilization of shielding constants in the approach instead of NMR chemical shifts helps overcome challenges linked to the relatively limited sizes of datasets comprising reliably measured spectra. Additionally, our approach holds significant potential for scalability, as it can harness vast amounts of information on known chemical structures for the model's learning process. A comprehensive evaluation of the model trained on the QM9 and custom dataset derived from the PubChem database was conducted. The trained model was demonstrated to have the capability of accurately predicting up to 100% of the bonds for selected compounds from the QM9 dataset, achieving an impressive average accuracy rate of 37.5% for predicted bonds in the test fold. The application of the model to the tasks of NMR peak attribution, structure prediction and identification is discussed, along with prospective strategies of prediction interpretation, such as similarity searches and ranking of isomeric structures.

Related Literature

Three-dimensional motion and transformation of a photoelectrochemical actuator

Kazutake Takada, Taichi Miyazaki, Nobutaka Tanaka, Tetsu Tatsuma

2006-04-11 Communication

DOI: 10.1039/B600442C

Synthesis, structure, and olefinpolymerization with nickel(ii) N-heterocyclic carbene enolates

Benjamin E. Ketz, Xavier G. Ottenwaelder, Robert M. Waymouth

2005-10-20 Communication

DOI: 10.1039/B511202H

Sieving behaviour of nanoscopic pores by hydrated ions

Joohan Lee, Juhyoun Kwak

2006-04-20 Communication

DOI: 10.1039/B601613H

An electrochemical/photochemical information processing system using a monolayer-functionalized electrode

Ronan Baron, Avital Onopriyenko, Eugenii Katz, Oleg Lioubashevski, Itamar Willner, Sheng Wang, He Tian

2006-02-23 Communication

DOI: 10.1039/B518378B

Biocatalytic deuterium- and hydrogen-transfer using over-expressed ADH-‘A’: enhanced stereoselectivity and 2H-labeled chiral alcohols

Christian C. Gruber, Tina M. Poessl, Frank Niehaus, Juergen Eck, Reinhold Oehrlein, Andreas Hafner

2006-04-11 Communication

DOI: 10.1039/B602487D

An achiral form of the hexameric resorcin[4]arene capsule sustained by hydrogen bonding with alcohols

Onome Ugono, K. Travis Holman

2006-04-26 Communication

DOI: 10.1039/B604148E

Non-catalytic and template-free growth of aligned CdS nanowires exhibiting high field emission current densities

Yi-Feng Lin, Yung-Jung Hsu, Shih-Yuan Lu, Sheng-Chin Kung

2006-05-02 Communication

DOI: 10.1039/B604309G

Recent progress in cobalt-mediated [2 + 2 + 2] cycloaddition reactions

Vincent Gandon, Corinne Aubert, Max Malacria

2006-03-16 Feature Article

DOI: 10.1039/B517696B

Synthesis of organic–inorganic hybrid mesoporous tin oxophosphate in the presence of anionic surfactant

Masahiro Fujiwara, Masahiko Matsukata

2005-09-20 Communication

DOI: 10.1039/B508589F

You might also like

Compound Q&A

What are the main uses of 1H-Indazole-6-carbonitrile (CAS: 141290-59-7)?

1H-Indazole-6-carbonitrile finds applications in pharmaceuticals, where it serve...

141290-59-71H-Indazole-6-carbon...
Compound Q&A

How should waste containing Dioctyl (2E)-2-butenedioate (CAS: 2997-85-5) be handled?

Waste containing Dioctyl (2E)-2-butenedioate (CAS: 2997-85-5) should be collecte...

2997-85-5Dioctyl (2E)-2-buten...
Compound Q&A

What industries use Sodium [(1,2-benzoxazol-3-ylmethyl)sulfonyl]azanide (CAS: 68291-98-5)?

Sodium [(1,2-benzoxazol-3-ylmethyl)sulfonyl]azanide is primarily used in pharmac...

68291-98-5Sodium [(1,2-benzoxa...
Compound Q&A

Are there alternatives to Dimethyl 4-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-2,6-pyridinedicarboxylate (CAS: 741709-66-0) in synthesis?

Dimethyl 4-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-2,6-pyridinedicarboxyla...

741709-66-0Dimethyl 4-(4,4,5,5-...
Compound Q&A

How should waste containing 2-Fluoro-6-hydrazinopyridine (CAS: 80714-39-2) be handled?

Waste containing 2-Fluoro-6-hydrazinopyridine (CAS: 80714-39-2) should be manage...

80714-39-22-Fluoro-6-hydrazino...
Compound Q&A

What is 6-Formyl-2-pyridinecarboxylic acid (CAS: 499214-11-8)?

6-Formyl-2-pyridinecarboxylic acid is an organic compound with the molecular for...

499214-11-86-Formyl-2-pyridinec...
900874-91-13-(3,4-dimethoxyphen...
Compound Q&A

How is 9H-Tribenzo[b,d,f]azepine (CAS: 29875-73-8) typically synthesized?

9H-Tribenzo[b,d,f]azepine is typically synthesized via a multi-step process invo...

29875-73-89H-Tribenzo[b,d,f]az...
Compound Q&A

How is 1-Cyclopropyl-7-ethoxy-6-fluoro-8-methoxy-4-oxo-1,4-dihydro-3-quinolinecarboxylic acid (CAS: 1797982-51-4) typically synthesized?

1-Cyclopropyl-7-ethoxy-6-fluoro-8-methoxy-4-oxo-1,4-dihydro-3-quinolinecarboxyli...

1797982-51-41-Cyclopropyl-7-etho...
Compound Q&A

How should waste containing Methyl 3-oxo-1,2,3,4-tetrahydro-6-quinoxalinecarboxylate (CAS: 671820-52-3) be handled?

Waste containing Methyl 3-oxo-1,2,3,4-tetrahydro-6-quinoxalinecarboxylate (CAS: ...

671820-52-3Methyl 3-oxo-1,2,3,4...

Source Journal

Digital Discovery

Digital Discovery
CiteScore: 0
Self-citation Rate: 0%
Articles per Year: 0

Recommended Compounds

Recommended Suppliers

Disclaimer
This page provides academic journal information for reference and research purposes only. We are not affiliated with any journal publishers and do not handle publication submissions. For publication-related inquiries, please contact the respective journal publishers directly.
If you notice any inaccuracies in the information displayed, please contact us at support@chemtradehub.com. We will promptly review and address your concerns.