Extracting structured seed-mediated gold nanorod growth procedures from scientific text with LLMs
Literature Information
Nicholas Walker, Anubhav Jain
Although gold nanorods have been the subject of much research, the pathways for controlling their shape and thereby their optical properties remain largely heuristically understood. Although it is apparent that the simultaneous presence of and interaction between various reagents during synthesis control these properties, computational and experimental approaches for exploring the synthesis space can be either intractable or too time-consuming in practice. This motivates an alternative approach leveraging the wealth of synthesis information already embedded in the body of scientific literature by developing tools to extract relevant structured data in an automated, high-throughput manner. To that end, we present an approach using the powerful GPT-3 language model to extract structured multi-step seed-mediated growth procedures and outcomes for gold nanorods from unstructured scientific text. GPT-3 prompt completions are fine-tuned to predict synthesis templates in the form of JSON documents from unstructured text input with an overall accuracy of 86% aggregated by entities and 76% aggregated by papers. The performance is notable, considering the model is performing simultaneous entity recognition and relation extraction. We present a dataset of 11 644 entities extracted from 1137 papers, resulting in 268 papers with at least one complete seed-mediated gold nanorod growth procedure and outcome for a total of 332 complete procedures.
Recommended Journals
Related Literature
Metal complexes of selenophosphinates from reactions with (R2PSe)2Se: [M(R2PSe2)n] (M = ZnII, CdII, PbII, InIII, GaIII, CuI, BiIII, NiII; R = iPr, Ph) and [MoV2O2Se2(Se2PiPr2)2]
Chinh Q. Nguyen, Adekunle Adeogun, Mohammad Afzaal, Mohammad A. Malik, Paul O'Brien
DOI: 10.1039/B603198F
A novel mediatorless microbial fuel cell based on direct biocatalysis of Escherichia coli
Tian Zhang, Changzheng Cui, Shengli Chen, Xinping Ai, Hanxi Yang, Ping Shen, Zhenrong Peng
DOI: 10.1039/B600876C
A platinum-catalyzed annulation reaction leading to medium-sized rings
Dirk Hildebrandt, Wiebke Hüggenberg, Matthias Kanthak, Tobias Plöger, Iris M. Müller, Gerald Dyker
DOI: 10.1039/B602498J
Two novel non-viral gene delivery vectors: low molecular weight polyethylenimine cross-linked by (2-hydroxypropyl)-β-cyclodextrin or (2-hydroxypropyl)-γ-cyclodextrin
Hongliang Huang, Guping Tang, Qingqing Wang, Da Li, Fenping Shen, Jun Zhou, Hai Yu
DOI: 10.1039/B601130F
Insulated conducting polymers: manipulating charge transport using supramolecular complexes
Phoebe H. Kwan, Timothy M. Swager
DOI: 10.1039/B508399K
Fluorescence based strategies for genetic analysis
Rohan T. Ranasinghe, Tom Brown
DOI: 10.1039/B509522K
Unusual variations in the incidence of Z′ > 1 in oxo-anion structures
Kirsty M. Anderson, Andres E. Goeta, Kirsty S. B. Hancock, Jonathan W. Steed
DOI: 10.1039/B602492K
Hemiaminals as substrates for sulfur ylides: Direct asymmetric syntheses of functionalised pyrrolidines and piperidines
Christoforos G. Kokotos, Varinder K. Aggarwal
DOI: 10.1039/B602226J
You might also like
What are the main uses of 1H-Indazole-6-carbonitrile (CAS: 141290-59-7)?
1H-Indazole-6-carbonitrile finds applications in pharmaceuticals, where it serve...
How should waste containing Dioctyl (2E)-2-butenedioate (CAS: 2997-85-5) be handled?
Waste containing Dioctyl (2E)-2-butenedioate (CAS: 2997-85-5) should be collecte...
What industries use Sodium [(1,2-benzoxazol-3-ylmethyl)sulfonyl]azanide (CAS: 68291-98-5)?
Sodium [(1,2-benzoxazol-3-ylmethyl)sulfonyl]azanide is primarily used in pharmac...
Are there alternatives to Dimethyl 4-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-2,6-pyridinedicarboxylate (CAS: 741709-66-0) in synthesis?
Dimethyl 4-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-2,6-pyridinedicarboxyla...
How should waste containing 2-Fluoro-6-hydrazinopyridine (CAS: 80714-39-2) be handled?
Waste containing 2-Fluoro-6-hydrazinopyridine (CAS: 80714-39-2) should be manage...
What is 6-Formyl-2-pyridinecarboxylic acid (CAS: 499214-11-8)?
6-Formyl-2-pyridinecarboxylic acid is an organic compound with the molecular for...
What is the market or research trend for 3-(3,4-dimethoxyphenyl)-2,5-dimethyl-N-(2-morpholin-4-ylethyl)pyrazolo[1,5-a]pyrimidin-7-amine (CAS: 900874-91-1)?
Research trends for this compound indicate a focus on its potential applications...
How is 9H-Tribenzo[b,d,f]azepine (CAS: 29875-73-8) typically synthesized?
9H-Tribenzo[b,d,f]azepine is typically synthesized via a multi-step process invo...
How is 1-Cyclopropyl-7-ethoxy-6-fluoro-8-methoxy-4-oxo-1,4-dihydro-3-quinolinecarboxylic acid (CAS: 1797982-51-4) typically synthesized?
1-Cyclopropyl-7-ethoxy-6-fluoro-8-methoxy-4-oxo-1,4-dihydro-3-quinolinecarboxyli...
How should waste containing Methyl 3-oxo-1,2,3,4-tetrahydro-6-quinoxalinecarboxylate (CAS: 671820-52-3) be handled?
Waste containing Methyl 3-oxo-1,2,3,4-tetrahydro-6-quinoxalinecarboxylate (CAS: ...














![N-[(9H-Fluoren-9-ylmethoxy)carbonyl]serine structure N-[(9H-Fluoren-9-ylmethoxy)carbonyl]serine structure](https://static.chemtradehub.com/structs/737/73724-45-5-b0dc.webp)
