Accelerating Scientific Workflows with LLM Agents
Large Language Model (LLM) agents are transforming the way we approach the Design, Make, Test, and Scale (DMTS) steps in chemistry research. These AI-driven collaborators can extract insights from literature, model reaction conditions, and analyze data with unprecedented speed.
Example Reaction and Corresponding Prompts

Acetylsalicylic acid is synthesized via the esterification of salicylic acid with acetic anhydride, catalyzed by sulfuric or phosphoric acid under reflux. The reaction follows a nucleophilic acyl substitution mechanism, yielding aspirin and acetic acid as a byproduct. Purification is typically achieved through recrystallization from water or ethanol to remove unreacted reagents and side products.
35 Prompts to Guide the Process (DMTA)
Some prompts are conceptual or under evaluation. Fine-tuning the overall system -the interaction of the agents – requires further work.
# | Prompt | Answer (Example) | Involved LLM Agent | Involved Tool |
---|---|---|---|---|
Discovery Phase | ||||
1 | Supervisor, initiate aspirin discovery workflow | Discovery workflow initialized | Supervisor | Paramus Routing |
2 | Retrieve molecules similar to aspirin | [Salicylic acid, methyl salicylate, benzocaine] | Data | Molecular Registry |
3 | Generate SMILES for acetylsalicylic acid | CC(=O)Oc1ccccc1C(=O)O | Open Cheminformatics (not LLM!) | RDKit |
4 | Compute physicochemical properties for aspirin | MW=180.16, logP=1.19 | Open Cheminformatics | RDKit |
5 | Check inventory for salicylic acid and acetic anhydride | Available: Salicylic Acid (100g), Acetic Anhydride (250 mL) | Data | Inventory System |
6 | Review known aspirin reactions. | Fischer esterification, acetylation (standard synthesis) | Literature | |
7 | Summarize known risks of aspirin synthesis | Hydrolysis risk, moisture sensitivity | Literature | PubChem, ACS |
Data Analysis Phase | ||||
8 | Design a fractional factorial DOE for aspirin synthesis | Generated DOE with factors: Temperature (60-80°C), Catalyst (H₂SO₄) | DataScience | scikit-learn |
9 | Calculate moles from 5g salicylic acid | 5 g corresponds to 0.0362 mol | Calculator | Python Interpreter |
10 | Analyze experimental results to find optimal temperature | Optimal temperature: 70°C | DataScience | Random Forest (scikit-learn) |
11 | Evaluate impurity correlation with catalyst amount | Higher catalyst correlates to fewer impurities (r²=0.85) | DataScience | Statistical Model |
Experimentation Phase | ||||
12 | Compute mass of acetic anhydride needed for reaction | 7.39 mL required | Open Cheminformatics, Calculator | RDKit, Python Interpreter |
13 | Check availability of acid catalyst in the lab | Catalyst (Sulfuric acid, 50 mL) available | Data | Inventory Query |
14 | Predict reaction time for 95% yield | Reaction time ~45 min at 75°C | Literature, Cheminformatics, DataScience | Simple ML Model (Regression) |
(hypothetical) 15 | Monitor reaction temperature stability | Temperature stable at 75±1°C throughout reaction | Data | (2026: Paramus Collector) → IoT Sensor Integration |
16 | Confirm reaction completion using TLC data | TLC indicates no salicylic acid presence; reaction complete | DataScience | Image Analysis (OpenCV) |
Optimization Phase | ||||
17 | Suggest adjustments to reduce reaction impurities | Decrease temperature to 68°C and reduce catalyst by 10% | DataScience, Cheminformatics, Literature | DOE/Statistical analysis |
18 | Calculate yield improvement from temperature adjustment | Estimated yield increase: 4.5% | Calculator | Python Interpreter |
19 | List historical experiments related to yield optimization | Experiment IDs [THGR-201, THGR-434, AMBR-1001] | Data (from ELN!) | ELN |
20 | Verify inventory for scale-up materials | Materials sufficient for 10-fold scale-up | Data | Inventory |
21 | Compute cost impact of scale-up conditions | Cost reduction: 15% per gram product | Literature, Code | Python Interpreter |
Literature Review Phase | ||||
22 | Retrieve latest pharmacokinetics studies on aspirin | 3 recent studies found, summarized in file | Literature | PubMed API |
23 | Summarize current safety guidelines for aspirin use | FDA guidelines updated 2023, dosage max 325 mg/day | Literature | |
24 | Check regulatory compliance for aspirin synthesis method | Complies with current chemical manufacturing standards | Literature | |
25 | Obtain current patent landscape for aspirin synthesis | No conflicting patents found | Literature | Google patent |
26 | Search adverse events databases for aspirin | Common adverse events: gastrointestinal irritation | Literature | |
Computational Validation Phase | ||||
27 | Optimize aspirin molecular geometry | Optimized structure converged at B3LYP/6-31G* | Computational Chemistry | PSI4 |
28 | Compute infrared spectrum of aspirin | Major IR peaks at 1750 cm⁻¹ and 1650 cm⁻¹ | Computational Chemistry | PSI4 |
29 | Predict aspirin solubility via QSAR modeling | Predicted solubility in water: 3 mg/mL | DataScience, Cheminformatics | sklearn, SVM and ensemble models |
30 | Calculate transition states of aspirin hydrolysis | Activation energy calculated: ΔG‡ = 23.4 kcal/mol | Computational Chemistry | PSI4, OPT_TYPE set to TS |
31 | Verify predicted IR spectra against literature values | IR peaks match literature within ±5 cm⁻¹ | Literature | Pubchem |
Deployment Phase | ||||
32 | Store optimized reaction protocol | Protocol saved as THGR-ASP-2025-01 | Data | ELN (is also in Copilots file system) |
33 | Update inventory based on experimental consumption | Inventory updated: salicylic acid reduced by 50g | Data | Inventory UPDATE |
34 | Generate compliance report for QA | Report generated and ready for QA review | Literature | as .rd file in local file system |
35 | Prepare summary of synthesis and computational data for internal report | Summary report ready; includes experimental yields, computational validation, and literature references | Supervisor | as .rd file in local file system |
Although not every agent from the example is fully productive yet, each one contributes targeted expertise that drives efficiency. By combining literature analysis, computational chemistry, and data insights, researchers gain a powerful toolset to streamline a development of aspirin synthesis. The following curated list of prompts illustrates how LLM agents can pave the way for a more robust and scalable DMTS process.