Chemical Space Coverage in LC-HRMS: Selectivity and Bias

Generated from prompt:

Create a presenation about my thesis

MSc Chemistry thesis investigating chemical space coverage of 236 LC-HRMS methods from RepoRT repository. Analyzes selectivity-measurability bias in NTA for exposome, using PCA/clustering to reveal RPLC dominance, limited polar/hydrophobic coverage,

April 18, 202632 slides
Slide 1 of 32

Slide 1 - Thesis Presentation

Investigating the Chemical Space Coverage of Liquid Chromatographic Techniques

MSc Chemistry Thesis - Analytical Sciences | Dec 2025

---

Photo by Ryan Zazueta on Unsplash

Slide 1 - Thesis Presentation
Slide 2 of 32

Slide 2 - Thesis Presentation: Chemical Space Coverage in LC-HRMS

Thesis Presentation: Chemical Space Coverage in LC-HRMS

Investigating the chemical space coverage of liquid chromatographic techniques: how selectivity drives measurability

---

Photo by Nastuh Abootalebi on Unsplash

Slide 2 - Thesis Presentation: Chemical Space Coverage in LC-HRMS
Slide 3 of 32

Slide 3 - Literature Thesis Presentation

Literature Thesis Presentation

Investigating the chemical space coverage of liquid chromatographic techniques: how selectivity drives measurability

---

Photo by Nastuh Abootalebi on Unsplash

Slide 3 - Literature Thesis Presentation
Slide 4 of 32

Slide 4 - Agenda

  • Introduction to Chemical Space and NTA
  • Methodology: RepoRT Repository
  • Analysis of Chemical Coverage
  • Trends in Reported Compounds
  • Conclusions & Future Perspectives

---

Photo by Nastuh Abootalebi on Unsplash

Slide 4 - Agenda
Slide 5 of 32

Slide 5 - Presentation Agenda

  • Introduction and Research Goal
  • Methodology and Data Acquisition
  • Analysis of Chemical Space Coverage
  • Key Trends and Findings
  • Conclusion and Future Perspectives

---

Photo by Bench Accounting on Unsplash

Slide 5 - Presentation Agenda
Slide 6 of 32

Slide 6 - Agenda

  • Introduction: Chemical Space and NTA
  • Methodology: RepoRT Analysis
  • Analysis of Chemical Coverage
  • Key Findings and Trends
  • Conclusions and Future Perspectives

---

Photo by Tom Parkes on Unsplash

Slide 6 - Agenda
Slide 7 of 32

Slide 7 - Section 1: Introduction

1

Introduction

Understanding the chemical space of the exposome and the role of LC-HRMS in non-targeted analysis (NTA).

---

Photo by Tom Parkes on Unsplash

Slide 7 - Section 1: Introduction
Slide 8 of 32

Slide 8 - Introduction

1

Introduction & Research Goal

Understanding the Exposome and Selectivity-Measurability Bias

---

Photo by Vedrana Filipović on Unsplash

Slide 8 - Introduction
Slide 9 of 32

Slide 9 - Introduction: The Challenge of Exposome Analysis

  • Exposome encompasses totality of chemical exposures over a lifetime.
  • Non-targeted analysis (NTA) with LC-HRMS is essential for identifying thousands of unknown compounds.
  • Selectivity-measurability bias: analytical methods only detect compounds interacting with the system.
  • Objective: Investigate how different LC conditions capture various regions of chemical space using the RepoRT repository.
Slide 9 - Introduction: The Challenge of Exposome Analysis
Slide 10 of 32

Slide 10 - Research Motivation & Goal

  • The exposome encompasses the totality of lifetime exposures.
  • Characterization requires analytical methods detecting thousands of diverse chemical entities.
  • Non-targeted analysis (NTA) with LC-HRMS is essential but constrained by methodological selectivity.
  • Selectivity-measurability bias: Only chemicals interacting with the system are measured, potentially missing classes of compounds.
  • Research Goal: Evaluate the chemical space coverage of 236 curated LC methods using the RepoRT repository.
Slide 10 - Research Motivation & Goal
Slide 11 of 32

Slide 11 - Key Concepts

  • The exposome consists of all chemical/non-chemical exposures over a lifetime.
  • Chemical space refers to all plausible organic structures relevant to human/environmental health.
  • Non-Targeted Analysis (NTA) + LC-HRMS is essential for identifying unknown exposures.
  • Selectivity-measurability bias: Only chemicals interacting with the analytical system are measured.
  • Small parameter changes in LC workflows can lead to significant differences in detected compounds.
Slide 11 - Key Concepts
Slide 12 of 32

Slide 12 - Methodology: Analyzing RepoRT Repository

StepAction
Data AcquisitionDownloaded RepoRT (236 LC methods).
Metadata ProcessingExtracted setup variables (USP codes, column, flow, etc.).
Chemical DescriptorsRetrieved mass, log-P, TPSA, and H-bond data from PubChem.
Statistical AnalysisPCA and K-means clustering to identify patterns and clusters.
Slide 12 - Methodology: Analyzing RepoRT Repository
Slide 13 of 32

Slide 13 - Methodology

2

Methodology

Data acquisition and RepoRT database setup

---

Photo by Deng Xiang on Unsplash

Slide 13 - Methodology
Slide 14 of 32

Slide 14 - Section 2: Methodology

2

Methodology

Data acquisition from the RepoRT repository and experimental setup.

---

Photo by Tom Parkes on Unsplash

Slide 14 - Section 2: Methodology
Slide 15 of 32

Slide 15 - RepoRT Data Overview

  • 236: LC Methods
  • 17,083: Unique Compounds
  • 89%: RPLC Dominance
  • 11%: HILIC Representation
Slide 15 - RepoRT Data Overview
Slide 16 of 32

Slide 16 - Data Acquisition & Processing

  • RepoRT repository: 236 curated LC setups processed using Julia.
  • Metadata: Filtered for specific column parameters (name, USP code, length, etc.).
  • Compound data: 75,797 compounds retrieved with chemical descriptors (TPSA, Log-P, Mass).
  • USP classification: 89% RPLC (primarily L1), 11% HILIC (primarily L122).
  • Normalization: Retention times normalized on a 0-1 scale to allow comparison across different methods.
Slide 16 - Data Acquisition & Processing
Slide 17 of 32

Slide 17 - Methodology Highlights

  • Utilized 236 curated LC methods from the RepoRT repository.
  • Instrumental setup metadata filtered for consistency (column name, length, temperature, flowrate, etc.).
  • Data processed with Julia 1.11.4; descriptors retrieved via PubChemCrawler.jl.
  • Analytes characterized by exact mass, predicted log-P, H-bond properties, and TPSA.
Slide 17 - Methodology Highlights
Slide 18 of 32

Slide 18 - Key Findings: Chemical Coverage Constraints

  • Chemical space coverage is heavily biased towards RPLC conditions.
  • Measured compounds range widely in theory (TPSA 0-852, log-P -11 to 26), but are constrained in practice.
  • Most frequently measured region: TPSA 0-200, log-P -4 to 9 (~95% of compounds).
  • Orthogonal selectivity is missing, leaving highly polar and extremely hydrophobic regions underrepresented.
Slide 18 - Key Findings: Chemical Coverage Constraints
Slide 19 of 32

Slide 19 - Analysis Results

3

Analysis of Chemical Coverage

PCA and K-Means clustering results

---

Photo by Deng Xiang on Unsplash

Slide 19 - Analysis Results
Slide 20 of 32

Slide 20 - LC Column Distribution

Selectivity ModeUSP Code TypesOccurrence
RPLCL1, L7, L11, L4389%
HILICL3, L68, L114, L12211%
Slide 20 - LC Column Distribution
Slide 21 of 32

Slide 21 - Conclusions & Future Perspectives

Complete Chemical Coverage Remains Challenging

Current LC workflows provide substantial but limited coverage; comprehensive characterization requires integrating additional analytical modalities (GC, SFC, IC).

---

Photo by Nastuh Abootalebi on Unsplash

Slide 21 - Conclusions & Future Perspectives
Slide 22 of 32

Slide 22 - Key Insights on Chemical Space

  • PCA reveals high similarity in the majority of LC methods, dominated by RPLC setups.
  • K-Means clustering (k=5) separates methods primarily by selectivity (RPLC vs HILIC) and eluent modifiers.
  • Chemical space coverage: Methods span a wide theoretical domain (TPSA 0-852 Ų, Log-P -11 to 26).
  • Adequate coverage: Most methods (>94%) are restricted to TPSA 0-200 Ų and Log-P -4 to 9.
  • Comparison: The CompTox dataset (800k chemicals) highlights that extreme polar and hydrophobic regions remain largely inaccessible with current setups.
Slide 22 - Key Insights on Chemical Space
Slide 23 of 32

Slide 23 - Section 3: Analysis of Chemical Coverage

3

Chemical Coverage Analysis

Visualizing methodological similarity and chemical space coverage.

---

Photo by Tom Parkes on Unsplash

Slide 23 - Section 3: Analysis of Chemical Coverage
Slide 24 of 32

Slide 24 - Conclusions

Comprehensive chemical space coverage requires orthogonal selectivity and wider methodological scope.

Advancing Exposome Monitoring

---

Photo by Jakub Żerdzicki on Unsplash

Slide 24 - Conclusions
Slide 25 of 32

Slide 25 - Methodological Findings

  • PCA revealed substantial methodological similarity across the 236 LC setups.
  • Primary variance driven by the dominant use of RPLC (89%) over HILIC (11%).
  • K-means clustering (k=5) confirmed limited diversity in current chromatographic methods.
  • Most methods are highly similar, differing mainly by particle size, temperature, and eluent modifiers.
Slide 25 - Methodological Findings
Slide 26 of 32

Slide 26 - Chemical Space Statistics

  • 0-852 Ų: TPSA Range
  • -11 to 26.6: Log-P Range
  • ~95%: Adequate Coverage
Slide 26 - Chemical Space Statistics
Slide 27 of 32

Slide 27 - Remarks & Perspectives

  • LC-HRMS workflows are effective but fall short of comprehensive exposome coverage.
  • Future work: Focus on purely NTA-generated data and more diverse chromatographic modes (e.g., GC, SFC, IC).
  • Improved method reporting: Including pH, detailed gradients, and column parameters is crucial for future data-driven analyses.
Slide 27 - Remarks & Perspectives
Slide 28 of 32

Slide 28 - Key Takeaway

> Current LC workflows, although effective within a broad region, remain far from achieving comprehensive characterization of chemical space.

— Jens Heemskerk (Thesis Conclusions)

---

Photo by Logan Voss on Unsplash

Slide 28 - Key Takeaway
Slide 29 of 32

Slide 29 - Section 4: Conclusions

4

Conclusions & Future Perspectives

Final assessment and roadmap for future exposome research.

---

Photo by Tom Parkes on Unsplash

Slide 29 - Section 4: Conclusions
Slide 30 of 32

Slide 30 - End of Presentation

Questions?

Thank you for your attention

---

Photo by Trnava University on Unsplash

Slide 30 - End of Presentation
Slide 31 of 32

Slide 31 - Conclusions and Future Work

  • Current LC workflows are effective but limited in comprehensive coverage.
  • Lack of orthogonal selectivity (HILIC/IC) restricts measurements of extreme polar/hydrophobic regions.
  • Complete chemical space coverage is not realistic under current conditions.
  • Future: Focus on NTA-generated datasets and diversify chromatographic platforms (GC, SFC, IC).
Slide 31 - Conclusions and Future Work
Slide 32 of 32

Slide 32 - Thank You

Thesis Work Completed: Dec 2025

Thank you for your attention.

---

Photo by Nastuh Abootalebi on Unsplash

Slide 32 - Thank You

Discover More Presentations

Explore thousands of AI-generated presentations for inspiration

Browse Presentations
Powered by AI

Create Your Own Presentation

Generate professional presentations in seconds with Karaf's AI. Customize this presentation or start from scratch.

Create New Presentation

Powered by Karaf.ai — AI-Powered Presentation Generator