LLM-Assisted Topic Modeling, Hierarchical Clustering, Key-Phrase Extraction, HDBSCAN, Clinical NLP, Psychiatric Notes, MIMIC-IV-Note, UMAP, c-TF-IDF, Topic Coherence, Interpretability, Cost-Effectiveness

Unlocking Clinical Insights: LLM-Assisted Hierarchical Topic Modeling in Psychiatric Notes

Morice Nouvertne

Effectively distilling crucial, clinically actionable insights from psychiatric discharge notes is a significant yet challenging task in clinical informatics. Standard methods, such as BERTopic, often fall short when faced with complex, jargon-heavy, and lengthy clinical texts. To address this, my latest research introduces an innovative hierarchical topic-modeling pipeline, powered by advanced large language models (LLMs) and tailored specifically to psychiatric clinical narratives.

The Challenge of Clinical Topic Modeling

Psychiatric notes, particularly those documenting emergency mental health episodes like suicide-related visits, contain intricate details about patient histories, triggers, and treatment responses. Traditional document-level clustering approaches frequently collapse such rich data into overly broad, ambiguous categories, missing clinically meaningful distinctions critical for patient care.

Our Innovative Approach

The new pipeline leverages the powerful Gemini 2.0 Flash-Lite language model to perform targeted extraction of structured key phrases, focusing on temporal and causal relationships within each clinical note. Unlike conventional topic modeling that clusters entire documents, this approach:

  • Extracts concise, semantically-rich phrases from notes using an LLM, drastically improving the quality and clinical relevance of inputs to the clustering algorithm.
  • Utilises a sophisticated, two-stage hierarchical clustering strategy with HDBSCAN, first identifying broad topic groups ("main topics") and subsequently refining them into clinically actionable sub-topics ("leaf topics").
  • Employs optional Maximal Marginal Relevance (MMR) re-ranking to enhance keyword diversity while balancing semantic coherence. Figure 1

Key Findings and Clinical Impact

Our rigorous evaluation across a suicide-related subset of the MIMIC-IV-Note corpus and the general-domain AG News dataset revealed remarkable outcomes:

  • Significantly improved semantic coherence and diversity over standard BERTopic and KeyBERT baselines, achieving an average increase in Normalized Pointwise Mutual Information (NPMI) coherence of 0.12.
  • Enhanced interpretability, with hierarchical clustering surfacing clinically meaningful subtopics, effectively reducing the review burden for clinicians (from roughly 219 notes per topic to just about 65).
  • Outstanding cost-effectiveness and efficiency, processing hundreds of notes asynchronously within mere minutes at an exceptionally low computational cost (less than $0.12 total).

Table 1


The hierarchical step converts broad, clinically vague groupings into actionable sub-topics that can be reviewed quickly and mapped directly to risk-assessment workflows— delivering granularity that single-layer BERTopic cannot approach. The below table illustrates this with an example drawn from the fifth main topic, “Patient Denies Current Suicidal Ideation,” along with its associated sub-topics


Table 2

Practical Clinical and General-Domain Applications

This methodology isn't limited to psychiatric applications alone. The approach demonstrated impressive versatility and robustness by achieving similarly strong results on open-domain news datasets. Its practical affordability and computational efficiency make it readily deployable for real-world clinical settings and potentially beneficial in broader NLP tasks.

Future Directions

While the intrinsic evaluation metrics indicate strong performance, ongoing and future research will involve direct clinician assessments and larger-scale user studies to validate clinical relevance comprehensively. The integration of human-in-the-loop evaluation methods will further strengthen the practical utility and real-world impact of this approach.

Through this novel LLM-assisted hierarchical clustering pipeline, we significantly advance the field of clinical NLP, offering scalable, cost-effective, and clinically valuable insights crucial for informed medical decision-making.

Citation

@inproceedings{nouvertne2025llm,
  title     = {LLM-Assisted Hierarchical Topic Modeling},
  author    = {Nouvertne, Morice and Molinari, Marc and Andritsch, Jarutas and Ahmad, Shakeel},
  booktitle = {Proc. IEEE Intl. Conf. on Data Science and Advanced Analytics},
  year      = {2025},
  doi       = {TBD}
}