Thao M. Dang

PhD Student in Computer Science

University of Texas at Arlington
Arlington, TX

Email: thaomaidang@gmail.com

[Google Scholar]    [Curriculum Vitae]

Thao M. Dang

Biography

I am a third-year PhD student in Computer Science at the University of Texas at Arlington. My PhD advisor is Prof. Junzhou Huang. I received my M.S. in Computer Engineering from Chonnam National University and B.E. in Computer Engineering from The University of Science-HCMUS.


Seeking Opportunities

I am actively seeking internship opportunities for Summer 2026. My research interests focus on multimodal foundation models, large language models for computational pathology, and AI-driven multi-omics alignment.


Research

- Multimodal Representation Learning and Cross-modal Alignment
- Robust Learning under Incomplete Data
- Multimodal Foundation Models for Computational Pathology
- Guideline-Driven Learning and Prompt Engineering


Publications

Selected publications including AAAI, MICCAI, BCB, ISBI, and journals.

  1. W. Zhong, H. Li, T. M. Dang, , F. Jiang, H. Ma, Y. Guo, J. Gao, J. Huang , “Learning from Guidelines: Structured Prompt Optimization for Expert Annotation Tasks,” AAAI, 2026.

  2. T. M. Dang, H. Li, Y. Guo, H. Ma, F. Jiang, Y. Miao, Q. Zhou, J. Gao, J. Huang, “HAGE: Hierarchical Alignment Gene-Enhanced Pathology Representation Learning with Spatial Transcriptomics,” MICCAI, 2025.

  3. H. Li, Y. Guo, F. Jiang, T. M. Dang, H. Ma, Q. Zhou, J. Gao, J. Huang, “Text-Guided Multi-Instance Learning for Scoliosis Screening via Gait Video Analysis,” MICCAI, 2025.

  4. T. M. Dang, Q. Zhou, Y. Guo, H. Ma, S. Na, T. B. Dang, J. Gao, J. Huang, “Abnormality-aware Multimodal Learning for WSI Classification,” Front. Med., 2025.

  5. Q. Zhou, T. M. Dang, , Y. Guo, H. Ma, W. Zhong, S. Na, J. Gao, J. Huang, “Visual-Language Contrastive Learning for Computational Pathology with Visual-Language Models,” ISBI, 2025.

  6. T. M. Dang, Y. Guo, H. Ma, Q. Zhou, S. Na, J. Gao, J. Huang, “MFMF: Multiple foundation model fusion networks for whole slide image classification,” ACM BCB, 2024.

  7. T. M. Dang, T. D. Nguyen, T. Hoang, H. Kim, A. B. J. Teoh, D. Choi, “AVET: A Novel Transform Function To Improve Cancellable Biometrics Security,” IEEE Transactions on Information Forensics and Security, 2022.

  8. T. M. Dang, L. Tran, T. D. Nguyen, D. Choi, “FEHash: Full Entropy Hash for Face Template Protection,” CVPR Workshops, 2020.

Teaching Experience

  • CSE 5360 - Artificial Intelligence Fall 2025
  • CSE 5360 - Artificial Intelligence Summer 2025
  • CSE 5360 - Artificial Intelligence Spring 2025
  • CSE 5360 - Artificial Intelligence Fall 2024
  • CSE 5311 - Design and Analysis Algorithms Summer 2024
  • CSE 5311 - Design and Analysis Algorithms Spring 2024
  • CSE 1106 - Introduction to Computer Science and Engineering Fall 2023

Patents

  • Personal information security system and method thereof ensuring irreversibility and similarity (US 2025)
  • Method and apparatus for applying absolute value equations transform function preserving similarity as well as irreversibility (KR 2024)
  • System and method for verifying user by security token combined with biometric data processing techniques (US/KR 2023)

Research Experience

(2023.8-Now) Multimodal Representation Learning with Missing Modalities

  • Designed self-supervised multimodal learning frameworks to align heterogeneous data with incomplete modality coverage. Proposed global alignment objectives and geometry-aware losses that enable learning unified embeddings from unimodal, bimodal, and partially observed data without requiring fully paired samples. Manuscript submitted to CVPR 2026.
  • Developed novel alignment methods beyond pairwise contrastive learning, including volume-based and hybrid geometric objectives that preserve higher-order semantic structure and improve zero-shot generalization across datasets and modality combinations. Manuscript submitted to CVPR 2026.
  • Proposed attention-based fusion and instance selection mechanisms to integrate multi-level representations from multiple pretrained foundation models, enabling scalable learning under sparse, noisy, and high-dimensional multimodal inputs. This work has been accepted to the MICCAI 2025 Conference, Front. Med. 2025 Journal, ACM BCB 2024 Conference.

(2024.10-Now) Multimodal Large Language Model

  • Engineered a two-stage adaptation framework to address fundamental limitations in Multimodal Large Language Models (MLLMs) for composed cross-modal retrieval. Formalized the composed retrieval task and developed a comprehensive benchmark for evaluating compositional queries across modalities. Manuscript submitted to CVPR 2026.
  • Designed a Vision Language Model framework implementing single-model multi-modal fusion. Replaced dual-encoder architectures with transformer-based deep fusion to capture cross-modal relationships. This work has been accepted to the ISBI 2025 Conference.

(2025.1-Now) Guideline-Driven Learning and Prompt Engineering

  • Developed a Guideline-Driven Prompt (GDP) optimization framework that shifts learning paradigm from data-driven training to guideline-driven reasoning with minimal annotated examples. Designed a Retrieval Augmented Generation (RAG) system to extract and synthesize essential fragments from complex guideline documents into structured, executable prompts. This work has been accepted to the AAAI 2026 Conference.
  • Engineered RGCWM, a Rule-Grounded Causal World Model for explicit guideline execution and constraint satisfaction. Addressed limitations of implicit text-based reasoning by building an explicit state space directly from guideline text. Manuscript submitted to ACL ARR 2026.
  • Designed a text-guided multi-instance learning framework integrating external textual guidance with sequence modeling. Incorporated textual guidance from domain experts and large language models (LLMs) to enhance feature representation. Implemented Dynamic Time Warping (DTW)-based sequence segmentation to handle temporal misalignment in sequential data. This work has been accepted to the MICCAI 2025 Conference.

(2024.6-2024.7) AstraZeneca's Challenge

  • Applied SAM (Segment Anything Model) and ensemble learning techniques to improve tumor segmentation accuracy. This work achieved 1st place in the first round and top 3 in the second round of the CoSolve Sprints challenge on 3D MRI mouse cancer segmentation. I am in charge of this project.


*Last updated on Feb 2026.