The 40th Annual AAAI Conference on Artificial Intelligence

January 20 – January 27, 2026 | Singapore

New Faculty Highlights

AAAI is continuing its invited speaker program by highlighting AI researchers who have just begun careers as new faculty members or the equivalent in industry.

New Faculty Highlight talks will be allotted 20 minutes each; the aim for these talks is to broadly survey the candidate’s research to date. Several talks will be released online and publicized each day of the AAAI conference, following which they will be available archivally as part of the conference program. Invited speakers will be further invited to contribute an article to a corresponding series in AI Magazine.

Click here for the New Faculty Highlights schedule

Program Overview

Towards Agents that Exhibit Human-Like Autonomy in Complex Environments

Rohan Chandra

Deploying intelligent, autonomous agents e.g. autonomous vehicles and robots, in the real world has been a longstanding goal in robotics and artificial intelligence (AI). We have already begun to witness the emergence of vacuum robots in our homes, service robots in warehouses, and even self-driving cars on our way to work. These environments are often dense, constrained, and unstructured, with heterogeneous agents, each with their own unique behaviors and objectives. While agents today are designed to navigate these environments safely, their overly conservative nature often leads to slow and jerky motion (frequent stopping and freezing), lack of social compliance (not giving way to other people, blocking doorways and intersection), and poor adaptability across diverse complex environments (failure due to sudden accidents e.g. liquid spills). In other words, these robots often fail to capture the essence of human-like autonomy, which involves the ability to take calculated risks, even in complex environments. In this talk, I will describe my vision for a paradigm shift in the way intelligent physical agents navigate highly dense, heterogeneous, constrained, and unstructured environments using human-like autonomy.

Rohan Chandra is an Assistant Professor in Computer Science at the University of Virginia and leads the CRΔL lab. From 2022 to 2024, he was a postdoctoral research fellow in Texas Robotics, advised by Dr. Joydeep Biswas and Dr. Peter Stone, at the University of Texas at Austin. His research focuses on algorithms and systems for enabling robots to navigate safely and efficiently among humans, like humans. Rohan obtained his M.S. and Ph.D. in 2018 and 2022 from the University of Maryland, College Park advised by Dr. Dinesh Manocha, and completed his B.Tech from the Delhi Technological University, New Delhi in 2016. His doctoral thesis focused on autonomous driving in dense, heterogeneous, and unstructured traffic environments. Rohan is a 2023 Microsoft Future Leader in Robotics and AI, 2023 KAUST Rising Star in AI, 2022 RSS Pioneer, and 2021 UMD Future Faculty Fellow. He is a finalist for the 2022 Charles A. Caramello Distinguished Dissertation Award and the UMD Innovation of the Year Award for 2021 and 2022. He is a recipient of the 2023 Drones Young Investigator Award, the 2023 SNU PhD Award for Autonomous Navigation, and the UMD 2020 summer research fellowship. His published work appears regularly in top computer vision, AI, and robotics conferences (CVPR, ICRA, IROS) and he is actively involved in participating in workshops and program committees of leading conferences in robotics, computer vision, artificial intelligence, and machine learning. Rohan is serving on the editorial boards of several reputed journals including RA-L.

From Few-Shot Learning to Data-Efficient Intelligence

Yaqing Wang

Modern artificial intelligence performs impressively in data-rich settings but still struggles to learn and adapt from only a few examples—a capability central to human intelligence. My research seeks to understand and enable data-efficient generalization, unifying principles across few-shot learning, meta-learning, in-context learning in large language models (LLMs), and adaptive agent behavior. First, I revisit few-shot learning from a foundational perspective, showing why conventional supervised learning breaks down under sparse data and how prior knowledge enables reliable adaptation. I then discuss how these principles extend to real-world scenarios such as scientific discovery and cold-start recommendation, where data are scarce, costly, or dynamically evolving. Finally, I explore how LLMs perform in-context learning and how their adaptive behaviors connect to meta-learning mechanisms. Building on these insights, I develop data-efficient, preference-adaptive agents that quickly align to user needs with minimal interaction.This talk presents a cohesive view of data-efficient intelligence and outlines future directions toward more reliable, human-like learning systems.

Yaqing Wang is an Associate Researcher at the Beijing Institute of Mathematical Sciences and Applications (BIMSA). She received her Ph.D. in Computer Science and Engineering from the Hong Kong University of Science and Technology, advised by Prof. Lionel M. Ni and Prof. James T. Kwok. Her research focuses on machine learning and artificial intelligence, with an emphasis on data-efficient generalization, including few-shot learning, in-context learning, and adaptive agents. Dr. Wang has published more than 30 papers in leading venues such as NeurIPS, ICML, ICLR, KDD, TheWebConf, SIGIR, AAAI, IJCAI, EMNLP, TPAMI, JMLR, and TIP, with over 5,300 citations. She serves as an Associate Editor of Neural Networks and an Area Chair for ACL Rolling Review. Her techniques have been deployed in large-scale real-world systems at Baidu, Meituan, and other industry applications.She is a recipient of the Hong Kong PhD Fellowship, the Beijing Nova Program, and the AAAI New Faculty Highlight Program, and is listed among the World’s Top 2% Scientists in 2024 and 2025.

Toward Trustworthy AI for Decision Making in Population Health

Alexander Rodríguez

AI and public health are becoming increasingly intertwined, driven by the growing availability of multimodal data and rapid advances in deep learning. In this talk, I will present our efforts to harness these trends within a data-centric pipeline for public health. I will first discuss robust deep learning architectures for real-time outbreak response, highlighting how our frameworks capture uncertainty and dynamics across shifting distributions, multimodal signals, hierarchical structures, and relational dependencies. I will then introduce hybrid approaches that integrate machine learning with mechanistic epidemiological models, including physics-informed neural networks, expert-guided generative models for causal inference, and differentiable agent-based models. Together, these advances illustrate how combining data-driven AI with domain knowledge can enable more reliable, adaptive, and actionable public health solutions for decision making.

Alexander Rodríguez is an Assistant Professor of Computer Science at the University of Michigan, Ann Arbor. He received his PhD in Computer Science from the School of Computational Science and Engineering at the Georgia Institute of Technology. His research interests lie at the intersection of machine learning, scientific modeling, time series, and decision-making. His work has been recognized with the best paper award at ICML AI4ABM 2022 and was awarded the 1st place in the Facebook/CMU COVID-19 Challenge and the 2nd place in the C3.ai COVID-19 Grand Challenge. He was also named a ‘Rising Star in Data Science’ by the University of Chicago Data Science Institute in 2021 and a ‘Rising Star in ML & AI’ by the University of Southern California in 2022. His dissertation received the 2024 Outstanding Dissertation Award from the College of Computing at Georgia Tech and the 2024 ACM SIGKDD Dissertation Award Runner Up. His homepage is alrodri.engin.umich.edu.

Scaling Human-Centric Trustworthy Foundation Model via Advanced Reasoning and Agentic Frameworks

Yi R. (May) Fung

As foundation models grow in size and scope, crucial challenges remain in scaling their trustworthiness and adaptability to meet the diverse needs of individual users, as well as mitigating their risk of generating unhelpful, non-factual, or harmful content. To address this, we propose to reframe model reasoning through a unified paradigm of active knowledge grounding that coordinates different tools and modalities. First, to scale reasoning depth and creativity, we introduce the novel paradigm of Thinking with Images to encourage models to externalize intermediate structure and perform interleaved cross-modal advanced reasoning beyond text-centric cues. To further scale honesty and bridge knowledge gaps reliably, we develop one of the first vision-language deep research agents, WebWatcher, that actively gathers and verifies information from the web with enhanced fragmented reasoning capability. Ultimately, to scale effective and efficient human-AI collaboration, we propose AdaCtrl as a novel training mechanism for dynamically aligning model behavior with individual user preferences and difficulty awareness to adaptively allocate computational resources. Together, these three pillars of integrating advanced multimodal reasoning, autonomous discovery, and adaptive alignment form a foundational framework for advancing the frontier of next generation human-centric trustworthy AI systems.

Yi R. (May) Fung is an Assistant Professor at the Department of Computer Science and Engineering (CSE), Hong Kong University of Science and Technology (HKUST). She received her Ph.D. from the University of Illinois, after which she spent time visiting MIT as a postdoctoral researcher. May drives cutting-edge research in the domain of human-centric trustworthy AI/NLP model reasoning, with cognitively grounded scalable alignment principles and a focus on advancing multimodal knowledge robustness mechanisms. In particular, she has published near 50 papers at top-tier machine learning venues along the topics of MLLM agentic frameworks, retrieval-augmented generation, and multi-lingual cross-culture situation understanding for diverse real-world applications (e.g., software, healthcare, business, education, media communication). Her stellar research has received much recognition internationally, including the ACL’24 Outstanding Paper Award, NAACL’24 Outstanding Paper Award, and NAACL’21 Best Demo Paper Award. In addition, she serves on the Organizing Committee for ACL/IJCAI, and as Area Chair for NeurIPS/ICLR/ACL/EMNLP/IJCAI. She leads a young, energetic, and growing research lab, and her work has been reported by various mainstream news outlets.

Advancing Trust in Multimodal AI: From Explainability Benchmarks to Clinical Safety

Chirag Agarwal

Machine learning models have become ubiquitous in the last decade, and with their increasing use in critical applications (e.g., healthcare, financial systems, and crime forecasting), it is vital to ensure that ML developers and practitioners understand and trust their decisions. This problem has become paramount in the era of frontier models, which are developed by training billion-parameter models on broad, uncurated datasets and extensive computing. In this talk, we will first explore the (un)reliability of existing multimodal explainability techniques in large language and multimodal models and understand the robustness and safety implications of Mechanistic Interpretability tools. Next, we will delve into two complementary threads: i) domain-specific safety and related trustworthy evaluation that surfaces risks missed by generic red-teaming, focusing on multilingual and distribution-shifted settings; and ii) methods that explicitly train and assess reasoning in medical LLMs.

Chirag Agarwal is an Assistant Professor at the School of Data Science, where he leads the Aikyam lab focusing on developing trustworthy machine learning frameworks that go beyond training models for specific downstream tasks and satisfy trustworthy properties, such as explainability, safety, and alignment. He has developed the first-of-its-kind, large-scale, in-depth study to support systematic, reproducible, and efficient evaluations of post hoc explanation methods for (un)structured data to understand algorithmic decision-making on diverse tasks ranging from bail decisions to loan credit recommendations. Dr. Agarwal’s research has led to publications in top-tier machine learning and computer vision conferences and journals, and he has received Spotlight and Oral presentations at NeurIPS, ICML, CVPR, AAAI, and ICIP conferences, and industrial support from Adobe, Google, Cohere, Thinking Machines, and OpenAI to support his work on trustworthy machine learning.

Toward Controllable and Trustworthy LLM Reasoning: From Failure Mapping to Cognition-inspired Control and Real-world Impact

Ben Zhou

Large language models (LLMs) are increasingly deployed in decision-critical settings, yet still exhibit brittle, opaque reasoning, especially under abstraction, spurious cues, and long-horizon planning. In this talk, I will present my research on making LLM reasoning controllable and trustworthy. First, I will show how to map when and why LLM reasoning fails, revealing conceptual blind spots, deceptive semantic shortcuts, and brittle consistency that undermine trust. Second, I will introduce cognition-inspired methods that steer model reasoning via problem and factual decomposition, consistency-driven learning, and controlled self-reflection, as well as pre-training schemes that attach explicit “thoughts of words” and token-level reinforcement learning. Finally, I will demonstrate how trustworthy reasoning can be translated into real-world impact, closing the loop between failure analysis, controlled reasoning, and safe AI applications.

Ben Zhou is an Assistant Professor in the School of Computing and Augmented Intelligence at Arizona State University. Ben’s research aims to understand large language model behavior, establish theories of model learning and generalization, and use data and symbolic cognitive processes to improve model reasoning, controllability, and trustworthiness. He has more than 20 publications at top-tier conferences, several of which were oral presentations at ICLR, NAACL, and EMNLP. Ben obtained his Ph.D. degree from the University of Pennsylvania. He is a recipient of the ENIAC fellowship from the University of Pennsylvania and a finalist for the CRA Outstanding Undergraduate Researcher Award.

Towards faithful, interpretable and private foundation models

Grigorios Chrysos

As large language models (LLMs) scale, ensuring interpretability and privacy becomes critical. This talk addresses these interconnected challenges with novel approaches to model specialization and safety. First, we tackle the dense, distributed nature of LLM representations by casting Mixture-of-Experts as a tensor decomposition, enabling specialized experts in a factorized space. Second, we argue that current neuron-level sparsity methods create a severe accuracy-sparsity trade-off, and we propose a paradigm shift to layer-level sparsity with the Mixture of Decoders. We explain how MxD uses tensor factorization to expand dense layers into thousands of specialized, full-rank sublayers. Finally, we address privacy in open-weight models by proposing a scalable alternative to differential privacy that induces maximal uncertainty on protected instances, introducing a certifiable algorithm and proving tight bounds that characterize the resulting privacy-utility tradeoff.

Grigorios Chrysos is an Assistant Professor at the University of Wisconsin-Madison. Grigorios was awarded a rising star award by CPAL (2025). His research interests lie in trustworthy machine learning leveraging tensor decompositions. His most research focuses on interpretability in large language models, mitigating hallucinations and ensuring the privacy of foundation models. He has co-organized workshops (CVPR, ICCV, AAAI, ICLR, NeurIPS) and tutorials on top-tier conferences (CVPR’22, AAAI’23, CVPR’23, ISIT’24, NeurIPS’25). Grigorios is action editor for the machine learning journal of TMLR, and area chair for machine learning conferences (NeurIPS, ICLR, ICML).

KOALA: Knowledge of Optimization and Learning Algorithms for Healthcare

Kai Wang

The Knowledge of Optimization And Learning Algorithms (KOALA) group studies how to integrate optimization, machine learning, and generative modeling to enable data-driven decision-making under uncertainty. We study decision-focused learning, embedding optimization as a differentiable layer to train models end-to-end for decision quality. We design scalable reinforcement learning algorithms for population and personalized healthcare, and develop efficient bilevel optimization methods for nested and multi-agent decision-making. These directions form a unified framework linking optimization and learning for impactful AI in healthcare. Through collaborations with hospitals and NGOs, our group designs and deploys algorithms for pediatric, diabetes, maternal, and mental health applications. Looking ahead, we aim to unite these foundations with generative AI to build theoretically grounded and socially responsible algorithms that advance trustworthy, real-world AI for health and beyond.

Kai Wang is an Assistant Professor in the School of Computational Science and Engineering at the Georgia Institute of Technology. He received his Ph.D. in Computer Science from Harvard University. His research develops the computational foundations of AI for healthcare, integrating optimization, machine learning, and generative modeling to enable reliable decision-making under uncertainty. His recent work spans decision-focused learning, reinforcement learning, and bilevel optimization, with applications in pediatric, diabetes, maternal, and mental health. In collaboration with hospitals and NGOs, his algorithms have been deployed in real-world health programs to improve care delivery and resource allocation. Kai’s work has been recognized with the Schmidt Science AI2050 Early Career Fellowship, the Siebel Scholarship, and the Best Paper Runner-Up Award at AAAI.

Diffusion-Based Data Augmentation for Bimanual Robot Manipulation

Daniel Seita

Learning robust bimanual manipulation policies requires demonstration data with broad coverage over robot poses, contacts, and scene contexts. However, collecting diverse real-world demonstrations is costly and time-consuming, creating a significant bottleneck for scaling imitation learning systems. This talk presents two complementary approaches that leverage diffusion models to address this challenge. First, D-CODA synthesizes novel viewpoint-consistent wrist camera images for eye-in-hand bimanual setups, and uses constrained optimization to generate action labels that satisfy coordination constraints during contact-rich manipulation. Second, ROPA extends this framework to third-person setups with RGB-D observations, introducing skeleton-based pose conditioning to generate diverse robot configurations while maintaining geometric consistency across both RGB and depth modalities. Experiments in simulation and real-world bimanual tasks demonstrate that these methods substantially outperform baselines, enabling more sample-efficient imitation learning for coordinated manipulation without requiring additional human demonstrations or simulator access.

Daniel Seita is an Assistant Professor in the Computer Science department at the University of Southern California and the director of the Sensing, Learning, and Understanding for Robotic Manipulation (SLURM) Lab. His research interests are in computer vision, machine learning, and foundation models for robot manipulation, focusing on improving performance in visually and geometrically challenging settings. Daniel was a postdoc at Carnegie Mellon University’s Robotics Institute and holds a PhD in computer science from the University of California, Berkeley. Daniel has been honored with the AAAI 2026 New Faculty Highlights program. He presents his work at premier robotics conferences such as ICRA, IROS, RSS, and CoRL.

Deep Model Reuse: Toward Flexible and Efficient Generative AI

Xingyi Yang

Humans easily apply learned skills to different situations, a flexibility that AI systems still struggle to achieve. Despite the explosion of large, specialized models, most are trained and deployed in isolation, resulting in redundant computation, brittle generalization, and limited adaptability. Deep Model Reuse offers a paradigm shift: instead of training from scratch, we treat the growing ecosystem of pre-trained models as a dynamic knowledge library. By strategically extracting, aligning, and re-composing their latent capabilities, we can rapidly construct new, versatile AI systems with minimal additional cost.
In this talk, I will present key techniques that enable deep model reuse, and demonstrate their transformative impact in generative and multimodal AI. From controllable image synthesis to 3D content generation, our approach bridges isolated model islands into a coherent, adaptable intelligence. Ultimately, deep model reuse moves us closer to AI that learns like humans: efficiently, flexibly, and cumulatively.

Xingyi Yang is a Tenure-Track Assistant Professor (Presidential Young Scholar) in the Department of Data Science and Artificial Intelligence at The Hong Kong Polytechnic University (PolyU). He received my Ph.D. from the National University of Singapore (NUS), advised by Prof. Xinchao Wang. He also had a wonderful time as a visiting Ph.D. student at the University of Oxford, working with Prof. Philip Torr. Prior to that, He completed his Master’s at UC San Diego (UCSD) and his Bachelor’s at Southeast University (SEU). His research interest lies at the intersection of generative models and multimodal learning, with a strong emphasis on improving their computationally efficiency. The central mission of his research is to build intelligent systems that can robustly understand, generate, and interact with the physical world. His research has led to over 30 publications at premier venues such as CVPR, NeurIPS, ICCV, ICML, ECCV, and ICLR. His contributions have been recognized with several honors, including the NeurIPS 2022 Best Paper Honorable Mention, the 2023 Chinese Government Award for Outstanding Self-Financed Students Abroad, and a 2024 WAIC Young Outstanding Paper Nomination.

Teach AI What It Doesn’t Know

Sean Du

The remarkable capabilities of machine learning (ML) models, especially foundation models like GPT, have transformed numerous domains. However, these systems often falter in real-world settings, where they encounter unknown or out-of-distribution (OOD) inputs, and generate overconfident predictions or unreliable outputs. Ensuring their reliability is not only a technical challenge but also a fundamental requirement for their safe deployment.
In this talk, I will discuss my research on teaching ML models what they don’t know by developing foundational frameworks for reliable decision-making in the open world. This involves three core aspects: (1) designing novel algorithms for unknown-aware learning through adaptive outlier synthesis, enabling models to handle unfamiliar inputs without explicit knowledge of unknowns; (2) addressing reliability blind spots in foundation models, such as hallucinations and malicious prompts, through innovative mitigation strategies; and (3) rethinking the AI alignment process from algorithm-centric to data-centric research that enables learning better reward models from limited human preferences.
Through fundamental algorithmic development, theoretical insights, and practical applications, my research contributes to the responsible deployment of AI technologies. The talk will conclude with a forward-looking perspective on interdisciplinary collaborations and the roadmap for achieving robust, reliable AI systems that adapt to an ever-changing world.

Sean Du is an Assistant Professor at College of Computing and Data Science (CCDS), Nanyang Technological University, Singapore. He obtained his Ph.D. in Computer Sciences at University of Wisconsin-Madison advised by Prof. Sharon Li. His research interest is in reliable machine learning and the applications to foundation models and AI safety. His first-author papers have been recognized with multiple oral and spotlight presentations at NeurIPS and CVPR. He is a recipient of the Jane Street Graduate Research Fellowship, and Rising Stars in Data Science award.

Trustworthy Planning with Large Language Models Generating Formal Languages

Li “Harry” Zhang

Despite the rapid advancement of AI, most systems in high-stakes applications remain primarily limited to rule-based interactions and cannot reliably plan or execute complex user tasks. Despite recent efforts in using large language models (LLMs) to plan as agents, their hallucinations and lack of verifiability undermine executability and trust, preventing real-world deployment. This proposal advances an alternative paradigm: LLM-as-formalizer. Instead of relying on LLMs to generate plans directly, we use them as a code generator to translate a user’s environment and goal into formal languages (such as PDDL) that can be deterministically solved by off-the-shelf solvers. This neuro-symbolic approach combines the flexibility of LLMs with the reliability of symbolic systems, offering a pathway toward trustworthy, generalizable planning. Based on our published prior work that has demonstrated the feasibility and challenges of the LLM-as-formalizer approach through comprehensive evaluation across multiple domains and models, I propose to significantly advance this paradigm by exploring ways to improve LLMs’ formalizing ability in various domains.

Li “Harry” Zhang is an assistant professor at Drexel University, focusing on Natural Language Processing (NLP) and artificial intelligence (AI). He obtained his PhD degree from the University of Pennsylvania in 2024, advised by Prof. Chris Callison-Burch and chaired by Prof. Dan Roth. He was a year-long intern in 2023 at the Allen Institute for Artificial Intelligence. He obtained his Bachelor’s degree from the University of Michigan in 2018, mentored by Prof. Rada Mihalcea and Prof. Dragomir Radev. His current work uses large language models (LLMs) to reason and plan in an executable and trustworthy manner via symbolic and structured representations. He has published more than 30 peer-reviewed papers in NLP and AI conferences, such as ACL, EMNLP, and NAACL, that have been cited more than 3,000 times. He also consistently serves as Area Chair, Session Chair, and reviewer in those venues. Outside academia, he is a sponsored musician, producer, and content creator having over 60,000 subscribers across streaming platforms.

Learning from Imperfect Data: Continual Learning, Few-shot Learning, and Generative Data Augmentation

Yaoyao Liu

In recent years, artificial intelligence (AI) has achieved great success in many fields. Although impressive advances have been made, AI algorithms still suffer from an important limitation: they rely on static and large-scale datasets. In contrast, human beings naturally possess the ability to learn novel knowledge from real-world imperfect data, such as a small number of samples or a non-static continual data stream. Attaining such an ability is particularly appealing and will push the AI models one step further toward human-level Intelligence. In this talk, I will present my work on addressing these challenges in the context of continual learning, few-shot learning, and generative models for data augmentation. First, I will discuss how to get better exemplars for continual learning based on optimization. I parameterize exemplars and optimize them in an end-to-end manner to obtain high-quality memory-efficient exemplars. Next, I will introduce our work on generative models that leverage diffusion techniques to create diverse, 3D-annotated images, enabling large-scale 3D model training without human annotations. Finally, I will present my work on how to apply continual and few-shot learning techniques to more challenging and realistic scenarios, e.g., object detection and medical imaging.

Yaoyao Liu is an assistant professor in the School of Information Sciences and the Coordinated Science Laboratory at the University of Illinois Urbana-Champaign. He is also affiliated with the Siebel School of Computing and Data Science, the Department of Electrical & Computer Engineering, the National Center for Supercomputing Applications, and the Illinois Informatics Institute. Previously, he completed his PhD in computer science at Max Planck Institute for Informatics and his BS in electronic information engineering at Tianjin University. His research lies at the intersection of computer vision and machine learning, with a special focus on building intelligent visual systems that are continual and data-efficient. His research interests include continual learning, few-shot learning, semi-supervised learning, generative models, 3D geometry models, and medical imaging. He is a recipient of the 2024 ECVA PhD Award.

Towards Inclusive AI: Advancing Multilingual Large Language Models

Dr. Wenxuan Zhang

Large language models (LLMs) have advanced rapidly, yet their development remains disproportionately focused on a few high-resource languages, leaving fundamental scientific and societal questions about multilingual capability, safety, and equity unresolved. This talk examines multilingual LLMs as a lens for understanding these challenges. I will first discuss observations from large-scale evaluations with real-world natural data, which reveal substantial performance gaps and highlight the need to treat multilingualism as a multidimensional construct. I then turn to safety, presenting work that uncovers multilingual jailbreak vulnerabilities and introduces frameworks for achieving more consistent cross-lingual alignment. Building on analyses of language-specific internal mechanisms, I will outline new strategies for enhancing multilingual systems and describe open-source efforts such as the SeaLLMs and Babel projects that aim to broaden linguistic and cultural inclusivity. Finally, I will discuss emerging directions beyond language, including recent findings on abstract thought in LLMs, which point toward the development of models that are not only multilingual but genuinely multicultural and contextually grounded.

Dr. Wenxuan Zhang is currently a tenure-track SUTD Assistant Professor (SAP) at the Information Systems Technology and Design (ISTD) Pillar, Singapore University of Technology and Design (SUTD). He received his PhD degree from the Chinese University of Hong Kong, and then joined Alibaba Singapore as a research scientist with the prestigious Ali Star award. His primary research areas are natural language processing (NLP) and large language models (LLMs). His research aims to advance NLP models that are inclusive, supporting diverse languages and cultures through multilingual language models; while also trustworthy by improving the understanding, safety, and robustness of the models. He (co-)led multiple influential open-source research projects, including SeaLLMs (Large Language Models specialized for Southeast Asian languages), Babel, and AutoArena. He regularly serves as an area chair and on program committees for multiple leading conferences and journals, including ACL, EMNLP, NeurIPS, ICLR etc. He served on the organizing committee of SSNLP 2025 and organized tutorials at IJCAI 2023 and SIGIR 2025. He is also recognized among the World’s Top 2% Scientists (by Stanford and Elsevier) in 2025.

Reinforcement Learning Without Explicit Rewards: Theory and Practice

Weitong Zhang

The deployment of reinforcement learning in modern systems often relies on implicit feedback rather than a precisely specified reward. This creates two central challenges. First, exploration tied to task rewards can be inefficient when rewards vary or are misspecified across tasks. Second, preference or proxy rewards can invite reward hacking. My research addresses these issues by advancing reinforcement learning without explicit reward design, with theoretical understanding and practical algorithms. In this talk, I will first discuss unsupervised and reward free exploration. The key idea is to acquire broad coverage of states and skills with curiosity, then finetune the agent with misspecification toleration. Concretely, we introduce a theoretical framework with intrinsic rewards. We then instantiate these ideas with practical algorithms for uncertainty aware exploration and representation learning that show strong empirical performance. Next, I will present methods that leverage the relative reward signals in flow matching. I will discuss exact energy guidance for flow matching where the model can guarantee the exact posterior distribution of the generation process. I will also showcase interdisciplinary applications enabled by these foundations including the energy guided ligand generation. Finally, I will highlight our work that pioneers machine learning for electrochemical analysis. The talk will close with open challenges and future directions, including theory for learning from implicit signals under distribution shift and reliable offline to online adaptation with safety and constraint guarantees.

Dr. Weitong Zhang is an Assistant Professor in the School of Data Science and Society at the University of North Carolina at Chapel Hill. He earned his Ph.D. in Computer Science from the University of California, Los Angeles. His research centers on reinforcement learning, with applications to foundation models such as diffusion models and large language models (LLMs). Broadly, his goal is to develop and deepen the theoretical understanding of reinforcement learning and other machine learning paradigms, even when full theoretical justification remains an ongoing pursuit. His work also extends to the design of AI agents and learnable models for scientific domains, including chemistry, physics, and molecular science.

Breaking the Resource Monopoly: LLM Post-Training and Serving with Modest Data and Compute

Jiaxin Huang

State-of-the-art large language models are increasingly powerful though many of them are trained from vast proprietary data and intensive computes, raising barriers for academic labs and smaller institutions for exploration and improvement. In this talk, I will present a unified research agenda for breaking the resource monopoly in both post-training and serving. On the training side, I will describe label-free and even zero-data post-training pipelines that let models curate their own reasoning supervision and evaluate it with controllable, fine-grained benchmarks. On the serving side, I will show how cost-aware inference can enable adaptive test-time scaling to be more efficient, by adjusting computing resources between queries of various difficulty levels. In addition, I will also introduce how to make language model generation more reliable by calibrating post-trained models to be less overconfident. Together, these components form a practical blueprint for building capable, trustworthy, and efficient LLM systems using modest data and compute resources.

Jiaxin Huang is an Assistant Professor in Computer Science & Engineering Department at Washington University in St. Louis. She received her Ph.D. degree from Computer Science Department in UIUC. Prior to UIUC, she received her Bachelor’s degree from Tsinghua University. She works in the field of natural language processing and machine learning, with a research focus on building reliable and efficient Large Language Models. She has over 30 research publications in top conferences, including ICLR, NeurIPS, ICML, EMNLP, COLM, KDD, etc. Her previous research paper on large language model self-improving reshaped how large language models learn and reason, and is the first work to show LLMs can bootstrap their own reasoning without labels. She was recognized as one of only 10 Microsoft Research PhD Fellows in North America in 2021. She has served as Area Chair and PC Member for top conferences such as ACL, EMNLP, NeurIPS, ICLR, and ICML for over 4 years.

Towards Continually-Evolving AI: Selective and Expandable Multimodal Memory

Jaehong Yoon

This talk presents a new paradigm for continually-evolving AI, where agents selectively expand their knowledge and build multimodal memories that support long-horizon reasoning. I will introduce how selective learning and expandable memory enable AI to refine skills, acquire new ones, and interpret complex visual experiences over time. First, I will discuss continual multimodal instruction tuning through the lens of training data selection. Visual instruction datasets often arrive asynchronously and contain substantial redundancy, making continual improvement inefficient. The proposed approach implicitly groups samples by their underlying skills and identifies the examples that most effectively strengthen the model’s current knowledge while still supporting steady refinement and the acquisition of new abilities. Next, I will introduce a multimodal memory framework for long-form video understanding. While video language models perform well on short clips, they struggle with hours-long content where important visual details disappear through abstraction and events unfold across diverse timescales. In contrast, we build complementary episodic, semantic, and visual memories and retrieve them adaptively, ensuring that both high-level concepts and fine-grained visual evidence remain accessible during reasoning.

Dr. Jaehong Yoon is an Assistant Professor at Nanyang Technological University (NTU), Singapore. Prior to joining NTU, he was a postdoctoral research associate at UNC Chapel Hill, working with Prof. Mohit Bansal. He completed his Ph.D. in the School of Computing at KAIST in 2023, advised by Prof. Sung Ju Hwang. His research aims to build AI systems that can live and learn in the real world, developing agents that perceive and reason through rich multimodal signals, interact with complex environments, and continually evolve to stay reliable, adaptive, and trustworthy over time. Dr. Yoon is the recipient of several honors, including the NSCC Young Investigator Seed Project (YISP) Award (2026), the CoLLAs Early-Career Spotlight Program (2025), the Google PaliGemma Academic Program Award (2024), and the KAIST Best Ph.D. Dissertation Award from both the College of Engineering and the School of Computing. He serves as the DEI Chair for CoLLAs 2026 and has been an area chair for EACL 2026, NeurIPS 2025, EMNLP 2024, and NAACL 2024.

Cracking the Code of Health Communication: Explanation, Evaluation, and Personalization

Yue Guo

Although scientific papers contain the information needed to support health-related decision making, this knowledge is often not accessible to the public. Medical jargon, dense writing styles, and missing background explanations make biomedical information opaque to non-experts. This creates an urgent need for scientific knowledge to be communicated in clear, actionable language. In this talk, I present my research on plain language summary (PLS) generation–automatically translating science texts into lay-friendly summaries. I will discuss four key challenges and my contributions toward addressing them: building large-scale training data, generating effective explanations, developing reliable evaluation methods, and personalizing summaries to readers’ backgrounds.

Yue Guo is an Assistant Professor in the School of Information Sciences at the University of Illinois at Urbana-Champaign (UIUC). She received her Ph.D. from the University of Washington. Trained as a physician and epidemiologist before transitioning to health informatics, she brings a multidisciplinary perspective to AI-driven solutions in medicine. Prior to joining UIUC, she interned at Microsoft Research, Google, and the Allen Institute for Artificial Intelligence. Her research broadly spans natural language processing and health informatics, with a current focus on improving the accessibility and actionability of biomedical knowledge, as well as evaluating and enhancing the reasoning, factuality, and trustworthiness of large-scale language models.

Safe Reinforcement Learning for Trustworthy AI: Theory, Algorithms, and Applications

Honghao Wei

Safe reinforcement learning (RL) has emerged as a key paradigm for deploying AI in high-stakes domains such as autonomous driving, robotics, healthcare, and recommender systems. By embedding constraints into the learning process, safe RL enables agents to optimize performance while satisfying critical requirements, including collision avoidance, resource limits, and system reliability. Such guarantees are indispensable for real-world AI, where failures can cause physical harm, economic loss, or loss of trust. At the same time, demand for trustworthy AI continues to grow as machine learning is increasingly deployed in human-centered applications. This makes it essential to design RL algorithms that are not only efficient but also reliable, robust, and aligned with societal needs.

Honghao Wei is currently an Assistant Professor in the School of Electrical Engineering and Computer Science (EECS) at Washington State University. He was a member of the AI EDGE Institute. Before joining WSU, he received his PhD in Electrical and Computer Engineering (ECE) from the University of Michigan, Ann Arbor. His research interests include Control and Decision theory, Reinforcement Learning, Safe Reinforcement Learning, Machine Learning, Reinforcement Learning with Human Feedback, Optimization, Robotics, and their intersections. His work has been published in top tier AI journals and conferences, with several papers selected for oral and spotlight presentations, and was recognized by the AAAI New Faculty Highlights program. He received an Honorable Mention for the Richard F. and Eleanor A. Towner Prize for Outstanding PhD Research. He serves on program committees for MobiHoc, INFOCOM, and major AI and machine learning conferences.

Reasoning LLMs for Science and the Science of Their Reasoning

Hao Peng

In this talk, I will first focus on our recent works using reasoning-capable LLMs to accelerate scientific discovery, highlighting new benchmarks and evaluations that investigate their effectiveness in realistic research workflows. I will then turn to the science of reasoning LLMs themselves, drawing on our recent studies that investigate the mechanisms underlying their reasoning abilities, how these can be elicited or enhanced, and what this reveals about their potential and limitations. Together, these perspectives offer a unified view of LLMs both as engines for advancing science and as subjects of scientific inquiry.

Hao Peng is an Assistant Professor at the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign. He is broadly interested in large language models, with a current focus on advancing their reasoning capabilities, causal understanding, and potential to drive scientific discovery. His research has been recognized with Best Paper Honorable Mentions at ACL and NAACL, as well as awards and gifts from Amazon, Open Philanthropy, Apple, and the Allen Institute for AI.

All-Purpose Mean Estimation over ℝ

Jasper Lee

Given society’s increasing reliance on data, its collection and processing into useful information is a technical problem of growing focus, and perhaps paradoxically, a critical bottleneck in many data science and machine learning applications. Yet, even for the most basic statistical problems such as mean estimation, there is a theory-practice divide. Conventional methods like the sample mean, while supported by theoretical results under strong assumptions, are often brittle in the presence of extreme data. Practitioners thus often use ad-hoc and unprincipled “outlier removal” heuristics, but which can lead to wrong conclusions (e.g. Milikan’s underestimation of the electron charge (Holton 1978)).
In this talk, I will describe my work that essentially resolves the fundamental 1-d mean estimation problem. I will show the construction of a statistically-optimal and computationally-efficient 1-dimensional mean estimator, whose estimation error is optimal even in the leading multiplicative constant, under bare minimum distributional assumptions (FOCS 2021). Furthermore, I will discuss its various robustness properties (ICML 2025 Oral), in particular highlighting robustness to adversarial sample corruption.

Jasper Lee is an assistant professor at the University of California, Davis, in the Department of Computer Science and the Graduate Group in Applied Mathematics. He completed his PhD at Brown University, advised by Paul Valiant, and was subsequently a postdoc mentored by Ilias Diakonikolas at UW Madison.
His research interests are broadly in the foundations of data science, aiming to design practical, data-efficient and computationally-efficient algorithms for a variety of statistical applications. His work has been published at top venues across artificial intelligence (AAAI), machine learning (NeurIPS, ICML) and theoretical computer science (FOCS, SODA, COLT), including oral presentations at AAAI and ICML.

Cross-Modal Knowledge Transfer in Time Series AI via Large Vision Models

Jingchao Ni

Time series analysis has progressed from traditional autoregressive models to deep learning, Transformers, and foundation models (FMs), including large language models (LLMs) and large vision models (LVMs). These advances have expanded model design possibilities and, notably, enabled time series problem-solving across multiple modalities, greatly improving downstream applications in domains such as climate, energy, and healthcare. This talk will provide an overview of recent developments in large FMs for time series, highlighting frameworks for transferring knowledge from other modalities to time series, and identifying the advantages of LVMs over LLMs in cross-modal knowledge transfer. I will then delve into our recent research on LVMs for time series, discussing (1) mainstream techniques for imaging time series; (2) key strengths and limitations of LVMs in time series modeling; and (3) multimodal frameworks that integrate LVMs for time series encoding. This talk will conclude with applications and future directions. The aim of the talk is to review state-of-the-art AI techniques for time series, highlight unique challenges, and share our recent findings in this promising area.

Dr. Jingchao Ni is an assistant professor of computer science at the University of Houston. Prior to joining UH, he was a researcher at NEC Labs and AWS AI Labs. He received his PhD from The Pennsylvania State University. His research focuses on machine learning, data mining, and artificial intelligence. He is particularly interested in time series analysis through cross-modal learning, multimodal integration, LLM reasoning, and their application to areas emphasizing AI for science (e.g., neuroscience, geoscience) and AI for social good (e.g., healthcare, cyber-physical systems, AIOps). He is an organizer of tutorials and workshops on time series AI at leading conferences (e.g., AAAI, KDD, ICDM). His research has been published in top-tier conferences (e.g., ICLR, ICML, NeurIPS, CVPR, ACL, AAAI, KDD, WWW) and journals (e.g., IEEE TKDE, ACM TKDD), and has contributed to products, with more than 20 patents filed or granted.

Augmenting Human Creativity with Machine Learning

Hao-Wen (Herman) Dong

In this talk, I will survey my work in three main research directions: 1) generative models for music creation, 2) AI-assisted music creation tools, and 3) multimodal generative models for content creation. In particular, I will discuss our recent work on AI-assisted video editing that explores novel machine learning models that can cut, select, and rearrange a long video into a short video. In the first TeaserGen project, we proposed a narration-centered teaser generation system that can effectively compress >30-min documentaries into <3-min teasers leveraging pretrained LLMs and language-vision models. In the second REGen project, we proposed a retrieval-embedded generation framework that allows an LLM to quote multimodal resources while maintaining a coherent narrative. I will conclude by discussing our future work towards next-generation video editing interfaces using multimodal LLMs and retrieval embedded generation. I will also discuss our future work towards playful human-AI music co-creation systems where the user can control a music generation system through hand gestures and body movements.

Hao-Wen (Herman) Dong is an Assistant Professor in the Department of Performing Arts Technology at the University of Michigan. Herman’s research aims to augment human creativity with machine learning. He develops human-centered generative AI technology that can be integrated into professional creative workflows, with a focus on music, audio, and video creation. His long-term goal is to make professional content creation accessible to everyone. Herman received his PhD degree in Computer Science from the University of California San Diego, where he worked with Julian McAuley and Taylor Berg-Kirkpatrick. His research has been recognized by the UCSD CSE Doctoral Award for Excellence in Research, KAUST Rising Stars in AI, UChicago and UCSD Rising Stars in Data Science, ICASSP Rising Stars in Signal Processing, and UCSD GPSA Interdisciplinary Research Award.

Towards Human-centered Proactive Conversational Agents

Deng Yang

Conversational AI agents are envisioned to provide social support or functional service to human users via natural language interactions. The popularity of conversational AI has grown unprecedentedly with the advent of ChatGPT, which showcases exceptional proficiency in the capabilities of context understanding and response generation with large language models (LLMs). However, typical conversational systems are built to follow instructions, which means that the conversation is led by the user, and the system simply follows the user’s instructions or intents. My research endows the conversational AI with the capabilities of creating or controlling the conversation to achieve the conversational goals by taking initiative and anticipating impacts on themselves or human users (Intelligence), namely Proactive Conversational AI. I will also highlight the importance of moving towards building human-centered proactive conversational AI that emphasize human needs and expectations (Adaptivity), and that considers ethical and social implications of these agents (Civility), rather than solely focusing on technological capabilities.

DENG Yang is an Assistant Professor at Singapore Management University (SMU). His research interests center on dialogue and interactive systems, language model powered agents, trust and reliability in large language models. He received the Google Southeast Asia Research Awards in 2024 and Lee Kong Chian Fellowship in 2025. He was recognized as the World’s Top 2% Scientist (Single Year) in 2025. He was selected as the New Faculty Highlights in AAAI 2026. He regularly serves as Area Chairs and Senior Program Committee Members at top conferences, such as ACL, NeurIPS, AAAI, EMNLP, NAACL, COLM, etc. He also served as the Program Co-Chair at Singapore Symposium on Natural Language Processing (SSNLP) in 2024 and 2026. He was invited as the keynote speaker for various international workshops, such as EMNLP 2025 PALS and CIKM 2025 ProActLLM workshops. He actively organizes tutorials of advanced topics at top conferences, including ACL, SIGIR, and WWW.

Efficient Model Specialization via Training-time and Test-time Adaptation

Huanrui Yang

The scaling up of single model size is stalling due to the inadequate high-quality data to pretrain on and the diminishing return in performance gain. Inspired by the trend of specialization and heterogeneity explored in the semiconductor’s Moore’s Law, we discuss effective methods to specialize a pretrained model to heterogeneous variants for both improved efficiency and performance. In this talk, we discuss efficient model specialization algorithm to adapt the pretrained model towards downstream tasks while improving its efficiency, efficiently generalizing to multiple tasks via dynamic architectures, and improving inference-time efficiency utilizing the diversity within model block functionalities. These research directions serve as the foundation towards co-designing models, tasks, systems, and hardware for a reconfigurable efficient intelligence future.

Huanrui Yang is an Assistant Professor in the Department of Electrical and Computer Engineering (ECE) at the University of Arizona (UA). Before joining the UA in 2024, he was a Postdoctoral Scholar in the EECS department of UC Berkeley and Berkeley AI Research. He obtained Ph.D. in ECE from Duke University in 2022 and B.E. in Electronic Engineering from Tsinghua University in 2017. His primary research focuses on the efficiency and robustness of deep neural network models, where he aims to identify the core functionality of the deep learning model and develop the most efficient and robust algorithm to fulfill such functionality. Applications of his research spans across computer vision, generative model, and natural language processing.

Bridging AI with Clinical Decisions from a Data Centric Perspective

Jiaming Cui

Machine learning and clinical decision-making are intertwined. ML methods depend on high-quality clinical data and domain knowledge to produce accurate and robust predictions, while clinical decisions increasingly rely on ML outputs to guide optimal interventions. Dr. Cui’s research brings a data-centric perspective to bridge AI with clinical decision-making. His work addresses multiple challenges arising from effectively utilizing multimodal clinical datasets and issues stemming from the complexity of disease spread dynamics in healthcare facilities. This talk will cover methods developed to address these challenges with better designed models to optimize disease surveillance and control policies and new techniques for end-to-end learning with mechanistic models. The talk will conclude by discussing emerging challenges and opportunities at the intersection of machine learning, scientific modeling, and clinical decision-making for computer scientists, epidemiologists, and computational biologists.

Dr. Jiaming Cui is an assistant professor in the Department of Computer Science at Virginia Tech. His research aims to bridge artificial intelligence (AI) with clinical decisions and focuses on machine learning (ML), data mining, scientific modeling, and public health. He has published in leading science journals and top CS venues such as PNAS, NPJ Digital Medicine, NeurIPS, ICML, AAAI, and SDM, and has organized workshops and tutorials at leading conferences like KDD and ICDM. He has closely collaborated with clinicians, and his work has been applied in multiple healthcare facilities. His work has also significantly contributed to pandemic prediction and prevention in the past several years, including helping decision-making in healthcare facilities and participating in the CDC’s healthcare-associated infections team. Jiaming earned his Ph.D. in computer science at the Georgia Institute of Technology. Prior to this, he completed undergraduate studies at Shanghai Jiao Tong University, where he received bachelor’s degrees in both information engineering and finance, graduating with honors.

Towards Aligned and Efficient Large Language Models

Yu Meng

Large language models (LLMs) have rapidly transformed the landscape of AI, demonstrating remarkable capabilities across reasoning, communication, and problem-solving. Yet, realizing their full potential requires addressing two critical challenges. First, their behavior must be steered and refined after training to ensure reliability, safety, and alignment with human values and intentions. Second, their large scale comes with substantial costs in training and deployment, necessitating research into more efficient methods.
My research centers on advancing both of these fronts—making LLMs both aligned and efficient. On one side, I investigate post-training techniques that allow models to better reflect human preferences, demonstrate strong reasoning capabilities, and mitigate hallucination. On the other side, I study methods for improving data efficiency in training and inference efficiency in deployment. Together, these thrusts highlight a broader vision of enabling LLMs that are not only powerful, but also trustworthy and accessible at scale.

Yu Meng is an Assistant Professor in the Department of Computer Science at the University of Virginia. He obtained his Ph.D. in Computer Science from University of Illinois at Urbana-Champaign (UIUC). His research focuses on developing more capable, efficient, and aligned large language models (LLMs). His honors include the Google PhD Fellowship (2021), the OpenAI Superalignment Fast Grant (2024), the ACM SIGKDD Dissertation Award (2024), an Amazon Research Award (2025), and recognition on the Forbes 30 Under 30 Asia list (2025).

Robust Machine Learning for Biomedical Data: Efficiency, Reliability, and Generalizability

Chenyu You

In the rapidly growing area of machine learning, there is profound promise in crafting intelligent, data-driven methods for diverse real-world applications. Yet, in safety-critical domains like healthcare, some fundamental challenges remain: (1) The insufficiency of raw biomedical data emphasizes the need for data-efficient and robust learning approaches. (2) The imperative of safety and stability necessitates a cohesive framework that unifies learning with theoretical guarantees. (3) The inherent heterogeneity and distribution shifts in real-world clinical data call for robust and generalizable learning methods. To address these challenges, there are several major directions I have explored: (i) (Robust) Machine Learning for Imperfect Medical Data: The development of machine learning models, particularly in the context of label scarcity, increasingly necessitates the collection of substantial annotated medical data. Moreover, medical data often display a long-tailed class distribution, which consequently results in notable imbalance issues. To this end, there are several growing interests in training machine learning models jointly across imbalanced class distributions and limited annotations. I have developed novel, efficient, statistically consistent algorithms to improve empirical performance for biomedical image analysis. (ii) Learning with Theoretical Guarantees: As machine learning methods have become ubiquitous in clinical decision-making, their reliability and interpretability have become important. This is particularly crucial in the field of biomedical image analysis, where decision outcomes can have profound implications. I have developed novel machine learning algorithms that enable provably accurate anatomical modeling with theoretical guarantees. (iii) Generalize across Diverse Biomedical Data: The development of medical foundation models often requires massive and diverse biomedical data. To this end, I have developed various foundation models for biomedical imaging data and explored novel applications of these models. I have also developed novel medical AI Agents that lead to the scalable and accurate predictive modeling, particularly for distribution shift problems.

Chenyu You is an Assistant Professor in the Department of Applied Mathematics & Statistics and the Department of Computer Science at Stony Brook University. He is also affiliated with the CVLab, AI institute, and Institute for Advanced Computational Science (IACS). He works on the principles and practice of trustworthy machine intelligence, often with a focus on generalization, and making machine learning more reliable His applied research includes applications to healthcare, biomedical imaging, and cognitive neuroscience. He received his Ph.D. in 2024 from Yale University under the advisement of James S. Duncan, his M.S. in 2019 from Stanford University under the advisement of Daniel Rubin, and his B.S. in 2017 from Rensselaer Polytechnic Institute under the advisement of Ge Wang, all in electrical engineering. He has also spent wonderful time at Facebook AI Research (FAIR), as well as Google Research. He serves on the Medical Image Computing and Computer-Assisted Intervention Society (MICCAI), and the SUNY AI Symposium Planning Committee, and as associate editors for IEEE Transactions on Medical Imaging, IEEE Transactions on Neural Networks and Learning Systems, and Pattern Recognition. He has been ranked as the World’s Top 2% most-cited scientists by Stanford University since 2024, is a member of the Sigma Xi scientific research society, and received the Excellence in Teaching Award for Spring 2025.

Graph-based Label-Efficient Learning: When Graph-Structured Data Meets Limited Labels

Zixing Song

The success of deep learning is highly dependent on large-scale labeled data. This dependency creates a challenging bottleneck in high-stakes scientific domains, such as molecular discovery, where data annotation is prohibitively expensive. Consequently, developing label-efficient learning methods to maximize model performance under limited annotation budgets has recently become more and more critical. However, mainstream label-efficient strategies still suffer from a Euclidean bias. Standard active learning or semi-supervised methods primarily operate on grid-structured or sequential data. They fail to effectively capture the non-Euclidean, or relational dependencies inherent in graph-structured data. In this talk, I present a systematic framework to bridge this gap, outlining a progressive path toward label-efficient graph machine learning. First, we address the adaptation of general label-efficient paradigms to explicit graph data. We demonstrate how to strategically select labeled nodes to maximize performance gain, how to effectively utilize unlabeled nodes to capture topological priors, and how to synergistically combine both labeled and unlabeled nodes to enforce prediction consistency. Second, we explore the utilization of implicit graph data to enhance general label-efficient learning. We pioneer methods to construct and leverage latent graph structures within unstructured data, propagating label information to significantly boost prediction performance.

Zixing Song is a Lecturer (equivalent to Assistant Professor) in the School of Engineering Mathematics and Technology at the University of Bristol. He was previously a postdoctoral research associate in the Computational and Biological Learning Lab at the University of Cambridge. He received his PhD in Computer Science from The Chinese University of Hong Kong. His research centers around Graph Machine Learning, Geometric Deep Learning, Data-Centric AI, and Large Language Models. He recently focuses on how to utilize graph data optimally with LLMs in an efficient and trustworthy manner. He received the Junior Research Fellowship at Wolfson College at Cambridge in 2025. His research has been awarded the 2024 Doctoral Dissertation Award from the International Neural Network Society.

Toward Causal Foundation World Models: From Representation to Decision-Making

Mengyue Yang

World models are becoming a central interface between perception, prediction, and control, yet most current systems are fundamentally correlational and assume fixed dynamics, limiting their robustness in the non-stationary, open-ended domains we care about (robots, science, society). In this survey, I will outline a research agenda for causal foundation world models—scalable, general-purpose models that embed causal structure to support explanation, counterfactual reasoning, and decision-making. I will briefly connect recent advances in causal representation learning, LLM-based reasoning and planning, and multi-agent decision-making with emerging ideas on curiosity-driven intervention and dynamic causal discovery, and argue that bringing these strands together is key to building trustworthy world models that can generalize, adapt, and provide meaningful guarantees in real-world applications.

Mengyue Yang is a Lecturer in Artificial Intelligence at the University of Bristol, where she has been based since October 2024. Her research interests span causality, reinforcement learning, multi-agent systems, and world modeling. She received her PhD from University College London (UCL), where she developed influential methods for causal representation learning. During her PhD, she was a visiting researcher at KAUST with Prof. Jürgen Schmidhuber and at MBZUAI with Prof. Kun Zhang. She has published more than 20 papers on and has served as a reviewer for top conferences such as NeurIPS, ICML, ICLR, and AAAI etc. She has also been a guest editor for a special issue of the journal Machine Learning. In 2024, she was recognized as a Rising Star in AI, and she has (co-)organised workshops at NeurIPS (2024–2025) and ICLR (2025–2026).

From Representation to Reasoning: Toward General-Purpose Visual Intelligence

Chen Wei

Visual perception is our primary interface with the world, yet today’s AI remains largely confined to static recognition. My research asks a more fundamental question: What is the structure of visual experience, and how can machines internalize it as a basis for understanding and action? By uncovering how visual signals map to concepts, language, and imagination, we build representational and generative foundations that allow models not only to perceive what is present, but to infer what is missing, and to hypothesize what could be. From this foundation, I explore how vision can become a vehicle for reasoning. PyVision enables models to construct and deploy their own tools for problem-solving, while ViGaL uses gameplay to incentivize cognitive skills that transfer beyond any single task. These efforts push AI from passive understanding toward agents that can explore, anticipate consequences, and engage with the world on their own terms. The goal is an AI that doesn’t merely see, but thinks through vision.

Chen Wei is an Assistant Professor in the Department of Computer Science at Rice University. Before joining Rice in Fall 2025, she was a Postdoctoral Researcher at Meta AI (FAIR). She earned her Ph.D. in Computer Science from Johns Hopkins University in 2024, advised by Bloomberg Distinguished Professor Alan L. Yuille, and received her B.Sc. with honors from Peking University in 2019. During her PhD, Chen was a research intern at FAIR at Meta AI and Google DeepMind. Her research lies broadly in Artificial Intelligence, with a focus on Computer Vision and Multimodal Learning. Her work spans generative AI, representation learning, and visual reasoning. She is recognized as a EECS Rising Star (2023).

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.