The 38th Annual AAAI Conference on Artificial Intelligence
February 20-27, 2024 | Vancouver, Canada
AAAI-24 Tutorial and Lab List
Sponsored by the Association for the Advancement of Artificial Intelligence
February 20-21, 2024 | Vancouver Convention Centre – West Building | Vancouver, BC, Canada
Half Day Tutorials
TH1: AI for emerging inverse problems in computational imaging
TH2: Beyond Human Creativity: A Tutorial on Advancements in AI Generated Content
TH3: Language Models Meet World Models
TH4: Learning under Requirements: Supervised and Reinforcement Learning with Constraints
TH5: User Simulation for Evaluating Interactive Intelligent Systems
TH6: Combinatorial Solving with Provably Correct Results
Cancelled – TH7: Knowledge Editing for Large Language Models
TH9: Model Reuse: Concepts, Algorithms, and Applications
Cancelled – TH13: Large-Scale Graph Neural Networks: Navigating the Past and Pioneering New Horizons
TH14: Machine learning for discrete optimization: Theoretical guarantees and applied frontiers
Cancelled – TH15: Privacy-Preserving Techniques for Large Language Models
TH16: Probabilistic Concept Formation with Cobweb
TH17: Experiments in Computational Social Choice Using Maps of Elections
TH18: Formalizing Robustness in Neural Networks: Explainability, Uncertainty, and Intervenability
TH19: Foundations, Practical Applications, and Latest Developments in Causal Decision Making
TH20: On the role of Large Language Models in Planning
TH21: Scalability, Robustness, and Optimization of Learning in Large Stochastic Games
Quarter Day Tutorials
TQ1: Deep Learning Methods for Unsupervised Time Series Anomaly Detection
Cancelled – TQ2: Physics-Inspired Geometric Pretraining for Molecule Representation
Cancelled – TQ3: Towards Out-of-Distribution Generalization on Graphs
Cancelled – TQ4: Disentangled Representation Learning
TQ5: Distributed Stochastic Nested Optimization for Emerging Machine Learning Models
TQ7: Advances in Robust Time-Series ML: From Theory to Practice
TQ8: Continual Learning on Graphs: Challenges, Solutions, and Opportunities
Cancelled – TQ9: Curriculum Learning: Theories, Approaches, Applications and Tools
TQ10: Graphs Counterfactual Explainability: A Comprehensive Landscape
TQ11: Aligning Large Language Models to Low-Resource Languages
Half Day Labs
Quarter Day Labs
TH1: AI for emerging inverse problems in computational imaging
This tutorial, tailored for machine learning and applied math researchers and practitioners, focuses on emerging computational imaging applications, particularly addressing lesser-known inverse problems. Our objective is to shed light on crucial yet lesser-studied areas like snapshot compressive imaging and single-photon imaging, offering insights into their mathematical modeling, advancements, and the limitations of current solutions. In doing so, we also emphasize how AI/ML approaches are instrumental in addressing these challenges, enhancing traditional optimization-based methods to develop state-of-the-art algorithms in computational imaging.
Designed to cater to a diverse audience, this tutorial is ideal for individuals with a foundational understanding of basic linear algebra and convex optimization. While prior knowledge in deep learning and underdetermined linear inverse problems is advantageous, it is not mandatory. Our structured approach ensures that all attendees, regardless of their background, can effectively engage with the material and gain a comprehensive understanding of the subject matter.
The tutorial begins with a review of classic underdetermined linear inverse problems, highlighting optimization-based solutions and learning-based methods such as unrolled networks, solutions based on generative models, and autoencoders. We discuss the pros and cons of these algorithms and the challenges that remain unaddressed.
We then transition to three specific classes of emerging inverse problems in computational imaging:
- Snapshot Compressive Imaging (SCI): This section delves into SCI systems innovative approach to capturing a 3D data cube from a single 2D projection. We will discuss applications of such systems, SCI recovery algorithms, and how AI/ML contributes to enhancing their efficiency and accuracy.
- Single-Photon Imaging: This segment covers ultrafast imaging and non-line-of-sight imaging, showcasing how these techniques are revolutionizing our approach to capturing images.
- Compressive Coherence Imaging: We explore imaging in the presence of speckle noise, discussing the intricacies and approaches to overcome these challenges.
By presenting a fundamental overview of each problem, along with existing solutions, we aim to pique attendees’ interest in these areas and provide them with the knowledge to effectively employ these solutions in their work. Our approach fosters a deeper understanding of a wide spectrum of inverse problems, ultimately contributing to advancements in the field of computational imaging.
For more detailed information about the tutorial, please visit our supplemental website: https://sites.google.com/view/aiforinverseproblems
Shirin Jalali is an Assistant Professor at the ECE department at Rutgers University. Prior to joining Rutgers in 2022, she was a research scientist at Nokia Bell Labs. She obtained her M.Sc. in Statistics and Ph.D. in Electrical Engineering from Stanford University. Her research interests primarily lie in information theory, statistical signal processing, and machine learning. She applies these disciplines to tackle computational imaging inverse problems and explore the fundamental limits of structure learning.
David Lindell is an Assistant Professor in the Department of Computer Science at the University of Toronto. His research combines optics, emerging sensor platforms, machine learning, and physics-based algorithms to enable new capabilities in visual computing. Prof. Lindell’s research has a wide array of applications including autonomous navigation, virtual and augmented reality, and remote sensing. Prior to joining the University of Toronto, he received his Ph.D. from Stanford University. He is a recipient of the 2021 ACM SIGGRAPH Outstanding Dissertation Honorable Mention Award and the 2023 Marr Prize.
Dr. Xin Yuan is currently an Associate Professor at Westlake University. He was a video analysis and coding lead researcher at Bell Labs, Murray Hill, NJ, USA from 2015 to 2021. Prior to this, he had been a Post-Doctoral Associate with the Department of Electrical and Computer Engineering, Duke University from 2012 to 2015, where he was working on compressive sensing and machine learning. He develops compressive sensing techniques for high-dimensional imaging with applications to videos, hyperspectral, microscopy and x-ray imaging. Before joining Duke, Dr. Yuan obtained his Ph.D. from the Hong Kong Polytechnic University in 2012. He has published more than 200 journal and conference papers and holding more than 20 international patents. He has been the associate editors of Pattern Recognition, Chinese Optics Letters, and the lead guest editor of IEEE Journal of Selected Topics in Signal Processing special issue “Deep Learning for High Dimensional Sensing” (2021). He has delivered invited talks in many international conferences on the topic of computational imaging and machine learning.
TH2: Beyond Human Creativity: A Tutorial on Advancements in AI Generated Content
The field of AI-generated content has experienced notable advancements recently, thanks to large language models and diffusion models that are capable of generating text and images. These developments have broadened applications across various domains, including text, image, video, and 3D object generation. Considering the increasing attention garnered by powerful generative models like ChatGPT for text and diffusion models for image synthesis, it is necessary for the AAAI community to fully explore these developments. This tutorial seeks to foster a deeper understanding of the field among conference attendees. Our tutorial will provide a comprehensive overview of AI-generated content, covering its foundations, frontiers, applications, and societal implications. It will cover the basics of large language models and diffusion models, as well as recent research and applications in this area. We will also discuss the societal concerns surrounding AI-generated content, including AI ethics and safety. By the end of the tutorial, attendees will have a better understanding of the current state of the field and the opportunities and challenges it presents. Our tutorial will be useful for researchers and practitioners interested in the application of AI-generated content to various domains. Attendees will gain insights into the latest techniques and tools for generating high-quality content and learn about the potential benefits and risks associated with this technology.
Keywords: AI Generated Content, Large Language Models, Diffusion Models, ChatGPT, Generative Models.
Website: https://sites.google.com/view/aigc-tutorial
Bang Liu is an Assistant Professor in the Department of Computer Science and Operations Research (DIRO) at the University of Montreal (UdeM). He is a member of the RALI laboratory (Applied Research in Computer Linguistics) of DIRO, the Institut Courtois of UdeM, Mila – Quebec Artificial Intelligence Institute, and holds a Canada CIFAR AI Chair. He received his B.Engr. degree in 2013 from University of Science and Technology of China (USTC), as well as his M.S. degree and Ph.D. degree from University of Alberta in 2015 and 2020, respectively. His research interests primarily lie in the areas of natural language processing, multimodal and embodied learning, theory and techniques for AGI, and AI for science (e.g., health, material science). Bang is keen to understand the essence of intelligence and develop intelligent techniques for accelerating scientific discovery. He has published 50+ papers in top-tier conferences and journals such as ACL, EMNLP, NAACL, NeurIPS, ICLR, AAAI, KDD, The Web Conference, ICDM, CIKM, CVPR, ACM Transactions on Knowledge Discovery from Data (TKDD), IEEE/ACM Transactions on Networking (TON), and ACM Transactions on the Web (TWEB). He received the Faculty of Arts and Science Medals for Research Excellence 2022 in the University of Montreal. He has served as an area chair for EACL, ACL, and EMNLP. He is also a program committee member or reviewer for many conferences and journals, including KDD, ACL, The Web Conference, SIGIR, AAAI, NeurIPS, ICLR, TOIS, JAIR, TPAMI, TNNLS, PR, and so on. He has given several tutorials in WWW 2022, AAAI 2022, IJCAI 2021, SIGIR 2021, and KDD 2021. He has also co-organized several workshops in ICLR 2022 and NAACL 2022.
Dr. Yu (Hugo) Chen is a distinguished engineer and scientist known for his remarkable contributions in Artificial Intelligence. As the Co-founder and Head of Machine Learning at Anytime.AI, he has pioneered generative AI solutions for the legal domain. As a proud alumnus of Rensselaer Polytechnic Institute, he earned his PhD in Computer Science and has since established himself as an authority in the realms of Machine Learning (Deep Learning) and Natural Language Processing. His groundbreaking research has garnered attention and acclaim, with publications in esteemed conferences and journals like NeurIPS, ICML, ICLR, AAAI, IJCAI, ACL, EMNLP, NAACL, KDD, WSDM, TheWebConf, ISWC, and TNNLS. In recognition of his exceptional work, he received the Best Student Paper Award of AAAI DLGMA’20. Further extending his knowledge, Dr. Chen contributed to the pivotal book, “Graph Neural Networks: Foundations, Frontiers, and Applications”. As a respected expert in his field, he has shared his expertise through DLG4NLP tutorials at renowned conferences such as NAACL’21, SIGIR’21, KDD’21, IJCAI’21, AAAI’22, and TheWebConf’22. Dr. Chen’s pioneering work has not only left a mark in the academic world but also in the technology and marketing spheres, with mentions in prominent publications including the World Economic Forum, TechXplore, TechCrunch, Ad Age, and Adweek. As a testament to his innovative spirit, he holds co-inventorship for 4 US patents.
Xiaojie Guo currently serves as a Research Staff Member at IBM Thomas J. Watson Research Center. She got her Ph.D. degree from the department of Information Science and Technology at George Mason University. Her research topics include data mining, artificial intelligence, and machine learning, with special interests in deep learning on graphs, graph transformation and generation, and interpretable representation learning. She has published over 30 papers in top-tier conferences and journals such as KDD, ICDM, ICLR, NeurIPS, AAAI, CIKM, Knowledge and Information System (KAIS), IEEE Transactions on Neural Networks and Learning Systems (TNNLS) and KAIS. She won the Best Paper Award in ICDM 2019 and has one paper awarded as an ESI Hot and Highly Cited Paper as the first author. She also won the AAAI/IAAI 2022 Award. Xiaojie has also served as an independent peer reviewer for multiple top academic journals, such as the IEEE Transactions on Neural Networks and Learning Systems (TNNLS), IEEE Transactions on Knowledge Discovery from Data (TKDD), ICLR, and NeurIPS.
Dr. Lingfei Wu is Cofounder and CEO of Anytime.AI, a new generative AI startup where they empower lawyers with unparalleled effectiveness & efficiency while connecting them to top-tier clients. He earned his Ph.D. degree in computer science from the College of William and Mary in 2016. Previously, he was an engineering Leader in Content Understanding at Pinterest, leading and overseeing the content understanding team consisted of talented applied scientists and software engineers and product managers to leverage Large Language Models (LLMs) and Generative AI technologies for building various interest-based and engagement-based content understanding signals. Before that, he was a Principal Scientist at JD.COM Silicon Valley Research Center, leading a team of 30+ machine learning/natural language processing scientists and software engineers to build next generation Large Language Models (LLMs)-powered Ecommerce systems. He was a research staff member at IBM Thomas J. Watson Research Center and led a 10+ research scientist team for developing novel Graph Neural Networks methods and systems, which leads to three-time Outstanding Technical Achievement Award at IBM Research. He has published one book (in GNNs) and more than 100 top-ranked conference and journal papers, and is a co-inventor of more than 60 filed US patents. Because of the high commercial value of his patents, he received eight invention achievement awards and was appointed as IBM Master Inventors, class of 2020. He was the recipients of the Best Paper Award and Best Student Paper Award of several conferences such as IEEE ICC’19, AAAI workshop on DLGMA’20 and KDD workshop on DLG’19. His research has been featured in numerous media outlets, including NatureNews, YahooNews, AP News, PR Newswire, The Time Weekly, Venturebeat, MIT News, IBM Research News, and SIAM News. He has served as Industry and Government Program Co-Chairs of IEEE BigData’22, Sponsorship Co-Chairs of KDD’22 and Associate Conference Co-Chairs of AAAI’21 and is the founding co-chairs for several workshops such as Deep Learning on Graphs (with AAAI20-22 and KDD19-22). He has also served as Associate Editor for IEEE Transactions on Neural Networks and Learning Systems and ACM Transactions on Knowledge Discovery from Data.
TH3: Language Models Meet World Models
Large language models (LMs) have achieved remarkable success in many language tasks. Recent works have also shown that knowledge of the world can emerge from large LMs, enabling them to assist decision-making for embodied tasks. However, the world knowledge exhibited by current large LMs is often not robust and cannot be grounded in physical environments without additional models. This hinders their ability to perform complex reasoning and planning tasks reliably. For example, in creating action plans to move blocks to a target state, GPT-4 achieves a significantly lower success rate compared to humans.
On the other hand, humans perform deliberate reasoning and planning based on the mental model of the world, also known as a world model (WM), which enables us to simulate actions and their effects on the world’s state. WMs encoding knowledge of the physical world can drastically improve the data efficiency and robustness of intelligent agents.
However, WMs were typically studied in reinforcement learning and robotics, areas conceptually distinct from problems studied in language modeling. This gap indicates new opportunities for connecting WMs and LMs to enhance LM capabilities in reasoning and planning in both embodied and general settings, and address the aforementioned limitations. Emerging studies on the intersection of WMs and LMs have demonstrated promising results. This tutorial aims to summarize and present a unified view of connecting WMs and LMs, highlighting the various opportunities for improved machine reasoning and planning based on large LMs through world modeling. We will review recent works on learning WMs and using them to further learn and perform embodied tasks. We will show how LMs can utilize external WMs to compensate for their lack of grounded world knowledge and how LMs themselves can learn world models from embodied experiences beyond text data, and use these internal WMs to guide complex reasoning.
Zhiting Hu is an Assistant Professor in Halicioglu Data Science Institute at UC San Diego. He received his Bachelor’s degree in Computer Science from Peking University in 2014, and his Ph.D. in Machine Learning from Carnegie Mellon University in 2020. His research interests lie in the broad area of machine learning, artificial intelligence, natural language processing, and ML systems. In particular, he is interested in principles, methodologies, and systems of building AI agents with all types of experience (data, symbolic knowledge, rewards, adversaries, lifelong interplay, etc), and their applications in controllable text generation, healthcare, and other application domains. His research was recognized with best demo nomination at ACL2019 and outstanding paper award at ACL2016.
Tianmin Shu is a Research Scientist at the Massachusetts Institute of Technology and an incoming Assistant Professor in the Department of Computer Science and the Department of Cognitive science at Johns Hopkins University. His research goal is to advance human-centered AI by engineering human-level machine social intelligence to build socially intelligent systems that can understand, reason about, and interact with humans in real-world settings. His work received the 2017 Cognitive Science Society Computational Modeling Prize in Perception/Action and several best paper awards at NeurIPS and IROS workshops. His research has also been covered by multiple media outlets, such as New Scientist, Science News, and VentureBeat. He received his PhD degree from the University of California, Los Angeles, in 2019.
TH4: Learning under Requirements: Supervised and Reinforcement Learning with Constraints
This tutorial is geared towards researchers and practitioners interested in imposing requirements to machine learning (ML) systems, such as fairness, robustness, and safety. Typically, these statistical, data-driven constraints are induced by combining the learning objective and requirement violation metrics into a single training loss. To guarantee that the solution satisfies the requirements, however, this approach requires careful tuning of hyperparameters (penalty coefficients) using cross-validation, a computationally intensive and time consuming process. Constrained learning incorporates requirements as statistical constraints rather than by modifying the training objective. In this tutorial, we provide an overview of theoretical and algorithmic advances from the past 5 years that show when and how it is possible to learn under constraints and effectively impose constraints on ML systems, both during training and at test time. Specifically, we explore the role and impact of different types of requirements in supervised learning, robust learning, and reinforcement learning (RL). First, we introduce new non-convex duality results that yield generalization guarantees for constrained supervised learning. We also use these results to derive practical algorithms to tackle these problems, despite their non-convexity. We then leverage these advances to obtain algorithms for robust learning capable of achieving better trade-offs between nominal and adversarial accuracy. Finally, we develop a parallel theory for constrained RL, showing that it is strictly more expressive than its unconstrained counterpart. Throughout the tutorial, we illustrate the effectiveness and flexibility of constrained learning in a diverse set of applications from robust image classification, federated learning, learning under invariance, and safe RL. Ultimately, this tutorial provides a general tool that can be used to tackle a variety of problems in ML and sequential decision-making.
Organizers
Miguel Calvo-Fullana received the B.Sc. degree in electrical engineering from the Universitat de les Illes Balears (UIB) in 2010 and the M.Sc. and Ph.D. degrees in electrical engineering from the Universitat Politècnica de Catalunya (UPC) in 2013 and 2017, respectively. He joined Universitat Pompeu Fabra (UPF) in 2023, where he is a Ramón y Cajal fellow. Prior to joining UPF, he held postdoctoral appointments at the University of Pennsylvania and the Massachusetts Institute of Technology, and during his Ph.D., he was a research assistant at the Centre Tecnològic de Telecomunicacions de Catalunya (CTTC). His research interests lie in the broad areas of learning and optimization for autonomous systems, with a particular emphasis on multi-agent systems, wireless communication and network connectivity. He is the recipient of best paper awards at ICC 2015, GlobalSIP 2015, and ICASSP 2020.
Luiz F. O. Chamon received the B.Sc. and M.Sc. degrees in electrical engineering from the University of São Paulo, São Paulo, Brazil, in 2011 and 2015 and the Ph.D. degree in electrical and systems engineering from the University of Pennsylvania (Penn), Philadelphia, in 2020. Until 2022, he was a postdoctoral fellow at the Simons Institute of the University of California, Berkeley. He is currently an independent research group leader at the University of Stuttgart, Germany. In 2009, he was an undergraduate exchange student of the Masters in Acoustics of the École Centrale de Lyon, Lyon, France, and worked as an Assistant Instructor and Consultant on nondestructive testing at INSACAST Formation Continue. From 2010 to 2014, he worked as a Signal Processing and Statistics Consultant on a research project with EMBRAER. He received both the best student paper and the best paper awards at IEEE ICASSP 2020 and was recognized by the IEEE Signal Processing Society for his distinguished work for the editorial board of the IEEE Transactions on Signal Processing in 2018. His research interests include optimization, signal processing, machine learning, statistics, and control.
Santiago Paternain received the B.Sc. degree in electrical engineering from Universidad de la República Oriental del Uruguay, Montevideo, Uruguay in 2012, the M.Sc. in Statistics from the Wharton School in 2018 and the Ph.D. in Electrical and Systems Engineering from the Department of Electrical and Systems Engineering, the University of Pennsylvania in 2018. He is currently an Assistant Professor in the Department of Electrical Computer and Systems Engineering at the Rensselaer Polytechnic Institute. Prior to joining Rensselaer, Dr. Paternain was a postdoctoral Researcher at the University of Pennsylvania. His research interests lie at the intersection of machine learning and control of dynamical systems. Dr. Paternain was the recipient of the 2017 CDC Best Student Paper Award and the 2019 Joseph and Rosaline Wolfe Best Doctoral Dissertation Award from the Electrical and Systems Engineering Department at the University of Pennsylvania.
Alejandro Ribeiro received the B.Sc. degree in electrical engineering from the Universidad de la República Oriental del Uruguay in 1998 and the M.Sc. and Ph.D. degrees in electrical engineering from the Department of Electrical and Computer Engineering at the University of Minnesota in 2005 and 2007. He joined the University of Pennsylvania (Penn) in 2008 where he is currently Professor of Electrical and Systems Engineering. His research is in wireless autonomous networks, machine learning on network data and distributed collaborative learning. Papers coauthored by Dr. Ribeiro received the 2022 IEEE Signal Processing Society Best Paper Award, the 2022 IEEE Brain Initiative Student Paper Award, the 2021 Cambridge Ring Publication of the Year Award, the 2020 IEEE Signal Processing Society Young Author Best Paper Award, the 2014 O. Hugo Schuck best paper award, and paper awards at EUSIPCO 2021, ICASSP 2020, EUSIPCO 2019, CDC 2017, SSP Workshop 2016, SAM Workshop 2016, Asilomar SSC Conference 2015, ACC 2013, ICASSP 2006, and ICASSP 2005. His teaching has been recognized with the 2017 Lindback award for distinguished teaching and the 2012 S. Reid Warren, Jr. Award presented by Penn’s undergraduate student body for outstanding teaching. Dr. Ribeiro received an Outstanding Researcher Award from Intel University Research Programs in 2019. He is a Penn Fellow class of 2015 and a Fulbright scholar class of 2003.
TH5: User Simulation for Evaluating Interactive Intelligent Systems
As AI technologies are increasingly deployed in real world applications, notably in the form of search engines, recommender systems, and conversational assistants, how to evaluate such technologies in the context of interactive intelligent system applications has emerged as an urgent challenge for both practitioners who deploy AI products and researchers.
Research communities have so far mostly relied on test collections to perform reproducible experiments, but such an evaluation methodology cannot be used to evaluate interactive intelligent systems, whose utility must be assessed by users via interactions with the system. To tackle this challenge, researchers have proposed and developed an evaluation methodology based on user simulation, where the idea is to simulate a real user using an intelligent agent that can mimic a user’s decisions when interacting with an AI system and evaluate an AI system by having the system interact with such an artificial user and measuring the perceived utility and cost/effort from a user’s perspective for finishing a task. The work on user simulation for evaluating intelligent systems has so far been mostly done in applied AI communities, notably Information Retrieval, Recommender Systems, and World Wide Web. The goal of this tutorial is to provide a systematic review of this topic and discuss many interesting novel AI-related research challenges for AAAI attendants, allowing them to learn about the major ideas, frameworks, models, and algorithms for both building user simulators and using simulators to evaluate an interactive system, as well as important future research directions.
This introductory tutorial primarily targets graduate students, academic researchers, and industry practitioners who are interested in learning about how AI techniques can be applied to build user simulators and/or how user simulation can be used to evaluate interactive AI systems. Since the question of how to accurately evaluate interactive intelligent systems is important to both practitioners who would like to assess the utility of their product systems and researchers who would like to know whether their new algorithms are truly more effective than the existing ones, we expect our tutorial to be broadly appealing to many participants of AAAI.
For more details see https://usersim.ai/aaai2024-tutorial/
Krisztian Balog is a full professor at the University of Stavanger and a staff research scientist at Google. His general research interests lie in the use and development of information retrieval, natural language processing, and machine learning techniques for intelligent information access tasks. More information can be found at https://krisztianbalog.com/.
ChengXiang Zhai is a Donald Biggar Willett Professor in Engineering of Department of Computer Science at the University of Illinois at Urbana-Champaign. His research interests include intelligent information retrieval, text mining, natural language processing, machine learning, and their applications. More information can be found at https://czhai.cs.illinois.edu/.
TH6: Combinatorial Solving with Provably Correct Results
Combinatorial optimization problems are ubiquitous in our lives. They show up in the forms of matching problems (e.g., assigning junior doctors to hospitals), scheduling problems (e.g., radiation therapy), logistics problems (e.g., visiting nurses), and many more. In many cases, these problems are also notoriously hard (often NP-hard, or even worse). Still, thanks to tremendous progress over the last decades, we now have access to sophisticated algorithms that can solve these problems in practice.
Unfortunately, it turns out that, due to their immense complexity, these (solving) algorithms often contain subtle bugs. In this tutorial, we give an overview of the most successful approach to dealing with this issue, namely proof logging, meaning that solvers aren’t allowed to just claim an answer to a problem: they’re expected to also produce an independently verifiable proof that backs up this claim. In the field of Boolean Satisfiability, this has done wonders for solver reliability, and has also helped with social acceptability of computer-generated mathematical results. What if we could do the same for the more general field of constraint programming, producing an auditable solving process where people whose lives or livelihoods are affected by a computer’s decision would no longer have to resort to hoping that the system is free of bugs?
We will start this tutorial by explaining what a proof really is, and what it means for an algorithm to certify the correctness of its answers by using proof logging. We will give a brief overview of the (extended) resolution and DRAT proof systems used in SAT proof logging. Then we will look at what is needed to bring proof logging to a broader range of solving algorithms, starting with some subgraph-finding algorithms, and moving towards a full CP solver with multiple global constraints and even some reformulation. We will show, by example, what you need to do to use this technology in your own algorithms, and how this relates to the underlying proof methods.
We’ll finish by discussing how proof logging can also support advanced solving techniques such as symmetry breaking. Surprisingly, all of this can be done in a simple proof format known as “cutting planes with strengthening” that does not know anything about non-binary variables, global constraints, graphs, groups, or symmetries.
Bart Bogaerts is an associate professor at the Artificial Intelligence Lab of the Vrije Universiteit Brussel. His research interests span different ways of using logic for problem solving.
This includes in particular knowledge representation languages (with a focus on algebraic frameworks for unifying them) and combinatorial optimization (with a focus on how to solve problems with correctness guarantees).
Ciaran McCreesh is a Royal Academy of Engineering Research Fellow at the University of Glasgow. His research looks at how we can efficiently solve hard combinatorial problems in practice, particularly in the areas of constraint programming and subgraph finding.
Jakob Nordström is a full professor at the University of Copenhagen and Lund University, where he leads the Mathematical Insights into Algorithms for Optimization (MIAO) group. He has done research in computational complexity theory, combinatorial optimization, and certifying algorithms using proof logging, with applications ranging all the way from hardware verification to finite model theory.
Cancelled – TH7: Knowledge Editing for Large Language Models
Large Language Models (LLMs) have showcased immense potential in generating human-like text as demonstrated by numerous studies. Despite their remarkable capabilities, LLMs such as ChatGPT can occasionally err in maintaining factual accuracy or logical consistency. They might inadvertently generate content that is harmful or offensive, and they are unaware of events that have occurred post their training phase. The challenge at hand, quite intuitively, remains how to effectively update these LLMs or rectify their errors without resorting to a complete retraining or continuous training process, both of which can be significantly resource-intensive and time-consuming. To this end, the notion of knowledge editing for LLMs has been proposed which provides an efficient way to modify the model’s behavior, particularly within a specified domain of interest, without compromising its performance on other inputs.
We will initiate this tutorial by defining the tasks involved in knowledge editing for LLMs, and introducing evaluation metrics and benchmark datasets. Subsequently, we will provide an overview of various knowledge editing methodologies. Initially, our focus will be on methods that preserve the parameters of LLMs. These techniques function by adjusting the model’s output for certain instances, achieved by integrating an auxiliary network alongside the original, untouched model. We will then transition to methods that modify the parameters of LLMs, which aim to alter the model parameters responsible for undesirable outputs. Throughout the tutorial, we aim to share insights gleaned from the diverse communities engaged in knowledge editing research and introduce open-sourced tools EasyEdit. Further, we will delve into potential issues as well as opportunities associated with knowledge editing for LLMs, with the goal of imparting valuable insights to the community. All tutorial slides will be available at https://github.com/zjunlp/KnowledgeEditingPapers.
Ningyu Zhang is an associate professor supervisor at Zhejiang University, leading the group about KG and NLP technologies. He has supervised to construct a information extraction toolkit named DeepKE (2.3K+ stars on Github). His research interest include knowledge graph and natural language processing. He has published many papers in top international academic conferences such as Natural Machine Intelligence, Nature Communications, NeurIPS, ICLR, AAAI, IJCAI, ACL, ENNLP, NAACL. He has served as Area Chair for ACL/EMNLP 2023, ARR Action Editor, Senior Program Committee member for IJCAI 2023, Program Committee member for NeurIPS, ICLR, ICML, AAAI.
Jiachen Gu will join the University of California, Los Angeles as a postdoctoral researcher. His research interests lie within machine learning for dialogue systems and retrieval-based LMs. He received the Ph.D. degree from the University of Science and Technology of China in June 2022. He has published many papers in top international academic conferences and journals such as ACL, ENNLP, SIGIR, CIKM, IJCAI, and IEEE/ACM Transactions on Audio Speech and Language Processing. He has served as PCs or Reviewers for ACL, EMNLP, NAACL, AAAI, IJCAI, TASLP, TOIS. He has received the ACL 2023 Best Paper Honorable Mention Award, Best Paper Award of ACL 2022 DialDoc Workshop, and the Outstanding Doctoral Dissertation Nomination Award of CIPSC. He has achieved top rankings in DSTC7-DSTC11. He has given multiple talks on dialogue systems.
Yunzhi Yao is a Ph.D candidate at at School of Computer Science and Technology, Zhejiang University. His research interests focus on Editing Large Language Models and Knowledge-enhanced Natural Language Processing. He has been research intern at Microsoft Research Asia supervised by Shaohan Huang, and research intern at Alibaba Group. He has published many papers in ACL, EMNLP, NAACL, SIGIR. He is the first author of the EMNLP 2023 paper “Editing Large Language Models: Problems, Methods, and Opportunities” and one of the developers of the knowledge editing framework EasyEdit, which is related to this tutorial.
Zhen Bi is a Ph.D candidate at at School of Software Engineering, Zhejiang University. His research interests focus on Knowledge Graph, Reasoning with Large Language Models and Knowledge-enhanced Natural Language Processing. He has published many papers in ICLR, ACL, EMNLP. Moreover, he is one of the developers of the knowledge editing framework EasyEdit, which is related to this tutorial.
Shumin Deng is a research fellow at Department of Computer Science, School of Computing (SoC), National University of Singapore. She have obtained her Ph.D. degree at School of Computer Science and Technology, Zhejiang University. Her research interests focus on Natural Language Processing, Knowledge Graph, Information Extraction, Neuro-Symbolic Reasoning and LLM Reasoning. She has been awarded 2022 Outstanding Graduate of Zhejiang Province, China; 2020 Outstanding Intern in Academic Cooperation of Alibaba Group. She is a member of ACL, and a member of the Youth Working Committee of the Chinese Information Processing Society of China. She has serves as a Research Session (Information Extraction) Chair for EMNLP 2022, and a Publication Chair for CoNLL 2023. She has been a Journal Reviewer for many high-quality journals, such as TPAMI, TASLP, TALLIP, WWWJ, ESWA, KBS and so on; and serves as a Program Committee member for NeurIPS, ICLR, ACL, EMNLP, EACL, AACL, WWW, AAAI, IJCAI, CIKM and so on. She has constructed a billion-scale Open Business Knowledge Graph (OpenBG), and released a leaderboard which has attracted thousands of teams and researchers.
TH8: Learning with Multiple Objectives Beyond Bilevel Optimization – New Foundations and Applications
Learning with multiple objectives emerges frequently as a new unified learning paradigm from recent machine learning problems such as learning under fairness and safety constraints; learning across multiple tasks including multi-task learning and meta-learning; learning across multiple agents including federated and multi-agent learning; and, learning with hierarchical games including incentive designs, reward shaping, and Stackelberg game. This tutorial will introduce principled frameworks that cover bilevel optimization for learning under two objectives with preference, and multi-objective optimization for learning with multiple objectives without preference, and their combinations. Efficient algorithms will be presented and recent advances on optimization and generalization theory will be introduced. We will highlight how we can apply those algorithms and theories to multi-task learning and learning with Markov games. Upon completion, the audience is expected to gain the necessary knowledge to effectively perform learning tasks with multiple objectives in and beyond the afore-mentioned applications.
External link to the website: https://sites.google.com/view/mol-tutorial/home
Tianyi Chen is an Assistant Professor in the Department of Electrical, Computer, and Systems Engineering at Rensselaer Polytechnic Institute (RPI), where he is jointly supported by the RPI – IBM Artificial Intelligence Research Partnership. Dr. Chen received his B. Eng. degree in Electrical Engineering from Fudan University in 2014, and the Ph.D. degree in Electrical and Computer Engineering from the University of Minnesota in 2019. Dr. Chen’s research focuses on theoretical and algorithmic foundations of optimization and statistical machine learning.
Dr. Chen is the inaugural recipient of IEEE Signal Processing Society Best PhD Dissertation Award in 2020, a recipient of NSF CAREER Award in 2021 and a recipient of Amazon Research Awards in 2022. He is also the co-author of several best paper awards such as the Best Student Paper Award at the NeurIPS Federated Learning Workshop in 2020 and at IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) in 2021.
Lisha Chen is a Ph.D. candidate in the Department of Electrical, Computer, and Systems Engineering at Rensselaer Polytechnic Institute (RPI). She received her M.S. in Electrical Engineering from RPI in 2021, her B.S. from Huazhong University of Science and Technology in 2017. Her research focuses on theoretical foundations of multi-objective learning and meta learning, as well as their applications to computer vision and communication tasks. For those topics, she has co-authored a monograph in Foundations and Trends in Signal Processing.
Lisha Chen has received several awards, including the IEEE Signal Processing Society Scholarship and Rensselaer’s Founders Award of Excellence in 2023, Rensselaer’s Belsky Award for Computational Sciences and Engineering in 2021, the IBM-AIRC PhD Fellowship for four years to support her research since 2020. She is also the recipient of the National Scholarship from the Ministry of Education of China.
Zhuoran Yang is an Assistant Professor of Statistics and Data Science at Yale University. His research interests span the intersection of machine learning, statistics, game theory, and optimization. In particular, he mainly works on the foundations of reinforcement learning, representation learning, and deep learning.
Before coming to Yale, he worked as a postdoctoral researcher at the University of California, Berkeley, under the supervision of Michael. I. Jordan. Prior to that, he obtained my Ph.D. from the Department of Operations Research and Financial Engineering at Princeton University, where he was co-advised by Jianqing Fan and Han Liu. he completed his bachelor’s degree in Mathematics at Tsinghua University in 2015.
TH9: Model Reuse: Concepts, Algorithms, and Applications
Machine learning techniques have made significant strides, leading to a wide array of general and specialized models that fulfill practical needs. As a result, model reuse has emerged as a crucial technique, empowering developers to harness the power of pre-trained models (PTMs), including small models and even large language models, to enhance the performance and efficiency of their target machine learning systems. These PTMs encapsulate valuable inductive biases that are helpful for target tasks, and well-designed reuse strategies are capable of extracting knowledge from a given model, pushing them beyond their original scope, and facilitating diverse machine learning tasks.
This tutorial establishes connections between the widely applied notion of model reuse across various domains and presents a modern taxonomy that covers the full stages of leveraging PTMs, namely in data preparation, architecture design, model training, and model inference. Besides providing a comprehensive overview of typical model reuse methods, the tutorial also explores extensions of basic model reuse notions, such as the active selection of PTMs from a model zoo, promising applications of model reuse, and the challenges, limitations, and trends in model reuse.
Dr. Han-Jia Ye is an Associate Professor in the School of Artificial Intelligence at Nanjing University. His primary research interest is in machine learning, including representation learning, model reuse , and meta-learning. He received his PhD degree in computer science from Nanjing University, China, in 2019. He has served as the tutorial co-chair of SDM’23 and doctoral forum co-chair of SDM’22. Additionally, he is a Senior Program Committee/Program Committee member for top-tier conferences including ICML, NeurIPS, IJCAI, ECML-PKDD, and others.
Dr. Yao-Xiang Ding is a tenure-track assistant professor in the State Key Lab of CAD & CG, Zhejiang University. His primary research area lies in machine learning, with the specific interest in understanding how prior inductive bias can help reduce the sample complexity. He received PhD degree in computer science from Nanjing University, China, in 2020. He is a regular program committee member of NeurIPS, ICML, ICLR, AAAI, and IJCAI. He also serves as the reviewer/meta-reviewer for top-tier conferences like UAI, AISTATS, ECML-PKDD, ECAI, CIKM, SDM, ACML, and PAKDD, as well as journals like IEEE TPAMI, IEEE TKDE, IEEE TKDD, MLJ, and KAIS. He served as the publicity co-chair of SDM’23.
TH10: Recent Advance in Physics-Informed Machine Learning
Machine learning (ML) is spurring a transformation in the computational sciences by providing a new way to build flexible, universal, and efficient approximations for complex high-dimensional functions and functionals. One area in which the impact of these new tools is beginning to be understood is the physical sciences, where traditionally intractable high-dimensional partial differential equations are now within reach. This tutorial will explore how developments in ML complement computational problems in the physical sciences, with a particular focus on solving partial differential equations, where the challenges of high-dimensionality and data acquisition also arise.
The first important example this tutorial will cover is using Deep Learning Methods for solving high-dimensional PDEs, which have wide application in variational rare events calculations, many-body quantum systems, and stochastic control. Another challenge covered by this tutorial that researchers often face is the complexity or lack of specification of the models they are using when performing uncertainty quantification. Thus another line of research aims to recover the underlying dynamic using observational data.
This tutorial will introduce the well-developed methods and theories for using machine learning in scientific computing. We will first discuss how to incorporate physical priors into machine learning models. Next, we will discuss how these methods can help to solve challenging physical and chemical problems. Finally, we will discuss the statistical and computational theory for scientific machine learning. In this tutorial, we will not focus on the technical details behind these theories, but on how they can help the audience to understand the challenges of using machine learning in differential equation applications and to develop new methods for addressing these challenges.
Grant M. Rotskoff is an Assistant Professor of Chemistry at Stanford. His work involves developing theoretical and computational tools that can probe and predict the properties of physical systems driven away from equilibrium. Recently, he has focused on characterizing and designing physically accurate machine learning techniques for sampling and physics-based simulation. Prior to his current position, Grant was a James S. McDonnell Fellow working at the Courant Institute of Mathematical Sciences at New York University.
Yiping Lu is a Courant instructor at Courant Institute of Mathematical Sciences at New York University and an incoming assistant professor at Industrial Engineering and Management Science at Northwestern University. He received his PhD in computational and mathematical engineering from Stanford University. His awards include the CPAL Rising Star Award (2024), Rising Star in Data Science (University of Chicago 2022), a Stanford Interdisciplinary Graduate Fellowship (2021-2024), and a SenseTime Scholarship (2018-2019). He currently serves as an area chair at AISTATS. His current research interests include time series analysis, non-parametric statistics, and machine learning, often set in the context of physics-based systems governed by differential equations.
More Information to Come
TH11: Zeroth-Order Machine Learning: Fundamental Principles and Emerging Applications in Foundation Models
The overarching goal of this tutorial is twofold: The first aim is to conduct a comprehensive assessment of the latest advancements in the gradient-free learning paradigm, also referred to as zeroth-order machine learning (ZO-ML). This involves an exploration of the theoretical and methodological foundations that support ZO-ML. The second goal is to illustrate the effective integration of ZO-ML techniques with emerging ML/AI applications. This step aims to bridge the theoretical and practical aspects of ZO-ML, demonstrating its potential to overcome design limitations in current foundation model (FM)-oriented applications.
Format
The tutorial comprises five parts: an introduction to ZO-ML, including its mathematical foundations and a comparison with first-order methods; an exploration of the foundational algorithms, properties, and scaling techniques of ZO-ML; a focus on ZO-ML applications in various AI fields; a practical demonstration session showcasing ZO-ML tools, benchmarking, and live demos in large language models and adversarial defense; and a concluding segment with key takeaways, future perspectives, and additional resources for deepening knowledge in ZO-ML.
Sijia Liu is an Assistant Professor at the CSE department of Michigan State University, and an Affiliate Professor at MIT-IBM Watson AI Lab, IBM Research. His research focuses on trustworthy and scalable ML, and optimization for signal processing. He received the Best Paper Runner-Up Award at UAI (2022), and the Best Student Paper Award at ICASSP (2017). He has given numerous tutorials at top-tier ML and SP conferences (including NeurIPS, CVPR, AAAI, KDD, MLSP, and CISS), and co-chaired several workshops on Trustworthy AI and Optimization for ML at KDD’19-22 and ICML’22-23.
Tianlong Chen is an incoming Assistant Professor of Computer Science at The University of North Carolina at Chapel Hill in Fall 2024. Before that, he was a postdoctoral researcher at CSAIL@MIT and BMI@Harvard. Dr. Chen received his Ph.D. degree in Electrical and Computer Engineering at The University of Texas at Austin in 2023. He received the IBM Ph.D. Fellowship, Adobe Ph.D. Fellowship, Graduate Dean’s Prestigious Fellowship, AdvML Rising Star Award, and the Best Paper Award from the inaugural Learning on Graphs (LoG) Conference 2022. He has served as an area chair in ICIP’22, ICIP’23, and CPAL’23.
Zhangyang Wang is a tenured Associate Professor and holds the Temple Foundation Endowed Faculty Fellowship #7, in the Chandra Family Department of Electrical and Computer Engineering at The University of Texas at Austin. He has broad research interests spanning from the theory to the application aspects of machine learning (ML). He received many research awards, including an NSF CAREER Award, an ARO Young Investigator Award, an IEEE AI’s 10 To Watch Award, an INNS Aharon Katzir Young Investigator Award, a Google Research Scholar award, an IBM Faculty Research Award, a J. P. Morgan Faculty Research Award, an Amazon Research Award, an Adobe Data Science Research Award, a Meta Reality Labs Research Award, and two Google TensorFlow Model Garden Awards.
Pin-Yu Chen is a principal research scientist at IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA. He is also the chief scientist of RPI-IBM AI Research Collaboration and PI of ongoing MIT-IBM Watson AI Lab projects. Dr. Chen received his Ph.D. degree in electrical engineering and computer science from the University of Michigan, Ann Arbor, USA, in 2016. Dr. Chen’s recent research focuses on adversarial machine learning and the robustness of neural networks. His long-term research vision is building trustworthy machine learning systems. At IBM Research, he received the honor of IBM Master Inventor and several research accomplishment awards, including an IBM Master Inventor and IBM Corporate Technical Award in 2021. His research works contribute to IBM open-source libraries including Adversarial Robustness Toolbox (ART 360) and AI Explainability 360 (AIX 360). He has published more than 40 papers related to trustworthy machine learning at major AI and machine learning conferences, given tutorials at NeurIPS’22, AAAI’22, IJCAI’21, and CVPR(’20,’21,’23), and organized several workshops for adversarial machine learning.
Mingyi Hong is an Associate Professor at the Department of Electrical and Computer Engineering, University of Minnesota. His research has been focused on optimization theory, and its applications in machine learning and signal processing. His works have been recognized by a number of awards, including a Best Paper Award from IEEE Signal Processing Society (2021), and a Best Student Paper Award from Asilomar Conference (2018). He has been one of the three finalists for IEEE Signal Processing Society Early Career Research Award (2021), and Mathematical Optimization Society Young Researchers in Continuous Optimization (2013, 2016).
Wotao Yin is the director of Decision Intelligence Lab, DAMO Academy. Before joining Alibaba US in 2019, he was a professor in the Department of Mathematics at the University of California, Los Angeles. His research interests include computational optimization and its applications in signal processing, machine learning, and other data science problems. He won the NSF CAREER award in 2008, an Alfred P. Sloan Research Fellowship in 2009, a Morningside Gold Medal in 2016, a DAMO Award in 2021, and an INFORMS Egon Balas Prize in 2021. He has had eight papers co-authored with his students and collaborators that have received the best-paper-kind awards. Since 2018 he has been among the top 1% cited researchers by Clarivate Analytics.
Yihua Zhang is a Ph.D. student of the department of computer science and engineering at Michigan State University. His research has been focused on the optimization theory and optimization foundations of various AI applications. In general, his research spans the areas of machine learning (ML)/deep learning (DL), computer vision, and security. He has published papers at major ML/AI conferences such as CVPR, ICCV, ICML, NeurIPS, and ICLR. He also received the Best Paper Runner-Up Award at the Conference on Uncertainty in Artificial Intelligence (UAI), 2022.
TH12: Knowledge-enhanced Graph Learning
Graph learning has emerged as a prominent area of interest in both academic and industrial circles, aiming to model complex graph-structured data. While existing methods primarily concentrate on leveraging node features and topological structures, there is untapped potential in enhancing graph learning through auxiliary knowledge. This auxiliary knowledge refers to valuable information that can be obtained, extracted, or learned from resources beyond the provided node features and structures. Exploring the realm of knowledge-enhanced graph learning (KEGL) is crucial for improving learning capabilities. In this tutorial, our focus is on showcasing state-of-the-art techniques in KEGL. We delve into the diverse sources of knowledge that can benefit graph learning, including internal perception, external sources, human expertise, and knowledgeable models. Additionally, we provide insights into real-world applications and engage in discussions about future research directions.
Yijun Tian
University of Notre Dame
yijun.tian@nd.edu
Yijun TIan is a Ph.D. candidate in Computer Science and Engineering at the University of Notre Dame. His research interests center around artificial intelligence, machine learning, and data science. His research aims to empower machines with the knowledge to positively influence real-world applications, health, and sciences. His work has been recognized with multiple awards and honors from top-tier conferences such as ICLR and AAAI.
Shichao Pei
University of Massachusetts Boston
shichao.pei@umb.edu
Shichao Pei is an Assistant Professor in the Department of Computer Science at the University of Massachusetts Boston. He was a Postdoctoral Research Associate at the University of Notre Dame. He received his Ph.D. degree in Computer Science from King Abdullah University of Science and Technology in 2021. His research interests span the areas of artificial intelligence, machine learning, and data mining, in particular graph representation learning, knowledge representation learning, data-efficient machine learning, and recommendation systems. He has served as PC and SPC at NeurIPS, ICML, ICLR, KDD, AAAI, IJCAI, etc.
Xiangliang Zhang
University of Notre Dame
xzhang33@nd.edu
Xiangliang Zhang is an Associate Professor in the Department of Computer Science and Engineering, University of Notre Dame. She was an Associate Professor in Computer Science at the King Abdullah University of Science and Technology (KAUST), Saudi Arabia. She received her Ph.D. degree in computer science from INRIA-Universite Paris Sud, France, in 2010. Her main research interests and experiences are in machine learning and data mining. She has published more than 200 refereed papers in leading international conferences and journals. She serves as associate editor of IEEE Transactions on Dependable and Secure Computing, Information Sciences and International Journal of Intelligent Systems, and regularly serves as area chair or on the (senior) program committee of IJCAI, SIGKDD, NeurIPS, AAAI, ICML, and WSDM.
Wei Wang
University of California, Los Angeles
weiwang@cs.ucla.edu
Wei Wang is the Leonard Kleinrock Chair Professor in Computer Science and Computational Medicine at University of California, Los Angeles and the director of the Scalable Analytics Institute (ScAi). She was a professor in Computer Science at the University of North Carolina at Chapel Hill from 2002 to 2012, and was a research staff member at the IBM T. J. Watson Research Center between 1999 and 2002. Her research interests include big data analytics, data mining, machine learning, natural language processing, bioinformatics and computational biology, and computational medicine. She has filed seven patents, and has published one monograph and 300+ research papers in international journals and major peer-reviewed conference proceedings, including multiple best paper awards. She chairs the Executive Committee of ACM SIGKDD. She is a Fellow of both ACM and IEEE.
Hanghang Tong
University of Illinois at Urbana-Champaign
htong@illinois.edu
Hanghang Tong is an associate professor of Computer Science at the University of Illinois at Urbana-Champaign. Before that, he worked at Arizona State University as an associate professor, at City University of New York (City College) as an assistant professor and at IBM T. J. Watson Research Center as a Research Staff Member. He received his Ph.D. from the Machine Learning Department of School of Computer Science at Carnegie Mellon University in 2009. His major research interest lies in large-scale data mining for graphs and multimedia. In the past, He has published 300+ papers in these areas and his research has received several awards, including ICDM Tao Li award (2019), SDM/IBM Early Career Data Mining Research award (2018), NSF CAREER award (2017), ICDM 10-Year Highest Impact Paper award (2015 \& 2022), and several best paper awards (e.g., ICDM’06, SDM’08, CIKM’12). He is a fellow of IEEE (2022) and a distinguished member of ACM (2020).
Nitesh V. Chawla
University of Notre Dame
nchawla@nd.edu
Nitesh V. Chawla is the Frank M. Freimann Professor of Computer Science and Engineering at the University of Notre Dame. He is the Founding Director of the Lucy Family Institute for Data and Society. He is an expert in artificial intelligence, data science, and network science, and is motivated by the question of how technology can advance the common good through interdisciplinary research. As such, his research is not only at the frontier of fundamental methods and algorithms but is also making interdisciplinary and translational advances for societal impact. He is the recipient of 2015 IEEE CIS Outstanding Early Career Award; the IBM Watson Faculty Award; the IBM Big Data and Analytics Faculty Award; and the 1st Source Bank Technology Commercialization Award. He was recognized with the Rodney F. Ganey Award and Michiana 40 under 40 honor. He is a Fellow of both ACM and IEEE.
Cancelled – TH13: Large-Scale Graph Neural Networks: Navigating the Past and Pioneering New Horizons
Graph Neural Networks (GNNs) have gained significant attention in recent years due to their ability to model complex relationships between entities in graph-structured data such as social networks, protein structures, and knowledge graphs. However, due to the size of real-world industrial graphs and the special architecture of GNNs, it is a long-lasting challenge for engineers and researchers to deploy GNNs on large-scale graphs, which significantly limits their applications in real-world applications. In this tutorial, we will cover the fundamental scalability challenges of GNNs, frontiers of large-scale GNNs including classic approaches and some newly emerging techniques, the evaluation and comparison of scalable GNNs, and their large-scale real-world applications. Overall, this tutorial aims to provide a systematic and comprehensive understanding of the challenges and state-of-the-art techniques for scaling GNNs. The summary and discussion on future directions will inspire engineers and researchers to explore new ideas and developments in this rapidly evolving field. The website of this tutorial is available at https://sites.google.com/ncsu.edu/largescalegnn/home/aaai_2024.
Keywords
Graph Neural Networks; Large-scale Graphs; Scalability
Descriptions
This tutorial represents a notable achievement as it offers a comprehensive overview of techniques designed for largescale machine learning on graphs, encompassing both theoretical foundations and practical applications. It delves into past and recent research endeavors aimed at enhancing the scalability of Graph Neural Networks (GNNs) and explores their diverse potential use cases. Specifically, this tutorial will provide a thorough examination of the theory and algorithms for a wide array of methods, including both existing and emerging ones, ensuring that attendees leave with a strong theoretical foundation and a profound grasp of these methods. Furthermore, this tutorial will also highlight their applications in industry, bridging the gap between theory and real-world practice. This comprehensive coverage ensures that participants are not only well-versed in specific. All rights reserved. methods but also informed about the latest developments in the field and how to apply them in real-world scenarios. The prerequisite knowledge for our proposed tutorial expects that attendees have a foundational understanding of machine learning and deep learning concepts. Familiarity with linear algebra, calculus, and probability theory, as these mathematical principles are fundamental to GNNs algorithms. Moreover, a basic understanding of graph theory and graph data structures is also preferred. Although we will offer an introductory overview of Graph Neural Networks (GNNs) during the tutorial, participants with prior knowledge in these areas will be better prepared to engage with advanced GNN concepts. Furthermore, possessing expertise in specific application domains where GNNs are frequently utilized, such as social networks, recommendation systems, or natural language processing, will prove beneficial for attendees interested in practical applications.
Rui Xue is a Ph.D. student in the Department of Electrical and Computer Engineering at North Carolina State University. His main research interests include machine learning on graphs, scalability of machine learning, and signal processing. He has published several papers at signal processing and machine learning conferences.
Haoyu Han is currently a second-year Ph.D. candidate in the Department of Computer Science and Engineering at Michigan State University. Haoyu’s primary research areas encompass graph data mining and large-scale machine learning. He has authored several publications in the field of graph data mining.
Tong Zhao is a Research Scientist in the Computational Social Science group at Snap Research. He earned a Ph.D. in Computer Science and Engineering at University of Notre Dame in 2022. His research focuses on graph machine learning as well as their applications in real-world use cases. His work has resulted in 20+ conference and journal publications in top venues.
Neil Shah is a Lead Research Scientist and Manager at Snap Research, working on machine learning algorithms and applications on large-scale graph data. His work has resulted in 55+ conference and journal publications in top venues, including several best-paper awards. He earned a PhD in Computer Science in 2017 from Carnegie Mellon University’s Computer Science Department.
Jiliang Tang is a MSU Foundation professor in the computer science and engineering department at Michigan State University. His research interests include data mining, machine learning and their applications in social media, biology, and education. He has published his research in highly ranked journals and top conference proceedings, which received more than 26,000 citations with h-index 77 and extensive media coverage.
Xiaorui Liu is an assistant professor in Computer Science Department at North Carolina State University. He received his Ph.D. degree in Computer Science from Michigan State University in 2022. His research interests include deep learning on graphs, large-scale machine learning, and trustworthy artificial intelligence. He has published innovative works in top-tier conferences. More information about him can be found at https://sites.google.com/ncsu.edu/xiaorui/.
TH14: Machine learning for discrete optimization: Theoretical guarantees and applied frontiers
Machine learning has become a powerful tool for discrete optimization. Discrete optimization algorithms have an incredible array of applications, including routing, manufacturing, planning, finance, and many others. The problems that, for example, a shipping company must solve to route its trucks will change daily, but not drastically: although demand and traffic will vary, the underlying road network will remain the same. This means that there is likely underlying structure that can be uncovered with the help of machine learning to optimize algorithm runtime on future problems. Since discrete optimization algorithms are used so widely, and since discrete optimization problems are so difficult to solve, this use of machine learning has the potential to make a significant impact in industry and scientific research.
This tutorial will cover how machine learning can be used within the discrete optimization pipeline from many perspectives, including how to design novel combinatorial algorithms with machine-learned modules and configure existing algorithms’ parameters to optimize performance. Topics will include both applied machinery (such as graph neural networks and reinforcement learning) as well as theoretical tools for providing provable guarantees.
I will only assume a background in machine learning. The tutorial will cover all other relevant background, such as graph neural networks and statistical learning theory tools.
Ellen Vitercik is an Assistant Professor at Stanford University with a joint appointment between the Management Science & Engineering department and the Computer Science department. Her research revolves around machine learning theory, discrete optimization, and the interface between economics and computation. Before joining Stanford, she spent a year as a Miller Fellow at UC Berkeley after receiving a PhD in Computer Science from Carnegie Mellon University. Her thesis won the SIGecom Doctoral Dissertation Award and the CMU School of Computer Science Distinguished Dissertation Award.
Cancelled – TH15: Privacy-Preserving Techniques for Large Language Models
This tutorial bridges two dynamic and evolving fields in AI: large language models (LLMs) and privacy-preserving AI. As we witness the proliferation of AI models, including LLMs, across various domains, the imperative to ensure privacy and security has never been more critical. We seek to unite these disciplines by introducing researchers and practitioners in the LLM space to cutting-edge privacy-preserving techniques that have the potential to revolutionize high-stake AI systems.
As the deployment of AI models in sensitive areas like healthcare, finance, and communication continues to grow, safeguarding privacy becomes paramount. Privacy-preserving techniques, particularly when applied to large language models, represent a relatively new and rapidly evolving field. This course introduces the LLM community to practical security approaches and novel privacy-preserving methodologies (such as federated learning, differential privacy, and secure multi-party computation) that are not yet widely explored within the LLM frontier research with its unique pre-training, fine-tuning and deployment requirements, but have immense potential for high-stake applications, such as healthcare, forensics, and justice.
With the increasing scrutiny on data privacy regulations and ethical AI practices, the tutorial addresses a topic that is at the forefront of discussions in both the LLM and AI security communities. Understanding and implementing privacy-preserving methods is essential for compliance with evolving regulations and maintaining public trust. Following the upcoming book in early 2024 by the author on the same topic, the course provides actionable insights and tools that A( professionals can immediately apply to their work. Attendees will gain both the methodological understanding and hands-on coding experience with techniques that enable the responsible deployment of large language models in real-world high-stake applications.
Dr. Baihan Lin is a research scientist in Trustworthy AI at Columbia University and IBM Research, as well as an incoming tenure-track professor. With 10+ years of industry experience (Google, IBM, Microsoft, Amazon, BGI Genomics), 50+ publications and patents, a PhD from Columbia, and a strong reputation in high-stake AI solutions in health, Baihan has chaired international conferences, been a Bell Labs Prize and XPRIZE finalist, and pioneered research on learning health systems, improving clinical outcomes with security and privacy, and spans deep learning, reinforcement learning, and natural language processing(NLP). Baihan authored “Reinforcement Learning Methods in Speech and Language Technology” (Springer, 2023) and “Privacy and Security for Large Language Models” (O’Reilly, 2024). He has taught tutorials, chaired conferences, and contributed to prestigious journals, bringing a unique perspective to large language models’ responsible, secure, and accessible application.
TH16: Probabilistic Concept Formation with Cobweb
Humans possess unique capabilities for forming concepts from experience, such as acquiring and revising them in an incremental, unsupervised, and cumulative way. Creating machine learning approaches that can realize these, and other, human-like capabilities remains a tantalizing goal. This tutorial delves into this intriguing area, showcasing the Cobweb family of approaches, which provide a foundation for human-like machine learning. These approaches adopt a probabilistic formalism to support robust incremental and unsupervised concept learning. Moreover, Cobweb models support prediction along multiple dimensions simultaneously, making them more flexible than traditional supervised learning models.
This session will present Cobweb’s core methodology for learning and performance, discuss several variants and extensions, and demonstrate its unique capabilities across several real-world applications, such as tabular data clustering and prediction and language and vision modeling. Our interactive tutorial format will walk participants through several examples, offering hands-on guidance to obtain and run code for each. We aim to bridge the gap between theory and practice, offering insights into how Cobweb’s human-like learning methodology can lead to new machine learning advancements.
Our tutorial invites participants from diverse backgrounds and application areas that are keen to explore an innovative new concept learning paradigm. It is designed for participants with a basic background in computer science and AI/ML; although, participants with deeper knowledge in these areas will find the tutorial beneficial too.
For more details see https://humanlikelearning.com/aaai24-tutorial/.
Dr. Pat Langley serves as Director of the Institute for the Study of Learning and Expertise. He was founding editor of two journals, Machine Learning and Advances in Cognitive Systems. Dr. Langley’s current research focuses on cognitive architectures for embodied agents, learning complex procedures from instructions, and inducing dynamic causal models from time series
Dr. Douglas Fisher is an Associate Professor of Computer Science at Vanderbilt University. His expertise spans online learning, artificial intelligence, machine learning, computational creativity, and computational sustainability. Dr. Fisher is a former NSF Program Director overseeing funding in artificial intelligence and machine learning.
Dr. Christopher MacLellan is an Assistant Professor in the School of Interactive Computing at Georgia Tech. He specializes in AI, ML, HCI, and the Cognitive & Learning Sciences. He aims to understand how humans teach and learn and to develop artificial systems that can teach and learn like people.
TH17: Experiments in Computational Social Choice Using Maps of Elections
In this tutorial we will present the “map of elections” framework for conducting experiments and analyzing election data in computational social choice. The idea of the framework is to take a set of elections, compute distances between them, and present them as points on a plane, whose distances resemble the distances between the elections. First, we will present the main components of the framework, including different ways of computing distances between elections, sources of election data, and algorithms for embedding elections in 2D space. Second, we will show a number of use cases of the framework, including planning and analyzing experiments, as well as analyzing synthetic and real-life data. Finally, we will show various extensions of the framework.
The target audience of this tutorial are researchers working on all aspects of computational social choice, with a particular focus on researchers studying elections and on early-stage researchers. In principle, the attendees can come without prior knowledge (of computational social choice), as all relevant notions will be explained. However, to make the most out of the tutorial, it will be helpful for the attendees to have a basic understanding of ordinal elections and voting rules used to select winners (such as the Plurality rule, the Borda rule, the notion of a Condorcet winner etc.). In some parts of the tutorial, it will be helpful to have a minimal background regarding multiwinner and approval elections (including the notion of justified representation and cohesive groups).
https://home.agh.edu.pl/~pragma/tutorials/aaai24/
Niclas Boehmer is a postdoctoral research fellow at the Harvard University, in the group of Prof. Milind Tambe. He obtained his PhD under the supervision of late Prof. Rolf Niedermeier and Prof. Markus Brill. His work focuses on theoretical and experimental aspects of computational social choice.
Piotr Faliszewski is a professor of computer science at the AGH University of Science and Technology in Krakow, Poland. His research focuses on multiwinner elections, participatory budgeting, bribery problems, and foundations of numerical experiments in computational social choice. He leads ERC consolidator project PRAGMA.
Stanisław Szufa is a PhD student at the Jagiellonian University, working in the groups of Prof. Piotr Faliszewski and Prof. Jerome Lang. He is primarily interested in experimental research in computational social choice. He is the main contributor to the MapEl package for map of elections and the maintainer of the Pabulib library of participatory budgeting instances.
TH18: Formalizing Robustness in Neural Networks: Explainability, Uncertainty, and Intervenability
Neural network driven applications like ChatGPT suffer from hallucinations where they confidently provide inaccurate information. A fundamental reason for this inaccuracy is the lack of robust measures that are applied on the underlying neural network predictions. In this tutorial, we identify and expound on three human-centric robustness measures, namely explainability, uncertainty, and intervenability, that every decision made by a neural network must be equipped and evaluated with. Explainability and uncertainty research fields are accompanied by a large body of literature that analyze decisions. Intervenability, on the other hand, has gained recent prominence due its inclusion in the GDPR regulations and a surge in prompting-based neural network architectures. In this tutorial, we connect all three fields using gradient-based techniques to create robust machine learning models.
The goal of the tutorial is threefold: 1) Decompose modern large-scale neural network robustness into three manageable and human-centric measures, 2) Probabilistically define post-hoc explainability, uncertainty, and intervenability measures, and 3) Compute the three measures as a function of the input, network and output with a focus on real life applications in medical, and seismic domains. The tutorial is composed of four major parts. Part 1 discusses some recent surprising results regarding training neural networks with out-of-distribution (OOD) data, the conclusions of which are that it is not always clear when and how to use OOD data during training. This motivates the need for formal and human-centric measures of robustness at Inference. Part 2 introduces the basic mathematical framework for each one of Explainability, Uncertainty, and Intervenability. We consider the relationships between them and show that interventions on variance decomposition of predictive uncertainty is an evaluation measure used for explainability. Part 3 illustrates the relationship between Explainability and Uncertainty in real world applications of biomedical and geophysics image analyses. Part 4 discusses intervenability as a function of uncertainty for the case of prompting and non-causal interventions. The detailed subtopics, tutorial outline, and suggested reading are present at https://alregib.ece.gatech.edu/aaai-2024-tutorial/.
This tutorial is intended for graduate students, researchers, engineers, and data scientists working in different topics related to visual information processing, machine learning, robust machine learning, and explainable AI. The audience are expected to have a basic understanding of neural networks and robustness applications including image recognition and detection.
Ghassan AlRegib is currently the John and McCarty Chair Professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. His research group, the Omni Lab for Intelligent Visual Engineering and Science (OLIVES) works on research projects related to machine learning, image and video processing, seismic interpretation, machine learning for ophthalmology, and video analytics. In 2008, he received the ECE Outstanding Junior Faculty Member Award. In 2017, he received the 2017 Denning Faculty Award for Global Engagement. He and his students received the Beat Paper Award in ICIP 2019. He is an IEEE Fellow.
Mohit Prabhushankar received his Ph.D. degree and is currently a Postdoctoral Fellow in the Georgia Institute of Technology. He is working in the fields of machine learning, image processing, healthcare, and robust and explainable AI. He is the recipient of the Best Paper award at ICIP 2019 and Top Viewed Special Session Paper Award at ICIP 2020. He is the recipient of the ECE Outstanding Graduate Teaching Award, the CSIP Research award, and of the Roger P Webb ECE Graduate Research Excellence award, all in 2022.
TH19: Foundations, Practical Applications, and Latest Developments in Causal Decision Making
To make effective decisions, it’s important to have a thorough understanding of the causal connections among actions, environments, and outcomes. This tutorial aims to surface three crucial aspects of decision making through a causal lens: 1) the discovery of causal relationships through causal structure learning, 2) Understanding the impacts of these relationships through causal effect learning, and 3) applying the knowledge gained from the first two aspects to support decision-making via causal policy learning. This tutorial aims to offer a comprehensive methodology and practical implementation framework by consolidating various methods in this area into a Python-based collection. This tutorial will provide a unified framework for you to understand areas including causal inference, causal discovery, randomized experiments, dynamic treatment regimen, bandits, reinforcement learning, and so on. This tutorial is based on an online book with an accompanying Python package (in progress; collaboration is welcomed).
Rui Song is a senior principal scientist at Amazon. She obtained her Ph.D. in Statistics from the University of Wisconsin in 2006 and has been a faculty member at North Carolina State University since 2012. Her research interests include reinforcement learning, causal inference, precision health, and generative AI. Her research has been supported as principal investigator by the National Science Foundation (NSF), including the NSF Faculty Early Career Development (CAREER) Award. She has
served as an associate editor for several statistical journals. She is an elected Fellow of the American Statistical Association and
Institute of Mathematical Statistics.
Hengrui Cai is an Assistant Professor of Statistics at the University of California Irvine. She got her PhD in Statistics from North
Carolina State University in 2022. Her research interests revolve around methodology and theory in causal inference, graphical modeling, and reinforcement learning, to establish reliable, powerful, and interpretable solutions to wide real-world problems. She received ENAR Distinguished Student Paper Award and has published over 10 papers in conferences and journals such as ICLR, NeurIPS, ICML, IJCAI, JMLR, Stat, and SIM.
Runzhe Wan is currently an Applied Scientist at Core AI, Amazon. He earned his Ph.D. in Statistics from North Carolina State University, mentored by Dr.Rui Song. Runzhe’s primary research focus revolves around optimal decision-making under uncertainty, encompassing areas such as causal inference, optimal personalized decision rules, multi-armed bandits, and reinforcement learning. Runzhe has received the Norman Breslow Young Investigator Award from the American Statistical Association and has published over 10 papers in conferences and journals such as NeurIPS, ICML, KDD, AISTATS, AoS, and JOE.
Lin Ge is a Ph.D. candidate in Statistics in her fifth year at North Carolina State University. Her research interests lie in developing
statistical and machine learning methods to solve real-world problems in mobile health applications, as well as recommender systems in e-commerce, advertisement, video streaming, and other platforms. Her focus areas include causal inference, bandit algorithms, reinforcement learning, and hidden Markov processes, with the ultimate goal of facilitating optimal decision-making via offline data analysis or online interactions.
Yang Xu is a Ph.D. candidate in Statistics at North Carolina State University. Her research focuses on causal inference and reinforcement learning, with a particular emphasis on offline policy evaluation and its applications in advertising markets and clinical trials, among others. One of her recent papers received the ASA best student paper award in SLDS session, 2023.
TH20: On the role of Large Language Models in Planning
Large Language Models (LLMs, or n-gram models on steroids) that have been trained originally to generate text by repeatedly predicting the next word in the context of a window of previous words, have captured the attention of the AI (and the world) community. Part of the reason for this is their ability to produce meaningful completions for prompts relating to almost any area of human intellectual endeavors. This sheer versatility has also led to claims that these predictive text completion systems may be capable of abstract reasoning and planning. In this tutorial we take a critical look at the ability of LLMs to help in planning tasks–either in autonomous modes, or in assistive modes. We are particularly interested in characterizing these abilities–if any–in the context of problems and frameworks widely studied in the AI planning community. The tutorial will both point out the fundamental limitations of LLMs in generating plans that will normally require resolving subgoal interactions with combinatorial search, and also show constructive uses of LLMs as complementary technologies to the sound planners that are developed in the AI Planning community. In addition to presenting our own work in this area, we provide a critical survey of many related efforts, including by researchers outside of the planning community.
Tutorial Website: http://rakaposhi.eas.asu.edu/llm-planning-tutorial.html
Subbarao Kambhampati is a professor of computer science at Arizona State University. Kambhampati studies fundamental problems in planning and decision making, motivated in particular by the challenges of human-aware AI systems. He is a fellow of Association for the Advancement of Artificial Intelligence, American Association for the Advancement of Science, and Association for Computing machinery, and was an NSF Young Investigator. He served as the president of the Association for the Advancement of Artificial Intelligence, a trustee of the International Joint Conference on Artificial Intelligence, the chair of AAAS Section T (Information, Communication and Computation), and a founding board member of Partnership on AI. Kambhampati’s research as well as his views on the progress and societal impacts of AI have been featured in multiple national and international media outlets. He can be followed on Twitter @rao2z.
Karthik Valmeekam is a third-year Ph.D. student at Arizona State University working at the Yochan Lab under the guidance of Prof. Subbarao Kambhampati. His research primarily focuses on Large Language Models (LLMs) and reasoning, with a special emphasis on exploring the planning abilities of LLMs. This includes understanding the various roles that LLMs can play in planning and reasoning about actions and change. He has also made contributions in areas like Human Aware AI Planning and Preference Based Reinforcement Learning. His research has been recognized at major AI conferences such as NeurIPS, ICLR, and ICAPS.
Lin Guan is a fifth-year PhD student at Arizona State University under the supervision of Prof. Subbarao Kambhampati. His research primarily focuses on building intelligent decision-making agents through methods such as reinforcement learning from human feedback (i.e., RLHF) and plan generation with large language models (i.e., LLM-based AI agents). His research has been recognized at top-tire AI conferences such as NeurIPS, ICLR, and ICML.
TH21: Scalability, Robustness, and Optimization of Learning in Large Stochastic Games
In today’s interconnected world, systems with large populations of strategic agents are ubiquitous, from distributed robotics, advertisement and marketing, network routing, self-driving cars, to macroeconomics and finance. In the past decade, significant progress has been made, on both the conceptual and the computational aspects, unlocking the potential for more efficient and more robust techniques in analyzing and learning large stochastic games. In this tutorial, after introducing a number of motivating examples and related concepts, we will survey several approximation approaches, including mean field techniques and optimization algorithms, to study extremely large populations of rational agents, along with several applications and examples. We will then review recent advances on the computational side with a focus on state-of-the-art learning and optimization algorithms. Last, we will conclude with some challenges and future directions. Throughout the tutorial, we will present online open software packages and codes to illustrate relevant techniques.
The main modeling framework will be large games with stochastic dynamics, in which many strategic agents optimize a reward while interacting with each other. The core concepts are the notion of Nash equilibrium and Pareto optimum. In terms of methods, we will focus on learning and optimization-based algorithms: starting with simple methods, we will then show the state-of-the-art algorithms based on optimization, deep learning, and reinforcement learning.
The target audience is anyone interested in learning about optimization and games with large populations, their applications, and related solution approaches and learning algorithms. We expect some familiarity with Markov Decision Processes and basic optimization concepts, but we do not expect any specific expertise beyond the familiarity that would be expected of the usual AAAI attendee. The participants will learn how to phrase various problems in the framework of large stochastic games, and will gain knowledge in solution methods based on machine learning. They will also be able to apply library codes to solve their practical problems. The topic of this tutorial has gained exponential momentum in the machine learning community over the past few years, and it will also be of interest for the general audience of AAAI. Our aim is to make the topic more accessible to attendees so they will be able to contribute to this field and utilize these tools and algorithms to solve problems relevant to their fields.
Xin Guo is the Coleman Fung Chair in Financial Modeling Endowment Fund in the Department of Industrial Engineering and Operations Research. Her lab studies Risk Analytics & Data Analysis. Research topics include stochastic controls, stochastic differential games and machine learning, with applications in finance, biological sciences, and healthcare.
Mathieu Lauriere is an Assistant Professor of Mathematics and Data Science at NYU Shanghai, after being a Postdoctoral Researcher at Princeton’s ORFE department and a Visiting Faculty Researcher at Google Brain. His recent work focuses on computational methods for mean field problems, including deep learning and reinforcement learning methods.
TH22: Trustworthy Machine Learning under Imperfect Data
Knowledge should not be accessible only to those who can pay” said Robert May, chair of UC’s faculty Academic Senate. Similarly, machine learning should not be accessible only to those who can pay. Thus, machine learning should benefit to the whole world, especially for developing countries in Africa and Asia. When dataset sizes grow bigger, it is laborious and expensive to obtain perfect data (e.g., clean, safe, and balanced data), especially for developing countries. As a result, the volume of imperfect data becomes enormous, e.g., web-scale image and speech data with noisy labels, images with specific noise, and long-tail-distributed data. However, standard machine learning assumes that the supervised information is fully correct and intact. Therefore, imperfect data harms the performance of most of the standard learning algorithms, and sometimes even makes existing algorithms break down. In this tutorial, we focus on trustworthy learning when facing three types of imperfect data: noisy data, adversarial data, and long-tailed data.
Prerequisites for Audiences:
Although we will introduce the basics of knowledge required, it is better to have some knowledge in linear algebra, probability, machine learning, and artificial intelligence. The emphasis will be on the intuition behind all the formal concepts, theories, and methodologies. The tutorial will be self-contained in a high level. Besides, audiences who are familiar with the basic robustness under noise, adversary and imbalance will find the comprehensive advances of areas.
Website:
https://tmlr-group.github.io/tutorials/aaai2024.html
Bo Han is an Assistant Professor at Hong Kong Baptist University, where his research focuses on trustworthy machine learning. He has received Outstanding Paper Award at NeurIPS and Outstanding Area Chair at ICLR, and has served as Area Chairs of many top-tier conferences and Action Editors of Journals.
Feng Liu is an Assistant Professor at The University of Melbourne, where his research mainly focuses on hypothesis testing and trustworthy machine learning. He has served as Area Chair for ICLR, and program committees of several top-tier conferences. He has received the NeurIPS Outstanding Paper Award.
Jiangchao Yao is an assistant professor in Shanghai Jiao Tong University and also a research scientist in Shanghai AI laboratory. He has published more than 40 papers on the journals and Conferences and served as program committees of ICML, NeurIPS, ICLR, KDD, ECML, ACML, AAAI and IJCAI.
Quarter Day Tutorials
TQ1: Deep Learning Methods for Unsupervised Time Series Anomaly Detection
Dr. Armanfard is the founder and principal investigator of the iSMART Lab. She holds the position of Tenure-Track Assistant Professor in the Department of Electrical and Computer Engineering at McGill University, as well as at Mila – Quebec AI Institute. She is also affiliated with McGill Centre for Intelligent Machines (CIM), McGill initiative in Computational Medicine (MiCM), and McGill Institute for Aerospace Engineering (MIAE). She earned her Ph.D. and Postdoctoral in Artificial Intelligence from McMaster University and the University of Toronto, respectively, in Ontario, Canada. Her research focus lies in developing innovative algorithms for various domains such as time-series data analysis, computer vision, reinforcement learning, and representation learning for tasks including data clustering, classification, and anomaly detection. Her contributions to the field of AI have been acknowledged through numerous awards from institutions, including the Natural Sciences and Engineering Research Council of Canada, AgeWell, Vanier-Banting, the Fonds de recherche du Québec, as well as McMaster University, McGill University, the University of Toronto, the Canadian Institutes of Health Research, and Scale AI, among others. Narges’s academic accomplishments extend to the publication of over fifty AI-focused articles across distinguished platforms, including AAAI, BMVC, ECML PKDD, TPAMI, TNNLS, TSMC, TIFS, and more.
Hadi received his Bachelor’s degree in Electrical Engineering from Sharif University of Technology, Iran. He started as an MSc student at the iSMART Lab. and fast-tracked to PhD. He has received Graduate Excellence Fellowship Award (GEF), McGill Engineering Doctoral Award (MEDA), GREAT Award, and AGE-WELL Award. He is currently doing research on multi-modal and multi-variate anomaly detection in time-series data.
Thi Kieu Khanh Ho is currently a PhD candidate at iSMART Lab, the Department of Electrical and Computer Engineering, McGill University and Mila Quebect AI-Institute, Montreal, Quebec, Canada. Previously, Khanh received her Master’s degree at Gwangju Institute of Science and Technology (GIST), and was a researcher at Seoul National University Hospital (SNUH) and at Korea National University of Transportation (KNUT), South Korea. Her research focuses on time-series anomaly detection, graph anomaly detection, and self-supervised learning. In recognition of her research achievements, she has received several prestigious awards and scholarships to support her research and studies such as McGill Engineering Doctoral Award (MEDA), GREAT Awards, the Quebec Research award in Engineering and Technology (FRQNT), and Vanier Canada Graduate Scholarship (Vanier CGS).
Cancelled – TQ2: Physics-Inspired Geometric Pretraining for Molecule Representation
Molecular representation pretraining is critical in various applications for drug and material discovery. Along this research line, most existing work focuses on pretraining on 2D molecular graphs. Meanwhile, the power of pretraining on 3D geometric structures has been recently explored. In this tutorial, I would like to start the introduction to molecule geometric representation methods (group invariant and equivariant representation) and self-supervised learning for pretraining. After this, I will combine these two topics and comprehensively introduce geometric pretraining for molecule representation, discussing the most recent works in detail (GraphMVP, GeoSSL, and MoleculeSDE).
Shengchao Liu is a postdoc at UC Berkeley and Caltech. He got his Ph.D. degree from Mila-UdeM in 2023. His research interests include geometric representation, transfer learning, foundation model, physics-informed machine learning, dynamics, and molecule discovery.
Cancelled – TQ3: Towards Out-of-Distribution Generalization on Graphs
Graph machine learning has been extensively studied in both academia and industry. Although booming with a vast number of emerging methods and techniques, most of the literature is built on the in-distribution (I.D.) hypothesis, i.e., testing and training graph data are sampled from the identical distribution. However, this I.D. hypothesis can hardly be satisfied in many real-world graph scenarios where the model performance substantially degrades when there exist distribution shifts between testing and training graph data. To solve this critical problem, out-of-distribution (OOD) generalization on graphs, which goes beyond the I.D. hypothesis, has made great progress and attracted ever-increasing attention from the research community. This tutorial is to disseminate and promote the recent research achievement on out-of-distribution generalization on graphs, which is an exciting and fast-growing research direction in the general field of machine learning and data mining. We will advocate novel, high-quality research findings, as well as innovative solutions to the challenging problems in out-of-distribution generalization and its applications on graphs.
This tutorial will be highly accessible to the whole machine learning and data mining community, including researchers, students and practitioners who are interested in this topic. The tutorial will be self-contained and designed for introductory and intermediate audiences. No special prerequisite knowledge is required to attend this tutorial.
For more details, please refer to https://ood-generalization.com/aaai2024Tutorial.htm.
Xin Wang is currently an Assistant Professor at the Department of Computer Science and Technology, Tsinghua University. He got both of his Ph.D. and B.E degrees in Computer Science and Technology from Zhejiang University, China. He also holds a Ph.D. degree in Computing Science from Simon Fraser University, Canada. His research interests include multimedia intelligence, machine learning and its applications in multimedia big data. He has published several high-quality research papers in top journals and conferences including IEEE TPAMI, IEEE TKDE, IEEE TMM, ICML, NeurIPS, ACM Multimedia, KDD, WWW, SIGIR etc. He is the recipient of 2017 China Postdoctoral innovative talents supporting program. He receives the ACM China Rising Star Award in 2020.
Haoyang Li received his Ph.D. from the Department of Computer Science and Technology of Tsinghua University in 2023. Before that, he received his B.E. from the Department of Computer Science and Technology of Tsinghua University in 2018. His research interests are mainly in machine learning on graphs and out-of-distribution generalization. He has published high-quality papers in prestigious journals and conferences, e.g., TKDE, KDD, NeurIPS, IJCAI, ICLR, ACM Multimedia, etc.
Wenwu Zhu is currently a Professor in the Department of Computer Science and Technology at Tsinghua University, the Vice Dean of National Research Center for Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as Member of Technical Staff during 1996-1999. He received his Ph.D. degree from New York University in 1996. His current research interests are in the area of data-driven multimedia networking and Cross-media big data computing. He has published over 350 referred papers, and is inventor or co-inventor of over 50 patents. He received eight Best Paper Awards, including ACM Multimedia 2012 and IEEE Transactions on Circuits and Systems for Video Technology in 2001 and 2019. He served as EiC for IEEE Transactions on Multimedia (2017-2019). He served in the steering committee for IEEE Transactions on Multimedia (2015-2016) and IEEE Transactions on Mobile Computing (2007-2010), respectively. He serves as General Co-Chair for ACM Multimedia 2018 and ACM CIKM 2019, respectively. He is an AAAS Fellow, IEEE Fellow, SPIE Fellow, and a member of The Academy of Europe (Academia Europaea).
Ziwei Zhang received his Ph.D. in 2021 from the Department of Computer Science and Technology at Tsinghua University. Currently, he is a postdoctoral researcher in the same department. His research primarily focuses on machine learning on graphs, including graph neural networks (GNNs), network embedding, and automated graph machine learning. He has published over 40 papers in esteemed conferences and journals, including KDD, ICML, NeurIPS, AAAI, IJCAI, and TKDE.
Cancelled – TQ4: Disentangled Representation Learning
Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in the observable data in representation form.
The process of separating underlying factors of variation into variables with semantic meaning benefits in learning explainable representations of data, which imitates the meaningful understanding process of humans when observing an object or relation. As a general learning strategy, DRL has demonstrated its power in improving the model explainability, controllability, robustness, as well as generalization capacity in a wide range of scenarios such as computer vision, natural language processing, data mining, etc. In this tutorial, we will disseminate and promote the recent research achievements on disentangled representation learning as well as its applications, which is an exciting and fast-growing research direction in the general field of machine learning.
This tutorial consists of four parts. We will first give a brief introduction on DRL, followed by discussions on learning strategies and applications of disentangled representation learning, covering disentangled graph representation learning, disentangled representation learning for multi-modalities, and disentangled representation for recommendation, etc. We finally share some of our insights on the trending and future directions for disentangled representation learning, such as disentangled representation learning and large pretrained models. This tutorial will be highly accessible to both the general AI community, including researchers, students and practitioners who are interested in representation learning for AI, especially for explainable and controllable AI. The tutorial will be self-contained and designed for introductory and intermediate audiences.
Xin Wang is currently an Assistant Professor at the Department of Computer Science and Technology, Tsinghua University. He got both of his Ph.D. and B.E degrees in Computer Science and Technology from Zhejiang University, China. He also holds a Ph.D. degree in Computing Science from Simon Fraser University, Canada. His research interests include multimedia intelligence, machine learning and its applications in multimedia big data. He has published several high-quality research papers in top journals and conferences including IEEE TPAMI, IEEE TKDE, IEEE TMM, ICML, NeurIPS, ACM Multimedia, KDD, WWW, SIGIR etc. He is the recipient of 2017 China Postdoctoral innovative talents supporting program. He receives the ACM China Rising Star Award in 2020.
Hong Chen is a Ph.D. student in machine learning at Tsinghua University, China. He has published several papers in top tier conferences and journals, including ICML, NeurIPS, TPAMI, ACM Multimedia, etc. His main research interests include multimodal generative models, transfer learning, and recommendation.
Wenwu Zhu is currently a Professor in the Department of Computer Science and Technology at Tsinghua University, the Vice Dean of National Research Center for Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as Member of Technical Staff during 1996-1999. He received his Ph.D. degree from New York University in 1996. His current research interests are in the area of data-driven multimedia networking and Cross-media big data computing. He has published over 350 referred papers, and is inventor or co-inventor of over 50 patents. He received eight Best Paper Awards, including ACM Multimedia 2012 and IEEE Transactions on Circuits and Systems for Video Technology in 2001 and 2019. He served as EiC for IEEE Transactions on Multimedia (2017-2019). He served in the steering committee for IEEE Transactions on Multimedia (2015-2016) and IEEE Transactions on Mobile Computing (2007-2010), respectively. He serves as General Co-Chair for ACM Multimedia 2018 and ACM CIKM 2019, respectively. He is an AAAS Fellow, IEEE Fellow, SPIE Fellow, and a member of The Academy of Europe (Academia Europaea).
TQ5: Distributed Stochastic Nested Optimization for Emerging Machine Learning Models
In recent years, some new learning paradigms have been proposed to address various challenges in practical applications.
Of particular interest in this tutorial is the learning paradigm that can be formulated as a stochastic nested optimization (SNO) problem
because it covers a wide range of emerging machine learning models, such as model-agnostic meta-learning, imbalanced data classification models, contrastive self-supervised learning models, neural architecture search, etc. More specifically, this tutorial will focus on two instances of SNO: stochastic compositional optimization (SCO) and stochastic bilevel optimization (SBO).
The nested structure in these two classes of optimization problems brings unique challenges for their distributed optimization.
A series of new techniques have been developed to address them in the past few years. However, these recent advances have not been disseminated to broad audiences.
Thus, this tutorial aims to introduce the unique challenges, recent advances, and practical applications of distributed SCO and distributed SBO, with a special focus on federated learning and decentralized learning settings. The audience will benefit from this tutorial by mastering the understanding of distributed SCO/SBO algorithms and applying them to real-world applications.
Website: https://www.cst.temple.edu/~tuo14379/tutorial_aaai2024.html
Hongchang Gao is an assistant professor in the Department of Computer and Information Sciences at Temple University. His research interests include machine learning, optimization, and biomedical data science, with a special focus on distributed optimization and federated learning. His works have been published at top venues, including ICML, NeurIPS, AISTATS, KDD, AAAI, IJCAI, etc. Currently, he serves as the Associate Editor of the Journal of Combinatorial Optimization. He also has been serving multiple top machine learning and data mining conferences and journals as a program committee member and reviewer. Recently, he was selected for the AAAI 2023 New Faculty Highlights program.
TQ6: Out-of-Distribution Generalization in Time Series
This tutorial aims to bring together AI researchers, data scientists, and industry practitioners to explore out-of-distribution generalization challenges in time series. Time series data are prevalent in many domains, including retail, finance, healthcare, and environmental monitoring, serving as a basis for numerous applications. A major challenge in time series studies is the presence of dataset shifts, where the statistical properties of data may vary due to data collection from various sources, different locations, or conditions. Out-of-distribution generalization is the task where models should generalize to new, unseen scenarios/domains by learning from observed, seen scenarios. In this tutorial, we embark on a journey to explore the confluence of generalization and time series analysis, revealing insights into how out-of-distribution generalization techniques can be harnessed to tackle challenges in modeling time series data. We begin by laying a strong foundation of time series problems and out-of-distribution generalization. Then, we delve into the challenges, problems, methods, and evaluation in improving out-of-distribution generalization in the time series domain. Recent research findings in this emerging field and their implications will be unveiled, providing participants with the tools to advance their work. Finally, we turn toward the future, discussing emerging trends, open research questions, and the rich set of possibilities in this field.
Songgaojun Deng is a Postdoc in AIRLab, University of Amsterdam. She received her PhD degree in Computer Science from Stevens Institute of Technology. Her research interests are machine learning and data mining in social, health informatics and e-commerce. Her current research focuses on out-of-distribution generalization in time series.
Jindong Wang is a Senior Researcher at Microsoft Research Asia. He obtained his PhD from the Institute of Computing Technology, Chinese Academy of Sciences. His research interest includes robust machine learning, out-of-distribution/domain generalization, transfer learning, semi-supervised learning, federated learning, and related applications such as activity recognition and computer vision.
Maarten de Rijke is Distinguished University Professor of AI&IR at the University of Amsterdam and director of the ICAI. Together with PhD students and postdocs, he works on problems at the interface of IR and machine learning. He has taught extensively at all levels from public lectures to advanced tutorials.
TQ7: Advances in Robust Time-Series ML: From Theory to Practice
We are seeing a significant growth in the Internet of Things (IoT) and mobile applications which are based on predictive analytics over time-series data collected from various types of sensors and wearable devices. Some important applications include smart home automation, mobile health, smart grid management, and finance. Traditional machine learning and deep learning has shown great success in learning accurate predictive models from time-series data. However, safe and reliable deployment of such machine learning (ML) systems require the ability to be robust to adversarial/natural perturbations to time-series, to detect time-series data which does not follow the distribution of training data, aka out-of-distribution (OOD) detection.
Most of the prior work on adversarial robustness for deep models is focused on the image domain and natural language domain to a lesser extent. On the other hand, business data such as stock prices and financial transactions, IoT data, and health data rely heavily on a time-series modality for AI-powered analysis and predictions. Transferring prior work on adversarial robustness to these domains has several caveats due to the different characteristics of modalities. Time-series domain poses unique challenges (e.g., sparse peaks, fast oscillations) that are not encountered in both image and natural language processing domains. Therefore, prior approaches are not applicable as they don’t capture the true similarity between time-series instances.
This tutorial will cover recent advances in adversarial robustness and certification for time-series domain using appropriate distance measures (e.g., dynamic time warping); min-max optimization algorithms to train robust ML models for time-series domain; OOD detection methods using deep generative models with application to generation of synthetic time-series data; and threats of adversarial attack on multivariate time-series forecasting models and viable defense mechanisms. This tutorial will also cover the real-world applications that require reliable and robust time-series analytics such as classification in human activity monitoring for smart health, and regression/forecasting in financial data.
The target audience of this tutorial includes (1) General AI researchers and graduate students who will learn about principles, algorithms, and outstanding challenges to explore the frontiers of robust time-series ML and its real-world applications; (2) Time-series ML researchers who will learn about the complete landscape of robustness and trustworthiness to gain breadth and learn about outstanding challenges in the frontiers; and (3) Industrial AI researchers and practitioners who will be apply the learned knowledge for solving applications including mobile health, smart homes, and smart grid.
Taha Belkhouja is a CS PhD candidate at Washington State University. His research focuses on robust and trustworthy time-series ML. He won outstanding teaching assistant and research assistant awards from the Voiland College of Engineering. He has published in top-tier AI venues including JAIR, AAAI, and IJCAI.
Yan Yan is an Assistant Professor at Washington State University (WSU). His research focuses on robust machine learning, uncertainty quantification, non-convex optimization, and time-series ML. He has published in top-tier AI venues including JMLR, AAAI, NeurIPS, ICML, IJCAI, and ICCV.
Nghia Hoang is an Assistant Professor at WSU. He received his Ph.D. in CS from National University of Singapore. His research spans deep generative modeling with applications to (personalized) federated learning, meta learning, black-box model reprogramming/reconfiguration. He has published in top-tier AI venues including ICML, NeurIPS, AAAI, UAI.
Ganapati Bhat is an Assistant Professor at WSU. He got his PhD from Arizona State University. He won multiple Best Paper Awards and ACM SIGDA PhD Dissertation Award (2022). His research focuses on designing robust (ML) algorithms for energy harvesting and data analytics for mobile health applications.
Jana Doppa is the Huie-Rogers Chair Distinguished Associate Professor at WSU. His research focuses on both foundations of AI and its applications to science, engineering, and industrial domains. He has won an NSF CAREER Award, an Outstanding Paper Award at AAAI (2013), and is an elected AAAI Senior Member.
TQ8: Continual Learning on Graphs: Challenges, Solutions, and Opportunities
Most real-world graphs constantly grow or evolve with potential distribution shifts. Classical graph learning models, however, typically assume graphs to be static and suffer from catastrophic forgetting when new types of nodes and edges (or graphs) continuously emerge. Therefore, investigating how to constantly adapt a graph learning model to new distributions/tasks in the growing graphs without forgetting the previously learned knowledge, i.e. Continual Graph Learning (CGL), is becoming increasingly important in various real-world applications, e.g., social science, biomedical research, etc. Due to the existence of complex topological structures, CGL is essentially different from traditional continual learning on independent data without topological connections (e.g., images). Challenges in CGL include the task configuration in different types of graphs, preservation of the previously learned topology, properly handling the concept drift caused by the topological connections, etc. In this tutorial, we will introduce this newly emerging area – Continual Graph Learning (CGL). Specifically, we will (1) introduce different continual graph learning settings based on various application scenarios, (2) present the key challenges in CGL, (3) highlight the existing CGL techniques and benchmarks, and (4) discuss potential future directions.
Prerequisite:
Our tutorial targets the audience with a basic knowledge of machine learning and deep learning, e.g., how to construct and optimize neural networks. The knowledge of continual learning or CGL is not required.
Website:
https://queuq.github.io/CGL_AAAI2024/
Dacheng Tao (Fellow, IEEE) is a Professor of Computer Science and an ARC Laureate Fellow in the at The University of Sydney. He mainly applies statistics and mathematics to artificial intelligence and data science. His research is detailed in one monograph and over 200 publications in prestigious journals and prominent conferences.
Dongjin Song is an assistant professor at the University of Connecticut. His research interests include machine learning, deep learning, data mining, and applications for time series and graphs. Papers describing his research have been published at top-tier conferences, such as NeurIPS, ICML, ICLR, KDD, ICDM, SDM, AAAI, IJCAI, CVPR, ICCV, etc.
Xikun Zhang is Ph.D. student at The University of Sydney. His research interests include applying deep learning to tackle graph related problems. His works have been published in top conferences and journals, such as NeurIPS, TPAMI, ICDM, CVPR, ECCV, and TNNLS.
Cancelled – TQ9: Curriculum Learning: Theories, Approaches, Applications and Tools
This tutorial focuses on curriculum learning (CL), an important topic in machine learning, which gains an increasing amount of attention in the research community. CL is a learning paradigm that enables machines to learn from easy data to hard data, imitating the meaningful procedure of human learning with curricula. As an easy-to-use plug-in, CL has demonstrated its power in improving the generalization capacity and convergence rate of various models in a wide range of scenarios such as computer vision, natural language processing, data mining, reinforcement learning, etc. Therefore, it is essential introducing CL to more scholars and researchers in the machine learning community. However, there have been no tutorials on CL so far, motivating the organization of our tutorial on CL at AAAI 2024.
To give a comprehensive tutorial on CL, we plan to organize it from the following aspects: (1) theories, (2) approaches, (3) applications, (4) tools and (5) future directions. First, we introduce the motivations, theories and insights behind CL. Second, we advocate novel, high-quality approaches, as well as innovative solutions to the challenging problems in CL. Then we present the applications of CL in various scenarios, followed by some relevant tools. In the end, we discuss open questions and future directions of this field. We believe this topic is at the core of the scope of AAAI and is attractive to the audience interested in machine learning from both academia and industry.
Xin Wang is currently an Assistant Professor at the Department of Computer Science and Technology, Tsinghua University. He got both of his Ph.D. and B.E degrees in Computer Science and Technology from Zhejiang University, China. He also holds a Ph.D. degree in Computing Science from Simon Fraser University, Canada. His research interests include multimedia intelligence, machine learning and its applications in multimedia big data. He has published several high-quality research papers in top journals and conferences including IEEE TPAMI, IEEE TKDE, IEEE TMM, ICML, NeurIPS, ACM Multimedia, KDD, WWW, SIGIR etc. He is the recipient of 2017 China Postdoctoral innovative talents supporting program. He receives the ACM China Rising Star Award in 2020.
Yuwei Zhou is a Ph.D. student at the Department of Computer Science and Technology, Tsinghua University. He received his B.E. degree from the Department of Computer Science and Technology, Tsinghua University. His main research interests include curriculum learning and multimodal learning.
Hong Chen is a Ph.D. student in machine learning at Tsinghua University, China. He has published several papers in top tier conferences and journals, including ICML, NeurIPS, TPAMI, ACM Multimedia, etc. His main research interests include multimodal generative models, transfer learning, and recommendation.
Wenwu Zhu is currently a Professor in the Department of Computer Science and Technology at Tsinghua University, the Vice Dean of National Research Center for Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as Member of Technical Staff during 1996-1999. He received his Ph.D. degree from New York University in 1996. His current research interests are in the area of data-driven multimedia networking and Cross-media big data computing. He has published over 350 referred papers, and is inventor or co-inventor of over 50 patents. He received eight Best Paper Awards, including ACM Multimedia 2012 and IEEE Transactions on Circuits and Systems for Video Technology in 2001 and 2019. He served as EiC for IEEE Transactions on Multimedia (2017-2019). He served in the steering committee for IEEE Transactions on Multimedia (2015-2016) and IEEE Transactions on Mobile Computing (2007-2010), respectively. He serves as General Co-Chair for ACM Multimedia 2018 and ACM CIKM 2019, respectively. He is an AAAS Fellow, IEEE Fellow, SPIE Fellow, and a member of The Academy of Europe (Academia Europaea).
TQ10: Graphs Counterfactual Explainability: A Comprehensive Landscape
Graph Neural Networks (GNNs) have proven highly effective in graph-related tasks, including Traffic Modeling, Learning Physical Simulations, Protein Modeling, and Large-scale Recommender Systems. The focus has shifted towards models that deliver accurate results and provide understandable and actionable insights into their predictions. Counterfactual explanations have emerged as crucial tools to meet these challenges and empower users. For the reasons mentioned above, in this tutorial, we furnish the essential theoretical underpinnings for generating counterfactual explanations for GNNs, arming the audience with the resources to gain a deeper understanding of the outcomes produced by their systems and enabling users to interpret predictions effectively. First, we present an insight into GNNs, their underlying message-passing architecture, and the challenges of providing post-hoc explanations for their predictions across different domains. Then, we provide a formal definition of Graph Counterfactual Explainability (GCE) and its potential to provide recourse to users. Furthermore, we propose a taxonomy of existing GCE methods and give insights into every approach’s main idea, advantages, and limitations. Finally, we introduce the most frequently used benchmarking datasets, evaluation metrics, and protocols to analyze the future challenges in the field.
In the introduction (20 mins) part, we delve into the world of Graph Neural Networks (GNNs) and the growing demand for models that deliver accurate results while providing understandable and actionable insights. We start by discussing the widespread use of GNNs in diverse domains, highlighting their underlying message-passing architecture.
In the second part (30 mins), we emphasize the importance of Explainable AI (XAI) in graph-based models. Black-box models, while powerful, pose challenges in critical scenarios where interpretability is vital. We explore factual explanations, shedding light on why specific predictions are made. We also briefly revisit key methods in this category, including GNNExplainer and GraphLIME.
The core of this tutorial (55 mins) focuses on counterfactual explanations in graph-based models. We define Graph Counterfactual Explanation (GCE) and its significance, followed by a taxonomy of GCE methods. These methods include instance-level explainers (search-based, heuristic-based, learning-based) and model-level explainers. We discuss the advantages and limitations of each approach, ensuring a clear understanding of the options. Benchmarking datasets and evaluation metrics are also addressed, enabling participants to assess the performance of counterfactual explainers effectively.
Audience and Scope:
This tutorial strikes a balance between comprehensiveness and accessibility for participants with varying technical backgrounds. For those seeking a deeper dive into the technical aspects, an additional lab session is available. The tutorial caters to practitioners in academia and industry interested in applying machine learning to analyze graph data. While familiarity with basic machine learning concepts is beneficial, the content is designed to be accessible to a broad audience.
For more information and updates, visit:
For those who want to master the field from the technical point of view, take a look at the intertwined laboratory: https://aiimlab.org/events/AAAI_2024_Digging_into_the_Landscape_of_Graphs_Counterfactual_Explainability.html
Mario Alfonso Prado-Romero
Gran Sasso Science Institute marioalfonso.prado@gssi.it
Mario Prado is a PhD Fellow in AI at the Gran Sasso Science Institute. His main research focus is at the intersection of GNN and XAI. Key contributor to the GRETEL project. Serves as PC member for top-tier conferences and journals. Currently, he is the only Research Intern selected by NEC Laboratories Europe to work on XAI in the Biomedical Domain.
Dr. Bardh Prenkaj
Sapienza University of Rome prenkaj@di.uniroma1.it
Dr Bardh Prenkaj is a postdoctoral researcher in Machine Learning at the Sapienza University of Rome. He is part of the GRETEL project, specializing in generative counterfactual explainers. He serves as a program committee member for top-tier conferences and journals and collaborates with international research groups.
Prof. Giovanni Stilo
University of L’Aquila giovanni.stilo@univaq.it
Prof. Giovanni Stilo is an associate professor in Computer Science and Data Science at the University of L’Aquila. He leads the Data Science Master’s program, specializes in trustworthiness aspects, and manages the GRETEL project. He organizes international workshops, serves on journal editorial boards, and leads key projects in data science and education.
TQ11: Aligning Large Language Models to Low-Resource Languages
This tutorial offers an in-depth exploration into enhancing Natural Language Processing (NLP) for low-resource languages (LRLs), addressing the notable gap in current language models like ChatGPT. Focusing on languages such as Swahili, it presents strategies for data collection and model alignment to LRLs, crucial for a linguistically inclusive AI future.
Participants will gain insights into the current NLP landscape, understanding the limitations of SOTA models in current LRLs. The tutorial introduces methods to overcome these challenges, emphasizing the creation of high-quality datasets beyond traditional machine translation capabilities. This approach is vital for effectively training and aligning Large Language Models (LLMs) with LRLs.
The core of the tutorial is dedicated to demonstrating how to collect and utilize crowd-sourced data for fine-tuning LLMs for diverse linguistic needs. Attendees will learn practical guidelines and innovative techniques for aligning these models with LRLs, a crucial step in making AI technologies accessible in underrepresented regions.
Designed for researchers and practitioners with a foundation in deep learning, this tutorial aims to support trends for a more inclusive and diverse AI ecosystem. Attendees will leave with a thorough understanding of the unique challenges and solutions in applying NLP technologies to LRLs, equipped to contribute to a more inclusive AI landscape.
Nazar Beknazarov works as a research scientist at Toloka Research and is pursuing a PhD in Computer Science from the Higher School of Economics.. His work covers many aspects of Machine Learning particularly LLM technologies in various fields including NLP and Bioinformatics. He has published papers in prestigious journals like Nature and presented at ML conferences. He received his M.Sc. with excellence from Higher School of Economics majoring in Computer Sciences.
Marzieh Fadaee is a senior research scientist at Cohere For AI, the research lab of Cohere. Her work broadly covers many aspects of natural language understanding, particularly multilingual learning, data-conscious learning, robust and scalable models, and evaluation. She was previously the NLP/ML research lead at Zeta Alpha Vector, working on smarter ways to discover and organize knowledge. She did her PhD at the Language Technology Lab (originally part of the ILPS group), University of Amsterdam, working on developing models to understand and utilize interesting phenomena in the data for translation. She received her B.Sc. from Sharif University, majoring in Computer Engineering, and M.Sc. from the University of Tehran, majoring in Artificial Intelligence.
Ahmet Üstün is a research scientist at Cohere For AI. He has previously earned a PhD from the University of Groningen. His research interests are multi-task, multilingual, and efficient natural language processing with a particular focus on modular approaches and low-resource languages. He is a leading member of AYA Open Science Initiative.
TQ12: Meta-Reinforcement Learning
Outline
A major drawback of deep reinforcement learning (RL) is its poor data efficiency. In this tutorial, we present meta-RL as an approach to create sample-efficient and general-purpose RL algorithms, via learning the RL algorithm itself. Meta-RL aims to learn a policy that is capable of adapting to any new task from a distribution over tasks with only limited data. We present the meta-RL problem statement, along with an overview of methods, applications, and open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
Goal
The audience will come away from this tutorial with an understanding of meta-reinforcement learning. Reinforcement learning is notoriously sample-inefficient. The goal of this tutorial is to enable researchers to use meta-RL to unblock reinforcement learning from this inefficiency. Ultimately, meta-RL promises to deliver agents capable of fast and general adaptation. The tutorial should enable machine learning practitioners to get up-to-speed on meta-RL, as well as direct new research in meta-RL toward valuable and open problems.
Prerequisite Knowledge
The tutorial should be accessible to anyone with some knowledge of RL. However, minimal background is assumed. The goal is to both get people up to speed on training meta-RL agents and to point out the limitations of current systems, which will be of use to current practitioners as well.
Content
In this tutorial, we introduce the field of meta-RL following our meta-RL survey. First, we motivate reinforcement learning and meta-reinforcement learning, using supervised meta-learning for comparison. Then meta-RL research is divided into categories considering the attributes of the problem setting. One key theme that differentiates meta-RL from other meta-learning is the need for efficient exploration. The theme of exploration is covered in detail. Next, key results for exploration and generalization are presented. We discuss application domains — including robotics, education, and multi-agent RL — to make meta-RL more concrete. Next, we shift our focus to meta-learning over long horizons. We survey related methods and demonstrate the degree of transfer that can be expected. For example, this includes training on simple low-dimensional grid-world environments and then transferring the learned RL algorithm to 8-bit Atari games for evaluation. Additionally, we note open problems — including optimization challenges, benchmark standardization, and offline data collection — to catalyze further research in the field. Finally, we conclude by discussing the aims of meta-RL in the near and long term.
Jacob Beck is a DPhil candidate at the University of Oxford, under the supervision of Shimon Whiteson and funded by the Oxford-Google DeepMind Doctoral Scholarship. His studies are focused on using reinforcement learning (RL) to make artificial agents capable of fast and general adaptation. His work has an emphasis on meta-RL, memory architecture, and hypernetworks. He authored a survey paper on meta-RL and was featured on the TalkRL podcast, alongside his co-author Risto Vuorio, to explain the field to a wider audience. Previously, Jacob completed a pre-doc at Microsoft Research, working on long-term memory in RL, and contributed to autonomous vehicle technology. He holds a BS and MS from Brown University, where he conducted research in RL and served as a teaching assistant for a graduate-level deep learning course.
Risto Vuorio is a DPhil (PhD) candidate at the University of Oxford supervised by Shimon Whiteson, funded by EPSRC Doctoral Training Partnership Scholarship and Department of Computer Science Scholarship. His research is focused on reinforcement learning with special interests in meta-RL and learning from demonstrations. His work on meta-RL has been recognized with a spotlight at NeurIPS. Together with Jacob Beck, Risto appeared on TalkRL podcast discussing the newest developments in meta-RL. Before Oxford, Risto worked as a research engineer in Satinder Singh’s lab at the University of Michigan and before that at SK T-Brain in Seoul.
TQ13: Recent Advances in Multi-Objective Search
Deterministic search and planning typically operate on graphs where each edge has a given cost, aiming to find a path of minimal cost. Heuristic search has developed tools for solving this path-finding problem, including the A* algorithm. However, the objective of search and planning is often much more complex in real life, for example, because one needs to trade-off between different costs. Multi-objective search and planning operate on graphs where each edge has two or more given costs, aiming to find paths that trade-off between the different costs in the form of a Pareto frontier of paths.
For example, transporting hazardous material requires trading off different costs for each street, such as its length and the number of residents that would be exposed to the hazardous material in case of an accident. Other examples include route planning for vehicles and robots (which involves trading off energy consumption and travel time), planning power transmission lines (which involves trading off power-generation cost and power loss), and scheduling satellites and routing packets in computer networks (which involves trading off profit and fairness).
This tutorial will provide an overview of the multi-objective search problem and summarize recent progress in this fast-moving research area. We will cover theoretical foundations, practical algorithms, and challenges that commonly arise in practice in a way that is accessible to all AI researchers and students. Our target audience is anyone interested in search and planning who wants to learn about this fascinating emerging field.
More information on the tutorial and a detailed schedule can be found at https://sites.google.com/usc.edu/aaai24-mos-tutorial/home.
Ariel Felner is a professor of computer science at Ben-Gurion University, Israel. He is interested in all aspects of heuristic search, namely algorithms, heuristics, and applications. Multi-objective search has been the focus of his research for the past few years. Additional information on him can be found at https://felner.wixsite.com/home.
Oren Salzman is an assistant professor at The Henry and Marylin Taub Faculty of Computer Science at the Technion, Israel. His research focuses on revisiting classical computer science algorithms, tools, and paradigms to address the computational challenges that arise when planning motions for robots. Combining techniques from diverse domains such as computational geometry, graph theory, and machine learning, he strives to provide efficient algorithms with rigorous analyses for robot systems with many degrees of freedom moving in tight quarters. Additional information on him can be found at https://orensalzman.com/index.html.
Carlos Hernández is a full professor of computer science at Universidad San Sebastian, Chile. He is interested in all aspects of heuristic search and planning, namely algorithms, heuristics, and applications. Multi-objective search has been the focus of his research since he led an artificial intelligence group for freight transport applications in 2019. Additional information on him can be found in his Google Scholar profile at https://scholar.google.com/citations?user=dNsBXU8AAAAJ&hl=es.
Sven Koenig is a Dean’s professor of computer science at the University of Southern California, USA. Most of his research centers around techniques for decision-making that enable agents (such as robots and decision-support systems) and teams of agents to act intelligently in their environments and exhibit goal-directed behavior in real-time. Additional information about him can be found at idm-lab.org.
Half Day Labs
LH1: Fully Homomorphic Encryption for Privacy-Preserving Machine Learning Using the OpenFHE Library
This tutorial offers a comprehensive exploration of Fully Homomorphic Encryption (FHE) and its practical application to privacy-preserving machine learning (PPML). Aimed at data scientists, ML engineers, and students interested in FHE, the tutorial equips participants with a solid understanding of FHE concepts and hands-on experience using the OpenFHE Python library.
The tutorial covers essential FHE concepts related to ML such as noise accumulation, noise threshold levels, and noise refreshing methods. Participants will be introduced to various FHE flavors that enable ML, such as the lookup-table TFHE and Cheon-Kim-Kim-Song (CKKS) methods. This tutorial focuses on building applications with the CKKS scheme, paying special attention to various cryptographic parameters associated with it, contextualized in terms of how the parameters affect ML applications.
The second section focuses on the OpenFHE library, covering its design principles and motivation. Participants gain insights into ciphertext representation tailored for efficient batch processing in a SIMD (vectorized) fashion. Practical examples demonstrate cryptographic key generation, ciphertext encryption, decryption, and mathematical operations, emphasizing applications in machine learning. The section also provides a glimpse into the library’s development roadmap, including ports to Python and NodeJS, and collaboration with the Google Transpiler team.
The hands-on segment guides participants through training an encrypted logistic regression model using OpenFHE. We begin with a naive implementation of encrypted logistic regression training, gradually introducing optimizations, such as efficient data packing and Nesterov-accelerated gradient descent. As participants are guided through the process, we hope to build intuition and insights that will allow the attendees to learn best practices for building FHE-enabled ML applications and optimize their own programs.
The next section demonstrates how end users can transform pre-trained models into encrypted models for private inference using OpenFHE. Additionally, we benchmark the CKKS approximate number method against the lookup-table TFHE method to compare their inference performance.
The concluding section presents new ML challenges that will be posted as part of the FHERMA project (https://fherma.io/challenges). The participants will be encouraged to participate in the challenges to solidify their skills in FHE-enabled ML acquired as part of this tutorial. A brief discussion of hybrid PPML methods, where FHE is used in tandem with other privacy-enhancing technologies (PETs), such as secure multiparty computation and federated learning, is also included.
Prerequisite knowledge includes proficiency in Python programming, a grasp of logistic regression, familiarity with NumPy and Single Instruction, Multiple Data (SIMD), and a basic understanding of Support Vector Machines.
Website:
https://openfheorg.github.io/aaai-2024-lab-materials/
Yuriy Polyakov is the Vice President of Cryptography and a Principal Scientist at Duality Technologies. He is a project lead for OpenFHE software library project and serves on the Steering Committee of the HomomorphicEncryption.org standardization consortium. His research has been funded by DARPA, IARPA, NIH, Simons Foundation, and Sloan Foundation.
Sukanya Mandal, a Research Engineer and Tech Lead, specializes in driving innovation in Data and AI. She has led teams in developing Federated Learning across diverse domains, focusing on its application in Edge AI. Her work involves extensive research and development, creating prototypes, solutions, and architecting data systems for industrial domains. She actively contributes to OpenFHE.
Ian Quah is a Ph.D. student at the University of Washington’s Ahmed Lab, working on deep reinforcement learning and biologically plausible deep learning. He actively contributes to OpenFHE, focusing on the application of fully homomorphic encryption in machine learning, with an emphasis on outreach and education.
LH2: Introduction to MDP Modeling and Interaction via RDDL and pyRDDLGym
RDDL (pronounced as “riddle”) stands for the Relational Dynamic Influence Diagram Language. It is the domain modeling language utilized in the International Conference on Automated Planning and Scheduling (ICAPS) in the years 2011, 2014, 2018, and most recently in 2023 for the Probabilistic Planning and Reinforcement Learning track of the International Planning Competitions. RDDL was designed to efficiently represent real-world stochastic planning problems, specifically Markov Decision Processes (MDPs), with a focus on factored MDPs characterized by highly structured transition and reward functions. This tutorial aims to provide basic understanding of RDDL, including recent language enhancements and features, through a practical example. We will introduce a problem based on a real-world scenario and incrementally build up its representation in RDDL, starting from the raw mathematical equations up to a full RDDL description. Furthermore, we will introduce “pyRDDLGym,” a new Python framework for the generation of Gym environments from RDDL descriptions. This facilitates interaction with Reinforcement Learning (RL) agents via the standard Gym interface, as well as enables planning agents to work with the model. In a series of exercises, we will explore the capabilities of pyRDDLGym, which includes generation of the Dynamic Bayesian Networks (DBN) and eXtended Algebraic Decision Diagrams (XADD)-based conditional probability functions, as well as offering both generic and custom visualization options. We will also generate a functional environment for the example problem. To close the loop from a mathematical representation to a fully operational policy, we will utilize the built-in model-based anytime backpropagation planner, known as “JaxPlan.” This will enable us to obtain solutions for the example problem, effectively closing the gap between theoretical description and a practical working policy.
Professor Scott Sanner University of Toronto ssanner@mie.utoronto.ca
Scott Sanner is an Associate Professor at the University of Toronto specializing in AI topics such as sequential decision-making, recommender systems, and machine/deep learning applications. He is an Associate Editor for three journals and a recipient of four paper awards and a Google Faculty Research Award
Dr. Ayal Taitler
University of Toronto ataitler@gmail.com
Ayal Taitler is a Postdoctoral Fellow at the University of Toronto, working with Prof. Scott Sanner. His research interests lie at the intersection of reinforcement learning, automated planning, and control theory, and its application to robotics and intelligent transportation. With over 10 years of experience in software engineering and AI research.
LH3: Measurement Layouts for Capability-oriented AI Evaluation
Recent years have witnessed an explosion in the general-purpose capabilities of AI systems, presenting unique challenges for their evaluation. Estimating capabilities, as opposed to performance, is essential for general-purpose, as opposed to task-specific, systems.
Understanding the capabilities of AI systems is crucial for anticipating their suitability in situations and occupations requiring particular cognitive skill levels to meet the expected demands. Techniques and methodologies from the cognitive sciences are more appropriate than task-oriented benchmarks for this evaluation of capability, however they require a common language and toolkit to facilitate cross-disciplinary collaboration. In this lab, we present one such approach, the Measurement Layouts framework, which leverages large, Hierarchical Bayesian Networks to infer the capabilities of AI systems. We aim to demonstrate the powerful evaluation inferences possible using this framework for various AI systems (RL agents, language models, etc.), with the hope of building a diverse community of interdisciplinary researchers committed to improving AI evaluation.
The lab comprises multiple sessions introducing participants to the Measurement Layout framework and capability-oriented AI evaluation. We will start with discussion of the limitations of current AI evaluation practices and motivate a capability-oriented evaluation, drawing on robust experimental practices from the cognitive sciences. Through hands-on practical experience, attendees will acquire the skills needed to implement the measurement layouts framework in their research and projects, adapting the material (examples, code, benchmarks, etc.) presented in this lab. Participants will actively engage in building measurement layouts in two scenarios: (i) assessing the capability of an agent in a navigation task in a 3D environment and (ii) evaluating the capabilities of large language models relative to those that might be required were such systems to take on human occupations. Finally, participants will learn to reuse and extend existing benchmarks with demand annotations to infer capabilities efficiently, as well as create new benchmarks with a grid of demands that allow the inference process to triangulate more efficiently, achieving high levels of validity.
Our target audience includes all researchers interested in improving AI evaluation. By attending this lab, participants will gain a comprehensive understanding of capability-oriented evaluation, the measurement layouts framework, and its practical applications. Specifically, they will acquire insights into how capability-oriented evaluation can enhance AI assessment beyond traditional benchmarking. Participants will learn to effectively apply the measurement layout framework, enabling them to estimate AI capabilities, and they will also gain an understanding of the benefits and challenges associated with adopting this approach.
John Burden is a Senior Research Associate at the University of Cambridge. He holds a PhD in Computer Science, through which he explored the role of learning abstraction in Reinforcement Learning. John’s research focuses on evaluating general-purpose AI, particularly with reference to how we can better measure generality and capabilities.
Konstantinos Voudouris is a PhD candidate and incoming Research Associate at the University of Cambridge. He is interested in interdisciplinary research in Artificial Intelligence, applying insights from developmental and comparative psychology to identify the failure modes of contemporary AI systems, with a view to making them safer and more robust.
Marko Tešić is a Research Associate at the University of Cambridge. He focuses on evaluating AI systems and exploring translation pathways between AI capabilities and occupational task demands. Marko was a RAEng research fellow investigating how explanations of AI systems affect people’s beliefs. He holds a PhD in Psychology.
Lucy Cheke is an Associate Professor of experimental psychology at the University of Cambridge with expertise in cognitive assessment of children and nonhuman animals. She is a founding member of the Animal AI team and principal/co-principal investigator on multiple research projects investigating cognition in biological and artificial intelligence systems.
José Hernández-Orallo (https://josephorallo.webs.upv.es/) is Professor at TU Valencia and Senior Research Fellow at the Leverhulme Centre for the Future of Intelligence. He has worked on several areas of AI, machine learning and intelligence measurement, with a focus on the capabilities, generality, impact and risks of AI.
LH4: Causal Fairness Analysis
Decision-making systems based on AI and machine learning have been used throughout a wide range of real-world scenarios, including healthcare, law enforcement, education, and finance. It is no longer far-fetched to envision a future where autonomous systems will drive entire business decisions and, more broadly, support large-scale decision-making infrastructure to solve society’s most challenging problems. Issues of unfairness and discrimination are pervasive when decisions are being made by humans, and remain (or are potentially amplified) when decisions are made using machines with little transparency, accountability, and fairness. In this tutorial, we describe the framework of causal fairness analysis with the intent of filling in this gap, i.e., understanding, modeling, and possibly solving issues of fairness in decision-making settings.
The main insight of our approach will be to link the quantification of the disparities present in the observed data with the underlying, often unobserved, collection of causal mechanisms that generate the disparity in the first place, a challenge we call the Fundamental Problem of Causal Fairness Analysis (FPCFA). In order to solve the FPCFA, we study the problem of decomposing variations and empirical measures of fairness that attribute such variations to structural mechanisms and different units of the population.
Our effort culminates in the Fairness Map, the first systematic attempt to organize and explain the relationship between various criteria found in the literature. Finally, we discuss which causal assumptions are minimally needed for performing causal fairness analysis and propose practical solutions for three key fairness tasks (in increasing order of complexity): (i) bias detection, in which disparities are analyzed and quantified; (ii) fair prediction, in which the task is to create predictions satisfying a desired notions of fairness; (iii) fair decision-making, in which the task is to implement a fair policy and achieve lower disparities over time.
Prerequisite knowledge: Attendees are expected to be familiar with standard notions in probabilistic reasoning. Familiarity with causal inference and the framework of structural causal models is desirable, but not necessary since the tutorial is self-contained, covering the key concepts of causal inference, and also introducing causal fairness from first principles.
For the practical lab component, basic familiarity with running R-code will be required, although all the code will be provided and explained during the lab. The tutorial will be entirely self-contained.
Elias Bareinboim is an associate professor in the Department of Computer Science and the director of the Causal Artificial Intelligence (CausalAI) Laboratory at Columbia University. His research focuses on causal and counterfactual inference and their applications to data-driven fields in the health and social sciences as well as artificial intelligence and machine learning. His work was the first to propose a general solution to the problem of “data-fusion,” providing practical methods for combining datasets generated under different experimental conditions and plagued with various biases. More recently, Bareinboim has been exploring the intersection of causal inference with decision-making (including reinforcement learning) and explainability (including fairness analysis). Bareinboim received his Ph.D. from the University of California, Los Angeles, where he was advised by Judea Pearl. Bareinboim was named one of “AI’s 10 to Watch” by IEEE, and is a recipient of the NSF CAREER Award, the ONR Young Investigator Award, the Dan David Prize Scholarship, the 2014 AAAI Outstanding Paper Award, and the 2019 UAI Best Paper Award.
Drago is a postdoctoral research scholar in the Computer Science Department at Columbia University, working with Elias Bareinboim. Before joining Columbia, he completed his PhD in the Seminar for Statistics at ETH Zürich under the supervision of Nicolai Meinshausen. His research interests are in applying causal inference for trustworthy data science, including both statistical and computational aspects. Drago has previously worked on fair machine learning and explainability, and more recently on algorithmic recourse, and is also interested in medical applications investigating health equity.
Quarter Day Labs
LQ1: Enabling trustworthy AI with metadata tracking using Common Metadata Framework
The rapid development of large ML models has emphasized the need for trustworthiness in the field of AI. The societal impacts of these models are enormous, and their implications require careful ethical assessment. Various regulatory bodies like the European Commission’s High-Level Expert Group on AI have stressed the need to build systems that are transparent, ensure data privacy, and can be auditable. This lab introduces artificial intelligence and machine learning metadata tracking and its importance in ensuring the trustworthiness of ML models with a Python tracking library called Common Metadata Framework (CMF).
This tutorial aims to cover the following areas:
- Why metadata tracking is necessary for trustworthy AI.
- Responsibilities and obligations from practitioners as outlined in various AI acts.
- Landscape of metadata tracking tools.
- Introduction to Common Metadata Framework.
- Hands-on session on Common Metadata Framework to enable trustworthy AI.
The tutorial focuses on creating reproducible AI pipelines with end-to-end metadata, lineage, and provenance. It introduces integrated versioning of code, artifacts and hyper parameters associated with AI pipelines. The tutorial provides methods to share metadata between different teams or individuals in collaborative development. It also provides details about the various regulatory obligations outlined by various governance bodies and how metadata tracking enables compliance. It will also provide a high-level overview of other metadata tracking tools in the market and provide a hands-on session on CMF. Tutorial provides details on how to track metadata in complex scientific workflows with examples from model training in Fusion Data Platform and Computational steering of experiments to discover specific features in nano material using Electron Microscopy. CMF plays a central role in Fusion Data Platform to support the design and safe operation of a fusion pilot plant. Fusion Machine Learning Data Science Platform (FDP) development was selected for funding by Department of Energy. More details about CMF can be found here https://github.com/HewlettPackard/cmf.
Prerequisite knowledge:
Basic AI and ML awareness. Working knowledge of Python. AI practitioners and researchers with different backgrounds can benefit from the session. It can help early researchers, students and seasoned practitioners to understand the regulatory landscape, awareness of the new tools and frameworks that can help them to produce reproducible experiments.
Annmary Justine is a senior research engineer in Hewlett Packard Labs. She has over 16 years of industry experience spanning across storage systems, Data platforms and analytical engines. Her current focus is metadata, lineage and provenance tracking in distributed AI pipelines. She has a masters degree from National Institute of Technology Surathkal.
Sergey Serebryakov is an engineer at Hewlett Packard Labs. He has worked on relation extraction from texts; benchmarking tools and performance analysis for deep learning workloads; and real-time anomaly detection for time series data. Currently, he focuses on accelerating machine learning pipelines by utilizing metadata of past pipeline runs.
Aalap Tripathy is a Principal Research Engineer at the AI Research Lab at Hewlett Packard Labs. His interests are on Edge-to-Exascale computing, data and metadata analytics and management for Trusted & Responsible AI. He has a Ph.D. in Computer Engineering from Texas A&M and a Bachelors from BITS, Pilani India
Suparna Bhattacharya is an HPE Fellow in the AI research lab at Hewlett Packard Labs where she currently focuses on data centric trustworthy AI. She is also an IEEE Fellow and a Fellow of the Indian National Academy of Engineering for lasting contributions to system software research and development.
Martin Foltin is a principal research engineer at Hewlett Packard Labs. He has been with Hewlett Packard Enterprise for over 23 years in various roles spanning Silicon Design Automation, VLSI chip architecture and Artificial Intelligence. Martin manages the Data Foundation for AI engineering team at HPE focusing on development of data centric AI infrastructures to help optimize data and pipelines for responsible AI.
LQ2: Harnessing Large Language Models for Planning: A Lab on Strategies for Success and Mitigation of Pitfalls
In the rapidly evolving landscape of artificial intelligence, Large Language Models (LLMs) have emerged as potent tools with an impressive aptitude for understanding and generating human-like text. Their relevance in the domain of planning is particularly noteworthy, given the similarities between planning tasks and programming code-related tasks, a forte of LLMs. Planning, akin to scripting in the Lisp programming language using Planning Domain Definition Language (PDDL), presents a fertile ground to explore the capabilities of LLMs in devising effective and efficient plans. This lab seeks to delve deep into the nuances of utilizing LLMs for planning, offering participants a comprehensive understanding of various techniques integral to the functioning of these models. Participants will be introduced to supervised fine-tuning and a range of prompting techniques, fostering a critical analysis of which approaches tend to enhance planning capabilities significantly. At the heart of this lab is a hands-on session where participants can work closely with “Plansformer”, our proprietary fine-tuned model developed explicitly for planning tasks. This session aims to provide a comparative analysis of the current state-of-the-art LLMs, including GPT-4, GPT-3.5, BARD, and Llama2, offering insights into their respective strengths and weaknesses in planning. We will also briefly explain and show how neuro-symbolic approaches can complement the incorrect generations from LLMs.
Vishal is currently a Ph.D. student in Computer Science and Engineering at the University of South Carolina. Vishal’s research interests lie in the intersection of Natural Language Processing and Automated Planning. He worked with an interdisciplinary team to create Plansformer, a large language model capable of generating valid and cost-optimal symbolic plans.
Keerthiram Murugesan is a Research Scientist at IBM Research, Yorktown Heights, NY. His current research interests are online learning, multi-armed bandits, (Multi-agent) reinforcement learning, hyperparameter optimization, etc., with applications to natural language understanding.
Biplav Srivastava is a professor of computer science at the AI Institute at the University of South Carolina. Dr. Srivastava is interested in enabling people to make rational decisions despite real-world complexities of poor data, changing goals, and limited resources by augmenting their cognitive limitations with technology. He is exploring new approaches for goal-oriented, ethical, human-machine collaboration via natural interfaces using domain and user models, learning, and planning.
Francesca Rossi is an IBM fellow and the IBM AI Ethics Global Leader. Her research interests focus on artificial intelligence, including constraint reasoning, preferences, multi-agent systems, computational social choice, and collective decision-making. She is also interested in ethical issues in the development and behavior of AI systems, particularly for decision support systems for group decision-making. She has published over 200 scientific articles in journals and conference proceedings, and as book chapters. She has co-authored a book and edited 17 volumes between conference proceedings, collections of contributions, special issues of journals, and a handbook.
Dr. Lior Horesh is a Principal Research Scientist and a Senior Manager of the Mathematics and Theoretical Computer Science group at the MIT-IBM Research Lab. His group’s mission is to approach some of AI’s big challenges from a principled mathematical angle. This involves conceiving and bringing in state-of-the-art mathematical theories, algorithms, and analysis tools to advance AI’s fundamental generalizability, scalability, and interpretability.
LQ3: Digging into the Landscape of Graphs Counterfactual Explainability
Unveiling the inner workings behind ML decisions, counterfactual explanations offer alternative scenarios for diverse outcomes. This powerful tool empowers end users with choices, making counterfactual explainability a game-changer, especially in the dynamic realm of graph-based ML, where the heterogeneous nature of the data makes it difficult to understand the underlying relationships between the input and the outcome.
In the first part of this lab (30 minutes), we will introduce the challenges of developing and evaluating GCE methods, including a lack of standardization in metrics and oracles, as well as benchmarking studies. Then, we will present the GRETEL framework for developing and evaluating GCE methods. Furthermore, we will provide hands-on examples of how to build the explanation pipeline using GRETEL.
In the second part (30 minutes), we will present how the different categories of explainers can be implemented into the framework, including search-based methods, heuristic-based ones, learning-based ones, and global-level explainers. Moreover, we will analyze an empirical comparison of some of these methods in different datasets.
In the third part of the lab (30 minutes), we will focus on a hands-on experience demonstrating how to extend the framework to fit the needs of the users. Industry practitioners will learn how to use the framework on their datasets and to explain their own ML models, while attendants interested in XAI will learn how to develop and evaluate their explainers using the ready-to-use datasets, oracle, and evaluation metrics provided in GRETEL. Furthermore, we will show how to use the data analysis capabilities of the framework to get more insights into the explanations.
In the final section of the lab (15 minutes), we will discuss the challenges of providing Graph Counterfactual Explanation and possible ways to tackle them. This part will allow the attendants to participate in the discussion actively.
Audience and Scope:
The lab is aimed at practitioners in academia and industry interested in applying explainable machine learning techniques to analyze graph data. In particular, it is focused on the use of the GRETEL framework. Participants with a background in data mining will gain an understanding of the information provided by explanation methods to end users and how to integrate their datasets into an explanation pipeline. Those with machine learning expertise will delve deeper into state-of-the-art counterfactual explainers, critically analyzing their strengths and weaknesses. They will also learn to integrate their oracles and develop new explanation methods using the GRETEL framework. The lab will be conducted with the help of the SoBigData.it RI. Look at the website for the latest updates on the requirements.
For more information and updates, visit: https://aiimlab.org/events/AAAI_2024_Digging_into_the_Landscape_of_Graphs_Counterfactual_Explainability.html
For those interested also in the theoretical overview of Counterfactual Explanations on Graphs, there is a dedicated tutorial session:
Mario Alfonso Prado-Romero Gran Sasso Science Institute marioalfonso.prado@gssi.it
Mario Prado is a PhD Fellow in AI at the Gran Sasso Science Institute. His main research focus is at the intersection of GNN and XAI. Key contributor to the GRETEL project. Serves as PC member for top-tier conferences and journals. Currently, he is the only Research Intern selected by NEC Laboratories Europe to work on XAI in the Biomedical Domain.
Dr. Bardh Prenkaj
Sapienza University of Rome prenkaj@di.uniroma1.it
Dr Bardh Prenkaj is a postdoctoral researcher in Machine Learning at the Sapienza University of Rome. He is part of the GRETEL project, specializing in generative counterfactual explainers. He serves as a program committee member for top-tier conferences and journals and collaborates with international research groups.
Prof. Giovanni Stilo
University of L’Aquila giovanni.stilo@univaq.it
Prof. Giovanni Stilo is an associate professor in Computer Science and Data Science at the University of L’Aquila. He leads the Data Science Master’s program, specializes in trustworthiness aspects, and manages the GRETEL project. He organizes international workshops, serves on journal editorial boards, and leads key projects in data science and education.