Workshop on Embodied and Safe-Assured Robotic Systems
In conjunction with NeurIPS 2025; Date: November 30 | Room: Don Alberto 2, Hilton Mexico City
| Time | Event |
|---|---|
| 13:00 – 13:10 | Opening Remarks |
| 13:10 – 13:40 |
Dr. Hasirlioglu, Sinan (Audi)
Sinan Hasirlioglu received his Ph.D. in Engineering Science from Johannes Kepler University Linz, Linz, Austria, in 2020. His doctoral research focused on simulation-based testing and validation of automotive surround sensors under adverse weather conditions. In the same year, Sinan joined Robert Bosch GmbH, where he contributed to the Urban Automated Driving project with a focus on System Validation and Release. Later, Sinan served as a Product Owner for SOTIF (Safety of the Intended Functionality) Evaluation. Since 2023, Sinan has been with AUDI AG as Release Manager for the Highway Pilot project. His research interests include scenario-based assessment and innovative release methods for highly automated driving systems.
The Advantages of Using Scenarios as Part of the Specification in Automated Driving
For the development and deployment of Driving Automation Systems, manufacturers use scenarios to decompose the complex world into smaller pieces. Traditionally, scenarios focus on verification and validation processes. This talk introduces the Scenario as Specification approach, which connects requirements to scenario-based descriptions of target behavior. By combining target behavior with environmental context, scenarios become a foundation for safety engineering and verification and validation. Practical examples illustrate the key concepts and the motivation behind this approach.
|
| 13:40 – 14:10 |
Prof. Andrea Stocco (TUM)
Andrea Stocco is an Assistant Professor at the Technical University of Munich at the Chair of Software Engineering for Data-intensive Applications of the School of Computation, Information and Technology. He is also the head of the Automated Software Testing unit at fortiss. His research focuses on the interface between software engineering and deep learning with the goals of improving the robustness, reliability, and dependability of data-intensive software systems. He has received several awards, including two Distinguished Paper Awards at ICST 2025, the Best Paper Award at QUATIC 2023, and the Best Student Paper Award at ICWE 2016. He serves on program committees of major SE conferences (ICSE, FSE, ASE, ISSTA, ICST) and reviews for top journals (TSE, TOSEM, EMSE, JSS, IST).
Testing of Autonomous Driving Systems: From Simulated to Real-world Environments
Autonomous driving systems (ADS) require extensive testing to ensure safety, reliability, and robustness across a wide spectrum of operational conditions. This talk introduces the state of the art in testing and evaluation of ADS, with a particular focus on simulation-based techniques and the challenges of transferring results from simulation to real-world environments. We discuss approaches for scenario generation and performance assessment, as well as current limitations in realism, sensor modeling, and non-deterministic behavior. Finally, the talk explores emerging research directions, such as sim2real consistency analysis, uncertainty quantification, and AI-driven test generation.
|
| 14:10 – 14:40 |
Prof. Sergey Levine (UC Berkeley / Physical Intelligence)
Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as applications in other decision-making domains. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more.
Vision-Language-Action Models
Vision-language-action (VLA) models enable multimodal language models to address challenges in robotics and control. While the basic foundations of VLAs are relatively simple, integrating action outputs into vision-language backbones, these models open up a wide range of new research topics. VLAs can use complex reasoning to solve temporally extended problems, refine their behavior through in-context learning, and utilize reinforcement learning to improve from experience. These new capabilities provide for exciting opportunities as well as challenges, which I will discuss in this talk.
|
| 14:40 – 15:10 |
Prof. Jinqiu Yang (Concordia University)
Dr. Jinqiu Yang is an Associate Professor at Concordia University in Montreal, Canada, working at the intersection of Software Engineering and Artificial Intelligence (SE4AI and AI4SE). She leads the O-RISA lab, focusing on robust LLM-based code generation, reliability of complex AI systems such as autonomous vehicles, and practical robustness frameworks for ADS perception. She has secured over $2M in research funding and has served on committees for ICSE, FSE, ASE, OOPSLA, and more. She is an Associate Editor for EMSE and TOSEM and recipient of multiple awards including the ACM Distinguished Paper Award, IBM CAS Project of the Year, and the Gina Cody Innovation Fellowship.
Towards Robust and Efficient Autonomous Driving Systems from the Lens of Software Engineering
Autonomous driving systems operate in complex, unpredictable environments where accuracy alone is insufficient. Safe deployment requires robustness, efficiency, fault tolerance, and systematic testing. This talk discusses how software engineering perspectives help bridge the gap between ML model performance and real-world dependable autonomy. Dr. Yang introduces recent work on robustness frameworks for ADS perception, stress-testing perceptual modules, and detecting high-latency behaviors. These findings highlight the importance of engineering-driven analysis in building safer and more reliable autonomous systems.
|
| 15:10 – 15:40 |
Prof. Krzysztof Czarnecki (University of Waterloo)
Krzysztof Czarnecki is a Professor of Electrical and Computer Engineering and a University Research Chair at the University of Waterloo, where he leads the Waterloo Intelligent Systems Engineering (WISE) Laboratory. He currently also serves as an Associate Director of the Waterloo Centre for Automotive Research (WatCAR). His research focuses on assuring the safety of AI systems and driving behavior. In 2018, he co-led the development of the first autonomous vehicle tested on public roads in Canada. He has made significant contributions to automotive AI and software safety standards, including SAE J3164 and ISO 8800. Before joining the University of Waterloo, he worked at DaimlerChrysler Research in Germany (1995-2002), where he advanced software development practices and technologies for enterprise, automotive, and aerospace sectors. His work has been recognized with numerous awards, including the Premier's Research Excellence Award (2004) and the British Computing Society’s Upper Canada Award for Outstanding Contributions to the IT Industry (2008). He has also received twelve Best Paper Awards, two ACM Distinguished Paper Awards, and five Most Influential Paper Awards.
Systematizing the Unusual: A Taxonomy-Driven Dataset for Vision–Language Model Reasoning About Edge Cases in Traffic
One of the central challenges in developing robust vision–language models (VLMs) for real-world autonomy is their ability to recognize, interpret, and reason about rare and hazardous situations—so-called edge cases. Unlike routine traffic patterns, which are well-represented in large-scale datasets, these scenarios occur infrequently, are highly diverse, and often involve subtle contextual cues that challenge both object detection and semantic understanding.
EdgeScenes is a new dataset under development at the WISE Lab that directly targets this limitation. It systematically captures and organizes rare road situations using a structured, ontology-driven taxonomy of hazardous conditions spanning infrastructure anomalies, abnormal road-user behavior, foreign objects, environmental extremes, and complex interactions. The dataset is constructed from crowdsourced video footage and annotated with a rich multimodal schema covering over 300 fine-grained hazardous conditions, along with temporal extents and contextual tags. As such, it establishes a testbed specifically designed for evaluating VLMs on out-of-distribution, safety-critical driving scenarios. This talk will present the motivation, design principles, and annotation methodology behind EdgeScenes, with particular emphasis on how taxonomy-based labeling enables systematic identification of edge cases and gaps. I will discuss key insights gathered during dataset construction, including patterns in real-world hazard emergence and challenges in visual–semantic grounding. Finally, I will report early experimental results using frontier VLMs, showing that while current models can detect a broad range of hazards beyond closed-vocabulary vision systems, they still exhibit severe limitations, showing both high false positive and false negative rates in recognizing road hazards. These findings point to an urgent need for progress in improving the reliability of VLMs to support trustworthy multimodal perception in autonomous systems. |
| 15:40 – 16:10 |
Dr. Mozhgan Nasr (University of Waterloo)
Dr. Mozhgan Nasr Azadani is a Postdoctoral Fellow at the Wise Lab, University of Waterloo, specializing in the intersection of autonomous driving and natural language processing. Her research focuses on developing trustworthy, interactive, and scalable multimodal models that integrate vision and language to enhance perception, understanding, reasoning, and decision-making in autonomous systems such as vehicles and robots. She completed her Ph.D. in Computer Science at the University of Ottawa, where her work centered on driving behavior analysis and prediction for safe autonomous vehicles. Her work has been recognized with the prestigious NSERC Postdoctoral Fellowship.
Toward Efficient and Reliable Vision–Language Models for Real-World Autonomous Systems
Vision language models (VLMs) are becoming central components in autonomous systems, enabling capabilities such as scene understanding, instruction following, and high-level decision support. However, their deployment in real-world environments remains constrained by two fundamental challenges: the need for reliable visual perception and the need for computational efficiency under strict latency and resource budgets. In this talk, I will present a unified research trajectory that addresses these challenges through new designs for knowledge-efficient, token-efficient, and expert-efficient multimodal models. I will showcase three complementary advances: HAWAII, a hierarchical distillation framework that transfers the strengths of multiple vision experts into a single lightweight encoder for improved visual understanding; LEO-MINI, an efficient multimodal model that dramatically reduces visual token redundancy using conditional token reduction and multi-modal expert routing; and Leo, a principled mixture-of-vision-encoders design that reveals simple, effective fusion strategies for high-resolution perception. Together, these contributions show how to achieve strong visual reasoning under tight computational and latency constraints offering practical, deployable VLM solutions for autonomous systems operating in complex, dynamic, and safety-critical environments.
|
| 16:10 – 16:30 | Break |
| 16:30 – 16:45 | Oral Paper 1 MC-Risk: Multi-Component Risk Fields for Risk Identification and Motion Planning |
| 16:45 – 17:00 | Oral Paper 2 Multi-Task Consistency-based Detection of Adversarial Attacks |
| 17:00 – 17:15 | Oral Paper 3 Balance Equation-based Distributionally Robust Offline Imitation Learning |
| 17:15 – 17:30 | Oral Paper 4 FAIR Voice Biomarker Data for Safe-Assured Embodied Health |
| 17:30 – 17:45 | Oral Paper 5 VLM-Enhanced Adversarial Scene Generation for Safe Autonomous Driving |
| 17:45 – 17:50 | Poster 1 SCSG: Real-World Report–Augmented Safety-Critical Scenario Generation |
| 17:50 – 17:55 | Poster 2 The Horcrux: Mechanistically Interpretable Task Decomposition for Reward Hacking Detection |
| 17:55 – 18:00 | Poster 3 Diversity-Guided Genetic Algorithm for Safety-Critical Scenario Generation |
| 18:00 – 18:05 | Poster 4 GUARDIAN: Uncertainty-Aware Runtime Dual Invariants for Neural Signal-Controlled Robotics |
| 18:05 – 18:10 | Poster 5 Probabilistic Formal Verification for Safe Neural Network Navigation |
| 18:10 – 18:15 | Poster 6 VLM-guided Object-level Segmentation from Dynamic Scenes |
| 18:15 – 18:30 | Award Ceremony & Closing Remarks |
| 18:30 – 19:00 | Poster Discussion + Q/A |
| Total Time | 13:00 – 20:00 |
Fraunhofer FOKUS
Huawei Munich Research Center
The University of Tokyo
Technical University of Munich
Technical University of Munich
University of Texas at Austin
The Chinese University of Hong Kong
Universität Paderborn
Ruhr-Universität Bochum
Technical University of Munich
SenseAuto Research
Technische Universität Graz
AMD
University of California, Los Angeles
City University of Macau