Certified policy synthesis for adaptive CPS/IoT automotive


< Project overview >


Cyber-Physical Systems (CPS) and Internet of Things (IoT) are complex engineering systems encompassing analogue/physical quantities with digital/discrete controllers. Modern CPS and IoT systems utilise learning components which gather data from sensors. These sensors process data through internal functions, then output signals accordingly. This is known as black-box technology.

This project aimed to develop solutions for practical verification issues for complex CPS/IoT systems in the automotive domain, which include black-box learning/adaptive components. The presence of these components renders verification tasks particularly tricky: the certification of black-box components is a core, unmet challenge for safety-critical applications.

The focus of the project was on breaking this barrier and on transferring technology via industrial demonstrations. We have engaged with an existing industrial partner and sought a new industrial contact in the automotive domain.

Project aims

This project planned to transfer knowledge from previous research, on new results for certified synthesis of CPS/IoT with adaptive components, to scenarios and demonstrations that are relevant to the cut-edge automotive industry.

In collaboration with ZF, a tier one automotive supplier, we have examined issues from use cases around advanced driver assistance systems (ADAS). The automotive industry has a clear need to develop new solutions to the problem of certified learning for autonomous systems in safety-critical environments. We will in particular focus on adaptive cruise control (ACC) scenarios, encompassing various levels of autonomy (ZF is interested in targeting deployment of level-4 solutions within the next few years).

We have synthesised certified policies via RL around specific ACC goals, such as platooning or autonomous lane changing/merging. We have leveraged ZF simulation software for the above scenarios, and later use a proprietary hardware-in-the-loop implementation. (Eventually, the open software has been locally installed on servers at Oxford, which enables the continuation of our work beyond the project timeframes.) Deployment of the solution in demonstrations has included elements of perception and data fusion, for instance through specific sensors or computational vision components.

What was done?

During this project we have successfully engaged with two companies: ZF Friedrichshafen (DE) and McLaren (UK). The PI has engaged with members of his group, OXCAV, namely DPhil students, two MSc students and their projects, and the hired RA Dr Hasanbeig.

This Pitch-In project has really accelerated the engagement of OXCAV with automotive companies around the IoT area. The hired RA, Dr Hasanbeig, has been quite supportive. As per our plan above, we have worked with ZF Friedrischshafen (DE), via our contact Dr Shashank Pathak.

Related to this research initiative in automotive, Dr Hasanbeig and I have also supervised a Y3 student in CS, Adam Nicholls, on his Y3 project titled ‘Application of reinforcement learning methods to problems of Formula One race strategy’. This project has been very exciting, and Adam is continuing on an Y4 project with me in HT-TT21; this has drawn the interest of McLaren. Dr Randy Singh is providing co-supervision, data, and hardware support for the new project. We are by now fully engaging with engineers at McLaren on this project, and hope this can be continued in the future.

Pitch-In has supported the summer internship of Joar Skalse, a brilliant Y4 student who is now continuing his studies as a DPhil student under my supervision. We have worked on reinforcement learning and we are now translating his results on the CARLA-based automotive simulations (see above), which will run on the new hardware.

Presently, we are aiming at publication outputs from this project, and a CS undergraduate is engaged in an internship that will conclude in June 2021.

Finally, Pitch-In has also enabled the acquisition of new hardware for the OXCAV group: this is being installed at CS, and will allow to run the CARLA software internally, and will provide bolstering for more aligned work, beyond the ongoing project.


We have fully attained the planned goals to transfer recent technology from OXCAV to two different but related industrial scenarios in CPS-IoT.

The framework of logically-constrained RL has been successfully applied on the two broad case studies. Both companies have found this application to be relevant for them. One industrial collaboration (with ZF) has been strengthened, another (with McLaren) established. Despite the issues that the automotive industry has been facing over the past year, in view of the pandemic, we hope support can be sought from our partners, to continue our collaborations in the future.

Deliverables and other tangible outputs

We are planning to write a paper titled ‘Safe automotive perception via deep logically-constrained reinforcement learning’, to be submitted to the International Conference on Intelligent Robots and Systems (IROS) in Summer 2021.

We have also submitted a paper titled ‘Lexicographic multi-objective reinforcement learning’ to the ECRL conference, in April 2021.

Dr Hasanbeig has also published the following work, which is related to reinforcement learning, which has direct impact on our IoT applications in automotive:

Jeppu, N, Hasanbeig, M, Abate, A, Melham, T, and Kroening, D (2021) ‘DeepSynth: program synthesis for automatic task segmentation in deep reinforcement learning’, AAAI.

Hasanbeig, M, Kroening, D, and Abate, A (2020) ‘Deep reinforcement learning with temporal logics’, FORMATS 20, LNCS 12288, pp. 1–22.

Hasanbeig, M, Abate, A, and Kroening, D (2020) ‘Cautious reinforcement learning with logical constraints’, AAMAS20, pp. 483–491.

Hasanbeig, M, and Abate, A (2021) ‘LCRL: the logically-constrained reinforcement learning software tool’, CAV, in print.

One article is under review:

Jeppu, N, Hasanbeig, M, Abate, A, Melham, T, and Kroening, D (2021) ‘Automata synthesis for automatic task segmentation in deep reinforcement learning’, JAIR, under review.


We have fully attained the planned goals to transfer recent technology from OXCAV to two different but related industrial scenarios in CPS-IoT. One industrial collaboration (with ZF) has been strengthened, another (with McLaren) established.

Dr Pathak (ZF) has presented a related company project titled ‘Safe active monitoring’ with the University of Oxford as a collaborator, at the Tech.AD Europe Conference: this has won second place in the ‘Most innovative use of AI in autonomous driving’ category.

Next steps

We plan to complete the series of publications related to this project. We plan to further engage with the industrial partners, expanding the collaborations established within this project. We expect incoming support from these companies, or through research funding, for future collaborations.

Lessons learned

Collaborations went very well.

I must also emphasise that the project administration was always quickly responsive and fully supportive, allocating extra funds for new expenses as needed, or helping with increasing the project impact. A special mention to Andy Gilchrist for the great work together – thanks!

Evidently, had COVID not hit us, we could have met in person once/twice over the year, which might have helped further work. Nevertheless, I think we have managed working from home quite well.

What has Pitch-In done for you?

Pitch-In has been an extremely supportive and flexible initiative throughout, which has enabled and accelerated my engagement with relevant companies, has allowed my group to test our algorithm and transfer them in industrial scenarios, has benefitted the named researchers and students in my group (two interns) and provided them with new skills. Thanks very much for the support indeed!

Project lead

Professor Alessandro Abate – University of Oxford

< Theme >