Intelligent Transport System Group

The Intelligent Transport Systems (ITS) group is part of the Australian Centre for Field Robotics (ACFR). The primary objective of the ACFR is to undertake fundamental and applied research in field robotics and intelligent systems that encompass the development of new theories and methods, and the deployment of these into industrial, social and environmental applications. The ITS group has over 20 people including researchers, PhD students, technical support and visitors.

The ITS group has extensive support from industry at the national and international level. Current international project sponsors include the University of Michigan, Ford USA, Ibeo Germany and Renault France. The group is also part of the new Collaborative Research Centre iMOVE, and is currently working on projects with Transport for New South Wales, Cohda Wireless and IAG.

The main aim of the group is the development, deployment and demonstration of new mobility technology. The group has developed several Connected and Automated Vehicles (CAVs) and is working with a number of industry partners, government institutions and the University of Sydney to deploy and demonstrate innovative “smart city” concepts in various domains.

Research goals for the ITS group span various Autonomous Vehicle (AV) technologies, with a strong focus on robustness and providing safety guarantees for long-term autonomous operations working under a wide range of urban scenarios. The focus on robustness and system verification covers all research topics.

Fundamental and applied research areas include:

Related papers are listed here.

Related videos are available here.

Autonomous Vehicle Platforms

The group has built a number of platforms designed for research in vehicle automation and perception. Two autonomous electric vehicles were developed in a joint project with Applied EV (now AEV Robotics), with the ACFR developing a computer-controlled interface. Each vehicle contains two computers; an Intel NUC for drive control and an Nvidia Drive PX2 computing platform to handle the camera array. A Programmable Logic Controller (PLC) is used to provide a reliable, safety-rated, low-level interface with the vehicle hardware.

The vehicles have been fitted with a variety of sensors including a scanning 3D lidar, 360-degree vision through the inclusion of an array of six 1080p (30Hz) cameras, Inertial Measurement Unit (IMU), Global Navigation Satellite System (GNSS), wheel encoders and various others. These vehicles leverage the Robot Operating System (ROS) and have been used to compile a comprehensive dataset from driving around the campus at the University of Sydney on a weekly basis for more than a year. This dataset is in the process of being made available to the wider ITS community.


Electrical vehicles retrofitted with automation technology and logging capabilities



A Volkswagen Passat has been fitted out as a data collection platform, with a sensor suite used to collect trajectory data, and is able to identify and classify dynamic and static objects in proximity to the vehicle. This vehicle includes a Hardware Data Acquisition (HAD) system that is capable of feature fusion as well as object detection and tracking. The HAD system is comprised of 6 Ibeo LUX lidar scanners and is able to identify road users at a range of up to 200 m. This vehicle is used to collect naturalistic driving data including tracked objects, which is used in many of the ITS group’s projects. This vehicle has been collecting data along city roads and motorways for over 3 years and has covered more than 40,000 km. 



Urban vehicle retrofitted with 360-degree perception, high-accuracy localisation and logging capabilities


Tracking and classifying mobile objects across the Sydney Harbour Bridge


Multimodal sensory data analysis using data collected by the perception vehicle


Human-machine Interfaces: Living with Autonomous Mobility

Autonomous vehicles have the potential to revolutionise mobility in smart cities of the future. Shared self-driving cars or pods could be used in shared spaces to enable last mile transportation. The introduction of this technology is dependent on the ability of the vehicles to safely and efficiently interact with pedestrians. This will undoubtedly require an effective mechanism to communicate information in both directions between the autonomous car and the pedestrians, replacing the often subtle non-verbal communication that humans can naturally perform between each other

The collaboration between engineering (Australian Centre for Field Robotics) and architecture and design (Design Lab, Smart Urbanism Lab provides a comprehensive set of research backgrounds to explore this problem. The ACFR have developed two AV platforms which are used to demonstrate the various autonomous vehicle functions in a shared environment. The Design lab has extensive experience developing and validating human-machine interfaces using innovative approaches that include the use of virtual reality. 

More information is available at


The group has a strong focus on perception, using multiple cameras and lidar to robustly perceive important features in the urban environment. The detection of objects is used in collision avoidance, object tracking, mapping and localisation. The group is also publishing a comprehensive dataset with weekly driving data taken over the course of an entire year. This work has included automatic calibration of the intrinsic and extrinsic parameters of associated sensors to estimate the geometric relationships between these sensors.

Vision-Based Semantic Deep Learning

One of the fundamental challenges in the design of perception systems for AVs is validating the performance of each algorithm under a comprehensive variety of operating conditions.  In the case of vision-based semantic segmentation, there are known issues when encountering new scenarios that are sufficiently different to the training data. In addition, even small variations in environmental conditions such as illumination and precipitation can affect the classification performance of the segmentation model.


Given the reliance on visual information, these effects often translate into poor semantic pixel classification, which can potentially lead to catastrophic consequences when driving autonomously. This research area examines novel methods for analysing the robustness and reliability of semantic segmentation models when operating in diverse environments. The aim is to provide metrics for the evaluation of the classification performance over a variety of environmental conditions. Another aspect of this research is the automated generation of labels using different sensing modalities. This, combined with the campus dataset mentioned above, can generate more robust models that are capable of handling a wider variety of conditions without expensive hand-labelling.



Autonomous systems, using a multitude of sensors including 3D lidar and cameras, are being increasingly used for research and industrial applications. The main challenge in fusing these two different sensor modalities is the requirement for a precise calibration of the camera’s intrinsic parameters, and the geometrical extrinsic parameters which includes the 3D transformation between the two sensors. This is a particularly challenging problem as the object features are obtained from different sensors with different modalities and noise patterns – noisy features reduce the accuracy of calibration. Furthermore, not all lidars/cameras have similar behavior and measurement errors, making it difficult to generalize an approach. These issues are addressed by selecting features that are less susceptible to noise from sensor measurements, and by using a robust optimization strategy. The ITS group has demonstrated experimentally that the method is able to obtain consistent results, which improve as more samples are added into the optimizer. The figure to the right shows a muti-camera view and laser projection obtained with the extrinsic calibration process.

Fusing Semantic and Laser Information

Current autonomous driving applications require not only the occupancy information of the local environment, but also reactive maps to represent dynamic surroundings. There is also benefit from incorporating semantic classification into the map to assist path planning in changing scenarios.

This work utilises a Convolutional Neural Network (CNN) to provide the semantic context of the local environment and projects the classification into a 3D lidar point cloud. The resulting point cloud feeds into the octree map building algorithm and computes the corresponding probabilities (occupancy and classification) for every 3D voxel. Methods to incorporate the uncertainty of semantic labels based on the pixel distance to the label boundaries are also investigated as part of this work. The figure to the right shows the process of labeling each pixel in the laser point cloud as belonging to a particular class obtained in the vision domain.


Robustness and safety are crucial properties for the real-world application of AVs. One of the most critical components of any autonomous system is localisation. During the last 20 years there has been significant progress in this area with the introduction of very efficient algorithms for mapping, localisation and Simultaneous Localisation and Mapping (SLAM). Many of these algorithms present impressive demonstrations for a particular domain, but fail to operate reliably with changes to the operating environment. The aspect of robustness has not received enough attention and localisation systems for self-driving vehicle applications are seldom evaluated for their robustness.

The work of the ITS group in this area focuses on using multimodal sensor fusion to build feature and semantic maps of the environment. The features observed in the relative coordinate systems are registered to a global reference frame using graph optimisation techniques. The projects aims at developing and demonstrating large area localisation algorithms and the introduction of novel metrics to effectively quantify localisation robustness with or without an accurate ground truth. The feature maps generated during localisation provide prior information for future localisation and path planning. The challenge in maintaining this map comes as a result of changes over time within the urban environment. In the campus dataset, there are numerous examples of buildings that are constructed/demolished and other features like trees/poles that change over time. One important outcome of the work from the ITS group is the development of new techniques to maintain a global map over a long time-frame, utilising a probabilistic approach that can be incorporated into the localisation task. The figure to the right shows the point cloud registered to a global map and aerial imagery.

Path Planning and Provably Safe Trajectories

Planning safe trajectories for autonomous vehicles is challenging, particularly when operating around dynamic obstacles. The planning strategy of the electric vehicles developed by the ITS group is divided into a high-level path planner and a low-level, robust, collision avoidance algorithm referred to as a “virtual bumper”. The high-level planner is interchangeable depending on the location and complexity of the situation.

Using high-level features of the road network, including information about the lanes/curbstones and road rules, the high-level planner can generate detailed trajectories in-line with the expected behaviour of the vehicle. The low-level collision avoidance component is designed primarily as a last resort in the event of an unexpected obstacle. These components are being developed to include the tracking and prediction of obstacles to reduce the frequency of false alarms and to operate in highly-dynamic environments such as shared pedestrian areas. In addition, the ITS group is collaborating with the University of Michigan to incorporate a reachability-based trajectory planner that is provably safe.

The ITS group is also researching map structures to incorporate all the information required to operate safely in urban road environments. The figure to the right presents a lane-based map showing the high-level features of the road network.

Driver/Pedestrian Intention Estimation

Driving vehicles is a highly skilled task that requires extensive understanding of the intentions of other road users. A limitation of current Advanced Driver Assistance Systems (ADAS) is the inability to perceive the entirety of the vehicle situational context and the intentions of the human participants. Available sensor systems cannot understand a vehicle’s surroundings anywhere near as accurately and comprehensively as a human driver. Humans interpret situations and anticipate likely events by combining past experiences, sensory input and a multitude of behavioural cues. While this may become second nature to an experienced human driver, properly understanding the intentions of other drivers and pedestrians is still an unsolved problem for ADAS, and by extension, AVs.

The ITS group’s work in this area addresses driver/pedestrian intention, and path prediction, using machine learning approaches including Recurrent Neural Networks (RNNs) and Deep Learning to improve the robustness of the estimation process. These data-driven approaches have incorporated naturalistic vehicle trajectories collected by the ITS group’s vehicle platforms. These large naturalistic datasets provide valuable information that is typical of the data available from an autonomous/smart vehicle. These datasets have been published by the ITS group for the wider ITS community. The figures below show the observed vehicle trajectories and the vehicle trajectory estimation using RNNs.

The estimation and prediction of pedestrian motion is of fundamental importance in ITS applications. Most existing solutions have utilised a particular type of sensor for perception such as cameras (e.g. stereo, monocular, infrared) or other modalities such as a laser range finders or radar. The ITS group are currently researching methods that are able to extract dynamic and skeletal pedestrian information from visual data and are combining this information with semantic information to infer pedestrian intentions. The figures below show the same image being passed through a pose filter (left) and through the vision-based semantic segmentation model detailed above (right).

Cooperative Perception and Tracking

The next decade will see the deployment of Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication. The availability of large scale egocentric and external sensor information will enable the development of highly-sophisticated cooperative safety solutions. The upcoming widespread deployment of Dedicated Short Range Communications (DSRC) technology will enable the sharing of a multitude of information that can potentially be used to build accurate representations of the local environment. This information will be provided by Intelligent Roadside Unit (IRSU) infrastructure and smart vehicles fitted with communications hardware and advanced perception capabilities.

Projects being worked on in this area by the ITS group focus on the development of a general framework for cooperative data fusion to integrate data coming from different sources, each with their own uncertainties. These algorithms can be used to propagate estimates of position, context and associated risk for all road users and vehicles in proximity. This information will be critical to extending the sensing capabilities of smart vehicles beyond the visual line of sight, which can be heavily restricted in complex traffic scenarios. The figures below provide examples of the benefit of cooperative perception integrating vehicles and IRSUs. More complex environments may consider incorporating information from smart vehicles, in addition to the information provided by IRSUs, in the same manner.

The IRSU detects pedestrians and broadcasts information through DSRC to a mobile vehicle. The mobile vehicle becomes aware of pedestrian locations as result of receiving this information

The group is researching into tracking algorithms to be able to estimate the position and velocity of pedestrians. Current work includes the fusion of multimodal sensing coming from a single platform and/or multiple platforms in a cooperative perception framework.

Tracking pedestrians (their position, heading and velocity) in highly-crowded environments


Autonomous Vehicles(AV)’s wide-scale deployment appears to be imminent despite many safety challenges that are yet to be resolved. The modern autonomous vehicles will undoubtedly include machine learning and probabilistic techniques that add significant complexity to the traditional verification & validation methods. Safety validation is the process of evaluating the risk of a system. The risk is assessed based on whether the SUT maintains the desired behaviour. Desired behaviour is encoded in the safety requirements, and evaluation is based on whether the system meets the requirement.

In our work, we approach the generation of relevant scenarios as a black-box optimisation problem. In the black-box optimisation problem, System Under Test(SUT) treated as a black-box system and the optimisation algorithms employed to search for the best combination of input parameters that maximises the cost. It enables us to search for relevant scenarios for validating the System Under Test effectively.