Highly Robust Clustering of GPS Driver Data for Energy Efficient Driving Style Modelling


This paper presents a novel approach to distinguish driving styles with respect to their energy efficiency. A distinct property of our method is that it relies exclusively on Global Positioning System (GPS) logs of drivers. This setting is highly relevant in practice as these data can easily be acquired. Relying on positional data alone means that all derived features will be correlated, so we strive to find a single quantity that allows us to perform the driving style analysis. To this end we consider a robust variation of the so called jerk of a movement. We show that our feature choice outperforms other more commonly used jerk-based formulations and we discuss the handling of noisy, inconsistent, and incomplete data as this is a notorious problem when dealing with real-world GPS logs. Our solving strategy relies on an agglomerative hierarchical clustering combined with an L-term heuristic to determine the relevant number of clusters. It can easily be implemented and performs fast, even on very large, real-world data sets. Experiments show that our approach is robust against noise and able to discern different driving styles.