Time series describe dynamic processes. Driven by big data applications including mapping of land use from satellite observations over time, our award winning research is revolutionising time series classification by developing technologies that can learn from and accurately classify orders of magnitude larger time series collections than the previous state of the art.
ROCKET and its successors MiniROCKET, MultiROCKET and HYDRA use convolutional filters from deep learning to extract diverse time series features of types that have previously each been addressed by specialised techniques. ROCKET generates a many of these filters and uses them to extract features from each series. From these features a simple linear classifier can learn models that are as accurate as the prior state-of-the-art, but do so in a fraction of the time and create models that classify with blistering speed. Angus Dempster received The Computing Research and Education Association of Australasia Distinguished Dissertation Award for this research. An implementation can be downloaded here. The most recent paper can be found here. Angus' video explaining ROCKET and its successors can be found here.
QUANT is a highly efficient interval method time series classifier, assessed by a recent benchmarking paper as best interval classifier and as "achieving high accuracy remarkably fast." An implementation can be downloaded here. The paper can be found here.
Proximity Forest provides a significant advance on the state of the art in time series classification. By coupling the efficiency of divide and conquer tree classifiers with the effectiveness of specialised similarity measures specifically designed for time series, Proximity Forest achieves very high accuracy for modest computation. An implementation can be downloaded here. The most recent paper can be found here.
TS-Chief builds upon Proximity Forest, enhancing its proximity-based methods by integrating interval statistics and dictionary techniques. An implementation can be found here and the paper found here.
InceptionTime brings the power of deep learning to time series classification. An implementation can be downloaded here. The paper can be downloaded here.
LB Webb and LB Enhanced are our novel lower bounds for Dynamic Time Warping that are both faster and tighter than the popular LB_Keogh. Implementations can be downloaded here and here. The papers can be found here and here.
The following is a blog post on the use of Barycentric averaging in time series classification: http://www.kdnuggets.com/2014/12/averaging-improves-accuracy-speed-time-series-classification.html. The code can be downloaded here: http://francois-petitjean.com/Research/ICDM2014-DTW/index.php. The slides for the 10-year Highest Impact Paper Award winning ICDM 2014 paper can be downloaded here: http://francois-petitjean.com/Research/ICDM2014-DTW/Slides.pdf.
The TSI software for the SDM 2017 paper on time series indexing can be downloaded here: https://github.com/ChangWeiTan/TSI. Slides for the SDM 2017 paper can be found here: http://francois-petitjean.com/Research/SDM17-slides.pdf.
The software for the Best Paper Award winning SDM 2018 paper on finding the best warping window can be downloaded here: https://github.com/ChangWeiTan/FastWWSearch (Matlab version).
Resources referred to in Deep Learning for Time Series Classification and Extrinsic Regression: A Current Survey can be found here: https://github.com/Navidfoumani/TSC_Survey.
Publications
Quant: a minimalist interval method for time series classification.
Dempster, A., Schmidt, D. F., & Webb, G. I.
Data Mining and Knowledge Discovery, in press.
[Bibtex] [Abstract] → Access on publisher site
@Article{Dempster2024,
author = {Dempster, Angus and Schmidt, Daniel F. and Webb, Geoffrey I.},
journal = {Data Mining and Knowledge Discovery},
title = {Quant: a minimalist interval method for time series classification},
year = {in press},
issn = {1573-756X},
abstract = {We show that it is possible to achieve the same accuracy, on average, as the most accurate existing interval methods for time series classification on a standard set of benchmark datasets using a single type of feature (quantiles), fixed intervals, and an ‘off the shelf’ classifier. This distillation of interval-based approaches represents a fast and accurate method for time series classification, achieving state-of-the-art accuracy on the expanded set of 142 datasets in the UCR archive with a total compute time (training and inference) of less than 15 min using a single CPU core.},
doi = {10.1007/s10618-024-01036-9},
keywords = {time series, efficient ml},
publisher = {Springer Science and Business Media LLC},
related = {scalable-time-series-classifiers},
url = {https://doi.org/10.1145/3649448},
}
ABSTRACT We show that it is possible to achieve the same accuracy, on average, as the most accurate existing interval methods for time series classification on a standard set of benchmark datasets using a single type of feature (quantiles), fixed intervals, and an ‘off the shelf’ classifier. This distillation of interval-based approaches represents a fast and accurate method for time series classification, achieving state-of-the-art accuracy on the expanded set of 142 datasets in the UCR archive with a total compute time (training and inference) of less than 15 min using a single CPU core.
Deep Learning for Time Series Classification and Extrinsic Regression: A Current Survey.
Foumani, N. M., Miller, L., Tan, C. W., Webb, G. I., Forestier, G., & Salehi, M.
ACM Computing Surveys, 2024.
[Bibtex] [Abstract] → Access on publisher site
@Article{Foumani2024,
author = {Foumani, Navid Mohammadi and Miller, Lynn and Tan, Chang Wei and Webb, Geoffrey I. and Forestier, Germain and Salehi, Mahsa},
journal = {ACM Computing Surveys},
title = {Deep Learning for Time Series Classification and Extrinsic Regression: A Current Survey},
year = {2024},
issn = {0360-0300},
month = {feb},
note = {Just Accepted},
abstract = {Time Series Classification and Extrinsic Regression are important and challenging machine learning tasks. Deep learning has revolutionized natural language processing and computer vision and holds great promise in other fields such as time series analysis where the relevant features must often be abstracted from the raw data but are not known a priori. This paper surveys the current state of the art in the fast-moving field of deep learning for time series classification and extrinsic regression. We review different network architectures and training methods used for these tasks and discuss the challenges and opportunities when applying deep learning to time series data. We also summarize two critical applications of time series classification and extrinsic regression, human activity recognition and satellite earth observation.},
address = {New York, NY, USA},
creationdate = {2024-02-29T11:53:15},
doi = {10.1145/3649448},
keywords = {Deep Learning, time series, Classification, Extrinsic regression, Review},
publisher = {Association for Computing Machinery},
related = {scalable-time-series-classifiers},
url = {https://doi.org/10.1145/3649448},
}
ABSTRACT Time Series Classification and Extrinsic Regression are important and challenging machine learning tasks. Deep learning has revolutionized natural language processing and computer vision and holds great promise in other fields such as time series analysis where the relevant features must often be abstracted from the raw data but are not known a priori. This paper surveys the current state of the art in the fast-moving field of deep learning for time series classification and extrinsic regression. We review different network architectures and training methods used for these tasks and discuss the challenges and opportunities when applying deep learning to time series data. We also summarize two critical applications of time series classification and extrinsic regression, human activity recognition and satellite earth observation.
Series2vec: similarity-based self-supervised representation learning for time series classification.
Foumani, N. M., Tan, C. W., Webb, G. I., Rezatofighi, H., & Salehi, M.
Data Mining and Knowledge Discovery, 2024.
[Bibtex] [Abstract] → Access on publisher site
@Article{Foumani2024a,
author = {Foumani, Navid Mohammadi and Tan, Chang Wei and Webb, Geoffrey I. and Rezatofighi, Hamid and Salehi, Mahsa},
journal = {Data Mining and Knowledge Discovery},
title = {Series2vec: similarity-based self-supervised representation learning for time series classification},
year = {2024},
issn = {1573-756X},
abstract = {We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called Series2Vec for self-supervised representation learning. Unlike the state-of-the-art methods in time series which rely on hand-crafted data augmentation, Series2Vec is trained by predicting the similarity between two series in both temporal and spectral domains through a self-supervised task. By leveraging the similarity prediction task, which has inherent meaning for a wide range of time series analysis tasks, Series2Vec eliminates the need for hand-crafted data augmentation. To further enforce the network to learn similar representations for similar time series, we propose a novel approach that applies order-invariant attention to each representation within the batch during training. Our evaluation of Series2Vec on nine large real-world datasets, along with the UCR/UEA archive, shows enhanced performance compared to current state-of-the-art self-supervised techniques for time series. Additionally, our extensive experiments show that Series2Vec performs comparably with fully supervised training and offers high efficiency in datasets with limited-labeled data. Finally, we show that the fusion of Series2Vec with other representation learning models leads to enhanced performance for time series classification. Code and models are open-source at https://github.com/Navidfoumani/Series2Vec},
doi = {10.1007/s10618-024-01043-w},
keywords = {Deep Learning, time series, Classification, Extrinsic regression, Review},
refid = {Foumani2024},
related = {scalable-time-series-classifiers},
}
ABSTRACT We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called Series2Vec for self-supervised representation learning. Unlike the state-of-the-art methods in time series which rely on hand-crafted data augmentation, Series2Vec is trained by predicting the similarity between two series in both temporal and spectral domains through a self-supervised task. By leveraging the similarity prediction task, which has inherent meaning for a wide range of time series analysis tasks, Series2Vec eliminates the need for hand-crafted data augmentation. To further enforce the network to learn similar representations for similar time series, we propose a novel approach that applies order-invariant attention to each representation within the batch during training. Our evaluation of Series2Vec on nine large real-world datasets, along with the UCR/UEA archive, shows enhanced performance compared to current state-of-the-art self-supervised techniques for time series. Additionally, our extensive experiments show that Series2Vec performs comparably with fully supervised training and offers high efficiency in datasets with limited-labeled data. Finally, we show that the fusion of Series2Vec with other representation learning models leads to enhanced performance for time series classification. Code and models are open-source at https://github.com/Navidfoumani/Series2Vec
Improving position encoding of transformers for multivariate time series classification.
Foumani, N. M., Tan, C. W., Webb, G. I., & Salehi, M.
Data Mining and Knowledge Discovery, 38, 22-48, 2024.
[Bibtex] [Abstract] → Access on publisher site
@Article{Foumani2023,
author = {Navid Mohammadi Foumani and Chang Wei Tan and Geoffrey I. Webb and Mahsa Salehi},
journal = {Data Mining and Knowledge Discovery},
title = {Improving position encoding of transformers for multivariate time series classification},
year = {2024},
pages = {22-48},
volume = {38},
abstract = {Transformers have demonstrated outstanding performance in many applications of deep learning. When applied to time series data, transformers require effective position encoding to capture the ordering of the time series data. The efficacy of position encoding in time series analysis is not well-studied and remains controversial, e.g., whether it is better to inject absolute position encoding or relative position encoding, or a combination of them. In order to clarify this, we first review existing absolute and relative position encoding methods when applied in time series classification. We then proposed a new absolute position encoding method dedicated to time series data called time Absolute Position Encoding (tAPE). Our new method incorporates the series length and input embedding dimension in absolute position encoding. Additionally, we propose computationally Efficient implementation of Relative Position Encoding (eRPE) to improve generalisability for time series. We then propose a novel multivariate time series classification model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data. The proposed absolute and relative position encoding methods are simple and efficient. They can be easily integrated into transformer blocks and used for downstream tasks such as forecasting, extrinsic regression, and anomaly detection. Extensive experiments on 32 multivariate time-series datasets show that our model is significantly more accurate than state-of-the-art convolution and transformer-based models. Code and models are open-sourced at https://github.com/Navidfoumani/ConvTran.},
creationdate = {2023-09-06T15:51:18},
doi = {10.1007/s10618-023-00948-2},
keywords = {time series},
publisher = {Springer Science and Business Media {LLC}},
related = {scalable-time-series-classifiers},
}
ABSTRACT Transformers have demonstrated outstanding performance in many applications of deep learning. When applied to time series data, transformers require effective position encoding to capture the ordering of the time series data. The efficacy of position encoding in time series analysis is not well-studied and remains controversial, e.g., whether it is better to inject absolute position encoding or relative position encoding, or a combination of them. In order to clarify this, we first review existing absolute and relative position encoding methods when applied in time series classification. We then proposed a new absolute position encoding method dedicated to time series data called time Absolute Position Encoding (tAPE). Our new method incorporates the series length and input embedding dimension in absolute position encoding. Additionally, we propose computationally Efficient implementation of Relative Position Encoding (eRPE) to improve generalisability for time series. We then propose a novel multivariate time series classification model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data. The proposed absolute and relative position encoding methods are simple and efficient. They can be easily integrated into transformer blocks and used for downstream tasks such as forecasting, extrinsic regression, and anomaly detection. Extensive experiments on 32 multivariate time-series datasets show that our model is significantly more accurate than state-of-the-art convolution and transformer-based models. Code and models are open-sourced at https://github.com/Navidfoumani/ConvTran.
CARLA: Self-supervised contrastive representation learning for time series anomaly detection.
Darban, Z. Z., Webb, G. I., Pan, S., Aggarwal, C. C., & Salehi, M.
Pattern Recognition, 110874, 2024.
[Bibtex] [Abstract] → Access on publisher site
@Article{Darban2024,
author = {Zahra Zamanzadeh Darban and Geoffrey I. Webb and Shirui Pan and Charu C. Aggarwal and Mahsa Salehi},
journal = {Pattern Recognition},
title = {CARLA: Self-supervised contrastive representation learning for time series anomaly detection},
year = {2024},
issn = {0031-3203},
pages = {110874},
abstract = {One main challenge in time series anomaly detection (TSAD) is the lack of labelled data in many real-life scenarios. Most of the existing anomaly detection methods focus on learning the normal behaviour of unlabelled time series in an unsupervised manner. The normal boundary is often defined tightly, resulting in slight deviations being classified as anomalies, consequently leading to a high false positive rate and a limited ability to generalise normal patterns. To address this, we introduce a novel end-to-end self-supervised ContrAstive Representation Learning approach for time series Anomaly detection (CARLA). While existing contrastive learning methods assume that augmented time series windows are positive samples and temporally distant windows are negative samples, we argue that these assumptions are limited as augmentation of time series can transform them to negative samples, and a temporally distant window can represent a positive sample. Existing approaches to contrastive learning for time series have directly copied methods developed for image analysis. We argue that these methods do not transfer well. Instead, our contrastive approach leverages existing generic knowledge about time series anomalies and injects various types of anomalies as negative samples. Therefore, CARLA not only learns normal behaviour but also learns deviations indicating anomalies. It creates similar representations for temporally close windows and distinct ones for anomalies. Additionally, it leverages the information about representations’ neighbours through a self-supervised approach to classify windows based on their nearest/furthest neighbours to further enhance the performance of anomaly detection. In extensive tests on seven major real-world TSAD datasets, CARLA shows superior performance (F1 and AU-PR) over state-of-the-art self-supervised, semi-supervised, and unsupervised TSAD methods for univariate time series and multivariate time series. Our research highlights the immense potential of contrastive representation learning in advancing the TSAD field, thus paving the way for novel applications and in-depth exploration.},
doi = {10.1016/j.patcog.2024.110874},
keywords = {Anomaly detection, time series, Deep learning, Contrastive learning, Representation learning, Self-supervised learning},
related = {scalable-time-series-classifiers},
}
ABSTRACT One main challenge in time series anomaly detection (TSAD) is the lack of labelled data in many real-life scenarios. Most of the existing anomaly detection methods focus on learning the normal behaviour of unlabelled time series in an unsupervised manner. The normal boundary is often defined tightly, resulting in slight deviations being classified as anomalies, consequently leading to a high false positive rate and a limited ability to generalise normal patterns. To address this, we introduce a novel end-to-end self-supervised ContrAstive Representation Learning approach for time series Anomaly detection (CARLA). While existing contrastive learning methods assume that augmented time series windows are positive samples and temporally distant windows are negative samples, we argue that these assumptions are limited as augmentation of time series can transform them to negative samples, and a temporally distant window can represent a positive sample. Existing approaches to contrastive learning for time series have directly copied methods developed for image analysis. We argue that these methods do not transfer well. Instead, our contrastive approach leverages existing generic knowledge about time series anomalies and injects various types of anomalies as negative samples. Therefore, CARLA not only learns normal behaviour but also learns deviations indicating anomalies. It creates similar representations for temporally close windows and distinct ones for anomalies. Additionally, it leverages the information about representations’ neighbours through a self-supervised approach to classify windows based on their nearest/furthest neighbours to further enhance the performance of anomaly detection. In extensive tests on seven major real-world TSAD datasets, CARLA shows superior performance (F1 and AU-PR) over state-of-the-art self-supervised, semi-supervised, and unsupervised TSAD methods for univariate time series and multivariate time series. Our research highlights the immense potential of contrastive representation learning in advancing the TSAD field, thus paving the way for novel applications and in-depth exploration.
A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection.
Jin, M., Koh, H. Y., Wen, Q., Zambon, D., Alippi, C., Webb, G. I., King, I., & Pan, S.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1-20, 2024.
[Bibtex] → Access on publisher site
@Article{Jin2024,
author = {Jin, Ming and Koh, Huan Yee and Wen, Qingsong and Zambon, Daniele and Alippi, Cesare and Webb, Geoffrey I. and King, Irwin and Pan, Shirui},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
title = {A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection},
year = {2024},
issn = {1939-3539},
pages = {1-20},
doi = {10.1109/tpami.2024.3443141},
keywords = {time series},
publisher = {Institute of Electrical and Electronics Engineers (IEEE)},
related = {scalable-time-series-classifiers},
}
ABSTRACT
Projecting live fuel moisture content via deep learning.
Miller, L., Zhu, L., Yebra, M., Rudiger, C., & Webb, G. I.
International Journal of Wildland Fire, 2023.
[Bibtex] [Abstract] → Access on publisher site
@Article{Miller2023Projecting,
author = {Miller, Lynn and Zhu, Liujun and Yebra, Marta and Rudiger, Christoph and Webb, Geoffrey I},
journal = {International Journal of Wildland Fire},
title = {Projecting live fuel moisture content via deep learning},
year = {2023},
abstract = {Background: Live fuel moisture content (LFMC) is a key environmental indicator used to monitor for high wildfire risk conditions. Many statistical models have been proposed to predict LFMC from remotely sensed data; however, almost all these estimate current LFMC (nowcasting models). Accurate modelling of LFMC in advance (projection models) would provide wildfire managers with more timely information for assessing and preparing for wildfire risk.
Aims: The aim of this study was to investigate the potential for deep learning models to predict LFMC across the continental United States 3 months in advance.
Method: Temporal convolutional networks were trained and evaluated using a large database of field measured samples, as well as year-long time series of MODerate resolution Imaging Spectroradiometer (MODIS) reflectance data and Parameter-elevation Relationships on Independent Slopes Model (PRISM) meteorological data.
Key results: The proposed 3-month projection model achieved an accuracy (root mean squared error (RMSE) 27.52%; R2 0.47) close to that of the nowcasting model (RMSE 26.52%; R2 0.51).
Conclusions: The study is the first to predict LFMC with a 3-month lead-time, demonstrating the potential for deep learning models to make reliable LFMC projections.
Implications: These findings are beneficial for wildfire management and risk assessment, showing proof-of-concept for providing advance information useful to help mitigate the effect of catastrophic wildfires.},
creationdate = {2023-03-21T11:37:58},
doi = {10.1071/WF22188},
keywords = {time series, earth observation analytics},
related = {earth-observation-analytics},
}
ABSTRACT Background: Live fuel moisture content (LFMC) is a key environmental indicator used to monitor for high wildfire risk conditions. Many statistical models have been proposed to predict LFMC from remotely sensed data; however, almost all these estimate current LFMC (nowcasting models). Accurate modelling of LFMC in advance (projection models) would provide wildfire managers with more timely information for assessing and preparing for wildfire risk. Aims: The aim of this study was to investigate the potential for deep learning models to predict LFMC across the continental United States 3 months in advance. Method: Temporal convolutional networks were trained and evaluated using a large database of field measured samples, as well as year-long time series of MODerate resolution Imaging Spectroradiometer (MODIS) reflectance data and Parameter-elevation Relationships on Independent Slopes Model (PRISM) meteorological data. Key results: The proposed 3-month projection model achieved an accuracy (root mean squared error (RMSE) 27.52%; R2 0.47) close to that of the nowcasting model (RMSE 26.52%; R2 0.51). Conclusions: The study is the first to predict LFMC with a 3-month lead-time, demonstrating the potential for deep learning models to make reliable LFMC projections. Implications: These findings are beneficial for wildfire management and risk assessment, showing proof-of-concept for providing advance information useful to help mitigate the effect of catastrophic wildfires.
Amercing: An Intuitive and Effective Constraint for Dynamic Time Warping.
Herrmann, M., & Webb, G. I.
Pattern Recognition, 137, Art. no. 109333, 2023.
[Bibtex] [Abstract] → Download PDF → Access on publisher site
@Article{Herrmann2023a,
author = {Matthieu Herrmann and Geoffrey I. Webb},
journal = {Pattern Recognition},
title = {Amercing: An Intuitive and Effective Constraint for Dynamic Time Warping},
year = {2023},
issn = {0031-3203},
volume = {137},
abstract = {Dynamic Time Warping (DTW) is a time series distance measure that allows non-linear alignments between series. Constraints on the alignments in the form of windows and weights have been introduced because unconstrained DTW is too permissive in its alignments. However, windowing introduces a crude step function, allowing unconstrained flexibility within the window, and none beyond it. While not entailing a step function, a multiplicative weight is relative to the distances between aligned points along a warped path, rather than being a direct function of the amount of warping that is introduced. In this paper, we introduce Amerced Dynamic Time Warping (ADTW), a new, intuitive, DTW variant that penalizes the act of warping by a fixed additive cost. Like windowing and weighting, ADTW constrains the amount of warping. However, it avoids both abrupt discontinuities in the amount of warping allowed and the limitations of a multiplicative penalty. We formally introduce ADTW, prove some of its properties, and discuss its parameterization. We show on a simple example how it can be parameterized to achieve an intuitive outcome, and demonstrate its usefulness on a standard time series classification benchmark. We provide a demonstration application in C++ [1].},
articlenumber = {109333},
doi = {10.1016/j.patcog.2023.109333},
keywords = {time series},
related = {scalable-time-series-classifiers},
}
ABSTRACT Dynamic Time Warping (DTW) is a time series distance measure that allows non-linear alignments between series. Constraints on the alignments in the form of windows and weights have been introduced because unconstrained DTW is too permissive in its alignments. However, windowing introduces a crude step function, allowing unconstrained flexibility within the window, and none beyond it. While not entailing a step function, a multiplicative weight is relative to the distances between aligned points along a warped path, rather than being a direct function of the amount of warping that is introduced. In this paper, we introduce Amerced Dynamic Time Warping (ADTW), a new, intuitive, DTW variant that penalizes the act of warping by a fixed additive cost. Like windowing and weighting, ADTW constrains the amount of warping. However, it avoids both abrupt discontinuities in the amount of warping allowed and the limitations of a multiplicative penalty. We formally introduce ADTW, prove some of its properties, and discuss its parameterization. We show on a simple example how it can be parameterized to achieve an intuitive outcome, and demonstrate its usefulness on a standard time series classification benchmark. We provide a demonstration application in C++ [1].
Parameterizing the cost function of dynamic time warping with application to time series classification.
Herrmann, M., Tan, C. W., & Webb, G. I.
Data Mining and Knowledge Discovery, 37, 2024-2045, 2023.
[Bibtex] [Abstract] → Access on publisher site
@Article{Herrmann2023Parameterizing,
author = {Herrmann, Matthieu and Tan, Chang Wei and Webb, Geoffrey I.},
journal = {Data Mining and Knowledge Discovery},
title = {Parameterizing the cost function of dynamic time warping with application to time series classification},
year = {2023},
pages = {2024-2045},
volume = {37},
abstract = {Dynamic time warping (DTW) is a popular time series distance measure that aligns the points in two series with one another. These alignments support warping of the time dimension to allow for processes that unfold at differing rates. The distance is the minimum sum of costs of the resulting alignments over any allowable warping of the time dimension. The cost of an alignment of two points is a function of the difference in the values of those points. The original cost function was the absolute value of this difference. Other cost functions have been proposed. A popular alternative is the square of the difference. However, to our knowledge, this is the first investigation of both the relative impacts of using different cost functions and the potential to tune cost functions to different time series classification tasks. We do so in this paper by using a tunable cost function ?? with parameter ?. We show that higher values of ? place greater weight on larger pairwise differences, while lower values place greater weight on smaller pairwise differences. We demonstrate that training ? significantly improves the accuracy of both the DTW nearest neighbor and Proximity Forest classifiers.},
creationdate = {2023-04-17T09:41:42},
doi = {10.1007/s10618-023-00926-8},
keywords = {time series},
related = {scalable-time-series-classifiers},
}
ABSTRACT Dynamic time warping (DTW) is a popular time series distance measure that aligns the points in two series with one another. These alignments support warping of the time dimension to allow for processes that unfold at differing rates. The distance is the minimum sum of costs of the resulting alignments over any allowable warping of the time dimension. The cost of an alignment of two points is a function of the difference in the values of those points. The original cost function was the absolute value of this difference. Other cost functions have been proposed. A popular alternative is the square of the difference. However, to our knowledge, this is the first investigation of both the relative impacts of using different cost functions and the potential to tune cost functions to different time series classification tasks. We do so in this paper by using a tunable cost function ?? with parameter ?. We show that higher values of ? place greater weight on larger pairwise differences, while lower values place greater weight on smaller pairwise differences. We demonstrate that training ? significantly improves the accuracy of both the DTW nearest neighbor and Proximity Forest classifiers.
HYDRA: competing convolutional kernels for fast and accurate time series classification.
Dempster, A., Schmidt, D. F., & Webb, G. I.
Data Mining and Knowledge Discovery, 37(5), 1779-1805, 2023.
[Bibtex] → Access on publisher site
@Article{Dempster2023,
author = {Angus Dempster and Daniel F. Schmidt and Geoffrey I. Webb},
journal = {Data Mining and Knowledge Discovery},
title = {{HYDRA}: competing convolutional kernels for fast and accurate time series classification},
year = {2023},
number = {5},
pages = {1779-1805},
volume = {37},
creationdate = {2023-05-18T09:41:26},
doi = {10.1007/s10618-023-00939-3},
keywords = {time series, efficient ml},
publisher = {Springer Science and Business Media {LLC}},
related = {scalable-time-series-classifiers},
}
ABSTRACT
Ultra-fast meta-parameter optimization for time series similarity measures with application to nearest neighbour classification.
Tan, C. W., Herrmann, M., & Webb, G. I.
Knowledge and Information Systems, 2023.
[Bibtex] → Access on publisher site
@Article{Tan2023,
author = {Tan, Chang Wei and Herrmann, Matthieu and Webb, Geoffrey I.},
journal = {Knowledge and Information Systems},
title = {Ultra-fast meta-parameter optimization for time series similarity measures with application to nearest neighbour classification},
year = {2023},
doi = {10.1007/s10115-022-01827-w},
keywords = {time series, efficient ml},
publisher = {Springer Science and Business Media {LLC}},
related = {scalable-time-series-classifiers},
}
ABSTRACT
Time series adversarial attacks: an investigation of smooth perturbations and defense approaches.
Pialla, G., Fawaz, H. I., Devanne, M., Weber, J., Idoumghar, L., Muller, P., Bergmeir, C., Schmidt, D. F., Webb, G. I., & Forestier, G.
International Journal of Data Science and Analytics, 2023.
[Bibtex] → Access on publisher site
@Article{Pialla2023,
author = {Gautier Pialla and Hassan Ismail Fawaz and Maxime Devanne and Jonathan Weber and Lhassane Idoumghar and Pierre-Alain Muller and Christoph Bergmeir and Daniel F. Schmidt and Geoffrey I. Webb and Germain Forestier},
journal = {International Journal of Data Science and Analytics},
title = {Time series adversarial attacks: an investigation of smooth perturbations and defense approaches},
year = {2023},
doi = {10.1007/s41060-023-00438-0},
keywords = {time series},
publisher = {Springer Science and Business Media {LLC}},
related = {scalable-time-series-classifiers},
}
ABSTRACT
ShapeDBA: Generating Effective Time Series Prototypes Using ShapeDTW Barycenter Averaging.
Ismail-Fawaz, A., Ismail Fawaz, H., Petitjean, F., Devanne, M., Weber, J., Berretti, S., Webb, G. I., & Forestier, G.
Advanced Analytics and Learning on Temporal Data, Cham, pp. 127–142, 2023.
[Bibtex] [Abstract]
@InProceedings{IsmailFawaz2023,
author = {Ismail-Fawaz, Ali and Ismail Fawaz, Hassan and Petitjean, Fran{\c{c}}ois and Devanne, Maxime and Weber, Jonathan and Berretti, Stefano and Webb, Geoffrey I. and Forestier, Germain},
booktitle = {Advanced Analytics and Learning on Temporal Data},
title = {ShapeDBA: Generating Effective Time Series Prototypes Using ShapeDTW Barycenter Averaging},
year = {2023},
address = {Cham},
editor = {Ifrim, Georgiana and Tavenard, Romain and Bagnall, Anthony and Schaefer, Patrick and Malinowski, Simon and Guyet, Thomas and Lemaire, Vincent},
pages = {127--142},
publisher = {Springer Nature Switzerland},
abstract = {Time series data can be found in almost every domain, ranging from the medical field to manufacturing and wireless communication. Generating realistic and useful exemplars and prototypes is a fundamental data analysis task. In this paper, we investigate a novel approach to generating realistic and useful exemplars and prototypes for time series data. Our approach uses a new form of time series average, the ShapeDTW Barycentric Average. We therefore turn our attention to accurately generating time series prototypes with a novel approach. The existing time series prototyping approaches rely on the Dynamic Time Warping (DTW) similarity measure such as DTW Barycentering Average (DBA) and SoftDBA. These last approaches suffer from a common problem of generating out-of-distribution artifacts in their prototypes. This is mostly caused by the DTW variant used and its incapability of detecting neighborhood similarities, instead it detects absolute similarities. Our proposed method, ShapeDBA, uses the ShapeDTW variant of DTW, that overcomes this issue. We chose time series clustering, a popular form of time series analysis to evaluate the outcome of ShapeDBA compared to the other prototyping approaches. Coupled with the k-means clustering algorithm, and evaluated on a total of 123 datasets from the UCR archive, our proposed averaging approach is able to achieve new state-of-the-art results in terms of Adjusted Rand Index.},
isbn = {978-3-031-49896-1},
keywords = {time series},
related = {scalable-time-series-classifiers},
}
ABSTRACT Time series data can be found in almost every domain, ranging from the medical field to manufacturing and wireless communication. Generating realistic and useful exemplars and prototypes is a fundamental data analysis task. In this paper, we investigate a novel approach to generating realistic and useful exemplars and prototypes for time series data. Our approach uses a new form of time series average, the ShapeDTW Barycentric Average. We therefore turn our attention to accurately generating time series prototypes with a novel approach. The existing time series prototyping approaches rely on the Dynamic Time Warping (DTW) similarity measure such as DTW Barycentering Average (DBA) and SoftDBA. These last approaches suffer from a common problem of generating out-of-distribution artifacts in their prototypes. This is mostly caused by the DTW variant used and its incapability of detecting neighborhood similarities, instead it detects absolute similarities. Our proposed method, ShapeDBA, uses the ShapeDTW variant of DTW, that overcomes this issue. We chose time series clustering, a popular form of time series analysis to evaluate the outcome of ShapeDBA compared to the other prototyping approaches. Coupled with the k-means clustering algorithm, and evaluated on a total of 123 datasets from the UCR archive, our proposed averaging approach is able to achieve new state-of-the-art results in terms of Adjusted Rand Index.
A Bayesian-inspired, deep learning-based, semi-supervised domain adaptation technique for land cover mapping.
Lucas, B., Pelletier, C., Schmidt, D., Webb, G. I., & Petitjean, F.
Machine Learning, 112, 1941-1973, 2023.
[Bibtex] [Abstract] → Access on publisher site
@Article{lucas2021bayesian,
author = {Lucas, Benjamin and Pelletier, Charlotte and Schmidt, Daniel and Webb, Geoffrey I and Petitjean, Fran{\c{c}}ois},
journal = {Machine Learning},
title = {A Bayesian-inspired, deep learning-based, semi-supervised domain adaptation technique for land cover mapping},
year = {2023},
pages = {1941-1973},
volume = {112},
abstract = {Land cover maps are a vital input variable to many types of environmental research and management. While they can be produced automatically by machine learning techniques, these techniques require substantial training data to achieve high levels of accuracy, which are not always available. One technique researchers use when labelled training data are scarce is domain adaptation (DA) - where data from an alternate region, known as the source domain, are used to train a classifier and this model is adapted to map the study region, or target domain. The scenario we address in this paper is known as semi-supervised DA, where some labelled samples are available in the target domain. In this paper we present Sourcerer, a Bayesian-inspired, deep learning-based, semi-supervised DA technique for producing land cover maps from satellite image time series (SITS) data. The technique takes a convolutional neural network trained on a source domain and then trains further on the available target domain with a novel regularizer applied to the model weights. The regularizer adjusts the degree to which the model is modified to fit the target data, limiting the degree of change when the target data are few in number and increasing it as target data quantity increases. Our experiments on Sentinel-2 time series images compare Sourcerer with two state-of-the-art semi-supervised domain adaptation techniques and four baseline models. We show that on two different source-target domain pairings Sourcerer outperforms all other methods for any quantity of labelled target data available. In fact, the results on the more difficult target domain show that the starting accuracy of Sourcerer (when no labelled target data are available), 74.2%, is greater than the next-best state-of-the-art method trained on 20,000 labelled target instances.},
doi = {10.1007/s10994-020-05942-z},
keywords = {time series, earth observation analytics},
publisher = {Springer US},
related = {earth-observation-analytics},
}
ABSTRACT Land cover maps are a vital input variable to many types of environmental research and management. While they can be produced automatically by machine learning techniques, these techniques require substantial training data to achieve high levels of accuracy, which are not always available. One technique researchers use when labelled training data are scarce is domain adaptation (DA) - where data from an alternate region, known as the source domain, are used to train a classifier and this model is adapted to map the study region, or target domain. The scenario we address in this paper is known as semi-supervised DA, where some labelled samples are available in the target domain. In this paper we present Sourcerer, a Bayesian-inspired, deep learning-based, semi-supervised DA technique for producing land cover maps from satellite image time series (SITS) data. The technique takes a convolutional neural network trained on a source domain and then trains further on the available target domain with a novel regularizer applied to the model weights. The regularizer adjusts the degree to which the model is modified to fit the target data, limiting the degree of change when the target data are few in number and increasing it as target data quantity increases. Our experiments on Sentinel-2 time series images compare Sourcerer with two state-of-the-art semi-supervised domain adaptation techniques and four baseline models. We show that on two different source-target domain pairings Sourcerer outperforms all other methods for any quantity of labelled target data available. In fact, the results on the more difficult target domain show that the starting accuracy of Sourcerer (when no labelled target data are available), 74.2%, is greater than the next-best state-of-the-art method trained on 20,000 labelled target instances.
Elastic similarity and distance measures for multivariate time series.
Shifaz, A., Pelletier, C., Petitjean, F., & Webb, G. I.
Knowledge and Information Systems, 65, 2665-2698, 2023.
[Bibtex] → Access on publisher site
@Article{Shifaz2023,
author = {Ahmed Shifaz and Charlotte Pelletier and Fran{\c{c}}ois Petitjean and Geoffrey I. Webb},
journal = {Knowledge and Information Systems},
title = {Elastic similarity and distance measures for multivariate time series},
year = {2023},
pages = {2665-2698},
volume = {65},
doi = {10.1007/s10115-023-01835-4},
keywords = {time series},
publisher = {Springer Science and Business Media {LLC}},
related = {scalable-time-series-classifiers},
}
ABSTRACT
Smooth Perturbations for Time Series Adversarial Attacks.
Pialla, G., Fawaz, H. I., Devanne, M., Weber, J., Idoumghar, L., Muller, P., Bergmeir, C., Schmidt, D., Webb, G. I., & Forestier, G.
Proceedings of the 2022 Pacific-Asia Conference on Knowledge Discovery and Data Mining, Cham, pp. 485-496, 2022.
[Bibtex] [Abstract] → Access on publisher site
@InProceedings{10.1007/978-3-031-05933-9_38,
author = {Pialla, Gautier and Fawaz, Hassan Ismail and Devanne, Maxime and Weber, Jonathan and Idoumghar, Lhassane and Muller, Pierre-Alain and Bergmeir, Christoph and Schmidt, Daniel and Webb, Geoffrey I. and Forestier, Germain},
booktitle = {Proceedings of the 2022 Pacific-Asia Conference on Knowledge Discovery and Data Mining},
title = {Smooth Perturbations for Time Series Adversarial Attacks},
year = {2022},
address = {Cham},
editor = {Gama, Jo{\~a}o and Li, Tianrui and Yu, Yang and Chen, Enhong and Zheng, Yu and Teng, Fei},
pages = {485-496},
publisher = {Springer International Publishing},
abstract = {Adversarial attacks represent a threat to every deep neural network. They are particularly effective if they can perturb a given model while remaining undetectable. They have been initially introduced for image classifiers, and are well studied for this task. For time series, few attacks have yet been proposed. Most that have are adaptations of attacks previously proposed for image classifiers. Although these attacks are effective, they generate perturbations containing clearly discernible patterns such as sawtooth and spikes. Adversarial patterns are not perceptible on images, but the attacks proposed to date are readily perceptible in the case of time series. In order to generate stealthier adversarial attacks for time series, we propose a new attack that produces smoother perturbations. We find that smooth perturbations are harder to detect by the naked eye. We also show how adversarial training can improve model robustness against this attack, thus making models less vulnerable.},
doi = {10.1007/978-3-031-05933-9_38},
isbn = {978-3-031-05933-9},
keywords = {time series},
related = {scalable-time-series-classifiers},
}
ABSTRACT Adversarial attacks represent a threat to every deep neural network. They are particularly effective if they can perturb a given model while remaining undetectable. They have been initially introduced for image classifiers, and are well studied for this task. For time series, few attacks have yet been proposed. Most that have are adaptations of attacks previously proposed for image classifiers. Although these attacks are effective, they generate perturbations containing clearly discernible patterns such as sawtooth and spikes. Adversarial patterns are not perceptible on images, but the attacks proposed to date are readily perceptible in the case of time series. In order to generate stealthier adversarial attacks for time series, we propose a new attack that produces smoother perturbations. We find that smooth perturbations are harder to detect by the naked eye. We also show how adversarial training can improve model robustness against this attack, thus making models less vulnerable.
MultiRocket: multiple pooling operators and transformations for fast and effective time series classification.
Tan, C. W., Dempster, A., Bergmeir, C., & Webb, G. I.
Data Mining and Knowledge Discovery, 36, 1623-1646, 2022.
[Bibtex] [Abstract] → Access on publisher site
@Article{Tan2022,
author = {Tan, Chang Wei and Dempster, Angus and Bergmeir, Christoph and Webb, Geoffrey I.},
journal = {Data Mining and Knowledge Discovery},
title = {MultiRocket: multiple pooling operators and transformations for fast and effective time series classification},
year = {2022},
issn = {1573-756X},
pages = {1623-1646},
volume = {36},
abstract = {We propose MultiRocket, a fast time series classification (TSC) algorithm that achieves state-of-the-art accuracy with a tiny fraction of the time and without the complex ensembling structure of many state-of-the-art methods. MultiRocket improves on MiniRocket, one of the fastest TSC algorithms to date, by adding multiple pooling operators and transformations to improve the diversity of the features generated. In addition to processing the raw input series, MultiRocket also applies first order differences to transform the original series. Convolutions are applied to both representations, and four pooling operators are applied to the convolution outputs. When benchmarked using the University of California Riverside TSC benchmark datasets, MultiRocket is significantly more accurate than MiniRocket, and competitive with the best ranked current method in terms of accuracy, HIVE-COTE 2.0, while being orders of magnitude faster.},
doi = {10.1007/s10618-022-00844-1},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
}
ABSTRACT We propose MultiRocket, a fast time series classification (TSC) algorithm that achieves state-of-the-art accuracy with a tiny fraction of the time and without the complex ensembling structure of many state-of-the-art methods. MultiRocket improves on MiniRocket, one of the fastest TSC algorithms to date, by adding multiple pooling operators and transformations to improve the diversity of the features generated. In addition to processing the raw input series, MultiRocket also applies first order differences to transform the original series. Convolutions are applied to both representations, and four pooling operators are applied to the convolution outputs. When benchmarked using the University of California Riverside TSC benchmark datasets, MultiRocket is significantly more accurate than MiniRocket, and competitive with the best ranked current method in terms of accuracy, HIVE-COTE 2.0, while being orders of magnitude faster.
Time series extrinsic regression.
Tan, C. W., Bergmeir, C., Petitjean, F., & Webb, G. I.
Data Mining and Knowledge Discovery, 35(3), 1032-1060, 2021.
[Bibtex] [Abstract] → Access on publisher site
@Article{tan2021regression,
author = {Tan, Chang Wei and Bergmeir, Christoph and Petitjean, Francois and Webb, Geoffrey I.},
journal = {Data Mining and Knowledge Discovery},
title = {Time series extrinsic regression},
year = {2021},
issn = {1573-756X},
number = {3},
pages = {1032-1060},
volume = {35},
abstract = {This paper studies time series extrinsic regression (TSER): a regression task of which the aim is to learn the relationship between a time series and a continuous scalar variable; a task closely related to time series classification (TSC), which aims to learn the relationship between a time series and a categorical class label. This task generalizes time series forecasting, relaxing the requirement that the value predicted be a future value of the input series or primarily depend on more recent values. In this paper, we motivate and study this task, and benchmark existing solutions and adaptations of TSC algorithms on a novel archive of 19 TSER datasets which we have assembled. Our results show that the state-of-the-art TSC algorithm Rocket, when adapted for regression, achieves the highest overall accuracy compared to adaptations of other TSC algorithms and state-of-the-art machine learning (ML) algorithms such as XGBoost, Random Forest and Support Vector Regression. More importantly, we show that much research is needed in this field to improve the accuracy of ML models. We also find evidence that further research has excellent prospects of improving upon these straightforward baselines.},
doi = {10.1007/s10618-021-00745-9},
keywords = {time series},
publisher = {Springer US},
related = {scalable-time-series-classifiers},
url = {https://rdcu.be/cgCAn},
}
ABSTRACT This paper studies time series extrinsic regression (TSER): a regression task of which the aim is to learn the relationship between a time series and a continuous scalar variable; a task closely related to time series classification (TSC), which aims to learn the relationship between a time series and a categorical class label. This task generalizes time series forecasting, relaxing the requirement that the value predicted be a future value of the input series or primarily depend on more recent values. In this paper, we motivate and study this task, and benchmark existing solutions and adaptations of TSC algorithms on a novel archive of 19 TSER datasets which we have assembled. Our results show that the state-of-the-art TSC algorithm Rocket, when adapted for regression, achieves the highest overall accuracy compared to adaptations of other TSC algorithms and state-of-the-art machine learning (ML) algorithms such as XGBoost, Random Forest and Support Vector Regression. More importantly, we show that much research is needed in this field to improve the accuracy of ML models. We also find evidence that further research has excellent prospects of improving upon these straightforward baselines.
Early abandoning and pruning for elastic distances including dynamic time warping.
Herrmann, M., & Webb, G. I.
Data Mining and Knowledge Discovery, 35(6), 2577-2601, 2021.
[Bibtex] → Access on publisher site
@Article{Herrmann_2021,
author = {Matthieu Herrmann and Geoffrey I. Webb},
journal = {Data Mining and Knowledge Discovery},
title = {Early abandoning and pruning for elastic distances including dynamic time warping},
year = {2021},
number = {6},
pages = {2577-2601},
volume = {35},
doi = {10.1007/s10618-021-00782-4},
keywords = {time series},
publisher = {Springer Science and Business Media {LLC}},
related = {scalable-time-series-classifiers},
url = {https://rdcu.be/cuoN0},
}
ABSTRACT
Tight lower bounds for Dynamic Time Warping.
Webb, G. I., & Petitjean, F.
Pattern Recognition, 115, Art. no. 107895, 2021.
[Bibtex] [Abstract] → Access on publisher site
@Article{WEBB2021107895,
author = {Geoffrey I. Webb and Fran\c{c}ois Petitjean},
journal = {Pattern Recognition},
title = {Tight lower bounds for Dynamic Time Warping},
year = {2021},
issn = {0031-3203},
volume = {115},
abstract = {Dynamic Time Warping (DTW) is a popular similarity measure for aligning and comparing time series. Due to DTW's high computation time, lower bounds are often employed to screen poor matches. Many alternative lower bounds have been proposed, providing a range of different trade-offs between tightness and computational efficiency. LB_KEOGH provides a useful trade-off in many applications. Two recent lower bounds, LB_IMPROVED and LB_ENHANCED, are substantially tighter than LB_KEOGH. All three have the same worst case computational complexity - linear with respect to series length and constant with respect to window size. We present four new DTW lower bounds in the same complexity class. LB_PETITJEAN is substantially tighter than LB_IMPROVED, with only modest additional computational overhead. LB_WEBB is more efficient than LB_IMPROVED, while often providing a tighter bound. LB_WEBB is always tighter than LB_KEOGH. The parameter free LB_WEBB is usually tighter than LB_ENHANCED. A parameterized variant, LB_Webb_Enhanced, is always tighter than LB_ENHANCED. A further variant, LB_WEBB*, is useful for some constrained distance functions. In extensive experiments, LB_WEBB proves to be very effective for nearest neighbor search.},
articlenumber = {107895},
doi = {10.1016/j.patcog.2021.107895},
keywords = {time series},
related = {scalable-time-series-classifiers},
}
ABSTRACT Dynamic Time Warping (DTW) is a popular similarity measure for aligning and comparing time series. Due to DTW's high computation time, lower bounds are often employed to screen poor matches. Many alternative lower bounds have been proposed, providing a range of different trade-offs between tightness and computational efficiency. LB_KEOGH provides a useful trade-off in many applications. Two recent lower bounds, LB_IMPROVED and LB_ENHANCED, are substantially tighter than LB_KEOGH. All three have the same worst case computational complexity - linear with respect to series length and constant with respect to window size. We present four new DTW lower bounds in the same complexity class. LB_PETITJEAN is substantially tighter than LB_IMPROVED, with only modest additional computational overhead. LB_WEBB is more efficient than LB_IMPROVED, while often providing a tighter bound. LB_WEBB is always tighter than LB_KEOGH. The parameter free LB_WEBB is usually tighter than LB_ENHANCED. A parameterized variant, LB_Webb_Enhanced, is always tighter than LB_ENHANCED. A further variant, LB_WEBB*, is useful for some constrained distance functions. In extensive experiments, LB_WEBB proves to be very effective for nearest neighbor search.
Ultra fast warping window optimization for Dynamic Time Warping.
Tan, C. W., Herrmann, M., & Webb, G. I.
IEEE International Conference on Data Mining (ICDM-21), pp. 589-598, 2021.
[Bibtex] → Access on publisher site
@InProceedings{TanEtAlUltraFast2021,
author = {Tan, Chang Wei and Herrmann, Matthieu and Webb, Geoffrey I.},
booktitle = {IEEE International Conference on Data Mining (ICDM-21)},
title = {Ultra fast warping window optimization for Dynamic Time Warping},
year = {2021},
pages = {589-598},
doi = {10.1109/ICDM51629.2021.00070},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
url = {https://changweitan.com/research/UltraFastWWSearch.pdf},
}
ABSTRACT
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification.
Dempster, A., Schmidt, D. F., & Webb, G. I.
Proceedings of the 27thACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 248-257, 2021.
[Bibtex] [Abstract] → Access on publisher site
@InProceedings{dempsteretal21kdd,
author = {Angus Dempster and Daniel F. Schmidt and Geoffrey I. Webb},
booktitle = {Proceedings of the 27thACM SIGKDD Conference on Knowledge Discovery and Data Mining},
title = {{MINIROCKET:} {A} Very Fast (Almost) Deterministic Transform for Time Series Classification},
year = {2021},
pages = {248-257},
abstract = {Until recently, the most accurate methods for time series classification were limited by high computational complexity. ROCKET achieves state-of-the-art accuracy with a fraction of the computational expense of most existing methods by transforming input time series using random convolutional kernels, and using the transformed features to train a linear classifier. We reformulate ROCKET into a new method, MINIROCKET, making it up to 75 times faster on larger datasets, and making it almost deterministic (and optionally, with additional computational expense, fully deterministic), while maintaining essentially the same accuracy. Using this method, it is possible to train and test a classifier on all of 109 datasets from the UCR archive to state-of-the-art accuracy in less than 10 minutes. MINIROCKET is significantly faster than any other method of comparable accuracy (including ROCKET), and significantly more accurate than any other method of even roughly-similar computational expense. As such, we suggest that MINIROCKET should now be considered and used as the default variant of ROCKET.},
doi = {10.1145/3447548.3467231},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
url = {https://arxiv.org/abs/2012.08791},
}
ABSTRACT Until recently, the most accurate methods for time series classification were limited by high computational complexity. ROCKET achieves state-of-the-art accuracy with a fraction of the computational expense of most existing methods by transforming input time series using random convolutional kernels, and using the transformed features to train a linear classifier. We reformulate ROCKET into a new method, MINIROCKET, making it up to 75 times faster on larger datasets, and making it almost deterministic (and optionally, with additional computational expense, fully deterministic), while maintaining essentially the same accuracy. Using this method, it is possible to train and test a classifier on all of 109 datasets from the UCR archive to state-of-the-art accuracy in less than 10 minutes. MINIROCKET is significantly faster than any other method of comparable accuracy (including ROCKET), and significantly more accurate than any other method of even roughly-similar computational expense. As such, we suggest that MINIROCKET should now be considered and used as the default variant of ROCKET.
Live fuel moisture content estimation from MODIS: A deep learning approach.
Zhu, L., Webb, G. I., Yebra, M., Scortechini, G., Miller, L., & Petitjean, F.
ISPRS Journal of Photogrammetry and Remote Sensing, 179, 81-91, 2021.
[Bibtex] [Abstract] → Access on publisher site
@Article{ZHU202181,
author = {Liujun Zhu and Geoffrey I. Webb and Marta Yebra and Gianluca Scortechini and Lynn Miller and Francois Petitjean},
journal = {ISPRS Journal of Photogrammetry and Remote Sensing},
title = {Live fuel moisture content estimation from MODIS: A deep learning approach},
year = {2021},
issn = {0924-2716},
pages = {81-91},
volume = {179},
abstract = {Live fuel moisture content (LFMC) is an essential variable to model fire danger and behaviour. This paper presents the first application of deep learning to LFMC estimation based on the historical LFMC ground samples of the Globe-LFMC database, as a step towards operational daily LFMC mapping in the Contiguous United States (CONUS). One-year MODerate resolution Imaging Spectroradiometer (MODIS) time series preceding each LFMC sample were extracted as the primary data source for training. The proposed temporal convolutional neural network for LFMC (TempCNN-LFMC) comprises three 1-D convolutional layers that learn the multi-scale temporal dynamics (features) of one-year MODIS time series specific to LFMC estimation. The learned features, together with a few auxiliary variables (e.g., digital elevation model), are then passed to three fully connected layers to extract the non-linear relationships with LFMC. In the primary training and validation scenario, the neural network was trained using samples from 2002 to 2013 and then adopted to estimating the LFMC from 2014 to 2018, achieving an overall root mean square error (RMSE) of 25.57% and a correlation coefficient (R) of 0.74. Good consistency on spatial patterns and temporal trends of accuracy was observed. The trained model achieved a similar RMSE of 25.98%, 25.20% and 25.93% for forest, shrubland, and grassland, respectively, without requiring prior information on the vegetation type.},
doi = {10.1016/j.isprsjprs.2021.07.010},
keywords = {time series, Live fuel moisture content, earth observation analytics, MODIS, Convolutional neural network, Time series analysis, Fire risk, Fire danger},
related = {earth-observation-analytics},
}
ABSTRACT Live fuel moisture content (LFMC) is an essential variable to model fire danger and behaviour. This paper presents the first application of deep learning to LFMC estimation based on the historical LFMC ground samples of the Globe-LFMC database, as a step towards operational daily LFMC mapping in the Contiguous United States (CONUS). One-year MODerate resolution Imaging Spectroradiometer (MODIS) time series preceding each LFMC sample were extracted as the primary data source for training. The proposed temporal convolutional neural network for LFMC (TempCNN-LFMC) comprises three 1-D convolutional layers that learn the multi-scale temporal dynamics (features) of one-year MODIS time series specific to LFMC estimation. The learned features, together with a few auxiliary variables (e.g., digital elevation model), are then passed to three fully connected layers to extract the non-linear relationships with LFMC. In the primary training and validation scenario, the neural network was trained using samples from 2002 to 2013 and then adopted to estimating the LFMC from 2014 to 2018, achieving an overall root mean square error (RMSE) of 25.57% and a correlation coefficient (R) of 0.74. Good consistency on spatial patterns and temporal trends of accuracy was observed. The trained model achieved a similar RMSE of 25.98%, 25.20% and 25.93% for forest, shrubland, and grassland, respectively, without requiring prior information on the vegetation type.
ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels.
Dempster, A., Petitjean, F., & Webb, G. I.
Data Mining and Knowledge Discovery, 34, 1454-1495, 2020.
Second Most Highly Cited Paper Published in Data Mining and Knowledge Discovery in 2020; Clarivate Web of Science Highly Cited Paper 2024
[Bibtex] [Abstract] → Access on publisher site
@Article{dempster2020rocket,
author = {Angus Dempster and Francois Petitjean and Geoffrey I. Webb},
journal = {Data Mining and Knowledge Discovery},
title = {ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels},
year = {2020},
pages = {1454-1495},
volume = {34},
abstract = {Most methods for time series classification that attain state-of-the-art accuracy have high computational complexity, requiring significant training time even for smaller datasets, and are intractable for larger datasets. Additionally, many existing methods focus on a single type of feature such as shape or frequency. Building on the recent success of convolutional neural networks for time series classification, we show that simple linear classifiers using random convolutional kernels achieve state-of-the-art accuracy with a fraction of the computational expense of existing methods. Using this method, it is possible to train and test a classifier on all 85 'bake off' datasets in the UCR archive in <2h, and it is possible to train a classifier on a large dataset of more than one million time series in approximately 1 h.},
comment = {Second Most Highly Cited Paper Published in Data Mining and Knowledge Discovery in 2020; Clarivate Web of Science Highly Cited Paper 2024},
doi = {10.1007/s10618-020-00701-z},
issue = {5},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
url = {https://rdcu.be/c1zg4},
}
ABSTRACT Most methods for time series classification that attain state-of-the-art accuracy have high computational complexity, requiring significant training time even for smaller datasets, and are intractable for larger datasets. Additionally, many existing methods focus on a single type of feature such as shape or frequency. Building on the recent success of convolutional neural networks for time series classification, we show that simple linear classifiers using random convolutional kernels achieve state-of-the-art accuracy with a fraction of the computational expense of existing methods. Using this method, it is possible to train and test a classifier on all 85 'bake off' datasets in the UCR archive in <2h, and it is possible to train a classifier on a large dataset of more than one million time series in approximately 1 h.
FastEE: Fast Ensembles of Elastic Distances for time series classification.
Tan, C. W., Petitjean, F., & Webb, G. I.
Data Mining and Knowledge Discovery, 34(1), 231-272, 2020.
[Bibtex] [Abstract] → Access on publisher site
@Article{Tan2019,
author = {Tan, Chang Wei and Petitjean, Fran{\c{c}}ois and Webb, Geoffrey I.},
journal = {Data Mining and Knowledge Discovery},
title = {FastEE: Fast Ensembles of Elastic Distances for time series classification},
year = {2020},
issn = {1573-756X},
number = {1},
pages = {231-272},
volume = {34},
abstract = {In recent years, many new ensemble-based time series classification (TSC) algorithms have been proposed. Each of them is significantly more accurate than their predecessors. The Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is currently the most accurate TSC algorithm when assessed on the UCR repository. It is a meta-ensemble of 5 state-of-the-art ensemble-based classifiers. The time complexity of HIVE-COTE---particularly for training---is prohibitive for most datasets. There is thus a critical need to speed up the classifiers that compose HIVE-COTE. This paper focuses on speeding up one of its components: Ensembles of Elastic Distances (EE), which is the classifier that leverages on the decades of research into the development of time-dedicated measures. Training EE can be prohibitive for many datasets. For example, it takes a month on the ElectricDevices dataset with 9000 instances. This is because EE needs to cross-validate the hyper-parameters used for the 11 similarity measures it encompasses. In this work, Fast Ensembles of Elastic Distances is proposed to train EE faster. There are two versions to it. The exact version makes it possible to train EE 10 times faster. The approximate version is 40 times faster than EE without significantly impacting the classification accuracy. This translates to being able to train EE on ElectricDevices in 13h.},
doi = {10.1007/s10618-019-00663-x},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
url = {https://rdcu.be/c1y5a},
}
ABSTRACT In recent years, many new ensemble-based time series classification (TSC) algorithms have been proposed. Each of them is significantly more accurate than their predecessors. The Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is currently the most accurate TSC algorithm when assessed on the UCR repository. It is a meta-ensemble of 5 state-of-the-art ensemble-based classifiers. The time complexity of HIVE-COTE–-particularly for training–-is prohibitive for most datasets. There is thus a critical need to speed up the classifiers that compose HIVE-COTE. This paper focuses on speeding up one of its components: Ensembles of Elastic Distances (EE), which is the classifier that leverages on the decades of research into the development of time-dedicated measures. Training EE can be prohibitive for many datasets. For example, it takes a month on the ElectricDevices dataset with 9000 instances. This is because EE needs to cross-validate the hyper-parameters used for the 11 similarity measures it encompasses. In this work, Fast Ensembles of Elastic Distances is proposed to train EE faster. There are two versions to it. The exact version makes it possible to train EE 10 times faster. The approximate version is 40 times faster than EE without significantly impacting the classification accuracy. This translates to being able to train EE on ElectricDevices in 13h.
Unsupervised Domain Adaptation Techniques for Classification of Satellite Image Time Series.
Lucas, B., Pelletier, C., Schmidt, D., Webb, G. I., & Petitjean, F.
IEEE International Geoscience and Remote Sensing Symposium, pp. 1074–1077, 2020.
[Bibtex] [Abstract] → Access on publisher site
@InProceedings{lucas2020unsupervised,
author = {Lucas, Benjamin and Pelletier, Charlotte and Schmidt, Daniel and Webb, Geoffrey I and Petitjean, Fran{\c{c}}ois},
booktitle = {IEEE International Geoscience and Remote Sensing Symposium},
title = {Unsupervised Domain Adaptation Techniques for Classification of Satellite Image Time Series},
year = {2020},
organization = {IEEE},
pages = {1074--1077},
abstract = {Land cover maps are vitally important to many elements of environmental management. However the machine learning algorithms used to produce them require a substantive quantity of labelled training data to reach the best levels of accuracy. When researchers wish to map an area where no labelled training data are available, one potential solution is to use a classifier trained on another geographical area and adapting it to the target location-this is known as Unsupervised Domain Adaptation (DA). In this paper we undertake the first experiments using unsupervised DA methods for the classification of satellite image time series (SITS) data. Our experiments draw the interesting conclusion that existing methods provide no benefit when used on SITS data, and that this is likely due to the temporal nature of the data and the change in class distributions between the regions. This suggests that an unsupervised domain adaptation technique for SITS would be extremely beneficial for land cover mapping.},
doi = {10.1109/IGARSS39084.2020.9324339},
keywords = {time series, earth observation analytics},
related = {scalable-time-series-classifiers},
}
ABSTRACT Land cover maps are vitally important to many elements of environmental management. However the machine learning algorithms used to produce them require a substantive quantity of labelled training data to reach the best levels of accuracy. When researchers wish to map an area where no labelled training data are available, one potential solution is to use a classifier trained on another geographical area and adapting it to the target location-this is known as Unsupervised Domain Adaptation (DA). In this paper we undertake the first experiments using unsupervised DA methods for the classification of satellite image time series (SITS) data. Our experiments draw the interesting conclusion that existing methods provide no benefit when used on SITS data, and that this is likely due to the temporal nature of the data and the change in class distributions between the regions. This suggests that an unsupervised domain adaptation technique for SITS would be extremely beneficial for land cover mapping.
TS-CHIEF: A Scalable and Accurate Forest Algorithm for Time Series Classification.
Shifaz, A., Pelletier, C., Petitjean, F., & Webb, G. I.
Data Mining and Knowledge Discovery, 34(3), 742-775, 2020.
Third Most Highly Cited Paper Published in Data Mining and Knowledge Discovery in 2020
[Bibtex] [Abstract] → Access on publisher site
@Article{shifazetal2019,
author = {Shifaz, Ahmed and Pelletier, Charlotte and Petitjean, Francois and Webb, Geoffrey I},
journal = {Data Mining and Knowledge Discovery},
title = {TS-CHIEF: A Scalable and Accurate Forest Algorithm for Time Series Classification},
year = {2020},
number = {3},
pages = {742-775},
volume = {34},
abstract = {Time Series Classification (TSC) has seen enormous progress over the last two decades. HIVE-COTE (Hierarchical Vote Collective of Transformation-based Ensembles) is the current state of the art in terms of classification accuracy. HIVE-COTE recognizes that time series data are a specific data type for which the traditional attribute-value representation, used predominantly in machine learning, fails to provide a relevant representation. HIVE-COTE combines multiple types of classifiers: each extracting information about a specific aspect of a time series, be it in the time domain, frequency domain or summarization of intervals within the series. However, HIVE-COTE (and its predecessor, FLAT-COTE) is often infeasible to run on even modest amounts of data. For instance, training HIVE-COTE on a dataset with only 1500 time series can require 8 days of CPU time. It has polynomial runtime with respect to the training set size, so this problem compounds as data quantity increases. We propose a novel TSC algorithm, TS-CHIEF (Time Series Combination of Heterogeneous and Integrated Embedding Forest), which rivals HIVE-COTE in accuracy but requires only a fraction of the runtime. TS-CHIEF constructs an ensemble classifier that integrates the most effective embeddings of time series that research has developed in the last decade. It uses tree-structured classifiers to do so efficiently. We assess TS-CHIEF on 85 datasets of the University of California Riverside (UCR) archive, where it achieves state-of-the-art accuracy with scalability and efficiency. We demonstrate that TS-CHIEF can be trained on 130 k time series in 2 days, a data quantity that is beyond the reach of any TSC algorithm with comparable accuracy.},
comment = {Third Most Highly Cited Paper Published in Data Mining and Knowledge Discovery in 2020},
doi = {10.1007/s10618-020-00679-8},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
url = {https://rdcu.be/c1zg6},
}
ABSTRACT Time Series Classification (TSC) has seen enormous progress over the last two decades. HIVE-COTE (Hierarchical Vote Collective of Transformation-based Ensembles) is the current state of the art in terms of classification accuracy. HIVE-COTE recognizes that time series data are a specific data type for which the traditional attribute-value representation, used predominantly in machine learning, fails to provide a relevant representation. HIVE-COTE combines multiple types of classifiers: each extracting information about a specific aspect of a time series, be it in the time domain, frequency domain or summarization of intervals within the series. However, HIVE-COTE (and its predecessor, FLAT-COTE) is often infeasible to run on even modest amounts of data. For instance, training HIVE-COTE on a dataset with only 1500 time series can require 8 days of CPU time. It has polynomial runtime with respect to the training set size, so this problem compounds as data quantity increases. We propose a novel TSC algorithm, TS-CHIEF (Time Series Combination of Heterogeneous and Integrated Embedding Forest), which rivals HIVE-COTE in accuracy but requires only a fraction of the runtime. TS-CHIEF constructs an ensemble classifier that integrates the most effective embeddings of time series that research has developed in the last decade. It uses tree-structured classifiers to do so efficiently. We assess TS-CHIEF on 85 datasets of the University of California Riverside (UCR) archive, where it achieves state-of-the-art accuracy with scalability and efficiency. We demonstrate that TS-CHIEF can be trained on 130 k time series in 2 days, a data quantity that is beyond the reach of any TSC algorithm with comparable accuracy.
InceptionTime: Finding AlexNet for Time Series Classification.
Fawaz, H. I., Lucas, B., Forestier, G., Pelletier, C., Schmidt, D. F., Weber, J., Webb, G. I., Idoumghar, L., Muller, P., & Petitjean, F.
Data Mining and Knowledge Discovery, 34, 1936-1962, 2020.
Clarivate Web of Science Highly Cited Paper 2022 - 2024
Most Highly Cited Paper Published In Data Mining and Knowledge Discovery in 2020
[Bibtex] [Abstract] → Access on publisher site
@Article{fawaz2019inceptiontime,
author = {Hassan Ismail Fawaz and Benjamin Lucas and Germain Forestier and Charlotte Pelletier and Daniel F. Schmidt and Jonathan Weber and Geoffrey I. Webb and Lhassane Idoumghar and Pierre-Alain Muller and Francois Petitjean},
journal = {Data Mining and Knowledge Discovery},
title = {InceptionTime: Finding AlexNet for Time Series Classification},
year = {2020},
pages = {1936-1962},
volume = {34},
abstract = {This paper brings deep learning at the forefront of research into time series classification (TSC). TSC is the area of machine learning tasked with the categorization (or labelling) of time series. The last few decades of work in this area have led to significant progress in the accuracy of classifiers, with the state of the art now represented by the HIVE-COTE algorithm. While extremely accurate, HIVE-COTE cannot be applied to many real-world datasets because of its high training time complexity in O(N^2 . T^4) for a dataset with N time series of length T. For example, it takes HIVE-COTE more than 8 days to learn from a small dataset with N = 1500 time series of short length T = 46. Meanwhile deep learning has received enormous attention because of its high accuracy and scalability. Recent approaches to deep learning for TSC have been scalable, but less accurate than HIVE-COTE. We introduce InceptionTime - an ensemble of deep Convolutional Neural Network models, inspired by the Inception-v4 architecture. Our experiments show that InceptionTime is on par with HIVE-COTE in terms of accuracy while being much more scalable: not only can it learn from 1500 time series in one hour but it can also learn from 8M time series in 13 h, a quantity of data that is fully out of reach of HIVE-COTE.},
comment = {Clarivate Web of Science Highly Cited Paper 2022 - 2024},
comment2 = {Most Highly Cited Paper Published In Data Mining and Knowledge Discovery in 2020},
doi = {10.1007/s10618-020-00710-y},
issue = {6},
keywords = {time series},
related = {scalable-time-series-classifiers},
url = {https://rdcu.be/b6TXh},
}
ABSTRACT This paper brings deep learning at the forefront of research into time series classification (TSC). TSC is the area of machine learning tasked with the categorization (or labelling) of time series. The last few decades of work in this area have led to significant progress in the accuracy of classifiers, with the state of the art now represented by the HIVE-COTE algorithm. While extremely accurate, HIVE-COTE cannot be applied to many real-world datasets because of its high training time complexity in O(N^2 . T^4) for a dataset with N time series of length T. For example, it takes HIVE-COTE more than 8 days to learn from a small dataset with N = 1500 time series of short length T = 46. Meanwhile deep learning has received enormous attention because of its high accuracy and scalability. Recent approaches to deep learning for TSC have been scalable, but less accurate than HIVE-COTE. We introduce InceptionTime - an ensemble of deep Convolutional Neural Network models, inspired by the Inception-v4 architecture. Our experiments show that InceptionTime is on par with HIVE-COTE in terms of accuracy while being much more scalable: not only can it learn from 1500 time series in one hour but it can also learn from 8M time series in 13 h, a quantity of data that is fully out of reach of HIVE-COTE.
Deep Learning for the Classification of Sentinel-2 Image Series.
Pelletier, C., Webb, G. I., & Petitjean, F.
IEEE International Geoscience And Remote Sensing Symposium, 2019.
[Bibtex] [Abstract] → Access on publisher site
@InProceedings{PelletierEtAl19b,
author = {Pelletier, Charlotte and Webb, Geoffrey I. and Petitjean, Francois},
booktitle = {IEEE International Geoscience And Remote Sensing Symposium},
title = {Deep Learning for the Classification of Sentinel-2 Image Series},
year = {2019},
month = {Jul},
abstract = {Satellite image time series (SITS) have proven to be essential for accurate and up-to-date land cover mapping over large areas. Most works about SITS have focused on the use of traditional classification algorithms such as Random Forests (RFs). Deep learning algorithms have been very successful for supervised tasks, in particular for data that exhibit a structure between attributes, such as space or time. In this work, we compare for the first time RFs to the two leading deep learning algorithms for handling temporal data: Recurrent Neural Networks (RNNs) and temporal Convolutional Neural Networks (TempCNNs). We carry out a large experiment using Sentinel-2 time series. We compare both accuracy and computational times to classify 10,980 km 2 over Australia. The results highlights the good performance of TemCNNs that obtain the highest accuracy. They also show that RNNs might be less suited for large scale study as they have higher runtime complexity.},
doi = {10.1109/IGARSS.2019.8900123},
keywords = {time series, earth observation analytics},
related = {earth-observation-analytics},
}
ABSTRACT Satellite image time series (SITS) have proven to be essential for accurate and up-to-date land cover mapping over large areas. Most works about SITS have focused on the use of traditional classification algorithms such as Random Forests (RFs). Deep learning algorithms have been very successful for supervised tasks, in particular for data that exhibit a structure between attributes, such as space or time. In this work, we compare for the first time RFs to the two leading deep learning algorithms for handling temporal data: Recurrent Neural Networks (RNNs) and temporal Convolutional Neural Networks (TempCNNs). We carry out a large experiment using Sentinel-2 time series. We compare both accuracy and computational times to classify 10,980 km 2 over Australia. The results highlights the good performance of TemCNNs that obtain the highest accuracy. They also show that RNNs might be less suited for large scale study as they have higher runtime complexity.
Elastic bands across the path: A new framework and methods to lower bound DTW.
Tan, C. W., Petitjean, F., & Webb, G. I.
Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 522-530, 2019.
[Bibtex] [Abstract] → Access on publisher site
@InProceedings{TanEtAl19,
Title = {Elastic bands across the path: A new framework and methods to lower bound DTW},
Author = {Tan, Chang Wei and Petitjean, Francois and Webb, Geoffrey I.},
Booktitle = {Proceedings of the 2019 SIAM International Conference on Data Mining},
Year = {2019},
Pages = {522-530},
Abstract = {There has been renewed recent interest in developing effective lower bounds for Dynamic Time Warping (DTW) distance between time series. These have many applications in time series indexing, clustering, forecasting, regression and classification. One of the key time series classification algorithms, the nearest neighbor algorithm with DTW distance (NN-DTW) is very expensive to compute, due to the quadratic complexity of DTW. Lower bound search can speed up NN-DTW substantially. An effective and tight lower bound quickly prunes off unpromising nearest neighbor candidates from the search space and minimises the number of the costly DTW computations. The speed up provided by lower bound search becomes increasingly critical as training set size increases. Different lower bounds provide different trade-offs between computation time and tightness. Most existing lower bounds interact with DTW warping window sizes. They are very tight and effective at smaller warping window sizes, but become looser as the warping window increases, thus reducing the pruning effectiveness for NN-DTW. In this work, we present a new class of lower bounds that are tighter than the popular Keogh lower bound, while requiring similar computation time. Our new lower bounds take advantage of the DTW boundary condition, monotonicity and continuity constraints to create a tighter lower bound. Of particular significance, they remain relatively tight even for large windows. A single parameter to these new lower bounds controls the speed-tightness trade-off. We demonstrate that these new lower bounds provide an exceptional balance between computation time and tightness for the NN-DTW time series classification task, resulting in greatly improved efficiency for NN-DTW lower bound search.},
Keywords = {time series},
Related = {scalable-time-series-classifiers},
Url = {https://arxiv.org/abs/1808.09617}
}
ABSTRACT There has been renewed recent interest in developing effective lower bounds for Dynamic Time Warping (DTW) distance between time series. These have many applications in time series indexing, clustering, forecasting, regression and classification. One of the key time series classification algorithms, the nearest neighbor algorithm with DTW distance (NN-DTW) is very expensive to compute, due to the quadratic complexity of DTW. Lower bound search can speed up NN-DTW substantially. An effective and tight lower bound quickly prunes off unpromising nearest neighbor candidates from the search space and minimises the number of the costly DTW computations. The speed up provided by lower bound search becomes increasingly critical as training set size increases. Different lower bounds provide different trade-offs between computation time and tightness. Most existing lower bounds interact with DTW warping window sizes. They are very tight and effective at smaller warping window sizes, but become looser as the warping window increases, thus reducing the pruning effectiveness for NN-DTW. In this work, we present a new class of lower bounds that are tighter than the popular Keogh lower bound, while requiring similar computation time. Our new lower bounds take advantage of the DTW boundary condition, monotonicity and continuity constraints to create a tighter lower bound. Of particular significance, they remain relatively tight even for large windows. A single parameter to these new lower bounds controls the speed-tightness trade-off. We demonstrate that these new lower bounds provide an exceptional balance between computation time and tightness for the NN-DTW time series classification task, resulting in greatly improved efficiency for NN-DTW lower bound search.
Using Sentinel-2 Image Time Series to map the State of Victoria, Australia.
Pelletier, C., Ji, Z., Hagolle, O., Morse-McNabb, E., Sheffield, K., Webb, G. I., & Petitjean, F.
Proceedings 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images, MultiTemp 2019, 2019.
[Bibtex] [Abstract] → Access on publisher site
@InProceedings{PelletierEtAl19c,
author = {Pelletier, C. and Ji, Z. and Hagolle, O. and Morse-McNabb, E. and Sheffield, K. and Webb, G. I. and Petitjean, F.},
booktitle = {Proceedings 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images, MultiTemp 2019},
title = {Using Sentinel-2 Image Time Series to map the State of Victoria, Australia},
year = {2019},
abstract = {Sentinel-2 satellites are now acquiring images of the entire Earth every five days from 10 to 60 m spatial resolution. The supervised classification of this new optical image time series allows the operational production of accurate land cover maps over large areas. In this paper, we investigate the use of one year of Sentinel-2 data to map the state of Victoria in Australia. In particular, we produce two land cover maps using the most established and advanced algorithms in time series classification: Random Forest (RF) and Temporal Convolutional Neural Network (TempCNN). To our knowledge, these are the first land cover maps at 10 m spatial resolution for an Australian state.},
doi = {10.1109/Multi-Temp.2019.8866921},
keywords = {cartography;convolutional neural nets;geophysical image processing;image classification;image resolution;land cover;optical images;optical information processing;remote sensing;terrain mapping;time series;TempCNN;temporal convolutional neural network;random forest;land cover maps;Victoria state;Australian state;spatial resolution;time series classification;Sentinel-2 data;accurate land cover maps;operational production;optical image time series;supervised classification;Sentinel-2 satellites;Australia;sentinel-2 image time series;Radio frequency;Australia;Spatial resolution;Time series analysis;Agriculture;Convolutional neural networks;Sentinel-2 images;land cover map;time series;Temporal Convolutional Neural Networks;Random Forests;earth observation analytics},
related = {earth-observation-analytics},
}
ABSTRACT Sentinel-2 satellites are now acquiring images of the entire Earth every five days from 10 to 60 m spatial resolution. The supervised classification of this new optical image time series allows the operational production of accurate land cover maps over large areas. In this paper, we investigate the use of one year of Sentinel-2 data to map the state of Victoria in Australia. In particular, we produce two land cover maps using the most established and advanced algorithms in time series classification: Random Forest (RF) and Temporal Convolutional Neural Network (TempCNN). To our knowledge, these are the first land cover maps at 10 m spatial resolution for an Australian state.
Proximity Forest: an effective and scalable distance-based classifier for time series.
Lucas, B., Shifaz, A., Pelletier, C., O'Neill, L., Zaidi, N., Goethals, B., Petitjean, F., & Webb, G. I.
Data Mining and Knowledge Discovery, 33, 607-635, 2019.
[Bibtex] [Abstract] → Access on publisher site
@Article{LucasEtAl2019,
author = {Lucas, Benjamin and Shifaz, Ahmed and Pelletier, Charlotte and O'Neill, Lachlan and Zaidi, Nayyar and Goethals, Bart and Petitjean, Francois and Webb, Geoffrey I.},
journal = {Data Mining and Knowledge Discovery},
title = {Proximity Forest: an effective and scalable distance-based classifier for time series},
year = {2019},
issn = {1573-756X},
pages = {607-635},
volume = {33},
abstract = {Research into the classification of time series has made enormous progress in the last decade. The UCR time series archive has played a significant role in challenging and guiding the development of new learners for time series classification. The largest dataset in the UCR archive holds 10,000 time series only; which may explain why the primary research focus has been on creating algorithms that have high accuracy on relatively small datasets. This paper introduces Proximity Forest, an algorithm that learns accurate models from datasets with millions of time series, and classifies a time series in milliseconds. The models are ensembles of highly randomized Proximity Trees. Whereas conventional decision trees branch on attribute values (and usually perform poorly on time series), Proximity Trees branch on the proximity of time series to one exemplar time series or another; allowing us to leverage the decades of work into developing relevant measures for time series. Proximity Forest gains both efficiency and accuracy by stochastic selection of both exemplars and similarity measures. Our work is motivated by recent time series applications that provide orders of magnitude more time series than the UCR benchmarks. Our experiments demonstrate that Proximity Forest is highly competitive on the UCR archive: it ranks among the most accurate classifiers while being significantly faster. We demonstrate on a 1M time series Earth observation dataset that Proximity Forest retains this accuracy on datasets that are many orders of magnitude greater than those in the UCR repository, while learning its models at least 100,000 times faster than current state-of-the-art models Elastic Ensemble and COTE.},
doi = {10.1007/s10618-019-00617-3},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
url = {https://rdcu.be/blB8E},
}
ABSTRACT Research into the classification of time series has made enormous progress in the last decade. The UCR time series archive has played a significant role in challenging and guiding the development of new learners for time series classification. The largest dataset in the UCR archive holds 10,000 time series only; which may explain why the primary research focus has been on creating algorithms that have high accuracy on relatively small datasets. This paper introduces Proximity Forest, an algorithm that learns accurate models from datasets with millions of time series, and classifies a time series in milliseconds. The models are ensembles of highly randomized Proximity Trees. Whereas conventional decision trees branch on attribute values (and usually perform poorly on time series), Proximity Trees branch on the proximity of time series to one exemplar time series or another; allowing us to leverage the decades of work into developing relevant measures for time series. Proximity Forest gains both efficiency and accuracy by stochastic selection of both exemplars and similarity measures. Our work is motivated by recent time series applications that provide orders of magnitude more time series than the UCR benchmarks. Our experiments demonstrate that Proximity Forest is highly competitive on the UCR archive: it ranks among the most accurate classifiers while being significantly faster. We demonstrate on a 1M time series Earth observation dataset that Proximity Forest retains this accuracy on datasets that are many orders of magnitude greater than those in the UCR repository, while learning its models at least 100,000 times faster than current state-of-the-art models Elastic Ensemble and COTE.
Exploring Data Quantity Requirements for Domain Adaptation in the Classification of Satellite Image Time Series.
Lucas, B., Pelletier, C., Inglada, J., Schmidt, D., Webb, G. I., & Petitjean, F.
Proceedings 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images, MultiTemp 2019, 2019.
[Bibtex] [Abstract] → Access on publisher site
@InProceedings{LucasEtAl2019b,
author = {Lucas, B. and Pelletier, C. and Inglada, J. and Schmidt, D. and Webb, G. I. and Petitjean, F},
booktitle = {Proceedings 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images, MultiTemp 2019},
title = {Exploring Data Quantity Requirements for Domain Adaptation in the Classification of Satellite Image Time Series},
year = {2019},
publisher = {IEEE, Institute of Electrical and Electronics Engineers},
abstract = {Land cover maps are a vital input variable in all types of environmental research and management. However the modern state-of-The-Art machine learning techniques used to create them require substantial training data to produce optimal accuracy. Domain Adaptation is one technique researchers might use when labelled training data are unavailable or scarce. This paper looks at the result of training a convolutional neural network model on a region where data are available (source domain), and then adapting this model to another region (target domain) by retraining it on the available labelled data, and in particular how these results change with increasing data availability. Our experiments performing domain adaptation on satellite image time series, draw three interesting conclusions: (1) a model trained only on data from the source domain delivers 73.0% test accuracy on the target domain; (2) when all of the weights are retrained on the target data, over 16,000 instances were required to improve upon the accuracy of the source-only model; and (3) even if sufficient data is available in the target domain, using a model pretrained on a source domain will result in better overall test accuracy compared to a model trained on target domain data only-88.9% versus 84.7%.},
doi = {10.1109/Multi-Temp.2019.8866898},
keywords = {time series, earth observation analytics},
related = {earth-observation-analytics},
}
ABSTRACT Land cover maps are a vital input variable in all types of environmental research and management. However the modern state-of-The-Art machine learning techniques used to create them require substantial training data to produce optimal accuracy. Domain Adaptation is one technique researchers might use when labelled training data are unavailable or scarce. This paper looks at the result of training a convolutional neural network model on a region where data are available (source domain), and then adapting this model to another region (target domain) by retraining it on the available labelled data, and in particular how these results change with increasing data availability. Our experiments performing domain adaptation on satellite image time series, draw three interesting conclusions: (1) a model trained only on data from the source domain delivers 73.0% test accuracy on the target domain; (2) when all of the weights are retrained on the target data, over 16,000 instances were required to improve upon the accuracy of the source-only model; and (3) even if sufficient data is available in the target domain, using a model pretrained on a source domain will result in better overall test accuracy compared to a model trained on target domain data only-88.9% versus 84.7%.
Temporal Convolutional Neural Network for the Classification of Satellite Image Time Series.
Pelletier, C., Webb, G. I., & Petitjean, F.
Remote Sensing, 11(5), Art. no. 523, 2019.
Clarivate Web of Science Highly Cited Paper 2021 - 2024
[Bibtex] [Abstract] → Access on publisher site
@Article{PelletierEtAl19,
author = {Pelletier, Charlotte and Webb, Geoffrey I. and Petitjean, Francois},
journal = {Remote Sensing},
title = {Temporal Convolutional Neural Network for the Classification of Satellite Image Time Series},
year = {2019},
issn = {2072-4292},
number = {5},
volume = {11},
abstract = {Latest remote sensing sensors are capable of acquiring high spatial and spectral Satellite Image Time Series (SITS) of the world. These image series are a key component of classification systems that aim at obtaining up-to-date and accurate land cover maps of the Earth’s surfaces. More specifically, current SITS combine high temporal, spectral and spatial resolutions, which makes it possible to closely monitor vegetation dynamics. Although traditional classification algorithms, such as Random Forest (RF), have been successfully applied to create land cover maps from SITS, these algorithms do not make the most of the temporal domain. This paper proposes a comprehensive study of Temporal Convolutional Neural Networks (TempCNNs), a deep learning approach which applies convolutions in the temporal dimension in order to automatically learn temporal (and spectral) features. The goal of this paper is to quantitatively and qualitatively evaluate the contribution of TempCNNs for SITS classification, as compared to RF and Recurrent Neural Networks (RNNs) —a standard deep learning approach that is particularly suited to temporal data. We carry out experiments on Formosat-2 scene with 46 images and one million labelled time series. The experimental results show that TempCNNs are more accurate than the current state of the art for SITS classification. We provide some general guidelines on the network architecture, common regularization mechanisms, and hyper-parameter values such as batch size; we also draw out some differences with standard results in computer vision (e.g., about pooling layers). Finally, we assess the visual quality of the land cover maps produced by TempCNNs.},
articlenumber = {523},
comment = {Clarivate Web of Science Highly Cited Paper 2021 - 2024},
doi = {10.3390/rs11050523},
keywords = {time series, earth observation analytics},
related = {earth-observation-analytics},
}
ABSTRACT Latest remote sensing sensors are capable of acquiring high spatial and spectral Satellite Image Time Series (SITS) of the world. These image series are a key component of classification systems that aim at obtaining up-to-date and accurate land cover maps of the Earth’s surfaces. More specifically, current SITS combine high temporal, spectral and spatial resolutions, which makes it possible to closely monitor vegetation dynamics. Although traditional classification algorithms, such as Random Forest (RF), have been successfully applied to create land cover maps from SITS, these algorithms do not make the most of the temporal domain. This paper proposes a comprehensive study of Temporal Convolutional Neural Networks (TempCNNs), a deep learning approach which applies convolutions in the temporal dimension in order to automatically learn temporal (and spectral) features. The goal of this paper is to quantitatively and qualitatively evaluate the contribution of TempCNNs for SITS classification, as compared to RF and Recurrent Neural Networks (RNNs) —a standard deep learning approach that is particularly suited to temporal data. We carry out experiments on Formosat-2 scene with 46 images and one million labelled time series. The experimental results show that TempCNNs are more accurate than the current state of the art for SITS classification. We provide some general guidelines on the network architecture, common regularization mechanisms, and hyper-parameter values such as batch size; we also draw out some differences with standard results in computer vision (e.g., about pooling layers). Finally, we assess the visual quality of the land cover maps produced by TempCNNs.
Efficient search of the best warping window for Dynamic Time Warping.
Tan, C. W., Herrmann, M., Forestier, G., Webb, G. I., & Petitjean, F.
Proceedings of the 2018 SIAM International Conference on Data Mining, pp. 459-467, 2018.
Best Research Paper Award
[Bibtex] [Abstract] → Download PDF
@InProceedings{TanEtAl18,
author = {Tan, Chang Wei and Herrmann, Matthieu and Forestier, Germain and Webb, Geoffrey I. and Petitjean, Francois},
booktitle = {Proceedings of the 2018 {SIAM} International Conference on Data Mining},
title = {Efficient search of the best warping window for Dynamic Time Warping},
year = {2018},
pages = {459-467},
abstract = {Time series classification maps time series to labels. The nearest neighbour algorithm (NN) using the Dynamic Time Warping (DTW) similarity measure is a leading algorithm for this task and a component of the current best ensemble classifiers for time series. However, NN-DTW is only a winning combination when its meta-parameter - its warping window - is learned from the training data. The warping window (WW) intuitively controls the amount of distortion allowed when comparing a pair of time series. With a training database of N time series of lengths L, a naive approach to learning the WW requires Omega(N^2 . L^3) operations. This often translates in NN-DTW requiring days for training on datasets containing a few thousand time series only. In this paper, we introduce FastWWSearch: an efficient and exact method to learn WW. We show on 86 datasets that our method is always faster than the state of the art, with at least one order of magnitude and up to 1000x speed-up.},
comment = {Best Research Paper Award},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
}
ABSTRACT Time series classification maps time series to labels. The nearest neighbour algorithm (NN) using the Dynamic Time Warping (DTW) similarity measure is a leading algorithm for this task and a component of the current best ensemble classifiers for time series. However, NN-DTW is only a winning combination when its meta-parameter - its warping window - is learned from the training data. The warping window (WW) intuitively controls the amount of distortion allowed when comparing a pair of time series. With a training database of N time series of lengths L, a naive approach to learning the WW requires Omega(N^2 . L^3) operations. This often translates in NN-DTW requiring days for training on datasets containing a few thousand time series only. In this paper, we introduce FastWWSearch: an efficient and exact method to learn WW. We show on 86 datasets that our method is always faster than the state of the art, with at least one order of magnitude and up to 1000x speed-up.
Generating synthetic time series to augment sparse datasets.
Forestier, G., Petitjean, F., Dau, H. A., Webb, G. I., & Keogh, E.
IEEE International Conference on Data Mining (ICDM-17), pp. 865-870, 2017.
[Bibtex] → Download PDF
@InProceedings{ForestierEtAl17,
Title = {Generating synthetic time series to augment sparse datasets},
Author = {Forestier, Germain and Petitjean, Francois and Dau, Hoang Anh and Webb, Geoffrey I and Keogh, Eamonn},
Booktitle = {IEEE International Conference on Data Mining (ICDM-17)},
Year = {2017},
Pages = {865-870},
Keywords = {time series},
Related = {scalable-time-series-classifiers}
}
ABSTRACT
Indexing and classifying gigabytes of time series under time warping.
Tan, C. W., Webb, G. I., & Petitjean, F.
Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 282-290, 2017.
[Bibtex] → Download PDF → Access on publisher site
@InProceedings{TanEtAl17a,
Title = {Indexing and classifying gigabytes of time series under time warping},
Author = {Tan, Chang Wei and Webb, Geoffrey I. and Petitjean, Francois},
Booktitle = {Proceedings of the 2017 SIAM International Conference on Data Mining},
Year = {2017},
Organization = {SIAM},
Pages = {282-290},
Doi = {10.1137/1.9781611974973.32},
Keywords = {time series},
Related = {scalable-time-series-classifiers}
}
ABSTRACT
Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm.
Petitjean, F., Forestier, G., Webb, G. I., Nicholson, A. E., Chen, Y., & Keogh, E.
Knowledge and Information Systems, 47(1), 1-26, 2016.
[Bibtex] [Abstract] → Download PDF → Access on publisher site
@Article{PetitjeanEtAl16a,
author = {Petitjean, F. and Forestier, G. and Webb, G. I. and Nicholson, A. E. and Chen, Y. and Keogh, E.},
journal = {Knowledge and Information Systems},
title = {Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm},
year = {2016},
number = {1},
pages = {1-26},
volume = {47},
abstract = {A concerted research effort over the past two decades has heralded significant improvements in both the efficiency and effectiveness of time series classification. The consensus that has emerged in the community is that the best solution is a surprisingly simple one. In virtually all domains, the most accurate classifier is the nearest neighbor algorithm with dynamic time warping as the distance measure. The time complexity of dynamic time warping means that successful deployments on resource-constrained devices remain elusive. Moreover, the recent explosion of interest in wearable computing devices, which typically have limited computational resources, has greatly increased the need for very efficient classification algorithms. A classic technique to obtain the benefits of the nearest neighbor algorithm, without inheriting its undesirable time and space complexity, is to use the nearest centroid algorithm. Unfortunately, the unique properties of (most) time series data mean that the centroid typically does not resemble any of the instances, an unintuitive and underappreciated fact. In this paper we demonstrate that we can exploit a recent result by Petitjean et al. to allow meaningful averaging of 'warped' time series, which then allows us to create super-efficient nearest 'centroid' classifiers that are at least as accurate as their more computationally challenged nearest neighbor relatives. We demonstrate empirically the utility of our approach by comparing it to all the appropriate strawmen algorithms on the ubiquitous UCR Benchmarks and with a case study in supporting insect classification on resource-constrained sensors.},
doi = {10.1007/s10115-015-0878-8},
keywords = {time series, efficient ml},
related = {scalable-time-series-classifiers},
}
ABSTRACT A concerted research effort over the past two decades has heralded significant improvements in both the efficiency and effectiveness of time series classification. The consensus that has emerged in the community is that the best solution is a surprisingly simple one. In virtually all domains, the most accurate classifier is the nearest neighbor algorithm with dynamic time warping as the distance measure. The time complexity of dynamic time warping means that successful deployments on resource-constrained devices remain elusive. Moreover, the recent explosion of interest in wearable computing devices, which typically have limited computational resources, has greatly increased the need for very efficient classification algorithms. A classic technique to obtain the benefits of the nearest neighbor algorithm, without inheriting its undesirable time and space complexity, is to use the nearest centroid algorithm. Unfortunately, the unique properties of (most) time series data mean that the centroid typically does not resemble any of the instances, an unintuitive and underappreciated fact. In this paper we demonstrate that we can exploit a recent result by Petitjean et al. to allow meaningful averaging of 'warped' time series, which then allows us to create super-efficient nearest 'centroid' classifiers that are at least as accurate as their more computationally challenged nearest neighbor relatives. We demonstrate empirically the utility of our approach by comparing it to all the appropriate strawmen algorithms on the ubiquitous UCR Benchmarks and with a case study in supporting insect classification on resource-constrained sensors.
Dynamic Time Warping Averaging of Time Series Allows Faster and More Accurate Classification.
Petitjean, F., Forestier, G., Webb, G. I., Nicholson, A., Chen, Y., & Keogh, E.
Proceedings of the 14th IEEE International Conference on Data Mining, pp. 470-479, 2014.
ICDM 2023 10-year Highest Impact Paper Award
One of nine papers invited to Knowledge and Information Systems journal ICDM-14 special issue
[Bibtex] [Abstract] → Download PDF → Access on publisher site
@InProceedings{PetitjeanEtAl14b,
author = {Petitjean, F. and Forestier, G. and Webb, G. I. and Nicholson, A. and Chen, Y. and Keogh, E.},
booktitle = {Proceedings of the 14th {IEEE} International Conference on Data Mining},
title = {Dynamic Time Warping Averaging of Time Series Allows Faster and More Accurate Classification},
year = {2014},
pages = {470-479},
abstract = {Recent years have seen significant progress in improving both the efficiency and effectiveness of time series classification. However, because the best solution is typically the Nearest Neighbor algorithm with the relatively expensive Dynamic Time Warping as the distance measure, successful deployments on resource constrained devices remain elusive. Moreover, the recent explosion of interest in wearable devices, which typically have limited computational resources, has created a growing need for very efficient classification algorithms. A commonly used technique to glean the benefits of the Nearest Neighbor algorithm, without inheriting its undesirable time complexity, is to use the Nearest Centroid algorithm. However, because of the unique properties of (most) time series data, the centroid typically does not resemble any of the instances, an unintuitive and underappreciated fact. In this work we show that we can exploit a recent result to allow meaningful averaging of 'warped' times series, and that this result allows us to create ultra-efficient Nearest 'Centroid' classifiers that are at least as accurate as their more lethargic Nearest Neighbor cousins.},
comment = {ICDM 2023 10-year Highest Impact Paper Award},
comment2 = {One of nine papers invited to Knowledge and Information Systems journal ICDM-14 special issue},
doi = {10.1109/ICDM.2014.27},
keywords = {time series},
related = {scalable-time-series-classifiers},
}
ABSTRACT Recent years have seen significant progress in improving both the efficiency and effectiveness of time series classification. However, because the best solution is typically the Nearest Neighbor algorithm with the relatively expensive Dynamic Time Warping as the distance measure, successful deployments on resource constrained devices remain elusive. Moreover, the recent explosion of interest in wearable devices, which typically have limited computational resources, has created a growing need for very efficient classification algorithms. A commonly used technique to glean the benefits of the Nearest Neighbor algorithm, without inheriting its undesirable time complexity, is to use the Nearest Centroid algorithm. However, because of the unique properties of (most) time series data, the centroid typically does not resemble any of the instances, an unintuitive and underappreciated fact. In this work we show that we can exploit a recent result to allow meaningful averaging of 'warped' times series, and that this result allows us to create ultra-efficient Nearest 'Centroid' classifiers that are at least as accurate as their more lethargic Nearest Neighbor cousins.