Data Scientist


Scalable learning of time series classifiers

Times series classification is an important data analysis task.  The largest dataset in the standard set of benchmark time-series classification tasks, the UCR respository, contains approximately 10,000 series. We are working with the French Space agency on classifying land usage from satellite images.  This task requires learning from many millions of time series and classifying many billions.  The preexisting state-of-the-art does not scale to these magnitudes.  We are developing new time series classification technologies that will.

The following is a blog post on the use of Barycentric averaging in time series classification: http://www.kdnuggets.com/2014/12/averaging-improves-accuracy-speed-time-series-classification.html. The code can be downloaded here: http://francois-petitjean.com/Research/ICDM2014-DTW/index.php. The slides for the ICDM 2014 paper can be downloaded here: http://francois-petitjean.com/Research/ICDM2014-DTW/Slides.pdf.

The TSI software for the SDM 2017 paper on time series indexing can be downloaded here: https://github.com/ChangWeiTan/TSI. Slides for the SDM 2017 paper can be downloaded here: http://francois-petitjean.com/Research/SDM17-slides.pdf.

The software for the best paper award winning SDM 2018 paper on finding the best warping window can be downloaded here: https://github.com/ChangWeiTan/FastWWSearch (Matlab version).

Publications