Temporal Feature Importances

A lot of time has passed since our last post, but the activity level will increase as the temperatures do, hopefully.

Separating classes by different attributes is the base of a classification task. Natural land-cover features follow a seasonal pattern, which may differ for different classes. With a smoothed VI time-series, based on MODIS Terra Satellite Data (MOD13Q1), we can determine the point in time with the best spectral separability between the used classes. Therefore the feature_importance measures of the RandomForest classifier in scikits-learn were calculated for all 23 composite periods + slope, elevation & aspect in the years 2002 to 2010* on our training samples in County Longford. These values are based on the Gini coefficient, which describes the impurity of a data distribution. The feature_importance measures add up to a value of 1 for each year.

* 2001 an 2011 were left out due to border artifacts of time-series smoothing.

Figure 1 shows the cumulative feature importances for all classes (Improved Grassland [GA], Semi-Improved Grassland [GS], Forest, Water, Settlement and Peatland) on a smoothed NDVI time-series. Two distinct peaks can be observed. The major peak occurs during winter (late Dec/Jan) while the minor peak is apparent in April. According to the feature_importance the summer months June to August seem to have a very low impact on the classification results. DEM values (elevation, slope, aspect) exhibit a minor influence on the classification results.

Cumulative Feature Importances

Fig.1: Cumulative Feature Importances – All Classes

Figure 2 is focussed on the distinction of both Grassland classes. Two different temporal peaks can be found around early April and during November. The first peak corresponds to the minor spring peak of Figure 1. Besides the temporal peaks of the VI based separability, the elevation has a stronger influence on the classification result for the distinction of Grasslands only.


Fig.2: Cumulative Feature Importances – Classes: GA vs. GS

Figure 3 shows the temporal distribution of separability for single years. Common characteristics, but also differences can be observed. A spring peak is commonplace for the separability of both grassland types in each year. This peak can vary strongly in intensity, length and timing. Years 2005 and 2007 for example are characterised by a very early, long-lasting peak with a low intensity, whereas year 2002 stands out with only one intensive and long-lasting spring peak. Most years feature a second peak around November, which is usually less pronounced and shorter in duration. However, year 2002 shows a complete lack of high feature importances for this particular period. Years 2003 and 2010 in contrast, exhibit a strong separability during their late-season peak, which outweighs the typically stronger spring peak.


Fig.3: Single year Feature Importances – Classes: GA vs. GS

The timing of separability can be attributed to the different phenological cycles and growth intensities of these two groups. Improved grasslands have a slightly earlier green-up phase than semi-improved grasslands, which leads to an increased separability during this period (cf. Fig.4). The same mechanism applies to the second peak, where the senescence periods differ in time. Climatic variability may have an influence on the varying timing and intensity of these processes, which has to be further evaluated.


Fig.4: Mean NDVI time-series +- 1 std GA vs. GS

All figures were created with the matplotlib library in python.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s