MCCCS application examples

The MCCCS includes several analysis scripts for image processing and file conversion. These scripts are arranged in custom pipelines to process different analysis tasks. Therefore we use the gnu bash shell which is a powerful tool, further more a command language. To understand the whole concept we provide several examples which are based on freely available datasets.

Data

  • Three image sets (A1, A2, A3) from the Leaf Segmentation Challenge (LSC) 2014
  • A hyperspec example from Purdue Research Foundation.
  • Disease classification for detached barley leaves (in preparation, not published yet).

Segmentation example 1 - classification


This example shows an application for foreground/background segmentation for top view plant images (Arabidopsis thaliana - A1, A2 and tobacco - A3) using a supervised Random Forest classifier. Thethree data sets are split into a training set and a data set forprediction as well. After processing, the segmentation results, namedforeground.png, are stored in each sub folder, e.g. plant_003.

Hyper-spectral example 1 - classification


This example shows an application for a multi-labeled segmentation onan airbone hyper-spectral image data set. Here partly pre-classifiedground-truth image masks are used to train a supervised Random Forestclassifier.Afterprocessing,thesegmentationresult,namedclassified.png is stored in the experiment sub folder (stack_images → dc).

Hyper-spectral example 2 - clustering


This example shows an application for a multi-labeled segmentation onan airbone hyper-spectral image data set as used in the examplebefore. Instead of using pre-classified ground-truth data to train asupervised classifier here a clustering approach is performed. Afterprocessing, the segmentation result, named clustered.png is stored inthe experiment sub folder (stack_images → dc).

Preparation

After downloading and installing the required software tools (see installation instructions). The mcccs.zip container can be downloaded from the github releases and extracted on a local file system.

Download of application examples

The application examples can be downloaded and prepared by executing the prepare_datasets.sh in a terminal. The example data and needed libraries are automatically downloaded and transferred into the common folder structure for processing with the given example scripts. Please make sure that there is sufficient space left on the used device.

Running examples

The analysis can be started by navigating into the corresponding experiment folder, by executing the process_ ... .sh script in a terminal (e.g. segmentation_example_1_classification → execute process_segmentation_example_1_classification.sh in the experiment folder). The results, including a labeled result image and the belonging numeric data, named all_ ... .csv, are stored into the corresponding sub-folders.

Analysis statistics

Table 1 includes the number of images for training and prediction for each data set A1, A2 and A3 of the segmentation examples. The second table gives an overview about the individual runtimes in seconds for all application examples and different use of processor units. Table 3 shows the runtimes for the hyperspectral examples. Tabel 4 gives an overview about the RAM and disk space consumption. All tests was performed on a machine equipped with a Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz and 32 GB RAM (OS: Ubuntu 16.04).

  • Table 1: Overview about the number of images # in the used data sets.
Dataset # Training # Prediction Image resolution (pixel)
A1 7 128 500 x 530
A2 8 31 530 x 565
A3 9 27 2448 x 2048
  • Table 2: Overview about runtimes of the segmentation examples in sconds s, single = 1 cpu, half = 4 cpu, multi = 8 cpu units in use for parallel job processing (including virtual cpus). Thus of the heavily parallelization of the single commands the execution of parallel jobs shows the biggest effect during the model training.
Training s (m) Prediction s (m)
A1_single 498 (8.3) 1428 (23:8)
A1_half 361 (6.0) 1597 (26.6)
A1_multi 213 (3.6) 1396 (23.3)
A2_single 547 (9.1) 319 (5.3)
A2_half 332 (5.5) 353 (5.9)
A2_multi 216 (3.6) 318 (5.3)
A3_single 7370 (122.8) 3587 (59.8)
A3_half 4370 (72.8) 3737 (62.3)
A3_multi 2774 (46.2) 3591 (59.9)
  • Table 3: Runtimes for the hyperspectral analysis.
Dataset Time
classification 687 s
clustering 102 s
  • Table 4: Overview about RAM and disk space consumption.
Dataset RAM (max) disk space (max)
segmentation A1 training xx yy
segmentation A1 prediction xx yy
segmentation A2 training xx yy
segmentation A2 prediction xx yy
segmentation A3 training xx yy
segmentation A3 prediction xx yy
hyper_classification xx yy
hyper_clustering xx yy

Customization and usage hints

Parallel job execution

By excuting more jobs in parallel the RAM comsumption will increase, for less perfomance systems it is recommended to decrese the job number, the following options are possible:

  • s - single job started
  • h - job number is half number of available cpu units (automatic detection)
  • m - job number equals all available cpu units (automatic detection)
  • 1 .. n - the used input number specifies the number of parallel started jobs

To learn more about the pipeline details, please have a look in the tutorials section.