MCCCS application examples
The MCCCS includes several analysis scripts for image processing and file conversion. These scripts are arranged in custom pipelines to process different analysis tasks. Therefore we use the gnu bash shell which is a powerful tool, further more a command language. To understand the whole concept we provide several examples which are based on freely available datasets.
Data
- Three image sets (A1, A2, A3) from the Leaf Segmentation Challenge (LSC) 2014
- A hyperspec example from Purdue Research Foundation.
- Disease classification for detached barley leaves (in preparation, not published yet).
Segmentation example 1 - classification
This example shows an application for foreground/background segmentation for top view plant images (Arabidopsis thaliana - A1, A2 and tobacco - A3) using a supervised Random Forest classifier. Thethree data sets are split into a training set and a data set forprediction as well. After processing, the segmentation results, namedforeground.png, are stored in each sub folder, e.g. plant_003.
Hyper-spectral example 1 - classification
This example shows an application for a multi-labeled segmentation onan airbone hyper-spectral image data set. Here partly pre-classifiedground-truth image masks are used to train a supervised Random Forestclassifier.Afterprocessing,thesegmentationresult,namedclassified.png is stored in the experiment sub folder (stack_images → dc).
Hyper-spectral example 2 - clustering
This example shows an application for a multi-labeled segmentation onan airbone hyper-spectral image data set as used in the examplebefore. Instead of using pre-classified ground-truth data to train asupervised classifier here a clustering approach is performed. Afterprocessing, the segmentation result, named clustered.png is stored inthe experiment sub folder (stack_images → dc).
Preparation
After downloading and installing the required software tools (see installation instructions). The mcccs.zip container can be downloaded from the github releases and extracted on a local file system.
Download of application examples
The application examples can be downloaded and prepared by executing the prepare_datasets.sh in a terminal. The example data and needed libraries are automatically downloaded and transferred into the common folder structure for processing with the given example scripts. Please make sure that there is sufficient space left on the used device.
Running examples
The analysis can be started by navigating into the corresponding experiment folder, by executing the process_ ... .sh script in a terminal (e.g. segmentation_example_1_classification → execute process_segmentation_example_1_classification.sh in the experiment folder). The results, including a labeled result image and the belonging numeric data, named all_ ... .csv, are stored into the corresponding sub-folders.
Analysis statistics
Table 1 includes the number of images for training and prediction for each data set A1, A2 and A3 of the segmentation examples. The second table gives an overview about the individual runtimes in seconds for all application examples and different use of processor units. Table 3 shows the runtimes for the hyperspectral examples. Tabel 4 gives an overview about the RAM and disk space consumption. All tests was performed on a machine equipped with a Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz and 32 GB RAM (OS: Ubuntu 16.04).
- Table 1: Overview about the number of images # in the used data sets.
Dataset | # Training | # Prediction | Image resolution (pixel) |
---|---|---|---|
A1 | 7 | 128 | 500 x 530 |
A2 | 8 | 31 | 530 x 565 |
A3 | 9 | 27 | 2448 x 2048 |
- Table 2: Overview about runtimes of the segmentation examples in sconds s, single = 1 cpu, half = 4 cpu, multi = 8 cpu units in use for parallel job processing (including virtual cpus). Thus of the heavily parallelization of the single commands the execution of parallel jobs shows the biggest effect during the model training.
Training s (m) | Prediction s (m) | |
---|---|---|
A1_single | 498 (8.3) | 1428 (23:8) |
A1_half | 361 (6.0) | 1597 (26.6) |
A1_multi | 213 (3.6) | 1396 (23.3) |
A2_single | 547 (9.1) | 319 (5.3) |
A2_half | 332 (5.5) | 353 (5.9) |
A2_multi | 216 (3.6) | 318 (5.3) |
A3_single | 7370 (122.8) | 3587 (59.8) |
A3_half | 4370 (72.8) | 3737 (62.3) |
A3_multi | 2774 (46.2) | 3591 (59.9) |
- Table 3: Runtimes for the hyperspectral analysis.
Dataset | Time |
---|---|
classification | 687 s |
clustering | 102 s |
- Table 4: Overview about RAM and disk space consumption.
Dataset | RAM (max) | disk space (max) |
---|---|---|
segmentation A1 training | xx | yy |
segmentation A1 prediction | xx | yy |
segmentation A2 training | xx | yy |
segmentation A2 prediction | xx | yy |
segmentation A3 training | xx | yy |
segmentation A3 prediction | xx | yy |
hyper_classification | xx | yy |
hyper_clustering | xx | yy |
Customization and usage hints
Parallel job execution
By excuting more jobs in parallel the RAM comsumption will increase, for less perfomance systems it is recommended to decrese the job number, the following options are possible:
- s - single job started
- h - job number is half number of available cpu units (automatic detection)
- m - job number equals all available cpu units (automatic detection)
- 1 .. n - the used input number specifies the number of parallel started jobs
To learn more about the pipeline details, please have a look in the tutorials section.