RESEARCH ARTICLE


Combined Cluster Analysis and Principal Component Analysis to Reduce Data Complexity for Exhaust Air Purification



Bastian Ebeling*, a, Cristiam Vargasb, Simone Huboc
1 Blohm + Voss Naval GmbH, Department AME, Hermann-Blohm-Strasse 3, D-20457 Hamburg, Germany
2 Novartis Pharma Productions GmbH, Oeflinger Strasse 44, D-79664 Wehr, Germany
3 Helmut-Schmidt-University/University of the Federal Armed Forces Hamburg, Institute of Thermodynamics, Holstenhofweg 85, D-22043 Hamburg, Germany


© 2013 Ebeling et al.;

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Blohm + Voss Naval GmbH, Department AME, Hermann-Blohm-Strasse 3, D-20457 Hamburg, Germany; E-mail: bastian.ebeling@ingenieur.de and Helmut-Schmidt-University/University of the Federal Armed Forces Hamburg, Institute of Thermodynamics, Holstenhofweg 85, D-22043 Hamburg, Germany; E-mail: simone.hubo@hsu-hh.de


Abstract

Anthropogenic and demographic processes cause worldwide air problems, giving rise to focus on exhaust air purification to counteract these effects. Due to the large number of substances found in exhaust air and the various operational parameters needed, a huge amount of often high dimensional data has to be analyzed. The ultimate goal is to finally reduce data complexity in terms of information reflecting the substances' characteristics.

The Cluster Analysis (CA) of data from 30 exhaust air compounds with 11 indices representing both structural characteristics and physicochemical data resulted in 7 clusters. The Principal Component Analysis (PCA) led to the identification of 6 Principal Components (PCs) and therefore to a dimensional reduction compared to the originally used 11 indices. After re-gathering the total information of the original data-set upon the 6 PCs only, a re-clustering showed that we were able to restore the same cluster structure as in the original CA based on the 11 indices. This process is a first proof of principle in successful re-clustering after dimensional data reduction by our proposed combined CA-PCA method and hence a step towards a possible development of an adsorption method to selectively remove malodorous/toxic components from the exhaust air.

Keywords: Cluster Analysis, Dimensional Reduction, Exhaust Air Purification, Odour Control, Principal Component Analysis.