Journal Published Online: 15 January 2019
Volume 47, Issue 6

An Enhanced Hybrid Clustering Approach for Privacy Preservation (ECPS) in Big Data using Apache Spark Framework

CODEN: JTEVAB

Abstract

In the era of big data mining, clustering is the organization of objects into groups based on applied similarity metrics. Among those metrics, the k-means approach is considered the most popular algorithm so far. As a result of the advancement in parallel processing and distributed computation, traditional techniques are not as efficient in computing the centroid when performing clustering. Consequently, this issue has an effect on privacy preservation when it comes to processing individuals’ sensitive information. Hence, an optimal clustering technique an enhanced hybrid clustering approach for privacy preservation in the context of big data perspective as well as in the context of the preservation of individual privacy protection from background knowledge attack is proposed in this article. The first approach depicts a combination of the ant colony optimization and firefly techniques for choosing the better centroid position with the data. The next approach is about combining the differential privacy algorithm, which uses the Laplace mechanism for augmenting additional noise to the individual’s data to make privacy preservation more robust. With the evolving trends and technologies, the amount of data being generated is increasing at an overwhelming rate. Thus, the proposed approaches are designed in such a way that they can be adapted to the changing needs of big data. The proposed algorithms are efficient when compared with the existing clustering algorithms and provide better performance by guaranteeing privacy. The implementation of the proposed works is done upon the Apache Spark with the big data framework.

Author Information

Swaminathan, Revathy
School of Computer Science and Engineering, VIT, Tamilnadu, India
Thangavelu, Arunkumar
School of Computer Science and Engineering, VIT, Tamilnadu, India
Pages: 13
Price: $25.00
Related
Reprints and Permissions
Reprints and copyright permissions can be requested through the
Copyright Clearance Center
Details
Stock #: JTE20180414
ISSN: 0090-3973
DOI: 10.1520/JTE20180414