A Mondrian-based Utility Optimization Model for Anonymization


CANBAY Y., SAĞIROĞLU Ş., VURAL Y.

2019 4th International Conference on Computer Science and Engineering (UBMK), Samsun, Turkey, Turkey, 11 - 15 September 2019 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/ubmk.2019.8907117
  • City: Samsun, Turkey
  • Country: Turkey
  • Keywords: Anonymization, Mondrian, outliers, utility optimization
  • Gazi University Affiliated: Yes

Abstract

Anonymization is a privacy-preserving approach facilitating to protect the identities of data subjects. Besides privacy, anonymization presents a data utility that is very important for data analytics. The accuracy of a data analytics model, which is created on anonymous data, depends on the utility provided by anonymization. While there exist some factors directly affecting data utility such as privacy level, anonymization operators, anonymization algorithm, etc. Recent studies presented that outliers are another factor that affects data utility too. In the anonymization domain, outliers are accepted as "a group of data that decreases total data utility". In this paper, in order to maximize the data utility, a new data utility model is proposed and applied to Mondrian for the first time. A general outlier-based utility evaluation function is also introduced to measure this utility for the first time. Experimental results have shown that the proposed model improves data utility and presents higher utility than Mondrian. The proposed function might be also used as a reliable tool for outlier-based utility-aware anonymization models.