T-closeness
t-closeness is a further refinement of l-diversity group based anonymization that is used to preserve privacy in data sets by reducing the granularity of a data representation. This reduction is a trade off that results in some loss of effectiveness of data management or data mining algorithms in order to gain some privacy. The t-closeness model extends the l-diversity model by treating the values of an attribute distinctly by taking into account the distribution of data values for that attribute.
Formal definition
Given the existence of data breaches where sensitive attributes may be inferred based upon the distribution of values for l-diverse data, the t-closeness method was created to further l-diversity by additionally maintaining the distribution of sensitive fields. The original paper by Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian defines t-closeness as:Charu Aggarwal and Philip S. Yu further state in their book on privacy-preserving data mining that with this definition, threshold t gives an upper bound on the difference between the distribution of the sensitive attribute values within an anonymized group as compared to the global distribution of values. They also state that for numeric attributes, using t-closeness anonymization is more effective than many other privacy-preserving data mining methods.