[PDF] k-Anonymity: A Model for Protecting Privacy | Semantic Scholar (2024)

Skip to search formSkip to main contentSkip to account menu

Semantic ScholarSemantic Scholar's Logo
@article{Sweeney2002kAnonymityAM, title={k-Anonymity: A Model for Protecting Privacy}, author={Latanya Sweeney}, journal={Int. J. Uncertain. Fuzziness Knowl. Based Syst.}, year={2002}, volume={10}, pages={557-570}, url={https://api.semanticscholar.org/CorpusID:361794}}
  • L. Sweeney
  • Published in Int. J. Uncertain. Fuzziness… 1 October 2002
  • Computer Science
  • Int. J. Uncertain. Fuzziness Knowl. Based Syst.

The solution provided in this paper includes a formal protection model named k-anonymity and a set of accompanying policies for deployment and examines re-identification attacks that can be realized on releases that adhere to k- anonymity unless accompanying policies are respected.

8,242 Citations

Highly Influential Citations

612

Background Citations

4,083

Methods Citations

1,712

Results Citations

33

Figures from this paper

  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5

Topics

k-Anonymity (opens in a new tab)Data Holder (opens in a new tab)Re-identification Attacks (opens in a new tab)Field Structured Data (opens in a new tab)Re-identified (opens in a new tab)Datafly (opens in a new tab)Privacy Protection (opens in a new tab)Protecting Privacy (opens in a new tab)u-Argus (opens in a new tab)Scientific Guarantees (opens in a new tab)

8,242 Citations

Achieving k-Anonymity Privacy Protection Using Generalization and Suppression
    L. Sweeney

    Computer Science

    Int. J. Uncertain. Fuzziness Knowl. Based Syst.

  • 2002

This paper provides a formal presentation of combining generalization and suppression to achieve k-anonymity and shows that Datafly can over distort data and µ-Argus can additionally fail to provide adequate protection.

Weak k-Anonymity: A Low-Distortion Model for Protecting Privacy
    Maurizio Atzori

    Computer Science

    ISC

  • 2006

This paper gives a weaker definition of k-anonymity, allowing lower distortion on the anonymized data, and shows that, under the hypothesis in which the adversary is not sure a priori about the presence of a person in the table, the privacy properties are respected also in the weak k-Anonymity framework.

  • 20
Anonymity : Formalisation of Privacy – k-anonymity
    Janosch Maier BetreuerRalph Holz

    Computer Science

  • 2013

It is shown, how l-diversity and t-closeness provide a stronger level of anonymity as k-anonymity, and a value generalization hierarchy based on the attributes model, device, version and network is provided.

  • 1
  • Highly Influenced
  • PDF
Approximation Algorithms for k-Anonymity
    Gagan AggarwalT. Feder An Zhu

    Computer Science, Mathematics

  • 2005

It is shown that the k-Anonymity problem is NP-hard even when the attribute values are ternary and the author provides an O(k)-approximation algorithm for the problem.

  • 282
Privacy-Preserving Distributed k-Anonymity
    Wei JiangChris Clifton

    Computer Science

    DBSec

  • 2005

A key contribution is a proof that the protocol preserves k-anonymity between the sites, a fundamentally different distributed privacy definition than that of Secure Multiparty Computation, and it provides a better match with both ethical and legal views of privacy.

  • 94
  • PDF
k-anonymity: Risks and the Reality
    A. BasuToru NakamuraSeira HidanoS. Kiyomoto

    Computer Science

    2015 IEEE Trustcom/BigDataSE/ISPA

  • 2015

This work quantifies risk as the probability of re-identification and proposes a mechanism to compute the empirical risk with respect to the cost of acquiring the knowledge about quasi-identifiers, using an real-world dataset released with some k-anonymity guarantee.

  • 15
k-Anonymous data collection
    Sheng ZhongZhiqiang YangTingting Chen

    Computer Science

    Inf. Sci.

  • 2009
  • 5
  • PDF
k-Anonymity in Context of Digitally Signed CDA Documents
    Daniel SlamanigChristian Stingl

    Computer Science, Medicine

    HEALTHINF

  • 2010

A novel approach based on generalized redactable signatures that realizes k-anonymity for sets of digitally signed records is proposed that allows any party to verify the original digital signatures for medical data, although these data are modified during the process of achieving k-Anonymity.

Extended K-Anonymity Model for Privacy Preserving on Micro Data
    Masoud RahimiM. BateniHosein Mohammadinejad

    Computer Science, Mathematics

  • 2015

An algorithm is proposed that fully protects the propagated micro data against identity and attribute disclosure and significantly reduces the distortion ratio during the anonymity process.

  • 15
  • Highly Influenced
  • PDF
Privacy Issues for K-anonymity Model
    N. MaheshwarkarK. PathakV. Chourey

    Computer Science

  • 2011

Some privacy issues for k-anonymity model are discussed and its integrity is checked while using some approaches.

  • 14
  • PDF

...

...

28 References

Guaranteeing anonymity when sharing medical data, the Datafly System
    L. Sweeney

    Computer Science, Medicine

    AMIA

  • 1997

We present a computer program named Datafly that maintains anonymity in medical data by automatically generalizing, substituting, and removing information as appropriate without losing many of the

  • 298
  • PDF
Enhancing Access to Microdata while Protecting Confidentiality: Prospects for the Future
    G. DuncanR. Pearson

    Computer Science, Political Science

  • 1991

This article presents a scenario for the future of research access to federally collected microdata, as they relate to improvements in database techniques, computer and analytical method- ologies and legal and administrative arrangements for access to and protection of federal statistics.

  • 179
  • PDF
Cryptography and Data Security
    D. Denning

    Computer Science, Mathematics

  • 1982

The goal of this book is to introduce the mathematical principles of data security and to show how these principles apply to operating systems, database systems, and computer networks.

  • 1,951
Towards the optimal suppression of details when disclosing medical data, the use of sub-combination analysis
    L. Sweeney

    Medicine, Computer Science

  • 1998

This work presents a new computational technique based on stepwise consideration of all sub-combinations of sensitive fields that can be used within the Datafly or m-Argus architectures to help achieve optimal disclosure and shows that doing so provides more specific data than Datafly would normally release and improves the confidentiality of results from m- Argus.

  • 12
  • PDF
The tracker: a threat to statistical database security
    D. DenningP. DenningM. Schwartz

    Computer Science

    TODS

  • 1979

It is shown that the compromise of small query sets can in fact almost always be accomplished with the help of characteristic formulas called trackers, and security is not guaranteed by the lack of a general tracker.

  • 217
  • PDF
Microdata disclosure limitation in statistical databases: query size and random sample query control
    G. DuncanSumitra Mukherjee

    Computer Science

    Proceedings. 1991 IEEE Computer Society Symposium…

  • 1991

A probabilistic framework is used to assess the strengths and weaknesses of two existing disclosure control mechanisms and an alternative scheme combining query set size restriction and random sample query control results in a significant decrease in the risk of disclosure.

  • 32
On the Question of Statistical Confidentiality
    I. Fellegi

    Computer Science

  • 1972

Abstract In Section 1 the nature of statistical confidentiality is explored, i.e., its essential role in the collection of data by statistical offices, its relationship to privacy and the need for

  • 210
Detection and elimination of inference channels in multilevel relational database systems
    Xiaolei QianM. StickelP. KarpT. LuntT. Garvey

    Computer Science

    Proceedings 1993 IEEE Computer Society Symposium…

  • 1993

A global optimization approach to upgrading is suggested to block a set of inference problems that allows upgrade costs to be considered, and supports security categories as well as levels.

  • 69
Aggregation and inference: facts and fallacies
    T. Lunt

    Computer Science

    Proceedings. 1989 IEEE Symposium on Security and…

  • 1989

It is shown that sensitive associations among entities of different types are best treated by representing the sensitive association separately and classifying the individual entities low and the relationship high, and the suggested approaches allow the mandatory reference monitor to protect the sensitive associations.

  • 91
  • PDF
A Multilevel Relational Data Model
    D. DenningT. LuntR. SchellM. HeckmanW. Shockley

    Computer Science

    1987 IEEE Symposium on Security and Privacy

  • 1987

The model is defined in terms of the standard relational model, but lends itself to a design and implementation that offers a high level of assurance for mandatory security.

  • 201

...

...

Related Papers

Showing 1 through 3 of 0 Related Papers

    [PDF] k-Anonymity: A Model for Protecting Privacy | Semantic Scholar (2024)

    FAQs

    What is the k-anonymity approach? ›

    To use k-anonymity to process a dataset so that it can be released with privacy protection, a data scientist must first examine the dataset and decide whether each attribute (column) is an identifier (identifying), a non-identifier (not-identifying), or a quasi-identifier (somewhat identifying).

    What are the k-anonymity standards? ›

    A dataset is considered K anonymous when, for every combination of identifying attributes in a dataset, there are at least “K minus 1” other people with the same attributes. In other words, the data is not unique to a certain individual, and therefore can't be used to identify them.

    How does k-anonymity help to protect privacy in micro data sets? ›

    K Anonymity ensures that no 1 person's information can be distinguished from at least "K-1" other people in the same dataset. In other words, for any given record, there are at least K other records in the dataset with identical values for all identifying attributes.

    What is the difference between k-anonymity and differential privacy? ›

    In the literature, k-anonymity and differential privacy have been viewed as very different privacy guarantees. k- anonymity is syntactic and weak, and differential privacy is algorithmic and provides semantic privacy guarantees.

    What are the three types of anonymity? ›

    In an online context, we must consider three types of anonymity: sender anonymity, recipient anonymity and unlinkability of sender and recipient. The GDPR defines anonymous data as such that “does not relate to an identified or identifiable natural person“.

    What are the advantages of k-anonymity? ›

    Therefore, by enforcing that data sets of a sensitive nature (such as medical or financial data sets) achieve k-anonymity with a high value of k, one can minimize the risk that an adversary will be able to uncover the identity of the person whose data is represented by a particular row.

    How to find k-anonymity? ›

    A dataset is k-anonymous if quasi-identifiers for each person in the dataset are identical to at least k – 1 other people also in the dataset. You can compute the k-anonymity value based on one or more columns, or fields, of a dataset.

    What is the difference between k-anonymity and L diversity? ›

    L Diversity enhances the inherent deficiencies in the K Anonymity model – notably eliminating the possibility of a hom*ogeneous pattern attack or a background knowledge attack – by introducing further entropy (or diversity) into a dataset. The result is significantly reduced risk of re-identification of anonymized data.

    What are the ethical principles of anonymity? ›

    Anonymity portrays the researcher as a steward of knowledge, thus, it elevates the researcher to a position of power, no collaborative partnership or sharing of control. The participants are disenfranchised. It is paternalistic, it negates participants' autonomy and the right to make choices.

    Is anonymity the best solution to privacy? ›

    Online anonymity is often less helpful on a day-to-day basis and should be prioritized on a case-by-case basis. It's best to be anonymous anytime you're doing something you maybe shouldn't be doing or something you wouldn't want to be traced back to you or your profiles.

    What problem does data anonymization protect against? ›

    Advantages of data anonymization

    These include: Enhanced data security: Anonymizing data can significantly reduce the risks associated with data breaches by removing or hiding sensitive and/or easily identifying details of personal information, such as names, addresses, and social security numbers.

    What is an example of anonymization? ›

    One example of anonymized data is a dataset that has been stripped of any personally identifiable information such as names, addresses, and phone numbers. This type of data can be used to analyze trends and patterns without the risk of exposing any individual's personal information.

    What is the problem with k-anonymity? ›

    Information loss: In order to make individuals indistinguishable from one another, k-anonymity often requires the suppression or generalization of data. This can result in a significant loss of information and may make the data set less useful for certain types of analysis.

    What is the biggest difference between anonymity and confidentiality? ›

    Anonymity means you don't know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations.

    What is anonymity privacy example? ›

    Anonymity – Keeping your identity private, but not your actions. For example, using a pseudonym to post messages to a social media platform.

    What is K level anonymity? ›

    K-anonymity is a property of a dataset that indicates the re-identifiability of its records. A dataset is k-anonymous if quasi-identifiers for each person in the dataset are identical to at least k – 1 other people also in the dataset.

    What is K degree anonymity? ›

    • K-‐degree anonymity ensures that each node has the same degree. as at least k other nodes. – Limits reiden|fica|on by node degrees. • Algorithm: Generate a k-‐degree anonymous degree sequence, find a graph that is close to the original graph that has the new.

    What is the K means clustering technique? ›

    k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster.

    References

    Top Articles
    Latest Posts
    Article information

    Author: Margart Wisoky

    Last Updated:

    Views: 5844

    Rating: 4.8 / 5 (78 voted)

    Reviews: 93% of readers found this page helpful

    Author information

    Name: Margart Wisoky

    Birthday: 1993-05-13

    Address: 2113 Abernathy Knoll, New Tamerafurt, CT 66893-2169

    Phone: +25815234346805

    Job: Central Developer

    Hobby: Machining, Pottery, Rafting, Cosplaying, Jogging, Taekwondo, Scouting

    Introduction: My name is Margart Wisoky, I am a gorgeous, shiny, successful, beautiful, adventurous, excited, pleasant person who loves writing and wants to share my knowledge and understanding with you.