What Is PII Data and How Can You Protect It?

Gil Dabah

CEO & Co-founder

August 9, 2022

Data has become an inseparable part of day-to-day functioning. Organizations across industries collect users' personal information for numerous reasons, like providing services, security, verification, selling it, or understanding how their consumers behave online to improve customer experience. Many refer to data as the new oil, and PII is the ultimate jackpot, which is why data breaches have become increasingly common. According to Security Magazine, in 2021 alone, data breaches rose by 10%, with the top three most targeted forms of data including names, social security numbers, and addresses - all forms of PII.

Understanding PII Terminology

PII stands for Personally Identifiable Information. To understand this term, we need to focus on the word “identifiable” and understand how data can identify a person. For the sake of simplicity, let’s focus on two ways it can be done:

  1. The first way is a direct (also called linked) reference, in which the data provides an unmistakable link to an individual’s identity—for example, full address, SSN, passport information, photo ID, and biometric records.
  2. The second way is an indirect (also called linkable) reference, in which a single piece of information by itself doesn’t provide a link to an identity, but several pieces of information combined will provide an unmistakable link to an individual identity. Example: A study has found that date of birth, gender, and zip code can be combined to identify over 60% of US citizens.

As privacy is a domain that evolves rapidly, so does the definition of PII. As such, it is important to keep track of definition changes and court rulings that provide further guidance. PII is a broad domain that can be broken into many subdomains, each with its specific regulations. For example, protected health information (PHI) is the demographic information, medical histories, tests and laboratory results, mental health conditions, insurance information, and other data that a healthcare professional collects to identify an individual and determine appropriate care.

The Health Insurance Portability and Accountability Act (HIPAA) of 1996 is the primary law that oversees the use of, access to, and disclosure of PHI in the United States. Diving further into the PII subdomains can help understand the complexity of this domain, the number of regulations, and the knowledge that is required to answer what seems to be a simple question: “What is PII?” Each of the sensitive data types mentioned above has a lot of monetary value to criminals, as it can be resold or exploited in many ways, such as for insurance fraud, identity theft, and impersonation, making it an attractive target for hackers. In this article, we’ll focus on the broader definition of PII.

Why Collecting Data Is Necessary – And Dangerous

In today’s modern world, data is often referred to as the new oil or the new gold. We believe that data is power - the power to better understand your customers’ needs and habits, the power to build better products and offer better personalization, or the power to build more accurate ML/AI models. Regardless of how we perceive data, we can all agree that it has changed the world and will continue for years to come. From businesses to official government legislations, various bodies must store and collect user data for various reasons.

While the increase in data collection has added convenience to our day-to-day lives and how we run our businesses, it has also created a new valuable commodity that motivates cybercrime. Over the past decade, cyberattacks have nearly doubled. Today, there is an industry consensus that data breaches are inevitable. As such, companies must prepare accordingly and protect their most vulnerable and sensitive assets. The 2022 DBIR report reveals that 71% of breaches in large organizations are motivated by financial or personal gain, and this statistic rises to 96% for organizations of all sizes.

Personally Identifiable Information is an especially valuable commodity; just a single linked PII is enough to steal a user's identity, commit financial fraud, or blackmail a user with personal information. Despite its sensitivity, PII is also one of the most necessary to collect. Details such as SSNs can be used to verify users' identities to open bank accounts, get a loan, or even perform background checks.

Privacy Regulations and PII

The rising concerns over the necessity of storing private data while facing growing threats have led to the development of regulations and regulatory guidelines designed to protect consumer privacy. Over 130 countries have jurisdictions in place to regulate consumers' privacy and safeguard their PII. The best known of these regulatory bodies is Europe's GDPR (General Data Protection Regulation).The golden standard is GDPR, and while there are a lot of commonalities to how PII is defined and enforced in various privacy regulations, there are also differences. As such, understanding which privacy regulations apply to you is crucial. Examples of differences:

  • The GDPR PII definition applies to “identifiable natural person”, whereas the California Consumer Privacy Act (CCPA) applies to “..with a particular consumer or household.”
  • PIPL (the Chinese Personal Information Protection Law) refers to the impact of an individual’s dignity as part of the PII definition.
  • In India, the long-awaited report for recommendations to the Personal Data Protection Bill, 2019 argues that it is impossible to distinguish between personal and non-personal information in mass data collection or transport. Because of this, the recommendations also have clauses applicable to non-personal information.

GDPR and Personally Identifiable Information

The European Union enacted the GDPR in 2016, and it is still considered one of the golden standards for regulatory compliance. The standard places strict rules and requirements on how companies that operate in the EU or handle the data of EU citizens need to manage their PII. This includes delineating the precautions companies must take to ensure their users' data remains safe from hackers. In addition, the standard includes the RTBF (right to be forgotten) act, which requires companies to delete EU citizens' data upon request, and the DSAR (data subject access request), which gives citizens the right to know what personal information the organization is collecting and storing, and how it’s being used. The GDPR protects a broad spectrum of data, from basic PII such as name, address, and ID numbers (like SSN) to web data such as IP addresses, cookie data, and user location. Failure to comply with GDPR standards can result in harsh fines and legal action.

NIST PII Standards

Although the US doesn't have an all-encompassing standard like the GDPR (though there are state-specific regulations like the CCPA in California), the NIST (National Institute of Science and Technology) has created a Guide to Protecting the Confidentiality of PII that can serve as a guideline for PII security. Many federal agencies use the guide, and its definition of PII includes:

  • Names (full name, maiden name, mother's maiden name, or alias)
  • Address information, both physical and online (such as email address)
  • Personally identifiable numbers, including SSN, passport number, or credit card number
  • Physical characteristics, including images (particularly of the face or other identifiable features), fingerprints, and other biometric data
  • Individual information that can be used with other information to identify an individual (such as race, religion, weight, D.O.B., place of birth, zip code, or the like)

While NIST standards are not as official as the GDPR, failure to comply with its regulations can result in reputational damage and legal action from parties injured by any failure to comply.

Which PII Should I Protect?

Personally Identifiable Information can be divided into two categories:

  • Linked PII - Information that can be directly linked to an individual, such as:
  • Social security number
  • National ID
  • Full name
  • Phone number
  • Driver's license number
  • Passport information
  • Email address
  • Medical records or financial information
  • Bank account numbers and payment information

This is far from the complete list, and many companies require at least some if not all of the above information.

  • Linkable PII - Includes indirect identifiers that alone cannot identify a person, but a combination can. Some examples include:
  • Zip-code
  • Race
  • Gender
  • D.O.B.
  • Place of birth
  • Religion

While it's important to know the difference between both types of identifiers (linked and linkable), both represent personal data. Privacy regulations apply to them equally, so both must be prioritized and protected. With the vast amounts of data passing through our systems today, it's essential to discover and distinguish the most sensitive data that has the most potential to cause damage if intercepted. If your organization wants to improve its protection methods for the PII it collects, you’ll need to prioritize your work and start with linkable PII before unlinkable PII, regardless of the solution you use.

How To Protect PII

While adhering to regulatory guidelines is a pivotal aspect of safeguarding personally identifiable information (PII), it alone may not suffice for comprehensive data protection. To fortify your defense against data breaches and privacy violations, consider implementing additional PII protection methods that go beyond mere compliance. These strategies ensure that your customers' sensitive data remains shielded from potential exposure.

  • Data Minimization: The principle of data minimization centers on collecting and storing the bare minimum of PII necessary for your business processes. By only retaining essential information, you reduce the exposure risk. This practice aligns with privacy regulations that encourage minimal data handling.
  • Tokenization: Tokenization involves replacing sensitive PII with random, non-reversible tokens. These tokens serve as placeholders for actual data, ensuring that even if the tokenized information is compromised, the original PII remains securely protected because they are stored elsewhere. Tokenization is an effective method to prevent sensitive data from falling into the wrong hands.
  • Pseudonymization: Pseudonymization is a technique to tokenize PII by replacing identifying elements with reversible transformations. This allows for safe data analysis while preserving privacy. By pseudonymizing PII, you can maintain the utility of the data for legitimate purposes without exposing individuals' identities.
  • Encryption: Data encryption is a fundamental aspect of data security. Encrypt all stored and transmitted PII to ensure that even if unauthorized access occurs, the information remains indecipherable without the appropriate encryption keys. Strong encryption algorithms add an extra layer of protection, safeguarding data from prying eyes.
  • Data Masking: Data masking conceals specific portions of PII, such as social security numbers, credit card details, or email addresses, while preserving data integrity for non-sensitive fields. This method is especially valuable for maintaining data utility in test environments and other scenarios where real PII isn't required. Or when the end-user knows their own national ID, but can confirm it by seeing the last few digits only.
  • PII Vault: PII vault is designed to securely store and manage sensitive information. These solutions offer enhanced security features, access controls, and auditing capabilities. As well as all the features we just mentioned above. PII vault provides the first line of defense for data protection and is particularly beneficial for organizations handling sensitive data in the financial healthcare industries. But it gains momentum and becomes a best practice methodology for any industry.

By integrating these comprehensive PII protection methods alongside regulatory compliance, you create a robust defense mechanism that significantly reduces the risks associated with data breaches and privacy violations. These measures not only enhance data security but also inspire trust and confidence in your customers, demonstrating your commitment to safeguarding their sensitive information. And above all, they actually make an impact.

PII Protection Remains a Priority

Data protection, in general, is critical, but investing an equal amount of security resources into each piece of data is both a financial and time-wasting endeavor. Data security should be prioritized by the most at-risk information, and that’s PII. As the risk of experiencing a data breach has become nearly inevitable and the black market for stolen sensitive data continues to grow, protecting PII will likely remain a hot-button issue for the foreseeable future. Focusing your protection measures on PII allows you to keep up with growing cyber threats and ensure your users' privacy remains secure.

Create your account today and get started for free!

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

To Summarize

Protecting PII can be a tedious job, definitely if you have to implement all the security mechanisms to make sure the data is safe. The return on investment for protecting PII at the field and record level will significantly reduce the impact of the next breach. At least, when thinking about it from the threat model perspective that is appropriate for hardening systems and taking security seriously - assume breach. With Piiano Vault you can start right away, consuming its APIs through a cloud hosted solution by us, or deployed in your application backend. We can jumpstart your journey by creating a new Vault account and start working with it right away, here.

About the author

Gil Dabah

CEO & Co-founder

Follow

Gil is a software ninja who loves both building software (companies too) and breaking code. Renowned for his prowess in security research, including notable exploits of the Microsoft Windows kernel that have earned him unusual high bounty awards. He has written a couple of very successful open source libraries. And he likes to talk publicly in conferences.

Why Piiano Vault

Continue your reading

Back to all blogs
You agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.