Gil Dabah
December 4, 2022
The short answer is - because they can’t. But let’s examine what’s going on and why that’s the case. We’re actually seeing more and more of a shift left (more on this below) toward developers. That is, privacy and security business requirements come all the way down to the developers of the products that companies build.
The bottom line is that security organizations are too small and are limited in R&D power. And given that software should (must, considering today’s standards and threats) be built by design with more security and privacy using the best practices - developers must take part in it.
It means doing something from the beginning of a process. For example, when you develop software, you’d normally start with writing down a product requirements document and then design its architecture. In that process, you would want to take into consideration security and privacy requirements as early on as possible. That’s also called ‘by design.’
Basically, if you think of a Gantt chart, then left is the beginning of the project in the timeline axis. Shifting left also means, in our case, that the developers are part of the security and privacy efforts and take responsibility for it too.
For homegrown applications in production environments, there is nobody else who can take care of the security and privacy requirements but developers. If companies wish to protect sensitive data and mitigate breaches, only the engineers can embed security and privacy measures. Security organizations may have engineers too, but normally they aren’t the ones building the product.
It’s really up to the business use case, and it varies between companies and industries. Generally, as we focus on security and privacy, PII becomes an important part of sensitive data. Usually, payment and health information is also important. Sometimes, the mere link between a person and an organization can be sensitive in itself.
For example, a person that has signed up for a diabetes newsletter should be classified as sensitive or confidential, or the reason for a business trip can be confidential information and thus classified as sensitive too.
Data requirements from GDPR might be relevant, like having retention policies of when to delete such sensitive data and honoring a user’s preferences about their consent, or lack thereof, with processing or sharing their data.
We wrote an entire article about this, with the example of working with a particularly sensitive piece of data - social security numbers - check it out here. But just to name a few techniques: granular access controls, native data masking (at the datastore level), field-level encryption, auditing of accesses, omitting PII from data warehouses or from HTTP GET params (that might get logged where not), making sure you don’t log PIIs in plaintext inside your application, and more.
Tokenizing sensitive data can be critical in the right situations and really be a business enabler. But first, let’s understand the value of doing it. It allows you to work on data without revealing the person behind it while having the need, from time to time, to identify a specific person. Suppose you need to manually analyze data for fraud detection, but without revealing who the data belongs to - so no one person can be identified by examining the data alone, therefore persevering the individual’s privacy rights.
However, sometimes you would still need to contact a specific person from that (tokenized) data. Tokenizing the identifiers before they go to the data warehouse is a great use case for making sure data scientists and analysts, or anybody else, can’t see to whom the data belongs while being able to reverse it in specific situations - like finding out who that person is in order to contact them.
We’re talking about pseudo-anonymizing data, so once it’s done, the data can’t identify the data subjects (the persons behind it).For example, sometimes you want to share a big chunk of banking or health data with another company.
But this data surely (in this scenario) contains PII, and exposing the persons might imply a privacy violation (depending on the privacy policy, their consent of how to use the data, etc.). But given you want to share it and make sure that even if this data goes out of control (employees start to copy it over slack or emails, for example), then removing these identifiers (PII) will make it anonymized if they can’t link back to the person the data represents or belongs to.
Therefore, tokenizing or omitting some data can become a business enabler. For example, many companies allow data scientists to access data warehouses that also contain sensitive information such as PII, so the people behind the data are potentially recognized - effectively, not preserving their privacy.
If you want to learn more about why developers are the new data protectors, please watch the webinar with our friends from Permit.io.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
CEO & Co-founder
Gil is a software ninja who loves both building software (companies too) and breaking code. Renowned for his prowess in security research, including notable exploits of the Microsoft Windows kernel that have earned him unusual high bounty awards. He has written a couple of very successful open source libraries. And he likes to talk publicly in conferences.
Increased complexity as the number of keys and systems grow.
Adopt a centralized key management solution such as a Hardware Security Module (HSM) or cloud-based KMS to securely manage and control cryptographic keys at scale.
Ensuring secure and timely key distribution and synchronization at scale.
Automate key rotation processes to maintain synchronization, reduce human intervention, and minimize errors as the system grows.