Exploring approaches to field-level encryption in Python for Django applications

Imri Goldberg

Tech Lead Advisor

August 1, 2023

On this page

Following our previous post on column-level encryption, this post explores several implementation approaches and discusses their advantages and disadvantages.

To start, we introduce a simple example. We then look at how to implement manual encryption, add automation and encapsulation with a property, use an encryption library, and explore alternatives such as using a proxy and database-level encryption. We then review the security implications of these approaches before, finally, showing how the Piiano Django ORM integration offers a simple and robust solution.

Example application

To aid this discussion, let’s create a simple Django application and a trivial model for it: Person, with two fields: name and ssn. Let’s say we want to encrypt the ssn field. Here is a very naive and simplified implementation without encryption.

models.py:

class Person(models.Model):
   name = models.CharField(max_length=100)
   ssn = models.CharField(max_length=10)

views.py:

def persons(request):
  if request.method == "POST":
    name = request.POST.get("name") .?ssn = request.POST.get("ssn")
    Person.objects.create(name=name, ssn=ssn)
  persons = Person.objects.all()
  return render(request, "persons.html", {"persons": persons})


def single_person(request, id):
  person = Person.objects.get(id=id)
  if request.method == "POST":
    person.name = request.POST.get("name")
    person.ssn = request.POST.get("ssn")
    person.save()
  return render(request, "person.html", {"person": person})

Note: A proper Django implementation is likely to use forms and class-based or generic model views. For brevity, we avoid these for now.

Approach 1 – Manual encryption and decryption

For this approach, we use Python’s cryptography.fernet as the encryption library. 

This approach is the most straightforward one to implement but the hardest to maintain. We include it here for completeness, as it’s usually not a practical approach.

Whenever Django views access the ssn field, they encrypt before storing and decrypt after retrieving. 

We also need to consider the acquisition and use of the encryption key. One approach is storing the key as an environment variable and reading it on process load. A better approach is to use a Secrets Manager  System, such as AWS Secrets Manager, and read the key from there. It’s possible to read the key from the Secrets Manager on load or every time it’s used. Preferably, use a cache to reduce the calls to the Secrets Manager. 

However, there is more to key management. If we require key rotation, and we usually would, then MultiFernet should be used to encrypt the field. When decrypting the field, we must provide the previous keys and rotate them as necessary. 

The main disadvantage of this approach is that all access to the ssn column needs to account for encryption and, if relevant, key rotation. This approach leaves more room for errors and makes for much non-don’t-repeat-yourself (DRY) code.

This example assumes that the encryption key is read from an environment variable and stored in the Django settings file.

def encrypt(cleartext: str) -> str:
  key = settings.ENCRYPT_KEY
    f = fernet.Fernet(key)
  return f.encrypt(cleartext.encode()).decode()


def decrypt(ciphertext: str) -> str:
  key = settings.ENCRYPT_KEY
    f = fernet.Fernet(key)
  return f.decrypt(ciphertext.encode()).decode()


def persons(request):
  if request.method == "POST":
    name = request.POST.get("name")
    ssn = request.POST.get("ssn")
    Person.objects.create(name=name, ssn=encrypt(ssn))
  persons = list(Person.objects.all())
  for p in persons:
    p.decrypted_ssn = decrypt(p.ssn)
  return render(request, "persons.html", {"persons": persons})


def single_person(request, id):
  person = Person.objects.get(id=id)
  if request.method == "POST":
    person.name = request.POST.get("name")
    person.ssn = encrypt(request.POST.get("ssn"))
    person.save()
  person.decrypted_ssn = decrypt(person.ssn)
  return render(request, "person.html", {"person": person})

Approach 2 – Automation and encapsulation with a property

To improve on the previous approach, we create a property called ssn, with the functions set_ssn and get_ssn, that stores and reads data to and from the encrypted_ssn column.

To prevent frequent decryptions of the same data, we can also store an additional member called decrypted_ssn as a cache for the ssn column. (Not shown in the example code)

class Person(models.Model):
  name = models.CharField(max_length=100)
  encrypted_ssn = models.CharField(max_length=100)

  def get_ssn(self):
    return decrypt(self._ssn)

  def set_ssn(self, ssn):
    self.encrypted_ssn = encrypt(ssn)

    ssn = property(get_ssn, set_ssn)

This approach improves significantly over the previous one. The code accessing the ssn column doesn’t need to consider encryption, making it transparent. We can encapsulate key rotation inside the Person class or, even better, inside the encrypt() and decrypt() functions. That encapsulation means that the code in views.py is identical to the original without encryption and doesn’t need to change.

However, this approach is far from ideal. Encrypting another field repeats boilerplate code, and we also miss some features such as batching or passing additional parameters to decrypt() (e.g. to only get a masked version of the SSN). Supporting that would require a context manager and storing the desired transformation in a context variable. Here is an example use of such a context manager:

with mask(Person.ssn):
  persons = list(Person.objects.all())

Approach 3 – Using an encryption library

To prevent code duplication, when specifying a property as encrypted, the work we did previously should apply automatically to the new encrypted field. 

There are several libraries to achieve this for many languages and platforms. For example, the Django library is django-encrypted-model-fields.

class Person(models.Model):
  name = models.CharField(max_length=100)
  ssn = EncryptedCharField(max_length=10)

The advantages of using the library are that there’s little we need to implement apart from specifying which fields to encrypt. This approach also delivers a high level of automation and encapsulation.

Depending on the library used, the library can fetch the keys. Otherwise, we must add code to fetch the keys from the Secret Manager.

Approach 4 – Using a proxy

Moving away from code changes, a network proxy is an alternative approach to field-level encryption. Notable examples include Evervault, Satori, and Fortanix. Some proxies, such as Evervault, are deployed between the browser and the backend of an application, guaranteeing that the backend only sees an encrypted version of the data. Other proxies, such as Satori, are deployed between the backend and database, making all encryption transparent to the backend. 

There are advantages to using a proxy, especially regarding centralization and simplification of an app. There are also disadvantages, as getting the application traffic to flow through a proxy can add latency, a single point of failure, and a scalability problem. And normally, a proxy means another team (not the app developers) has to maintain it. This team may not always be in sync with the engineering effort and know what’s going on.

Also, a proxy dependency makes local testing and testing in the CI environment harder, as it’s another component to test. If it’s not tested, another difference between production and testing is introduced.

Approach 5 – Using a database plug-in

We can use cryptographic functions in a database using, for example, PostgreSQL's pgcrypto module. A plug-in like this can help implement field-level encryption. 

To use pgcrypto, first, we need to install the extension into PostgreSQL. This might be as straightforward as running a SQL command to create the extension. However, it may be more complex when using an RDS or setting up the database as part of CI/CD workflow.

Once installed, we could use pgcrypto for field-level encryption in SQL queries, although that would be cumbersome, especially when using an ORM. Continuing our example, we use a library, such as django-pgcrypto-fields, to encrypt and decrypt values as needed. This approach is very similar to using a library. However, instead of the backend doing the encryption, the database does the work, and the library takes care of the differences in SQL queries.

from pgcrypto.fields import CharPGPPublicKeyField

class Person(models.Model):
  name = models.CharField(max_length=100)
  ssn = CharPGPPublicKeyField(max_length=10)

This code is very similar to the previous approach of using a library to encrypt fields in the backend. When implemented using SQL queries, we need significant code changes to support field-level encryption within the application code to implement key management. Also, we must integrate the infrastructure changes to ensure that the database has pgcrypto installed and enabled. 

Finally, the additional security gained from this approach is limited, as it doesn’t protect against someone gaining access to the database.

Security implications

So far, we’ve not discussed the challenges of working with cryptography libraries in applications. Aside from complicating the code, there are some issues to consider:

  1. Key Access – The application needs to access a KMS to fetch the right key
  2. Key Distribution – if an application is broken down into microservices, each service needs to access the key, and the key must be fetched securely.
  3. Key Rotation –  The requirement to meet today's security standards. This complicates code, backups, anti-tampering, and queries.
  4. Key Compromise – Each micro service can be compromised and leak the encryption key.
  5. Searchability – When we encrypt data in a safe manner (without leaking information through indexing), there’s no good way to search it.
  6. App Level Attacks - The SQL code can be susceptible to SQL-injection and IDOR attacks. See the post OWASP Top 10 Vulnerabilities – A Guide for Pen-Testers & Bug Bounty Hunters for more information.
  7. Logs – Databases in production rarely record access to the data. If there’s a breach, we can be in a situation where we don’t know what happened.

The better approach – Using Piiano Vault’s Django ORM Integration

Piiano recently released Django ORM integration (along with similar integrations for Hibernate for Java and TypeORM for TypeScript). The Django ORM integration encrypts and decrypts values in a transparent way, similar to using a library. In fact, the resulting code is almost identical to the basic approach used to illustrate this post.

from django_encryption.fields import EncryptedCharField, EncryptionType


class Person(models.Model):
  name = models.CharField(max_length=100)
  ssn = EncryptedCharField(
    encryption_type=EncryptionType.randomized,
    data_type_name="SSN",
    null=True,
    blank=True,
  )

With Piiano Vault, you don’t need to worry about key management, your app being compromised, or adding code complexity to deal with the encryption intricacies. You use the ORM and annotate your fields.

Piiano Vault, an infrastructure for the protection of sensitive customer data, is built to make your life easy. It mitigates all the security implications we've discussed and decouples code from messing with encryption, keeping it readable and focused on data operations.

The biggest advantage of using the Piiano integration is that, unlike pgcrypto for example, it gives your organization a centralized way to control sensitive data access. It also records all data access to logs, so if you ever want to do forensics, you have all the information needed.

There are several advantages to using Vault’s ORM integration:

  • It requires very few code changes, making it almost platform independent.
  • It supports several encryption types, such as deterministic encryption (standard encryption) and randomized encryption, where two identical values can yield different ciphertexts.
  • It enables you to specify the semantic data type for object properties, for example, SSN, address, or phone number, rather than just specifying a property as a string or needing to comply with a required format. Once specified, the data type unlocks more powerful features.
  • It can mask based on the data type and use. For example, masking SSNs in some cases and not others.
  • You can search encrypted data without breaking the security of the encryption (for deterministic encryption and only for exact matches).
  • It enables you to manage permissions with a broad range of granularity, from very detailed to highly generalized control levels:
    - You can set a data type to always be protected. For example, when SSN is returned as part of an analytics query, it can always be masked.
    - You can take context into account, such as the reason for the query: is it for analytics, marketing, or app functionality?
    For example, you can set your system up so addresses for people are masked unless strictly necessary, while business addresses are never blocked.
  • It matches the use patterns of your application, such that batches can be decrypted using an API call instead of an API call per row.

Finally, remember that achieving privacy and security requires careful analysis and planning. Our hope is that with this post, you are better prepared to implement both in your project.

Create your account today and get started for free!

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

About the author

Imri Goldberg

Tech Lead Advisor

Follow

Imri is a lifelong software engineer who loves Python and has taken on CTO roles as an entrepreneur. He has been writing code professionally for his entire life, and he is passionate about building SaaS companies.

Why Piiano Vault

Continue your reading

Back to all blogs
You agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.