Data Protection Basics

Recently the PHP Roundtable podcast had an episode about privacy and GDPR compliance. As a kiwi developer who mostly works on projects intended for NZ and Australian audiences, GDPR isn’t something I’ve had to worry about thus far in my career, but there were some great takeaways from the podcast to consider when thinking about privacy and security:

  • GDPR may not affect you … yet – if you don’t “target” European customers, the law may not apply to you. Examples of targeting would include owning a European domain name, having an office in Europe, selling things in Euro currency, or translating your site into European languages. One comment on the pocast suggested that just making a statement like “we ship all around the world” on your website is enough to make you eligible. Even if GDPR doesn’t affect you yet, though, it’s always better to design with data protection in mind rather than having to retrofit it.
  • We need to think about data differently – “The biggest impact GDPR has, is that the data that businesses have is not the data of the business – it’s the data of the individual who provided the data”. This is going to require a major culture shift in a lot of businesses.
  • Take only what you need – the less data you collect, the less you have to misuse, or to be compromised when you are hacked. You may have a battle to fight here against your BDMs and marketing folk, but the flip side is that having fewer fields to fill in will improve the user experience too.
  • Plan for the long term – do you need to keep every single piece of data you’ve collected for the lifespan of the business? For example, do you need a record of every single order and every single item that a customer added to the cart, or can you just keep the last couple of years’ worth? Again, the less data you have, the less data about the customer you can leak.
  • Think outside the box – where does the data that you collect go? Do you manage the server that your database sits on, or is it in the cloud? What APIs do you integrate with? Do you send any emails that contain personal data? Seek written assurances (an explicit contract or part of their Ts and Cs) from each provider that they won’t be using your users’ data. Think very hard about how much data you need to send to APIs, and remember to disclose these use cases in your privacy policy.
  • Be careful with backups – a backup of your production database needs as much security as the database itself. You also need to ensure that recent requests for data removal or modification are honoured when you restore from the backup – so you need a separate, extremely stable mechanism for storing these.
  • Don’t use prod data in your test and dev environments – your developers absolutely shouldn’t have a copy of the production database on their laptops. These environments typically don’t have the same level of protection that production servers do, so they should not host real data. Your developers shouldn’t be seeing data about your real clients anyway, especially those who don’t have access to do so on the production system.
  • Don’t use personal data like IRD numbers as primary or foreign keys – this was something we covered in the privacy and copyright law paper at university. It’s against the Privacy Act in NZ, and it increases the risk to your user if the data is breached. Primary keys should always be fields that are unique to your system.
  • Consider using hashed foreign keys – if you have data that is linked to a user and only ever needs to be looked up in one direction (e.g. a log of their interactions with your system), consider hashing the key that you use to associate the records back to your user. This makes it harder for anyone who compromises your database to extract information about your users.

Next time I’m working on my side project I’m going to take a very critical eye to the data that I’m collecting and see what I can trim…