Data protection: Privileged data processing methods
Die new, harmonised EU General Data Protection Regulation came into force on May 24, 2016 and is applicable from May 24, 2018 without any further transition period. All companies operating in the EU, regardless of their headquarters, are already subject to legislation; legal implementation in Switzerland is already in progress. This is a reason for us to take a closer look at the General Data Protection Regulation in a series of articles.
Privileged procedures
With the documentation of Technical and organizational measures (“TOMs”) is required to process data, which means are used to achieve adequate protection. The General Data Protection Regulation describes a series of procedures which are legally privileged - i.e. to be used as part of technical and organizational measures if possible. They are:
- the use of anonymized data,
- the use of pseudonymous data,
- the use of statistical data,
- the use of encrypted data.
This ranking cannot be interpreted completely linearly, as there are, of course, possible combinations. In the following, we will look at these listed procedures from the perspective of data protection. All procedures are generally much more difficult to implement correctly and securely than might appear on the surface. However, in the context of an article on data protection, these technical aspects go too far, which is why we require professional and error-free technical implementation here. (If only it were that easy during implementation.
anonymization
Anonymization is the process of removing from a database those characteristics that make it possible to assign the managed data to specific persons. This is a great process where it can be implemented. This is because anonymization means that the data is no longer related to a person and is therefore no longer subject to the provisions of the Data Protection Act. Because, as we found out in Part 1 of this series of articles, only data relating to an identified or identifiable natural person is subject to data protection, not everything else. Let's assume that we compile crime statistics. As a basis, we have a list of all crime cases over the last 10 years. The anonymization of the perpetrator to protect him now means that the information that makes him identifiable is systematically removed from this list of crime cases. The first, obvious measure is therefore to delete those attributes from the database that allow direct assignment to a person. This could be, for example, the name and conduct of the perpetrators. However, the database is not yet really anonymized. Because it remains to be checked whether indirect identification is still possible. This is because anonymized is only considered anonymous if the person is unable to identify the person even with considerable effort, for example by using other sources of information. Perhaps there were newspaper articles about the crime, which nevertheless make attribution possible again. Or further - at first glance perhaps inconspicuous - information about the perpetrator could be stored, from the combination of which the group of people in question can in many cases be limited back to the person. Accordingly, irreversible anonymization is usually not easy to implement or is even impossible due to the requirements.
pseudonymization
Pseudonymization is the process of removing person-identifying features from data and storing them separately. It is therefore a type of partial anonymization. The key to the conversion is stored in a separate location. Let's illustrate this using the example from the Anonymization section. Pseudonymization would mean that, analogous to anonymization, the name and conduct of the perpetrator would be deleted from the database. A unique sequence number would be used for this purpose. A translation table would then be kept in a second data memory, which lists the name and first name for each sequence number. This key table must then be stored separately from the data and specially protected. Pseudonymization is therefore not as effective as anonymization. But the effort required to identify people is increasing. For this reason, this procedure is legally privileged when full anonymization is not possible.
statistical data
Statistical data is understood as data aggregations from individual case data. To stick with the example of crime statistics mentioned above, this would mean that although the starting point for updating data is always an individual case, the data is then only stored as a sum of incidents, e.g. the number of burglaries per month. Such an aggregation also results in partial anonymization with the same data protection advantages. It should be noted here that when the number of cases is low, the sum of a single case remains an individual case in statistics. This means that it is potentially possible to identify the person again.
Example: Insufficient protection of voting secrecy in statistics
I have an illustrative example of this problem when introducing a E-voting system for Swiss citizens abroad experienced, which affected not only data protection legislation but also laws on political rights. Voting secrecy must be maintained in the event of a vote or election, i.e. it must not be comprehensible who voted for which candidate or who approved which bill or not.
On voting Sunday, the local authorities should record the voting of “their Swiss abroad” separately via e-voting so that the public could be transparently informed about the use of this voting channel, also due to the many concerns about possibilities of manipulation. In practice, it unfortunately turned out that individual small communities have so few registered Swiss nationals abroad that only individual votes were cast by Swiss abroad on a regular basis. Accordingly, it was sometimes possible to see directly from the statistics how the Swiss living abroad in question had voted. The problem was finally solved by no longer crediting these votes to municipalities, but to a separate, intercommunal constituency for Swiss abroad.
encryption
Encrypting data is a method of increasing the protection of data. As a first step, data is usually protected from access by third parties via firewalls, authorization concepts, etc. By using encryption, this access can be restricted even more specifically to just a few, specific people. In particular, with the use of encryption, access by internal system administrators and other IT personnel can also be prevented or at least restricted to a few people. At the same time, this introduces an additional security hurdle for potential attackers if the other protective measures could be circumvented. However, any encryption method is only as secure as the key (s) used and their storage. As a result, it is often difficult to achieve truly effective protection. Because “pragmatic” implementation often results primarily in so-called security-by-obscurity, i.e. a nice calming pill with the effectiveness of placebo. You feel better, but not really because of the ingredients. Encrypting data also has significant disadvantages. On the one hand, there is usually a loss of performance, i.e. all processing is necessarily longer and the system slows down. On the other hand, recovery scenarios for recovering a service after a failure can also be made more difficult and slowed down. From a data protection perspective, encrypting as much personal data as possible is actually always welcome. However, from the point of view of cost efficiency, i.e. the additional protection factor actually achieved per invested capital, the calculation often does not pay off. Accordingly, encryption is usually used very selectively where it can effectively counter significant risks.
So which method to apply where?
The presented procedures, which are privileged under data protection law, represent a toolbox to increase the protection of persons. All are desirable, complete anonymization is the ideal solution. Which procedures are to be used are decided on the basis of the risk assessment in the data protection impact assessment.
Our series of articles on the subject
- In We have lead-in articles on the need for action alerted.
- In First part of the series, we introduce the various actors and unclip the frame.
- In In part 2, we examined the principles based on four pillars of data protection.
- Part 3 explained the specific requirements for processing special categories of personal data and profiling, which is considered particularly critical.
- This part 4 examined legally privileged, desired processing methods.
- Part 5 of the series concludes with a framework for pragmatic and appropriate implementation of data protection in their IT project.
About the author
Stefan Haller is an IT expert specializing in risk management, information security and data protection at linkyard. He supports companies and authorities in risk analysis in projects, the design and implementation of compliance requirements in software solutions and in the preparation of IT security and authorization concepts. He is certified in Risk Management and has carried out numerous security audits based on ISO standard 27001 over 10 years as an internal auditor. Do you have any questions about implementation in your company? stefan.haller@linkyard.ch | +41 78 746 51 16