Skip to main content

Understanding What, in Fact, Is Personal Data

Ipswich boy Ribald from the X century, turned the bark into a carrier of personal data, combining a drawing of his father and his name.

Imagine that you found three people who were born on the same day as you. You have the same date of birth, the same sex, and you can (with some efforts) change names. As a result, formally, we will get four identical people.

Will such a set like a Name + Date of birth + Sex be personal data? The answer is, yes.

In this case, personal data is understood as a set of information that one way or another allows you to identify an individual - a carrier of personal data, a set that unambiguously indicates a specific person.

In some jurisdictions, like European countries, personal data is a broad term and may include various quasi-identifiers that may lead to the identification of a particular person.

So, let's look at some examples and find out what pieces of information can be called personal data.

Simple cases

To start with, there is a category of "raw" data that allows you to identify the identity of a particular person. For example, this is the passport \ ID number, or a set of name, gender, and date of birth.

Examples of personal data:

  • Passport \ ID number
  • Name + gender + date of birth
  • Fingerprint


At the same time, there is a second category of "raw" data, which by itself is unlikely to help you identify anybody. For example:

  • Favorite dish
  • Place of work
  • Personal qualities (character traits)
  • Amount of children


It is impossible to call such information personal data from the point of view of the current legislation.

Rule number one: if you unite some information, which by itself represents personal data and one that does not represent personal data, you will get a database of personal data.

For example:

  • Passport \ ID number + place of work = personal data
  • Medical diagnosis + favorite food + photo = personal data
  • Name + sex + date of birth + place of work + fingerprint = personal data

Refinement regarding the "bare" types of personal data

Some personal data pieces do not allow an accidental person to identify your identity but allows, for example, law enforcement to do it. The mobile phone number of an individual is often attached to his name and ID. It is "clean" personal data. Merging the phone number with any other information about its owner - means getting personal data. The same can be applied to the credit card number, insurance number, and so on.

But if the particular phone number is tied to a legal entity, it is not considered personal data by itself. It does not allow to identify a specific employee of the company using this phone number.

Some complex cases

It does not always happen that a special type of information is included in the dataset that automatically makes this set personal data. As we mentioned earlier, if there is some info AND a passport number, this is definitely can be called personal data. But sometimes no part of the dataset is personal data in an isolated form, but all pieces together allow you to accurately identify a person.

For example, a medical diagnosis, as a rule, is not personal data in isolation from the name. But the result of a DNA code test is personal genomic data. The race is not personal data by itself as well the place of work.

However, it may turn out that the workplace + race + diagnosis = personal data when, for example, only one disabled Chinese woman works at a gas station.

What is interesting, if two disabled Chinese worked initially at the gas station, and then one of them retired, such dataset was not personal from the beginning, but later it became such.

As well as in our example of four people. For one person such a set as name + sex + date of birth can be called personal data, but when four friends changed their names, in theory, the set ceased to be personal data. In practice, it is often not.

To understand whether your dataset can be called personal data, you may try examining law precedents - rulings established in previous legal cases. If there was a court decision that such a set is personal data, then your set with very high probability is such.

In many countries, it is almost impossible to get officially signed expert opinion which determines whether data is personal or not. You may contact the government authorities, however, it is highly probable you will get such an answer: "If it is possible to identify a person then this is personal data."

The final decision will always be a court decision, but as a rule, it is better not to bring it so far and think ahead. Do your best logically comparing all facts and determining if a person can be identified or not.

Naturally, in day-to-day situations, most datasets have long been described. It is clear how to classify them. Nevertheless, there are several interesting situations with biometrics, photos and special categories of personal data.

Interesting cases

A photocopy of a passport is personal data because it includes headshot, name, date of birth, and other data. Things become controversial when you take separately a passport photo, a picture or video from CCTV cameras.

It is a matter of disputes because it is not always possible to unequivocally understand whether, for example, a particular photo allows us to identify a person. Where is this border in terms of the quality of the photo?

If it is a 3000x3000 pixels clear image of the passport photo, then, obviously, it is personal data. But it can be the same picture but blurry and just 32x32 pixels. It can also be not a photo from a passport, but a photo of a crowd on the street?

Side note: security systems and cameras at the stadiums use special auto-detectors that can identify a person (from the blacklist) using images as little as 250x250 pixels.

But in practice, there is no clear definition here. Most often things happen like this, for example, when you pass the border control, the officer looks at your face, then looks at your photo in a passport or visa and decides whether your face matches the photo. If from his point of view, the photo and real face are similar – he provides positive “expert decision” and you are OK to cross the border. Approximately the same happens while identifying photos in court. The judge invites an expert, and this expert will decide whether it is possible to identify somebody based on the specific photo.

In most jurisdictions, a citizen can request to stop processing his personal data. In theory, you can fish for all your photos in the crowd and insist that it is an act of data storage and processing without your consent. Exceptions to this situation:

  • Your image is used in state or public interests.
  • The image was taken in places open to free access or at public events (meetings, conferences, concerts, performances, sports events, etc.)
  • A citizen posed for money.

The image of a person is personal data. Usually, we are talking about photos, but not about the portrait. Nevertheless, the person can be identified by this portrait bellow, so it is unclear how to store and process it.

the person can be identified by this portrait

Another controversial case is email. It is unambiguously clear that in an isolated form (without a name, for example) is not personal data. It may belong to anybody including a robot.

But what if it is or Most likely such addresses are not personal data. To make it personal data we need to add more pieces of data here.

In addition, by analogy with the phone number, it all depends on whom the email is registered to: a legal entity or a citizen. And in most cases, you do not tie your email address to passport number.

Storing IP address is tricky too. Even if your ISPs assigned you a specific IP address, you can use some of the best vpn services and connect to websites using multiple other IP addresses.

Biometric data like the individual shape of the skull and ears are unambiguously personal data, like a fingerprint. This imposes serious limitations on face recognition systems - one must obtain an approval even for storing a hash of biometric measurements.

What does this all mean?

When you build an IT infrastructure, you need to understand whether your data is personal or not. If your data is personal, you need to classify it and understand what kind of data you have, what threats are possible, and how many PD records you have. Further on, the necessary level of security should be assigned. And each level should include appropriate protection measures in accordance with the requirements of the local legislation.

If you follow the spirit of the law and law enforcement common practices, in almost all situations it is possible to determine whether any dataset is personal or not. However, in some cases lawyers are invited who perform an assessment and make inquiries to the regulatory authorities.


Was this article helpful? Please, rate this.

There are no comments yet.
Authentication required

You must log in to post a comment.

Log in