Practically every day we hear news about some tens or hundreds of millions of personal records stolen from yet another online platform and posted on the Darknet or Internet. All of these instances are preventable by technological means.
The solution floats right on the surface. It’s no rocket science. All of the elements of modern software technologies necessary for the protection of PII already exist, in all languages and frameworks and on all modern data platforms. Nothing needs to be researched or developed from scratch, and only well-known, tried, tested, and proven approaches need to be applied. Why are they not?
Recently, I was asked to look into the hardening of a custom CRM, from the perspective of prevention of possible PII theft. The design was textbook, with all that implies. For a data thief to take advantage of that design, all they had to do was to steal the database backup or file, thus obtaining the shopping list for identity theft, blackmail, or other illegal activities.
Having devised and implemented a plan of turning such a textbook design into arrays of data that cannot be linked back to the same person, without access to the application’s code and some metadata being stored outside of both the database and the web application, I ensured that anyone who steals the database receives meaningless sets of data from which a person’s complete profile cannot be reassembled or deduced.
From the perspective of unauthorized actors, the PII is irreversibly split into its constituent parts. From the data owner’s perspective it remains integral, and its parent-child relationships between a person’s ID and its contact information remains available to search functions. From the user’s perspective, nothing really changes.
Of course, this will not protect the system from being stolen in its entirety: the code, the externally stored metadata, and the database, but that is much more complicated than simply copying a single file, over the Internet or onto a thumb-drive. Doing the former would require access to not one, but at least 3x separate locations, and the perpetrator or perpetrators can be unmasked and identified much easier, if appropriate authorizations are in place.
From the opponents of such an approach, I sometimes hear that it comes with development, performance, and storage cost. Today, in 2025, this consideration is laughable, and here’s why.
Indeed, once the protection measures are in place, developers can no longer use a single SELECT statement to retrieve all personal data at once. Some code is necessary to re-assemble the PII from its constituent parts. Fortunately, this code only needs to be put in place once, and going forward, the developers will leverage a small set of functions or procedures.
Cryptographic functions and database access have become computationally cheap. Compared with the cloud and network latency, the computation of markers that link elements of personal information back to the person’s ID takes a tiny fraction of the time that users wait for the routing and packet transfer to occur. Also, the changes to personal information constitute only a fraction of many online systems’ business payload. And finally, storage has become and is still becoming cheaper, by the day.
Today, there is really no excuse for allowing data thieves to steal PII. Any entity that allows it to happen is simply lazy, and there is no other way to put it. Reach out for assistance in securing your data, and together we will make it prohibitively hard for hackers to damage your business or reputation.
					
	
 