Beyond Deterministic Thinking: Embracing PII 2.0 for a Nuanced Approach to Data Classification

Classifying and determining PII needs a clear understanding of what constitutes PII within an operational context. This involves a comprehensive data inventory and mapping exercise.

One of the capabilities of the Pretectum CMDM, is the ability to designate certain stored data as PII. This capability is provided in order to more adequately secure visible access to sensitive data.

In a call-centre use case, for example, we might see a call center agent identify the customer record through the dialing in number via CLI integration and record matching but a second step verification might require the customer to indicate their last name or postal code to ensure that the person is who they are inferred to be. Under such circumstances determining exactly which data attributes are PII may be challenging.

In classifying customer data that is held, any given organisation should:

Identify all data they collect, store, and process. This includes understanding the source of the data, its purpose, and how it’s used in the execution of organisational activity.
Categorise the data based on its sensitivity. This involves determining whether the data directly identifies individuals or can be used in combination with other data to identify individuals. For example, names, addresses, government IDs are clearly PII, while data like purchase history, browsing data, or device IDs might become PII when combined with other data.
Establish clear thresholds for classification. Using a deterministic approach, organisations might decide, for example, that any data point that can be used with say a 70% certainty to identify an individual is classified as PII.

Keeping in mind, that identifiability is not always quantifiable. While some data points, like a unique government ID, have a near 100% certainty of identification, other data points exist on a spectrum of identifiability depending on context and available information. For instance, a date of birth alone might not be considered PII, but combined with a postcode, it can significantly narrow down an individual’s identity.

Technological advancements and the increasing availability of data can change the identifiability of data points too. What might have been considered non-PII a few years ago could become PII with new analytical techniques.

Once the data is classified, you should focus on securing the PII in proportion to its sensitivity and the potential risk of harm associated with its compromise.

Approaches to this, might include:

Implementing access controls: Limiting access to PII data only to authorized personnel whose job roles require it. This involves role-based access control (RBAC) to enforce granular permissions, something the Pretectum CMDM inherently supports.
Encryption: Protecting PII data both at rest and in transit using robust encryption algorithms. Encrypting databases and storage devices ensures that even if unauthorized access occurs, the data remains unintelligible, again, an inherent trait of the Pretectum CMDM.
Data masking: Replacing sensitive PII data with non-sensitive, fictional data for non-production environments or when sharing data with third parties for analytics or testing purposes. This ensures that the original PII data is not exposed. Pretectum CMDM presents PII as masked data represented by asterisks.
Ethical walls: Implementing mechanisms to prevent specific departments or individuals within the organisation from accessing PII that’s not relevant to their work or that could create a conflict of interest. Here again, we expect RBAC to be used but the federated approach to customer MDM also allows for the compartmentalization of customer datasets.
Regular security audits and monitoring: Continuously monitoring access logs, system activity, and user behavior for suspicious patterns that might suggest a security breach. Regularly conducting security audits helps to identify vulnerabilities and assess the effectiveness of existing security controls. This is another control within the application that allows you to manage the risks associated with potentially sensitive customer data.
Data loss prevention (DLP): Implementing systems that track the movement of sensitive PII data within and outside the organisation, flagging unusual data transfer patterns. RBAC within Pretectum CMDM can prevent bulk extraction actions.
Secure disposal of PII data: Implementing secure data deletion or destruction policies when PII data is no longer required. This might involve data wiping, overwriting, or physical destruction of storage devices.

While the above deterministic approach to PII classification and security is helpful, it has limitations. The strict categorisation of data as either PII or non-PII fails to acknowledge the evolving nature of identifiability in the age of big data and advanced analytics. What might be considered non-PII today could become PII tomorrow with the availability of new data sources or advanced correlation techniques.

Thee is the concept of PII 2.0, as proposed by Daniel J. Solove of George Washington University Law School and Paul M. Schwartz. Their concept offers a more nuanced perspective. PII 2.0 moves away from a binary classification of data and instead conceptualizes identifiability as a spectrum of risk.

Instead of rigidly categorizing data, organisations might evaluate their data assets by considering the likelihood of identification based on a number of factors.

Data itself: The nature of the data, its sensitivity, and whether it directly or indirectly identifies an individual.
Data in Context: How the data is used, combined with other data, and the potential harm associated with its disclosure.
Available technology: The current and emerging technologies that can be used to link and identify individuals.

PII 2.0 encourages organisations to Adopt a risk-based approach to PII classification and security. Instead of treating all data as either PII or non-PII, organisations should focus on evaluating the risk of identification associated with different data sets. They should also implement data minimization strategies i.e. collect only the data that’s necessary and explore ways to anonymize or de-identify data. In addition, they should develop robust data governance frameworks that include clear policies and procedures for data classification, access controls, data retention, and data disposal.

By moving away from the traditional deterministic thinking of PII and embracing the nuanced framework of PII 2.0, organisations could better navigate the complex landscape of data privacy in a way that is both responsible and pragmatic.

Such an approach enables organisations to protect individual privacy while still harnessing the value of data for innovation and growth.

Leave a Reply Cancel reply