October 17, 2024

Safeguarding Personal Privacy In The AI Era

Thumbnail

Data privacy is an issue of growing concern in the wake of explosive advances in commercially available AI systems and applications. Trained on ever larger datasets, AI algorithms could enable the resolution of critical economic and societal problems. Yet it could also introduce some significant challenges related to the intended or unintended misuse of personal digital information. Is there a way to protect society from intrusive data collection or is online privacy a thing of the past?

The Nature Of Privacy Risks

Navigating privacy in the AI era was the focus of a recent panel discussion I participated in at a forum put on by La French Tech Toronto.

Historically, privacy was often defined in terms of what was not shared or made public. As digitalization spreads to most aspects of our lives, privacy now focuses on securing consent and maintaining control over the flow of data related to our physical selves, and social, economic and political activities, as well as managing data spillover from individuals who aren’t the intended target. Consent over access to, storage and use of this personal information is one of the cornerstones of many regulatory and policy frameworks.

Nonetheless, some data-hungry machine learning algorithms are being trained with mined data obtained without our explicit consent or even awareness, which can have a variety of negative impacts. One risk of the unsanctioned collection and use of personal information is its use by repressive regimes that leverage data to create detailed records of movements and activities in order to target people. Used to impede the participation of certain persons in public life, this is known as model-enforced repression.

Another risk associated with the misuse of personal data is identity theft and targeted fraud. For example, deepfakes of a chief financial officer and other staff members at a Hong Kong-based multinational company were used to create an AI-generated videoconference. These re-creations were successful in persuading a key staff person to transfer $25 million to a bank account set up by the fraudsters.

The Benefits Of Personal Data Sharing

Do the risks associated with the misuse of personal information mean we should curtail all use of private data? Absolutely not. While there are definable risks, sharing data has enormous benefits at all levels of society.

For example, access to data about driving behavior de-risks the operation of vehicles and lowers insurance premiums. This is the approach behind Tesla drivers’ Safety Scores, used to manage eligibility and determine Tesla Insurance rates. By incentivizing better driving, personalized data collection has also reduced long-term accident rates.

The sharing of data offers commercial benefits for customers and businesses. It is a growing necessity for online commercial activity, as demonstrated clearly when Alibaba deactivated its personalization features. This was shown to drive down click-through rates and product browsing, resulting in an 81% drop in buying rates. On the consumer side, sharing more granular data offers more tailored shopping experiences, which can ultimately improve competition and pricing.

Data sharing also saves lives. With some of the most stringent data protection laws in the world, Europe is seeing an estimated 50,000 avoidable deaths each year related solely to constraints on patient information sharing, which limits monitoring that could directly reduce neonatal mortality rates or better identify and treat HIV patients. That doesn’t even account for the exponential improvement in outcomes if data was used to train evidence-based AI medical models.

Regulatory approaches are tackling some of the issues related to data flows in an effort to improve privacy and control over personal data security. The EU’s 2018 General Data Protection Regulation requires explicit permission for the collection, storage and processing of personal data. The California Privacy Protection Agency has proposed regulations requiring data transparency by companies and allowing consumers to opt out of data collection.

While most frameworks offer certain protections, experts are nonetheless concerned about the downsides of stringent regulatory frameworks. They advocate interventions such as proportional taxation of the data collected by companies.

AI-Generated Solutions To Improving Personal Data Protection

Some protections against privacy breaches have already been generated by AI itself. Google used a common technique to blur faces and license plates, called enforced filters, in Google Street View, a technique also being relied upon by ChatGPT. The problem with enforced filters is that workarounds can secure access to forbidden information from models.

Consequently, the tech sector is turning to more innovative AI-generated approaches. One of them is synthetic data generation, a technique being explored in Spain, where patient data is not allowed to be transferred outside of hospitals.

Here, my company is exploring ways to train AI on the country’s entire inventory of original personal data so that it contains the statistical data relevant to the original training data without the threat of divulging identifiable personal information. This requires that we build a dataset of realistic patient data where none of the patients actually exist.

Another AI technique designed to secure personal data is to selectively delete information that could compromise privacy, a process akin to what happens in the film Eternal Sunshine of the Spotless Mind. Currently, this technique poses some risk of harming training. It also needs to be certifiable so that it can hold up in court.

So we are devising techniques that avoid the need to completely retrain models, a process that involves tensorizing fully trained large language models and then removing the connections associated with certain information in a way that does not impact other useful information.

AI-Generated Solutions To Privacy Concerns

Machine learning is starting to drive some incredible benefits to society. It is critical to maintain the flow of data-driven innovations, which risk being stymied by overly cumbersome and unnecessarily restrictive regulatory frameworks. Yet there is a clear need to address privacy concerns.

Let us use the same drive and human ingenuity propelling advances in AI to create more of the tools that can safeguard against the misuse of data and protect the informational privacy rights of everyone.

Check the article here: Safeguarding Personal Privacy In The AI Era (forbes.com)