>

Resources

>

April 26, 2022

What are privacy enhancing technologies? The 5 best PETs for the modern tech stack

Written by

Privacy enhancing technologies protect data privacy in new ways. Legacy data anonymization techniques can no longer fully protect privacy. In their effort to mask or obfuscate data, legacy anonymization destroys data utility. As a result, these old technologies should not be considered to be privacy enhancing technologies or PETs.

Examples of privacy enhancing technologies

There are five major emerging privacy enhancing technologies that can be considered true PETs: homomorphic encryption, AI-generated synthetic data, secure multi-party computation, federated learning and differential privacy. These new generation privacy enhancing technologies are crucial for using personal data in safe ways.

Organizations handling sensitive customer data, like banks, are already using PETs to accelerate AI and machine learning development and to share data outside and across the organization. Most companies will end up using a combination of different PETs to cover all of their data use cases. Let's see how the five most promising privacy enhancing technologies work and when they come in handy!

1. Homomorphic encryption

Homomorphic encryption is one of the most well-known privacy enhancing technologies. It allows third parties to process and manipulate data in its encrypted form. In simple terms: someone who performs the analysis will never actually get to see the original data. But that's also one of the severe limitations of this technology. It's not helpful when the person who should do the analysis has no prior knowledge about the dataset as data exploration is virtually impossible.

Another limitation of homomorphic encryption is that it's incredibly compute-intensive and has restricted functionality. As a result, some queries are not possible on encrypted data. It's one of the least mature, but promising technologies when it comes to anti-money laundering and the detection of double fraud.

2. AI-generated synthetic data

AI-generated synthetic data is one of the most versatile privacy enhancing technologies. AI-powered synthetic data generators are trained using real data. After the training, the generator can create statistically identical but flexibly sized datasets. Since none of the individual data points match the original data points, re-identification is impossible.

The most popular synthetic data use cases include data anonymization, advanced analytics, AI and machine learning. The process of synthesization also allows for different data augmentation processes. Upsampling rare categories in a dataset can make AI algorithms more efficient. Subsetting large datasets into smaller, but representative batches is useful for software testing. Advanced synthetic data platforms offer statistically representative data imputation and rebalancing features. Since synthetic datasets do not maintain a 1:1 relationship with the original data, subjects are impossible to reidentify. As a result, it's not suitable for use cases where re-identification is necessary.

3. Secure multi-party computation

Secure multi-party computation is an encryption methodology. It allows multiple parties to collaborate on encrypted data. Similarly to homomorphic encryption, the goal here is to keep data private from participants in the computational process. Key management, distributed signatures, and fraud detection are some of the possible use cases here. The limitation of secure multi-party computation is the resource overhead. To pull off a SMPC stunt with success is pretty tricky - everything has to be timed right and processing has to happen synchronously.

4. Federated learning

Federated learning is a specific form of machine learning. Instead of feeding the data into a central model, the data stays on the device and multiple model versions are trained and operated locally. The result of these local trainings are model updates, which get fed back into and improve the central model. This decentralized form of machine learning is especially prevalent in IoT applications.

The training takes place on edge devices, such as mobile phones. Federated learning on its own doesn’t actually protect privacy, only eliminates the need for data sharing in the model training process. However, the fact that data isn’t shared doesn’t mean privacy is safe. The model updates in transitioning from the edge devices could also be hacked and leak privacy. To prevent this, federated learning is often combined with another PET, like differential privacy.

5. Differential privacy

Differential privacy is not as much a privacy-enhancing technology in itself, but a mathematical definition of privacy. Differential privacy quantifies the privacy leakage that occurs when analyzing a differentially private database. This measure is called the epsilon value. In an ideal world - or with an epsilon value of 0 - the result of said analysis wouldn’t differ no matter whether a given individual is present in the database or not.

The higher the epsilon the more potential privacy leakage can occur. In academia, epsilon values of below 1 are recommended to achieve strong anonymization. In practice, it’s still a challenge to determine a suitable epsilon value. This is important to keep in mind, as differential privacy does not automatically guarantee adequate privacy protection. It simply offers a mathematical guarantee for the upper boundary of potential privacy leakage. So getting the epsilon value set right is of utmost importance. It needs to be low enough to protect privacy, but not so low that the noise that has to be added to achieve this low epsilon value is diminishing data utility.

More often, privacy practitioners use it in combination with another PET, such as federated learning.

Which Privacy Enhancing Technology to use when?

Different privacy-enhancing technologies' benefits and limitations need to be weighed carefully. Some of them are more use case agnostic than others, but most organizations will have to invest in more than one PET to cover all use cases. Some legacy anonymization techniques might also have a place in the data tech stack as additional measures, but their use should be limited.

Which privacy enhancing technology to choose when? Image courtesy of Mobey Forum

The synthetic data guide

If you would like to learn about adding AI-generated synthetic data to your privacy stack, download the complete guide with case studies!

Download the synthetic data guide

Ready to try synthetic data generation?

The best way to learn about synthetic data is to experiment with synthetic data generation. Try it for free or get in touch with our sales team for a demo.

Get started free Request a demo

Name	Borlabs Cookie
Provider	Owner of this website, Imprint
Purpose	Saves the visitors preferences selected in the Cookie Box of Borlabs Cookie.
Cookie Name	borlabs-cookie
Cookie Expiry	1 Year

Name	Google Tag Manager
Provider	Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Purpose	Used to control advanced script and event handling.
Privacy Policy	https://policies.google.com/privacy?hl=en
Cookie Name	-

Accept	Google Analytics
Name	Google Analytics
Provider	Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Purpose	Cookie by Google used for website analytics. Generates statistical data on how the visitor uses the website.
Privacy Policy	https://policies.google.com/privacy?hl=en
Cookie Name	_ga,_gat,_gid
Cookie Expiry	2 Years

Accept	LinkedIn Insight Tag (LinkedIn Pixel)
Name	LinkedIn Insight Tag (LinkedIn Pixel)
Provider	LinkedIn Corporation
Purpose	The LinkedIn Insight Tag is a lightweight JavaScript tag that powers conversion tracking, website audiences, and website demographics for LinkedIn ad campaigns.
Privacy Policy	https://www.linkedin.com/legal/privacy-policy
Cookie Expiry	2 Years

Accept	Hotjar
Name	Hotjar
Provider	Hotjar Ltd., Dragonara Business Centre, 5th Floor, Dragonara Road, Paceville St Julian's STJ 3141 Malta
Purpose	Hotjar is an user behavior analytic tool by Hotjar Ltd.. We use Hotjar to understand how users interact with our website.
Privacy Policy	https://www.hotjar.com/legal/policies/privacy/
Host(s)	*.hotjar.com
Cookie Name	_hjClosedSurveyInvites, _hjDonePolls, _hjMinimizedPolls, _hjDoneTestersWidgets, _hjIncludedInSample, _hjShownFeedbackMessage, _hjid, _hjRecordingLastActivity, hjTLDTest, _hjUserAttributesHash, _hjCachedUserAttributes, _hjLocalStorageTest, _hjptid
Cookie Expiry	Session / 1 Year

Accept	Mixpanel
Name	Mixpanel
Provider	Mixpanel S.L
Purpose	We utilize Mixpanel cookies to gather data regarding your usage of our website. This information enables us to comprehend your interests, enhance our products and services, and deliver an improved user experience on our website.
Privacy Policy	https://mixpanel.com/legal/privacy-policy
Host(s)	*.mostly.ai, mostly.ai
Cookie Name	_mp

Accept	HubSpot
Name	HubSpot
Provider	HubSpot Inc., 25 First Street, 2nd Floor, Cambridge, MA 02141, USA
Purpose	HubSpot is a user database management service provided by HubSpot, Inc. We use HubSpot on this website for our online marketing activities.
Privacy Policy	https://legal.hubspot.com/privacy-policy
Host(s)	*.hubspot.com, hubspot-avatars.s3.amazonaws.com, hubspot-realtime.ably.io, hubspot-rest.ably.io, js.hs-scripts.com
Cookie Name	__hs_opt_out, __hs_d_not_track, hs_ab_test, hs-messages-is-open, hs-messages-hide-welcome-message, __hstc, hubspotutk, __hssc, __hssrc, messagesUtk
Cookie Expiry	Session / 30 Minutes / 1 Day / 1 Year / 13 Months

Accept	VWO
Name	VWO
Provider	VWO, Wingify Software Pvt., Heidenkampsweg 58, Hamburg, 20097, Germany
Purpose	VWO allows website owners to conduct A/B testing, create heatmaps, and track user behavior to optimize their website's performance and user experience. We use VWO on this website for our online marketing activities.
Privacy Policy	https://vwo.com/privacy-policy/
Host(s)	mostly.ai
Cookie Name	_vis_opt_exp_#_goal_#, _vis_opt_test_cookie, _vis_opt_exp_#_combi, _vis_opt_exp_#_exclude, _vis_opt_exp_#_split, _vis_opt_s, _vis_opt_out, _vwo_uuid, _vwo_uuid_#, _vwo_ds, _vwo_sn, _vwo_uuid_v2, _vis_opt_exp_#_combi_choose, _vwo_referrer, _vwo, wingify_push_db_status, wingify_push_subscription_id, wingify_push_subscription_endpoint, pushcrew_opt_out, wingify_push_do_not_show_notification_popup, pshcrw_update_subId, wingify_push_subscription_status, wingify_push_subscriber_lang, wingify_donot_track_actions, wingify_do_not_show_chicklet, _wingify_pc_uuid, wingifyEcomData-, wingify_push_gcm_id, wingifyRetrySegment-, wingifySegment-*, pshcrw_v_k, wingify_push_subscriber_id, _vwo_global_opt_out, _vwo_ssm