HubSpot Inc.

11/15/2024 | News release | Distributed by Public on 11/15/2024 18:05

AI Data Protection: How To Mitigate Risk While Using AI [Expert Advice]

AI Data Protection: How To Mitigate Risk While Using AI [Expert Advice]

Published: November 15, 2024

Automation. Productivity. Turbo-charging your ops by "delving" into "the ever-evolving landscape" of AI. Sounds great, right?

It's no wonder then that Informatica highlights nearly half (45% to be exact) of data leaders have already implemented generative AI into their processes. And an additional 53% say they will - with 36% of those predicting implementation will occur in the next two years.

However, as we rush toward AI adoption, we can't forget that without appropriate safeguards, it can pose a significant data protection and privacy risk.

Don't just take my word for it, though. Informatica's research reports that data privacy and protection remain top concerns with the implementation of AI. Many of the experts I interviewed for this article share concerns regarding AI's impact on data protection, too. Keep reading to learn why data leaders are prioritizing AI data protection.

Table of Contents


An Introduction to Data Analytics

Unlock the power of data and transform your business with HubSpot's comprehensive guide to data analytics.

  • Fundamentals of data analytics
  • Different types of data analytics
  • Data visualization
  • And more!
Download Now Learn more

Download Free

All fields are required.

You're all set!

Click this link to access this resource at any time.

Does Using AI Pose a Privacy Risk?

According to Immuta's AI Security & Governance Report, which surveyed 700+ data experts from around the globe, 80% of respondents said AI is making data security more challenging.

More specifically, 52% reference the possibility of AI attacks via threat actors as a significant risk. Meanwhile, 57% have witnessed increased AI-driven attacks in the last year. These threat actors may attempt to access and leak your company's data. And that may include your customers' data, too. Yikes!

I spoke to Rob Stevenson, the founder of BackupVault, a company specializing in secure cloud backup solutions. Stevenson, who has years of experience in data protection and building GDPR-compliant businesses, shares the concerns of Immuta's report.

"Yes, AI can certainly pose a privacy risk if it isn't properly regulated or designed with data protection in mind," says Stevenson. "One big concern is that AI often requires large amounts of data, which can sometimes include personal or sensitive information. If this data isn't anonymized or securely handled, it opens the door to misuse, whether intentional or through hacking."

He adds, "AI systems that involve personal profiling or behavior prediction, such as those used for targeted ads, can lead to invasive tracking. The risk comes when data is collected without transparency or when individuals don't have control over how their information is used."

Frank Milia, Chief Operations Officer at ITAMG, an IT asset disposal (ITAD) and recycling service, highlights a different privacy risk to consider regarding AI.

"It's important to have policies, tools, and training in place that impede users from inappropriately sharing protected data or intellectual property with AI tools," says Milia. He explains that the issue is in part due to the decentralized nature of how data is stored in AI models. This makes "locating, protecting, managing, and deleting protected data more challenging than in traditional storage architecture."

"In other words, once a user overshares sensitive information, there is a significant risk of that data being distributed across multiple systems and being used to train the AI model," warns Milia. "Even if one could remove the sensitive input, the challenge of confirming data sanitization of the neural network remains."

So the short answer is yes, using AI does pose a security risk. But that's not the end of the story. Read on to learn about the pros and cons of AI and data privacy.

AI and Data Protection

I've spoken to a bunch of data experts and folks with first-hand experience implementing AI with data protection top of mind. Together, we break down the tech's strengths and weaknesses so you can better mitigate risk.

Strength: AI can simulate cyber attacks and attackers' behavior.

The first benefit I learned about is that AI can simulate cyber attacks so you can test your software and strengthen it against real attacks. Martin Fix, the Co-Lead of the AI Innovation Hub and Technical Director at Star, shares his thoughts on AI data protection.

"I haven't heard of AI directly 'protecting' data unless it's used to identify non-typical behavior or usage parameters (Microsoft has offered this for a couple of years already like other providers such as Cisco and Sophos)," says Fix. However, "AI can help to increase data privacy and data protection by simulating attacks and attackers' behavior. Basically, AI could be used as a white hacker tool or virtual white hacker."

Alex Bekker, Head of the Data Analytics Department at software development and consulting company ScienceSoft, shares an example of this strength in action.

Example: In one of their recent projects, they developed an AI model to improve their client's penetration testing tool. "Before the AI enhancement, the tool was used to scan networks and devices for common vulnerabilities and exposures (CVEs). The rest of the pen-testing process was fully manual - the cybersecurity team chose optimal attack vectors, simulated them, and prepared further attacks based on the results," says Bekker.

Bekker explains that now, with the integrated AI model, this process is fully automated. This allows their client to provide pen-testing services to multiple customers simultaneously. He adds, "Since the model was trained on vast datasets and fine-tuned with real-world attack simulation results, it can self-learn to provide accurate results in unseen scenarios."

Weakness: AI can be overly sensitive.

While AI can work as a white hacker tool or virtual white hacker, in the experience of the Founder of Theme Park Brochures, Rhett Crites, AI can be overly sensitive and end up blocking good-faith users.

"It tends to misidentify risks, leading to false positives that sometimes create more work than necessary," says Crites. "In fact, I've seen cases where AI flagged normal user activity as a risk, which can frustrate both the team and users." Crites shares a first-hand example of this below.

Example: "We once integrated an AI security tool to monitor our user accounts. The system was set up to detect unusual login behavior, and during a busy holiday period, the AI flagged a surge of logins from a particular IP range as suspicious. It automatically locked out multiple accounts, assuming there was a potential breach."

Crites explains that, in reality, they had launched a marketing campaign in that region, and the traffic increase was entirely normal. The AI mishap led to numerous user complaints, with people unable to access their accounts during the campaign.

He adds, "We had to manually review each case and adjust the system's sensitivity to prevent this from happening again. The experience taught us that while AI is helpful for security, it needs to be constantly refined to avoid these frustrating disruptions."

Strength: AI can identify vulnerabilities human testers may overlook.

While sharing their example of using AI to simulate security attacks, Alex Bekker mentioned that AI helps their client to identify vulnerabilities that human testers may overlook. This practice seems valuable to me since it is enhancing a human task and filling a gap. Kevin Shahnazari, the founder of credit card recommendation platform FinlyWealth, mirrors this experience.

Example: Shahnazari first explained that FinlyWealth handles sensitive financial data daily, and its AI models process millions of credit card transactions monthly.

He then shared, "The system even detects some unusual patterns that human analysts miss regarding their strength. For instance, it flagged a series of small transactions later discovered to be a test run of actual credit card theft. The system protected over 50 customers from fraud in this area before the criminals could charge more significantly."

Weakness: AI can leak sensitive data.

Looking at the data, around 55% of experts say the accidental leaking of sensitive information by large language models (LLMs) is their primary concern. Further, over 50% of data experts worry user prompts could expose sensitive information via LLMs. This raises red flags for me.

More worryingly, still? This isn't just an intellectual war game exercise or theoretical possibility. The leaking of sensitive data by AI is already happening. Let's hear from FinlyWealth Founder & CEO once more about how.

Example: "We learned that AI privacy risks are real," says Shahnazari. "Last year, our recommendation engine spat out credit card numbers in its debugging logs. We caught it during an audit, but in this case, it demonstrated how AI can leak sensitive data without intent. Think of AI as a chatty employee - sometimes it shares more than it should."

An Introduction to Data Analytics

Unlock the power of data and transform your business with HubSpot's comprehensive guide to data analytics.

  • Fundamentals of data analytics
  • Different types of data analytics
  • Data visualization
  • And more!
Download Now Learn more

Download Free

All fields are required.

You're all set!

Click this link to access this resource at any time.

Strength: AI can analyze and respond to threats in real time.

According to Founder of BackupVault Rob Stevenson, AI can help identify potential security risks in real time. I think this is incredibly valuable. "It can monitor massive amounts of traffic on a network and quickly pick up unusual patterns that could signal a breach or malware," says Stevenson. "It can also enhance encryption techniques by automating and improving their efficiency."

Tharindu Fernando, a tech expert and full-stack developer at Net Speed Canada, has experienced this first-hand.

Example: "AI can be a powerful ally in bolstering data security," says Fernando. "At Net Speed Canada, we've deployed AI-powered intrusion detection systems that continuously monitor network traffic, identifying and responding to potential threats in real-time. This provides an additional layer of security that far surpasses traditional methods in both speed and accuracy."

Mira Nathalea, Chief Marketing Officer at SoftwareHow, shares Fernando's experience. "AI can be a powerful tool for safeguarding data through its role in threat detection and encryption," says Nathalea.

"It flags suspicious activity before we even notice something's off. On top of that, AI encryption tools kick in automatically, so if there's a breach, the data is scrambled and less likely to be misused. That kind of automation makes my team sleep better at night."

Weakness: AI models can be easily fooled.

Tidio's research shows that nearly 100% of internet users (96%) know of AI hallucinations, while around 86% have personally experienced them. Further, 46% of respondents frequently encounter AI hallucinations, and 35% do so occasionally. Meanwhile, 75% of AI users have been misled at least once.

But here's where I think it gets even more interesting. AI doesn't just have the potential to unintentionally fool users in a desperate scramble to provide the requisite information… Shahnazari states, "AI models can be easily fooled," too.

Example: "For example, people gaming our system through gradual changes in spending habits to receive better card recommendations," says Shahnazari. "Users gradually increased their reported income by $500/month until they qualified for premium cards. The AI did not flag it as suspicious because the changes were minor."

Strength: AI can process large amounts of data faster than humans.

It's no secret: AI can process data at incredible speeds - microseconds even depending on the tech you're working with. Tomasz Borys, Senior VP of Marketing & Sales at Deep Sentinel, a company that combines the power of smart security camera technology and live guards, agrees. "AI has significant strengths in data protection," says Borys. "It can analyze vast amounts of data in real-time, detecting anomalies and potential breaches far faster than human analysts."

Example: Borys shares a use case: "One way we use AI to protect data is through advanced pattern recognition in network traffic. Our AI models can detect unusual data access patterns that might indicate a breach attempt, allowing us to respond proactively. Our AI-powered security system can identify and respond to threats in milliseconds, significantly enhancing our clients' data protection."

Need help wrangling your data as an existing HubSpot user? Try these Data Management Apps for HubSpot. And if you haven't tried HubSpot AI yet, you can road-test it for free.

Weakness: AI can falter on data volume and quality.

Returning to Informatica's research, data leaders see the increasing volume and variety of data as a significant AI roadblock. Further, 42% cite data quality as another major hurdle for folks integrating AI into their business.

Tharindu Fernando, a full-stack developer, also cites data volume as a potential threat to data protection. "The sheer volume of data processed by AI systems can pose significant privacy risks if not properly managed. To mitigate these risks, I've found that implementing robust encryption protocols and adhering strictly to data protection regulations is crucial."

Meanwhile, Rob Stevenson shares insight on data quality. "A major weakness is the reliance on the quality of the data being fed into the system," says Stevenson. "If the data is compromised or biased, the AI's outputs can be flawed, leading to poor decision-making. AI systems can also become attractive targets for cybercriminals, especially if the models hold sensitive information."

Tomasz Borys from Deep Sentinel shares an example below.

Example: "We've found that AI can be both a powerful tool for data protection and a potential risk if not managed properly," says Borys. "Here's our perspective based on our experience: AI does pose privacy risks if not implemented correctly. For instance, AI systems often require large datasets for training, which could potentially expose sensitive information if not properly anonymized."

He adds, "We once encountered this issue when developing an AI model to detect suspicious behavior in security footage. We had to redesign our data preprocessing to ensure all identifiable information was removed before feeding it into the AI system."

Pro tip: Keep your data clean, clear, and under control with data quality software that does the heavy lifting for you.

How to Protect Data While Using AI

You know it, I know it: Life isn't always black and white. More often than not, it's shades of gray. Such is the case with how AI interacts with data protection and privacy. The kicker? AI can be both the cause of and the fix for data protection issues.

As Irina Maltseva puts it in their article about AI's Role in Protecting Consumer Data: "The best way to tackle bad AI is with good AI." That's why our experts return to show us how to protect data while integrating AI into your business.

1. Take a multi-faceted approach.

"I always advocate for a multi-faceted approach," says tech expert Tharindu Fernando. He explains that this includes employing strong encryption algorithms, implementing granular access controls, and regularly conducting security audits of their AI systems.

Additionally, "In healthcare projects, we take it a step further by anonymizing sensitive data before it's processed by our AI models, ensuring HIPAA compliance while still benefiting from AI's analytical capabilities."

2. Implement a strong data governance framework.

"In any data protection program, it is critical to have a strong data governance posture and ensure encryption and other strong access controls are in place," recommends COO Frank Milia. "Taking a proactive approach to data management will help ensure protected data in an AI environment is as secure as possible."

Tomasz Borys agrees with Milia, stating they follow a strict protocol to ensure data protection while using AI. Borys maps out this protocol below:

  1. We use federated learning techniques where possible, allowing AI models to learn from data without directly accessing it.
  2. We implement differential privacy, adding controlled noise to datasets to prevent individual data points from being identified.
  3. We regularly audit our AI systems for bias and potential privacy leaks.

According to Borys, these measures have helped Deep Sentinel maintain a perfect data protection record while leveraging AI's power.

He adds, "For me, AI is a tool, and like any tool, its impact depends on how it's used. The key takeaway is that by implementing strong data governance practices and privacy-preserving AI techniques, you can harness the power of AI for data protection while minimizing risks." This balanced approach has allowed them to innovate in security technology while maintaining the highest standards of data privacy.

3. Invest in the right tools and training.

"IT, risk, and cyber security professionals need to be up-to-date on the challenges of increasingly unstructured data storage and how it can affect even computer hardware management practices," Milia warns.

"For instance, the largest desktop manufacturers are marketing 'AI desktops' as the next big catalyst for enterprise end-use computing upgrades. These desktops may store data in logs, temporary caches, data lakes, or memory dumps that make isolating or deleting data more difficult."

Aside from keeping your team up to date, Milia recommends that organizations implement tools to track, manage, and audit data flow across AI systems to address these issues. This includes a need to develop tools and methods for purging sensitive data from AI models and performing data erasure on emerging AI hardware.

4. Properly encrypt and anonymize data.

"To make sure data is protected when using AI, start by ensuring that any data you collect is properly encrypted and anonymized before it's used by AI systems," says Stevenson. "This reduces the chance of exposing sensitive information if a breach does occur."

Pro tip: Stevenson also advises following the principle of minimal data collection - don't gather more information than you need to accomplish the task.

5. Complement automation with human oversight.

"AI can be a valuable tool for improving data protection by providing more sophisticated defense mechanisms," says Stevenson. He explains that AI-driven systems can predict and stop threats in real time by detecting abnormal behaviors or vulnerabilities in a network that humans might miss. Furthermore, AI can assist with data classification, ensuring that sensitive information is correctly identified and treated with extra security layers.

"Automating these processes with AI can greatly reduce human error, which is a common cause of data breaches," he adds. "However, it's also important to complement AI with human oversight, regularly reviewing the models to ensure they are updated and functioning as intended."

Pro tip: Stevenson also recommends that you have regular audits and updates of the AI models, ensuring they comply with the latest security protocols.

6. Monitor data access patterns using AI.

"For data protection, we are monitoring data access patterns using AI," says Kevin Shahnazari. According to Shahnazari, their system raised a flag at 3 AM when one of the employees downloaded an unusual amount of customer data. It turned out to be authorized maintenance, but that kind of monitoring keeps them at peace.

Further, to protect data while using AI, the team at FinlyWealth follows three rules:

  1. Minimize data exposure - their models see only what they need.
  2. Use synthetic data for testing - they created 100,000+ fake user profiles for development.
  3. Regular model audits - monthly checks of AI outputs for potential leaks.

7. Keep systems updated and use masking techniques.

"To protect data while using AI, you must keep the systems updated and use masking techniques," says Mira Nathalea.

"We make sure to regularly update the AI algorithms we use so they stay sharp against new threats. We also mask sensitive data during processing, so even if there's a problem, the most critical information stays protected."

8. Bake privacy considerations into the AI systems from the ground up.

Finally, we hear from Parker Gilbert, the CEO and Co-founder of Numeric, an AI accounting automation company. Since his company uses AI extensively, Gilbert has given a lot of thought to the potential risks and benefits of AI when it comes to protecting sensitive financial data.

"Like any powerful technology, AI does have the potential to be misused in ways that could compromise privacy," Gilbert says. "However, I believe that with proper safeguards and responsible development, AI can actually enhance data protection. The key is to bake privacy considerations into the AI systems from the ground up, rather than trying to bolt them on after the fact."

To help achieve this, the team at Numeric has identified a series of questions to consider asking any software vendor that has embedded AI. "How did we come up with them? We didn't - they are many of the same questions that prospects have posed to us," says Gilbert.

Here's a rundown of the questions that Gilbert recommends asking:

  • Where are you using generative AI, or where have you integrated it?
  • Who are your current LLM providers? (E.g., OpenAI, Claude, etc.)
  • How is data used? Will data be used to train models in any way?
  • Are there any vector databases being used with your current AI implementation?
  • Is the data you provide to your model being sent anonymized or sanitized in any way, or is it just the raw data?
  • What type of fields are you sanitizing?
  • How are you ensuring that actual customer data isn't being leaked?
  • Are you SOC 2 certified? What measures are you taking to keep up with AI best practices as our knowledge develops?
  • When you transmit data, do you keep a history of all the queries that you are making?
  • What do you have in place to help make sure the model provides accurate data?
  • Are there access controls to turn off AI usage in the product?
  • What are your future plans for your AI implementation?

The Best Way to Tackle Bad AI is With Good AI

Love it, hate it, utterly indifferent to it: AI isn't going away anytime soon. In fact, AI implementation is on the rise. But with the increasing adoption of the technology comes the ever-increasing risk of data protection and privacy breaches.

If artificial intelligence is right for your business, it doesn't mean you should put the breaks on. It does, however, require that you put thought into how you're implementing the tech. For example, are you vulnerable to external threats from hackers? Or internal threats like your favorite AI tool being an "overly chatty employee" and leaking sensitive data?

In either case, and somewhat ironically, I've learned that the best way to mitigate data protection issues caused by AI is with AI. That said, and as I always emphasize, always keep your human team in the driving seat.

An Introduction to Data Analytics

Unlock the power of data and transform your business with HubSpot's comprehensive guide to data analytics.

  • Fundamentals of data analytics
  • Different types of data analytics
  • Data visualization
  • And more!
Download Now Learn more

Download Free

All fields are required.

You're all set!

Click this link to access this resource at any time.

Don't forget to share this post!