Study shows AI agents struggle with CRM and confidentiality

Large Language Model (LLM) agents aren’t very good at key parts of CRM, according to a study led by Salesforce AI scientist Kung-Hsiang Huang.

The report showed AI agents had a roughly 58% success rate on single-step tasks that didn’t require follow-up actions or information. That dropped to 35% when a task required multiple steps. The agents were also notably bad at handling confidential information.

“Agents demonstrate low confidentiality awareness, which, while improvable through targeted prompting, often negatively impacts task performance,” the report said.

Varying performance and multi-turn problems

While the agents struggled with many tasks, they excelled at “Workflow Execution,” with the best agents having an 83% success rate in single-turn tasks. The main reason agents struggled with multi-step tasks was their difficulty proactively acquiring necessary, underspecified information through clarification dialogues.

Dig deeper: 7 tips for getting started with AI agents and automations

The more agents asked for clarification, the better the overall performance in complex multi-turn scenarios. That underlines the value of effective information gathering. It also means marketers must be aware of agents’ problems handling nuanced, evolving customer conversations that demand iterative information gathering or dynamic problem-solving.

Alarming lack of confidentiality awareness

One of the biggest takeaways for marketers: Most large language models have almost no built-in sense of what counts as confidential. They don’t naturally understand what’s sensitive or how it should be handled.

You can prompt them to avoid sharing or acting on private info — but that comes with tradeoffs. These prompts can make the model less effective at completing tasks, and the effect wears off in extended conversations. Basically, the more back-and-forth you have, the more likely the model will forget those original safety instructions.

Open-source models struggled the most with this, likely because they have a harder time following layered or complex instructions.

Dig deeper: Salesforce Agentforce: What you need to know

This is a serious red flag for marketers working with PII, confidential client information or proprietary company data. Without solid, tested safeguards in place, using LLMs for sensitive tasks could lead to privacy breaches, legal trouble, or brand damage.

The bottom line: LLM agents still aren’t ready for high-stakes, data-heavy work without better reasoning, stronger safety protocols, and smarter skills.

The complete study is available here.

Email:

See terms.

The post Study shows AI agents struggle with CRM and confidentiality appeared first on MarTech.

Varying performance and multi-turn problems

Alarming lack of confidentiality awareness

Related News

Flicks marks app relaunch with punchy new OOH campaign across Australia and New Zealand

Mim Haysom’s Cannes Diary #2: Inside the jury room as the Creative Data Lions are decided

Havas Red elevates Tina Provis to Senior Influencer Specialist, reveals top influencer trends

Alex Reid’s Cannes Diary #1: Wristband envy and the art of the lanyard glance