OPERATIONS · 2026-05-25

12 red flags when evaluating an AI agents services company

Specific signals from the sales process and pitch deck that consistently predict a bad managed AI agents services engagement. If you see three or more, walk.

This is the companion to our pillar guide on how to choose an AI agents services company. The guide listed ten red flags briefly; this article unpacks twelve with the underlying reasoning for each. The order is rough severity, descending.

1. Demo on their data, refuses paid pilot on yours

The single loudest signal. Any competent vendor can show a polished agent running on a curated dataset they built themselves. The agent is tuned to those examples. The only meaningful evaluation is the agent run on a slice of your real data, with the failures visible. A 1–3 week paid pilot is the standard ask. Vendors who refuse, stall, or insist on a 12-week discovery phase before any working artefact are managing risk that should be theirs.

2. Cannot name a single specific failure mode of their agent

Every production agent has failure modes. Hallucinated entities. Edge cases on input format. Tool calls that time out. Rate-limit cascades when the upstream API has a slow day. Vendors who claim their agent does not have failure modes have not run it long enough to find out — or they have, and they are hiding what they learned. The honest answer is a specific list and a specific story about each.

3. "Fully autonomous, no human needed" pitch in 2026

For anything customer-facing or money-moving, this is either marketing fluff or a vendor who is about to ship a damaging action on your behalf. The 2026 default for production workflows is human-in-the-loop on the last mile. Vendors pitching pure autonomy across all workflows are either selling a future they have not built or volunteering to learn on your dime. See the autonomy framework in the pillar guide.

4. Insists on owning the prompts, workflow, or evaluation suite you paid them to build

The 2026 equivalent of a marketing agency owning your ad account. The argument the vendor will make: "the IP is intermixed with our platform, we cannot cleanly separate it." That is technically true and operationally a choice. Boutique vendors with mature platforms can and do assign IP in the work product to clients. If your vendor will not, you are leasing your own process from them. Walk.

5. No named operator, "the team will handle it"

Means a junior, often shared across many accounts, and they will rotate off in three months without notice to you. The whole point of a managed service is that a specific human is accountable for the workflow's outcome. If there is no name, there is no accountability. The vendor's answer should be a person with a calendar — not a pool, not a Slack channel, not "we have a strong team."

6. Pricing hidden until a sales call

If the model is reasonable, the vendor will publish a range. Hiding pricing almost always means one of three things: they price-discriminate based on how desperate you sound, the number is embarrassingly high for the deliverable, or they have no consistent pricing because every deal is bespoke (which is itself a red flag for an "ongoing service"). The discovery call exists to qualify fit and customise scope, not to discover whether you can afford the headline number.

7. Cannot answer EU data residency questions in plain language

For EU buyers, this is disqualifying. A vendor selling to EU clients in 2026 should answer "where is our data stored, where is it processed by the model, who are your sub-processors" in 60 seconds with named regions and named sub-processors. Hesitation here means either they have not thought about it, or they are about to wing it on your contract. Either is bad. See AI agents data residency in the EU.

8. White-label or subcontracts the actual work without disclosure

Some vendors front a sales motion that subcontracts the engineering or the operations to a third party — often an offshore shop you would not have hired directly. Ask in writing: "Is anyone outside your company touching our workflow, our data, or our infrastructure?" If they hedge, walk. Disclosed subcontracting is fine; undisclosed is fraud-adjacent.

9. Contract length over 12 months on a fixed term with no clear build deliverable

Pure lock-in. The vendor's incentive shifts from delivery to retention the moment the contract closes. The legitimate reasons for a long fixed term are: a meaningful one-time build that needs amortisation, or a regulated environment with a known long ramp. Outside those, a 12-month fixed lock is the vendor protecting margin against a relationship they cannot win on merit.

10. Sales process led by a closer who cannot answer technical questions

If the account executive defers every specific question to "the engineering team" or "the AI lead" you have not met, you are about to sign a contract with people you do not know. Ask for the operator and the technical lead on the first or second call. Vendors who will not put them in front of you in pre-sales will not put them in front of you in delivery either.

11. Guarantees specific outcomes before seeing your data

"We guarantee 30% reduction in support handling time." "We guarantee 50 qualified leads per month." A specific guarantee made before the vendor has seen your data is either dishonest or based on a definition of "guarantee" that will not survive the first slow month. Real performance commitments are calibrated to a baseline established during the pilot — not pulled from a slide deck.

12. References that all date from the last 6 months

The vendor has been running this service for two years but every reference they offer started 4–6 months ago. Two possibilities: the older clients have churned and the vendor is hiding it, or the older clients are unhappy and unwilling to take the call. Ask explicitly for a reference who has been on the service for 12+ months and has renewed at least once. The pause before the answer tells you what you need.

The yellow flags worth probing

Beyond the twelve hard reds, a handful of yellow flags should trigger follow-up questions rather than an immediate walk.

The vendor has changed their name or rebranded in the last 12 months. Sometimes a normal evolution. Sometimes a way to escape a bad reputation. Ask the founder directly and check the LinkedIn history.

Heavy reliance on a single LLM provider with no migration plan. If they cannot tell you how they would handle Anthropic or OpenAI changing pricing or deprecating a model overnight, you are inheriting the concentration risk.

The case studies are anonymous. Sometimes legitimate (NDA, regulated client). Often a sign that no actual client agreed to be named. Ask for one named reference even if the case study itself is anonymised.

The "platform" looks like a Notion page plus a Slack channel. Could mean a young vendor with a real future or a thin shop without a real platform. Probe what the operator portal actually contains.

Pricing is "starting at €X" with no upper bound discussed. The starting price is rarely the real price. Push for the typical landing range for a workflow at your scale.

How to use this list

You do not need a clean sweep. One red flag is a yellow flag worth probing. Two means a serious conversation about whether to continue. Three or more, walk — the vendor has structural issues that will be your problem at month six. The relationship will not improve once you sign.

The complementary list is the positive checklist in the pillar guide. A vendor who passes 11 of 13 checks on the positive list and exhibits zero of these red flags is rare and worth keeping in the running. A vendor who passes the positive list but exhibits two of these flags is the most dangerous category — they look good in the sales process and they will deteriorate predictably.

For the questions to ask that will surface these flags, see 20 questions to ask before hiring an AI agents services company. For when not to even start the process, see When NOT to hire an AI agents services company.

Where Logitelia fits

Logitelia publishes pricing ranges, runs paid pilots on real client data, assigns IP in work product to clients, names the senior operator before contract, hosts in the EU, and has zero undisclosed subcontracting. If you want to test the fit against any of the flags above, book an intro call and we will answer specifically — including the failure modes our agents do have.

For the full framework, read the pillar How to choose an AI agents services company in 2026.

Evaluating an AI agents services vendor and want a second opinion on the red flags? We will spend 30 minutes on a no-strings sanity check.

Book intro call