How do you audit an AI vendor for security?
Request the vendor's SOC 2 Type II report, ask whether they'll sign a BAA if you're in a regulated industry, and get written confirmation of their data retention and training-use policies. If they can't produce those three things, stop the conversation there.
Why most AI vendor audits fail before they start
Most SMBs pick AI vendors the same way they pick SaaS tools: they read a marketing page, see logos of recognizable companies, and assume security is handled. It usually isn't, at least not in ways that protect your data specifically.
The problem is that AI vendors have introduced new risk categories that standard IT vendor reviews weren't designed to catch. A vendor can be SOC 2 certified and still use your inputs to retrain their models. A vendor can sign a BAA and still route your data through subprocessors who haven't. The audit checklist for AI vendors needs to be more specific than the one you use for, say, a CRM.
What to actually check when auditing an AI vendor
Start with documentation you can verify independently. Ask for the SOC 2 Type II report, not just a claim of compliance. SOC 2 Type II covers a period of time (typically 6-12 months) and is audited by a third party. SOC 2 Type I only covers a single point in time and is much easier to fake with good process theater. If the vendor only has Type I, note that as a gap.
Next, get their data handling policy in writing, not the marketing summary. You need answers to four specific questions: Does your data get used to train or fine-tune their models? Who are their subprocessors, and are those subprocessors also under compliant agreements? Where is data stored geographically, and does that matter for your regulatory context? How long is data retained, and can you request deletion? If the vendor's privacy policy is vague on any of these, that vagueness is the answer.
For HIPAA-regulated work, BAA signing is non-negotiable. OpenAI doesn't sign BAAs for standard ChatGPT accounts. Anthropic doesn't sign BAAs for Claude.ai consumer accounts. Google offers BAA coverage for Gemini in Workspace under specific enterprise tiers only. Microsoft Copilot can operate under a BAA in certain M365 configurations, but you need to verify which services are actually in scope. If PHI will touch the system at any point, get the BAA signed before you test anything, not after.
Finally, ask about the model architecture. Vendors building on public API wrappers (OpenAI, Anthropic, Gemini) inherit those platforms' data policies unless they've negotiated enterprise agreements that explicitly exclude training use. Vendors running private or self-hosted models, such as Llama 3.1 deployed in your own cloud environment, give you full control over data flow because nothing leaves your infrastructure.
When the standard checklist isn't enough
If you're in healthcare, finance, or any sector with specific regulatory obligations, layer in regulation-specific questions on top of the base audit. For HIPAA, that means verifying breach notification timelines in the BAA. For GDPR, that means confirming the vendor has signed Standard Contractual Clauses if data crosses EU borders. For financial services, it means asking whether the vendor's model outputs can be explained and audited under fair lending or anti-discrimination requirements.
Multi-agent systems add another layer. If the vendor is building a workflow where one AI agent calls another, you need to audit every model and integration in the chain, not just the front-facing one. Twilio for voice, third-party vector databases, retrieval systems that pull from external sources: each one is a potential data exposure point.
How we handle vendor audits at Usmart
We build private LLM deployments, which means we don't route client data through public APIs as a default. For healthcare and finance clients, we sign BAAs before any scoping work touches real data, and we document the full model and integration stack so clients have a clear audit trail. We treat the audit checklist as a deliverable, not an afterthought.
When a client comes to us after a bad vendor experience, the gap is almost always in the data-use-for-training clause or in subprocessor coverage. Both are fixable, but they're much easier to catch before you've integrated than after. If you're evaluating vendors right now and want a second opinion on what you've been handed, that's a conversation we're happy to have.
Ready to see it working for your business?
Book a free 30-minute strategy call. We will scope your use case and give you honest numbers on timeline, cost, and ROI.