Evaluating 'Incognito' Claims in AI Assistants: A Technical Vendor Audit Checklist
ai-privacyvendor-managementcompliance

Evaluating 'Incognito' Claims in AI Assistants: A Technical Vendor Audit Checklist

DDaniel Mercer
2026-05-30
22 min read

A technical vendor audit checklist to test AI 'incognito' privacy claims, retention, encryption, logging, and fine-tuning behaviors.

When an AI vendor says a chat is “incognito,” “private,” or “not used for training,” that statement may be meaningful—or it may be marketing shorthand that disappears under legal scrutiny. The current wave of complaints around Perplexity’s incognito chats is a useful reminder that privacy claims in conversational AI must be tested the same way you would test any other security control: with evidence, not trust. If you are responsible for procurement, compliance, or architecture, the right question is not whether the vendor says it is private, but whether its vendor audit package can prove how data is collected, retained, processed, and deleted. For teams building secure workflows, the stakes are the same as in compliant hosting architectures: vague assurances do not pass a control review.

This guide gives you a practical, technical checklist for evaluating AI privacy, data retention, encryption, logging, and model fine-tuning pipelines. It is designed for IT admins, security engineers, privacy counsel, and product teams that need to decide whether an assistant belongs in regulated workflows. You will also see how to pressure-test “incognito mode” claims with a structured evidence request, similar to how teams validate support systems or other enterprise platforms that touch sensitive data. The goal is simple: reduce blind trust, increase verifiability, and make your vendor decisions defensible.

1) Why “Incognito” in AI Is a Risky Word

1.1 Consumer UX language vs. compliance-grade privacy

“Incognito” is a UX label, not a legal control. In browsers, it usually means local history is not retained on the device; it does not necessarily mean the network, CDN, or service operator never sees the data. In AI, the term is even more ambiguous because prompts can be used for safety monitoring, abuse detection, quality improvement, and model training. If the vendor cannot define “incognito” in operational terms, assume it is a product experience, not a compliance guarantee.

This is where teams often over-interpret the interface and under-review the backend. A privacy badge on a chat window does not answer critical questions like whether prompts are stored in application logs, whether attachments are indexed, or whether human reviewers can access transcripts during incident investigation. If you are already familiar with how procurement teams assess product claims in areas like new or unproven storefronts, the mindset is the same: look for red flags in the wording, the exclusions, and the missing specifics.

1.2 Why AI assistants are unlike normal SaaS apps

Traditional SaaS systems usually have a bounded set of data flows: user input, storage, API calls, logs, maybe analytics. Conversational AI adds hidden layers: prompt preprocessing, safety filters, retrieval augmentation, model routing, fine-tuning queues, telemetry, and vendor-managed context retention. Each layer can create a separate copy of the same content, which means “deleted” may only mean removed from the chat UI. This complexity is why governance teams need a much deeper review than they would for standard webmail access.

In practice, the most sensitive data often enters through the front door and then fans out into multiple subsystems. A single prompt can become a logs entry, a moderation record, a support artifact, a usage metric, and a training sample candidate. If your vendor cannot explain each copy and retention period, the risk surface is larger than most business users realize.

1.3 The business cost of getting this wrong

Misjudging “incognito” claims can create regulatory, contractual, and reputational fallout. A prompt containing customer PII, protected health information, source code, legal strategy, or incident details can trigger disclosure obligations if it is retained longer than expected or accessed by unauthorized personnel. Even if no breach occurs, you may still violate internal policy or customer commitments by placing regulated content into an AI system that is not approved for that class of data. In regulated environments, the penalty is often not the incident itself but the inability to prove control.

That is why the audit should be evidence-driven and repeatable. Treat the AI vendor like any other managed service in a compliance stack, whether you are evaluating multi-cloud compliance patterns, a security-conscious data workflow, or a vendor with self-serve privacy claims. The standard is not “sounds private”; it is “can we document and verify the control?”

2) Build the Vendor Audit Scope Before You Ask Questions

2.1 Define what data classes are in scope

Before you ask a vendor about retention or encryption, define exactly what content the assistant may receive. Separate prompts into categories: public data, internal business data, confidential documents, customer records, source code, secrets, regulated data, and incident-response artifacts. The same tool may be acceptable for public brainstorming but prohibited for SOC2 evidence, legal memos, or clinical notes. If your policy is unclear, the vendor audit will produce nice answers but no actionable decision.

A practical way to start is to map use cases to data classes and control requirements, then align those to the vendor’s processing model. For example, a team using AI for support triage may allow redacted ticket summaries, but not raw customer attachments. For technical teams, a policy matrix like the one used in automation workflows can help translate operational intent into enforceable rules.

2.2 Identify the systems that touch the data

You need a data-flow inventory, not just a product brochure. Identify the browser client, mobile app, API, gateway, moderation layer, prompt store, analytics layer, search layer, and model provider chain. Ask where the vendor terminates TLS, where it decrypts data, where it stores session state, and which subprocessors can access each artifact. The point is to understand whether “incognito” applies to the end-to-end path or only the visible conversation history.

This is especially important when the vendor uses multiple model backends or routing logic. The same query may touch a first-party model, a third-party inference provider, and a separate observability platform. Procurement teams that have dealt with complex infrastructure decisions, such as vertical integration tradeoffs, will recognize the risk: every added layer creates another trust boundary.

2.3 Set approval criteria before the demo

Do not let the demo drive the policy. Establish a clear approval rubric that includes encryption, deletion timelines, logging scope, training usage, admin controls, and evidence requirements. If the vendor cannot meet a requirement, document whether there is a compensating control, such as strict data redaction, enterprise tenant isolation, or API-only usage with disabled retention. Without this prework, teams end up approving a product because the UX felt safe rather than because the risk was managed.

Pro tip: If the vendor’s answer uses phrases like “generally,” “typically,” “may be retained,” or “used to improve services,” force a follow-up that asks for the exact system, exact retention period, and exact customer setting that changes behavior.

3) Data Retention: The First Question You Should Always Ask

3.1 Retention in the UI is not the same as retention in the backend

One of the most common failures in privacy reviews is equating chat deletion with full data deletion. A deleted conversation may still exist in logs, caches, moderation queues, backup snapshots, and legal hold systems. In some architectures, deletion only removes the user-facing thread while preserving back-end copies for a fixed retention period. The vendor must specify which artifacts are deleted, when deletion occurs, and whether backups are included in the deletion process.

The most useful audit question is: “After a user deletes an incognito chat, what exact data objects remain, where are they stored, and for how long?” If the response does not name the object types—prompt text, attachments, metadata, embeddings, logs, traces—treat the answer as incomplete. Teams that already know how to analyze lifecycle controls in other domains, like post-merger data integration, understand the importance of defining every stage of the lifecycle.

3.2 Ask for the retention matrix, not a policy summary

Vendors often publish a privacy policy that is too broad for operational use. You want a retention matrix with columns for data type, purpose, default retention, customer-configurable retention, deletion SLA, backup retention, and legal hold exceptions. A serious enterprise vendor should be able to answer with precision, not slogans. If they cannot, they may not have a mature records-management process.

Use the matrix to test edge cases: free-tier user, enterprise tenant, API user, support ticket, abuse-report investigation, and security incident. Different answers are acceptable, but different answers must be explicit. This is similar to the way a technical team evaluates whether a platform can support diverse workloads without data sprawl, as in hybrid compliance architectures.

The hardest part of retention is not active storage; it is the residue left behind in backups and forensic logs. Ask how often backups age out, whether deleted records are tombstoned or physically erased, and whether any data persists after customer termination. Also ask whether legal hold can override deletion requests and how users are notified when that occurs. For regulated teams, these details determine whether the vendor is usable at all.

If the vendor claims “incognito means not stored,” make them define “stored.” Does that exclude transient memory, queue messages, encrypted blobs, or crash dumps? In privacy reviews, ambiguity almost always benefits the vendor and disadvantages the buyer. That is why a clear access and retention checklist should be mandatory before rollout.

4) Encryption, Key Management, and the Real Meaning of “Protected”

4.1 Encryption in transit: necessary, not sufficient

Every AI vendor should use modern TLS for data in transit, ideally with current best practices and strong certificate management. But TLS only protects traffic between endpoints; it does not tell you what happens after termination. Ask whether TLS terminates at a front proxy, an API gateway, or a regional edge, and whether any internal service-to-service traffic is also encrypted. For highly sensitive workloads, you want clear evidence that internal east-west traffic does not traverse in the clear.

To validate this, request the vendor’s network architecture overview and ask which components can inspect prompt content. Security teams that have performed deep infrastructure procurement reviews, such as admin hardware selection or platform hardening, know that security claims often stop at the marketing boundary and do not cover internal hops.

4.2 Encryption at rest: define the storage object

“Encrypted at rest” can mean a lot of things. It may refer to disk-level encryption on a storage cluster, application-level field encryption, or envelope encryption per tenant. For AI assistants, the relevant question is not simply whether the storage is encrypted, but whether the sensitive objects are separately protected from broader platform operators. If prompt histories, embeddings, logs, and attachments are all stored differently, each should have its own control description.

Ask whether the vendor supports customer-managed keys, hardware security modules, key rotation, and cryptographic erasure. If the vendor cannot explain how keys are segregated per tenant or per region, assume the encryption is protective against disk theft but not against overbroad platform access. For teams that care about compliance-grade architectures, this is as important as the controls discussed in compliant EHR hosting.

4.3 Zero-knowledge vs. vendor-managed decryption

The phrase “zero-knowledge” should be reserved for systems where the provider cannot decrypt customer content in usable form. Many AI vendors cannot legitimately make that claim because the service itself must read prompts to generate outputs. In those cases, the more accurate standard is “customer-controlled encryption” or “restricted operator access.” Do not let the vendor blur these distinctions.

A strong audit asks whether the service ever processes plaintext in memory, whether privileged support staff can access decrypted content, and whether there are any admin override paths. If the assistant is marketed alongside privacy-first services like secure support automation, the same logic applies: operational visibility must be balanced against the minimum-access principle.

5) Logging, Telemetry, and the Hidden Copy Problem

5.1 Log every event, or log nothing sensitive?

Logging is where many “incognito” promises quietly fail. Vendors often log prompt metadata, full text, error payloads, moderation verdicts, and trace identifiers for debugging and abuse prevention. Even if the UI claims a private session, the observability layer may still keep detailed records long enough for internal teams to reconstruct user interactions. Ask whether logs are structured, whether they contain payloads, and whether redaction is automatic or best-effort.

You want to know the boundary between operational telemetry and content retention. If the vendor says logs are “minimized,” request a sample log schema with sensitive fields masked, retention durations by log class, and role-based access details. This is a familiar exercise for teams that have built transparent system reporting, similar to traceability dashboards where each event must be explainable.

5.2 Support access and human review paths

Many incidents happen not because the system was hacked, but because support or safety staff had broad access to conversations. Ask whether humans can access user prompts, under what conditions, and whether access is authenticated, approved, and recorded. You should also ask whether transcripts are used for quality review, abuse investigations, or escalated support tickets. If the vendor cannot provide audit logs for human access, your compliance team should treat that as a serious gap.

For regulated organizations, this is especially important if employees may enter patient data, employee HR records, or privileged legal material. If human review is part of the process, you need a documented business justification and clear access governance, much like the discipline applied in meeting transformation programs where visibility must not become overexposure.

5.3 Telemetry, analytics, and product improvement signals

Some vendors use analytics events to improve product performance and reliability. That can be acceptable if the data is truly de-identified and cannot be re-associated with the user or conversation. But many systems still emit rich event streams that include prompt fragments, completion snippets, or URLs to associated artifacts. Request the telemetry event catalog and inspect whether any content-bearing fields are transmitted.

When a vendor says “we only collect usage data,” ask for examples. Usage data can mean button clicks, session duration, and feature flags—or it can mean complete prompt-response pairs. The difference is enormous, and the burden is on the vendor to prove the lighter version. Treat this part of the audit like evaluating the data fidelity in a clean-data pipeline: bad inputs produce misleading outputs.

6) Fine-Tuning, Training Use, and Model Improvement Pipelines

6.1 Opt-out is not enough if the defaults are opaque

One of the most consequential questions in AI privacy is whether user content can be used for model improvement. Some vendors let customers opt out of training, but the default may still be “on” for consumer accounts or certain workspace tiers. Others may exclude training but still use content for safety model tuning, moderation classifier improvement, or reinforcement learning workflows. The audit should separate direct model training, fine-tuning, evaluation datasets, and safety analysis.

Ask for the data-flow diagram from ingestion to training. Specifically, what content is sampled, who approves sampling, how long it stays in the training queue, and whether users can request deletion from training corpora. If the vendor cannot answer in operational language, then the phrase “not used for training” may be incomplete rather than false. Teams doing product and AI due diligence should use the same rigor they would bring to AI project scoping.

6.2 Fine-tuning pipelines create a special risk

Fine-tuning often requires curated datasets that may contain real user interactions. That raises a difficult question: are prompts anonymized, tokenized, truncated, or transformed before they enter the pipeline? Ask whether the vendor uses customer content to fine-tune shared models, tenant-specific models, or only narrow safety classifiers. Also ask whether the fine-tuned artifact can memorize sensitive strings and whether the vendor has a process for mitigating model inversion or prompt extraction risks.

If your organization handles secrets or regulated data, the safest assumption is that any content sent to a shared fine-tuning pipeline is high risk unless the vendor proves otherwise. This is the AI equivalent of avoiding unnecessary exposure in any system that relies on careful data handling, from complex technical platforms to mission-critical enterprise software. The smaller the necessary data footprint, the better.

6.3 Training exclusions, retention exclusions, and deletion rights

Do not confuse “we do not train on your data” with “we delete your data promptly.” Those are separate promises. A vendor may retain prompt content for abuse monitoring while excluding it from training, or retain it in pseudonymized form for analytics even if it is excluded from model updates. Your contract should distinguish among training usage, retention duration, and deletion semantics. If a vendor cannot commit to all three, then the privacy story is incomplete.

For teams evaluating whether AI belongs in operational workflows, the same lesson appears in automation governance: the system can be useful and still be disqualified by data handling. Productivity does not override policy.

7) The Vendor Audit Checklist You Can Use Today

7.1 Core evidence request list

Below is the evidence package to request from any conversational AI vendor before approving “incognito” usage. Ask for policy documents, architecture diagrams, retention schedules, subprocessors, DPA terms, security whitepapers, and access-control descriptions. Where possible, request screenshots of customer controls, not just prose. You are trying to verify real product behavior, not passively collect assurances.

Audit AreaWhat to VerifyEvidence to RequestPass/Fail Signal
RetentionWhat data persists after chat deletionRetention matrix, deletion SLA, backup policyClear object-level retention and deletion terms
Encryption in transitTLS scope and termination pointsNetwork architecture, TLS configuration summaryStrong TLS everywhere user content travels
Encryption at restStorage encryption and key managementKMS/HSM overview, key rotation policyCryptographic controls documented per data class
LoggingWhether prompts or excerpts appear in logsLog schema, redaction policy, log retentionNo content-bearing logs without explicit justification
Training/Fine-tuningWhether customer content enters model pipelinesTraining governance doc, opt-out detailsExplicit exclusion or tightly controlled consent model
Human accessSupport or safety review accessRBAC model, access logs, reviewer policyLeast-privilege access with recorded approvals
Deletion rightsErasure from active systems and backupsDSR process, deletion workflow, confirmation SLADefined deletion path including downstream copies

7.2 Questions that force specificity

Use direct, non-leading questions so the vendor cannot answer with generic marketing text. Ask: “Are prompts stored in application logs?” “What fields are redacted by default?” “Can support personnel access incognito chats?” “Is customer content excluded from all training and fine-tuning pipelines?” “What is the maximum retention period for deleted chats, backups excluded?” These questions are hard to dodge because they require operational detail.

You can also ask for a data path walkthrough: from prompt submission to response generation to logging to deletion. Strong vendors can explain this clearly, just as strong technical teams can explain hardware and lifecycle decisions in admin procurement reviews. If the explanation falls apart when you ask about object-level handling, that is useful signal.

7.3 Red flags that should trigger escalation

Escalate if the vendor refuses to answer where user data is stored, cannot identify subprocessors, uses “incognito” as a blanket privacy guarantee, or says deletion is “best effort.” Also escalate if logs are not documented, if training policy varies by plan without clear contract language, or if human review is described as “rare” but not governed. Any of these can become a compliance problem once the tool is widely adopted.

For practical procurement discipline, watch for the same warning signs used in high-risk purchasing reviews: vague ownership, unclear warranty terms, and inconsistent promises. The domain is different, but the discipline is the same.

8.1 Match risk tier to use case tier

Not every AI use case deserves the same controls. A public-marketing brainstorming assistant may only need basic protections, while a developer assistant handling proprietary code may need stricter retention limits and no-training guarantees. An HR or legal use case should almost always require a higher bar, including explicit approval, logging review, and contractual restrictions. The vendor may be capable, but the risk tier should drive the permitted workflow.

When teams fail to do this, they either over-restrict low-risk use cases or under-protect high-risk ones. A tiered policy avoids both failures by matching data sensitivity to control depth. This logic mirrors how organizations allocate resources in other technical decisions, such as technical platform selection or enterprise integration planning.

8.2 Require a contractual privacy schedule

Do not rely only on a web privacy policy that can change unilaterally. Your agreement should include a privacy schedule or DPA annex that specifies what data is processed, for what purposes, for how long, with what encryption, with what logging constraints, and under what training restrictions. If the vendor serves enterprise customers, it should be prepared for this conversation. If it is not, the service may be too immature for regulated use.

Also consider contractual language on breach notification, subprocessor change notice, and audit cooperation. A good contract turns a verbal claim into an enforceable obligation. That is the difference between a consumer app and an enterprise service.

8.3 Make privacy a recurring review, not a one-time sign-off

Privacy posture changes. Vendors launch new features, switch model providers, adjust logging, and revise defaults. Your approval should therefore include a review cadence, especially after major product updates or policy changes. Re-run the audit when the vendor adds voice, file upload, agentic workflows, or third-party connectors, because those features often expand the data footprint dramatically.

This is the same operational discipline used in mature platform governance: controls are not static. Teams that keep a living checklist, like those using traceability dashboards or recurring compliance reviews, are far better positioned to spot drift before it becomes an incident.

9) How to Write a Defensible Internal Recommendation

9.1 Summarize the control environment, not the marketing claim

Your internal memo should not say “Vendor X has incognito mode.” It should say what the mode actually does, what data persists, what the retention period is, whether content is excluded from training, and whether support staff can access transcripts. Include exceptions and unresolved questions. If there are gaps, call them out plainly. That honesty makes the approval process more credible, not less.

Use plain language for executives, but keep the technical detail for the control owners. A strong recommendation documents the evidence reviewed and the residual risk accepted. That style of writing is familiar in any serious governance workflow, including the kind of analysis found in AI product diligence.

9.2 Include a remediation path if the vendor is close but not ready

Sometimes a vendor is useful but not yet suitable for sensitive content. In those cases, recommend conditional approval with guardrails: no PII, no regulated data, no secrets, prompt redaction, API-only access, disabled training, or separate tenant configuration. A good recommendation does not force a binary yes/no when a safer middle ground exists. It names the exact steps required to move from “possible” to “approved.”

If the vendor wants enterprise trust, it should welcome this process. A mature privacy program should be able to answer the same kinds of questions asked of regulated hosting environments, because the burden of proof is similar.

9.3 What to do if the answers stay vague

If the vendor remains evasive after repeated requests, treat that as a decision, not an inconvenience. Vague answers usually indicate one of three things: immature controls, unwillingness to disclose, or a product design that cannot support your needs. In all three cases, you should limit adoption or reject the tool for sensitive workflows. The cost of waiting is often lower than the cost of a privacy incident.

Pro tip: In vendor audits, silence is data. If a vendor won’t provide retention, logging, and training details in writing, assume the answer is not favorable until proven otherwise.

10) Final Takeaway: “Incognito” Should Mean Verifiable, Not Vague

Evaluating AI privacy claims is no longer optional. As conversational systems move deeper into support, development, operations, and knowledge work, the difference between an actual control and a marketing label becomes material. A credible vendor audit must test data retention, encryption at rest and in transit, logging behavior, human access, and training/fine-tuning pipelines. If the vendor can answer those questions precisely, you may have something worth approving. If not, the term “incognito” is doing more work than the product design.

For teams building a formal policy, pair this checklist with your broader governance program and review it alongside other platform decisions such as hardware procurement standards, support tooling controls, and compliance architecture patterns. The same principle applies across all of them: trust the evidence, not the label.

FAQ: Evaluating Incognito Claims in AI Assistants

1. Does “incognito” mean the vendor does not store my prompts?

Not necessarily. In many systems, “incognito” only means chats are hidden from the user’s normal history or excluded from one specific retention bucket. Prompts may still be stored in logs, moderation systems, backups, or security tools. Always ask for object-level retention details in writing.

2. Is “not used for training” the same as “not retained”?

No. A vendor can exclude content from training while still retaining it for abuse detection, support troubleshooting, analytics, or legal obligations. Training exclusion and retention policy are separate controls and must be reviewed separately.

3. What encryption claims should I verify first?

Start with encryption in transit and encryption at rest, then move to key management. Ask who controls the keys, whether they rotate, where they are stored, and whether the vendor can decrypt content in plaintext to serve the model. For high-sensitivity use cases, this distinction matters as much as the encryption algorithm itself.

4. How do I audit logging without getting lost in technical details?

Request the log schema, data classification for each field, retention periods, and access controls. You are looking for evidence of content-bearing fields, prompt excerpts, file names, or user identifiers that can reconstruct a conversation. If logs are not documented, assume there is hidden risk.

5. What should I do if the vendor refuses to answer my questions?

Escalate internally and treat the refusal as a risk indicator. If the vendor will not provide retention, logging, or training details, you should restrict use to low-risk content or reject the platform for sensitive workflows. Lack of transparency is itself a control failure.

Related Topics

#ai-privacy#vendor-management#compliance
D

Daniel Mercer

Senior Cybersecurity Editor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

2026-05-30T21:34:34.478Z