Key takeaways

  • The enterprise buying question has shifted from "what can the agent do" to "where does the call data go." Capability is assumed, so the deal now turns on data flow and control.
  • Voice is the heaviest data you can collect. A voiceprint is biometric, it cannot be changed once leaked, and vendor promises like thirty-day deletion and zero retention keep getting overridden by courts, subprocessors, and regulation.
  • The fix already lives in your cloud account. AWS, Azure, and GCP all sell ready-made isolation tiers, and Dograh can run as an open, fully self-hostable orchestration layer inside whichever tier you already trust, so the audio never leaves your boundary.

The buyer question has changed

Two years ago the enterprise demo was about capability. What can the agent do, how human does it sound, how fast does it respond. Now the first hard question in the room is about data flow. I see it on almost every call we run. The buyer wants to know where the audio goes, who can read the transcript, what gets stored, and whether any of it touches a model that learns from it.

Capability is now assumed. Everyone has good voice. The thing being sold is trust, and that shift is not cosmetic. A voice agent that books appointments and handles refunds is impressive, but if the security team cannot draw the data flow on a whiteboard and feel comfortable, the deal stalls in procurement. The question moved from "what can it do" to "where does the call data go," and the second one is much harder to answer with a slide.

Voice is the most sensitive data you collect

A phone call is dense with personal information. Names, account numbers, health details, payment data, and the voice itself. That last part is the one most teams underrate. A voiceprint is a biometric identifier sitting in the same legal category as a fingerprint, and once it leaks the person cannot change it.

This is not abstract. Illinois saw more than a hundred new biometric privacy lawsuits in 2025, with settlements running into the millions, and courts are now treating speaker recognition and diarization as biometric collection that needs written consent. Transcription tools like Fireflies and Otter got hit directly. Voice data carries a kind of liability that text never had, which is why InfoSec teams treat voice pipelines with more caution than any other channel and keep pushing the audio away from third parties. This is the gap Dograh is built to close, by keeping the entire voice stack inside the buyer's own environment.

dograh oss

Nobody trusts the privacy promise anymore

The standard vendor answer used to close the conversation. We delete after thirty days. We do not train on your data. We are SOC 2 certified. Buyers have stopped accepting these as guarantees, and they are right to.

The OpenAI case made the reason obvious. A court ordered OpenAI to preserve every ChatGPT log, including chats users had deleted and sessions meant to be temporary. The thirty-day deletion policy protected no one. A privacy promise is only as strong as the next subpoena. "Zero data retention" has the same weakness. Most of those claims are marketing language, and even when a vendor means it, the data passes through subprocessors across regions, each one another place it can be kept or exposed. On top of that, several SaaS vendors have admitted customer data was used to train third-party models without clear consent.

And often the choice is made for you. In several geographies the central bank requires payment data to be stored only inside the country. India's RBI rule is strict enough that it forced Visa and Mastercard to build local data centers, SEBI adds residency rules for regulated entities on cloud, IRDAI keeps insurance data local, and the DPDP Act carves out sensitive personal data for tighter handling. In Europe, GDPR fines have crossed several billion euros and the EU AI Act raises the ceiling to seven percent of global turnover. For a regulated buyer, air-gapped or in-country deployment stops being an IT preference. It is the only configuration their compliance team will sign.

Your cloud provider already sells the isolation you need

The isolation is already sitting in your cloud account. All cloud providers provide a logical perimeter strategy thats used by most enterprises today in Healthtech & Fintech. This is the most logical option with VPC srvice controls. Every large enterprise already runs on AWS, Azure, or GCP, already has a signed data agreement, a private network, and key management in place. The three big providers each sell a full ladder of data isolation, from a physical air-gapped box down to logical network walls, private routing, and external key control. You do not have to invent any of this.

Amazon Web Services:

Strategy Service Isolation Complexity Best for
Physical air-gap Top Secret/Secret Regions, Outposts, Snowball Edge Maximum (physical and air-gapped) Very high National security, defense, tactical edge
Logical perimeter VPC Endpoints, IAM data perimeters High (logical network) High Finance and healthcare enforcing exfiltration boundaries
Private routing PrivateLink Medium (private backbone) Medium Workloads calling internal AI like Bedrock
External keys KMS External Key Store (XKS) Medium (cryptographic) Medium Sovereign mandates where keys stay outside the cloud

Microsoft Azure:

Strategy Service Isolation Complexity Best for
Physical air-gap Azure Government Secret, Azure Stack Hub (Disconnected) Maximum (physical and air-gapped) Very high Defense, intelligence, critical infrastructure
Logical perimeter Virtual Network (VNet) Service Endpoints High (logical network) High Regulated firms enforcing tight boundaries over internal SaaS
Private routing Private Link / Private Endpoints Medium (private backbone) Medium Internal Azure OpenAI workloads off the public internet
External keys Key Vault Managed HSM with external HSM Medium (cryptographic) Medium Multi-nationals needing full sovereignty and key ownership

Google Cloud:

Strategy Service Isolation Complexity Best for
Physical air-gap GDC Air-Gapped Appliance Maximum (physical) Very high Defense, national security, regulated utilities
Logical perimeter VPC Service Controls High (logical network) High Finance, healthcare, PII data protection
Private routing Private Service Connect Medium (private routing) Medium Corporate networks needing cloud AI speeds
External keys External Key Management Medium (cryptographic) Medium Strict digital sovereignty and compliance mandates

The pattern is identical across all three. The top rung is completely air-gapped but an overkill and an expensive solution. Google's GDC Air-Gapped Appliance is a ruggedized box you put in your own data center or at a tactical edge, with zero physical or wireless link to the internet, running a disconnected version of Vertex AI inside it. The middle rung is logical air-gapping, which is the pattern most enterprises actually use. With VPC Service Controls you draw a software perimeter around your workloads, your storage, and your data warehouse, and the boundary blocks all egress and ingress. Even an employee who misconfigures an IAM permission cannot push data across the wall. The hardware is still shared inside the public cloud, but the network layer is sealed.

So the compliance answer is mostly solved before voice AI even enters the picture. Pick the tier your regulator demands and your provider hands you the perimeter.

Dograh is the orchestration layer you actually own

The cloud gives you the room. Something still has to run the voice agent inside it, and this is exactly where the closed platforms break the model. A hosted voice platform has to route your call audio through its own infrastructure to function. Even the ones that advertise self-hosting keep proprietary pieces on their servers or call home for orchestration. Your audio leaves your perimeter, and you are back to trusting a promise instead of a boundary.

Dograh is built the other way around. It is open source and fully self-hostable, with no closed-source component anywhere in the stack. You deploy it inside the cloud account and the isolation tier you already chose. Drop it inside your VPC Service Controls perimeter on GCP, behind your VPC Endpoints on AWS, or inside your VNet on Azure. If you run defense-grade workloads, put it on a GDC Air-Gapped Appliance with no internet at all. In every case the call audio, the transcript, and the voiceprint stay inside the boundary you already pay for and already trust.

That is the structural advantage of open source for voice. The model quality and the speech quality have commoditized, so the real moat is whether the buyer can own the entire stack. Dograh can be owned end to end. A closed, hosted platform cannot, by its own design.

For the security team this is the moment the whiteboard finally makes sense. Audio comes in, audio goes out, nothing crosses the perimeter. The answer to "where does the call data go" becomes "nowhere you do not already control." You are not adding a new vendor to trust. You are running open software on infrastructure you have already cleared.

Talk to us about on-prem

If your compliance team has already flagged data residency or air-gapping for a voice project, that is the conversation we have every week. We run fully managed on-prem and self-hosted Dograh deployments inside your cloud of choice. We scope the isolation tier with your security team, deploy Dograh inside your own boundary, and hand you a voice stack you fully own with no closed-source software in it.

Reachout to Dograh to scope an on prem deployment: Exploring Dograh on prem