Designing a Zero-Knowledge Evidence Pipeline

Most privacy claims collapse at the moment they matter. A system is "private" until someone asks for the data in plaintext. Then the difference between policy and architecture becomes obvious.

The standard that matters is narrower and harder: under compulsion, can the company actually produce sensitive data? If the answer is yes, the privacy model is weaker than it sounds.

That was the design constraint here. We wanted a secure evidence system where the backend could not decrypt stored evidence, and could not hand over a durable plaintext contact list for trusted-contact alerts. That forced a more disciplined architecture than the usual "we encrypt data at rest" story. It required two separate zero-knowledge boundaries: one for evidence, and one for the SMS send list.

The important point is not that we promise restraint. It is that the system is designed so restraint is not the thing you have to trust.

The real scope of zero-knowledge

Teams often treat zero-knowledge as a label for storage encryption. That is too shallow. The real question is what the server can ever decrypt, what it can ever persist in plaintext, and what it only sees transiently because the product cannot function otherwise.

In this system, there were two sensitive surfaces.

The first was evidence. Recordings had to remain ciphertext from capture through storage and playback. The server could issue URLs, coordinate uploads, and manage metadata, but it could not be part of the decryption path.

The second was the contact graph. Trusted-contact alerting sounds operationally simple until you apply the same privacy standard. To send an SMS, the system needs a phone number. If you store phone numbers in plaintext, you have created a durable, discoverable dataset. That may be convenient for delivery, but it fails the architectural test.

So the design goal became precise: the server should never hold long-lived decryption keys for evidence, and it should never persist trusted-contact phone numbers in durable plaintext storage.

Evidence needs two encryption layers, not one

Evidence has two different threat models.

One is local. A device can be lost, seized, or casually inspected. That means recordings need protection before they ever leave the phone.

The other is remote. Once evidence is backed up to a vault, the backend must not gain the ability to decrypt it. Otherwise "encrypted storage" turns into a policy decision about who gets access to keys.

Those are different problems, so they need different keys.

Layer one: on-device encryption

On first use, the app generates a 256-bit random master key and stores it in platform secure storage using the iOS Keychain or Android Keystore. That key is not uploaded to the server.

Each recording gets its own encryption key derived from that master key using HKDF. The salt is the recording ID, the info field is anchor-recording-key, and the output is 32 bytes. The recording itself is encrypted with AES-256-GCM using a per-file 12-byte IV and a 16-byte authentication tag. The stored format is straightforward: IV, ciphertext, then tag. Plaintext is deleted after encryption, and the file on disk is a .enc artifact.

This layer protects local evidence. It ensures that even before upload, the system is not relying on filesystem obscurity or device goodwill.

Layer two: vault encryption

Local encryption is not enough once evidence leaves the device. The upload path needs a second key with a different trust boundary.

For vault backup, the client derives a vault key from a 6-digit access code plus the incident ID using PBKDF2-HMAC-SHA256 with 10,000 iterations and a 32-byte output. The password material is effectively accessCode:incidentId, and the salt is the incident ID. Chunks uploaded to object storage are already encrypted with that key using AES-256-GCM.

That changes the backend's role. It stores ciphertext only. It can generate signed PUT and GET URLs. It can manage retention and manifests. But it never has the vault key required to decrypt the uploaded content.

Playback follows the same rule. The viewer fetches ciphertext via signed URLs, derives the key locally from the access code and incident ID, and decrypts in memory on the client. The server is involved in access control and URL issuance, not decryption.

That is the line that matters. The backend can hand over encrypted chunks and signed URLs. It cannot hand over plaintext evidence because it never had the key.

The harder problem was the SMS send list

Evidence is conceptually clean because the server never needs to inspect the bytes. SMS alerting is messier because delivery requires a destination number.

A conventional implementation would store trusted-contact numbers in plaintext E.164 format. That would make alerting easy and privacy weak. If asked to produce the contact list, the company could export it directly from storage.

We wanted a different answer.

Encrypting contact numbers to the inviter's public key

Each inviter's device generates an RSA 2048 key pair. The private key stays on the device in secure storage. The public key is registered with the backend.

When a trusted contact opts in by SMS, the server receives the phone number from the webhook, encrypts it to the inviter's public key using RSA-OAEP with SHA-256, and stores only the ciphertext. The backend has the public key and the ciphertext. It does not have the private key, so it cannot decrypt the number later.

That changes what "contact storage" means. The system is not storing a reusable phone directory. It is storing opaque encrypted blobs that are only meaningful to the inviter's device.

Delivery without durable plaintext

The hard part comes at send time. Twilio needs actual E.164 numbers. Someone, somewhere, has to resolve the ciphertext into destinations.

The design choice was to do that resolution on the client, not the server.

When the user arms alerts, the app requests the stored ciphertexts for the relevant trusted contacts. The backend returns ciphertexts only. The app decrypts them locally using the inviter's private key, validates the E.164 results, and sends the resolved list as sendListE164 in the arm-alerts request.

From there, the worker passes that list into a per-incident Durable Object, which uses it for SMS delivery and retries. The list exists only for that delivery window. It is not written into the relational database. It is not persisted into the main key-value stores. It is wiped during terminal cleanup after the retry window ends.

That is an important distinction. The server does transient operational work with a send list because the product has to send messages. But it does not turn that operational necessity into a durable plaintext database of who can be contacted.

At scale, this is where privacy systems usually fail. Not in the primary design, but in the convenience layer built around it.

Storage discipline matters as much as cryptography

A zero-knowledge claim is only as strong as the side channels around the main flow.

You can encrypt evidence correctly and still leak sensitive identifiers through install blobs, retry caches, logs, or relational tables. Most privacy architectures do not fail because AES-GCM is wrong. They fail because a helper table or debugging shortcut quietly reintroduces plaintext.

That is why the rest of the storage model matters.

Install blobs do not store trusted-contact phone numbers in plaintext. Delivery-cache records use identifiers like { contactKey, incidentId, token }, not raw E.164 destinations. Opt-out state is keyed by HMAC of the phone number so the system can answer "is this number opted out?" when given a number by a webhook, without being able to enumerate a contact list from storage. Relational tables store token hashes, code hashes, and encrypted incident identifiers where needed, but not plaintext phone numbers.

This is not glamorous work, but it is the work that makes the privacy model real. The system has to be designed so that the obvious operational shortcuts are structurally unavailable.

Reliability shaped the design too

Security design gets more interesting when it has to survive actual product conditions.

The core product constraint here was simple: the device may be lost at any time. That means recording has to work with no network, encryption has to happen locally, and upload has to begin as soon as segments are ready. If only one chunk survives before the device disappears, that chunk should still make it to remote storage.

That is why the recording pipeline is segmented and why fragmented MP4 matters. With fMP4, the file can be finalized in fragments and uploaded incrementally while recording continues. Chunk 0 is prioritized because the earliest evidence is often the most valuable if the session is cut short. The backend still stores ciphertext only, but the client gets a realistic chance to move evidence off-device under hostile conditions.

This is also why hot-path crypto uses native acceleration where available. Camera, encoding, packaging, and encryption are not support functions. They are the product. Once that is true, performance and reliability stop being implementation details and start becoming architecture decisions.

What we can claim honestly

This is not "perfect zero-knowledge" across every part of the system, and pretending otherwise would make the design less credible.

There are accepted tradeoffs.

The vault access code is six digits because usability matters. Push tokens still exist on the server because alerts have to be delivered. The send list exists transiently in server memory during the delivery window because an SMS provider cannot send to ciphertext. Those are real constraints.

What matters is how narrowly those constraints are scoped.

For the two most sensitive data classes, evidence contents and trusted-contact phone numbers, the long-lived server-side system does not hold the decryption keys and does not persist plaintext in durable storage. Under compulsion, the company can produce ciphertext, public keys, hashes, and short-lived operational records. It cannot produce a plaintext contact database from storage, and it cannot decrypt vault evidence on the server.

That is the standard worth defending. Privacy is strongest when it is enforced by architecture rather than intention. Once the server is outside the decryption path and outside long-lived plaintext storage, the trust model changes. You are no longer asking users to believe you will do the right thing later. You are showing them what the system is incapable of doing now.