Why We Built the Evidence Engine in Native Code

Most teams treat the media pipeline as implementation detail. That is a mistake.

In a secure evidence product, recording is not a support feature. It is the product. If the app silently drops frames, mishandles an audio interruption, fails to finalize a segment, or produces a file that cannot be encrypted or played back, the product has already failed. The user does not care that the UI was cross-platform. They care that the evidence exists, is intact, and can be recovered.

That is why we made a hard architectural choice early: camera, recording, segmentation, encryption, fMP4 packaging, and playback live in native code, Swift on iOS and Kotlin on Android. Flutter is the shell. It handles UI, app state, and orchestration. It does not own the hot path.

This was not a stylistic preference. It was a reliability decision.

Reliability lives where the operating system surfaces failure

The common mistake in cross-platform mobile apps is to optimize for code sharing before defining the actual failure modes. That works for forms, onboarding, and settings screens. It fails in production for media systems because the operating system exposes the important events natively.

On iOS, the camera and audio stack tells you what is happening through AVFoundation and AVAudioSession. That is where encoding errors appear. That is where interruptions from calls, Siri, alarms, and route changes appear. That is where the lifecycle constraints are enforced.

On Android, the same principle applies through Camera2, AudioManager focus events, foreground service requirements, and the native media stack. Long-running recording is governed by operating system rules, not framework abstractions. If you want predictable behavior under stress, you need to own those rules directly.

A plugin layer is useful until it is not. In practice, it usually hides exactly the details that matter most when something goes wrong: partial encoder failures, backgrounding edge cases, recorder state transitions, device-specific quirks, and timing interactions between camera, audio, and file finalization.

For a product where the evidence pipeline is the trust boundary, opaque errors are not acceptable.

Flutter is the shell, not the engine

This does not mean Flutter was the wrong choice. It means Flutter was the right choice for the right layer.

We used Flutter for UI, state management, workflow orchestration, and product iteration speed. That bought us a fast interface layer and a shared experience across platforms. But we kept a strict boundary: Flutter talks to the Evidence Engine through a thin Dart facade, not by implementing the media pipeline itself.

The integration model is straightforward. The Dart layer exposes an EvidenceEngineService. That service communicates with native code through MethodChannel for command and control, and EventChannel for long-running event streams such as recorder state transitions and segment-ready notifications. The boundary is intentionally narrow. Flutter can start, stop, arm, subscribe, and react. It does not encode frames, package fragments, or perform live file I/O on the critical path.

That separation matters more than it sounds.

Once you let the UI layer grow into the engine layer, you get two bad outcomes at once. First, performance degrades because you introduce bridge overhead where throughput and timing matter. Second, debugging gets harder because the ownership model becomes blurry. When a segment fails to finalize, is that a native media issue, a Dart orchestration issue, or a plugin abstraction bug? Teams lose time not because the bug is impossible, but because the architecture made causality harder to see.

A clean boundary keeps failure domains legible.

Contract discipline is part of the architecture

What teams usually underestimate is that native-first architecture is not just about language choice. It is also about API discipline.

Every time a feature touched the native evidence engine, we required the Native API Contract to be defined before implementation. Method names, parameters, payload shapes, event names, error codes, lifecycle expectations, all of that had to be written down and acknowledged before anyone started coding.

That sounds procedural. It is actually architectural.

Cross-platform apps break down when the boundary between the shared layer and the native layer is negotiated informally in code review. One engineer adds a method on iOS, Android names it differently, Dart handles one error shape but not another, and the product team discovers late that each platform interpreted the flow slightly differently. None of those bugs are glamorous, but they create the integration drag that makes cross-platform work feel fragile.

We treated the native boundary like an external API, even though we owned both sides. That reduced rework, stabilized the interface, and made platform behavior more consistent over time.

In practice, this is one of the highest leverage habits on a team building reliability-critical mobile systems.

fMP4 is not an implementation detail

A lot of the architecture followed from one decision: we packaged video as fragmented MP4, not a single-shot MP4 file.

That choice solved a real product problem. We wanted playback to start quickly, and we wanted segments to become uploadable as soon as they were finalized. If the device is lost or taken mid-session, already-completed fragments can still survive remotely. That is materially different from a monolithic file format that only becomes useful at the end.

Once you commit to fMP4, the stack choice becomes clearer. Packaging fragments correctly is not a Dart-side convenience problem. On iOS, that means AVAssetWriter and AVAssetWriterInput. On Android, that means MediaRecorder or MediaMuxer, plus the platform media stack around them. The container behavior, timing, finalization, and buffering are native concerns.

This also forced one of the more important guardrails in the system: we do not rely on MP4 rotation metadata for fMP4. We rotate pixel buffers before encoding so the encoded frames are already in the correct orientation. Then we assert in debug paths that we never fall back to appending unrotated sample buffers.

That kind of rule matters because media bugs tend to survive basic testing. A file can be technically valid and still fail the product requirement. It might upload, but play rotated on one client. It might play locally, but break downstream when streamed or segmented. The right response is not "we'll fix it later in playback." The right response is to make the encoded output correct at the source.

Native control is also a performance decision

There is a second reason to keep the hot path native: cost.

Every frame that crosses an unnecessary abstraction boundary adds overhead. Every large file operation done in the wrong runtime adds latency. Every crypto operation handled in a slow fallback path turns into battery, thermal, and time cost on the device.

For encryption, we used platform crypto on the hot path whenever available. On iOS that means native crypto APIs. On Android that means Conscrypt and platform-backed key storage. AES-GCM belongs close to the platform, especially when it is part of a live media workflow.

The Dart layer still has a role. It can coordinate and fall back when needed. But fallback is exactly that, fallback. For local file encryption and decryption, the system can call native-accelerated paths such as encryptFileNative and decryptFileNative, with an isolate-based Dart implementation available when native acceleration is not present. That is a pragmatic design: one architecture, two execution paths, with the fast path where it belongs.

The principle is simple. Shared code is valuable. Shared bottlenecks are not.

Lifecycle handling is where abstractions usually break

Media systems do not fail during happy-path demos. They fail during interrupts.

A phone call arrives. Siri grabs the audio session. The app backgrounds. Android enforces foreground service constraints. A recorder is still writing when the user changes state. These are not edge cases for an evidence app. They are normal operating conditions.

So we designed for them explicitly.

On iOS, the recording flow owns the AVAudioSession lifecycle. It activates before start, deactivates on stop, and treats interruptions as recoverable state transitions, not mysterious failures. If an interruption occurs, the system finalizes the current segment, marks the recording as interrupted rather than failed, and resumes when the session allows.

On Android, long-running recording lives behind a foreground service with the notification behavior the OS requires. That is not optional. It is how the platform guarantees the work can continue. We also kept the notification copy neutral, because the operating system surface is part of the product surface.

This is where native code earns its keep. You are not fighting the platform. You are using the platform on its own terms.

The larger lesson is about where to spend abstraction

There is a broader product lesson here.

Cross-platform is not a religion. It is a budgeting exercise. You should spend abstraction where the work is similar and preserve native control where the failure modes are platform-specific and expensive.

For us, the right split was clear. Product surfaces, stateful UI, and orchestration could be shared. The evidence engine could not. It needed direct access to camera APIs, encoder behavior, audio session events, foreground service constraints, native crypto, and platform playback. It needed one implementation per platform, not one compromise shared by both.

That does create more native code. It also creates a system you can reason about.

And in a product like this, that tradeoff is worth making. The common mistake is to ask, "How much can we keep in Dart?" The better question is, "Which parts of this system are too important to abstract away?"

Our answer was the evidence pipeline.

I would make the same decision again.