Architecting a Telehealth Platform: What We Learned Building One

Telehealth sounds simple on paper. Two people join a video call, the doctor writes a prescription, done. In practice, you are building a real-time communication system on top of a regulated healthcare workflow, and the architecture decisions you make in week one will haunt you (or save you) for years.

Here is what we learned building telehealth infrastructure, and what we would do differently.

Choosing Your Video Infrastructure

You have three serious options: Twilio Video, Daily.co, and LiveKit. Each comes with real trade-offs.

Twilio Video is the safe enterprise pick. It has the best documentation, solid reliability, and your compliance team will feel comfortable with their BAA process. The downside is cost. At scale, Twilio's per-minute pricing adds up fast. If you are running 15-minute appointments, the math works. If you are doing 60-minute therapy sessions, start budgeting carefully. Twilio also gives you the least control over the underlying WebRTC infrastructure.

Daily.co hits a sweet spot for most startups. Their API is cleaner than Twilio's, pricing is more predictable, and they handle a lot of the edge cases (browser compatibility, network recovery) that you would otherwise build yourself. The trade-off is that they are a smaller company. If your compliance team needs a vendor with a long track record, that conversation gets harder.

LiveKit is the open-source option, and it is genuinely good. You can self-host, which gives you full control over where PHI flows. The catch: you are now responsible for scaling WebRTC infrastructure, handling TURN servers, managing certificate rotation, and debugging browser-specific WebRTC quirks at 2am. LiveKit is the right choice if you have a DevOps team that can own it. If you are a small team, go with Daily and revisit later.

One thing worth noting across all three: test on real hospital networks. Hospital WiFi is notoriously unreliable, with aggressive firewalls that block non-standard ports. Whatever provider you choose, verify that it works behind restrictive corporate firewalls. TURN server support is not optional.

Also consider your fallback strategy. What happens when video fails mid-appointment? The answer should not be "the appointment is lost." Build in phone call fallback, chat fallback, or at minimum a reconnection flow that preserves the session state. Patients on spotty connections need a graceful degradation path, not an error screen.

Scheduling System Design

Do not build your own scheduling engine from scratch. That said, do not just drop in Calendly either.

Healthcare scheduling has constraints that generic tools ignore: provider licensure by state, insurance-based appointment types, required buffer time between certain visit types, and the need to block time for charting. Your data model needs to account for all of this.

We settled on a design with three layers:

Availability Layer: Providers set recurring availability windows. These get stored as RRULE patterns (the same format iCal uses). This handles "I work Mondays 9-5 except holidays" without creating thousands of individual slot records.
Slot Generation Layer: A background job materializes available slots for the next 30 days. This is where business rules live: minimum appointment duration, buffer times, state licensure filtering. Generated slots go into a Redis-backed cache for fast reads.
Booking Layer: When a patient selects a slot, we use optimistic locking in Postgres to prevent double-booking. The slot gets reserved for 10 minutes while the patient completes intake forms. If they abandon, the slot releases back automatically.

One thing we got wrong initially: we tried to compute available slots on the fly. It was technically correct but painfully slow once we had 50+ providers. Pre-materializing slots was the right call.

Time zones are another hidden complexity. A provider licensed in California and New York might see patients in both time zones. Display all times in the patient's local zone, store everything in UTC, and make sure your slot generation handles DST transitions correctly. We found a bug during the spring-forward transition that created overlapping slots. Not fun.

Cancellation and rescheduling logic also needs thought upfront. Healthcare no-show rates run 15 to 30 percent. Build waitlist functionality so cancelled slots get offered to patients who want earlier availability. This is a revenue problem as much as a technical one.

Prescription Workflows

E-prescribing is where telehealth gets legally interesting. You cannot just let a doctor type "take 2 pills daily" into a text field. You need to integrate with a certified e-prescribing network, typically Surescripts.

The integration itself is not technically hard. It is a SOAP API (yes, in 2026). The hard part is the certification process. Surescripts requires your application to pass a battery of tests, and the turnaround time is measured in months. Start this process early.

For controlled substances (EPCS), the requirements jump significantly. You need identity proofing for each prescriber, two-factor authentication on every prescription signing, and an auditable log of every action. Most teams integrate with a vendor like DrFirst or DoseSpot rather than building EPCS in-house. That is the right call unless prescribing is your core product.

Your prescription data model should track the full lifecycle: created, signed, transmitted, received by pharmacy, dispensed, and any error states along the way. Patients will call asking "where is my prescription" and you need to answer that question quickly.

PHI Handling Across Layers

Protected Health Information touches every layer of your stack, and each layer needs its own controls.

Client Layer: Never cache PHI in localStorage or sessionStorage. Use in-memory state only. Set appropriate cache-control headers so the browser does not store responses to disk. If you are building a mobile app, encrypt the local database (SQLCipher for SQLite).

API Layer: Log request metadata but never log request or response bodies that contain PHI. Use structured logging with explicit field allowlists. Your API should authenticate every request (JWTs with short expiration), authorize at the resource level (can this user see this patient's data?), and rate-limit aggressively.

Database Layer: Encrypt at rest using Postgres TDE or cloud provider encryption (RDS, for example). Consider column-level encryption for the most sensitive fields: SSNs, diagnosis codes, psychotherapy notes.

Infrastructure Layer: VPC isolation, private subnets for databases, no public IPs on anything that touches PHI. Use a bastion host or VPN for admin access. All internal traffic over TLS. Audit your setup regularly, because defaults drift.

Session Recording Compliance

Recording telehealth sessions is a minefield. Some states require all-party consent. Some require single-party consent. Some have specific rules for medical recordings that differ from general recording laws.

From a technical perspective, if you do record, there are several things to get right. Store recordings in a separate, access-controlled bucket (not the same S3 bucket as your profile photos). Encrypt them with a separate key from your general PHI encryption key. Implement automatic retention policies, typically 7 years to match medical record retention requirements, though you should check your state's rules.

Never process recordings through third-party transcription services without a BAA in place. That includes sending audio to OpenAI's Whisper API. And give patients access to their own recordings, which is both good practice and increasingly a legal requirement under information blocking rules.

The safest architectural decision is to make recording opt-in at the organization level, require explicit consent capture at the session level, and store the consent record alongside the recording metadata.

Sync vs. Async: Choose Your Architecture Pattern Early

Many platforms start with live video only, then realize half their use cases (dermatology, medication management, follow-ups) work better as async visits where the patient submits photos and a questionnaire and the provider responds within hours.

If you think async is in your future, design your data model around "encounters" rather than "video sessions" from day one. An encounter can contain a video session, a message thread, uploaded media, or all three. Retrofitting this abstraction later means migrating every table that references a session ID. We learned this the hard way.

Final Thoughts

Telehealth architecture is not fundamentally different from building any real-time application. The difference is that every shortcut you take has regulatory consequences. Build for compliance from day one, because retrofitting it is three times harder and ten times more expensive.

Start with a managed video provider, integrate with certified e-prescribing networks rather than building your own, and treat PHI handling as a cross-cutting concern that touches every architectural decision. The teams that get this right early ship faster in the long run, because they are not constantly patching compliance gaps.

How We Architect Telehealth Platforms From Scratch