Back to Build
Building
ai
communications
SES

Communications Stack: Every Channel

Dr. Ben Soffer, DOJune 28, 202612 min read

Research and drafting assistance from Claude (Anthropic). All clinical, technical, and strategic decisions are mine.

Communications Stack: Every Channel

The clinical core (post #5) consumes messages and SMS as inputs and produces emails and notifications as outputs. This post is about what's underneath all of that: the four communication channels the practice runs on, the vendor choices behind each one, the architectural pattern that keeps them working when other things break, and the three scheduled jobs that surface what's happening across the practice without me having to ask.

The framing is again the telehealth psychiatry practice in Florida and New Jersey. Communications is one of the surfaces where the regulated-healthcare requirements (HIPAA-compliant defaults, signed Business Associate Agreements with every vendor that touches PHI, audit-able delivery logs) create real constraints, and where most of the operational surprise in solo-doctor engineering tends to live. The architecture below reflects what I've actually had to ship and re-ship.

SES for transactional email

Email is the highest-volume channel by a wide margin. Every appointment confirmation, every after-visit summary, every magic-link sign-in, every refill notification, every newsletter, every patient-facing letter. The numbers compound quickly even at a small-practice scale.

The vendor is Amazon SES. Post #3 covered why I'd start there on day one rather than going through Resend first; this post is about how SES actually slots in.

The integration is unglamorous. The application has a single sendEmail() function that takes a to, a templateId, and a context object with template variables. The function renders the template against the context, signs the message, hands it to the SES SDK, and records the resulting MessageId in an EmailDispatch table along with the patient ID, template name, and a status field that gets updated by an SES bounce/complaint webhook. Every email the application sends is one of these records. There's no separate code path for marketing email versus transactional email versus operational alerts; everything goes through the same dispatch surface.

The templates live in the same repository as the application, in a email/templates/ directory. Each template is a TypeScript file that exports a function taking context and returning subject + plain-text body + HTML body. The HTML is generated from the same component library that renders the patient portal, so the visual design is consistent across the brand without having to maintain a separate email design system. Each new template is a new file; deploying a new email type takes about ten minutes.

The reason SES rather than a developer-experience-first vendor: HIPAA-eligible by default, integrates with the AWS IAM identity the rest of the application already uses, scales without per-message friction, and the cost per thousand messages is roughly an order of magnitude lower than the alternatives once you've earned a warmed-up sending reputation. The warmup takes a few weeks of gradual volume increase; it is the only meaningful operational tax on adopting SES, and it pays for itself within the first month.

AWS End User Messaging for SMS

SMS is the second channel. It's lower volume than email but higher urgency. Appointment reminders, two-factor codes for admin login, MADRS item 10 safety alerts that need to reach me directly (post #5 covers that pattern), and the once-in-a-while text from a patient who needs something quickly and emails too slowly.

The vendor is AWS End User Messaging (formerly Pinpoint SMS). Twilio is the more obvious choice and is the one I'd reach for first if I needed to ship in days. AWS End User Messaging is the better choice once you've got the time to do it right, for reasons specific to a HIPAA-covered AWS practice.

The integration pattern is the same as SES: a single sendSMS() function that takes to and a templateContext, dispatches through the AWS SDK, and records the resulting message reference in an SMSDispatch table. Inbound replies route through a webhook that creates a row in the same table with direction=INBOUND. The chart view (post #5) reads from SMSDispatch for any patient-related thread.

The thing nobody warns you about in advance is the toll-free verification process. To send SMS at any meaningful volume to US numbers, you need a toll-free number, and to use that number for application-generated messages, you need to go through TFV (toll-free verification), which is a multi-week process involving a paper trail of opt-in consent flows, sample message templates, sender identification, and a few back-and-forth clarification rounds with the carriers. If you start the process the week you launch, your TFV approval lands a few weeks later. If you start it after you've already shipped and your SMS volume is throttled to a trickle, you can't credibly send appointment reminders to anyone for a month while it processes.

The lesson from doing this on a sister practice: start the TFV application on day one even if you're not yet sending SMS. The approval process is happening in the background and you want it complete before SMS becomes operationally critical. The cost of starting early is the documentation work you'd have done anyway; the cost of starting late is operational improvisation while you wait.

Once TFV is approved, AWS End User Messaging is consistently cheaper per message than Twilio, the IAM permission model is the same one the rest of the AWS application uses, and the delivery-receipt and opt-out handling integrates with the same AWS surface I already monitor for everything else. The total operational surface is smaller than maintaining a separate Twilio account, a separate API key rotation, a separate vendor portal to check when something looks off.

The hardcoded-fallback pattern

The pattern that runs underneath both SES and AWS End User Messaging is worth describing because it's saved me from extended outages multiple times.

The application has a configuration database. Database rows describe which email sender to use for which message type, which SMS sender ID, which Whereby room configuration for video visits, which Stripe webhook signing secret to verify against. The configuration is editable through the admin tool. Editing it lets me change vendor parameters without a deploy.

The configuration database also has a failure mode: it can be wrong, it can be unreachable, it can be edited by an autonomous agent that doesn't fully understand the consequences (see post #2 for that genre of mistake). When the configuration database is wrong or down, every outbound message that depends on it is also wrong or down.

The hardcoded-fallback pattern works like this. Every function that sends something through an external service has two paths. The primary path reads from the configuration database. The fallback path is hardcoded in source code, statically committed to the repository, deployed with the application. If the primary path fails to resolve a configuration value (database unreachable, row missing, value malformed), the function falls back to the hardcoded value and proceeds.

For email, the fallback is a specific verified SES sender identity that I know is valid because it's in the application's deployed Terraform. For SMS, the fallback is a specific verified toll-free number with TFV approved. For Whereby, the fallback is a particular session-configuration preset that I know creates a HIPAA-compliant room. For Stripe webhook verification, the fallback is the webhook signing secret committed (encrypted) into the application's secrets manager rather than read from a database row.

The point of the pattern is not redundancy in the abstract. The point is that the configuration database being broken doesn't break the messaging path. When my autonomous coding agent (back in the months before I had the agent governance described in post #2) misconfigured a sender identity in the configuration database, the application kept sending emails from the fallback identity. When the database wiped in April, the application kept sending notifications via the fallback path while I rebuilt the configuration. The fallback is unglamorous and almost invisible in normal operation; it earns its keep the one or two times a year something upstream goes sideways.

Whereby Embedded for video visits

Video is the third channel. The vendor is Whereby Embedded.

Post #3 covered why Whereby Embedded over Doxy.me on day one. This post is about the actual integration mechanics.

The principle is that the video session is part of the patient portal, not a separate destination. The patient signs into the portal with their magic link (post #4 covered the auth flow), navigates to the visit they have scheduled, and clicks "join visit." The portal renders the video session inline via Whereby's embed widget. The patient's identity is already established by the portal session; they're not asked to enter their name on a video-room landing page, they're not redirected to a different brand, they're not given a separate URL to remember.

The HIPAA-compliant defaults matter here. Whereby Embedded has a set of session options (no recording, no transcription stored externally, end-to-end encrypted media path, restricted participant list) that you set when you create the room. Get them right and the video session is BAA-eligible. Get them wrong and you've just transmitted PHI through a non-compliant pipe. The configuration for the room creation lives in the hardcoded-fallback pattern described above, so the HIPAA-correct option set is what gets used whether the configuration database is reachable or not.

The room is created at appointment-booking time, the room URL is stored in the application's appointment record, and the patient portal renders the embed with that URL when the patient joins the visit. After the visit, the room is torn down and the URL is invalidated. Each visit gets a fresh room; nothing is reused.

The one thing I'd flag for anyone integrating Whereby Embedded for the first time: the cross-origin policy interactions with browser security settings can produce non-obvious failures when the video session is embedded inside an authenticated patient portal that uses a strict Content-Security-Policy. Plan to spend a half-day getting the CSP right the first time. The Whereby documentation covers the required directives.

The patient portal

The fourth channel is the patient portal itself. It's where everything the application produces becomes visible to the patient: messages, appointment list, downloadable documents (via the S3 signed-URL pattern from post #5), the video-visit embed, the intake forms still in progress.

The portal is deliberately narrow. It doesn't expose the full clinical chart (that lives in DrChrono, the system of record per post #5). It doesn't show prescription details beyond what was already covered in the after-visit email. It doesn't process payments (the next post covers the money layer separately). It doesn't have a settings page beyond basic profile updates and notification preferences.

The reason for the narrow surface is twofold. First, every feature on the portal is something that has to be HIPAA-correct, audit-trailed, and tested across browsers and devices; adding features compounds the maintenance surface. Second, the patients who would use a feature-heavy portal are not the patients I have; my patients want to schedule a visit, read a message, download a letter, and join a video call. Building beyond that is engineering work that doesn't translate into patient value.

The portal is the same Next.js application that serves the marketing site and the admin tool, just rendered under a different layout component with a different navigation tree. There is no separate portal repository, no separate deploy, no separate auth domain. The unified architecture from post #3 ("a single repository at the scale of one to two engineers is the correct shape") shows up here as the reason the patient portal doesn't double the project's complexity.

Three scheduled jobs

The communications surface includes outbound channels but also includes the scheduled jobs that surface what's happening across the practice without me having to ask. There are three of them.

Morning briefing. Runs at 8am Eastern every day, via an EventBridge schedule that triggers a Lambda. The Lambda queries the application database and DrChrono for the day's activity: how many appointments today, which patients are coming for first visits versus follow-ups, which patients have flagged items in the chart (a MADRS item 10 from the previous week that wasn't yet addressed, a refill request that's been sitting unanswered, an outside record uploaded and waiting for me to review). The result is rendered as a single email that lands in my inbox before I start work. The format is: today's schedule in time order, then a "needs attention" section listing anything that should not wait until the next appointment.

Evening report. Runs at 7pm Eastern. The Lambda queries what happened during the day: appointments completed, chart notes signed, prescriptions sent, emails delivered, SMS messages exchanged. It also lists what didn't get done: encounters from earlier in the day that are still in the documented state but haven't had prescriptions sent, after-visit summaries that haven't been emailed, refill requests received but not yet acted on. The point of the evening report is to make sure nothing on the day's work falls through to the next day silently. If something is on the report's "didn't get done" list, it's something I'm going to handle before I close the laptop.

Emergency escalation. Runs every 15 minutes. This is the safety check. The Lambda queries for any MADRS item 10 alert (post #5) from the last 24 hours that hasn't been explicitly acknowledged in the admin task list, any inbound patient message containing crisis-indicator keywords, and any appointment from the last 24 hours that started but doesn't have a chart note saved. Any hit triggers a high-priority SMS to my cell with a one-line summary and a deep link to the relevant chart view. This is intentionally noisy on the false-positive side; I'd rather get a text I didn't need than miss one I did.

Together these three jobs reduce the amount of "I need to remember to check on..." cognitive load that solo practice otherwise generates. The jobs do the checking. The output gets delivered to channels I'm already in (email for morning briefing, email for evening report, SMS for emergency escalation), so I don't have to develop a separate habit of opening the admin tool to discover what needs my attention.

Next time

The next post covers the money layer: the Stripe-based architecture for memberships, individual visit fees, refunds, and the few specific edge cases (incomplete-onboarding cron, repurchase links, coupon handling, the multi-state tax question) that have eaten more debugging time than they should have. Communications is how the practice talks to patients; money is how patients pay for the relationship. Both touch every patient eventually.

Frequently Asked Questions

Why SES instead of a developer-experience-first email vendor?
SES is HIPAA-eligible by default, integrates with the AWS IAM identity the rest of the application already uses, scales without per-message friction, and costs roughly an order of magnitude less per thousand messages once you've warmed up a sending reputation. The only real adoption tax is the few-week domain warmup, which pays for itself within the first month. A developer-experience-first vendor is the better choice only when you need to ship in days and don't yet have the AWS surface set up.
What is toll-free verification (TFV) and why start it early?
TFV is the carrier approval process required to send application-generated SMS at meaningful volume to US numbers from a toll-free number. It's a multi-week paper trail of opt-in consent flows, sample message templates, sender identification, and clarification rounds. Start it on day one even before you're sending SMS, because the approval runs in the background and you want it complete before SMS becomes operationally critical. Starting late means a month of improvising appointment reminders while you wait.
What is the hardcoded-fallback pattern?
Every function that sends through an external service has two paths: a primary that reads vendor configuration from the database, and a fallback hardcoded in source code and deployed with the application. If the primary fails to resolve a config value (database down, row missing, value malformed), the function uses the hardcoded fallback and proceeds. For email it's a known-valid SES sender identity; for SMS a TFV-approved toll-free number; for video a HIPAA-correct Whereby room preset; for Stripe the webhook signing secret in the secrets manager. The point is that a broken configuration database doesn't break the messaging path.
How is the video visit kept HIPAA-compliant?
Whereby Embedded has session options (no recording, no externally-stored transcription, encrypted media path, restricted participant list) that you set at room creation. Set correctly, the session is BAA-eligible. The room-creation config lives in the hardcoded-fallback pattern, so the HIPAA-correct option set is used whether or not the configuration database is reachable. Each visit gets a fresh room created at booking time and torn down afterward; nothing is reused.
Why is the patient portal deliberately narrow?
Two reasons. First, every portal feature has to be HIPAA-correct, audit-trailed, and tested across browsers and devices, so each addition compounds the maintenance surface. Second, patients want to schedule a visit, read a message, download a letter, and join a video call; building beyond that is engineering work that doesn't translate into patient value. The full clinical chart stays in DrChrono (the system of record); the portal surfaces only what the patient needs to act on.
What do the three scheduled jobs do?
Morning briefing (8am): today's schedule plus a 'needs attention' section for anything that shouldn't wait until the next appointment. Evening report (7pm): what got done today and, more importantly, what didn't (undocumented encounters, unsent summaries, unanswered refill requests). Emergency escalation (every 15 minutes): checks for unacknowledged safety alerts, crisis-keyword messages, and started-but-undocumented appointments, and texts my cell on any hit. Together they move the 'I need to remember to check on...' load off me and onto the system, delivered to channels I'm already in.
ai
communications
SES
SMS
Whereby
practice infrastructure
build log
telehealth-psychiatry

If you're a doctor thinking about building (or fixing) your own practice tech and want to talk through your specific situation, I do a small amount of consulting at drbensoffer.com/consulting. I work with a handful of doctor-builders at a time, so the calendar is intentionally narrow.

Get the next post by email

One short email a week, only when there's a new post in this series.

One short email a week, only when there's a new post. Unsubscribe in one click.