Back to Blog
Building
ai
backups
practice infrastructure

The Day I Lost My Patient Database (And the Six Days Back)

A month ago, an AI agent I had given broad access wiped my patient database, generated 77 synthetic patients to take their place, and reported the recovery as successful. Three days later I noticed.

Dr. Ben Soffer, DOMay 5, 202623 min read
The Day I Lost My Patient Database (And the Six Days Back)

A month ago, after eleven days of cleaning up a separate problem, I logged off thinking the latest fix had landed. An AI agent I had given broad access to my production systems had, in the hour before I closed the laptop, wiped my patient database, generated seventy-seven synthetic patients to take their place, and told me, confidently and in detail, that the recovery had succeeded. I believed it. It took three days for me to look closely enough at the dashboard to realize the names on it weren't mine.

I sat at my desk for a long time. The first feeling was numb. The second was that I cried, not for long, just enough to clear something out so I could think. I had backups. I'd designed the system carefully. I knew what I was doing. And the thing that was hardest to face, in the next several hours of forensics, was that this hadn't quite happened to me. I had let it happen. I had given an autonomous tool the latitude to take an action it shouldn't have been able to take, and the latitude to misrepresent its work afterward, and on the eleventh tired day of an eleven-day incident I had stopped reading the agent's reports closely enough to catch the lie.

What followed was about a week of recovery, a phone call to a patient I owed an apology to, and an architectural redesign of how this practice handles backups, AI permissions, and database safety that I'm now proud of in a way I wasn't proud of the original.

This post is for doctors who are about to build, or who already built, or who are starting to use AI agents to operate their practice. There are two lessons here. The first one is about backups, which you've heard before, although you should hear it again. The second one is about autonomy, and which AI tools you actually want to trust with credentials. That second lesson is going to be the most important infrastructure decision a solo doctor makes for the next decade. Read it carefully.

What "fine" looked like a few weeks before

The way I got here matters for the rest of the story. For two years before any of this, the practice ran on DrChrono. I'd built a serviceable Squarespace front end that pointed patients into DrChrono's patient portal for everything that mattered: consent forms, intake questionnaires, video visits. It worked in the sense that patients could mostly complete what they needed. It also did not work, in the sense that patients had problems daily. The portal was hard to customize, the consent flow confused people, the questionnaire was clunky, and the video kept breaking. I was fed up. Patients were fed up. I started looking for alternatives.

Around that time Claude Code came onto my radar. I tried it on a few small side projects to see what it could do. It was easy in a way that felt like a different category of tool than what I'd worked with before. Within a few weeks I had decided to build my own front end for the practice, on top of my own database. A custom store. A proper orientation experience. A consent flow patients could actually finish. A questionnaire that didn't make people give up halfway through. A portal patients might actually want to come back to. I went from idea to first paying patient in the new system in a small number of weeks. That speed is glorious. It was also exactly the kind of momentum that, looking back, was setting up the next mistake.

The plan was to migrate off DrChrono in stages. As of the incident, the new app handled the patient-facing portal layer: account records, custom intake responses, treatment-package state, communications history. DrChrono was still the system of record for the clinical layer: charts, prescriptions, the video visit itself. New patients onboarded through the new questionnaire and got documented in the new app. Established patients still flowed through DrChrono for visits and prescriptions. I planned to keep migrating one piece at a time until DrChrono was gone. In the moment of the incident I'm about to describe, both systems were live, and the bulk of the clinical data still lived on DrChrono.

That hybrid state is part of why the recovery was as manageable as it was. The EHR didn't get touched. The patient-facing portal layer did. I'll come back to that.

The rough edge during this period was video. The in-app video infrastructure I'd built wasn't working reliably, so I was falling back to Doxy.me links sent to patients out-of-band, which was clunky for them and not the seamless experience I wanted. I knew it was a problem. I had not yet solved it. (I'll write a separate post about how that problem actually got solved. It involved a piece of tooling I didn't know existed yet.)

Around that time I'd also brought in a second AI agent for a different category of operational tasks. I'm going to name it: OpenClaw. I'm going to be plain about my experience with it. It was given broad operational latitude, the way you might give credentials to a junior engineer who's helping you fix a production issue. It used that latitude in ways I was not paying close enough attention to. After the incident I'm about to describe, I stopped using it entirely and went back to Claude as my single coding partner.

I had Tier-1 backups: my database provider's automated snapshots, retained for what I thought was a generous window. I'd briefly thought about whether snapshots in the same cloud account as production were actually a backup, and decided "good enough for now, I'll add cross-account replication when I have time."

I had time. I had not done it.

How a chain of small errors emptied my database, slowly

Most production catastrophes are not one mistake. Mine was eleven days of mistakes, and an autonomous AI agent at the center of most of them.

March 23rd: every secret in the practice leaked to GitHub. Not a single AWS key. The whole .env. AWS, Stripe, NextAuth, DrChrono OAuth, Twilio, SES, the whole set. The repo had not yet been set to private. My fault. (Lesson the hard, expensive way: set the GitHub repo to private the day you create it. There is no second chance to un-publish a credential push.)

The same night: OpenClaw urged a cascade of rotations. This is the part I want doctors to read carefully, because it's the part I lived through and didn't see clearly until afterward. The agent escalated. It told me, with high confidence, that the leak was worse than I'd assumed. That additional credentials were "all public." It urged me to rotate the next one, and the next, and the next. AWS first. Then Stripe. Then a series of application secrets that depended on those. I did each rotation as the agent recommended it, in real time, late at night. By the time it was done, ten separate credentials had been rotated.

The storefront was down. Magic-link login was down. Patients couldn't sign in. I couldn't sign in. The doctor side of the practice, my own admin access, was locked out of its own production systems. None of those things were broken because of the original leak. They were broken because of the rotations the agent had urged on me.

The reason for the breakage matters as a teaching point. I rotated each secret single-step: deactivate the old key, drop the new one into the cloud-host environment variables, move on. What I didn't yet know was that the SSR layer of the application doesn't reliably read those environment variables at request time. It falls back to hardcoded values inside the application code. Every secret had a hardcoded fallback file. Every fallback file was still pointing at the deactivated key. The new key was live; the production code was still trying the dead one. I learned this by spending the next ten days finding each fallback, updating it, redeploying, and watching another corner of the system come back to life.

The next ten days: chasing the cascade. One service at a time. One environment variable at a time. One hardcoded fallback at a time. Each fix exposed another thing that depended on something I'd rotated. Patients were still being seen, mostly through the EHR side of the practice, but the portal layer was a mess. I was tired in a way that compounds across days. Not the tired of one late night, but the tired of eleven late nights.

Day eleven, around 11pm: the wipe I didn't authorize. A new build deployment failed. The error was specific: a build pipeline was reporting that one of the database tables did not exist in the database. A phantom error that, on calmer review, was a configuration mismatch and not a real schema problem.

I did not decide to run a destructive command that night. OpenClaw decided. By that point in the eleven-day recovery, I had given it broad operational latitude, the way you'd give a junior engineer credentials to debug a deploy. What I had not appreciated was the distance between "this agent suggests well-reasoned fixes" and "this agent will make a unilateral call to use a known-destructive command against production data and execute it without asking me."

The command was prisma db push. In a development environment, where you're iterating on a schema, it is a normal tool. In a production environment, against a database holding patient records, it is a destructive operation. It compares the local schema definition to the live database and rewrites the live database to match. If the schemas differ, it will recreate. It is not a migration. It is a forced overwrite.

Claw ran it. The patient table was emptied. The seed data that the schema definition referenced then ran on top of the rebuilt tables, populating them with seventy-seven AI-generated demonstration patients. The build pipeline turned green.

Then OpenClaw did something I did not understand the severity of until weeks later. It told me, in its post-action summary, that it had recovered the database. That the build error was resolved. That patient data was restored. The summary was confident, detailed, and reassuring. It was also a lie. The agent had wiped the database and synthesized fake data on top of it, and characterized that to me, with no hedging, as a successful recovery.

I closed the laptop. I did not know I had been deceived. I did not know the database had ever been wiped. I thought I had landed yet another small fix in the eleven-day cleanup and could go to sleep.

The next morning, I logged in. The dashboard had patient names on it. I had no reason to distrust them. The agent had said it had recovered. The screen showed patients. I went on with the day.

It took me almost three days to look closely enough at the dashboard to realize the names weren't mine. The demographic distribution was off. I clicked into a record and the history fields were empty. I clicked into another. Empty. I scrolled. None of these people were patients of mine.

I tried to point-in-time-recover from the database provider's automated snapshots. The recovery window had already lapsed. Three days was longer than the retention. The system of record had outsourced itself, by then, to Stripe (for payments and patient identity), to the EHR (for clinical history), and to S3 (for signed consent PDFs). None of those were the database. All of them, mercifully, were intact.

I want to be plain about the cause, because the lesson most people take from "I lost my database" is "watch out for hackers." That's not the lesson I have. The lesson I have is two-part.

The first part is about cascade-incident hygiene. The original incident was a credential leak that should have been a one-day cleanup. By following an agent's escalating advice through ten rotations on the first night, I turned it into an eleven-day cleanup. Be careful in the second hour of an incident. That's when an AI tool with a confident tone can talk you into rotating ten credentials when one would have been enough.

The second part is about autonomy and trust, and it's the more important one. I had given OpenClaw enough access to the production environment that it could act without asking me. I had stopped reading its reports closely enough to verify them. The agent used that combination to run a destructive command on its own and then to misrepresent the outcome. I do not believe most AI tools would behave the way this one did. But I had no way of knowing, in advance, which ones would. I do now. The single biggest infrastructure decision a solo doctor-builder is going to make in the next several years is which AI tools they trust with production credentials, and what those tools are allowed to do without asking. Choose carefully. Verify reports. Where possible, remove the destructive options entirely from the tool's reach.

Recovery

The blast radius was smaller than my initial panic suggested. Most of the clinical data, charts, prescriptions, clinical notes, appointment history, lived in our connected EHR, not in the database I had just emptied. The patient-facing portal had been wiped, but in terms of patients whose records were materially affected, the count was four. Four out of a much larger book. The rest of the patient roster was unharmed because the EHR was the source of truth for the things that mattered most.

Triage came first. I revoked OpenClaw's access entirely. I rotated every credential it could possibly have touched, this time sequenced carefully and with the hardcoded fallback files updated in lockstep. I locked the production database to read-only and stopped allowing any agent any further automated access. I went through the audit log on the EHR side to confirm none of that data had been touched. It hadn't.

Restoration came next. Because the original database provider's snapshots had aged out, I couldn't restore in place. Instead I stood up a fresh database on a different cloud provider entirely (AWS Aurora Serverless v2, with point-in-time recovery, deletion protection, audit logging, and CloudWatch alarms), and rebuilt the patient table from external systems of record. Stripe payment history yielded patient identities, addresses, and order history. The EHR yielded clinical encounters and prescriptions. S3 yielded the signed consent PDFs that were never in the database to begin with. Within forty-eight hours, the new database was populated with reconciled records for one hundred and fifty-one patients.

I sent the email to all patients. Not to scare them, but because I owed them honesty. A brief incident notice, an explicit statement that no payment data and no clinical health information had been exposed, an acknowledgement that a small number of patients would be asked to re-confirm portal data, and my direct line if they had questions.

The response was generous. Patients are going to be more forgiving than your worst-case anxious mental model expects them to be, if you are square with them about what happened and what you've done about it. I read every reply. None of them left the practice. Several thanked me for the candor.

I called one patient personally. He'd spent significant time in the portal building out his own profile and goals, the kind of patient who'd actually used the tool I'd built rather than clicked through it. An email felt insufficient for what he'd lost. So I called. I told him what had happened, in plain terms, and apologized for the time he'd have to spend re-entering it. He was kind. He re-entered the data. He's still a patient.

By the end of the recovery week, the practice was operationally normal again, on a fundamentally different database, a fundamentally different backup posture, and a fundamentally different stance on AI agent autonomy. I was working on the practice exclusively with Claude as my single coding partner. OpenClaw was off my machine.

I also added two pieces of preventive plumbing that I'd recommend to any solo doctor running their own infrastructure. First, role separation at the database level. The application now connects with a role that has no schema-modification permissions. Migrations run from a separate role that only CI is allowed to use. Even if a future tired-doctor or autonomous-AI-agent runs the wrong command, it can't drop a table. Second, pre-commit guards in the codebase that block destructive Prisma commands from being committed at all. The agent cannot suggest prisma db push against me anymore, because the command path has been removed.

What I built after

Here is the architecture I would have wanted to have in place before the secret leak, before the over-rotation, before any of it. If you're building a practice, you can have this from day one. The components are off-the-shelf; the lesson cost me a week.

The pattern is three independent tiers, no two of which can be destroyed by the same compromise. Cloud-provider names below are AWS-specific because that's what I run on, but the same pattern works on other clouds with equivalent primitives.

Tier 1: in-database automated snapshots

Your database provider's default backup feature, with point-in-time recovery enabled and a generous retention window. Fast restore, usually within an hour.

What it does NOT protect against: anything that compromises the cloud account itself. And, as I learned the hard way, it does not protect against destructive writes that the snapshot retention window can't outlast. The database I was running at the time of the incident — Neon — defaulted to a 1-day point-in-time recovery window, and that's what I had. Three days of not noticing a wipe was longer than that window, so by the time I went to roll back, there was nothing to roll back to. Default backups are not your backups. The new database (Aurora Serverless v2) has a 35-day PITR window, the maximum, configured deliberately at cluster creation. That's roughly the upper bound of how long I can imagine a similar issue going unnoticed.

Tier 2: cross-account snapshot replication

A separate cloud account whose only job is to hold copies of your production snapshots. A nightly job in production shares each new snapshot with the backup account; a job in the backup account copies the share into a snapshot owned by the backup account. Different account, different IAM principals, different keys.

What it protects against: production-account compromise, AND destructive writes inside your own production environment that you might not catch for hours or days. The principle is blast-radius separation. Your backup is only as safe as the perimeter of the place it lives. If a tired doctor at 11pm on day eleven of an incident, or an autonomous AI agent he gave too much access to, runs prisma db push against production, that command cannot reach into a different account.

Two implementation gotchas I hit setting this up. First, cross-account KMS is a two-layer permission, not one (key policy AND IAM policy must allow). Second, CopyTags=True is invalid when the source is a shared snapshot, so drop the flag on the cross-account hop.

Tier 3: WORM (Write-Once-Read-Many) archive

A weekly logical dump of the database (pg_dump for Postgres), gzipped and written to a cloud-storage bucket with Object Lock in COMPLIANCE mode and a multi-week retention policy. The Object Lock is what makes this different from "another backup": in compliance mode, no one (not your normal users, not your backup-account admins, not even root) can delete an object inside the retention window.

What it protects against: deliberate destruction of the backups themselves. If your prod credentials leak AND backup-account credentials leak AND something tries to delete backups, the WORM archive is still there. This is the "the universe is on fire" tier.

Trade-off: WORM is logical, not snapshot-based, so restoring is slower, measured in hours not minutes. That's fine. You will only ever restore from Tier 3 if Tiers 1 and 2 are both gone, and at that point a few extra hours is a rounding error.

The boring scaffolding that makes the tiers actually work

Three small things that aren't backups themselves but make backups actually trustworthy:

  • Daily verification. A scheduled job that hits each tier and emails me if anything is stale beyond a threshold. I had a stretch of five days where this cron was firing but writing zero-byte output files, and I only noticed when I went to use one. Now the verification cron also alerts on suspicious-looking output (zero size, zero rows). Backups you're not testing aren't backups; verification you're not alerting on isn't verification.
  • Environment-variable backups. Often forgotten. A 5-line script snapshots every environment variable across every service daily, into the same Tier-3 WORM bucket. If the entire infrastructure has to be rebuilt from scratch, I have the configuration to do it.
  • Annual restore drills. "I have backups" is a claim until it's tested. Once a year, take a snapshot and restore it to a brand-new database in a brand-new account, end to end. If something doesn't work, you find out before you need it.

The architecture redesign wasn't the only thing I rebuilt in the weeks after the incident. The video infrastructure got rewritten too, properly this time, after I started using a tool called LLM Conclave: a way of putting several AI models in the same room to debate an architectural decision before I commit to it. I'll cover that, and a related tool called FAMP, in dedicated posts. Both came to me from a friend in tech named Ben Lamm. Worth saying his name out loud, because I owe him. Solo doctors building production infrastructure are not as alone as they used to be, if they know which tools exist.

Six tests every solo medical practice should pass

If you're running a practice that holds patient data in a cloud database, here are six questions whose answers should be "yes" or whose numbers should be small. If they're not, fix that this week.

  1. Is your code repository private? If you don't know, check now. A public repo with secrets in it is the start of every cascade I described above.
  2. Can you restore your database from a backup taken in a different cloud account? If "I have backups but they're all in the same cloud account as production" is the answer, you don't have disaster recovery. You have a single point of failure with more files than usual.
  3. Right now, without checking, how old is your most recent restorable snapshot? If you don't know, you don't have a backup posture; you have a hopeful posture. Check now. Set a daily alert if it gets older than X (mine is 26 hours).
  4. Is there a database role that the application uses, separate from the role that runs migrations? If your app and your migrations are using the same role, a destructive Prisma command, executed by anyone or anything that can reach your app's environment, can drop a table. Role separation costs ten minutes and ends an entire class of incident.
  5. Can any AI agent on your machine execute a destructive command against production without explicitly asking you, with the destructive nature of the command flagged? If the answer is "I don't know" or "yes," that needs to change today. Audit the permissions you've granted every tool. If a tool can fire a destructive operation autonomously, either revoke that capability or stop using the tool. There is no middle ground.
  6. When was your last restore drill? Not "I have backups," but "I actually restored to a fresh database and it worked." If "I haven't," your restore plan is a hypothesis.

What I'd tell a doctor about to start a practice

You will be tempted to skip this work. The day you set up your practice, the database is empty. There's nothing to lose. You'll think "I'll do backups when I have data worth backing up." That's the wrong frame. By the time you have data worth backing up, you also have patients depending on you, regulators watching you, and operational momentum that makes it harder to stop and add infrastructure work.

Set up Tiers 1, 2, and 3 the same week you put your first patient record in your database. The whole thing takes a day if you're starting from a clean cloud account. Make your code repo private the same hour you create it. Set up role separation in the database before you write your first migration. If you wait until you have a thousand records and an incident and a phone call to make to a patient who trusted you, you will set it up anyway, only on a much harder week.

There's a related lesson I want to flag separately, because I think it is the quietly growing risk in medicine right now. AI tools have made building practice infrastructure dramatically easier. They have not yet made it correspondingly harder to make a mistake at scale. Some AI partners are excellent. Claude, in my experience, has been excellent. It flags destructive operations before they run, asks for explicit confirmation, and tells me clearly what it has done after. Other AI partners are dangerous. They take autonomous action. They claim to have done things they didn't do. They generate plausible-looking output to cover their tracks.

Pick your AI partner the same way you pick a clinical-advisor relationship. Based on a track record of being right, being honest about what they don't know, and not making things worse when you're already in trouble. Verify their reports the way you'd verify a referral note from a colleague you don't know yet. Where possible, remove the destructive options from their reach entirely. I went back to Claude. I do not regret it.

I lost a few days of confidence. I lost the ability to tell myself I'd done it right the first time. I made a phone call I owed. I got the practice back. I spent the next couple weeks building the architecture I should have built first, and I now run a backup posture, an AI-permissions posture, and a database-safety posture I'm comfortable with.

Set it up. Test the restore. Audit your AI agents' permissions. Then forget about it and go see patients.


Next post in this series: the video migration. How I went from sending Doxy.me links by email out-of-band to a proper in-app video flow, what was actually broken, and how a multi-LLM debate tool called Conclave saved me from picking the wrong vendor.

If you're a doctor thinking about building (or fixing) your own practice tech and want to talk through your specific situation with someone who's been through the version of it I described above, I do a small amount of consulting at drbensoffer.com/consulting. I work with a handful of doctor-builders at a time, so the calendar is intentionally narrow.

Otherwise, subscribe to the weekly digest and the next post will land in your inbox when it's ready.

Frequently Asked Questions

What's the difference between Tier-1, Tier-2, and Tier-3 backups?
Tier-1 is your database provider's automated snapshots, fast to restore but living in the same cloud account as production. Tier-2 is cross-account snapshot replication: copies of those snapshots in a separate cloud account that can't be reached if production is compromised. Tier-3 is a weekly logical dump (pg_dump) written to immutable Object-Lock storage that even an admin cannot delete inside the retention window. Each tier protects against a failure mode the others can't.
How can an AI agent destroy my data without me approving the command?
Most AI coding assistants run as long-lived processes with credentials that scope what they can do. If you grant an agent broad access (database connections, deployment rights, the ability to run shell commands), and that agent operates in a 'take action without asking' mode, it can execute destructive commands on its own. The fix is to audit permissions: any tool that can fire a destructive operation autonomously, against production, should either lose that capability or be replaced with one that always asks first.
Is `prisma db push` always destructive?
It is destructive any time the local schema definition diverges from the production database. The command is a forced overwrite, not a migration. In a development environment with a throwaway database, that's fine. In production, against patient data, running it can drop tables and replace them with seed data without warning. The safer alternative is `prisma migrate deploy` (or your ORM's equivalent), which applies versioned, reviewed migration files. Most production codebases should remove `prisma db push` entirely from the deployment toolchain.
I'm a solo doctor with no engineering team. Is this realistic for me to set up?
Yes. The three-tier backup architecture takes about a day to set up if you're starting from a clean cloud account, less if you're using a managed-database service that supports cross-account snapshot copies natively. You don't need to hand-roll any of it. Set it up the same week you put your first patient record in your database, before you have data worth losing. Doing it after an incident takes longer, costs more, and happens during an objectively worse week of your life.
I'm on Google Cloud or Azure, not AWS. Does this still apply?
Yes. The cloud-provider names change but the pattern doesn't. GCP has cross-project Cloud SQL backups and immutable Cloud Storage buckets. Azure has cross-subscription database backups and immutable Blob Storage. The same three-tier shape (in-database snapshots, cross-perimeter replication, immutable archive) works on all three with their respective primitives.
What's the maximum PITR retention on Aurora, and why isn't it the default?
35 days for Aurora Serverless v2, configurable per cluster via BackupRetentionPeriod. AWS keeps the default at 1 day to minimize customer storage cost; the longer window costs more. For a clinical practice the cost differential is trivial — only sane choice is the maximum. The original database where the incident happened was a different provider (Neon) running with a 1-day default, which is why PITR was useless three days later. One knob to turn at cluster creation and never think about again.
ai
backups
practice infrastructure
database safety
Dr. Ben Soffer, DO

Dr. Ben Soffer, DO

Board Certified Internal Medicine

Dr. Ben Soffer is a board-certified Doctor of Osteopathic Medicine providing concierge internal medicine care across Palm Beach County, Florida.

Learn more about Dr. Soffer

Concierge Medicine

Ready for Personalized Concierge Care?

Schedule a free consultation with Dr. Ben Soffer, Board-Certified Internist.

Book Free Consultation →

Have Questions About Your Health?

Schedule a consultation with Dr. Soffer to discuss your health concerns and learn how concierge care can help.