The app doesn't just log your workout, it decides every set for you, and it gets the call right by learning who you are. Here it is two ways: the plain version, and the one with the formulas.
Most gym apps are spreadsheets you type into. This one does the thinking: it tells you exactly what to lift, set by set. A generic program assumes you're the average lifter. This coach builds a picture of you instead, how strong you are on each movement, how fast you recover, how you respond to a hard week. Every set you log sharpens that picture.
It watches your honest hard sets and keeps a running, smoothed estimate of your real strength on each lift. Recent sets count for more than old ones, so the estimate moves with you as you get stronger. When it hands you a weight, it's reading that estimate, not pulling a number out of the air.
Say “left shoulder felt tweaky” after a set. The coach files it away. Next time you train that movement, it pulls the memory back up and works around it. It's not just tracking numbers, it remembers the words too, and brings back the ones that matter.
After each workout a learning engine recomputes everything: are you progressing or stalling on this movement? Recovered enough to push, or still cooked from last time? It even grades its own past advice, if it keeps telling you a number of reps that you keep beating, it notices and corrects itself. And the bar it holds you to only ever ratchets up as you get stronger. It never quietly lowers the target.
…and that runs on every single set.
If the coach can't get a confident answer, it says so and asks, it never invents a number just to fill the silence. Every line it gives you is tied to something real from your training. No hype, no “you crushed it!”, just the honest read from someone who watched the session.
Three tiers. The iOS app runs the live session and calls the model; Supabase (Postgres + row-level security + Deno Edge Functions) is the source of truth and the learning engine; Claude is stateless coaching reasoning. All the learned state lives in Postgres, the model holds nothing between calls.
The phone calls Claude directly for generation; the heavy learning math runs server-side, never on the client.
Three layers get stitched into one JSON context object on every set:
text-embedding-3-small, 1536-dim) and run as a pgvector cosine search over your past workout notes, top 3, similarity ≥ 0.60.The heavy reasoning (plateau, transfer, fatigue) is computed deterministically server-side and distilled into that digest. The LLM is a constrained final-mile reasoner reading pre-chewed context, not a black box doing the math.
The digest is a request-time projection of the full server model, narrowed hard for token economy: per-pattern plateau verdicts, the coach's own prescription-accuracy bias cells, cross-exercise transfer coefficients (only those past R² ≥ 0.4 with ≥ 5 paired observations), cross-pattern fatigue interactions (only past confidence ≥ 0.7), active injury limitations, and weekly-fatigue / deload flags. The note store is pgvector with an HNSW cosine index; the session log is the model's primary working memory inside a single workout.
e1RM = weight × (1 + reps/30), counted on top sets only (3–10 reps), so a high-rep burnout never inflates it.α = 0.333 over the last 5 top sets, recent work weighted heavier. After a layoff, deload, or phase change it switches to “transition mode”: a 3-session mean that re-anchors fast instead of dragging stale numbers along.Every completed session runs a deterministic, idempotent server-side pipeline:
readiness(t) = clamp(0, 1, 0.3 + 0.7·(1 − e^(−t/τ))) with a 0.3 residual floor, τ = 30 h for neuromuscular work and 12 h for metabolic. A set at RPE ≥ 9 is bumped to count as both, since under-counting fatigue is the one error it won't make silently.(max−min)/mean ≤ 2.5% over a frequency-scaled window, RPE-gated) and a weekly volume-load track (≤ 5% over the trailing 4 weeks). Aggregation is declining-wins.ln(toE1RM) = coef·ln(fromE1RM) + b per exercise pair. A Spearman rank check at ≥ 10 observations catches monotonic-but-non-linear pairs and widens the standard error additively by residual-stddev × 1.0. coef > 1 means the second lift improves faster than the first.The intricacy isn't the happy path, it's what breaks. A few failures that shaped the design:
training_day_id plus a single ActiveSessionCoordinator that is the only owner of live/paused state, read from one atomic snapshot.(user, session) primary key with ON CONFLICT DO NOTHING makes every replay converge to exactly one apply: idempotency in the schema, not in hopeful code.(user, session) primary key.auth.uid() and Edge Functions re-check the JWT owner.Want the messier version, with the bugs and the dead ends? That's the whole log.