17 things got done: The new look's building blocks now exist in code (PR #367). Tap any of these to read the whole thing.
The redesign has been fully drawn on paper for a while. This is the first piece of actually building it. I put the design's basic ingredients into the app's code: the exact colours (cream "paper", the deep ultramarine ink-blue, and a handful of others), the two fonts (Space Grotesk for big headline numbers, Inter for normal text), plus the spacing, corner-rounding, motion, and buzz-feedback rules. None of the real screens use these yet, this is the shared paint-set and toolbox every new screen will reach for next.
A few things worth calling out simply:
I also wrote automatic checks that prove the colours match the design exactly, the dark look really differs, the blue text stays readable, and the fonts actually load. All twelve pass.
Nothing an everyday user sees has changed, the old screens are all still in place and untouched. This just lays the foundation the next steps build on (the new 3-tab shell is next).
The last item from the #318 flow audit. The pre-workout screen shows little notice cards, "welcome back after your break", "you've leveled up", "your targets are ready". Each has an × to dismiss it. But the app only remembered that × in short-term memory: leave the screen and come back, and the same card you just closed was back again. Worse, the app didn't remember which news you dismissed, so it couldn't tell "same old card again" apart from "actually, something new happened". Separately, on app launch two pop-ups (crash recovery and a one-time programme notice) could try to appear at the same moment; the system quietly drops one, and the dropped notice was marked as "already shown", so you'd never see it at all.
Dismissing a card now writes down exactly which event you dismissed, keyed to solid facts like "the session date that caused the break" or "the session count when you leveled up", never to wobbly numbers. The card stays gone across screens and app restarts, and only returns when the underlying event genuinely changes (a new break, a new level-up, a re-calibration). The × is still just a local "not now", saving from the review screens remains the real acknowledgement. And at launch, crash recovery now goes first: the programme notice politely waits for the next launch instead of being silently swallowed. The bigger banner-queue redesign stays parked as issue #327.
Six new unit tests cover the dismissal memory directly, shows when nothing is dismissed, stays hidden for the same event, re-arms for a new one, each banner independent, all against a throwaway settings store so real settings are never touched. Full build passed and the whole test suite ran green, zero failures (the live-API test stays gated off).
Four related problems with the weights the AI coach hands you, all from the #318 audit. First, the coach could prescribe a weight your gym simply doesn't have, like 47.5 kg dumbbells when the rack jumps from 45 to 50. The app even had a little note ready to display ("adjusted to nearest available") but nothing ever produced it. Second, nothing stopped a runaway prescription: if the AI hallucinated 200 kg after you benched 80 last week, the app would show 200 kg with a straight face. Third, the workout card never told you what you actually did last time, useful context the app already stores but never showed. And fourth, when the AI was offline and you tapped "Continue with last weights" on your very first set, it had no "last weights" to use, so it showed 0 kg, rendered as "BW" (bodyweight), on a barbell exercise.
Every weight the AI prescribes now goes through one shared checkpoint before you see it. It first snaps the weight to something your gym actually has (rounding down unless the target sits clearly closer to the next weight up, the safe default), honoring every "my gym doesn't have this" report you've made. Then, for your heavy working sets only, it caps the weight at 15% above what you lifted last session, warm-ups and first-ever sessions stay free, and a low prescription is never pushed up. If anything changed, the card says so with the "adjusted" note. The card also gained a quiet "Last time: 80kg × 8/8/7" line so you can sanity-check the coach yourself. And "Continue with last weights" now seeds from your last session's history when this session has none, only shows up when it can offer something real, and works on the first set of a session too instead of only between sets.
Twenty-five new tests: the snapping rule (including the gym-exclusion case, the 5 kg machine steps, and both ends of the rack), the cap (with history, without history, and per set type), the "adjusted" note surviving the round trip from the coach's text to the screen, the history seeding (including keeping honest 0 kg for genuinely bodyweight movements), and the new card line's wording. Full build passed and the whole suite ran green.
All five redesigned screens are done on paper. This step turned that pile of design documents into an actual to-do list a developer can pick up, about 22 small, self-contained build tickets, ordered so each one can be finished and checked on its own. They're all written down as issues on the tracker now, so the plan can't get lost when a session ends.
Three of those tickets needed real engineering judgment calls before anyone can start: how to build the colour-and-font system in code, how to take automatic "did the drawing change?" screenshots in tests, and how to swap the new screens in one at a time without breaking the old app while another developer is editing the same files. Instead of just deciding these myself, I had several AI "advisors" each argue their own recommendation, then a separate AI reviewer weigh the arguments and make the call, then a final reviewer check that the three decisions fit together. Each decision is now written up as a permanent record (an "ADR").
The biggest catches. - The AIs actually read the real code first, so the advice was grounded, they found 236 places where colours are hard-coded, that no custom fonts are installed yet, and an old chart drawn the dishonest "smooth curve" way the new design bans.
Four small lies and rough edges in the live workout loop, all found by the big #318 audit. The gym streak was always computed for a fake placeholder user, so once real sign-ins existed, your streak could be someone else's, or nobody's. The rest screen promised a "rest is over" alert even when you had turned notifications off, and said nothing about it. Its skip button was so faint and small it was easy to miss and easy to fumble. The countdown showed raw seconds ("150") instead of the "2:30" a human expects. And the end-of-workout summary had a personal-records section that the screen knew how to draw, but the data side always sent an empty list, so it never, ever appeared.
The streak now uses the real signed-in user. The rest screen quietly tells you when rest alerts can't fire because notifications are off (one muted line, no nagging, no buttons). The skip button is brighter and has a proper finger-sized tap area, and the countdown now reads minutes:seconds. And the summary now computes real personal records: for each exercise it compares your best estimated one-rep max from today's top sets (3–10 reps only) against your history under the same rule, using the same formula the rest of the app already uses. One honest detail: if you've never done an exercise before, there's no baseline, so no record is claimed. First time doing something is not a "new record", it's just a first time. The history lookup runs alongside the existing end-of-workout save, so finishing a workout is not a second slower; and if the lookup fails, the summary simply shows no records rather than blocking.
The record-computing piece is a pure function, so four tests pin it down: a genuine record, no record when you didn't beat your best, no record when there's no history, and reps outside 3–10 being ignored on both sides. Full build passed and the whole suite ran green, 757 tests, 0 failures.
The set-logging sheet quietly made things up. The "how hard was it?" picker came pre-set to "On Target", so if you never touched it the app recorded an effort rating you never gave. The rep counter refused to go below 1, so a failed lift, zero reps, real and useful information, could not be recorded truthfully. And there was no way to skip a set at all: if you weren't going to do it, your only options were lying about it or ending the workout.
Three things. The effort picker now starts empty and is clearly marked optional, pick one if you want, skip it if you don't, and nothing gets invented either way. The rep counter now goes down to zero, and a zero-rep set can no longer be celebrated as a personal record. And the menu on the active set screen has a new "Skip Set" item: it moves you straight to the next set (no rest timer, nothing written down for the skipped one), and if you skip everything, the session is thrown away instead of being saved empty.
Fixing skip surfaced a sneaky bug: the app decided "was that the last set?" by counting the sets it had written down. A skipped set writes nothing down, so after a skip the count lagged and the coach would prescribe a phantom extra set. Both the skip path and the normal path now ask "what set number are we on?" instead of counting receipts.
The phantom-set test was written first and watched to fail against the old counting logic, then the fix turned it green. Full build passed and the whole suite ran green, 743 tests, 0 failures. Seven new tests: the empty-by-default effort picker, the database row leaving the effort field properly blank, skip advancing without writing anything or resting, skip-then-complete advancing to the next exercise with no phantom set, and an all-skipped session being discarded.
During onboarding the app asks about your experience, your goal, your bodyweight and your age, and then, embarrassingly, the part that builds each workout never read those answers. It looked in storage spots nothing ever wrote to, shrugged, and planned every session for a made-up "intermediate lifter chasing muscle size". Three smaller things too: if you denied camera access, the "enter equipment manually" button just looped you back to the camera; if the app got killed mid-onboarding, your finished gym scan was forgotten and a paid AI call could quietly run twice; and there was no way to ask for a fresh version of a session you hadn't started yet.
All three program-building paths now read your real saved answers from one shared, tested place. Your goal is picked smartly: the coach's current understanding of your goal wins when it exists, then your onboarding answer, then the old default as a last resort. The manual-equipment button now actually opens manual entry (and phone users get that option up front, not just simulator users). Onboarding now remembers a finished scan after an app kill and won't pay for a second program generation it already has. And any planned-but-untouched session gets a "Regenerate this session" button with a confirmation, days that are completed, paused, or mid-workout are protected and can't be reset.
Full build passed and the whole test suite ran green. Twelve new tests: eight lock down exactly how the profile and goal get assembled (every fallback branch covered), and four prove regenerate refuses completed, paused, and in-progress days while correctly resetting an eligible one.
The Train tab got its full design plan, the last of the five screens. Train is two things in one: the plan (your weeks ahead, what's coming) and the dictionary of every exercise (how to do it, and your own history with it). The same four experts reviewed it.
The big idea the experts pushed hardest on: the app should never pretend it knows more about the future than it does. The coach builds your program a little at a time, it plans next week's exact weights close to the day, not a month out. So the calendar draws a line: days it has really planned show in full ink with real numbers; days it hasn't yet show in faint pencil, as just a shape ("lower body, squat focus") with no fake numbers. The further out you look, the fainter it gets, because that's honestly how much the coach knows.
The biggest catches. - The first draft let the app pretend on the future, so the experts made the "how much do we actually know" line into a real drawn thing on the page, not just a colour change.
All five screens are now fully designed. The next step (not started, waiting for the go-ahead) is breaking these designs into actual build tickets.
A flow audit found a bunch of places where the app's words didn't match reality. The onboarding screen promised "session reminders" that don't exist. The final onboarding screen said "your program is loaded" even when generation had failed or the gym scan was skipped. An error message blamed your "API key"
Ten small fixes. The copy now tells the truth: the final onboarding screen has three honest versions depending on what actually happened; the fake reminder promise is gone; the wait estimate matches the real timeout ("up to a couple of minutes"). Dead ends became buttons: the Scanner-tab text is now a button that takes you to Settings, and the programme-complete screen got a real "Start Your Next Programme" button with a confirmation step (disabled with an explanation if your gym isn't set up). The welcome-back-from-a-break banner now only claims the session "accounts for the break" when that's true, if the session was planned before your break, it says so instead.
Full build passed, the whole test suite ran, and the welcome-back banner logic got three new unit tests (short break, long break, pre-planned session).
The Progress tab, where you check if you're actually getting stronger, got its full design plan, reviewed by the same four experts. The main idea: other apps chart your raw gym numbers, so a planned light day looks like you got weaker. We chart what the coach believes about you instead, a band per lift, and your floor through time becomes a staircase that can only ever go up, because it only moves when you prove it.
Four independent expert reviews against 17 reference screenshots from Tonal, Hevy, Bevel, Gymshark and Peloton; every accepted finding folded into the locked spec with the disagreements and who-won recorded.
The biggest catches. The brand expert ruled that no line on the chart may ever slope, the coach's belief updates on training days and holds in between, so every line is made of flat steps and right angles. A slope would claim knowledge nobody has. The experience expert found the screen had no forward pull, it only looked backward, so every lift now shows how close the next floor-raise is ("2 of 3 sessions above 102"). And the animation expert banned the two prettiest possible animations for honest reasons: re-scaling the chart would show the floor moving, and morphing the small band into the big chart would rotate an axis mid-flight.
The app builds each day's workout only when you ask for it, until then the day is empty, a blank slate. But the big button on the Workout tab always said "Start Workout," even on a day that had nothing in it yet. Tapping it didn't build the workout; it just threw an error. Worse, by the time it hit that error it had already scribbled two notes to itself: a "you have an unfinished workout" flag and a half-started session record in the database. So the next time you opened the app it nagged you about a workout you never actually did, showed a warning dot on the tab, and could even mark a day you never trained as "paused."
On a not-yet-built day, the button now reads "Generate Session" and actually builds the workout right there, no more dead-end error. If you haven't set up your gym yet, the button politely greys out and points you to Settings instead of pretending to work. And the empty-day check now runs first, before the app writes anything down, so a failed start leaves zero mess behind: no false "unfinished workout" flag, no orphaned record, no wrong-day "paused" mark. If building the session fails (say your connection drops), it now says so plainly instead of failing silently.
Two new automated tests pin the fixes: one proves a failed start leaves no flag and no stray database record, the other proves resuming a truly empty day quietly stops instead of faking a finished workout. The full test suite, 261 tests, passed on an iPhone 17 Pro simulator.
The screen you see right after finishing a workout got its full design plan. It's the moment the coach speaks: one short headline naming what the session proved about you, the proof drawn underneath as your strength band with today's result placed on it like a dot of ink, and the full session record below that. The same four experts reviewed it, looks, experience, brand, and animation.
The biggest catch. My draft said that once you tap Finish, the record is locked forever. The experts voted that down 4 to 0. The scary example: you log the wrong weight by accident, and the app celebrates a strength milestone you never earned, and you can't fix it. Now real facts (weight and reps) can be corrected from the history page for up to two days, with the old number visibly crossed out, not erased. Feelings stay locked, you can't rewrite how a set felt after the fact.
Other good catches. All four experts noticed my headline text literally couldn't fit on the screen (too many words at too big a size), it's now a short claim plus a smaller proof line. The animation expert choreographed the screen's signature moment: today's dot settling onto your band like a pen touching paper, and the "floor moved up" click that only ever plays when you've actually seen it, never behind your back, never twice. And celebrations stay honest: a flat day gets respect and a quiet "one more session above 102 and the floor moves," not confetti.
What's next. The Progress screen, then the Train screen, same four-expert panel.
A while back, three files for the workout screens ended up in the wrong place, sitting at the very top of the project folder instead of inside the app's real code folder. The app never used them; it always built from the proper copies. But two of the three strays were old and out of date: one was missing a crash fix, another was missing the "Continue with last weights" button. The danger wasn't a broken app, it was that anyone, a person or an AI helper, who opened the wrong copy could read outdated code and get confused, or even "fix" the file nobody actually runs.
The three stray files are gone, along with their now-empty folders and the leftover bookkeeping lines in the project's index that still pointed at them. The real, up-to-date copies inside the app were not touched at all.
Before deleting, we re-confirmed the app's build recipe never compiled these files, it didn't, so removing them couldn't break anything. After deleting, we searched the project's index for any leftover mention (none), confirmed the project still opens, and built and ran the full test suite on a simulated iPhone, all 261 of the everyday tests passed. One live-internet test failed, but it fails the exact same way on a clean copy of the project too, so it's an old flaky test, not something this change caused.
The screen you actually lift with, one set at a time, a big Done button, a rest timer, got its full design plan. Four expert agents reviewed it this time: looks, experience, brand, and a new one who only judges animation and timing.
The biggest catch. There was no way to fix a logged set. Tap Done by accident, or rack the bar a rep early, and the app would remember a lift that never happened, feeding the coach's picture of you with fiction. Now every logged set can be corrected until you finish the session: fewer reps, more reps, a different weight, or "that one hurt."
Other good catches. The app now shows what you lifted last time right under today's target, so you can check the coach's homework. Weights snap to plates your gym actually has before they're ever shown. The animation expert timed every transition to the millisecond, ruled that nothing on screen may move while you rest (which also saves battery), and found two spots where animation quietly caused data bugs, like the final set of every session losing its "how did that feel?" question because the screen changed too early.
One look-and-feel decision worth noting. The Done button is no longer a button, it's a solid band of blue ink across the bottom of the screen, all session long, that turns into "Finish" at the end. And work numbers are always dark ink while timer numbers are light pencil-grey, so your eye always knows what's a lift and what's a clock.
Two more screens got their full design plans: the moment the app opens, and the home screen ("Today") that answers one question, what does my coach want from me right now? This round three expert agents reviewed the draft instead of two: the usual looks-and-experience pair, plus a new brand-design specialist.
The big design decision. The brand specialist said the draft logo would look like anyone's app, plain lettering on a blue screen. The fix: the word APEX gets one custom-drawn letter. The hole inside the letter A becomes a tiny camera shutter, the same shutter that shows your daily readiness on the home screen. One drawing ties together the logo, the app icon, and the app's signature gauge. (Two agents preferred lowercase lettering; the brand specialist's capital-A argument won because the shutter needs the A's triangle shape.)
Other important catches. The home screen was missing half its real-life situations, like the day the coach says "don't train hard today" (the screen now leads with that advice instead of a big blue Start button), or when your workout numbers aren't ready yet (they now get prepared in the background, so you never stand on the gym floor watching a spinner). Also, the readiness gauge must admit when it doesn't know yet, a new-user gauge shows a dash, not a made-up number. And one technical save: iPhone launch screens can't use custom fonts, so the logo ships as a pre-made image, caught on paper instead of in testing.
Yesterday we drafted the design for a new user's first three minutes in the app. Today two expert agents (one for looks, one for experience) studied the 16 real app screenshots that design was based on, from apps like Fitbod and Yazio, and tore into them: what those apps get right, where they cheat or annoy people, and what our version must do differently.
The biggest finds. The most important screen, where the app draws its first picture of you, had no good example anywhere, so it now has the most detailed plan instead of the thinnest. A real logic hole got caught: the old draft would ask someone with no barbell about their barbell lifts, now the equipment answer shapes which questions get asked. And a stack of smaller rules landed: progress bars that never lie, questions that answer themselves with one tap, never pre-filling a number (people just accept suggestions, which poisons the data), and no sign-up screens, pop-ups, or paywalls between the last question and the first workout.
How it landed. All of it is folded into the design documents. By coincidence this rode into main inside pull request #316 (another work stream branched off these changes and its merge carried them in), the content landed exactly as written.
Three reviewer agents each took a different pair of glasses and went through the app's real code, screen by screen: one looked at the overall journey (like a ride-app expert), one at what it's like to actually use mid-workout at the gym, and one at the very first minutes a new person spends in the app. They wrote three reports with about forty findings, each one pointing at the exact line of code.
The headline problems they agree on. The Workout tab can't actually start a workout on most days (and can even invent a fake "unfinished workout" afterwards), several messages in the app say things that aren't true, the answers people give during signup get partly thrown away, a lazy set log quietly invents an effort score, and the streak counter is counting for the wrong user so it always shows zero.
What this is and isn't. These are reports, not fixes. The fix campaign starts next: a plan, a critic to challenge the plan, then small focused pull requests.