For muscle nerds

The approach.
The sources. The caveats.

Most fitness apps treat data like a stage prop: a number on a chart, a vague "AI insight," a percentage bar that aggregates something they won't name. flexRep treats data like a working tool — which means being honest about the model behind every metric, the literature it leans on, and the bits that are still squishy.

This page is the long version. It exists because the same kind of person who reads Sports Medicine in bed wants to know whether their bench is being scored against a Schoenfeld meta-analysis or against vibes.^↓

It's against the meta-analysis. Mostly.

01 — The data model

Sets are first-class citizens. Definitions live elsewhere.

The shape of your training data shouldn't change because we renamed a button. The model is built so that every set you've ever logged survives every iteration of the app that comes after.

The relationships, in plain English.

Exercise (the library)
    ↑
    │ referenced by many — never cascades
    │
Workout  ──▶  ExercisePerformance  ──▶  Set
(session)     (this workout's bench)     (weight × reps × effort)
    │
    └─▶  Gym (optional)

Delete a workout, and its performances and sets go with it — that's intentional. Delete an exercise from the library or archive a gym, and your historical logs stay untouched. The shape of the graph is the shape of how training data actually behaves.

Principles we don't bend.

Library metadata lives in one place. Renaming an exercise touches one row, not a thousand log entries.
Sets are their own things. Each one has its own identifier, its own effort, its own notes — never encoded into a string on a parent row.
Every record gets a globally-unique identifier and timestamps from day one. Cross-device sync depends on it; debugging benefits from it.
Canonical units stored, converted only at the edges. Switching display units never mutates your data.
Categorical data is human-readable. Set type and movement pattern are stored as words, not magic integers.
Every entity has escape hatches. Tags and an opaque metadata field absorb feature requests without forcing a migration.
Versioning is a v1 problem. The schema is versioned from the start, so v2's first migration doesn't require an emergency.

02 — Units & precision

The number under your finger is not the number on disk.

Weight is stored in kilograms at full floating-point precision, regardless of what your display shows. Not because we believe in micro-gram accuracy. Because we believe in not losing data in conversion.

WEIGHT

102.0586

Canonical: kilograms, at full precision.

225 lb displayed → 102.06 kg stored. On reload, kg → display unit, rounding only at the view layer. A 225 lb user who switches to kg sees 102.06 kg. A 102.06 kg user who switches to lb sees 225 lb. The math is reversible because the storage is canonical.

RPE / RIR

7.5

Half-step scale. RIR derived from RPE.

Half-steps are coarse enough to be honest. Tenths would imply discrimination most lifters can't actually make. We store both RPE and RIR so the user can think in whichever framing they prefer.

DURATION

14,820

Always stored as seconds.

Aggregating "13 min + 1:42:18" across a year of sessions is the kind of thing that produces off-by-three-hours bugs forever. Seconds compose.

TIMESTAMPS

UTC · ISO 8601

Storage: UTC. Display: user's locale at the moment of display.

Train in Tokyo, fly to LA, train again — the gap between sessions is unambiguous because both timestamps are UTC. The display layer figures out which one is "yesterday."

03 — Data integrity

What happens to your data when you're not looking.

Most apps say 'your data is safe with us' and then either lose it or hold it hostage. This is what flexRep actually does, in the order it actually happens.

Unique IDs, timestamps, and soft delete on every entity.

Every row carries a globally-unique identifier, creation and update timestamps, and a soft-delete flag. Hard deletion happens only as a background cleanup task, never as a direct result of user action.

WHY Cross-device sync requires identifiers that can't collide. Soft delete preserves undo, restore-from-export, and reconciliation. The cost is negligible. We pay it.

A grace period before deleted data is actually gone.

When you delete a workout or a set, it goes into a soft-deleted state. After a reasonable grace period, a background process purges it for good.

WHY Long enough that an accidental delete is recoverable. Short enough that your phone isn't a forever-graveyard of things you wanted gone.

Abandoned-workout detection.

A workout left open for an extended period without activity — or with a clear "left the gym" signal — is auto-ended. The session is annotated so analytics can include it, exclude it, or treat it differently.

WHY Real-life sessions get interrupted. Forgetting to tap "End Workout" should not silently inflate your weekly volume. The annotation lets queries make the right call without losing the data.

Reversible imports.

Every CSV import — from Strong, from Hevy, from a generic export — tags every row it creates as part of a single batch. Reverting the batch removes only those rows; everything you logged natively is left alone.

WHY Imports are a data-poisoning risk. The batch boundary lets a user say "actually, never mind" without nuking a year of native logs.

No silent save failures.

Every persistence write goes through a wrapper that catches errors and surfaces them as a global alert. We never quietly swallow a save error.

WHY A failed save the user doesn't see is a corrupted log they'll discover six weeks later. The alert is annoying. The bug it prevents is worse.

Scoped queries everywhere.

No view loads an unbounded history. Queries are scoped to a time window or a fetch limit; aggregations across deep history happen off the main thread and cache their results.

WHY A multi-year training history is a lot of rows. Loading all of them at every redraw is how an app gets returned to the store.

Canonical units, converted at the edges.

Weight is stored in kilograms. Distance in meters. Duration in seconds. Conversion to your display preference happens only at the view layer.

WHY Switching between lb and kg shouldn't mutate a single row of your data. Aggregations across mixed-unit rows are a class of bug we refuse to allow.

Categorical data stored as strings, not magic numbers.

Set type, muscle group, equipment, movement pattern — all stored as human-readable string values, not integer codes.

WHY Adding a new category in a future version doesn't break old records. CSV exports stay readable to humans. The "savings" of integer enums never pay back the migration pain.

04 — Cleanliness

A clean library. One bench press.

Or: how we keep one canonical 'Bench Press' across hundreds of curated exercises, four typing styles, and one user who insists on logging 'Benchpress (Flat)' with no space.

Flat rows, grouped by attribute.

Each fully-qualified exercise is its own row. "Incline Dumbbell Bench Press" is not a child of "Bench Press" — it's its own entry, tagged with shared attributes so analytics still roll up correctly.

Users search and log by fully-qualified name. Nobody types "Bench Press" and navigates "incline → dumbbell." Grouping and substitution happen at the query layer, not the schema.

Normalized aliases.

Every exercise carries a list of aliases. They get lowercased, diacritics stripped, and normalized into a single canonical form. Search matches against the normalized version, so spelling drift doesn't fracture the library.

"Bench Press"	→	one record
"BB Bench"	→	same record
"Flat Bench"	→	same record
"Benchpress"	→	same record
"flat barbell press"	→	same record

Provenance, preserved.

Every exercise in the library carries its origin and the seed version it shipped in. On updates, we match by stable identifier — never by name — and preserve any user edits.

New entries insert cleanly. Removed entries archive instead of deleting. Your logs never lose their target.

Movement-pattern inference.

Curated exercises carry hand-tagged movement patterns. User-created and imported exercises get a pattern inferred from name and primary muscle.

Inference is high accuracy, not perfect. Anywhere it gets a call wrong, you can correct it in the exercise detail screen — and your correction is sticky across seed updates.

05 — What we track

Nine signals. All disclosed.

Each of these is computable from the data you've already logged. None requires an external sensor, a subscription, or a marketing partnership we can't tell you about.

RPE (Rate of Perceived Exertion)

half-step scale

The cleanest available proxy for effort. Half-steps are granular enough to be informative and coarse enough not to fake precision.

RIR (Reps in Reserve)

derived from RPE

The arithmetic dual of RPE. We track both so the user can think in whichever framing they prefer.

Estimated 1RM

computed per set

A back-calculated single-rep max. Lets you compare 225×5 to 245×3 honestly. We use a formula with strong literature support across the working-rep range.

Fractional set volume

multi-axis muscle volume

A bench press contributes to chest, triceps, and front delts — but not in equal measure. We weight primary movers more than secondary movers. Counting everything as a full set implies your triceps grew at the rate of your chest; we trust your triceps to disagree.

Effective reps

reps performed close to failure

Hypertrophy lives near failure. A set of 20 at RPE 6 is mostly cardio. We don't care about cardio sets when we're asking "did this grow tissue?"

Movement patterns

a small fixed taxonomy

Push, pull, squat, hinge, carry, lunge, core, rotation. Coarse, durable, and meaningful — patterns survive any individual exercise being renamed or retired.

Tempo (optional)

per-phase seconds

Captured per set for users who care about tempo work. Most lifters never touch it; the ones who do want it logged precisely.

Rest interval

per set, seconds

Captured from the rest timer; editable. Drives the workout rhythm waveform and post-hoc fatigue analyses.

Body weight + bodyfat %

HealthKit-synced

Two-way sync with Apple Health. Body weight matters for relative-strength comparisons. Bodyfat % matters for body-composition trends (and almost nothing else).

06 — What we deliberately don't track

Honesty is also a feature.

The shortest list in the app world. Most workout trackers add metrics that look impressive in screenshots; we'd rather ship six that mean something than sixteen that don't.

Velocity-based training (VBT)

NOT TRACKED

We have no business between your bar and your barbell collar. Velocity tracking belongs to the hardware that can actually measure it.

Percentage of 1RM

NOT TRACKED

You can compute it yourself from e1RM at any time. Storing it separately means it goes stale the moment you PR — and stale data is worse than no data.

HRV-based "recovery scores"

NOT TRACKED

Heart-rate variability is mostly noise plus your phone's barometer plus how much coffee you had. We could pretend it was a signal. We don't.

Perceived recovery scales (1–10)

NOT TRACKED

A user-reported number with no calibration and no anchor. We'd ship a feature that looks impressive in screenshots and conveys precisely nothing.

Streak length (gamified)

NOT TRACKED

We track consecutive training days as a stat. We refuse to gamify it. No "DON'T BREAK THE CHAIN" notifications. No streak-shame.

Social leaderboards

NOT TRACKED

flexRep is not a social network. Our brand stance on this is firm and not up for debate.

07 — The receipts

What each number leans on.

We won't reprint our exact tunings — those are ours. But we will tell you, for every metric in the app, which body of research it's grounded in and which parts are in-house judgment.

Metric

Approach

Provenance

e1RM

Back-calculated single-rep max from a working set

Epley family · 1985

RIR

Arithmetic dual of RPE

Helms et al., 2018

Stall flag

Moving-window e1RM stability check with an RPE-stability gate

In-house · literature-informed

Strength velocity

Trend over a rolling window of recent e1RMs

In-house · standard time-series shape

Fractional muscle volume

Weighted sum where primary movers count more than secondaries

Schoenfeld 2017 / 2019 — adapted

Effective reps

Reps performed close to failure, computed from RPE/RIR

Helms / Israetel framing

Recovery snapshot

Recency × accumulated load, normalized

In-house · ambient indicator only

Gym hue rotation

A deterministic per-gym hue offset

In-house · purely cosmetic

Glyph color

Warmth scales with training frequency

In-house · purely cosmetic

We cite the literature we lean on. The specific tuning choices — windows, weights, thresholds — are part of what flexRep is. If you'd score things differently, your own numbers are one CSV export away.

08 — The deep cuts

Three charts, opened.

The same methodology-card pattern that ships inside the app. Tap to see how to read each chart, what it captures, and what it doesn't.

Fractional muscle volume how it's calculated ↘

WHAT IT MEASURES

A weighted sum of weekly sets per muscle group, where primary movers count more than secondary movers.

HOW TO READ IT

A radar chart across multiple muscle axes. A balanced lifter trains across the polygon, not at three points of it. The center-of-mass should sit on center.

WHY IT MATTERS

Hypertrophy research consistently shows that secondary mover work contributes — just not at the rate of primary mover work. Counting everything as a full set over-credits compound lifts; ignoring secondaries under-credits them. Weighting is our practical answer to that finding.

Stall detection how it's calculated ↘

WHAT IT MEASURES

A flag raised when an exercise's estimated 1RM stops progressing over several weeks despite stable effort.

HOW TO READ IT

An orange dot appears on the exercise card. Tap to see the contributing weeks and the effort window the flag is based on.

WHY IT MATTERS

A multi-week lag in noticing a plateau is the most common programming error in self-coaching. We optimize for catching it earlier rather than later, and we tolerate a few missed flags in exchange for not crying wolf.

Effective reps how it's calculated ↘

WHAT IT MEASURES

Reps performed close to failure, where the bulk of the hypertrophy stimulus lives.

HOW TO READ IT

A weekly count of effective reps per muscle group. Compare against accepted hypertrophy landmarks (see the literature; we won't reprint them).

WHY IT MATTERS

Total sets is a blunt instrument — twenty reps at low effort are mostly cardio. Effective reps focus on the work that grows tissue, which is the work most lifters actually care about.

09 — The caveats

What we're not pretending to know.

Every analytic in flexRep has at least one assumption embedded in it. Below: the assumptions we're aware of, severity-rated, with the reasoning we used when we made the call.

MINOR

e1RM is a model, and models are wrong.

Single-rep-max estimation formulas fit the working-rep range well and degrade outside it. We constrain the rep range we'll compute e1RM from. A 1RM extrapolated from a set of fifteen is a fiction we don't want to participate in.

MINOR

RPE is subjective.

It always will be. We could replace it with average bar velocity, but average bar velocity requires hardware most lifters don't own. RPE is the honest middle: subjective, but disclosed.

MEDIUM

Fractional sets are a heuristic, not a measurement.

The literature is clear that secondary mover work contributes — just not at the rate of primary mover work. The exact weighting we use is a practical choice informed by how serious coaches credit accessory work. Your disagreement with our weighting is really a disagreement with the meta-analysis design, not us.

MEDIUM

Movement-pattern inference is heuristic.

For curated exercises in our library, movement pattern is hand-tagged. For user-created and imported exercises, we infer it from the name and primary muscle. Inference is high accuracy, not perfect — and any correction you make sticks across seed updates.

MINOR

Stall detection is conservative.

We flag a stall only when progress flattens against stable effort over several weeks. We don't flag stalls on missed reps alone. We'd rather miss two stalls than fire a false-positive that nudges you to deload during your best block of the year.

MEDIUM

Your data is one lifter's data.

Every analytic in flexRep computes on the lifter using flexRep. We don't cross-reference your numbers against a normative population. There are good reasons (privacy, statistical validity at small N, generalizability across populations). There are also philosophical ones: your numbers should be measured against your numbers.

SMALL

The strength glyph is decorative.

It is generative art driven by your data. It is not a diagnostic instrument. Do not show your glyph to your doctor.

MEDIUM

On-device AI is conservative.

It has less context than a coach who has watched you train for six months. The insights skew toward observation ("bench e1RM unchanged for several weeks") and away from prescription ("you should deload"). When in doubt, we err on the side of saying less.

MINOR

Imported data is annotated, not laundered.

Sets imported from other apps carry provenance tags. Analytics treat them the same as native data, but the source is preserved so you can always tell which logs came from where if you ever care.

10 — Don't trust us, run the numbers

Bring your own spreadsheet.

If you disagree with how we've scored any of this — the e1RM family we chose, our fractional-set weighting, the stall window, the recovery shape — you don't have to take our word for it. CSV export ships every set, every RPE, every timestamp.

Roll your own formulas. Score sets the way you'd score them. Build a longer or shorter stall window. Compare runs in your own notebook. We'd rather you run the math yourself than trust ours blindly.

See the export schema The full methodology library →

That's the lot.

If you read this far, we built flexRep for you specifically. The brand identity playbook calls you "the muscle-nerd / spreadsheet-refugee audience." We mean it warmly.

Download for iPhone Or feel the difference first →

Now go log a set.

The approach. The sources. The caveats.

The relationships, in plain English.

Principles we don't bend.

Unique IDs, timestamps, and soft delete on every entity.

A grace period before deleted data is actually gone.

Abandoned-workout detection.

Reversible imports.

No silent save failures.

Scoped queries everywhere.

Canonical units, converted at the edges.

Categorical data stored as strings, not magic numbers.

Flat rows, grouped by attribute.

Normalized aliases.

Provenance, preserved.

Movement-pattern inference.

Velocity-based training (VBT)

Percentage of 1RM

HRV-based "recovery scores"

Perceived recovery scales (1–10)

Streak length (gamified)

Social leaderboards

Bring your own spreadsheet.

That's the lot.

The approach.
The sources. The caveats.