Quantified Self · 2016-2026
In February 2016, I opened an app called SleepCycle for the first time. Since then, every night, my phone has lain quietly by my bedside, recording everything about my sleep. Ten years later, these numbers have accumulated into a surprisingly rich life archive. With AI's help, I was finally able to spread them all out and examine them for the first time.
01 / Full Dataset
Let's start with the raw dataset itself. From February 2016 to March 2026, the App recorded a total of 3,656 Valid Nights.
The value of this chart isn't in providing immediate answers, but in building an overall intuition first: high-quality and low-quality nights aren't randomly scattered, yet they're not as neat as a textbook would suggest. This tells us that sleep isn't a single-variable problem—and it's all waiting to be explored.
Plotting the SleepCycle sleep scores across these ten years reveals a clear arc— hovering around 80% when recording began in 2016, then slowly climbing to a peak in 2020 (annual average 94.4%), followed by a noticeable decline. From 2022 onward, scores remained persistently low, oscillating between 79-83% through 2023-2025.
02 / Duration & Time
First, sleep duration—which doesn't always move in the same direction as sleep quality. 2020's high quality came with the longest duration (annual average of 7.8h), while by 2024-2026, duration had dropped to around 6.5h—near the decade's lowest point. Note: the sleep duration here is the actual sleep time identified by SleepCycle (sleep_h), excluding nighttime wake periods, not the total span from getting into bed to waking up.
The bedtime data tells another story: the "slept after midnight" tag appears 2,305 times, accounting for 63% of valid records— meaning that for two-thirds of the nights in this decade, I technically went to bed on "the next day."
Grouping all nights by bedtime window and examining the average quality for each group yields a remarkably clear pattern: for each hour later, sleep quality drops by approximately 4–6 percentage points. Falling asleep before 23:00 scores 4% above baseline; after 02:00, it's 11% below. This isn't occasional late nights—it's a systematic circadian shift.
The SleepCycle App also calculates a "Regularity" score for each night—how closely your bedtime and wake time align with your own historical rhythm. Its trajectory tells a clear story: holding steady at 90–91% through 2019–2020, then gradually declining as life rhythms became more disrupted, stabilizing at 82–83% after 2023, never returning above 90%.
The sleep duration above is the actual sleep time estimated by the app; but there's an even more objective metric: total Time in Bed—the entire window from getting into bed to getting up. This window sets a ceiling on sleep quality: if the time in bed is itself too short, the body doesn't get a fair chance to sleep well.
One of the most common misconceptions about sleep is: "If I lose two hours tonight, I can make it up by sleeping two hours extra tomorrow." But when we decompose ten years of continuous time-series data to observe how the previous night's sleep deprivation affects the next day's behavior, my body's answer is entirely different.
First, my body doesn't repay debt by "sleeping in," but by "going to bed early." Data shows that if the previous night suffered severe sleep deprivation (less than 6 hours), the next night's absolute Time in Bed remains firmly anchored at the ~8-hour baseline. My body doesn't choose to force 10 hours of sleep; instead, its strategy is: go to bed 2.5 hours (148 minutes) earlier than usual.
However, going to bed much earlier brings an unexpected "Sleep Onset Paradox." Common sense suggests that extreme fatigue should make you fall asleep instantly. In reality, after a night of severe sleep deprivation (<6h), sleep onset latency on the next day stretches from the normal 9.9 minutes to 14.5 minutes (nearly 50% longer). This is likely because extreme sleep deprivation triggers hyperarousal of the nervous system (compensatory cortisol/adrenaline spikes), combined with forcing yourself to lie down over two hours earlier than usual to catch up—resulting in tossing and turning instead.
Furthermore, the data suggests that sleep exhibits powerful momentum. If we examine how two consecutive days' sleep quality affects the third day, we find that "sleep debt" can almost never be perfectly repaid in a single day. Two consecutive nights of poor sleep keep the third day's quality suppressed at 76.7%; conversely, two consecutive nights of high-quality sleep surprisingly propel the third day to peaks above 90%.
These findings may reveal a cold biological reality: the sleep system is a homeostatic system that intensely dislikes variance. Attempting to repay weekday sleep debt by "sleeping 12 hours on weekends" doesn't work physiologically. The only effective way to repay debt isn't extending single sleep sessions, but honestly returning to baseline—maintaining stable bedtimes across several consecutive days, waiting for the nervous system's hyperarousal to gradually subside.
03 / Behavior & Sleep Quality
I added a set of custom tags in Sleep Cycle to note what happened before bed and after waking. Over ten years, these tags have built up a behavioral database that makes more interesting analysis possible.
The chart below shows the quality delta for 23 behavior tags relative to the overall baseline (all nights average 83.3%). Note again: this is correlation, not causation.
04 / Event Analysis
The previous section was a broad comparison: which events overall hurt sleep more, which were relatively friendlier. This section pulls out a few events worth examining individually— looking at how their frequency changes over time, and in what periods they became significant.
Travel nights account for just 3.7% (135 nights) of all records, but the damage is remarkably consistent: average quality 73.3%, a full 10.4 points below baseline. Among all high-frequency behavior tags, travel is the most predictable negative factor — almost every trip brings a quality drop, underscoring how much stable sleep environment matters. But the data reveals something counterintuitive: the real culprit isn't travel itself, it's the behavior patterns that come with it.
Recovery after travel isn't gradual — it's a sharp homecoming rebound: quality hits 87.5 on night 2 back home, above the everyday baseline (83.7). On trip length: the relationship is simply linear — longer trips mean worse sleep every night (1-night: 73.5 → 4–7 nights: 70.4). There's no adaptation sweet spot; each extra night costs something. One more finding: pre-travel nights show no anxiety signal — quality is only 1.7 points below baseline, which suggests travel itself isn't anxiety-inducing.
The break tag I logged in SleepCycle doesn't refer to ordinary brief night awakenings, but rather nights where I woke up from sleep and then fell back asleep.
It accounts for only a minority of all records, but when it appears, it usually means something went wrong with that night's sleep:
lower average quality, longer awake time, and worse sleep efficiency.
What's more interesting is that break isn't just a "bad night label". Some break nights are just ordinary but interrupted poor nights; others feel like experience continuing to overflow into the night after the day has ended— especially during long trips, holidays, writing sessions, or when emotions are still echoing.
The chart above is the reason I pulled break out into its own section.
During exploratory analysis, AI helped me pick out a handful of unusual nights with this tag—and the most interesting among them were two post-hike return nights.
The same exhaustion, the same "falling back into bed", yet the body gave two entirely different answers.
On the night I came back from Haba Snow Mountain, my body returned, but my mind hadn't. Emotions, writing, and rewatching the day's footage stretched the awake time further and further (see My Haba Snow Mountain Journey). On the night I came back from Yubeng—another echo after a long trek—sleep was barely disturbed at all. The same kind of "post-return echo night" can end in completely different ways— what really hurts sleep is never the hiking itself, but whether the mind is willing to let the day end.
05 / Fatigue & Exercise
Almost every piece of sleep hygiene advice repeats the same line: "Move more during the day, tire yourself out, and you'll sleep well at night." But when you put ten years of step counts, workout tags, and sleep metrics side by side, the result runs the other way—the more I walked, the worse I slept.
On ordinary days without any specific workout tag, days with more than 12,000 steps didn't push sleep quality up; if anything, they pulled it down to 81.4% (vs. 84.6% on days under 4,000 steps). The clearer change shows up in the breathing metrics: on high-step days, average snoring stretched to 35 minutes, and the breathing disruption index (BD) rose to 8.6 events/hour.
The other side of the coin, though, is just as clear.
When you break the high-step days (>10,000) apart, fatigue turns out to come from three distinct sources: intense hiking (Worked out), travel-related rushing around (Travel), and purposeless high steps—shopping, running errands, daily commuting.
All three hurt sleep, and travel rushing is the worst of the three (quality as low as 73.3%, BD around 8.6).
But once you switch to very low-step days (<5,000) combined with Worked out—"targeted exercise" like swimming or pure strength training—the direction reverses entirely.
To rule out a statistical artifact, I ran a stricter comparison: exclude every night tagged with sickness, alcohol, caffeine, or late bedtime, then compare each exercise type's sleep performance against the "local baseline" of where it actually happened. The conclusion didn't water down—on this cleaner baseline, targeted exercise still lifted sleep quality by 2.7%, while hiking and travel rushing took off 2.9% and 11.2% respectively. Put another way: unstructured physical depletion really does damage sleep architecture, and what actually raises the sleep floor is the kind of movement that mostly asks the heart, lungs, and muscles to work without grinding down the joints.
06 / Long-term Health Battles
Health issues are the biggest killer of sleep quality.
SleepCycle records nighttime sound and breathing-related signals. Looking at these lines together tells more than staring at sleep quality scores alone.
From 2016-2019, snoring barely existed; after the severe flu B during Spring Festival in late January 2020, things changed. Snoring started appearing, and never truly went back: first sporadically, then most nights, finally almost every night. By 2025, the nightly average approached 70 minutes.
After late 2022, SleepCycle started providing breathing disruption data; after 2023, the cough metric also gradually became available.
Viewed this way, later respiratory problems weren't a single metric deteriorating, but a whole set of burdens rising in sync.
Especially in August-October 2024, cough, the sick tag, snoring, and breathing disruptions together formed a very clear respiratory episode.
sick tag.
This suggests the later decline wasn't just snoring or breathing disruptions worsening in isolation, but rather a deeper, system-wide burden increase across the entire Respiratory System.
No sooner had one wave subsided than another rose. The digestive and respiratory systems took turns becoming the main battlefield in my body.
From 2016 to 2018, one in four nights carried a gastrointestinal discomfort tag. In late 2018, I underwent quadruple therapy (esomeprazole + clarithromycin + amoxicillin + colloidal pectin bismuth), which completely eradicated the HP infection. After a recovery period, by 2019, gut issues had largely subsided. Meanwhile, snoring incidence quietly started rising from under 1%.
After the January 2020 flu B infection, the two curves crossed: HP nearly hit zero, while snoring kept climbing.
On April 15, 2025, after snoring and breathing disruptions kept worsening—peaking at 108 minutes/night, 12.84 events/hour— I began self-treating rhinitis: Flixonase (fluticasone propionate) nasal spray, later switched to budesonide, combined with saline nasal irrigation. On July 25, 2025, an MRI confirmed sinusitis and cysts—the first clear structural diagnosis after years of respiratory decline. The upper airway was partially blocked by nasal inflammation; breathing was already labored, and even more so during sleep.
The effects weren't obvious at first, but by December 2025, airflow sensation started noticeably improving— not because instruments told me so, but because my body said so.
The data confirmed this perception: In April 2025, average snoring peaked at 108 minutes/night, with breathing disruptions at 12.84 events/hour; after treatment stabilized (December 2025—March 2026), snoring dropped to around 55 minutes, and breathing disruptions fell to around 6.9 events/hour (four-month average). Compared to the spring 2025 peak, both metrics nearly halved.
SleepCycle's quality score—as always—barely moved.
07 / Environment & Triggers
SleepCycle occasionally records the night's weather and room temperature. Over ten years, 755 nights have valid room temperature readings—about 20% of the dataset. Lining these nights up against respiratory burden reveals a very physical pattern: cold air is a direct amplifier for a vulnerable airway.
The lower the temperature, the longer the snoring and the more frequent the breathing disruptions (BD). When room temperature sat in the 10–18°C range (on the cool side), average snoring hit 38.4 minutes and BD approached 7.7 events/hour; once temperature rose above 24°C, the two numbers eased back to 21.8 minutes and 5.4 events/hour.
If cold air is the punishment, another kind of weather feels closer to shelter: rainy nights.
Across the 170 nights logged as "rainy" or "showers," snoring fell noticeably—from 32 minutes on sunny/cloudy nights down to 19.4 minutes. The subjective side echoed the same shape: the share of mornings waking up feeling "Bad" dropped from 46% to 38.2%.
Stretching the idea of "environment" from weather to place of residence (Location Phase), these ten years form a small-scale sleep history of living spaces in Hong Kong: different physical spaces shape different sleep baselines.
As I moved from Mianyang through various Hong Kong districts, average sleep quality and actual sleep duration both show a clear structural decline. The Pok Fu Lam period stands out—subjectively logged noise disturbance (Noisy) approached 40%, the peak of the decade. By the Wan Chai and Mid-Levels² phases, subjective noise appears to have receded, but that's a partial illusion: during both phases, I drastically increased earplug usage (dashed line), pushing the subjective noise reading down by brute physical means.
So how did I actually fight back against bad acoustic environments? In the last three years, SleepCycle introduced objective "Ambient Noise" decibel monitoring. Crossing those physical decibel readings with "wore earplugs" behavior reveals a clear positive correlation: earplugs weren't a random bedtime ritual, but a direct response to a genuinely bad physical environment.
08 / Mood Records
SleepCycle asks me to report my state upon waking: Good / OK / Bad / Not set. Over these ten years, Bad accounts for 38% of all valid records— nearly four out of ten mornings, the first feeling upon waking is "not good"; while Good is only 3%, extremely rare. This isn't just because there are many bad mornings, but also because "good" here is inherently a high-threshold judgment: in the ten complete years from 2016–2025, Good never exceeded 4.1%, and so far in 2026 it's down to just 1.3%. If snoring and breathing disruptions are the body's nighttime signals, then mood tags are the direct testimony after waking.
But SleepCycle's mood recording only happens after waking—it's more like an evaluation of that night's sleep quality, not the actual emotional state before falling asleep. Two other tags in the data fill this gap: pre-sleep negative mood (sad) and Stressful day. Behavior tag analysis shows that "pre-sleep negative mood" drags down sleep quality by −4.6%, one of the strongest negative signals among all tags, deeper than both "sick" (−2.0%) and "Stressful day" (−1.8%). These two dimensions—emotional state before sleep and subjective evaluation after waking—together form an emotional facet that the algorithm cannot see at all.
From 2016–2019, the Bad rate was still fluctuating between 25–35%. In 2021 it suddenly jumped to 48%, and never returned below 40% since— 48% in 2024, 54% in 2026, the highest in ten years. This jump happened even a year before COVID, suggesting that certain cracks appeared in subjective feelings before physical indicators completely deteriorated.
This is also why the "adjusted score" later puts mood at the core: it's not a footnote; it's actually the most direct feedback on how the night went. Sensors record sleep structure, while mood tags record the feeling of being alive; the latter is more subjective, but often more honest.
If sleep quality scores could truly predict how you feel upon waking, then high-quality nights should correspond to very few Bad moods. The actual situation in the data: on Q 90+ nights, still 21% of mornings are Bad— while Good is only 6%. The higher the score, the lower the Bad proportion, but it never drops to 0. This chart shows the persistent gap between quality scores and subjective feelings.
09 / Two Measurements
SleepCycle's quality score is an opaque black box—it estimates sleep stages from the phone's microphone and accelerometer, and spits out a single 0–100% number. But what, really, is that number measuring?
To probe it, I built two correction curves on top of the raw score. The first is the mood-adjusted score: still anchored to SleepCycle's raw score, but directly folding wake-up mood (Good / OK / Bad) into the adjustment. The second is the full-adjusted score: on top of the mood adjustment, it further corrects downward using respiratory-related burden (snoring, breathing disruptions).
Because respiratory-burden data only became gradually available in later app versions, the full-adjusted score is only computed from December 2022 onward. The three lines in the chart correspond to three different layers of understanding: the algorithm itself, the subjective experience, and the reality after respiratory issues are accounted for. Notably, this gap didn't open all at once: it first shows up as a mood divergence, and is then stretched further by respiratory burden. What the three lines really answer isn't "which score looks prettier," but: under different rulers, how differently can the same night look?
10 / What the Machine Learned
Everything so far has looked at sleep one piece at a time — bedtime here, a tag there, a rough night over there. In this section I let those pieces compete at once. I trained a model on my more recent nights and asked a more honest question: when bedtime, tags, recent history, and context are all in the room together, which ones still carry weight?1
The answer is both reassuring and a little annoying. Bedtime still runs the show. Night interruptions and late caffeine are real costs. And several tags that looked terrifying in the earlier charts — especially tired and after midnight — lose much of their force once bedtime itself is in the model. The machine didn't uncover some hidden law of sleep. It did something more useful: it helped me separate the cause from the traces it leaves behind.
Sleep quality
Wake-up mood
Technical notes
1 This section mainly uses the more recent years and only variables that are known before sleep. The point is not to maximize fit, but to see which factors still retain explanatory weight when the question stays actionable.
2 The relative weights here are not simple one-variable correlations. They reflect how much signal a factor still carries after it has to share space with the others in the same model.
3 The counterfactual charts are not causal proof. They are model-based one-variable simulations: keep the surrounding context as intact as possible, change one factor, and see which way the prediction moves and by roughly how much.
11 / Afterword
This is an extremely personal sleep analysis, but the questions underneath it probably aren't unique: sleep is multidimensional, the environment matters more than most people think, and the score an algorithm gives you and the morning you actually walk into are not always the same thing.
The model in the last section did not reveal some hidden law of sleep. It did something more honest: it ranked what still survives when everything is tested at once. Bedtime came first. Not exercise, not tags, not the little rituals we like to build stories around. When you fall asleep still outweighs almost everything else, and its damage lands harder on mood than on the score the app shows you.
A few things became harder to ignore after laying all of this out.
Exercise helps, but not all exercise is equal. The kind that grinds down joints and muscles often backfires; the kind that genuinely works the heart and lungs without that wear and tear is where the numbers actually move.
“I'll sleep more tomorrow” does not really hold up. The body seems to repay debt by going to bed earlier, not by stretching a single night. Restoring rhythm matters more than adding volume.
If you have chronic respiratory issues, cold air and dryness are real amplifiers. Humid nights are not just more pleasant; they are often materially easier on a fragile airway.
And if you often feel the score looks fine but something is off, trust the feeling. The algorithm measures movement and sound. It does not measure the morning you have to live through.
The data's real purpose was never to confirm that everything is okay. It was to turn a vague unease into something you can point at, examine, and maybe begin to change.