The appeal of fitness-tracking smartwatches is that they have all the answers. They turn our squishy bodies’ inscrutable secrets into hard numbers we can plainly read and analyze. But we would be fooling ourselves if we believed that our smartwatches always tell the truth. According to a new scientific analysis, not only do wearables often get things wrong, it may not be possible to ever really know how accurate they are.
This isn’t going to be shocking news to longtime Lifehacker readers. We’ve discussed the fact that some smartwatch metrics are more reliable than others, and that calorie burn is one of the less accurate ones. On the other hand, heart rate variability shows different raw numbers from one device to another, but the major recovery-focused devices all manage to capture the same rough trend—if you trust my homebrew study with a sample size of one.
So what do we know about the accuracy of the smartwatches on the market, and why is it so hard to answer that question? That’s the problem that the recent analysis, from a group of sports scientists and data scientists in Ireland, set out to answer. It’s an umbrella review—a study of studies of studies—that aimed to collect all the relevant published data on consumer wearables. Here’s some of what they learned.
Studies are out of date as soon as they’re published
You would think somebody at Apple or Garmin or Fitbit would do extensive studies of their technology before releasing it to the public. And they probably do, internally, but their goal is launching and selling a product—not validating the accuracy of their product relative to others.
So the studies we have are generally done by scientists, and they begin after the wearables hit the market. It usually takes at least two years to conduct a study on a brand-new smartwatch and get it published. By then, that smartwatch isn’t so brand-new anymore.
This new analysis, published in July of 2024, used the most current meta-analyses available, which in turn used the most recent studies they had available. And which models of fitness watches did those include? I looked through the supplemental tables for the newest models of each major brand. They included:
-
Fitbit’s Charge 4 (the Charge 6 launched last year)
-
Apple Watch Series 6 (the most recent is Series 9, again launched last year alongside the Ultra 2)
-
Garmin’s Fenix 5 (the Fenix 8 just came out)
-
Garmin’s Forerunner 245 (still popular, to be fair, but the 255 and 265 have been launched since then; the 265 is a year and a half old already)
-
Oura’s generation 2 ring (it’s up to gen3 now)
-
Whoop 3.0 (the current model is 4.0)
So if you want to know how the Apple Watch Ultra 2 compares to the Charge 6, or the Forerunner 265, or Whoop 4.0, you’ll have to wait a few more years—and by that point, everything will have gone up another version number or two.
Accuracy studies aren’t done in consistent ways
The studies are also so varied that it’s hard to compare them to each other, even if you’re interested in learning about older models of devices. The umbrella review found that most of their studies underestimated heart rate and overestimated sleep time, for example, but the authors concluded that they can’t really say wearables in general overestimate and underestimate these things. The studies were too different from each other, each evaluating a handful of devices and rarely using the same gold-standard metrics to compare them against.
“This umbrella review reveals the intricate variability across devices, outcomes, user contexts and reference standards,” the authors wrote, “making a definitive assessment of wearables’ accuracy challenging.” In other words: we don’t have the data to answer the questions you ask when you go shopping for a new device.
Which metrics fared the best and worst?
Still, if we can take the results with some huge grains of salt, I think it’s still worth looking at what the umbrella review found. These are some commonalities, although we definitely can’t say they’re universally true:
-
Heart rate was usually correct to within +/- 3% of the true value. That’s not bad, but still, a window of 6% is kind of a lot when you may be trying to keep your heart rate within a 10-point zone.
-
Heart rate variability was “very good to excellent” when readings were taken at rest, but accuracy dropped when readings were taken in motion.
-
Energy expenditure (calorie burn) wasn’t great, which we already knew. Sometimes devices underestimated by 21%, sometimes they overestimated by 14%.
-
Step counts were also pretty variable, ranging from 9% less than the actual number, to 12% more.
-
Sleep duration was usually overestimated, and sleep latency (how long it takes you to fall asleep) was usually underestimated.
It’s more important to ask if something is useful than if it’s accurate
I honestly don’t judge wearables on whether they’re accurate, only on whether they’re useful. You may remember from my comparison of Whoop, Garmin, and Oura that each of the three devices reported different raw numbers for resting heart rate and heart rate variability, but they were all able to track the same trend, giving me arguably useful information about when my body was well-rested and well-recovered, versus when it wasn’t.
That eye toward usefulness is why I try to steer people away from paying attention to their calorie burn. If you truly want to know how many calories to eat to maintain your weight, you’re best off tracking how many calories you eat alongside tracking your weight. Similarly, instead of blindly following a watch’s estimate of whether you’re in zone 2 when you exercise, you can use other cues like your breathing and your internal monologue (“oh god when will this be over?”) to tell how hard you’re working.
Even though we can’t verify the accuracy of every tracker, I do know that accuracy is important to most people who go shopping for smartwatches and fitness trackers, so I’ll continue covering it, where appropriate. A GPS-enabled watch should show you the street you’re actually running on, and a heart rate sensor shouldn’t confuse your running cadence with your heart rate. But the most important questions to ask about a wearable are not whether its metrics are accurate, but whether they are useful even knowing that they may be inaccurate.