How Wirecutter Evaluates Sound in Our Headphone and Speaker Reviews

People can usually tell in a few seconds if they like a glass of wine, an article of clothing, or the sound of a speaker. But understanding exactly why people like or dislike these things can take years of study. For Wirecutter’s audio writers, researching the “why” is part of the job. Some readers may just want to know what the best pair of speakers or headphones is, and as a result they may be reluctant to take deep dives into topics like decibels, dynamics, and dispersion. But if you’re as interested in the “why” as we are, keep reading.

This is the first in a two-part series we’ve been wanting to do for a long time, in which we discuss how we evaluate sound in our speaker and headphone reviews. Here, in part one, we’ll define good sound and explain the different sonic elements we listen for. In part two, we’ll explain how we put sound quality to the test—and how you can, too.

Audio evaluation is complicated because it’s not just about the sound picked up by human ears. It’s about how the brain processes and interprets those sounds. Although there’s more than a century of science behind today’s audio devices, people still judge them mostly by listening, which is influenced by their hearing, their surroundings, and even their moods. And people often judge the quality of audio devices using music, which is itself the product of emotion rather than science.

That said, there are still specific (and measurable) traits that audio reviewers use and discuss when evaluating the performance of a speaker or a pair of headphones. Knowing what those traits are, and what your own preferences may be for each one, will help you pick the best audio device for you.

What is good sound?

From the simplest standpoint, good sound is whatever pleases the listener, just as any food that tastes good is, on some level, good food. But to audio professionals, good sound means accurate, natural reproduction of instruments and voices. Some listeners may wish to alter the sound to their liking—for example, boosting the bass to add more excitement—but if you don’t start out with a speaker or other audio device that can deliver reasonably accurate reproduction, you have no way to know what the artists and engineers intended when they made the recording.

To that end, an audio system should change the sound as little as possible from the original recording. It shouldn’t boost or muffle certain parts of the audio range relative to others. If, for example, Ariana Grande’s voice suddenly takes on a little of the bassy, resonant tone of Johnny Cash, something’s wrong. Likewise, if you play a tune by Lil Baby, and the bass doesn’t get your car seat vibrating, that’s wrong, too. Or if Ed Sheeran’s acoustic guitar sounds dull, as if it were stuffed with wet rags. You get the idea.

Another hindrance to good sound is when an audio system produces distortion, which is usually defined as adding extra tones that aren’t in the original recording. Distortion can make even a smooth-voiced singer like Karen Carpenter or Solange sound rough or cause a Stradivarius violin to sound like a screechy old fiddle. Distortion can also reduce the clarity of a recording. Metal bands deliberately add distortion to instruments and vocals to create a greater sense of tension and power in the music—and that kind of thing is great for Avenged Sevenfold but might not set the right mood for a Joni Mitchell tune.

An audio system that changes the sound in the ways described above distracts the listener from the music, movie, or even podcast that they’re hearing. The ideal is that you forget you’re hearing an audio system and just get wrapped up in the substance of what you’re listening to.

The voice is (almost) everything

Whether a person is listening to music or movies, the element of sound that’s hardest to reproduce is the one that people are most familiar with: the human voice. Most folks have conversations with other people every day, so they can tell right away when an audio system is getting voice reproduction wrong. In movies and music with vocals, the voice is the most important part of the program.

The errors that audio systems make when reproducing voices are therefore easy to identify. Audio devices that create too much bass—a problem common in headphones—tend to make voices sound chesty and boomy. Audio devices that issue too little bass and too much treble—a common flaw in Bluetooth speakers—make voices sound thin, shrill, and sibilant (or hissy). Systems with high distortion, or just generally low quality, make voices hard to understand. Many speakers that try to pump high frequencies through large drivers (for example, a model that mates a large, 7- or 8-inch woofer with a small tweeter and doesn’t include a separate driver for the midrange frequencies) can make voices sound as if a performer were singing or speaking with their hands cupped around their mouth.

Bringing the bass

A Rogersound Speedwoofer speaker in front of a bass drum, which represents the bass sounds the speaker can reproduce well.
A good subwoofer, such as the Rogersound Speedwoofer 10S MKII, lets the listener feel the drums and bass as well as hear them. Photo: Brent Butterworth

Beyond vocals, it’s bass that really catches the ear—and every other part of the body, too. You not only hear it, you also feel it. Bass carries most of the rhythm and groove of music. Bass is critical to many movie soundtracks—explosions, earthquakes, and car crashes seem a lot more realistic when you can sense the vibrations in your body as well as hear them. Unfortunately, bass can also be annoying because some people believe that lots of bass equals good bass. But in this case, louder usually isn’t better.

With systems that offer high-quality bass reproduction, you can identify the pitch of the bass notes, and probably even hum along with the bass line. Poor-quality bass reproduction tends to sound boomy, often to the point where you can’t tell one note from the next; audio enthusiasts often deride this as “one-note bass.”

It’s also important to have the right amount of bass. Having too much bass might be fun on some level, but it also makes the rest of the audio sound dull. On the other extreme, if there’s not enough bass, the music sounds thin and loses some of its groove and rhythm.

Reaching for the highs

A Polk Signature Elite speaker in front of a cymbal, which represents the high-pitched sounds the speaker can reproduce well.
A good tweeter—the gold-colored dome near the top of the Polk Signature Elite ES15 speaker here—can reproduce the sizzle and shimmer of cymbals and other high-pitched instruments. Photo: Brent Butterworth

Treble, or high frequencies, is what gives a recording its sense of clarity, definition, liveliness, and spaciousness. Think of the twang of acoustic guitars or the shimmer of cymbals. If the treble is deficient, the music or movie soundtrack sounds dull and uninvolving. It also sounds small and dead, as if you were listening in a room where the walls are covered in carpet. That’s because most of the effects of reverb and echo are audible in the high frequencies. The ambient effects in surround-sound movie soundtracks and the reverberation in music recordings are critically dependent on good treble reproduction.

Good treble sounds balanced. You can easily distinguish different instruments, as you hear every pluck of a guitar string and every tap on a ride cymbal clearly. But it doesn’t sound excessive—the twang of Bob Dylan’s guitar strings, for example, doesn’t distract you from his lyrics or tire out your ears after a few minutes. It also doesn’t sound distorted, harsh, or piercing, an effect you’re likely to experience if you crank the volume way up on your laptop speakers.

It might get loud

A row of 5 different portable speakers, arranged from smallest to biggest.
Most portable Bluetooth speakers are too small to reproduce music at realistic levels, but the better models, such as the JBL Xtreme 3 (far right), can come close. Photo: Brent Butterworth

Running your audio gear at consistently high volumes can be dangerous. You probably don’t want realistic volume levels when you’re listening to a live album from your favorite rock band, and you definitely don’t want realistic levels when you’re watching a battle-intensive movie like Midway. But you probably do want to listen loud enough that you don’t have to strain to understand quiet dialogue or hear the flute in The Firebird (video).

A good audio system should exhibit what audio engineers call dynamics. It should clearly reveal the softest parts of music and movie soundtracks but not struggle to reach the loudest peaks. If a wireless speaker sounds natural and effortless when you play the guitar solo in Led Zeppelin’s “Heartbreaker”—which, in just a few bars, swings from being so quiet you can hear the noise of the guitar amp to being so loud it sounds as if the band were beating the instruments to the breaking point—that’s pretty good dynamics.

Many small speakers, especially all-in-one wireless models, distort badly when played loud. At best, distortion reduces the clarity of instruments and vocals; at worst, it can result in a sound similar to that of paper being torn. This effect annoys even the least attentive listener, and in our Bluetooth speaker tests, it instantly disqualifies a speaker from being a recommendation. On the flip side, a few audio devices—perhaps most notably, some noise-cancelling headphones—produce so much noise or hiss that they obscure the quieter parts of music.

Space is the place

Although modern audio systems can produce sound that’s impressively close to the real thing, the magic lies in their ability to make it sound as if you’re in a different acoustical space: a jungle instead of your living room, say, or Carnegie Hall instead of an airplane seat. The ability to do that comes from two separate characteristics: imaging and soundstaging.

Imaging is the ability of an audio system to produce the convincing sonic illusion that voices and instruments are coming from a specific place in front of you, behind you, or to the sides. A good stereo speaker system can “float” the sonic image of a singer right in your living room, as if the singer were standing 10 feet in front of you, and provide a good sense of placement for each instrument, as opposed to a single blob of sound. The more realistic the image seems, the better—but some speaker systems struggle to focus stereo images in front of the listener, and headphones can’t do that at all unless they’re aided by sound processing such as Dolby Atmos.

Soundstaging re-creates the sense of acoustical space in a particular environment. For example, when you listen to an orchestral recording through a good stereo system, you should be able to get a sense of the size of the hall in which it was recorded. With a really good recording, you hear sound reverberating from the walls and ceiling of the hall. With a hip-hop recording that has lots of electronic sound effects with heavy reverb, the effects may seem to come from 50 feet away, even if you’re wearing headphones.

That wraps up our primer on what we (and most audio experts) think is important when it comes to sound reproduction. We’d also like to learn what’s most important to you when it comes to sound, so let us know in the comments section below. And stay tuned for part two, where we’ll discuss how we evaluate each of these sonic elements and offer tips on how you can create your own collection of helpful test tracks.

This article was edited by Adrienne Maxwell and Grant Clauser.

Leave a Comment

Your email address will not be published. Required fields are marked *