In mid-February, SoundStage! editor-in-chief Jeff Fritz e-mailed to ask if I’d be willing to put together a “virtual system” as part of a series he was writing for SoundStage! Ultra: “If you had an unlimited budget and wanted the best performance money could buy on the desktop, what would you pick?”

That system, which I described in Jeff’s March 2021 column, “Six Stereo Systems I’d Like to Hear, Part Three: The Final Two,” comprised Focal’s Shape 65 active nearfield monitors ($1998/pair, all prices USD) and exaSound’s e62 DAC ($2799). I reviewed the Shape 65 for Simplifi in June 2020, and it subsequently won a SoundStage! Network Product of the Year award for Outstanding Value. Including cables and accessories, the system’s total price is about $5000—as I said in Jeff’s column, “pretty steep for desktop audio, but a lot less than the typical Ultra system.”

I’m not second-guessing myself here—this is a great system. As I concluded in my formal review of the Shape 65s, they could “generate excitement and draw me into the music,” and were “also capable of beauty and refinement.” They worked wonderfully for both nearfield and living-room stereo listening.

But had Jeff asked me to choose a “cost-no-object desktop audio system” only a few weeks later, my answer might have been different. I might well have chosen instead the subject of this review: Sony’s SA-Z1 active nearfield monitor ($7999/pair).

Inside and out

In terms of build quality, the SA-Z1 can definitely hold its own with the cost-no-object components reviewed on SoundStage! Ultra. Its all-aluminum, black-satin anodized cabinet is built like a tank, and inside is a whole raft of very sophisticated technology.

Each speaker weighs 23.4 pounds and measures 7.88″W x 8.25″H x 12.88″D, including projecting parts and controls. The SA-Z1 is irregularly shaped—its front section, which houses the drivers, protrudes snout-like from the rear section, which houses the electronics.

To minimize noise and resonance, Sony says, the two housings are made from different alloys of aluminum. The driver enclosure comprises six aluminum plates, which Sony says suppresses resonances better than a monocoque enclosure. The electronics are isolated from vibrations from the drivers by a solid aluminum wall between the front and rear sections, and by a bridge structure that joins and decouples the two sections.


On the front baffle is an aluminum assembly—another sort of bridge—containing a three-driver tweeter array: a 0.75″ (19mm) main tweeter and two 0.55″ (15mm) Assist tweeters, one above and one below the main tweeter. All three tweeters have soft domes sputtered with titanium, which Sony says reduces resonances and improves transient response. These domes feature a balanced design in which the voice coil drives the diaphragm at a point at which its mass matches the air load; Sony says this suppresses breakup modes.

Behind the tweeter array is a 4″ woofer, and inside the cabinet is a second Assist driver, this one a 4″ cone that fires to the rear. As Sony notes, the opposed motions of the two woofers cancel out unwanted vibrations. The Assist woofer radiates through two narrow slots, one on each side of the enclosure and running almost its full height. Per Sony, this results in a wider, higher soundstage. The woofers have neodymium magnets to maintain their linearity, zinc die-cast baskets that suppress vibrations from cone motion, and copper sleeves to reduce distortion.

Each woofer is powered by a dedicated amplifier specified to output 35W into 6 ohms at 100Hz with 10% THD. Each tweeter is powered by an amplifier specified to output 18W into 6 ohms at 5kHz with 10% THD. Those specs are provided in the manual and on Sony’s website. The manual also provides more conservative specs for the main woofer amplifier—24W RMS into 6 ohms, 40Hz-20kHz at <1% THD—but none for the Assist woofer or tweeter amplifiers.

The class-D amplifiers powering each driver have gallium-nitride (GaN) output transistors with a very high switching rate. As a result, Sony says, ringing is significantly lower than from MOSFET output stages.

The plus side of the class-D amp feeds the driver, while the minus side feeds an analog amplifier that’s used for feed-forward error correction. The analog amp compares the input signal with the output of the class-D amp, and generates an error-correction signal that’s then added to the output of the class-D amp; this cancels out switching errors in the main amplifier.


The primary speaker, or Speaker A, contains all the input connectors. On the rear are pairs of unbalanced (RCA) and balanced (XLR) line-level inputs, a two-prong IEC power inlet, and a connector for a 7′ Digital Sync interconnect (supplied) that tethers Speakers A and B. A switch on the back of Speaker A allows it to be designated for left- or right-channel operation. The only connectors on Speaker B are an IEC power inlet and a jack for the Digital Sync wire, both on the back.

From top to bottom on the left side of Speaker A are a USB input, a proprietary input for a Sony XPeria smartphone or Walkman digital audio player, an S/PDIF input (TosLink), and a 3.5mm stereo analog jack. The maximum resolution via USB is 32-bit/768kHz PCM, DSD native to DSD512, and DSP DoP (DSD over PCM) to DSD256. Via the Walkman port, the SA-Z1’s maximum resolution is 32/786 PCM and DSD256 (native or DoP). Via TosLink, the maximum resolution is 24/96 PCM.

Instead of an off-the-shelf DAC chip and digital processor, Sony uses a Field Programmable Gate Array (FPGA) running custom code. Among the FPGA’s functions is aligning the drivers’ outputs in time so that they produce a coherent, unified wavefront at the listening seat, and the implementation of several adjustments selectable by the user with controls on Speakers A and B.

On the left of the angled top panel of Speaker A’s electronics section is an on/off switch, and next to that an input button for cycling through sources. On the right is a volume knob, and next to it two buttons for engaging Digital Sound Enhancement Engine HX (DSEE HX) and DSD Remastering (DSD RE) processing. By default, the SA-Z1 upsamples compressed and PCM streams to 32/352.8 or 32/384. DSEE HX uses 40-bit floating-point calculations to do this, and adjusts the processing based on program content. DSD RE resamples data to DSD256.


In the middle of Speaker A’s front control panel is a 1.5″W x 0.75″H alphanumeric display that can be set to show the selected input and volume setting, or the input and output resolution.

There are four other digital signal-processing (DSP) options, selected with knobs on the angled top panel of Speaker B’s electronics section:

  • A. Hybrid Amplifier Analog Assist (D.A. Assist) changes the function of the analog amplifiers. In the Standard setting, the analog amplifier is used only for feed-forward error correction, as described above. In the Blended setting, the analog amp is used for amplification as well as error correction, so “you can enjoy soft sounds, just like analog audio,” per the manual.
  • Assist Woofer Motion (A.WF Motion) has two settings: Standard, in which the main and Assist woofers operate in phase; and Fixed, in which the Assist woofer doesn’t output sound. The Standard setting creates a wider soundfield, Sony says, the Fixed setting “a clear bass.”
  • Assist Woofer Frequency Range (A.WF Freq Range) has three settings—Narrow, Standard, Wide—that adjust the range of frequencies produced by the Assist woofer.
  • Assist Tweeter Time Alignment (A.TW Time Ali) lets the user control the time alignment of the main woofer and Assist tweeters. The Sync setting synchronizes their outputs. The Delay setting delays the Assist tweeters’ output so that the SA-Z1 better “reproduces soft tones.” The Advance setting “emphasizes the contours of clear sound by advancing the output of the sound from the Assist tweeters.”

Included with the SA-Z1 is a metal remote control with an on/off switch at the top. Below that, on the left, are two small buttons for activating and deactivating DSEE HX and DSD RE; on the right are three small buttons for cycling through inputs, changing display settings, and dimming the display. In the middle are larger + and – buttons for adjusting volume, and below these a small Mute button. At bottom are four buttons for navigating menus.


Setup and settings

Sony’s reviewers’ guide recommends that the SA-Z1s be pointed straight ahead, not toed in, with the main tweeter at ear level, the rear panels 5.9″ (15cm) from the wall behind them, and their inner edges 28.75″ (73cm) apart. The listener and two speakers should describe an equilateral triangle.

By happy coincidence, the secretary desk in my second-floor office made it easy to follow these guidelines. With the SA-Z1s placed atop IsoAcoustics ISO-L8R155 desktop stands ($109.99/pair) at either end of the top of my secretary, their inner panels were exactly 28.75″ apart. And after I’d moved my desk 7″ away from the wall, the speakers’ rear panels were precisely 5.9″ from that wall. When I slightly raised my office chair, I could sit with my ears exactly level with the main tweeters—as long as I didn’t slouch. When I leaned back a little to relax and listen, the speakers and my head described an equilateral triangle. Easy!


For listening from a computer, the SA-Z1’s user manual recommends installing Sony’s Hi-Res Audio Player (HRAP) app, available in macOS and Windows versions. The app, a bare-bones affair with no support for album art, allows playback of high-resolution files at native resolution; it supports WAV and AIFF to 32/768, ALAC and FLAC to 32/384, and DSD64 to DSD512. You choose music from the File menu, navigate to the folder with the music you want, then highlight the files you want to play. If you have a full-fledged music-player app—say, Audirvana, JRiver Media Center, or Roon—you won’t need HRAP. And if you use a streaming service like Qobuz or Tidal, you can use its desktop app to play music through the SA-Z1s via USB.

I connected my early-2015 Apple MacBook Pro laptop to the SA-Z1s’ USB input, and played music using Sony’s HRAP, as well as Audirvana 3.5.44 and Qobuz’s and Tidal’s desktop apps. All worked fine. But almost all of my listening was via Roon Core 1.8, running on a modified mid-2011 Apple Mac Mini connected to the SA-Z1s’ USB port with a 2m AudioQuest Cinnamon USB link.

I experimented briefly with the SA-Z1s’ DSP options. Some were very subtle in their effects, and comparisons were made more difficult by the one-second delay before any change took effect.

DSEE HX and DSD RE made the sound a little smoother, a little less glary; I left them on for my listening. D.A. Assist’s Blended option softened the sound slightly; I preferred the greater incisiveness of the Standard setting. With A.TW Time Ali, I found the sound with the Advance setting a bit too edgy; Delay took the edge away, but I found that Sync provided the best balance of coherence and smoothness.

A.WF Motion’s Fixed setting, which turns off the SA-Z1s’ rear woofers, made the bass snappier but less robust—I much preferred the Active setting. A.WF Freq Range’s Wide setting indeed broadened the soundstage, but made aural images a bit less precise. Images were very precise with the Narrow setting, but confined the soundstage to the area between the speakers. The Standard setting provided the best tradeoff of soundstage width and image specificity.


For Simplifi, I usually compare the component being reviewed with a similar product. That wasn’t possible with the Sony SA-Z1. Not only is it a unique product, I had no other nearfield minimonitors on hand. To get perspective on what I was hearing through the Sonys, I frequently listened to the same music through my living-room system of an NAD C 658 streaming DAC-preamp ($1649) and Elac Navis ARF-51 active floorstanding speakers ($4599.96/pair), as well as my headphone rig: an iFi Audio iDSD Micro BL headphone amp-DAC ($599.99) feeding HiFiMan Edition-X V2 headphones ($1299).

During my listening, several things about the Sony SA-Z1s stood out:

  • Dead silent. With the volume set as loud as I’d want in my home office, when I paused play and put an ear right next to a tweeter assembly, I heard absolutely nothing—no hiss, or any other kind of noise. Nada.
  • Big sound. The SA-Z1s threw a wide, deep soundstage, with laser-like positioning of aural images. The nearfield setup delivered a degree of clarity I’ve never experienced with living-room stereo, but with the scale I expect from a living-room system.
  • Explosive dynamics. These desktop minimonitors scaled up and down effortlessly.
  • Mercilessly revealing. The SA-Z1s could at times sound analytical. If a recording had a harsh edge, there was no escaping it. I sometimes found myself wishing for a bit more warmth.
  • Exquisite microdetail. I marveled at how the SA-Z1s conveyed subtle nuances in the sounds of instruments and voices, and in the interplay of musicians. Time and again, the SA-Z1s revealed new aspects of favorite recordings.

About that last item: A track I often play to get a handle on a new component is the Chick Corea Trio’s performance of the Thelonious Monk classic “Blue Monk,” from their live album Trilogy (24-bit/96kHz FLAC, Concord Jazz). I’ve heard this recording a gazillion times—it’s music I know and love, beautifully recorded and played, and through the SA-Z1s it sounded fresh and new.


The track begins with a descending riff played by Christian McBride on the lower strings of his double bass. The SA-Z1s nailed it. Low notes were impressively big and robust, but also crisp and articulate, with no bloat or overhang. I could hear every element: the initial pizzicato attacks, the vibrations of the strings against the fingerboard, the woody resonance of the soundbox. The bass output seemed to flag a tad on the lowest notes, but considering that they were being reproduced by a pair of desktop speakers, each containing only two 4″ woofers, and that those notes extend down to E1 (41Hz), that sort of low-end performance was truly impressive. (Sony specifies the SA-Z1’s bass response as -10dB at 51Hz.)

In the middle of the track, when McBride plays a long solo that begins in the lower strings, I had the same enthusiastic reaction. Later in his solo, he plays an impossibly fast sequence on the upper strings. I admired how the SA-Z1s articulated McBride’s varied attacks: crisp staccato notes, then long passages in which note blended into note, each pluck clearly defined but part of a complete phrase.

The SA-Z1s’ reproduction of Corea’s concert grand was just as impressive. Piano tone was consistent from top to bottom, with a lovely bell-like clarity in the upper octaves. The Sonys effortlessly tracked Corea’s hard-hitting staccato chords and phrases—but just as impressive was the amount of expression in Corea’s legato asides these speakers could convey.

I was blown away by the SA-Z1s’ reproduction of Brian Blade’s drumming. His quiet taps on the hi-hat and ride cymbals had wonderfully delicate metallic sheen, with crisp attacks followed by shimmering decays that seemed to go on forever. However, when he hit these metal surfaces hard, the sound became a wee bit splashy. His beats on floor tom and kick drum had fantastic body and impact, with no overhang. All of these sounds were arrayed across the back of the soundstage, each surface of Blade’s kit having a clearly defined space.

Rim shots had explosive impact, but even more striking was how the SA-Z1s differentiated the initial impact of drumstick on drum rim from the short-lived resonance that followed. Snare rolls were amazing—I can’t recall ever having heard the individual bounces of sticks against the drum’s top head so clearly articulated. I was simultaneously aware of the rattling of the snares against the bottom head. Those sounds were clearly delineated from each other yet presented as an integrated acoustic event. That said, the rattling sound of the snares was sometimes a bit too pronounced.


I responded much the same way to another favorite track, “My Right Eye,” from Laurie Anderson’s Homeland (16/44.1 ALAC, Nonesuch). Through the SA-Z1s, I noticed expressive nuances in her singing that had previously escaped my attention, such as the descending whispered inflection at the end of each phrase of the opening verse: “Hold your breath, hold your breath, close your eyes.” It was gorgeous. But I found sibilants too hot—sh and ch sounds could be almost grating. Anderson’s voice in this recording is heavily processed, and sometimes has a bit of a hard edge; the SA-Z1s accentuated this quality.

The electronic drumbeats that punctuate “My Right Eye,” sounding like heartbeats, had wonderful depth and definition. While synth chords sounded a bit too edgy, Eyvind Kang’s viola was perfect: rich, woody, rosiny. As with Anderson’s singing, the SA-Z1s drew my attention to Kang’s expressive bowing and phrasing.

“Somebody That I Used to Know,” the 2012 Grammy Song of the Year by the Australian singer-songwriter-instrumentalist Gotye (Wally De Backer) joined by the New Zealand singer Kimbra, from Gotye’s Making Mirrors (16/44.1 FLAC, Samples & Seconds/Qobuz), showed off the SA-Z1’s impressive dynamics. Lucas Taranto’s bass guitar, and Gotye’s percussion and sampled sounds, fairly jumped out of the speakers. Sampled effects and percussion appeared all over the soundstage almost as if by magic, each in a clearly defined location—xylophone on the left, brushed snare on the right. In restrained verses, the SA-Z1s’ reproduction of Gotye’s tenor and Kimbra’s breathy soprano seemed ideal, with no hint of coloration. But when Gotye belted out the choruses, some hardness crept in. Also, his sibilants were a tad hot (Kimbra’s less so).

A new recording of Rachmaninoff orchestral works by the Tatarstan National Symphony Orchestra conducted by Alexander Sladkovsky (16/44.1 FLAC, Sony Music/Qobuz) demonstrated the SA-Z1s’ ability to produce surprisingly large-scaled sound. If I leaned back and closed my eyes while listening to Symphonic Dances, Rachmaninoff’s last major composition, I could almost imagine I was hearing a big living-room stereo, not a pair of nearfield monitors plopped atop my desk.

In the energetic, slightly grotesque passages at the beginning and end of the first dance, Non allegro (Not fast), the SA-Z1s produced a wide and very deep soundstage that seemed to extend far beyond the wall behind the speakers. String tone was a tad steely for my taste, but otherwise orchestral tone was commendably transparent. Layering was superb—strings were arrayed across the front of the soundstage with brass and woodwinds behind them and percussion at the rear, each group of instruments occupying a clearly defined area.


Dynamics were astounding. Big timpani beats emerged effortlessly from the left rear of the stage, with no bloat or boom. Emerging from the center rear, dramatic trumpet phrases had delicious metallic bite and wonderful golden tone.

In the dreamy middle section of this dance is a gorgeous theme played on alto saxophone, supported by clarinet, flute, oboe, and English horn. I admired the way the SA-Z1s rendered the distinct timbres of these diverse instruments, and the precision with which each was positioned. That point was brought home later in this section when the violins pick up that lush, romantic theme—again, the strings were clearly in front of the wind instruments.


I’ve picked a couple of nits about the Sony SA-Z1s’ sound: occasional hardness, some hot sibilants, a slightly analytical character. But overall, I loved what I heard from these minimonitors. Throughout my listening, I admired their bold, effortless dynamics; their wide, deep soundstages; their precise imaging; and their wonderful microdetail.

While the SA-Z1s were in residence in my home office, I never felt the urge to go downstairs to my living room and listen to music on my main rig. Quite the contrary: in those weeks, the SA-Z1s became my primary music system.


It would have been interesting to compare the SA-Z1s with the desktop system I chose for Jeff Fritz’s column on SoundStage! Ultra. Alas, the Focal Shape 65s have long since been returned to their rightful owner, so such a comparison was impossible.

But there are some differences worth noting. As I mentioned in that column, my “Knowledge Worker’s system” can be used for both desktop and living-room applications. But the Sony SA-Z1 is a one-trick pony—it’s strictly for nearfield use. On the other hand, the SA-Z1 is a completely self-contained system requiring no external DAC—an attractive feature for desktop use. And Sony’s build quality is outstanding.

This I can say for sure, right now: If I could have any system for listening to music in my home office, cost no object, I’d go for the Sony SA-Z1s.

. . . Gordon Brockhouse
This email address is being protected from spambots. You need JavaScript enabled to view it.

Associated Equipment

  • Source: Modified Apple Mac Mini computer (mid-2011) running Roon Core 1.8
  • Desktop speaker stands: IsoAcoustics ISO-L8R155
  • USB link: AudioQuest Cinnamon (2m)
  • Control devices: Apple MacBook Pro computer (early 2015), Google Pixel 4a 5G smartphone

Sony SA-Z1 Active Nearfield Loudspeakers
Price: $7999 USD per pair.
Warranty: Two years parts and labor.

Sony Corporation of America
1 Sony Drive
Park Ridge, NJ 07656