I find lots of things in other blogs that are in varying degrees of error: normally I pass it by. Sometimes I just cannot help it. I must admit that there are many, many people who have forgotten more about audio than I have ever known; all the same, sometimes I might just actually know a few things about the particular subject and would like to make myself useful. There are people who have very kindly done the same for me, offering genuine, heartfelt constructive criticism. I would like to return that favor, and do it in similar fashion. Then there are others who, well, are what I call assumers. And you know what they say about people who 'ass'ume things. And I have had at times these 'ass'umers try and correct me, and I am astonished by how wrong they are in addition to how little they actually seem to know about the subject in question. Genuine, heartfelt constructive criticism? No, I do not have any of THAT for them. In this blog we will get a taste of both sides of me. A couple of critiques that are in good faith, and one that, well, might me a bit more spicy. THE MYSTERY OF THE VANISHING AUDIO The first critique will be short and sweet, and honestly, I do not have a memory of what the poor fellow's name is, nor the site on which he posted this. We are talking about Vinyl playback versus digital, and why Vinyl sounds 'better' in his mind. Disclosure: I LOVE good vinyl playback myself. I do at times think it sounds 'better' in many ways than digital. It also obvious has some weaknesses that are pretty much inarguably large compared to digital. And that brings me to what I consider one of the more fundamental issues in our hobby. We always talk about why something is 'better', or 'sounds better', or just flat out 'IS BETTER.' NO, NO, NO. I think what we are talking about is things sound DIFFERENT. And there are times when different people in different situations will have different PREFERENCES for what DIFFERENT they prefer. It is a purely subjective thing in many cases. And that is the case, I personally believe with Vinyl. In this case though, said person in goodwill stated that Vinyl (and analog in general) is better because it reproduces 'all of the audio in continuous fashion, while digital sampling leaves part of the music missing due to sampling gaps'. FACEPALM. Go ahead, do it with me. Let's facepalm together. Nothing could be more wrong. Digital audio in no ways 'leaves part of the music missing'. The sampling theorem will accurately reproduce the ENTIRE waveform, within certain boundaries of frequency and amplitude accuracy. Considering that digital has a SNR much higher than vinyl, we can go ahead and throw out amplitude accuracy as any kind of advantage vinyl may have. So we turn to sampling rate. Yes, it is true that the sampling rate will limit the high frequency extension of a digital recording. It is also true that vinyl has high frequency extension limits as well, and not only that, much, much higher distortion at the highest of frequencies. But back to the idea of what I am going to characterize as 'holes' in the music. Don't you also feel that is what this person is getting at? Sampling leaves 'holes' or 'gaps' in the waveform? Again, this cannot be any farther from the truth. Because of the reconstruction filter. For when the system is bandwidth limited, a proper filter will allow only ONE way for that waveform to be reconstructed from the samples. EXACTLY as it was before it was sampled below what we call the 'Nyquist' limit. Again, the ONLY errors that should exist below that Nyquist limit are the amplitude quantization errors, and we have already established at 16 bit and higher, they are already smaller than any amplitude distortions present in vinyl playback. So no, good sir, there are no gaps in digital audio. This is a persistent myth that just will not go away for some reason. THE MASTER SWITCH NOT BEING SO MASTERFUL I like 'The Master Switch' audio website. I enjoy their reviews. Pretty darn good stuff. But I was reading their explanations of audio formats, then I got to DSD. They were doing an okay job, until I got to this part. I hope it's okay to use this small excerpt as fair use: "Imagine a ruler with 44,100 lines on it. In other words, you can measure something in 44,100 increments. If the bit depth is sixteen, you’ll then be able to gather sixteen bits of information from the segment you’ve just measured. But if you have a ruler with 2,822,400 lines on it, then obviously you’ll be able to take much finer measurements. When you’re taking measurements that fine and that accurate, you simply don’t need sixteen bits of information. You only need one. That’s because the segment you’ve measured won’t be all that different from the ones to the left and right of it. Having sixteen bits of information won’t be any more beneficial than one bit, in this case. When the sample rate is that high, there’s no benefit to having a higher bit depth." CLICK HERE FOR TO READ THE REST AT THE MASTER SWITCH Although the first part is extremely basic and sort of on the right track, their explanation essentially sounds like the way Delta modulation works. We are totally missing the Sigma it seems. And the last couple sentences stating 16 bits of info is no better than 1 bit at these kind of sample rates, made me sit up and wonder it they have ever heard of multi-bit delta sigma? (Well first, they need a primer on what Delta modulation is, but I will leave that for some one else.) Virtually every DAC chip in current use has a multi-bit Delta-Sigma modulator (and reminder, DSD is nothing more than 1 bit Delta-Sigma modulation stored in a bitstream file format), so OBVIOUSLY there is a major benefit to having higher than 1 bit sample rates at over 2.8 MHZ. Actually, the latest, greatest chips are using more like 6 to 8 binary bits at rates that at times exceed 10 MHZ! It is a way to minimize the pulse quantization error from the beginning, meaning much, much easier noise shaping requirements, and much less strain on analog output stages, not to mention massively higher resolution, both actual (from the basic principle of pulse averaging) and perceived (from the magic of noise shaping). Furthermore, it's not nearly as MASSIVE an increase in pulse resolution as they make it out to be. If you take that 44.1khz sample period they are talking about, and truncate it from 16 bits to one, and consider a single sample out of that period, that is going from 65,536 levels of data in that single sample period of around 22 microseconds, to 64 individual one bit pulses/ 65 levels of data in 22 microseconds, or 6 bits when averaged. (yes, yes, I know DSD doesn't use time periods like this to calculate its resolution, and the actual resolution changes with the frequency being sampled vs. the time period chosen in this thought experiment, but it IS a time splicing AVERAGING pulse format, and BEFORE noise shaping comes in to save the day increasing the apparent resolution by not getting rid of the error, but rather shifting it into clumps of noise at high frequencies we cannot hear, well tough.. this is accurate as to how it works.) Expanding our horizons beyond our limited view down to an approximately 22 microsecond 44,100khz single sample, we will find a much, much greater increase in actual and perceived resolution across the entire audible range. FINALLY LETS GET JUICY ABOUT DITHER.... I made a simple post on the science section of a popular headphone enthusiast site the other day. We won't talk about the real scandal 'there' that has me steamed, and that is how they treated a major vendor and massive contributor, but between their actions involving him, and the own attacks I have received there myself, (I was actually threatened by a stalker there a few years ago via PM, who hurled all manner of insults about my lack of intelligence, then proceeded to threaten to 'get' me at work, after which I actually dealt with massive amounts of A/V sabotage, in addition to stolen equipment from our normally secure audio/visual booth) and the other day a random guy in the audio 'science' section ( not a new stalker as far as I know so no worries lol) who seemed to assume I was a village idiot was just the cherry on top.. for THIS week that is. Who knows what else will go down over there. This dude actually tried to tell me that DSD noise isn't quantization noise. Rather that it is dither noise. (As an aside, never use Gemini AI for any accurate info. When I entered the query about the nature of DSD noise and dither, Gemini gave me his statement verbatim. Then I looked at what source Gemini has used to come up with this info. I can't make this crap up. The source? Was the very thread from the Science section of this website where this guy made the statement. I am still laughing about the ridiculousness of this.) Anyway, NO 1-bit DSD is NOT dither noise. Yes, it is noise, but it is almost entirely QUANTIZATION noise. In fact, because it is a 1-bit system, it CANNOT be fully dithered. Which means YES, ultrasonic noise, which is noise-shaped quantization noise from the 1-bit samples, is correlated to the audible range. Dither is random noise than de-correlates quantization noise in mulit-bit PCM systems. It isn't something that can be accomplished, at least not fully, in 1 bit systems. That is the other thing the dude told me, that the DSD dither noise is not correlated at all to any harmonics in the audible band. I don't know where people go so wrong on something so very, very basic. (I warned you I would not be very tactful about this experience. Sorry if you are offended, but you don't have to read lol.) Then he asked me if I knew that most DSD was actually edited in PCM. Again, these 'ASS'umers. Of course I know that. Of course I also know there is a fairly large for a niche market 'PURE' DSD industry that uses minimal DXD punch-in/punch outs, crossfades etc, but the majority is made to stay in DSD. Also, there was this thing called DSD-wide, that is a totally different story for another day, but it also allowed the same kind of minimal editing. You didn't have to convert everything in its entirely to multi-bit. And even if the system is converted to multi-bit, it isn't exactly a bad thing. DSD's advantages, if it has any, are not defined so much by its bit-depth as it is the sample rate, and the filtering. (Which is why the original DSD should have at least been a few levels, rather than just 1-bit.) Even most 'Pure DSD' DACs convert 1-bit DSD into multiple levels of that 1-bit signal, offset in time by a single clock sample, to filter it. This can be done in a totally digital form, with taps that multiply every stream (anywhere from 4 to 32 stacked streams are what I have found) by 1, meaning the same comes out as went in, and all the filtering is done in the 'delay', actually making this FIR filter as much as CIC filter as anything, with no decimation stage. Or it can be done almost exactly the same way, except the filter can be implemented at the output stage itself, with the resistor/switch being the TAP, filtering the multiple streams of DSD AND converting them to analog at the exact same time. Pretty efficient and ingenious. Anyway, no! DSD is not dither noise. I think people get this idea from the most basic of explanations that use black and white pictures. If you have a 1-bit pixelated black and white video system, and try and draw an image, you will get completely black shapes, with maybe a recognizable outline, against an all white background. If you randomize the noise instead, sending some white pixels into the black, and some black pixels into the white, all of a sudden the eyes can see a more detailed image, albeit with a 'haze' of noise uniformly across it. I have seen this used to describe how DSD works. But it actually is nothing like how DSD works. This is indeed a good description of dither. And maybe on some very simple conceptual level it is helpful in beginning to understand DSD or 1-bit systems. But again, this is ultimately wrong when it comes to audio, quantization noise, DSD and Noise Shaping. Finally, this 'educator' attempted to put down any notions of psychoacoustics playing a role in the sound of various formats like DSD. Of course, be brought no references. Or perhaps he works like Gemini AI and uses inaccurate forum threads (Gemini used more than just the one I posted on, almost all its references are from user run audio 'science' forums). Let's finish this up with exactly what I was talking about before he rudely 'ass'umed I was a village idiot. (I'm not the village idiot. I am more like the guy who is smart enough to count out the dinari at the market and make sure no one is stealing. So no, I am not the smartest guy by any means, but I am not the dumb one either.) Psychoacoustic research into why some listeners perceive DSD (Direct Stream Digital) as sounding better than other digital audio formats, such as PCM (Pulse Code Modulation), involves exploring how humans perceive sound and how different audio encoding techniques interact with our auditory system. Here are several factors that contribute to the perceived superiority of DSD: Key Factors in Psychoacoustics and DSD Perception High Sampling Rate: DSD Sampling Rate: DSD uses a very high sampling rate of 2.8224 MHz (64 times the CD standard of 44.1 kHz). This high sampling rate can capture more of the audio spectrum, leading to a perception of more natural and dynamic sound. Psychoacoustic Impact: Humans are sensitive to high-frequency content transients. The high sampling rate of DSD may better capture these transient elements, due to the ability to capture faster transients, and the potential lack PCM type filtering artifacts, dependent on filter parameters that take advantage of DSD benefits, enhancing the perception of realism and presence in the audio. Noise Shaping: Quantization Noise: DSD uses noise shaping to push quantization noise to higher frequencies, well beyond the range of human hearing (20 Hz to 20 kHz). This means the audible band is relatively free of quantization noise. Psychoacoustic Impact: A lower noise floor in the audible range can lead to a cleaner and more transparent sound. Listeners might perceive the audio as having more depth and clarity. It is true that very high bit depth PCM also has low quantization noise, however, all the quantization noise power, even if low in level stays in a much more narrow range, much of it the audible range, almost all of it in the audible range if the sample rate is 44.1khz. For PCM the uniform distribution of quantization noise could still affect the subtle nuances of the audio. By shifting noise to the ultrasonic range, DSD may preserve more of the delicate details and spatial cues within the music, enhancing the perceived realism and depth of the audio. One-Bit Signal Processing: Simplicity: DSD uses a 1-bit signal, which some argue leads to less complex processing and potentially fewer artifacts compared to multi-bit PCM. This is especially so the less DSP is required, and the fewer modulations before conversion. Psychoacoustic Impact: The simplicity of the 1-bit signal may result in a more coherent and phase-accurate reproduction, which can enhance the perception of spatial accuracy and instrument separation. Subjective Preference and Listening Environment: Individual Differences: People have different auditory sensitivities and preferences. Some listeners might be more attuned to the qualities that DSD enhances, such as high-frequency detail and low noise. Listening Environment: High-quality playback equipment and acoustically treated listening environments can make the differences between DSD and other formats more noticeable. Research and Studies: Several studies and research papers have explored the subjective perception of audio quality between DSD and PCM. Some key findings include: Listener Preference: Controlled listening tests have shown that some listeners prefer DSD over PCM, citing smoother and more natural sound. Critical Listening: Trained listeners and audio professionals often report differences more accurately, suggesting that experience and familiarity with high-quality sound influence the perception of DSD. Psychoacoustic Advantages of Ultrasonic Harmonic Noise in DSD In DSD, ultrasonic noise is typically harmonically related to the audio signal due to the nature of delta-sigma modulation. This harmonic structure can extend well beyond the human hearing range (20 Hz to 20 kHz). Perceived Sound Quality: Subharmonic Effects: Although the ultrasonic frequencies are above the audible range, their harmonic relationships can influence subharmonic frequencies within the audible range through intermodulation distortion, which can enhance the perception of a richer and more complex sound, even sometimes at the expense of measured performance. Inaudible Frequencies: These frequencies might interact with the auditory system in ways that affect the perception of lower frequencies, potentially adding to the sense of depth and spatiality in the audio. Localization Cues: Ultrasonic frequencies can influence spatial localization cues, potentially enhancing the perception of the soundstage. The brain processes these cues to determine the location of sound sources. Ambience and Air: The presence of ultrasonic harmonics can contribute to the perception of ambience and airiness in recordings, leading to a more lifelike and immersive listening experience. Influence on Lower Frequencies: Nonlinearities in Hearing: The human auditory system exhibits nonlinearities, meaning that interactions between ultrasonic frequencies and audible frequencies can generate audible artifacts or enhance existing tones. Masking Effects: Ultrasonic content can create masking effects, altering how lower frequencies are perceived. This can lead to a cleaner and more detailed perception of the mid and low frequencies. Subjective Preference for all High Resolution formats: Listener Preference: Many listeners subjectively prefer audio with rich harmonic content, including ultrasonic harmonics, as they may contribute to a perception of higher fidelity and naturalness. High-Resolution Audio: Audiophiles often report that high-resolution audio formats (like DSD) that include ultrasonic content sound more realistic and engaging compared to standard-resolution formats. Conclusion: The perceived superiority of DSD to some listeners can be attributed to its high sampling rate, effective noise shaping, and the psychoacoustic impacts of these factors. The subjective nature of audio perception means that individual preferences and sensitivities play a significant role in how DSD is experienced compared to other digital audio formats. References and Studies: (the most important part) Psychoacoustics: Facts and Models by Hugo Fastl and Eberhard Zwicker: Comprehensive coverage of how the human auditory system processes complex sounds, including the effects of ultrasonic frequencies. The Influence of High-Frequency Audio Content on the Perception of High-Resolution Audio: This AES convention paper investigates how high-frequency content influences the perceived quality of high-resolution audio. Intermodulation Distortion in Digital Audio Converters: Discusses how ultrasonic frequencies can create intermodulation products that fall within the audible range, potentially enhancing the richness of the sound. The Effect of Ultrasonic Components on the Perception of Music: A study examining how ultrasonic components in music recordings affect listener preferences and perceived audio quality. Perceptual Audio Coders: What To Listen For by James D. Johnston: Offers insights into how various audio coding techniques and their handling of ultrasonic content can affect perceived audio quality. "The Perception of High-Frequency Content in Music": This paper discusses how high-frequency content affects perceived audio quality. AES Journal Articles: The Journal of the Audio Engineering Society has published numerous articles on the psychoacoustics of digital audio formats, including DSD and PCM comparisons.
2 Comments
Jagan Seshadri
8/6/2024 11:43:05 pm
Interesting and informative article. I always thought that Floyd-Steinberg dithering was the closest visual analogy to how DSD audio works. Despite “dithering” being in the name, it feeds back quantization error to impact neighboring pixels, resulting in diffusion of error. Not a perfect analogy to DSD, but along a similar direction of reasoning.
Reply
Gavyn
9/12/2024 08:58:37 pm
What a fantastic article. I love that you didn't put down DSD or PCM, didn't really raise either up, cited your sources, and really did your best to come to a fair conclusion on DSD. I think a lot of it is up for debate, but I'm in the camp that likes DSD, personally.
Reply
Leave a Reply. |