Cardioid MEMS Microphone Array

This is a collection of Mastodon posts that I've written about a MEMS microphone array with a cardioid polar pattern that I've been designing since April 2019.

Knowles SPH0641 MEMS digital microphone. That's a very good frequency response for the human hearing range. You could use this as a calibration microphone. I reckon the bump at 25 kHz is the resonance frequency of the diaphragm. That's probably a compromise. Too high, and the diaphragm would be too stiff, too low and it would be audible. It's also a decent compromise for the ultrasound range, since it leaves the 36 - 70 kHz range flat to within +/- 1 dB.

That peak at 25 kHz kind of explains why smartphone and tablet microphones sound so crisp compared to more traditional microphones. It doesn't look too hard to neutralise the response with some digital EQ filters. It wouldn't surprise me if some mobile devices already do that, since digital filters are basically free.

— April 13, 2019, 9:05 PM

Now that I've taken the pick-and-place machine course at the hacker space, I should think up a project that uses it. What it excels at is placing large arrays of SMDs on a PCB.

I wonder what I could do with a large array of MEMS microphones and a high performance ARM microcontroller.

One loose knot I really need to tie up is connecting things to a computer. I want to make USB devices, preferably high speed ones.

— April 13, 2019, 9:18 PM

When I look for devices that use MEMS microphone arrays, I mostly come across research papers, or press releases about products you can't buy yet.

I'm kind of wondering what you could do with a MEMS array in terms of pro audio. We're talking about microphones the size of a grain of rice, costing a little over a dollar each, with built-in preamps and digital outputs. That's gotta be useful for something.

— April 15, 2019, 1:16 AM

All MEMS microphones have an upward bend on the high end of the spectrum. I found the explanation for this in a technical paper: It's not an issue with diaphragm resonance as I originally thought. It's the microphone cavity that forms a Heimholtz resonator. When you blow a bottle, that's a Helmholtz resonator. Speakers with bass ports are also Heimholtz resonators. It creates a peak at a particular frequency, determined by the shape of the opening and the volume of the cavity.

I'm still fascinated with MEMS microphones. I wish I knew more about DSP algorithms for beamforming.

— April 15, 2019, 11:14 PM

Interesting diagram from ST Microelectronics' tutorial for MEMS microphones. You get some pretty good MEMS microphones with a ~70 dB SNR (compare with ~80 dB for a Neumann U87) and a near-flat frequency response. Their problem is the Helmholtz resonance. It varies considerably from capsule to capsule, so if you wanted to flatten that with a filter, you'd need to calibrate each capsule. Why don't they reduce the cavity volume to move that resonance into the ultrasonic range?

Upon further investigation, it seems that the Helmholtz resonance of MEMS microphones is affected by the diameter, length and material of the opening to that chamber.

Paper on acoustical design for MEMS microphones:

https://www.edn.com/Pdf/ViewPdf?contentItemId=4429422

It looks to me like almost no matter what you do to the opening to the front chamber, you get a resonance somewhere in the 10-20 kHz area.

— April 16, 2019, 2:17 PM

They say that the main reason that large-diaphragm condenser microphones sound warmer than small-diaphragm microphones is that they become more omnidirectional at lower frequencies.

I hadn't noticed before, but cigar mics are less sensitive at low frequencies. I guess they have to be, to achieve good directivity. That explains their cold sound.

I'm pretty sure you could emulate these properties with an array of MEMS microphones and some DSP.

— April 16, 2019, 9:37 PM

Trying to figure out how professional audio interfaces expect ADAT Lightpipe data to be aligned relative to the BNC word clock output. Your typical ADC takes the word clock as a trigger to begin outputting the next PCM sample. Since you can't begin to output the bits of a PCM sample before you've demodulated the PDM signal from the delta-sigma modulator, this implies a one-sample delay.

I'm guessing that what your typical ADC does is integrate the PDM signal until the next word clock edge comes along, at which point it places the sum in a shift register and begins to clock it out.

For an ADAT slave device with ADCs that output a PDM signal, I suppose what you need to do is similar. You'd use a PLL to derive your oversampling clock from the word clock, and integrate PDM bits until a word clock edge comes along.

I guess the source of my confusion was that of instantaneous samples. A device attempting to play back a serial PCM signal can't actually instantaneously output the sample at the word clock, because the clock edge merely indicates the beginning of the transmission of a sample, not the beginning of reproduction. With resistor-ladder ADCs and DACs and a parallel bus, you could instantaneously reproduce the sample, and there would be no delay.

The ultimate implication of this is that an ADAT Lightpipe recording device can never have a delay of less than two or three samples. The ADC induces a one-sample delay and a further one-sample delay is induced because time is needed to assemble and transmit a ADAT Lightpipe packet. With a playback device on the other end, a further two-sample delay is induced because the packet must be decoded, both by the device and the playback DAC itself.

A further implication of this is that you shouldn't ever let two microphones in the same acoustic environment (say, a pair of overhead mics) be on each side of an ADAT Lightpipe, because there will be a constant delay there, and you will get a skewed stereo image and comb filtering between the channels.

— April 17, 2019, 10:49 AM

ADAT Lightpipe would seem to be a useful interface for doing a bit of experimentation with MEMS microphones. Getting the signals into a DAW seems to offer the simplest way of experimenting with these microphones. There is a DAW named REAPER where you can write your own DSP algorithms in a built-in scripting language without needing to implement a full VST/AU plugin.

— April 17, 2019, 11:24 AM

One of those questions that Google gives you 1000 irrelevant answers for: Given N summed sine waves, all of frequency X, with N different phase shifts and amplitudes, what is the resulting amplitude? Integrating them in discrete steps over time and finding the maximum seems like a clumsy way of doing it.

I guess it would help to reword it to "find the new phase and amplitude given two summed sine waves" and then iterate over that.

Okay, so... Transforming each sine wave into a phasor (complex number representation) would let you sum each phasor to produce a new phasor, but that still only gives you an answer for a single instant in time.

My gut feeling tells me that I might need some calculus to solve this.

The problem I'm trying to solve essentially looks like this animation:

https://images.app.goo.gl/iNf1XwxHFLRrZkLu6

Given two sine waves with the same frequency but different phases and amplitudes (here represented as phasors), what is the peak amplitude of their sum across a 360 degree cycle?

If my understanding of phasors is right, summing the two phasors on the complex plane at a given instant should produce a phasor that represents the sum of the actual waveforms.

Now, if that's true, a rotation of the resulting phasor should behave as if the two phasors are rotating.

Now, if that is true, the magnitude of this phasor should represent the maximum real magnitude of the summed waveforms, since it will hit that magnitude on the real plane whenever it intersects with the horizontal axis.

So, the answer to "What is the amplitude of two summed sine waves of equal periods but different amplitudes and phases?" would seem to boil down to something like:

v₁ = A₁ cos(ϕ₁) + A₂ cos(ϕ₂),
v₂ = A₁ sin(ϕ₁) + A₂ sin(ϕ₂)

M₁₂ = √v₁² + v₂²

Essentially tacking one phase vector of a sine wave onto the end of another and finding the distance from the origin.

The purpose of this exercise is that I'm writing a simple microphone array simulator with the goal of producing polar patterns for whole arrays. I want to see what happens if you move microphones around or sum/subtract them in different ways, so I'm using ray-casting to each element from an orbiting source and naturally I need to know what happens to the amplitude as that's going on in order to render a polar pattern.

Phasors are pretty awesome. FFT coefficients are also phasors. The output of an FFT is an array of phasors representing the amplitude and phase of sine waves at evenly spaced frequencies between DC and the Nyquist frequency. If you use a big FFT window and manipulate the phase angles while leaving the amplitudes intact, you can produce some really weird and wonderful audio smearing effects.

— April 17, 2019, 11:02 PM

I wrote a little microphone array simulator in JavaScript.

I wanted to create an array with a cardioid response. One text I found said I should get that if I combine omni and figure-eight microphones, but that didn't work.

Another text said you needed to make a figure-eight microphone with a delayed rear port.

This is the 20 - 20000 Hz polar pattern of a two-microphone array, with the elements spaced 2 cm apart, with one element inverted and time-delayed by 2 cm.

— April 18, 2019, 11:35 AM

Mom thinks cardioid microphone polar patterns look more like butts than hearts. Maybe we should've called them butt patterns.

— April 18, 2019, 1:12 PM

I'm trying to understand how to apply a H(s) transfer function to phasors in order to find the filtered phasors.

I have understood that s is the complex frequency, and can be replaced with jω, where j is the imaginary unit and ω is the angular frequency, but when confronted with a function like...

H(s) = 1 + (G f) / (s + f)

...do you just plug in jω like...

H(jω) = 1 + (G f) / (jω + f)

...and perform a complex multiplication with the phasor, and ta-da, you have the response?

— April 18, 2019, 7:36 PM

Polar patterns at various frequencies for an array of 2 omnidirectional microphones, 2 cm apart, subtracted and time-delayed to behave like a cardioid pattern microphone.

Cardioids are typically bass-boosted to give them a flat frequency response. This explains why they rumble so much when handled! I haven't done that in this case, so it doesn't start flattening out until about 4500 kHz.

Screenshots are from my JavaScript microphone array simulator.


— April 18, 2019, 11:25 PM

Okay, so I now think I know how to use transfer functions, i.e. those previously so mysterious "H(s) = foo / bar" things you always find on Wikipedia instead of actual explanations when you're looking for information about audio equalisation filters.

I knew that s was the frequency, but not much else.

Solution:

s can be substituted with jω where j is the imaginary unit and ω is angular frequency.

The output format is a phasor, from which you can derive an amplitude and a phase.

More specifically, any constant or coefficient x in a transfer function H(s) that isn't imaginary is taken as a complex number x+0i. In other words, you turn everything into complex numbers. Any arithmetic is taken to be complex arithmetic.

ω is in radians per second, so ω = 2πf, where f is the frequency in hertz.

Output x + yi can be taken as a phasor/vector where sqrt(x² + y²) encodes amplitude and atan2(y, x) encodes the phase in radians.

— April 19, 2019, 12:06 AM

On the other hand, to be directional the capsule must have rear sound ports with acoustical phase shift networks, which for any practical design has inherent non-idealities.

Mwahahah! But not the microphone array design I'm working on, because it uses a digital delay line!

I think what I'm going to do is give it an SPDIF output. My Focusrite interface has an SPDIF input, and many consumer sound cards have one too. Should make for a cheap way of equipping your home computer with a good mic.

— April 19, 2019, 3:25 AM

Working on improving the plots in my microphone array simulator.

— April 19, 2019, 1:26 PM

From what I'm told, the warmness of a large diaphragm microphone is due to the gradual transition to an omnidirectional pattern at lower frequencies. Using two omni microphones in a figure-eight pattern, with a phase inversion + delay on one mic to turn that into a cardioid microphone, and some high-pass and peak filtering, I managed to produce a microphone with a characteristic like that. The bend down at 20 kHz is the tail of the comb filter created by the phase delay.

I should probably give this simulator a better GUI. 99% of the editing is happening by changing the code at the moment.

I'm getting the hang of using transfer functions. I should probably learn how write transfer functions and how to transform them into digital filters. The only thing I could pull off with my current knowledge is to use an inverse discrete Fourier transform to turn them into filter kernels. Since these are IIR filters, they'd have to be windowed and the kernels would only approximate them.

— April 19, 2019, 7:17 PM

Example frequency response of a MEMS microphonen — from the ST Microelectronics Tutorial for MEMS Microphones.

The Helmholtz resonance of a MEMS microphone looks inconvenient at first glance, but in the microphone array design I'm working on, I'm using two MEMS capsules to create a cardioid pattern, and if you space them apart just right, and induce a phase delay equal to the wave propagation time between them, the first notch in the resulting comb filter will flatten the peak.

You can't actually see the center frequency of the Helmholtz resonator in most MEMS microphone datasheets, but if you look at the datasheet for an ultrasonic one, you can see it.

The approximate parameters look to be:

f = 25kHz
g = 15dB
Q = f / 8 kHz = 3.125

It seems to vary a lot, but it wouldn't surprise me If 25 kHz is what they generally aim at as a compromise. Bigger diaphragm and displacement means better sensitivity and max SPL but also larger cavity and thus lower Helmholtz resonance.

— April 20, 2019, 12:04 AM

With only 2 elements in the microphone array, i was getting a polar plot resembling a figure-eight pattern sideways close to 20 kHz. I didn't like that, so I tried an experiment where I doubled the array to 4 microphones, arranged in a square. This improved the pattern — it now has a, uh, dentoid (tooth-like) pattern close to 20 kHz:

The sideways Bode plot is very interesting — it cuts the high end by 6 dB:

— April 20, 2019, 3:40 PM

Note: The physical size of this microphone array would only be about 9x9mm (⅓x⅓"), with a pitch of 7.15mm between each element. It doesn't need to be any bigger than that, and due to frequency tuning, it shouldn't be any bigger either.

— Apr 20, 2019, 15:51

Somehow, I'm not convinced that this is right...

— April 21, 2019, 1:26 AM

I found a somewhat readable way of expressing complex algebra in code. These are transfer functions for equaliser filters.

— April 21, 2019, 8:20 AM

I'm stashing transfer functions for EQ filters. As it turns out, if you have a transfer function, that's all you really need to know to implement a filter digitally.

I still maintain that you can get pretty far with middle school math and a dash of self study.

— April 21, 2019, 9:48 AM

This is supposed to be the frequency response of a Neumann U 87. I don't buy it. Real frequency response curves don't look that neat, not even when averaged. This is a drawing of a frequency response curve. Neumann doesn't publish the actual measurements. It surprises me that no one's tested the living shit out of a U 87 in an anechoic chamber and published their findings, since it's such an iconic microphone.

— April 21, 2019, 12:15 PM

I found this plot of a U87Ai from Townsend Labs, belonging to an article I can't read because it's password protected. This is more like it. Actual measurements. This, averaged together for a large number of microphones, is what Neumann should be publishing as the frequency response for that microphone.

— April 21, 2019, 12:19 PM

Web Microphone Array Simulator:

https://thj.no/public/mas/

You can't see or edit the microphones themselves yet, but you can explore the characteristics of the included hardcoded array of 4 omnidirectional microphones and their associated EQ filters.

— April 21, 2019, 8:17 PM

1 meter on-axis, 1 meter off-axis and 20 centimeter on-axis response for the microphone array I'm working on. My simulator now displays the array itself (see the bottom right graph).

I think I've found a clever way of using 3 microphones, delays and crossover filters to get a near-optimal cardioid polar pattern at all frequencies without getting a comb filter at high frequencies or needing insane EQ gain levels to flatten the frequency response.

The technique is basically to keep the front microphone untouched, using a high-pass filter on the microphone right behind it, and a low-pass filter tuned to the same frequency on the rear microphone. I'm effectively combining two different cardioid microphones — where one is tuned for low frequencies and the other is tuned for high frequencies — and then gluing them together with a crossover filter. The result isn't flat, but it needs a lot less parametric EQ gain than my previous design.

— Apr 22, 2019, 21:41

Cardioid MEMS Microphone Array