## Frequency Estimation Techniques

Posted June 20, 2017 by Shawn
Categories: dsp, TKEO, tone detection

I have had an interest in frequency estimation ever since I worked with telecommunications algorithms at HotHaus Technologies and Broadcom.  One of the techniques that I often used for detecting the frequency of telephony tones was a method using the multiplication of a signal with its complex conjugate delayed by one sample.  Recently when reading an article on the DSP Related website, I ran across an article by Rick Lyons about frequency estimation (see https://www.dsprelated.com/showarticle/1045.php).  It turns out the method is called the Lank-Reed-Pollon algorithm, and Rick has some interesting things to say about it.  He does not mention the TKEO (Teager Kaiser Energy Operator) algorithm (see my page here).  According to Rick, the Lank-Reed-Pollon algorithm  has some issues at high and low frequencies, which reminds me of the short-comings of TKEO at high and low frequencies.

I found the Lank-Reed-Pollon algorithm a very cycle efficient method to estimate frequencies.  One difficulty is finding a quick way to calculate the inverse tangent function.  If you look online, you can find some ways to approximate it.  In fact, I believe this is shown in one of Rick Lyon’s books.  One thing I like about the TKEO, is that the square root and inverse sine calculations can be skipped if you are looking for a particular frequency (by taking the sine and squaring the frequency of interest instead).  The TKEO also does not require a conversion to the complex number domain, if you are dealing with real numbers.  But still, I found that in general I had worse cycle performance with the TKEO, due to restrictions on the minimum sampling frequency.

I highly recommend the http://www.dsprelated.com website for information on DSP in general.  And I also highly recommend the writings of Rick Lyons.  He has a good way of explaining complex things in a simple and easy to read way.

## Second part of tone detection using TKEO

Posted October 20, 2012 by Shawn
Categories: dsp, TKEO, tone detection

Tags: , , ,

It’s been almost a year since I put up my last tutorial about tone detection using the Teager Kaiser Energy Operator.  I have finally written something up for part two here.  Life with a young family has been quite busy for me.  If there is enough interest in the TKEO stuff, I may write a part 3 showing a fixed point version (another year from now?).

Let me know if there are any problems with formatting in the code (something that always seems to be a problem). Copying the code is easy.  If you put your mouse pointer over any of the code listings, a number of symbols will be shown at the top right.  One of these will allow the code to be copied to the clipboard.

## New Post on Tone Detection

Posted November 8, 2011 by Shawn
Categories: dsp, tone detection

Tags: ,

I’ve added a new page introducing the use of the Teager Kaiser Energy Operator (TKEO) for determining the frequency of a tone (sinusoid). The page is here.  I’m planning a couple of more pages to show how to improve conditions in noise, how to reduce execution time, and how to make a fixed point version.

## New Arrangement of Tutorials

Posted October 24, 2011 by Shawn
Categories: Uncategorized

I haven’t been posting for a long, long time, but today I rearranged my old posts and put them up as separate pages.  There is now an index to all of the tutorials here.  A blog isn’t really the best format for publishing tutorials, so I plan on making new pages for any new tutorials, and using blog posts for any subjects I feel like generally commenting on.

Using pages instead of regular posts, parts one and two of the FIR filter tutorials now use a better control for displaying the C code, which prevents lines from being chopped off, and makes it easy to copy and paste the code.

I have lots of ideas for new tutorials, but I haven’t had a lot of time to work on them.  I have some interesting code for frequency detection that I will hopefully put up soon.

## Fixed Point Division of two Q15 Numbers

Posted September 20, 2010 by Shawn
Categories: dsp, fixed point

Tags: , , , , ,

Fixed point division is quite a bit more complicated than fixed point multiplication, and usually takes a lot more cycles than performing a multiplication. When dividing by a known value (a constant), it is usually better to multiply by the reciprocal than to do a division. And when dividing a fixed point number by an integer that is a power of two, a right shift can be used to implement a division. For example, to divide by 16, just shift your dividend right by 4 bits.

But there are cases where it is necessary to do a division by a calculated value. The easiest way to picture how the division should proceed is to think of the inverse of multiplying two Q15 numbers. The multiplication of two Q15 numbers produces a Q30 product. It then makes sense that a Q30 number divided by a Q15 number produces a Q15 result.

Let $a = \hat{a}/{2}^{30}, b = \hat{b}/{2}^{15}$

then $a/b = \frac{\hat{a}/{2}^{30}}{\hat{b}/{2}^{15}}=\left(\hat{a}/\hat{b} \right)/{2}^{15}$

So one procedure for finding a/b has the following steps:
1.convert the dividend a from Q15 to Q30 by shifting left by 15
2.divide the Q30 format number a by the Q15 format number b to get result in Q15

Let’s try an example. Let a = 0.03125 and b = 0.25, then c = a/b = 0.125. The Q15 numbers as hexadecimal integers will be a = 0x0400 and b = 0x2000. In step 1, a becomes 0x02000000 in Q30. In step 2, divide 0x02000000 by 0x2000 to get c = 0x1000 which is 4096 in decimal. As a check, find 4096/32768 = 0.125, the expected result.

In C language code, fixed point Q15 division can be coded as follows:

int16 a;
int16 b;
int16 c;

if ( abs(b) > abs(a) ) {
c = (int16)(((int32)a << 15) / ((int32)b));
}


The casting is very ugly, but this works. Note that I have restricted the result of the division to be less than one. Removing the restriction that the magnitude of the divisor is larger than that of the dividend has an effect on the number of bits required for the result. To see this, try dividing the largest positive Q15 number by the smallest positive Q15 number, which results in a large number with 15 digits in front of the fractional point:

(0x7FFF/0x8000) / (1/0x8000) = (0x7FFF * 0x8000 ) / 0x8000 = 0x3FFF8000 / 0x8000

The result (0x3FFF8000) requires 30 bits, and it will have 15 bits to the left of the fractional point and 15 bits to the right. That is, the most significant bit has a weighting of ${2}^{14}$ and the least significant bit has a weight of ${2}^{-15}$. In my work, I have almost always used Q15 division where the magnitude of the divisor is smaller than that of the dividend.

Along with looking ugly, the C code above for division is often inefficient. The C compiler will likely implement this as a division between two 32 bit numbers. When implementing division on a fixed point DSP chip, I have usually used assembly language coding and made use of a special purpose division instruction.

For example, the Texas Instruments TMS320C55x processor has the “subc” instruction or “conditional subtract.” To perform the type of division I have just described do the following:

1.make the dividend and divisor both positive and note original sign of each
2.load the dividend shifted left by 15 into an accumulator register
3.execute the conditional subtract of the divisor 16 times
4.store the result (in the lower 16 bits of the accumulator )
5.determine the correct sign for the result, and negate it if necessary

Note that a short cut is to load the dividend shifted left by 16 in the first step, and then execute the subc instruction 15 times. This works because it is known that the result will be positive.

Fixed point division is not difficult, but it can take a lot of cycles, and one needs to recognize the need to consider the range of the resulting output.

The C code and output below shows a couple of examples from this tutorial.

Example code:

#include <stdio.h>

typedef short int16;
typedef int int32;

int main( void )
{
int16 a;
int16 b;
int16 c;
int32 d;

// example 1: magnitude of divisor is greater than magnitude of dividend
printf("example 1: magnitude of divisor is greater than magnitude of dividend\n");

a = 0x0400;
b = 0x2000;

if ( abs(b) > abs(a) ) {
c = (int16)(((int32)a << 15) / ((int32)b));
} else {
printf("division error\n");
}

printf("a = %d, b = %d, c = %d\n",a,b,c);
printf("a = 0x%x, b = 0x%x, c = 0x%x\n",a,b,c);

// example 2: no restrictions on divisor other than not 0
printf("\nexample 2: no restrictions on divisor other than not 0\n");

a = 0x7fff;
b = 0x0001;

if ( b != 0 ) {
d = ((int32)a << 15) / ((int32)b);
} else {
printf("division by zero error\n");
}

printf("a = %d, b = %d, d = %d\n",a,b,d);
printf("a = 0x%x, b = 0x%x, d = 0x%x\n",a,b,d);

return 0;
}


Code output:

example 1: magnitude of divisor is greater than magnitude of dividend
a = 1024, b = 8192, c = 4096
a = 0x400, b = 0x2000, c = 0x1000

example 2: no restrictions on divisor other than not 0
a = 32767, b = 1, d = 1073709056
a = 0x7fff, b = 0x1, d = 0x3fff8000

## Implementation of FIR Filtering in C (Part 3)

Posted January 4, 2010 by Shawn
Categories: FIR filters, fixed point

Part 2 showed an example of a FIR filter in C using fixed point. This tutorial on FIR filtering shows how to apply several different FIR filters to the same input data. The examples for this part are also in fixed point. The example is a single C file with the FIR filter code at the top, and a small test program at the bottom. In an actual implementation, you would likely want to split the code into several files.

Code Example PDF

The code example is shown below:

#include <stdio.h>
#include <stdint.h>

//////////////////////////////////////////////////////////////
//  Filter Code Definitions
//////////////////////////////////////////////////////////////

// maximum number of inputs that can be handled
// in one function call
#define MAX_INPUT_LEN   80
// maximum length of filter than can be handled
#define MAX_FLT_LEN     63
// buffer to hold all of the input samples
#define BUFFER_LEN      (MAX_FLT_LEN - 1 + MAX_INPUT_LEN)

// array to hold input samples
int16_t insamp[ BUFFER_LEN ];

// FIR init
void firFixedInit( void )
{
memset( insamp, 0, sizeof( insamp ) );
}

// store new input samples
int16_t *firStoreNewSamples( int16_t *inp, int length )
{
// put the new samples at the high end of the buffer
memcpy( &insamp[MAX_FLT_LEN - 1], inp,
length * sizeof(int16_t) );
// return the location at which to apply the filtering
return &insamp[MAX_FLT_LEN - 1];
}

// move processed samples
void firMoveProcSamples( int length )
{
// shift input samples back in time for next time
memmove( &insamp[0], &insamp[length],
(MAX_FLT_LEN - 1) * sizeof(int16_t) );
}

// the FIR filter function
void firFixed( int16_t *coeffs, int16_t *input, int16_t *output,
int length, int filterLength )
{
int32_t acc;     // accumulator for MACs
int16_t *coeffp; // pointer to coefficients
int16_t *inputp; // pointer to input samples
int n;
int k;

// apply the filter to each input sample
for ( n = 0; n < length; n++ ) {
// calculate output n
coeffp = coeffs;
inputp = &input[n];
acc = 1 << 14;
// perform the multiply-accumulate
for ( k = 0; k < filterLength; k++ ) {
acc += (int32_t)(*coeffp++) * (int32_t)(*inputp--);
}
// saturate the result
if ( acc > 0x3fffffff ) {
acc = 0x3fffffff;
} else if ( acc < -0x40000000 ) {
acc = -0x40000000;
}
// convert from Q30 to Q15
output[n] = (int16_t)(acc >> 15);
}
}

//////////////////////////////////////////////////////////////
//  Test program
//////////////////////////////////////////////////////////////

// bandpass filter centred around 1000 Hz
// sampling rate = 8000 Hz
// gain at 1000 Hz is about 1.13

#define FILTER_LEN  63
int16_t coeffs[ FILTER_LEN ] =
{
-1468, 1058,   594,   287,    186,  284,   485,   613,
495,   90,  -435,  -762,   -615,   21,   821,  1269,
982,    9, -1132, -1721,  -1296,    1,  1445,  2136,
1570,    0, -1666, -2413,  -1735,   -2,  1770,  2512,
1770,   -2, -1735, -2413,  -1666,    0,  1570,  2136,
1445,    1, -1296, -1721,  -1132,    9,   982,  1269,
821,   21,  -615,  -762,   -435,   90,   495,   613,
485,  284,   186,   287,    594, 1058, -1468
};

// Moving average (lowpass) filter of length 8
// There is a null in the spectrum at 1000 Hz

#define FILTER_LEN_MA   8
int16_t coeffsMa[ FILTER_LEN_MA ] =
{
32768/8, 32768/8, 32768/8, 32768/8,
32768/8, 32768/8, 32768/8, 32768/8
};

// number of samples to read per loop
#define SAMPLES   80

int main( void )
{
int size;
int16_t input[SAMPLES];
int16_t output[SAMPLES];
int16_t *inp;
FILE   *in_fid;
FILE   *out_fid;
FILE   *out_fid2;

// open the input waveform file
in_fid = fopen( "input.pcm", "rb" );
if ( in_fid == 0 ) {
printf("couldn't open input.pcm");
return;
}

// open the output waveform files
out_fid = fopen( "outputFixed.pcm", "wb" );
if ( out_fid == 0 ) {
printf("couldn't open outputFixed.pcm");
return;
}
out_fid2 = fopen( "outputFixedMa.pcm", "wb" );
if ( out_fid == 0 ) {
printf("couldn't open outputFixedMa.pcm");
return;
}

// initialize the filter
firFixedInit();

// process all of the samples
do {
size = fread( input, sizeof(int16_t), SAMPLES, in_fid );
// store new samples in working array
inp = firStoreNewSamples( input, size );

// apply each filter
firFixed( coeffs, inp, output, size, FILTER_LEN );
fwrite( output, sizeof(int16_t), size, out_fid );

firFixed( coeffsMa, inp, output, size, FILTER_LEN_MA );
fwrite( output, sizeof(int16_t), size, out_fid2 );

// move processed samples
firMoveProcSamples( size );
} while ( size != 0 );

fclose( in_fid );
fclose( out_fid );
fclose( out_fid2 );

return 0;
}



There are a few differences from the code example of Part 2. First, I have created a function to store the input samples to the input sample array (firStoreNewSamples). This function is called once for every block of input samples that are processed. The calling function passes in a pointer to the new input samples, and the number of new samples to copy. The function returns the address at which to apply the FIR filter.

Second, I have added a function to move the samples after processing a block of samples (firMoveProcSamples). Again, this function is called once per block of samples, not once per FIR filter applied.

The FIR filtering function (firFixed) has the same argument list as in the Part 2 example, but the “input” argument is a bit different in this case. The input pointer passed in should be the address returned from the firStoreNewSamples function, rather than a pointer to the input sample buffer.

The test program shows an example where two different FIR filters are applied to the same output data. First one input file is opened (for input samples) and two output files are opened (one for each filter). In the sample processing loop, a block of up to 80 samples is read and stored into the working array for the filters. Next the 63 tap bandpass filter is applied by calling firFixed, and the block of output samples is written to file. Afterwards, the 8 tap moving average filter is applied, and the output samples are written to a different file. Finally, the sample buffer is shifted to prepare for the next block of input samples.

The code I have shown works for however many filters that you want to implement. Remember to keep track of the maximum filter tap length, and input sample block size, and change the #define statements appropriately. That concludes my tutorial on basic FIR filters.

## Implementation of FIR Filtering in C (Part 2)

Posted October 8, 2009 by Shawn
Categories: FIR filters, fixed point

Tags: , , , , , , , ,

In Part 1 I showed how to code a FIR filter in C using floating point. In this lesson I will show how to do the same thing using fixed point operations. The code example below will demonstrate the application of fixed point multiplication, rounding and saturation. The code has definitions for the FIR filtering function, followed by an example test program.

firFixed

And here is the code example:


#include <stdio.h>
#include <stdint.h>

//////////////////////////////////////////////////////////////
//  Filter Code Definitions
//////////////////////////////////////////////////////////////

// maximum number of inputs that can be handled
// in one function call
#define MAX_INPUT_LEN   80
// maximum length of filter than can be handled
#define MAX_FLT_LEN     63
// buffer to hold all of the input samples
#define BUFFER_LEN      (MAX_FLT_LEN - 1 + MAX_INPUT_LEN)

// array to hold input samples
int16_t insamp[ BUFFER_LEN ];

// FIR init
void firFixedInit( void )
{
memset( insamp, 0, sizeof( insamp ) );
}

// the FIR filter function
void firFixed( int16_t *coeffs, int16_t *input, int16_t *output,
int length, int filterLength )
{
int32_t acc;     // accumulator for MACs
int16_t *coeffp; // pointer to coefficients
int16_t *inputp; // pointer to input samples
int n;
int k;

// put the new samples at the high end of the buffer
memcpy( &insamp[filterLength - 1], input,
length * sizeof(int16_t) );

// apply the filter to each input sample
for ( n = 0; n < length; n++ ) {
// calculate output n
coeffp = coeffs;
inputp = &insamp[filterLength - 1 + n];
acc = 1 << 14;
// perform the multiply-accumulate
for ( k = 0; k < filterLength; k++ ) {
acc += (int32_t)(*coeffp++) * (int32_t)(*inputp--);
}
// saturate the result
if ( acc > 0x3fffffff ) {
acc = 0x3fffffff;
} else if ( acc < -0x40000000 ) {
acc = -0x40000000;
}
// convert from Q30 to Q15
output[n] = (int16_t)(acc >> 15);
}

// shift input samples back in time for next time
memmove( &insamp[0], &insamp[length],
(filterLength - 1) * sizeof(int16_t) );

}

//////////////////////////////////////////////////////////////
//  Test program
//////////////////////////////////////////////////////////////

// bandpass filter centred around 1000 Hz
// sampling rate = 8000 Hz
// gain at 1000 Hz is about 1.13

#define FILTER_LEN  63
int16_t coeffs[ FILTER_LEN ] =
{
-1468, 1058,   594,   287,    186,  284,   485,   613,
495,   90,  -435,  -762,   -615,   21,   821,  1269,
982,    9, -1132, -1721,  -1296,    1,  1445,  2136,
1570,    0, -1666, -2413,  -1735,   -2,  1770,  2512,
1770,   -2, -1735, -2413,  -1666,    0,  1570,  2136,
1445,    1, -1296, -1721,  -1132,    9,   982,  1269,
821,   21,  -615,  -762,   -435,   90,   495,   613,
485,  284,   186,   287,    594, 1058, -1468
};

// number of samples to read per loop
#define SAMPLES   80

int main( void )
{
int size;
int16_t input[SAMPLES];
int16_t output[SAMPLES];
FILE   *in_fid;
FILE   *out_fid;

// open the input waveform file
in_fid = fopen( "input.pcm", "rb" );
if ( in_fid == 0 ) {
printf("couldn't open input.pcm");
return;
}

// open the output waveform file
out_fid = fopen( "outputFixed.pcm", "wb" );
if ( out_fid == 0 ) {
printf("couldn't open outputFixed.pcm");
return;
}

// initialize the filter
firFixedInit();

// process all of the samples
do {
size = fread( input, sizeof(int16_t), SAMPLES, in_fid );
// perform the filtering
firFixed( coeffs, input, output, size, FILTER_LEN );
// write samples to file
fwrite( output, sizeof(int16_t), size, out_fid );
} while ( size != 0 );

fclose( in_fid );
fclose( out_fid );

return 0;
}



The first thing to notice is that the definitions for the input sample storage and handling are nearly the same as for the code in Part 1. The only difference is that the storage type is a 16 bit integer instead of double precision floating point.

The next difference is the inclusion of rounding in the calculation of each output. Rounding is used when converting the calculated result from a Q30 format number to Q15. Notice that I have loaded the rounding constant into the accumulator value (acc) at the beginning of the loop rather than adding it at the end. This is a small optimization commonly seen in code for FIR filters. If you are coding in assembly language, and your chip has a rounding instruction, it may be better to do the rounding at the end (depending on what the instruction actually does).

The multiplication itself is now an integer multiplication of two 16 bit values, each of which is a Q15 number. The accumulator variable is 32 bits, and holds a Q1.30 format number. There is one bit for the sign, one integer bit, and thirty fractional bits. Notice that I have cast each multiplier to a 32 bit value. Failure to do so should result in a 16 bit product and produce wrong results.

Next comes the overflow handling for converting from Q1.30 to Q30. The code checks for values beyond the limits of the largest/smallest Q30 number (no integer bits), and saturates if necessary.

Finally, a right shift by 15 is used to convert the MAC result from Q30 to Q15 and the result is stored to the output array.

The test program is simpler than the one in Part 1 because there is no need to convert the input and output samples to or from floating point. The most important thing to note is the change in the filter coefficient array. To generate these coefficients, I took the floating point coefficients from Part 1, multiplied by 32768, and then rounded to the nearest integer. The coefficients are in Q15 format, and note that none of the original floating point coefficients are close to one. Multiplying by 32768 would cause a problem for any coefficients larger than 32767/32768 or less than -1.

As in Part 1, the test input file should be 16 bit samples at a sampling rate of 8000 Hz. Try using a 1000 Hz tone with noise added to it. The bandpass filter should remove a large portion of the noise.

The filter has greater than unity gain at 1000 Hz and the gain is about 1.13. Admittedly it is not a great filter design, but it tests the saturation code if a full scale 1000 Hz sine wave is used as an input. Try a full scale 1000 Hz tone input with and without the saturation check and see what the difference is. There should be a small amount of distortion with the saturation code present (due to slight flattening of tone) and a large amount of distortion without it (due to overflow).

A fixed point FIR filter is fairly easy to implement. Personally I find the sample storage and movement to be the most difficult part. In Part 3 of this lesson I will demonstrate how to run multiple FIR filters on the same input.