## Implementation of FIR Filtering in C (Part 2)

In Part 1 I showed how to code a FIR filter in C using floating point. In this lesson I will show how to do the same thing using fixed point operations. The code example below will demonstrate the application of fixed point multiplication, rounding and saturation. The code has definitions for the FIR filtering function, followed by an example test program.

firFixed

And here is the code example:

```
#include <stdio.h>
#include <stdint.h>

//////////////////////////////////////////////////////////////
//  Filter Code Definitions
//////////////////////////////////////////////////////////////

// maximum number of inputs that can be handled
// in one function call
#define MAX_INPUT_LEN   80
// maximum length of filter than can be handled
#define MAX_FLT_LEN     63
// buffer to hold all of the input samples
#define BUFFER_LEN      (MAX_FLT_LEN - 1 + MAX_INPUT_LEN)

// array to hold input samples
int16_t insamp[ BUFFER_LEN ];

// FIR init
void firFixedInit( void )
{
memset( insamp, 0, sizeof( insamp ) );
}

// the FIR filter function
void firFixed( int16_t *coeffs, int16_t *input, int16_t *output,
int length, int filterLength )
{
int32_t acc;     // accumulator for MACs
int16_t *coeffp; // pointer to coefficients
int16_t *inputp; // pointer to input samples
int n;
int k;

// put the new samples at the high end of the buffer
memcpy( &insamp[filterLength - 1], input,
length * sizeof(int16_t) );

// apply the filter to each input sample
for ( n = 0; n < length; n++ ) {
// calculate output n
coeffp = coeffs;
inputp = &insamp[filterLength - 1 + n];
acc = 1 << 14;
// perform the multiply-accumulate
for ( k = 0; k < filterLength; k++ ) {
acc += (int32_t)(*coeffp++) * (int32_t)(*inputp--);
}
// saturate the result
if ( acc > 0x3fffffff ) {
acc = 0x3fffffff;
} else if ( acc < -0x40000000 ) {
acc = -0x40000000;
}
// convert from Q30 to Q15
output[n] = (int16_t)(acc >> 15);
}

// shift input samples back in time for next time
memmove( &insamp[0], &insamp[length],
(filterLength - 1) * sizeof(int16_t) );

}

//////////////////////////////////////////////////////////////
//  Test program
//////////////////////////////////////////////////////////////

// bandpass filter centred around 1000 Hz
// sampling rate = 8000 Hz
// gain at 1000 Hz is about 1.13

#define FILTER_LEN  63
int16_t coeffs[ FILTER_LEN ] =
{
-1468, 1058,   594,   287,    186,  284,   485,   613,
495,   90,  -435,  -762,   -615,   21,   821,  1269,
982,    9, -1132, -1721,  -1296,    1,  1445,  2136,
1570,    0, -1666, -2413,  -1735,   -2,  1770,  2512,
1770,   -2, -1735, -2413,  -1666,    0,  1570,  2136,
1445,    1, -1296, -1721,  -1132,    9,   982,  1269,
821,   21,  -615,  -762,   -435,   90,   495,   613,
485,  284,   186,   287,    594, 1058, -1468
};

// number of samples to read per loop
#define SAMPLES   80

int main( void )
{
int size;
int16_t input[SAMPLES];
int16_t output[SAMPLES];
FILE   *in_fid;
FILE   *out_fid;

// open the input waveform file
in_fid = fopen( "input.pcm", "rb" );
if ( in_fid == 0 ) {
printf("couldn't open input.pcm");
return;
}

// open the output waveform file
out_fid = fopen( "outputFixed.pcm", "wb" );
if ( out_fid == 0 ) {
printf("couldn't open outputFixed.pcm");
return;
}

// initialize the filter
firFixedInit();

// process all of the samples
do {
size = fread( input, sizeof(int16_t), SAMPLES, in_fid );
// perform the filtering
firFixed( coeffs, input, output, size, FILTER_LEN );
// write samples to file
fwrite( output, sizeof(int16_t), size, out_fid );
} while ( size != 0 );

fclose( in_fid );
fclose( out_fid );

return 0;
}

```

The first thing to notice is that the definitions for the input sample storage and handling are nearly the same as for the code in Part 1. The only difference is that the storage type is a 16 bit integer instead of double precision floating point.

The next difference is the inclusion of rounding in the calculation of each output. Rounding is used when converting the calculated result from a Q30 format number to Q15. Notice that I have loaded the rounding constant into the accumulator value (acc) at the beginning of the loop rather than adding it at the end. This is a small optimization commonly seen in code for FIR filters. If you are coding in assembly language, and your chip has a rounding instruction, it may be better to do the rounding at the end (depending on what the instruction actually does).

The multiplication itself is now an integer multiplication of two 16 bit values, each of which is a Q15 number. The accumulator variable is 32 bits, and holds a Q1.30 format number. There is one bit for the sign, one integer bit, and thirty fractional bits. Notice that I have cast each multiplier to a 32 bit value. Failure to do so should result in a 16 bit product and produce wrong results.

Next comes the overflow handling for converting from Q1.30 to Q30. The code checks for values beyond the limits of the largest/smallest Q30 number (no integer bits), and saturates if necessary.

Finally, a right shift by 15 is used to convert the MAC result from Q30 to Q15 and the result is stored to the output array.

The test program is simpler than the one in Part 1 because there is no need to convert the input and output samples to or from floating point. The most important thing to note is the change in the filter coefficient array. To generate these coefficients, I took the floating point coefficients from Part 1, multiplied by 32768, and then rounded to the nearest integer. The coefficients are in Q15 format, and note that none of the original floating point coefficients are close to one. Multiplying by 32768 would cause a problem for any coefficients larger than 32767/32768 or less than -1.

As in Part 1, the test input file should be 16 bit samples at a sampling rate of 8000 Hz. Try using a 1000 Hz tone with noise added to it. The bandpass filter should remove a large portion of the noise.

The filter has greater than unity gain at 1000 Hz and the gain is about 1.13. Admittedly it is not a great filter design, but it tests the saturation code if a full scale 1000 Hz sine wave is used as an input. Try a full scale 1000 Hz tone input with and without the saturation check and see what the difference is. There should be a small amount of distortion with the saturation code present (due to slight flattening of tone) and a large amount of distortion without it (due to overflow).

A fixed point FIR filter is fairly easy to implement. Personally I find the sample storage and movement to be the most difficult part. In Part 3 of this lesson I will demonstrate how to run multiple FIR filters on the same input.

Explore posts in the same categories: FIR filters, fixed point

Tags: , , , , , , , ,

You can comment below, or link to this permanent URL from your own site.

### 6 Comments on “Implementation of FIR Filtering in C (Part 2)”

1. Ivan Marsh Says:

Your tutorial is very informative, but I’m getting a little confused trying to implement it. I used matlab to generate the coefficients for my filter and they are 32 bit. I am a little confused on what this does to the fixed point math, and how I need to change the code to account for this. With my filter length of 71 and 32 bit coefficients. Does the accumulator need to be larger than 32 bits? Also since I am now multiplying a 16 bit input by a 32 bit coefficient the resulting size will be Q48?

• Shawn Says:

Is there any reason why you coefficient need to be 32 bits? Often you don’t need that much precision for coefficients. I have sometimes done the opposite (32 bit data and 16 bit coefficients). Are you generating floating point coefficients and then converting to fixed point?

If you have 32 bit coefficients in Q31 and 16 bit data in Q15, then the result will be 48 bits Q46. That’s assuming you can do 16 by 32 bit multiplies. What kind of CPU are you using? It is also possible to multiply a 32 bit value by a 16 bit value and only calculate the upper 32 bits (giving a Q30 result).

• Ivan Marsh Says:

Now that I think about it I don’t think I actually need the larger resolution. Matlab just outputs the coefficients as 32bit signed integers. So would you suggest just shifting the coefficients to the right by 16 and then treating them as 16bit numbers?

This is what the output from matlab looks like:
const int BL = 71;
const int32_T B[71] = {
28494258,-16838180,-33592961,-30060256,-5989893, 10325924,332057,-17559991,-13213756,10426188,19210577, -2044071,-23341697,-10882659,20358173,23692593,-10098766, -32938934,-6948508,34223235,27502942,-24483003,-46514692, 2482492,57576134,30531551,-53638101,-70673281,27233446, 112149015,32476265,-148255167,-158647872,172864619, 659307423,892157330,659307423,172864619,-158647872, -148255167,32476265,112149015,27233446,-70673281, -53638101,30531551,57576134, 2482492, -46514692, -24483003, 27502942, 34223235, -6948508, -32938934, -10098766, 23692593, 20358173, -10882659, -23341697, -2044071,
19210577, 10426188, -13213756, -17559991, 332057, 10325924,
-5989893, -30060256, -33592961, -16838180, 28494258
};

It is for a low pass filter and my input is only 16bits. I’m using the nios processor on a DE2 FPGA board from altera.

2. Shawn Says:

Yes I would think you can just shift the values right by 16. Or you could round by adding (1<<15) and then shifting right by 16.

I've heard of the Nios processor. We are using an Altera FPGA board at my work, but with an NXP CoolFlux DSP on it.

3. Ivan Marsh Says:

It turns out that the cyclone II FPGA that I’m using only has a 32×16 multiplier so I can’t use the 32bit coefs that I was using before. So I think I have all of my number converting issues sorted out however what I hear when I run the filter is the police siren sound/a lot of aliasing. Any thoughts on where I might have gone wrong?

• Shawn Says:

I don’t know where you might have gone wrong, but I have some debugging ideas for you to try. Try using an impulse as an input. For example a series of zeroes, a value of say 0.25, and then more zeroes. Make the input number of samples a few times longer than your filter coefficients. When you run this through the filter, you should get as an output the filter coefficients scaled by 0.25 (preceded and followed by some zeroes). Another thing you could try is to set all of your coefficients to zero except for one of them, which you set to 0.25. Then confirm that the output is 0.25 times the input.