## Implementation of FIR Filtering in C (Part 1)

In this lesson I will show how to code a finite impulse response (FIR) digital filter in the C programming language using floating point operations. Understanding the basic FIR filter is a stepping stone to understanding more complicated filters such as polyphase FIR filters and adaptive FIR filters.

A FIR filter is one in which the output depends only on the inputs (and not on previous outputs). The process is a discrete-time convolution between the input signal and the impulse response of the filter to apply. I will call this impulse response “the filter coefficients.” The following equivalent equations describe the convolution process:

equation 1: $y(n)=\sum_{k=0}^{N-1}h(k)x(n-k)$

equation 2: $y(n)=\sum_{k=n-(N-1)}^{n}h(n-k)x(k)$

where h is the impulse response, x is the input, y[n] is the output at time n and N is the length of the impulse response (the number of filter coefficients). For each output calculated, there are N multiply-accumulate operations (MACs). There is a choice of going through the filter coefficients forward and the input backwards (equation 1), or through the filter coefficients backwards and the input forwards (equation 2). In the example that follows, I will use the form in equation 1.

In my code example, samples will be stored in a working array, with lower indices being further back in time (at the “low” end). The input samples will be arranged in an array long enough to apply all of the filter coefficients. The length of the array required to process one input sample is N (the number of coefficients in the filter). The length required for M input samples is N – 1 + M.

A function will be defined that filters several input samples at a time. For example, a realtime system for processing telephony quality speech might process eighty samples at a time. This is done to improve the efficiency of the processing, since the overhead of calling the function is eighty times smaller than if it were called for every single sample. The drawback is that additional delay is introduced in the output.

The basic steps for applying a FIR filter are the following:

1. Arrange the new samples at the high end of the input sample buffer.
2. Loop through an outer loop that produces each output sample.
3. Loop through an inner loop that multiplies each filter coefficient by an input sample and adds to a running sum.
4. Shift the previous input samples back in time by the number of new samples that were just processed.

The code example defines an initialization function and a filtering function, and includes a test program. Download the PDF from the following link to see the code, or view it in-line below:

firFloat

#include <stdio.h>
#include <stdint.h>

//////////////////////////////////////////////////////////////
//  Filter Code Definitions
//////////////////////////////////////////////////////////////

// maximum number of inputs that can be handled
// in one function call
#define MAX_INPUT_LEN   80
// maximum length of filter than can be handled
#define MAX_FLT_LEN     63
// buffer to hold all of the input samples
#define BUFFER_LEN      (MAX_FLT_LEN - 1 + MAX_INPUT_LEN)

// array to hold input samples
double insamp[ BUFFER_LEN ];

// FIR init
void firFloatInit( void )
{
memset( insamp, 0, sizeof( insamp ) );
}

// the FIR filter function
void firFloat( double *coeffs, double *input, double *output,
int length, int filterLength )
{
double acc;     // accumulator for MACs
double *coeffp; // pointer to coefficients
double *inputp; // pointer to input samples
int n;
int k;

// put the new samples at the high end of the buffer
memcpy( &insamp[filterLength - 1], input,
length * sizeof(double) );

// apply the filter to each input sample
for ( n = 0; n < length; n++ ) {
// calculate output n
coeffp = coeffs;
inputp = &insamp[filterLength - 1 + n];
acc = 0;
for ( k = 0; k < filterLength; k++ ) {
acc += (*coeffp++) * (*inputp--);
}
output[n] = acc;
}
// shift input samples back in time for next time
memmove( &insamp[0], &insamp[length],
(filterLength - 1) * sizeof(double) );

}

//////////////////////////////////////////////////////////////
//  Test program
//////////////////////////////////////////////////////////////

// bandpass filter centred around 1000 Hz
// sampling rate = 8000 Hz

#define FILTER_LEN  63
double coeffs[ FILTER_LEN ] =
{
-0.0448093,  0.0322875,   0.0181163,   0.0087615,   0.0056797,
0.0086685,  0.0148049,   0.0187190,   0.0151019,   0.0027594,
-0.0132676, -0.0232561,  -0.0187804,   0.0006382,   0.0250536,
0.0387214,  0.0299817,   0.0002609,  -0.0345546,  -0.0525282,
-0.0395620,  0.0000246,   0.0440998,   0.0651867,   0.0479110,
0.0000135, -0.0508558,  -0.0736313,  -0.0529380,  -0.0000709,
0.0540186,  0.0766746,   0.0540186,  -0.0000709,  -0.0529380,
-0.0736313, -0.0508558,   0.0000135,   0.0479110,   0.0651867,
0.0440998,  0.0000246,  -0.0395620,  -0.0525282,  -0.0345546,
0.0002609,  0.0299817,   0.0387214,   0.0250536,   0.0006382,
-0.0187804, -0.0232561,  -0.0132676,   0.0027594,   0.0151019,
0.0187190,  0.0148049,   0.0086685,   0.0056797,   0.0087615,
0.0181163,  0.0322875,  -0.0448093
};

void intToFloat( int16_t *input, double *output, int length )
{
int i;

for ( i = 0; i < length; i++ ) {
output[i] = (double)input[i];
}
}

void floatToInt( double *input, int16_t *output, int length )
{
int i;

for ( i = 0; i < length; i++ ) {
if ( input[i] > 32767.0 ) {
input[i] = 32767.0;
} else if ( input[i] < -32768.0 ) {
input[i] = -32768.0;
}
// convert
output[i] = (int16_t)input[i];
}
}

// number of samples to read per loop
#define SAMPLES   80

int main( void )
{
int size;
int16_t input[SAMPLES];
int16_t output[SAMPLES];
double floatInput[SAMPLES];
double floatOutput[SAMPLES];
FILE   *in_fid;
FILE   *out_fid;

// open the input waveform file
in_fid = fopen( "input.pcm", "rb" );
if ( in_fid == 0 ) {
printf("couldn't open input.pcm");
return;
}

// open the output waveform file
out_fid = fopen( "outputFloat.pcm", "wb" );
if ( out_fid == 0 ) {
printf("couldn't open outputFloat.pcm");
return;
}

// initialize the filter
firFloatInit();

// process all of the samples
do {
size = fread( input, sizeof(int16_t), SAMPLES, in_fid );
// convert to doubles
intToFloat( input, floatInput, size );
// perform the filtering
firFloat( coeffs, floatInput, floatOutput, size,
FILTER_LEN );
// convert to ints
floatToInt( floatOutput, output, size );
// write samples to file
fwrite( output, sizeof(int16_t), size, out_fid );
} while ( size != 0 );

fclose( in_fid );
fclose( out_fid );

return 0;
}



First is the definition of the working array to hold the input samples. I have defined the array length to handle up to 80 input samples per function call, and a filter length up to 63 coefficients. In other words, the filter function that uses this working array will support between 1 to 80 input samples, and a filter length between 1 and 63.

Next is the initialization function, which simply zeroes the entire input sample array. This insures that all of the inputs that come before the first actual input are set to zero.

Then comes the FIR filtering function itself. The function takes pointers to the filter coefficients, the new input samples, and the output sample array. The “length” argument specifies the number of input samples to process (must be between 0 and 80 inclusive). The “filterLength” argument specifies the number of coefficients in the filter (must be between 1 and 63 inclusive). Note that the same filter coefficients and filterLength should be passed in each time the firFloat function is called, until the entire input is processed.

The first step in the function is the storage of the new samples. Note closely where the samples are placed in the array.

Next comes the outer loop that processes each input sample to produce one output sample. The outer loop sets up pointers to the coefficients and the input samples, and zeroes an accumulator value for the multiply-accumulate operation in the inner loop. It also stores each calculated output. The inner loop simply performs the multiply-accumulate operation.

Finally comes the step to move the input samples back in time. This is probably the most difficult step in the FIR filter and the part that always takes me the longest to work out. Pay close attention to the position of the samples to move, and the number of samples moved. The amount of time shift is equal to the number of input samples just processed. The number of samples to move is one less than the number of coefficients in the filter. Also note the use of the memmove function rather than the memcpy function, since there is a potential for moving overlapping data (whether or not the data overlaps depends on the relative lengths of the input samples and number of coefficients).

That concludes the code example for the FIR filter. In Part 2, I will show how to do the same thing using fixed point operations.

### 33 Comments on “Implementation of FIR Filtering in C (Part 1)”

1. Hi,
thanks for the tutorial. just a quick question about the moving the input back by 63 samples. why not move it back 80 samples and discard the first 80-63 of the block. are you using overlap add or overlap save method ?

2. ok i see now. you move from the 80th, 63 samples back to begining of buffer, makes sense now 🙂

3. Klevis Says:

Hi,
Thanks it helped me a lot

4. valentin Says:

There is some bullshit in your example:

for ( i = 0; i 32767.0 ) {
095 input[i] = 32767.0;
096 } else if ( input[i] < -32768.0 ) {
097 input[i] = -32768.0;
098 }

I do not understand the loop and if. Might be it is not necessary an all in this example

5. Deepak Bhatia Says:

How do we generate the input file of sine wav with integer values ?

6. Alina Says:

How did you find the coefficients?

7. nig Says:

In case of .wav files .. how can I get wav file header information before starting the filtering?

8. […] wav file using C programming. Till now I managed to tackle the filter part following this link: Implementation of FIR Filtering in C (Part 1) | Shawn's DSP Tutorials (I have my own filter co-efficients which have been produced using FDAtool on MATLAB and some other […]

9. Andy Says:

i have input value in txt file, in this program i got output file in txt but its not readable ……

• Shawn Says:

The program does not read nor write text files, but uses binary files instead. The format is raw PCM, 16 bits per sample, 8000 Hz sampling rate. Take a look at some of the previous comments on this page. I if ever get around to it, I’ll add some code to read and write .wav files, which would be a bit easier to use than raw PCM.

10. hamid Says:

hi dear
my samplig rate is 1Mhz
lenght of sampling is 256 sample
i want to detect 145khz with FIR.
can i use ur code?

Hi Shawn,

Great explanation of a FIR filter.

I was looking at the implementation and shouldn’t the following line:

inputp = &insamp[filterLength – 1 + n];

be changed to:

inputp = &insamp[filterLength – 1 + n – 1];

?
Otherwise inputp is pointing to one element past the input array since its length is (filterLength – 1 + n) and an array’s last element is (array-length – 1).

Furthermore, I realized that once the function firFloat is called it is not possible to change the length of the array of input samples. It is valid to have a requirement for a fixed-size array of input samples, but better make this explicit (at least in the comment).

• Shawn Says:

Thanks for checking out my site.

The inputp will not point past the end of the array as long as length is less than or equal to MAX_INPUT_LEN, because in the for loop, the largest value of n will be (length – 1) and the index will be (filterLength – 1 + length – 1).

Although I have not tried it, I don’t see any reason why the length of the input samples couldn’t be changed from one call to the next. However, if the filter coefficients or the number of coefficients is changed, then the firFloatInit function should be called again before calling the firFloat function.

Shawn

Hi Shawn,

Yes you are right about the inputp: the code is correct, I overlooked the fact that the maximum value for n is (length – 1), thanks for making it clear.

About the length of the input samples: what if, for instance, the length in the first call to firFloat is 16 and in the second call 32, then the first memmove will move data 16 samples and the second memcpy will copy 32 samples thereby overwriting 16 samples. I think this could be easily solved by placing the memmove just before the memcpy. What do you think?

• Shawn Says:

Take a closer look at the memcpy and the memmove lines of code. The memcpy copies in the number of new input samples to be processed; the number of samples is “length”. The memmove keeps enough of the previous input samples to be multiplied by the filter coefficients; the number of samples is “filterLength -1”. The two lines of code serve different purposes and move different numbers of samples. The number of samples moved in the memmove is independent of the number of samples processed.

Shawn

Hi Shawn,

The memmove moves “filterLength – 1” samples, but the space it creates for new samples does not depend on the number of samples that are moved but on the distance they are moved. And the distance is determined by the difference between the destination and the source pointer, which is “length” samples. For fixing it the memmove has to placed before memcpy and also some of the parameters for memmove and memcpy need to be different.

But, as already said, it is not a problem if the “length” stays the same between successive calls to firFloat, so there is no need to change the code we just need to be aware of that.

14. Tenz Says:

Hi Shawn ,Thanks a lot!
I have learned DSP theory,but start implementing in code lately
I have 3 questions:

First,it’s new to me using the pointer rather than applying directly the order of the arrays
Like for(i=0; i<k; i++){
for(j=max(0,i+1-n); j<=min(i,m-1); j++){
w[i] += u[j]*v[i-j];
}

Second,What's the meaning saving the inputsamples which have been processed at Line memmove?

Third,I think the convolution Part is based on the fact that there is enough 0 after valid samples.otherwise the conv is incomplete
For example ,assuming the length of coffes is N,the input M,then n must shift from 0 to more than N+M-1,otherwise we lose M datas.
I think Line output[n] = acc is correctly processed only when there is more than M 0s after the end of the sample.

Don't know whether I understand wrong.
Thx

Tenz

15. Tenz Says:

Hi Shawn, thanks a lot,it really helps cuz I have learned DSP theory but just start implementing in code.
I have 3 questions :
First,I got confused with the convolution part ,Line output[n] = acc;
I think this part must be based on the fact that there are enough 0s padded after the valid input of the last sample group of the original signal.For example ,assuming coffes with length of N,and input M,time n must shift from 0 to more than N+M-1,but in code we can see it stop at M,leaving M samples unprocessed,it must have at least N 0s padded to make this convolution complete.Do I understand wrongly?
Second,What’s the meaning saving processed sample at Line memmove(); when we just process new samples at insamp[filterLength – 1]?
Third,it’s new to me using pointer to change which one in an array to process rather that directly using the order like for(k = 0;k<n;k++)acc += h[k]*x[n-k];what’s the advantage?
Tenz

• Shawn Says:

Hi Tenz,

Sorry for the late reply, I’ve been away on vacation.

1. Using a pointer instead of accessing array directly: I think for most compilers this will make no difference at all in terms of speed of the code. For me, using the pointers is similar to how I used to write code in assembly language on old DSP chips. It’s really up to you what code you prefer to use.

2. Input samples need to be saved when the number of filter coefficients is greater than the number of new input samples. Otherwise there would not be enough samples with which to do the convolution.

3. When processing is first started, zeroes are used for past input samples that are unknown. That is a common thing to do. You might use something other than zero in some cases (for example filtering frequency bins of an FFT, which is not a time series).

I hope this helps.

-Shawn

16. […] website do not list education, so I don’t think he has his Doctorate uses a simplified FIR Filter per iteration. I find if you apply the filter after you’ve built up the terrain, you will […]

17. Nagananda Shet Says:

Hi Shawn,

In FIR filter, does the output remains for different frame sizes ?
Kindly clarify

• Shawn Says:

Hi Nagananda,

The output is the same no matter what frame size is used. Keep in mind that the number of samples in the output will be the same as the number of samples in the input. So if the block size is 100 samples, the output is 100 samples. If the block size is 1 sample, then the output is 1 sample. If you call the function 100 times in the second case, the 100 outputs will be the same as the one buffer of 100 outputs in the first case.

I hope that clarifies it for you.

-Shawn

• Nagananda Shet Says:

Hi Shawn,
Thank you for the quick reply.

Suppose

For ex: framesize = 32
As per the convolution equation = y(n) = Summation (h(n) * x(n-k))
K = 0 to 63
Assume that n = 0 ; y(0) = h(0) * x(0 –(0 to 63))
Assume that n = 31 ; y(31) = h(31) * x(31 –(0 to 63))

For the next frame with framesize = 32, n will be 0 that corresponds to 32nd sample for frame size 64 and 64th sample for 96 frame size

framesize = 64
Assume that n = 32 ; y(32) = h(32) * x(32 –(0 to 63)) =====è which corresponds to the 0th sample for framesize 32, which differs in the output.

is my understanding correct ?

– Nagananda

18. Shawn Says:

Take a closer look at the equations. The summation is over k for the filter coefficients (or n-k if you look at equation 2). Your examples have h[n]. It should be something like this:

y[32] = sum over k = 0 to 63 (h[k] * x[32 – k])
y[0] = sum over k = 0 to 63(h[k] * x[0 – k])

The “history” samples are set up such that x[0-k] in the 32 sample framesize corresponds to the same samples in the 64 sample frame. That is x[-1] for 32 is equal to x[31] for 64.

-Shawn

• Nagananda Says:

HI Shawn,

Thank you so much for the clarification.

I understood the scenario.

If we dont consider the history(means not considering ‘the negetive index) then outputs will be different for different frame sizes.

-Nagananda