## Implementation of FIR Filtering in C (Part 1)

In this lesson I will show how to code a finite impulse response (FIR) digital filter in the C programming language using floating point operations. Understanding the basic FIR filter is a stepping stone to understanding more complicated filters such as polyphase FIR filters and adaptive FIR filters.

A FIR filter is one in which the output depends only on the inputs (and not on previous outputs). The process is a discrete-time convolution between the input signal and the impulse response of the filter to apply. I will call this impulse response “the filter coefficients.” The following equivalent equations describe the convolution process:

equation 1:

equation 2:

where h is the impulse response, x is the input, y[n] is the output at time n and N is the length of the impulse response (the number of filter coefficients). For each output calculated, there are N multiply-accumulate operations (MACs). There is a choice of going through the filter coefficients forward and the input backwards (equation 1), or through the filter coefficients backwards and the input forwards (equation 2). In the example that follows, I will use the form in equation 1.

In my code example, samples will be stored in a working array, with lower indices being further back in time (at the “low” end). The input samples will be arranged in an array long enough to apply all of the filter coefficients. The length of the array required to process one input sample is N (the number of coefficients in the filter). The length required for M input samples is N – 1 + M.

A function will be defined that filters several input samples at a time. For example, a realtime system for processing telephony quality speech might process eighty samples at a time. This is done to improve the efficiency of the processing, since the overhead of calling the function is eighty times smaller than if it were called for every single sample. The drawback is that additional delay is introduced in the output.

The basic steps for applying a FIR filter are the following:

- Arrange the new samples at the high end of the input sample buffer.
- Loop through an outer loop that produces each output sample.
- Loop through an inner loop that multiplies each filter coefficient by an input sample and adds to a running sum.
- Shift the previous input samples back in time by the number of new samples that were just processed.

The code example defines an initialization function and a filtering function, and includes a test program. Download the PDF from the following link to see the code, or view it in-line below:

#include <stdio.h> #include <stdint.h> ////////////////////////////////////////////////////////////// // Filter Code Definitions ////////////////////////////////////////////////////////////// // maximum number of inputs that can be handled // in one function call #define MAX_INPUT_LEN 80 // maximum length of filter than can be handled #define MAX_FLT_LEN 63 // buffer to hold all of the input samples #define BUFFER_LEN (MAX_FLT_LEN - 1 + MAX_INPUT_LEN) // array to hold input samples double insamp[ BUFFER_LEN ]; // FIR init void firFloatInit( void ) { memset( insamp, 0, sizeof( insamp ) ); } // the FIR filter function void firFloat( double *coeffs, double *input, double *output, int length, int filterLength ) { double acc; // accumulator for MACs double *coeffp; // pointer to coefficients double *inputp; // pointer to input samples int n; int k; // put the new samples at the high end of the buffer memcpy( &insamp[filterLength - 1], input, length * sizeof(double) ); // apply the filter to each input sample for ( n = 0; n < length; n++ ) { // calculate output n coeffp = coeffs; inputp = &insamp[filterLength - 1 + n]; acc = 0; for ( k = 0; k < filterLength; k++ ) { acc += (*coeffp++) * (*inputp--); } output[n] = acc; } // shift input samples back in time for next time memmove( &insamp[0], &insamp[length], (filterLength - 1) * sizeof(double) ); } ////////////////////////////////////////////////////////////// // Test program ////////////////////////////////////////////////////////////// // bandpass filter centred around 1000 Hz // sampling rate = 8000 Hz #define FILTER_LEN 63 double coeffs[ FILTER_LEN ] = { -0.0448093, 0.0322875, 0.0181163, 0.0087615, 0.0056797, 0.0086685, 0.0148049, 0.0187190, 0.0151019, 0.0027594, -0.0132676, -0.0232561, -0.0187804, 0.0006382, 0.0250536, 0.0387214, 0.0299817, 0.0002609, -0.0345546, -0.0525282, -0.0395620, 0.0000246, 0.0440998, 0.0651867, 0.0479110, 0.0000135, -0.0508558, -0.0736313, -0.0529380, -0.0000709, 0.0540186, 0.0766746, 0.0540186, -0.0000709, -0.0529380, -0.0736313, -0.0508558, 0.0000135, 0.0479110, 0.0651867, 0.0440998, 0.0000246, -0.0395620, -0.0525282, -0.0345546, 0.0002609, 0.0299817, 0.0387214, 0.0250536, 0.0006382, -0.0187804, -0.0232561, -0.0132676, 0.0027594, 0.0151019, 0.0187190, 0.0148049, 0.0086685, 0.0056797, 0.0087615, 0.0181163, 0.0322875, -0.0448093 }; void intToFloat( int16_t *input, double *output, int length ) { int i; for ( i = 0; i < length; i++ ) { output[i] = (double)input[i]; } } void floatToInt( double *input, int16_t *output, int length ) { int i; for ( i = 0; i < length; i++ ) { if ( input[i] > 32767.0 ) { input[i] = 32767.0; } else if ( input[i] < -32768.0 ) { input[i] = -32768.0; } // convert output[i] = (int16_t)input[i]; } } // number of samples to read per loop #define SAMPLES 80 int main( void ) { int size; int16_t input[SAMPLES]; int16_t output[SAMPLES]; double floatInput[SAMPLES]; double floatOutput[SAMPLES]; FILE *in_fid; FILE *out_fid; // open the input waveform file in_fid = fopen( "input.pcm", "rb" ); if ( in_fid == 0 ) { printf("couldn't open input.pcm"); return; } // open the output waveform file out_fid = fopen( "outputFloat.pcm", "wb" ); if ( out_fid == 0 ) { printf("couldn't open outputFloat.pcm"); return; } // initialize the filter firFloatInit(); // process all of the samples do { // read samples from file size = fread( input, sizeof(int16_t), SAMPLES, in_fid ); // convert to doubles intToFloat( input, floatInput, size ); // perform the filtering firFloat( coeffs, floatInput, floatOutput, size, FILTER_LEN ); // convert to ints floatToInt( floatOutput, output, size ); // write samples to file fwrite( output, sizeof(int16_t), size, out_fid ); } while ( size != 0 ); fclose( in_fid ); fclose( out_fid ); return 0; }

First is the definition of the working array to hold the input samples. I have defined the array length to handle up to 80 input samples per function call, and a filter length up to 63 coefficients. In other words, the filter function that uses this working array will support between 1 to 80 input samples, and a filter length between 1 and 63.

Next is the initialization function, which simply zeroes the entire input sample array. This insures that all of the inputs that come before the first actual input are set to zero.

Then comes the FIR filtering function itself. The function takes pointers to the filter coefficients, the new input samples, and the output sample array. The “length” argument specifies the number of input samples to process (must be between 0 and 80 inclusive). The “filterLength” argument specifies the number of coefficients in the filter (must be between 1 and 63 inclusive). Note that the same filter coefficients and filterLength should be passed in each time the firFloat function is called, until the entire input is processed.

The first step in the function is the storage of the new samples. Note closely where the samples are placed in the array.

Next comes the outer loop that processes each input sample to produce one output sample. The outer loop sets up pointers to the coefficients and the input samples, and zeroes an accumulator value for the multiply-accumulate operation in the inner loop. It also stores each calculated output. The inner loop simply performs the multiply-accumulate operation.

Finally comes the step to move the input samples back in time. This is probably the most difficult step in the FIR filter and the part that always takes me the longest to work out. Pay close attention to the position of the samples to move, and the number of samples moved. The amount of time shift is equal to the number of input samples just processed. The number of samples to move is one less than the number of coefficients in the filter. Also note the use of the memmove function rather than the memcpy function, since there is a potential for moving overlapping data (whether or not the data overlaps depends on the relative lengths of the input samples and number of coefficients).

That concludes the code example for the FIR filter. In Part 2, I will show how to do the same thing using fixed point operations.

June 19, 2012 at 10:27 am

Hi,

thanks for the tutorial. just a quick question about the moving the input back by 63 samples. why not move it back 80 samples and discard the first 80-63 of the block. are you using overlap add or overlap save method ?

June 20, 2012 at 2:33 am

ok i see now. you move from the 80th, 63 samples back to begining of buffer, makes sense now 🙂

October 10, 2012 at 9:13 am

Hi,

Thanks it helped me a lot

December 1, 2012 at 1:19 pm

There is some bullshit in your example:

for ( i = 0; i 32767.0 ) {

095 input[i] = 32767.0;

096 } else if ( input[i] < -32768.0 ) {

097 input[i] = -32768.0;

098 }

I do not understand the loop and if. Might be it is not necessary an all in this example

December 2, 2012 at 12:06 am

Some of the code got deleted somehow. It should make more sense now.

August 14, 2013 at 12:54 am

How do we generate the input file of sine wav with integer values ?

September 13, 2013 at 7:25 pm

Sorry for the late reply. There are some comments about generating the input on this page:

https://sestevenson.wordpress.com/2010/01/04/implementation-of-fir-filtering-in-c-part-3/

You can use programs such as Adobe Audition or Audacity to generate sine waves with the correct sampling rate, format, etc.

September 13, 2013 at 7:14 am

How did you find the coefficients?

September 13, 2013 at 7:30 pm

You should look at the comments on the following page (June 18, 2010):

https://sestevenson.wordpress.com/2010/01/04/implementation-of-fir-filtering-in-c-part-3/

I used a program called Scilab to generate the coefficients. The Scilab code snippet in the comment shows how to generate fixed point coefficients, so if you want floating point, then leave out the multiplication by 32768 and rounding.

September 14, 2013 at 4:57 am

That was really helpful 🙂 Thanks a lot!

December 19, 2013 at 5:14 am

In case of .wav files .. how can I get wav file header information before starting the filtering?

December 21, 2013 at 9:57 pm

Hi. Take a look at the comments on this page starting at March 13, 2012:

https://sestevenson.wordpress.com/2010/01/04/implementation-of-fir-filtering-in-c-part-3/

The summary is, write code to read the header information (details of which are here:http://www.sonicspot.com/guide/wavefiles.html) or try using libsndfile, which handles reading and writing of audio files.

I hope that helps.

December 22, 2013 at 5:56 am

Thanks a lot Shawn,

A quick look at the links provided seem to be very helpful.

Will be tackling this problem soon, hoping for the best 🙂

Keep up the good work and thanks again 🙂

December 29, 2013 at 12:09 pm

[…] wav file using C programming. Till now I managed to tackle the filter part following this link: Implementation of FIR Filtering in C (Part 1) | Shawn's DSP Tutorials (I have my own filter co-efficients which have been produced using FDAtool on MATLAB and some other […]

April 15, 2014 at 6:38 pm

i have input value in txt file, in this program i got output file in txt but its not readable ……

help please

April 15, 2014 at 9:30 pm

The program does not read nor write text files, but uses binary files instead. The format is raw PCM, 16 bits per sample, 8000 Hz sampling rate. Take a look at some of the previous comments on this page. I if ever get around to it, I’ll add some code to read and write .wav files, which would be a bit easier to use than raw PCM.

December 20, 2015 at 5:09 am

hi dear

my samplig rate is 1Mhz

lenght of sampling is 256 sample

i want to detect 145khz with FIR.

can i use ur code?

tanks in advance

December 21, 2015 at 8:53 pm

Hamid,

The code on this page cannot be used to detect a 145 kHz tone. But take a look at my pages about the TKEO starting at:

https://sestevenson.wordpress.com/tone-detection-using-the-teager-kaiser-energy-operator-part-1/

I have a couple of pages that describe how to do tone detection using the “energy operator”, which should be preceded by a bandpass filtering of the input signal (the code on this page). You would need to calculate your own filter coefficients with the 145 kHz centre frequency and the 1 MHz sampling frequency.

-Shawn

June 15, 2016 at 3:57 am

Hi Shawn,

Great explanation of a FIR filter.

I was looking at the implementation and shouldn’t the following line:

inputp = &insamp[filterLength – 1 + n];

be changed to:

inputp = &insamp[filterLength – 1 + n – 1];

?

Otherwise inputp is pointing to one element past the input array since its length is (filterLength – 1 + n) and an array’s last element is (array-length – 1).

Furthermore, I realized that once the function firFloat is called it is not possible to change the length of the array of input samples. It is valid to have a requirement for a fixed-size array of input samples, but better make this explicit (at least in the comment).

Ad

June 15, 2016 at 4:28 pm

Hi Ad,

Thanks for checking out my site.

The inputp will not point past the end of the array as long as length is less than or equal to MAX_INPUT_LEN, because in the for loop, the largest value of n will be (length – 1) and the index will be (filterLength – 1 + length – 1).

Although I have not tried it, I don’t see any reason why the length of the input samples couldn’t be changed from one call to the next. However, if the filter coefficients or the number of coefficients is changed, then the firFloatInit function should be called again before calling the firFloat function.

Shawn

June 16, 2016 at 12:47 am

Hi Shawn,

Yes you are right about the inputp: the code is correct, I overlooked the fact that the maximum value for n is (length – 1), thanks for making it clear.

About the length of the input samples: what if, for instance, the length in the first call to firFloat is 16 and in the second call 32, then the first memmove will move data 16 samples and the second memcpy will copy 32 samples thereby overwriting 16 samples. I think this could be easily solved by placing the memmove just before the memcpy. What do you think?

Ad

June 16, 2016 at 8:39 am

Take a closer look at the memcpy and the memmove lines of code. The memcpy copies in the number of new input samples to be processed; the number of samples is “length”. The memmove keeps enough of the previous input samples to be multiplied by the filter coefficients; the number of samples is “filterLength -1”. The two lines of code serve different purposes and move different numbers of samples. The number of samples moved in the memmove is independent of the number of samples processed.

Shawn

June 16, 2016 at 10:42 pm

Hi Shawn,

The memmove moves “filterLength – 1” samples, but the space it creates for new samples does not depend on the number of samples that are moved but on the distance they are moved. And the distance is determined by the difference between the destination and the source pointer, which is “length” samples. For fixing it the memmove has to placed before memcpy and also some of the parameters for memmove and memcpy need to be different.

But, as already said, it is not a problem if the “length” stays the same between successive calls to firFloat, so there is no need to change the code we just need to be aware of that.

Ad

August 22, 2016 at 8:57 pm

Hi Shawn ,Thanks a lot!

I have learned DSP theory,but start implementing in code lately

I have 3 questions:

First,it’s new to me using the pointer rather than applying directly the order of the arrays

Like for(i=0; i<k; i++){

for(j=max(0,i+1-n); j<=min(i,m-1); j++){

w[i] += u[j]*v[i-j];

}

What 's the advantages?

Second,What's the meaning saving the inputsamples which have been processed at Line memmove?

Third,I think the convolution Part is based on the fact that there is enough 0 after valid samples.otherwise the conv is incomplete

For example ,assuming the length of coffes is N,the input M,then n must shift from 0 to more than N+M-1,otherwise we lose M datas.

I think Line output[n] = acc is correctly processed only when there is more than M 0s after the end of the sample.

Don't know whether I understand wrong.

Thx

Tenz

August 23, 2016 at 12:45 am

Hi Shawn, thanks a lot,it really helps cuz I have learned DSP theory but just start implementing in code.

I have 3 questions :

First,I got confused with the convolution part ,Line output[n] = acc;

I think this part must be based on the fact that there are enough 0s padded after the valid input of the last sample group of the original signal.For example ,assuming coffes with length of N,and input M,time n must shift from 0 to more than N+M-1,but in code we can see it stop at M,leaving M samples unprocessed,it must have at least N 0s padded to make this convolution complete.Do I understand wrongly?

Second,What’s the meaning saving processed sample at Line memmove(); when we just process new samples at insamp[filterLength – 1]?

Third,it’s new to me using pointer to change which one in an array to process rather that directly using the order like for(k = 0;k<n;k++)acc += h[k]*x[n-k];what’s the advantage?

Waiting for your help sincerely.Thx.

Tenz

August 29, 2016 at 8:27 am

Hi Tenz,

Sorry for the late reply, I’ve been away on vacation.

1. Using a pointer instead of accessing array directly: I think for most compilers this will make no difference at all in terms of speed of the code. For me, using the pointers is similar to how I used to write code in assembly language on old DSP chips. It’s really up to you what code you prefer to use.

2. Input samples need to be saved when the number of filter coefficients is greater than the number of new input samples. Otherwise there would not be enough samples with which to do the convolution.

3. When processing is first started, zeroes are used for past input samples that are unknown. That is a common thing to do. You might use something other than zero in some cases (for example filtering frequency bins of an FFT, which is not a time series).

I hope this helps.

-Shawn

August 29, 2016 at 8:52 am

All clear!

It really helps,Thanks a lot!!

^0^

November 2, 2016 at 4:18 pm

[…] website do not list education, so I don’t think he has his Doctorate uses a simplified FIR Filter per iteration. I find if you apply the filter after you’ve built up the terrain, you will […]

November 18, 2016 at 9:32 pm

Hi Shawn,

In FIR filter, does the output remains for different frame sizes ?

Kindly clarify

November 18, 2016 at 9:57 pm

Hi Nagananda,

The output is the same no matter what frame size is used. Keep in mind that the number of samples in the output will be the same as the number of samples in the input. So if the block size is 100 samples, the output is 100 samples. If the block size is 1 sample, then the output is 1 sample. If you call the function 100 times in the second case, the 100 outputs will be the same as the one buffer of 100 outputs in the first case.

I hope that clarifies it for you.

-Shawn

November 19, 2016 at 12:39 am

Hi Shawn,

Thank you for the quick reply.

Suppose

For ex: framesize = 32

As per the convolution equation = y(n) = Summation (h(n) * x(n-k))

K = 0 to 63

Assume that n = 0 ; y(0) = h(0) * x(0 –(0 to 63))

Assume that n = 31 ; y(31) = h(31) * x(31 –(0 to 63))

For the next frame with framesize = 32, n will be 0 that corresponds to 32nd sample for frame size 64 and 64th sample for 96 frame size

framesize = 64

Assume that n = 32 ; y(32) = h(32) * x(32 –(0 to 63)) =====è which corresponds to the 0th sample for framesize 32, which differs in the output.

is my understanding correct ?

– Nagananda

November 20, 2016 at 2:15 pm

Naganada,

Take a closer look at the equations. The summation is over k for the filter coefficients (or n-k if you look at equation 2). Your examples have h[n]. It should be something like this:

y[32] = sum over k = 0 to 63 (h[k] * x[32 – k])

y[0] = sum over k = 0 to 63(h[k] * x[0 – k])

The “history” samples are set up such that x[0-k] in the 32 sample framesize corresponds to the same samples in the 64 sample frame. That is x[-1] for 32 is equal to x[31] for 64.

-Shawn

November 20, 2016 at 11:47 pm

HI Shawn,

Thank you so much for the clarification.

I understood the scenario.

If we dont consider the history(means not considering ‘the negetive index) then outputs will be different for different frame sizes.

-Nagananda

May 7, 2017 at 1:29 am

Hi

Can I use the same filter to filter different signals from different sources the signals are at the same frequency ??

or i have to multiple the insamp buffer for each signal ??

May 7, 2017 at 9:03 pm

Hi Basel,

That sounds like a good question, but I don’t know exactly what you are asking. To apply the filter to more than one input buffer, you would need to modify the code. Even if you were to use the same input, with different coefficients, the code would need to be modified. The reason, as I think you can see, is the the samples in the insamp buffer are shifted around each time the filtering function is called.

One way to handle multiple inputs is to define a struct (or you could use a class in C++) that has the insamp buffer in it. And then, when initializing the filter, pass in a pointer to the structure, and initialize that particular “object”. You need to have a different instance of the structure for each filter that you would run. However you program it, you need to have an “insamp” buffer for each filter that you are going to use.

I hope that answers your question.

Thanks,

Shawn

May 8, 2017 at 2:37 am

Hi Shawn

Thank you for your response 🙂

Yes you answered my question !

you answered exactly what i need that for each signal i have to have a insamp buffer to make the shifting in time done

I have another question may you can help me with it i want to minimize my code and use less bytes as much i can , can I minimize the insamp buffer ??

or i can each time just take my N samples and padding with zeroes to the length of the filter and in that way i don’t have to use static buffers to handle the previous samples ??

I will be very glad for any advises !!

Best regards

Basel

May 9, 2017 at 4:03 pm

Basel,

You can minimize the memory used by the insamp buffer by storing only the filter history for each filter. That is, the “number of coefficients – 1” samples that are always copied to the beginning of insamp. Then you need some temporary memory to do the actual filtering. And you need to copy the history to the temporary memory before filter processing, and copy new samples to history after filter processing.

Or…if you are running many filters on the same input signal, then you only need one insamp buffer. But you need to change the code to update the buffer only once for every set of filters that are run. This is what I have often done myself, sometimes using the same buffer for all kinds of different algorithms other than the just FIR filters as well. If you have many different lengths of FIR filters, you may want to align your input pointer to equalize the delays. If you have a symmetric FIR filter with an odd number of taps, N, then the delay is (N+1)/2. The shortest filter would have the input pointer advanced in time compared to the longest filter. You can think of this as the filter being applied to the input starting at the centre coefficient.

It is quite hard to describe these details without showing some code, but sorry I don’t have time to write the code for you. Let me know if you have more questions based on what I said.

Thanks,

Shawn

May 10, 2017 at 5:28 am

Hi Shawn

Thank you for helping

what im doing that i use the snae filter for diffrent signals

and i’m trying do minimize my memory ,I did what youwrote keeping only the filter history and minimize my insamp buffer by the numbers of samples i get , my question is there any more things i can do to minmize my code

?

May 11, 2017 at 9:15 pm

Basel,

If you have symmetric FIR filter coefficients, you could save memory by not storing the duplicated coefficients. You would need different code for even and odd lengths. And you could reduce the number of multiplications by adding the two data values that are multiplied by the same coefficient value, before doing the multiplication.

What kind of minimization are you trying to do? Instruction memory, constant data memory, RAM, or just everything?

-Shawn

May 14, 2017 at 12:37 am

Hi Shawn

Im trying to minimize every thing