Implementation of FIR Filtering in C (Part 1)

In this lesson I will show how to code a finite impulse response (FIR) digital filter in the C programming language using floating point operations. Understanding the basic FIR filter is a stepping stone to understanding more complicated filters such as polyphase FIR filters and adaptive FIR filters.

A FIR filter is one in which the output depends only on the inputs (and not on previous outputs). The process is a discrete-time convolution between the input signal and the impulse response of the filter to apply. I will call this impulse response “the filter coefficients.” The following equivalent equations describe the convolution process:

equation 1: y(n)=\sum_{k=0}^{N-1}h(k)x(n-k)

equation 2: y(n)=\sum_{k=n-(N-1)}^{n}h(n-k)x(k)

where h is the impulse response, x is the input, y[n] is the output at time n and N is the length of the impulse response (the number of filter coefficients). For each output calculated, there are N multiply-accumulate operations (MACs). There is a choice of going through the filter coefficients forward and the input backwards (equation 1), or through the filter coefficients backwards and the input forwards (equation 2). In the example that follows, I will use the form in equation 1.

In my code example, samples will be stored in a working array, with lower indices being further back in time (at the “low” end). The input samples will be arranged in an array long enough to apply all of the filter coefficients. The length of the array required to process one input sample is N (the number of coefficients in the filter). The length required for M input samples is N – 1 + M.

A function will be defined that filters several input samples at a time. For example, a realtime system for processing telephony quality speech might process eighty samples at a time. This is done to improve the efficiency of the processing, since the overhead of calling the function is eighty times smaller than if it were called for every single sample. The drawback is that additional delay is introduced in the output.

The basic steps for applying a FIR filter are the following:

  1. Arrange the new samples at the high end of the input sample buffer.
  2. Loop through an outer loop that produces each output sample.
  3. Loop through an inner loop that multiplies each filter coefficient by an input sample and adds to a running sum.
  4. Shift the previous input samples back in time by the number of new samples that were just processed.

The code example defines an initialization function and a filtering function, and includes a test program. Download the PDF from the following link to see the code, or view it in-line below:


#include <stdio.h>
#include <stdint.h>

//  Filter Code Definitions

// maximum number of inputs that can be handled
// in one function call
#define MAX_INPUT_LEN   80
// maximum length of filter than can be handled
#define MAX_FLT_LEN     63
// buffer to hold all of the input samples

// array to hold input samples
double insamp[ BUFFER_LEN ];

// FIR init
void firFloatInit( void )
    memset( insamp, 0, sizeof( insamp ) );

// the FIR filter function
void firFloat( double *coeffs, double *input, double *output,
       int length, int filterLength )
    double acc;     // accumulator for MACs
    double *coeffp; // pointer to coefficients
    double *inputp; // pointer to input samples
    int n;
    int k;

    // put the new samples at the high end of the buffer
    memcpy( &insamp[filterLength - 1], input,
            length * sizeof(double) );

    // apply the filter to each input sample
    for ( n = 0; n < length; n++ ) {
        // calculate output n
        coeffp = coeffs;
        inputp = &insamp[filterLength - 1 + n];
        acc = 0;
        for ( k = 0; k < filterLength; k++ ) {
            acc += (*coeffp++) * (*inputp--);
        output[n] = acc;
    // shift input samples back in time for next time
    memmove( &insamp[0], &insamp[length],
            (filterLength - 1) * sizeof(double) );


//  Test program

// bandpass filter centred around 1000 Hz
// sampling rate = 8000 Hz

#define FILTER_LEN  63
double coeffs[ FILTER_LEN ] =
  -0.0448093,  0.0322875,   0.0181163,   0.0087615,   0.0056797,
   0.0086685,  0.0148049,   0.0187190,   0.0151019,   0.0027594,
  -0.0132676, -0.0232561,  -0.0187804,   0.0006382,   0.0250536,
   0.0387214,  0.0299817,   0.0002609,  -0.0345546,  -0.0525282,
  -0.0395620,  0.0000246,   0.0440998,   0.0651867,   0.0479110,
   0.0000135, -0.0508558,  -0.0736313,  -0.0529380,  -0.0000709,
   0.0540186,  0.0766746,   0.0540186,  -0.0000709,  -0.0529380,
  -0.0736313, -0.0508558,   0.0000135,   0.0479110,   0.0651867,
   0.0440998,  0.0000246,  -0.0395620,  -0.0525282,  -0.0345546,
   0.0002609,  0.0299817,   0.0387214,   0.0250536,   0.0006382,
  -0.0187804, -0.0232561,  -0.0132676,   0.0027594,   0.0151019,
   0.0187190,  0.0148049,   0.0086685,   0.0056797,   0.0087615,
   0.0181163,  0.0322875,  -0.0448093

void intToFloat( int16_t *input, double *output, int length )
    int i;

    for ( i = 0; i < length; i++ ) {
        output[i] = (double)input[i];

void floatToInt( double *input, int16_t *output, int length )
    int i;

    for ( i = 0; i < length; i++ ) {
        if ( input[i] > 32767.0 ) {
            input[i] = 32767.0;
        } else if ( input[i] < -32768.0 ) {
            input[i] = -32768.0;
        // convert
        output[i] = (int16_t)input[i];

// number of samples to read per loop
#define SAMPLES   80

int main( void )
    int size;
    int16_t input[SAMPLES];
    int16_t output[SAMPLES];
    double floatInput[SAMPLES];
    double floatOutput[SAMPLES];
    FILE   *in_fid;
    FILE   *out_fid;

    // open the input waveform file
    in_fid = fopen( "input.pcm", "rb" );
    if ( in_fid == 0 ) {
        printf("couldn't open input.pcm");
        return -1;

    // open the output waveform file
    out_fid = fopen( "outputFloat.pcm", "wb" );
    if ( out_fid == 0 ) {
        printf("couldn't open outputFloat.pcm");
        return -1;

    // initialize the filter

    // process all of the samples
    do {
        // read samples from file
        size = fread( input, sizeof(int16_t), SAMPLES, in_fid );
        // convert to doubles
        intToFloat( input, floatInput, size );
        // perform the filtering
        firFloat( coeffs, floatInput, floatOutput, size,
               FILTER_LEN );
        // convert to ints
        floatToInt( floatOutput, output, size );
        // write samples to file
        fwrite( output, sizeof(int16_t), size, out_fid );
    } while ( size != 0 );

    fclose( in_fid );
    fclose( out_fid );

    return 0;

First is the definition of the working array to hold the input samples. I have defined the array length to handle up to 80 input samples per function call, and a filter length up to 63 coefficients. In other words, the filter function that uses this working array will support between 1 to 80 input samples, and a filter length between 1 and 63.

Next is the initialization function, which simply zeroes the entire input sample array. This insures that all of the inputs that come before the first actual input are set to zero.

Then comes the FIR filtering function itself. The function takes pointers to the filter coefficients, the new input samples, and the output sample array. The “length” argument specifies the number of input samples to process (must be between 0 and 80 inclusive). The “filterLength” argument specifies the number of coefficients in the filter (must be between 1 and 63 inclusive). Note that the same filter coefficients and filterLength should be passed in each time the firFloat function is called, until the entire input is processed.

The first step in the function is the storage of the new samples. Note closely where the samples are placed in the array.

Next comes the outer loop that processes each input sample to produce one output sample. The outer loop sets up pointers to the coefficients and the input samples, and zeroes an accumulator value for the multiply-accumulate operation in the inner loop. It also stores each calculated output. The inner loop simply performs the multiply-accumulate operation.

Finally comes the step to move the input samples back in time. This is probably the most difficult step in the FIR filter and the part that always takes me the longest to work out. Pay close attention to the position of the samples to move, and the number of samples moved. The amount of time shift is equal to the number of input samples just processed. The number of samples to move is one less than the number of coefficients in the filter. Also note the use of the memmove function rather than the memcpy function, since there is a potential for moving overlapping data (whether or not the data overlaps depends on the relative lengths of the input samples and number of coefficients).

That concludes the code example for the FIR filter. In Part 2, I will show how to do the same thing using fixed point operations.

49 Comments on “Implementation of FIR Filtering in C (Part 1)”

  1. Hi,
    thanks for the tutorial. just a quick question about the moving the input back by 63 samples. why not move it back 80 samples and discard the first 80-63 of the block. are you using overlap add or overlap save method ?

  2. ok i see now. you move from the 80th, 63 samples back to begining of buffer, makes sense now 🙂

  3. Klevis Says:

    Thanks it helped me a lot

  4. valentin Says:

    There is some bullshit in your example:

    for ( i = 0; i 32767.0 ) {
    095 input[i] = 32767.0;
    096 } else if ( input[i] < -32768.0 ) {
    097 input[i] = -32768.0;
    098 }

    I do not understand the loop and if. Might be it is not necessary an all in this example

  5. Deepak Bhatia Says:

    How do we generate the input file of sine wav with integer values ?

  6. Alina Says:

    How did you find the coefficients?

  7. nig Says:

    In case of .wav files .. how can I get wav file header information before starting the filtering?

  8. […] wav file using C programming. Till now I managed to tackle the filter part following this link: Implementation of FIR Filtering in C (Part 1) | Shawn's DSP Tutorials (I have my own filter co-efficients which have been produced using FDAtool on MATLAB and some other […]

  9. Andy Says:

    i have input value in txt file, in this program i got output file in txt but its not readable ……
    help please

    • Shawn Says:

      The program does not read nor write text files, but uses binary files instead. The format is raw PCM, 16 bits per sample, 8000 Hz sampling rate. Take a look at some of the previous comments on this page. I if ever get around to it, I’ll add some code to read and write .wav files, which would be a bit easier to use than raw PCM.

  10. hamid Says:

    hi dear
    my samplig rate is 1Mhz
    lenght of sampling is 256 sample
    i want to detect 145khz with FIR.
    can i use ur code?
    tanks in advance

  11. Ad Says:

    Hi Shawn,

    Great explanation of a FIR filter.

    I was looking at the implementation and shouldn’t the following line:

    inputp = &insamp[filterLength – 1 + n];

    be changed to:

    inputp = &insamp[filterLength – 1 + n – 1];

    Otherwise inputp is pointing to one element past the input array since its length is (filterLength – 1 + n) and an array’s last element is (array-length – 1).

    Furthermore, I realized that once the function firFloat is called it is not possible to change the length of the array of input samples. It is valid to have a requirement for a fixed-size array of input samples, but better make this explicit (at least in the comment).


    • Shawn Says:

      Hi Ad,

      Thanks for checking out my site.

      The inputp will not point past the end of the array as long as length is less than or equal to MAX_INPUT_LEN, because in the for loop, the largest value of n will be (length – 1) and the index will be (filterLength – 1 + length – 1).

      Although I have not tried it, I don’t see any reason why the length of the input samples couldn’t be changed from one call to the next. However, if the filter coefficients or the number of coefficients is changed, then the firFloatInit function should be called again before calling the firFloat function.


  12. Ad Says:

    Hi Shawn,

    Yes you are right about the inputp: the code is correct, I overlooked the fact that the maximum value for n is (length – 1), thanks for making it clear.

    About the length of the input samples: what if, for instance, the length in the first call to firFloat is 16 and in the second call 32, then the first memmove will move data 16 samples and the second memcpy will copy 32 samples thereby overwriting 16 samples. I think this could be easily solved by placing the memmove just before the memcpy. What do you think?


    • Shawn Says:

      Take a closer look at the memcpy and the memmove lines of code. The memcpy copies in the number of new input samples to be processed; the number of samples is “length”. The memmove keeps enough of the previous input samples to be multiplied by the filter coefficients; the number of samples is “filterLength -1”. The two lines of code serve different purposes and move different numbers of samples. The number of samples moved in the memmove is independent of the number of samples processed.


  13. Ad Says:

    Hi Shawn,

    The memmove moves “filterLength – 1” samples, but the space it creates for new samples does not depend on the number of samples that are moved but on the distance they are moved. And the distance is determined by the difference between the destination and the source pointer, which is “length” samples. For fixing it the memmove has to placed before memcpy and also some of the parameters for memmove and memcpy need to be different.

    But, as already said, it is not a problem if the “length” stays the same between successive calls to firFloat, so there is no need to change the code we just need to be aware of that.


  14. Tenz Says:

    Hi Shawn ,Thanks a lot!
    I have learned DSP theory,but start implementing in code lately
    I have 3 questions:

    First,it’s new to me using the pointer rather than applying directly the order of the arrays
    Like for(i=0; i<k; i++){
    for(j=max(0,i+1-n); j<=min(i,m-1); j++){
    w[i] += u[j]*v[i-j];
    What 's the advantages?

    Second,What's the meaning saving the inputsamples which have been processed at Line memmove?

    Third,I think the convolution Part is based on the fact that there is enough 0 after valid samples.otherwise the conv is incomplete
    For example ,assuming the length of coffes is N,the input M,then n must shift from 0 to more than N+M-1,otherwise we lose M datas.
    I think Line output[n] = acc is correctly processed only when there is more than M 0s after the end of the sample.

    Don't know whether I understand wrong.


  15. Tenz Says:

    Hi Shawn, thanks a lot,it really helps cuz I have learned DSP theory but just start implementing in code.
    I have 3 questions :
    First,I got confused with the convolution part ,Line output[n] = acc;
    I think this part must be based on the fact that there are enough 0s padded after the valid input of the last sample group of the original signal.For example ,assuming coffes with length of N,and input M,time n must shift from 0 to more than N+M-1,but in code we can see it stop at M,leaving M samples unprocessed,it must have at least N 0s padded to make this convolution complete.Do I understand wrongly?
    Second,What’s the meaning saving processed sample at Line memmove(); when we just process new samples at insamp[filterLength – 1]?
    Third,it’s new to me using pointer to change which one in an array to process rather that directly using the order like for(k = 0;k<n;k++)acc += h[k]*x[n-k];what’s the advantage?
    Waiting for your help sincerely.Thx.

    • Shawn Says:

      Hi Tenz,

      Sorry for the late reply, I’ve been away on vacation.

      1. Using a pointer instead of accessing array directly: I think for most compilers this will make no difference at all in terms of speed of the code. For me, using the pointers is similar to how I used to write code in assembly language on old DSP chips. It’s really up to you what code you prefer to use.

      2. Input samples need to be saved when the number of filter coefficients is greater than the number of new input samples. Otherwise there would not be enough samples with which to do the convolution.

      3. When processing is first started, zeroes are used for past input samples that are unknown. That is a common thing to do. You might use something other than zero in some cases (for example filtering frequency bins of an FFT, which is not a time series).

      I hope this helps.


  16. […] website do not list education, so I don’t think he has his Doctorate uses a simplified FIR Filter per iteration. I find if you apply the filter after you’ve built up the terrain, you will […]

  17. Nagananda Shet Says:

    Hi Shawn,

    In FIR filter, does the output remains for different frame sizes ?
    Kindly clarify

    • Shawn Says:

      Hi Nagananda,

      The output is the same no matter what frame size is used. Keep in mind that the number of samples in the output will be the same as the number of samples in the input. So if the block size is 100 samples, the output is 100 samples. If the block size is 1 sample, then the output is 1 sample. If you call the function 100 times in the second case, the 100 outputs will be the same as the one buffer of 100 outputs in the first case.

      I hope that clarifies it for you.


      • Nagananda Shet Says:

        Hi Shawn,
        Thank you for the quick reply.


        For ex: framesize = 32
        As per the convolution equation = y(n) = Summation (h(n) * x(n-k))
        K = 0 to 63
        Assume that n = 0 ; y(0) = h(0) * x(0 –(0 to 63))
        Assume that n = 31 ; y(31) = h(31) * x(31 –(0 to 63))

        For the next frame with framesize = 32, n will be 0 that corresponds to 32nd sample for frame size 64 and 64th sample for 96 frame size

        framesize = 64
        Assume that n = 32 ; y(32) = h(32) * x(32 –(0 to 63)) =====è which corresponds to the 0th sample for framesize 32, which differs in the output.

        is my understanding correct ?

        – Nagananda

  18. Shawn Says:


    Take a closer look at the equations. The summation is over k for the filter coefficients (or n-k if you look at equation 2). Your examples have h[n]. It should be something like this:

    y[32] = sum over k = 0 to 63 (h[k] * x[32 – k])
    y[0] = sum over k = 0 to 63(h[k] * x[0 – k])

    The “history” samples are set up such that x[0-k] in the 32 sample framesize corresponds to the same samples in the 64 sample frame. That is x[-1] for 32 is equal to x[31] for 64.


    • Nagananda Says:

      HI Shawn,

      Thank you so much for the clarification.

      I understood the scenario.

      If we dont consider the history(means not considering ‘the negetive index) then outputs will be different for different frame sizes.


  19. Basel Hussein Says:

    Can I use the same filter to filter different signals from different sources the signals are at the same frequency ??
    or i have to multiple the insamp buffer for each signal ??

    • Shawn Says:

      Hi Basel,

      That sounds like a good question, but I don’t know exactly what you are asking. To apply the filter to more than one input buffer, you would need to modify the code. Even if you were to use the same input, with different coefficients, the code would need to be modified. The reason, as I think you can see, is the the samples in the insamp buffer are shifted around each time the filtering function is called.

      One way to handle multiple inputs is to define a struct (or you could use a class in C++) that has the insamp buffer in it. And then, when initializing the filter, pass in a pointer to the structure, and initialize that particular “object”. You need to have a different instance of the structure for each filter that you would run. However you program it, you need to have an “insamp” buffer for each filter that you are going to use.

      I hope that answers your question.


      • Basel Hussein Says:

        Hi Shawn
        Thank you for your response 🙂
        Yes you answered my question !
        you answered exactly what i need that for each signal i have to have a insamp buffer to make the shifting in time done
        I have another question may you can help me with it i want to minimize my code and use less bytes as much i can , can I minimize the insamp buffer ??
        or i can each time just take my N samples and padding with zeroes to the length of the filter and in that way i don’t have to use static buffers to handle the previous samples ??
        I will be very glad for any advises !!
        Best regards

      • Shawn Says:


        You can minimize the memory used by the insamp buffer by storing only the filter history for each filter. That is, the “number of coefficients – 1” samples that are always copied to the beginning of insamp. Then you need some temporary memory to do the actual filtering. And you need to copy the history to the temporary memory before filter processing, and copy new samples to history after filter processing.

        Or…if you are running many filters on the same input signal, then you only need one insamp buffer. But you need to change the code to update the buffer only once for every set of filters that are run. This is what I have often done myself, sometimes using the same buffer for all kinds of different algorithms other than the just FIR filters as well. If you have many different lengths of FIR filters, you may want to align your input pointer to equalize the delays. If you have a symmetric FIR filter with an odd number of taps, N, then the delay is (N+1)/2. The shortest filter would have the input pointer advanced in time compared to the longest filter. You can think of this as the filter being applied to the input starting at the centre coefficient.

        It is quite hard to describe these details without showing some code, but sorry I don’t have time to write the code for you. Let me know if you have more questions based on what I said.


      • Basel Hussein Says:

        Hi Shawn
        Thank you for helping
        what im doing that i use the snae filter for diffrent signals
        and i’m trying do minimize my memory ,I did what youwrote keeping only the filter history and minimize my insamp buffer by the numbers of samples i get , my question is there any more things i can do to minmize my code


  20. Shawn Says:


    If you have symmetric FIR filter coefficients, you could save memory by not storing the duplicated coefficients. You would need different code for even and odd lengths. And you could reduce the number of multiplications by adding the two data values that are multiplied by the same coefficient value, before doing the multiplication.

    What kind of minimization are you trying to do? Instruction memory, constant data memory, RAM, or just everything?


  21. […] is my attempt to understand the intricacies involved in filter design. Shawn Stevenson’s blog proved to be a great guide in getting me started in this domain. However, for a beginner, a great […]

  22. luke Says:

    Hello, I’m trying to apply a fir filter to a wav file I read. First I tried with a SinusWave.wav file and I was getting some output… Then I tried with other audios such as songs or voices but the output is always ‘0’, there is no audio at all… Any idea of what could be happening?

    • Shawn Says:

      Hi Luke,

      I suspect that you are not reading the file correctly. What type of file is it? What kind of filter are you using? (lowpass, highpass, bandpass, …).


  23. Jin Says:

    how can i get integer output?

  24. Jin Says:

    Hi Shawn, I am trying with a small sample instead of reading a .pcm file. my input is {1,2,3,4,5}. and filter coefficients are{1,1,1,1}. But the output obtained is 8241 16483 24726 -32562 16483 . but the expected output of filtering this sample is is {1,3,6,9 14}. Any idea why i am getting like that?

    • Shawn Says:

      Hi Jin,

      When defining constant values that are doubles, it is a good idea to include a period so that the compiler knows that they are not integers. For example {1.0, 2.0, 3.0, 4.0, 5.0} for the input and {1.0, 1.0, 1.0, 1.0, 1.0} for the filter coefficients. When using “float” constants, put an “f” at the end, e.g. 1.0f. Are you modifying the code on this particular page?


  25. Matthew Says:

    Hi Shawn,
    EE student here, could this code be applied to a real time filter? For example using a raspberry Pi? Could you help me understand what that may change and why? I assume the window length would change.
    Anyways thank you, very helpful code example.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: