Posted tagged ‘multiplication’

Implementation of FIR Filtering in C (Part 2)

October 8, 2009

In Part 1 I showed how to code a FIR filter in C using floating point. In this lesson I will show how to do the same thing using fixed point operations. The code example below will demonstrate the application of fixed point multiplication, rounding and saturation. The code has definitions for the FIR filtering function, followed by an example test program.

firFixed

And here is the code example:


#include <stdio.h>
#include <stdint.h>

//////////////////////////////////////////////////////////////
//  Filter Code Definitions
//////////////////////////////////////////////////////////////

// maximum number of inputs that can be handled
// in one function call
#define MAX_INPUT_LEN   80
// maximum length of filter than can be handled
#define MAX_FLT_LEN     63
// buffer to hold all of the input samples
#define BUFFER_LEN      (MAX_FLT_LEN - 1 + MAX_INPUT_LEN)

// array to hold input samples
int16_t insamp[ BUFFER_LEN ];

// FIR init
void firFixedInit( void )
{
memset( insamp, 0, sizeof( insamp ) );
}

// the FIR filter function
void firFixed( int16_t *coeffs, int16_t *input, int16_t *output,
int length, int filterLength )
{
int32_t acc;     // accumulator for MACs
int16_t *coeffp; // pointer to coefficients
int16_t *inputp; // pointer to input samples
int n;
int k;

// put the new samples at the high end of the buffer
memcpy( &insamp[filterLength - 1], input,
length * sizeof(int16_t) );

// apply the filter to each input sample
for ( n = 0; n < length; n++ ) {
// calculate output n
coeffp = coeffs;
inputp = &insamp[filterLength - 1 + n];
acc = 1 << 14;
// perform the multiply-accumulate
for ( k = 0; k < filterLength; k++ ) {
acc += (int32_t)(*coeffp++) * (int32_t)(*inputp--);
}
// saturate the result
if ( acc > 0x3fffffff ) {
acc = 0x3fffffff;
} else if ( acc < -0x40000000 ) {
acc = -0x40000000;
}
// convert from Q30 to Q15
output[n] = (int16_t)(acc >> 15);
}

// shift input samples back in time for next time
memmove( &insamp[0], &insamp[length],
(filterLength - 1) * sizeof(int16_t) );

}

//////////////////////////////////////////////////////////////
//  Test program
//////////////////////////////////////////////////////////////

// bandpass filter centred around 1000 Hz
// sampling rate = 8000 Hz
// gain at 1000 Hz is about 1.13

#define FILTER_LEN  63
int16_t coeffs[ FILTER_LEN ] =
{
-1468, 1058,   594,   287,    186,  284,   485,   613,
495,   90,  -435,  -762,   -615,   21,   821,  1269,
982,    9, -1132, -1721,  -1296,    1,  1445,  2136,
1570,    0, -1666, -2413,  -1735,   -2,  1770,  2512,
1770,   -2, -1735, -2413,  -1666,    0,  1570,  2136,
1445,    1, -1296, -1721,  -1132,    9,   982,  1269,
821,   21,  -615,  -762,   -435,   90,   495,   613,
485,  284,   186,   287,    594, 1058, -1468
};

// number of samples to read per loop
#define SAMPLES   80

int main( void )
{
int size;
int16_t input[SAMPLES];
int16_t output[SAMPLES];
FILE   *in_fid;
FILE   *out_fid;

// open the input waveform file
in_fid = fopen( "input.pcm", "rb" );
if ( in_fid == 0 ) {
printf("couldn't open input.pcm");
return;
}

// open the output waveform file
out_fid = fopen( "outputFixed.pcm", "wb" );
if ( out_fid == 0 ) {
printf("couldn't open outputFixed.pcm");
return;
}

// initialize the filter
firFixedInit();

// process all of the samples
do {
size = fread( input, sizeof(int16_t), SAMPLES, in_fid );
// perform the filtering
firFixed( coeffs, input, output, size, FILTER_LEN );
// write samples to file
fwrite( output, sizeof(int16_t), size, out_fid );
} while ( size != 0 );

fclose( in_fid );
fclose( out_fid );

return 0;
}



The first thing to notice is that the definitions for the input sample storage and handling are nearly the same as for the code in Part 1. The only difference is that the storage type is a 16 bit integer instead of double precision floating point.

The next difference is the inclusion of rounding in the calculation of each output. Rounding is used when converting the calculated result from a Q30 format number to Q15. Notice that I have loaded the rounding constant into the accumulator value (acc) at the beginning of the loop rather than adding it at the end. This is a small optimization commonly seen in code for FIR filters. If you are coding in assembly language, and your chip has a rounding instruction, it may be better to do the rounding at the end (depending on what the instruction actually does).

The multiplication itself is now an integer multiplication of two 16 bit values, each of which is a Q15 number. The accumulator variable is 32 bits, and holds a Q1.30 format number. There is one bit for the sign, one integer bit, and thirty fractional bits. Notice that I have cast each multiplier to a 32 bit value. Failure to do so should result in a 16 bit product and produce wrong results.

Next comes the overflow handling for converting from Q1.30 to Q30. The code checks for values beyond the limits of the largest/smallest Q30 number (no integer bits), and saturates if necessary.

Finally, a right shift by 15 is used to convert the MAC result from Q30 to Q15 and the result is stored to the output array.

The test program is simpler than the one in Part 1 because there is no need to convert the input and output samples to or from floating point. The most important thing to note is the change in the filter coefficient array. To generate these coefficients, I took the floating point coefficients from Part 1, multiplied by 32768, and then rounded to the nearest integer. The coefficients are in Q15 format, and note that none of the original floating point coefficients are close to one. Multiplying by 32768 would cause a problem for any coefficients larger than 32767/32768 or less than -1.

As in Part 1, the test input file should be 16 bit samples at a sampling rate of 8000 Hz. Try using a 1000 Hz tone with noise added to it. The bandpass filter should remove a large portion of the noise.

The filter has greater than unity gain at 1000 Hz and the gain is about 1.13. Admittedly it is not a great filter design, but it tests the saturation code if a full scale 1000 Hz sine wave is used as an input. Try a full scale 1000 Hz tone input with and without the saturation check and see what the difference is. There should be a small amount of distortion with the saturation code present (due to slight flattening of tone) and a large amount of distortion without it (due to overflow).

A fixed point FIR filter is fairly easy to implement. Personally I find the sample storage and movement to be the most difficult part. In Part 3 of this lesson I will demonstrate how to run multiple FIR filters on the same input.

Fixed Point Multiplication

July 14, 2009

Fixed point addition and subtraction are straightforward. Additions and subtractions are performed using integer operations. For example, if two 16 bit Q15 format numbers are added, the result is a Q15 number. But what about fixed about multiplication? What happens if two Q15 numbers are multiplied?

Let’s try an example. Take 0.5 multiplied by 0.25. In Q15 the number 0.5 is represented (in hexadecimal) as 0x8000 times 0.5 or 0x4000. Similarly, 0.25 is 0x2000. When we multiply these together, the product is 0x08000000. Obviously the result is not a Q15 number since the number of bits required is more than 16. The expected product, 0.125, is 0x1000 in Q15.

To see what is going on, define the following two Q15 numbers a and b:

$a = \frac{\hat{a}}{{2}^{15}}$

$b = \frac{\hat{b}}{{2}^{15}}$

where $\hat{a}$ and $\hat{b}$ are the integer representations of our numbers (0x4000 and 0x2000 in our example). The product of a and b is:

$c = ab=\frac{\hat{a}\hat{b}}{{2}^{30}}$

From the above, it can be seen that the product is a Q30 number. Going back to our example, 0x4000 times 0x2000 is 0x08000000, which is 0.125 times ${2}^{30}$.

A general rule when multiplying a Qm format number by a Qn format number, is that the product will be a Q(m+n) number. The number of bits required to represent the product is at least (n+m) for unsigned multiplication and (n+m+1) for signed (twos complement) multiplication.

For the more general case of a Qa.b number times a Qc.d number, the product is Q(a+c).(b+d).  The number of bits needed for the result is (a + b + c + d + 1) for signed numbers (and one less for unsigned numbers).

Consider the example of a Q16 unsigned multiplication between the two largest unsigned numbers that can be represented. The largest Q16 number is 65535/65536 = 0.9999847412109375. The product is 0xffff times 0xffff or 0xfffe0001. The result is a Q32 number requiring at least 32 bits. If we divide by ${2}^{32}$ then we get 0.99996948265470564365386962890625, the expected result.

There are a number of things that are done with the product of a multiplication, depending on the application. Some of the commonly seen options are:

1. Convert the product to a different Q format.

2. Use the product in the resulting Q format.

3. Add the product to a running sum in an accumulator register.

4. Convert the product to a different Q format, then add to a running sum.

Let’s look at some of these options for the case of signed multiplication using Q15 format numbers. For case 1, assume we want to multiply two Q15 numbers and get a Q15 result. The required operation is to take the Q30 product, and shift it right by 15 bits. The result can then be stored in 16 bits. There is also the option of rounding the product before shifting out the lower 15 bits (I may discuss rounding in a future post). Some CPU architectures are better set up to shift the product left by 1, and then store the upper 16 bits. This is almost exactly the same as shifting right by 15 bits and keeping the lower 16 bits.

Multiply-accumulate (MAC) operations are used a lot in many DSP algorithms. Many processors have one or more dedicated accumulator registers for this purpose (often with 32 or 40 bits). For the case of Q15 multiplies, each Q30 product can be summed to the accumulator.

I have seen a lot of code that shifts each product left by 1 when performing the MAC operations. Some DSP chips can do the left shift in hardware using a special mode of the ALU. In this case, the value in the accumulator is in Q31 format. Although very common, this method has a greater chance of overflow problems since each product is effectively two times bigger. I think this method became popular because certain older DSP chip architectures required the storing of the high 16 bits of the accumulator or product register, rather than having a single cycle instruction allowing a shift by 15 bits.

In summary, because multiplication operations are often a chief component of signal processing implementations, it is important to understand how they work. This is especially true for fixed point operations, where one must know the effect of multiplication on the format of the numbers themselves.