// The two operators below are not virtual operators. If you cast
// to D3DXMATRIX, do not delete using them
void operator delete(void* p);
void operator delete[](void* p);
A 16-byte aligned matrix, when used by Direct3D extensions (D3DX) math functions, has been optimized for improved performance on Intel Pentium 4 processors. Matrices are aligned independent of where they are created: on the program stack, in the heap, or in global scope. Alignment is accomplished using __declspec(align(16)), which works with Microsoft® Visual C++® .NET and with Visual C++ 6.0 only when the processor pack is installed. Unfortunately, there is no way to detect the processor pack, so byte alignment is turned on by default only with Visual C++ .NET.
Vectors and quaternions are not byte aligned in D3DX. When using vectors and quaternions with D3DX math functions, use _declspec(align(16)) to generate byte aligned vectors and quaternions, because they will perform significantly better. The definition of _declspec is shown here.
#define _ALIGN_16 __declspec(align(16))
Other compilers interpret D3DXMATRIXA16 as D3DXMATRIX. Using this structure on a compiler that does not actually align the matrix can be problematic because it will not expose bugs that ignore alignment. For example, if a D3DXMATRIXA16 object is inside a structure or class, a memcpy might be done with tight packing (ignoring 16-byte boundaries). This would cause build breaks if the compiler were to sometime add matrix aligning.