We
are striving to develop a revolutionary development tool that can
really
put MMX/SSE
programming
into mainstream. We would greatly appreciate any feedback on Quexal, as
any comment, feature request and bug report would help us design a
product
that better fulfills your MMX/SSE
programming needs.
Click here to send an email!
Frequently
Asked Questions
 |
Would MMX/SSE
instructions accelerate
my application? |

|
MMX/SSE
instructions can greatly enhance
the performance of the following applications: multimedia (audio/video),
communications,
DSP
kernels, 2D and 3D
graphics, image processing
and speech
recognition. Nowadays, all these kinds of applications need to
exploit
MMX/SSE instructions to offer the level of performance that users
expect.
And Quexal is the only
tool that can get
the job done quickly and easily. |
 |
What are the
differences between the
Demo and the Registered versions of Quexal? |
 |
The Demo version
is identical to the Registered
version except that:
- the Save File
and
Save as Macro functions
are disabled;
- the compiler
will
stop after the first solution;
- several
compiler
settings are fixed;
- the Export to
Intrinsics function is disabled;
- it can be used
for
evaluation only.
Don't overlook
difference #2: the first solution
is hardly optimal, and it is typical that medium-size routines enjoy
tens
of improvements over the first solution when compiled with the
Registered
version. |
 |
Why don't you
add SSE floating point
instructions to Quexal? |
 |
The current focus
is on integer SIMD programming.
Extending the Quexal environment to floating-point SIMD may be a
natural
upgrade path, but there are several reasons that may not make it
worthwhile:
- it is
easier to
write FP vector compilers
than integer vector compilers: in FP code you have only one size of
operands and a fixed level of parallelism (4 with SSE, 2 with 3D-Now!);
another cause is that some MMX/SSE instructions do not match well
with
C language syntax: consider the PSADBW
instruction, it replaces a whole bunch of C code, but the coding style
of the C programmer may not make the match obvious. It is also quite
difficult
to match a C multiplication to one of MMX
multiplies (high or low part?), as it requires a deep analysis of
the
dynamic ranges of the operands and of the following instructions;
- the Intel
Pentium 4 processor has fast SSE but slow
x87 FPU:
it means that mainstream compiler vendors will have to implement FP
vectorizers
to fully harness P4's power. Intel offers C and Fortran SSE vectorizing
compilers today, others are likely to follow.
|

|