|

The following table summarizes the
latencies of MMX/iSSE instructions on the Intel
Pentium III and Pentium 4 processors, and on the AMD Athlon processor:
Instruction |
Pentium III |
Pentium 4 |
AMD Athlon |
MOVD mm,r32 |
1 |
2 |
3 |
MOVD r32,mm |
1 |
5 |
5 |
MOVQ mm,mm |
1 |
6 |
2 |
PACKSSWB / PACKSSDW / PACKUSWB mm,mm |
1 |
2 |
2 |
PADDB / PADDW / PADDD |
1 |
2 |
2 |
PADDSB / PADDSW / PADDUSB / PADDUSW mm,mm |
1 |
2 |
2 |
PAND / PANDN/ POR / PXOR mm,mm |
1 |
2 |
2 |
PCMPEQB / PCMPEQW / PCMPEQD mm,mm |
1 |
2 |
2 |
PCMPGTB / PCMPGTW / PCMPGTD mm,mm |
1 |
2 |
2 |
PMADDWD mm,mm |
3 |
8 |
3 |
PMULHW / PMULLW / PMULHUW mm,mm |
3 |
8 |
3 |
PSLLW / PSLLW / PSLLQ mm,mm/imm8 |
1 |
2 |
2 |
PSRAW / PSRAD mm,mm/imm8 |
1 |
2 |
2 |
PSUBB / PSUBW / PSUBD mm,mm |
1 |
2 |
2 |
PSUBSB / PSUBSW / PSUBUSB / PSUBUSW mm,mm |
1 |
2 |
2 |
PUNPCKHBW / PUNPCKHWD / PUNPCKHDQ mm,mm |
1 |
2 |
2 |
PUNPCKLBW / PUNPCKLWD / PUNPCKLDQ mm,mm |
1 |
2 |
2 |
EMMS |
6 |
12 |
2 |
PAVGB / PAVGW mm,mm |
1 |
2 |
2 |
PEXTRW r32,mm,imm8 |
2 |
7 |
7 |
PINSRW mm,r32,imm8 |
4 |
4 |
5 |
PMAX / PMIN mm,mm |
1 |
2 |
2 |
PMOVMSKB r32,mm |
1 |
7 |
6 |
PSADBW mm,mm |
5 |
4 |
3 |
PSHUFW mm,mm,imm8 |
1 |
2 |
2 |
Latency |
the number of clock cycles that are required
to complete the execution of all of the µops that form an instruction. |
Throughput |
the number of clock cycles required to
wait before the issue ports are free to accept the same instruction again. |
Execution
Unit |
the names of the execution units in the
execution core that are utilized to execute the µops for each instruction. |
|
|