Add support for VEX-encoded SHA512-NI instructions.
Signed-off-by: Tomasz Kantecki <tomasz.kantecki@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Kind of embarrassing... I had not implemented the FRED instruction,
despite personally being one of the architects of FRED ;)
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
"LOCK XCHG reg,mem" would issue a warning for being unlockable, which
is incorrect. In this case the RM encoding is simply an alias for the
MR encoding. Add a "LOCK1" bit to deal with that.
However, XCHG is *always* locked, so create a new warning to
explicitly flag a user-specified LOCK XCHG; default off.
Consider optimizing that prefix away in the future, but for now, let's
stick to the user-requested code sequence.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
For VEX instructions created *after* the corresponding EVEX
instructions, we need the user to either explicitly declare them {vex}
or specifying "cpu latevex".
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add vector instructions from the Intel Instruction Set Extensions
document, version 046, September 2022.
Still need to check for missing instructions that have already passed
through the ISE into the SDM.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Additional nonvector instructions from the Intel Instruction Set
Extensions document 319433-046 September 2022.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add support for AVX512-FP16 instructions and the associated
handling. Allow "mapN" syntax as well as "mN" syntax to match the
documentation.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The VCVTNEPS2BF16 instruction was incorrectly specified as
VCVTNE2S2BF16. Fortunately, the correct opcode for the latter was
specified first, so it would emit the correct result when that
instruction was specified.
Fixes: https://bugzilla.nasm.us/show_bug.cgi?id=3392821
Reported-by: Agner <agner@agner.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The instruction supports two forms with [f2] and [f3].
I guess we might add aliases as VMGEXIT2 and VMGEXIT3.
For now simly leave a second form for ndisasm sake.
https://bugzilla.nasm.us/show_bug.cgi?id=3392755
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
BNDMK, BNDLDX, and BNDSTX are split-SIB (MIB) instructions, but do
*not* require a SIB encoding. However, TILELOAD* and TILESTORE* *do*
require a SIB in all cases. Split the MIB flag into MIB (split
address) and SIB (SIB required) flags.
This fixes travis test mpx.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The VPCMP instructions are controlled by an immediate byte, but there
is also a set of SSE-derived legacy opcodes for VPCMPEQ and
VPCMPGT. For the specific cases of VPCMPEQ and VPCMPGT, prefer those
opcodes since they are one byte shorter.
Reported-by: ig <glucksmann@avast.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Break the instruction processing if there are impossible combinations
of Sx flags and operand sizes. If the intent is to always require
explicit sizes, use the SX flag.
The INSERTPS instruction pattern was explicitly wrong, the rest of
these are nuisance fixes.
TODO: fix the disassembler to be able to exclude patterns where these
bits don't matter.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
MOVHPD takes a mem64, but was incorrectly tagged SO - an impossible
combination.
The Sx tags really are a problem and should be removed in the future
whereever possible, presumably in the master branch.
Reported-by: Lukas Hönig <lukashoenig@icloud.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Most SSE instructions were missing memory operand sizes, resulting in
error if a memory operand was specified with explicit size.
Reported-by: <nemeth.marton@hotmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The X64 marker for "X86_64,LONG" has turned out to be a problem in
that it is easy to mistake for "long mode" when adding new
instructions, which results in duplicate CPU flags. Kill it off; it
isn't like we will legitimately have new instructions with this
pattern ever again.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add instructions for Intel Control Flow Enforcement Technology (CET).
Signed-off-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The a64 instruction patterns would incorrectly force REX to zero at a
point where REX prefixes have already been assigned. This is not only
incorrect in case of instructions which can use high registers, but it
causes an assertion failure. It happened to work for J*CXZ and LOOP*.
Reported-by: Philip Lantz <philip.lantz@intel.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
When using VCMP and VPCMP operations with the condition in the opcode,
we should not have an immediate operand!
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
CMPXCHG8b/16b should be legitimate with an explicit operand size.
Reported-by: Xusheng Li <xushengli@protonmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
We need the instruction table to contain the correct information for
both the reg and the rm field in the various modes.
Reported-by: <fasdfqwer@mail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The two VPSH{R|L}DV* instructions had the wrong opcode.
Reported-by: Henrik Gramner <herik@gramner.com>
Link: https://bugzilla.nasm.us/show_bug.cgi?id=3392607
Signed-off-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
The memory operand size of LEA doesn't matter in any way as it isn't
"real memory". Add an ANYSIZE option to ignore sizes entirely.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Distinguish instructions which have once been valid (OBSOLETE) from
those that never saw the light of day (NEVER). Futhermore, flag
instructions which devolve to an architectural noop from those with
undefined behavior and possibly recycled opcodes.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
This allows the K instructions to be specified without a size suffix
as long as the operands are sized; this matches the way most other x86
instructions work. As this is not the syntax specified in the SDM,
don't use it for disassembly.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
AMD documents this instruction with an rax operand. The error behavior
implies this is an address-size-sensitive instruction. Add support for
specifying the explicit operand, but consistent with normal ndisasm
behavior, don't disassemble the implicit operand.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Support the +n syntax for multiple contiguous registers, and emit it
in the output from ndisasm as well.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
New instructions which do four full iterations of a data-reduction
operation (FMA, dot product.)
Bug report: https://bugzilla.nasm.us/show_bug.cgi?id=3392492
Reported-by: ff_ff <qqqqqqqqqfffffffff@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>