Clarify libgomp nvptx 'omp_low_lat_mem_space' documentation

PTX '%dynamic_smem_size' was "Introduced in PTX ISA version 4.1", and
"Requires 'sm_20' or higher".  Given that GCC/nvptx generally supports
'sm_20', only the PTX ISA version matters here, and that's all fine if
just using GCC's defaults.  Follow-up to
commit e9a19ead49
"openmp, nvptx: low-lat memory access traits".

	libgomp/
	* libgomp.texi: Clarify nvptx 'omp_low_lat_mem_space'
	documentation.
This commit is contained in:
Thomas Schwinge 2024-11-12 09:54:35 +01:00
parent ab5bd6ac68
commit c80ecfa092

View file

@ -6972,8 +6972,10 @@ The implementation remark:
memory-copy functions of the CUDA library. Higher dimensions will
call those functions in a loop and are therefore supported.
@item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
the @code{access} trait is set to @code{cgroup}, the ISA is at least
@code{sm_53}, and the PTX version is at least 4.1. The default pool size
the @code{access} trait is set to @code{cgroup}, and libgomp has
been built for PTX ISA version 4.1 or higher (such as in GCC's
default configuration). @c -mptx=4.1
The default pool size
is 8 kiB per team, but may be adjusted at runtime by setting environment
variable @code{GOMP_NVPTX_LOWLAT_POOL=@var{bytes}}. The maximum value is
limited by the available hardware, and care should be taken that the