Clarify libgomp nvptx 'omp_low_lat_mem_space' documentation

PTX '%dynamic_smem_size' was "Introduced in PTX ISA version 4.1", and "Requires 'sm_20' or higher". Given that GCC/nvptx generally supports 'sm_20', only the PTX ISA version matters here, and that's all fine if just using GCC's defaults. Follow-up to commit e9a19ead49 "openmp, nvptx: low-lat memory access traits". libgomp/ * libgomp.texi: Clarify nvptx 'omp_low_lat_mem_space' documentation.
2024-11-12 09:54:35 +01:00 · 2024-11-12 09:54:35 +01:00 · c80ecfa092
commit c80ecfa092
parent ab5bd6ac68
1 changed files with 4 additions and 2 deletions
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@ -6972,8 +6972,10 @@ The implementation remark:
      memory-copy functions of the CUDA library.  Higher dimensions will
      call those functions in a loop and are therefore supported.
@item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
-      the @code{access} trait is set to @code{cgroup}, the ISA is at least
-      @code{sm_53}, and the PTX version is at least 4.1.  The default pool size
+      the @code{access} trait is set to @code{cgroup}, and libgomp has
+      been built for PTX ISA version 4.1 or higher (such as in GCC's
+      default configuration).  @c -mptx=4.1
+      The default pool size
      is 8 kiB per team, but may be adjusted at runtime by setting environment
      variable @code{GOMP_NVPTX_LOWLAT_POOL=@var{bytes}}.  The maximum value is
      limited by the available hardware, and care should be taken that the