Clarify libgomp nvptx 'omp_low_lat_mem_space' documentation
PTX '%dynamic_smem_size' was "Introduced in PTX ISA version 4.1", and
"Requires 'sm_20' or higher". Given that GCC/nvptx generally supports
'sm_20', only the PTX ISA version matters here, and that's all fine if
just using GCC's defaults. Follow-up to
commit e9a19ead49
"openmp, nvptx: low-lat memory access traits".
libgomp/
* libgomp.texi: Clarify nvptx 'omp_low_lat_mem_space'
documentation.
This commit is contained in:
parent
ab5bd6ac68
commit
c80ecfa092
1 changed files with 4 additions and 2 deletions
|
@ -6972,8 +6972,10 @@ The implementation remark:
|
|||
memory-copy functions of the CUDA library. Higher dimensions will
|
||||
call those functions in a loop and are therefore supported.
|
||||
@item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
|
||||
the @code{access} trait is set to @code{cgroup}, the ISA is at least
|
||||
@code{sm_53}, and the PTX version is at least 4.1. The default pool size
|
||||
the @code{access} trait is set to @code{cgroup}, and libgomp has
|
||||
been built for PTX ISA version 4.1 or higher (such as in GCC's
|
||||
default configuration). @c -mptx=4.1
|
||||
The default pool size
|
||||
is 8 kiB per team, but may be adjusted at runtime by setting environment
|
||||
variable @code{GOMP_NVPTX_LOWLAT_POOL=@var{bytes}}. The maximum value is
|
||||
limited by the available hardware, and care should be taken that the
|
||||
|
|
Loading…
Add table
Reference in a new issue