Find a file
Tom de Vries 052aaaceed [nvptx] Don't allow vector_length 64 with num_workers 16
When using a compiler build with:
...
+#define PTX_DEFAULT_VECTOR_LENGTH PTX_CTA_SIZE
...

consider a test-case:
...
int
main (void)
{
  #pragma acc parallel vector_length (64)
    #pragma acc loop worker
    for (unsigned int i = 0; i < 32; i++)
      #pragma acc loop vector
      for (unsigned int j = 0; j < 64; j++)
        ;

  return 0;
}
...

If num_workers is 16, either because:
- we add a "num_workers (16)" clause on the parallel directive, or
- we set "GOMP_OPENACC_DIM=:16:", or
- the libgomp plugin chooses 16 num_workers
we run into an illegal instruction at runtime, because a bar.sync instruction
tries to use a barrier 16.  The instruction is illegal, because ptx supports
only 16 barriers per CTA, and the valid range is 0..15.

The problem is that with a warp-multiple vector length, we use a code generation
scheme with a per-worker barrier.  And because barrier zero is reserved for
per-cta barrier, only the remaining 15 barriers can be used as per-worker
barrier, and consequently we can't use num_workers larger than 15.

This problem occurs only for vector_length 64.  For vector_length 32, we use a
different code generation scheme, and for vector_length >= 96, the maximum
num_workers is not big enough not to trigger this problem.

Also, this problem only occurs for num_workers 16.  As explained above,
num_workers 15 is safe to use, and 16 is already the maximum num_workers for
vector_length 64.

This patch fixes the problem in both the compiler (handling "num_workers (16)")
and in the libgomp nvptx plugin (with and without "GOMP_OPENACC_DIM=:16:").

2019-01-11  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.c (PTX_CTA_NUM_BARRIERS, PTX_PER_CTA_BARRIER)
	(PTX_NUM_PER_CTA_BARRIER, PTX_FIRST_PER_WORKER_BARRIER)
	(PTX_NUM_PER_WORKER_BARRIERS): Define.
	(nvptx_apply_dim_limits): Prevent vector_length 64 and
	num_workers 16.

	* plugin/plugin-nvptx.c (nvptx_exec): Prevent vector_length 64 and
	num_workers 16.

From-SVN: r267838
2019-01-11 11:46:43 +00:00
config iconv.m4 (AM_ICONV_LINK): Don't overwrite CPPFLAGS. 2018-11-07 15:41:21 -07:00
contrib PR other/16615 [1/5] 2019-01-09 16:37:45 -05:00
fixincludes Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
gcc [nvptx] Don't allow vector_length 64 with num_workers 16 2019-01-11 11:46:43 +00:00
gnattools PR81878: fix --disable-bootstrap --enable-languages=ada 2018-11-20 00:07:47 +00:00
gotools Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
include PR other/16615 [2/5] 2019-01-09 16:39:49 -05:00
INSTALL
intl iconv.m4 (AM_ICONV_LINK): Don't overwrite CPPFLAGS. 2018-11-07 15:41:21 -07:00
libada Update copyright years. 2019-01-01 13:31:55 +01:00
libatomic Update copyright years. 2019-01-01 13:31:55 +01:00
libbacktrace PR other/16615 [1/5] 2019-01-09 16:37:45 -05:00
libcc1 Update copyright years. 2019-01-01 13:31:55 +01:00
libcpp Update copyright years. 2019-01-01 13:31:55 +01:00
libdecnumber Update copyright years. 2019-01-01 13:31:55 +01:00
libffi Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
libgcc PR other/16615 [1/5] 2019-01-09 16:37:45 -05:00
libgfortran PR other/16615 [1/5] 2019-01-09 16:37:45 -05:00
libgo runtime: in doscanstackswitch, set gp->m before gogo 2019-01-07 22:07:26 +00:00
libgomp [nvptx] Don't allow vector_length 64 with num_workers 16 2019-01-11 11:46:43 +00:00
libhsail-rt Update copyright years. 2019-01-01 13:31:55 +01:00
libiberty PR other/16615 [2/5] 2019-01-09 16:39:49 -05:00
libitm Update copyright years. 2019-01-01 13:31:55 +01:00
libobjc PR other/16615 [4/5] 2019-01-09 16:44:56 -05:00
liboffloadmic PR other/16615 [1/5] 2019-01-09 16:37:45 -05:00
libphobos libphobos: Merge phobos upstream b022e552a 2019-01-09 23:04:20 +00:00
libquadmath Update copyright years. 2019-01-01 13:31:55 +01:00
libsanitizer Cherry pick libsanitizer patch (https://reviews.llvm.org/D54856). 2018-12-27 09:47:20 +00:00
libssp Update copyright years. 2019-01-01 13:31:55 +01:00
libstdc++-v3 PR libstdc++/88125 remove duplicate entry in linker script 2019-01-11 11:39:45 +00:00
libvtv Update copyright years. 2019-01-01 13:31:55 +01:00
lto-plugin Update copyright years. 2019-01-01 13:31:55 +01:00
maintainer-scripts Add new maintainer script for PRs that can be closed. 2018-11-22 14:05:54 +00:00
zlib Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
ChangeLog Update config.guess, config.sub (PR target/88535) 2019-01-03 11:28:27 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in Add D front-end, libphobos library, and D2 testsuite. 2018-10-28 19:51:47 +00:00
config.guess Update config.guess, config.sub (PR target/88535) 2019-01-03 11:28:27 +00:00
config.rpath
config.sub Update config.guess, config.sub (PR target/88535) 2019-01-03 11:28:27 +00:00
configure darwin - add configuration support for 'otool' 2018-12-05 21:57:00 +00:00
configure.ac darwin - add configuration support for 'otool' 2018-12-05 21:57:00 +00:00
COPYING
COPYING.LIB
COPYING.RUNTIME
COPYING3
COPYING3.LIB
depcomp
install-sh
libtool-ldflags
libtool.m4 Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
ltgcc.m4
ltmain.sh libtool.m4: Sort output of 'find' to enable deterministic builds. 2018-07-05 13:13:45 -06:00
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
MAINTAINERS Update maintainer email address 2018-12-21 17:53:03 +00:00
Makefile.def Add D front-end, libphobos library, and D2 testsuite. 2018-10-28 19:51:47 +00:00
Makefile.in darwin - add configuration support for 'otool' 2018-12-05 21:57:00 +00:00
Makefile.tpl darwin - add configuration support for 'otool' 2018-12-05 21:57:00 +00:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
README
symlink-tree
test-driver Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.