Enqueue multiple times if the the size is not uniform, at most 2 times for 1D, 4times for 2D and 8 times for 3D. Using the workdim offset of walker in batch buffer to keep work groups in series.
TODO: handle events for the flush between multiple enqueues
b8e07f6 Runtime: Add support for non uniform group size
src/cl_api_kernel.c | 13 ---------
src/cl_command_queue.c | 64 +++++++++++++++++++++++++++++++++++++++++--
src/cl_command_queue_gen7.c | 19 +++++++------
src/cl_driver.h | 1 +
src/intel/intel_gpgpu.c | 14 ++++++----
5 files changed, 81 insertions(+), 30 deletions(-)
Upstream: cgit.freedesktop.org