OpenCL local work group size

Nov 20, 2011 23:17


Hi, first of all, let me say a big thank you to MacResearch for the tutorial. It really helped me to get started with OpenCL.

I am trying out the matrix multiplication code from here. It has an optimised algorithm that uses vector multiplication of size 16 (using a tile of size 16x16), which - if I understood everything correctly - should be executed with local work group size of 16x16. I could never do this. It was either:

1) leave the vector multiplication and float16 but reduce the local work group size to 1x1 - this works but, naturally, produces a rubbish result

2) reduce everything to 8; that is, use float8 and tile size 8x8 and local work group size 8x8 - this works correctly but it is also not as fast.
http://www.macresearch.org/opencl-local-work-group-size

opencl, ktfr

Previous post Next post
Up