How Do Memory Operations In Gpus Differ From Those In Cpus?

1.	How Do Memory Operations In Gpus Differ From Those In Cpus?
Answer» GPUs have a significantly smaller cache making average latency of memory operations much HIGHER. This requires many concurrent threads to hid the latency. Also, the shared memory can be USED as an opaque cache in direct CONTROL of the programmer -- making it possible to UTILIZE the cache better in some situations. Further, because of SIMD WARP instructions, multiple memory accesses are made per instruction. These accesses can be coalesced into a smaller number of real accesses, if the address set is contiguous for global memory or strided for shared memory. GPUs have a significantly smaller cache making average latency of memory operations much higher. This requires many concurrent threads to hid the latency. Also, the shared memory can be used as an opaque cache in direct control of the programmer -- making it possible to utilize the cache better in some situations. Further, because of SIMD warp instructions, multiple memory accesses are made per instruction. These accesses can be coalesced into a smaller number of real accesses, if the address set is contiguous for global memory or strided for shared memory.

Answer»

GPUs have a significantly smaller cache making average latency of memory operations much HIGHER. This requires many concurrent threads to hid the latency. Also, the shared memory can be USED as an opaque cache in direct CONTROL of the programmer -- making it possible to UTILIZE the cache better in some situations. Further, because of SIMD WARP instructions, multiple memory accesses are made per instruction. These accesses can be coalesced into a smaller number of real accesses, if the address set is contiguous for global memory or strided for shared memory.

GPUs have a significantly smaller cache making average latency of memory operations much higher. This requires many concurrent threads to hid the latency. Also, the shared memory can be used as an opaque cache in direct control of the programmer -- making it possible to utilize the cache better in some situations. Further, because of SIMD warp instructions, multiple memory accesses are made per instruction. These accesses can be coalesced into a smaller number of real accesses, if the address set is contiguous for global memory or strided for shared memory.

How Do Memory Operations In Gpus Differ From Those In Cpus?

Discussion

No Comment Found

Related InterviewSolutions

Reply to Comment