Furthermore, error detection capabilities have been added to make it safer for use with workloads that rely on ECC. With 48KB in size, in compute the texture cache becomes a read-only cache, specializing in unaligned memory access workloads. With GK110, Nvidia also reworked the GPU texture cache to be used for compute. This goes in hand with an increase of total number of registers each thread can address, moving from 63 registers per thread to 255 registers per thread with GK110. Performance in register-starved scenarios is also improved as there are more registers available to each thread. Both the L2 cache and register file bandwidth have also doubled. As for the L2 cache, GK110 L2 cache space increased by up to 1.5MB, 2x as big as GF110. At the SMX level, GK110's register file space has increased to 256KB composed of 64K 32bit registers, as compared to Fermi's 32K 32bit registers totaling 128 KB. With GK110, increases in memory space and bandwidth for both the register file and the L2 cache over previous models, are seen. This model also attempts to maximise energy efficiency through the execution of as many tasks as possible in parallel according to the capabilities of its streaming processors. GK110 was designed and marketed with computational performance in mind.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |