[B! Programming][cuda] xiangzeのブックマーク

xiangze id:xiangze

Programmingとcudaに関するxiangzeのブックマーク (2)

Faster Parallel Reductions on Kepler | NVIDIA Technical Blog
Parallel reduction is a common building block for many parallel algorithms. A presentation from 2007 by Mark Harris provided a detailed strategy for implementing parallel reductions on GPUs, but this 6-year old document bears updating. In this post I will show you some features of the Kepler GPU architecture which make reductions even faster: the shuffle (SHFL) instruction and fast device memory a
xiangze 2016/01/11
cuda

gpu

programming
リンク
CUDA kernel return value shortcut - Next MIDI Project
CUDA の kernel 関数からの戻り値を毎回 cudaMemcpy したくない時につかう template<class T> class CudaValue { private: T* ptr; public: CudaValue() { cudaMalloc(&ptr, sizeof(T)); } ~CudaValue() { cudaFree(ptr); } operator T() { T ret; cudaMemcpy(&ret, ptr, sizeof(T), cudaMemcpyDeviceToHost); return ret; } CudaValue& operator=(const T& rhs) { cudaMemcpy(ptr, &rhs, sizeof(T), cudaMemcpyHostToDevice); return this; } T* get() {
xiangze 2015/06/09
cuda

programming
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx