Category Archives: Development

GPU Shared Memory Performance Optimization

Posted on September 26, 2013 by Justin McKennon (for Microway)

This post is Topic #3 (post 2) in our series Parallel Code: Maximizing your Performance Potential. In my previous post, I provided an introduction to the various types of memory available for use in a CUDA application. Now that you’re … Continue reading →

GPU Memory Types – Performance Comparison

Posted on August 6, 2013 by Justin McKennon (for Microway)

This post is Topic #3 (part 1) in our series Parallel Code: Maximizing your Performance Potential. CUDA devices have several different memory spaces: Global, local, texture, constant, shared and register memory. Each type of memory on the device has its … Continue reading →

Optimize CUDA Host/Device Transfers

Posted on July 3, 2013 by Justin McKennon (for Microway)

This post is Topic #2 (part 2) in our series Parallel Code: Maximizing your Performance Potential. In my previous post, CUDA Host/Device Transfers and Data Movement, I provided an introduction into the bottlenecks associated with host/device transfers and data movement. … Continue reading →

CUDA Host/Device Transfers and Data Movement

Posted on June 24, 2013 by Justin McKennon (for Microway)

This post is Topic #2 (part 1) in our series Parallel Code: Maximizing your Performance Potential. In post #1, I discussed a few ways to optimize the performance of your application via controlling your threads and provided some insight as … Continue reading →

CUDA Parallel Thread Management

Posted on June 13, 2013 by Justin McKennon (for Microway)

This post is Topic #1 in our series Parallel Code: Maximizing your Performance Potential. Regardless of the environment or architecture you are using, one thing is certain: you must properly manage the threads running in your application to optimize performance. This … Continue reading →

Parallel Code: Maximizing your Performance Potential

Posted on June 3, 2013 by Justin McKennon (for Microway)

No matter what the purpose of your application is, one thing is certain. You want to get the most bang for your buck. You see research papers being published and presented making claims of tremendous speed increases by running algorithms … Continue reading →

5 Easy First Steps on GPUs – Accelerating Your Applications

Posted on April 26, 2013 by Eliot Eshelman

This week NVIDIA provided a tutorial outlining first steps for GPU acceleration using OpenACC and CUDA. This was offered as part of the “GPUs Accelerating Research” week at Northeastern University and Boston University. After attending, it seemed appropriate to review … Continue reading →

Software Support for Intel® Xeon Phi™ Coprocessors

Posted on January 3, 2013 by Brett Newman

With Intel’s release of the Xeon Phi coprocessor cards, HPC users must ask themselves how much performance they need and how they plan to achieve it. Will resources be devoted towards fast new hardware, re-writing/optimizing software or some balance of … Continue reading →

GPU Performance without GPU Coding

Posted on January 13, 2012 by Eliot Eshelman

I think everyone in the HPC arena has heard plenty about GPUs. GPUs aren’t sophisticated like CPUs, but they provide raw performance for those who know how to use them. The question for those who have large computational workloads has … Continue reading →

Category Archives: Development

GPU Shared Memory Performance Optimization

GPU Memory Types – Performance Comparison

Optimize CUDA Host/Device Transfers

CUDA Host/Device Transfers and Data Movement

CUDA Parallel Thread Management

Parallel Code: Maximizing your Performance Potential

5 Easy First Steps on GPUs – Accelerating Your Applications

Software Support for Intel® Xeon Phi™ Coprocessors

GPU Performance without GPU Coding

Archives

Meta

Talk to an Expert

Take a Test Drive

Configure Your Solution

Subscribe to Microway’s Technical Newsletter

HPC-Tech-Tip Categories

Subscribe to Blog

Technologies

Products

Knowledge Center

Pre-Configured Systems

NVIDIA DGX H100™

NVIDIA DGX POD™

EOL – NVIDIA DGX A100™

AI Anywhere Solution

Category Archives: Development

Archives

Meta

Talk to an Expert

Take a Test Drive

Configure Your Solution

Subscribe to Microway’s Technical Newsletter

HPC-Tech-Tip Categories

HPC-Tech-Tip Tags

Subscribe to Blog