HPC Posts Archive

Benchmarking with TensorRT-LLM

Posted on February 16, 2024 by Jon Allman

Evaluating the speed of GeForce RTX 40-Series GPUs using NVIDIA’s TensorRT-LLM tool for benchmarking GPU inference performance.

Molecular Dynamics Benchmarks GPU Roundup GROMACS NAMD2 NAMD 3alpha on 12 GPUs

Posted on May 9, 2022 by Dr. Donald Kinghorn

We have a new collection of GPU accelerated Molecular Dynamics benchmark packages put together for GROMACS, NAMD 2, and NAMD 3-alpha10. (The benchmark packages will be available to the public soon.) In this post we present results for,
– 3 applications: GROMACS, NAND 2 and NAMD 3alpha10,
– 8 MD simulations,
– 12 different NVIDIA GPUs,
– 96 total results.

NAMD Custom Build for Better Performance on your Modern GPU Accelerated Workstation — Ubuntu 16.04, 18.04, CentOS 7

Posted on July 20, 2018 by Dr. Donald Kinghorn

In this post I will be compiling NAMD from source for good performance on modern GPU accelerated Workstation hardware. Doing a custom NAMD build from source code gives a moderate but significant boost in performance. This can be important considering that large simulations over many time-steps can run for days or weeks. I wanted to do some custom NAMD builds to ensure that that modern Workstation hardware was being well utilized. I include some results for the STMV benchmark showing the custom build performance boost. I’ve included some results using NVIDIA 1080Ti and Titan V GPU’s as well as an “experimental” build using an Ubuntu 18.04 base.

PCIe X16 vs X8 with 4 x Titan V GPUs for Machine Learning

Posted on May 21, 2018 by Dr. Donald Kinghorn

One of the questions I get asked frequently is “how much difference does PCIe X16 vs PCIe X8 really make?” Well, I got some testing done using 4 Titan V GPU’s in a machine that will do 4 X16 cards. I ran several jobs with TensorFlow with the GPU’s at both X16 and X8. Read on to see how it went.

Microsoft Build 2018 — impressions

Posted on May 11, 2018 by Dr. Donald Kinghorn

I attended the Microsoft Build 2018 developers conference this week and really enjoyed it. I wanted to share my “big picture” feelings about it and some of the things that stood out to me. I’m not going to give you a “reporters” view or repeat press-release items. This is just my personal impression of the conference.

Multi-GPU scaling with Titan V and TensorFlow on a 4 GPU Workstation

Posted on May 4, 2018 by Dr. Donald Kinghorn

I have been qualifying a 4 GPU workstation for Machine Learning and HPC use. The last confirmation testing I wanted to do was running it with TensorFlow benchmarks on 4 NVIDIA Titan V GPU’s. I have that systems up and running and the multi-GPU scaling looks very good.

GPU Memory Size and Deep Learning Performance (batch size) 12GB vs 32GB — 1080Ti vs Titan V vs GV100

Posted on April 27, 2018 by Dr. Donald Kinghorn

Batch size is an important hyper-parameter for Deep Learning model training. When using GPU accelerated frameworks for your models the amount of memory available on the GPU is a limiting factor. In this post I look at the effect of setting the batch size for a few CNN’s running with TensorFlow on 1080Ti and Titan V with 12GB memory, and GV100 with 32GB memory.

NVIDIA Titan V plus Tensor-cores Considerations and Testing of FP16 for Deep Learning

Posted on April 20, 2018 by Dr. Donald Kinghorn

Tensor-cores are one of the compelling new features of the NVIDIA Volta architecture. In this post I discuss the some thought on mixed precision and FP16 related to Tensor-cores. I have some performance results for large convolution neural network training that makes a good argument for trying to use them. Performance looks very good.

GTC 2018 Impressions

Posted on April 2, 2018 by Dr. Donald Kinghorn

NVIDIA’s Graphics Technology Conference (GTC) is probably my all-time favorite conference. It’s an interesting blend of “Scientific Research meeting” and Trade-Show. It’s put on by a hardware vendor but still feels like a scientific meeting. It’s not just a “Kool-Aid” fest! In this post I go present some of my thoughts about this years conference.

NAMD Performance on Xeon-Scalable 8180 and 8 GTX 1080Ti GPUs

Posted on March 9, 2018 by Dr. Donald Kinghorn

This post will look at the molecular dynamics program, NAMD. NAMD has good GPU acceleration but is heavily dependent on CPU performance as well. It achieves best performance when there is a proper balance between CPU and GPU. The system under test has 2 Xeon 8180 28-core CPU’s. That’s the current top of the line Intel processor. We’ll see how many GPU’s we can add to those Xeon 8180 CPU’s to get optimal CPU/GPU compute balance with NAMD.