Here is a concise guide on how to clean up entries associated with a specific user across multiple Slurm related XDMod databases. 1. Set the Active Database and Remove Entries: Start by setting the active database to mod_hpcdb and removing job entries for the user: 2. Switch to Shredder Modules: Next, address the mod_shredder database […]
Managing Slurm database size
Here are some advanced level approaches to manage the size of and to manipulate the Slurm database directly on the Slurm database. Slurm database size may grow significantly and quickly in some HPC setups. This is a problem if your space dedicated to Slurm is limited for your workload or the same is applying to […]
Disabling GPU ECC Memory and Persistence Mode
This technical blog suggests a method to increase the utilization and the performance on NVIDIA GPUs particularly focusing on disabling the ECC Memory and enabling the Persistence mode. ECC Memory ECC introduces memory scrubbing, error detection, and correction cycles, which add latency to memory operations. Disabling ECC can sometimes yield performance benefits especially when the […]
gdc-client and MAF-LIB python modules
This short tutorial is for the end users to install gdc-client and MAF-LIB Python modules on a Conda environment. The gdc-client is a command-line tool used to interact with the Genomic Data Commons (GDC) API, which is a repository for cancer genomic data managed by the National Cancer Institute (NCI). This tool allows users to […]
SPPARKS with stitch
Here, I provide a guide for installing SPPARKS with Stitch in an HPC cluster environment. Due to the complexity of the software, it seems that either the developers have not provided comprehensive documentation, or I may have overlooked it. This document is designed to help you complete the installation process smoothly and without issues. Our […]
Fix for VASP installation error with spack
VASP is a popular software along with Spack that HPC administrators use. Lately due to changes at VASP, Spack is failing to compile this tool. This article tries to explain the issue and then provide a workaround to be able to install VASP with Spack. VASP (Vienna Ab initio Simulation Package) is a commercial software […]
‘HPC Monitoring Tool’ Installation Tutorial
This is a guide that explains the installation of the HPC Performance Monitoring tool that I developed and shared at https://github.com/serdar-acir/HPC_Monitor, a real-time HPC performance monitoring tool with automatic node detection and basic benchmarking.
My custom Load Generator code for GPU (c & cuda)
Below is a piece of Cuda code that I wrote which does a simple vector add on a GPU. It is particularly useful in comparing the single thread, multi thread and multi grid performances of the GPU. By resizing the size of the array this code can also be used for GPU stress testing. Let’s […]
My CPU stressing code (c language)
Here is a small piece of C-code that I wrote for stress testing CPUs on an HPC cluster. This testing is important for various performance optimization studies. This code will maximize the CPU usage on the node that it is executed on by creating separate threads. cpu_8core_c.c : (code to utilize the full 8-cores) to […]
A quick way to update HP ILO3
Below are the steps to upgrade the HP ILO3 safely on a RedHat server.