18. Optional instructions for Creating CycleCloud Cluster using /shared and /lustre#
Warning - these are being kept to preserve the timings, but the method used to create the cluster is outdated.
Use portal.azure.com to create CycleCloud Cluster
- 18.1. Advanced Tutorial
- 18.1.1. Modify Cyclecloud CMAQ Cluster
- 18.1.2. Install CMAQ and pre-requisite libraries on linux on Alinux
- Login to updated cluster
- Change shell to use .tcsh
- Log out and then log back in to activate the tcsh shell
- Optional Step to allow multiple users to run on the CycleCloud Cluster
- Check to see if the group is added to your user ID
- Make the /shared/build directory
- Change ownership to your username
- Make the /shared/cyclecloud-cmaq directory
- Change ownership to your username
- Install git
- Install the cluster-cmaq git repo to the /shared directory
- Optional - Change the group to cmaq recursively for the /shared directory/build
- Check what modules are available on the cluster
- Load the openmpi module
- Verify the gcc compiler version is greater than 8.0
- Change directories to install and build the libraries and CMAQ
- Build netcdf C and netcdf F libraries - these scripts work for the gcc 8+ compiler
- A .cshrc script with LD_LIBRARY_PATH was copied to your home directory, enter the shell again and check environment variables that were set using
- If the .cshrc wasn’t created use the following command to create it
- Execute the shell to activate it
- Verify that you see the following setting
- Build I/O API library
- Build CMAQ
- 18.1.3. Install CMAQ and pre-requisite libraries on linux on SUSE Linux
- Login to updated cluster
- Change shell to use .tcsh
- Log out and then log back in to activate the tcsh shell
- Optional Step to allow multiple users to run on the CycleCloud Cluster
- Check to see if the group is added to your user ID
- Make the /shared/build directory
- Change ownership to your username
- Make the /shared/cyclecloud-cmaq directory
- Change ownership to your username
- Install git
- Install the cluster-cmaq git repo to the /shared directory
- Optional - Change the group to cmaq recursively for the /shared directory/build
- Check what modules are available on the cluster
- Install openmpi
- Install gnu-compilers
- Load the gcc copiler - note, this may have been automatically loaded by the openmpi module
- Verify the gcc compiler version is greater than 8.0
- Change directories to install and build the libraries and CMAQ
- Build netcdf C and netcdf F libraries - these scripts work for the gcc 8+ compiler
- A .cshrc script with LD_LIBRARY_PATH was copied to your home directory, enter the shell again and check environment variables that were set using
- If the .cshrc wasn’t created use the following command to create it
- Execute the shell to activate it
- Verify that you see the following setting
- Build I/O API library
- Build CMAQ
- 18.1.4. Configuring selected storage and obtaining input data
- 18.1.5. Copy the run scripts from the CycleCloud repo to run on HBv120
- 18.1.6. Edit the run script to run on 192 pes
- 18.1.7. Check the status in the queue
- 18.1.8. check the timings while the job is still running using the following command
- 18.1.9. When the job has completed, use tail to view the timing from the log file.
- 18.1.10. Check whether the scheduler thinks there are cpus or vcpus
- 18.1.11. Edit the run script to run on 96 pes
- 18.1.12. Check the timing after run completed
- 18.1.13. Copy the run scripts from the CycleCloud repo
- 18.1.14. Run the CONUS Domain on 176 pes
- 18.1.15. Check the status in the queue
- 18.1.16. check the timings while the job is still running using the following command
- 18.1.17. When the job has completed, use tail to view the timing from the log file.
- 18.1.18. Check whether the scheduler thinks there are cpus or vcpus
- 18.2. CMAQv5.4+ Benchmark on HB176_v4 compute nodes and shared
- 18.2.1. Use Cycle Cloud pre-installed with CMAQv5.4+ software and 12US1 Benchmark data.
- 18.2.2. Log into the new cluster
- 18.2.3. Download the input data from the AWS Open Data CMAS Data Warehouse.
- 18.2.4. Verify Input Data
- 18.2.5. Examine CMAQ Run Scripts
- 18.2.6. Submit Job to Slurm Queue to run CMAQ on 2 nodes
- 18.2.7. Submit a run script to run on the shared volume
- 18.3. CMAQv5.4+ Benchmark on HBv3_120 compute nodes and lustre
- 18.3.1. Use Cycle Cloud pre-installed with CMAQv5.4+ software and 12US1 Benchmark data.
- 18.3.2. This method relies on obtaining the code and data from blob storage.
- 18.3.3. Update Cycle Cloud
- 18.3.4. Log into the new cluster
- 18.3.5. Verify Software
- 18.3.6. Download the input data from the AWS Open Data CMAS Data Warehouse.
- 18.3.7. Verify Input Data
- 18.3.8. Examine CMAQ Run Scripts
- 18.3.9. To run on the Shared Volume a code modification is required.
- 18.3.10. Build the code by running the makefile
- 18.3.11. Submit Job to Slurm Queue to run CMAQ on Lustre
- 18.3.12. Submit a run script to run on the shared volume
- 18.3.13. Submit a minimum of 2 benchmark runs