Contents Menu Expand Light mode Dark mode Auto light/dark mode
azure-cmaq documentation
Logo
azure-cmaq documentation

Contents:

  • 1. Introductory Tutorial
    • 1.1. Create an Azure Account
    • 1.2. Sign up for a Developer Azure Support Plan
    • 1.3. Use Azure CLI to examine your quota
    • 1.4. Create a virtual machine
    • 1.5. Login to Virtual Machine
    • 1.6. Delete the virtual machine and all of the associated resources by deleting the resource group.
  • 2. System Requirements
    • 2.1. System Requirements for a Single Virtual Machine or Cycle Cloud Cluster
    • 2.2. Software Requirements for CMAQ on Single VM or CycleCloud Cluster
    • 2.3. Storage Options
    • 2.4. Recommended Cycle Cloud Configuration for CONUS Domain 12US1
  • 3. Create Single VM using HB120rs_v3 Tutorial
    • 3.1. Create a HB120rs_v3 Virtual Machine
    • 3.2. Login to the Virtual Machine
    • 3.3. Mount the disk on the server as /shared using the instructions on the following link:
    • 3.4. Alternatively, you can create an nvme stripped disk that has faster performance.
    • 3.5. Download the Input data from the S3 Bucket
    • 3.6. Change shell to use tcsh
    • 3.7. Create Environment Module for Libraries
    • 3.8. Install and Build CMAQ
    • 3.9. Copy the run scripts from the repo to the run directory
    • 3.10. Run CMAQ interactively using the following command:
    • 3.11. Created another single VM using HBv120_v2 and ran again
    • 3.12. Created another VM using the HB120v3 cpus
    • 3.13. Verify that the correct number of cpus are installed using lscpu
    • 3.14. Timing information
    • 3.15. Review performance metrics in the Azure portal
    • 3.16. IF your performance is much slower than this, then we recommend that you terminate the resource group and re-build the VM
  • 4. Create CycleCloud HB120rs_v3 Cluster
    • 4.1. Create Cyclecloud CMAQ Cluster
  • 5. CMAQv5.4+ Benchmark on HBv3_120 compute nodes and beeond
    • 5.1. Use Cycle Cloud with CMAQv5.4+ software and 12US1 Benchmark data.
    • 5.2. Log into the new cluster
    • 5.3. Download the input data from the AWS Open Data CMAS Data Warehouse using the aws copy command.
    • 5.4. Verify Input Data
    • 5.5. Install CMAQv5.4+
    • 5.6. Copy and Examine CMAQ Run Scripts
    • 5.7. Submit Job to Slurm Queue to run CMAQ on beeond
    • 5.8. submit job to run on 1 node x 96 processors
    • 5.9. Submit job to run on 3 nodes
    • 5.10. Check how quickly the processing is being completed
    • 5.11. Check results when job has completed successfully
    • 5.12. Check to see if spot VMs are available
    • 5.13. Unsuccessful slurm status messages
    • 5.14. Change to HB176_v4 compute node
    • 5.15. To recover from failure use the terminate cluster option
    • 5.16. If SLURM jobs are in a bad state
    • 5.17. Run DESID CMAQ on hbv3_120 using the beeond filesystem
  • 6. Scripts to run combine and post processing
  • 7. Scripts to post-process CMAQ output
  • 8. Install R, Rscript and Packages
  • 9. Install Anaconda on the /shared/build directory
  • 10. QA CMAQ
    • 10.1. Quality Assurance Checks for Successful CMAQ Run on CycleCloud
    • 10.2. R analysis scripts
    • 10.3. Run Jupyter Notebook to analyze difference between with DESID Emissions and the base case (no emission reduction)
  • 11. Compare Timing of CMAQ Routines
    • 11.1. Parse timings from the log file
  • 12. Copy Output to S3 Bucket
    • 12.1. Copy Output Data and Run script logs to S3 Bucket
  • 13. Logout and Delete CycleCloud
    • 13.1. Link to Azure Instructions on how to logout and delete cyclecloud
  • 14. Performance Optimization
    • 14.1. Right-sizing Compute Nodes for a Single Virtual Machine.
    • 14.2. An explanation of why a scaling analysis is required for Single Node
    • 14.3. Benchmark Scaling Plots using Single Virtual Machine HBv120
    • 14.4. Right-sizing Compute Nodes for the CycleCloud
    • 14.5. An explanation of why a scaling analysis is required for Multinode or Parallel MPI Codes
    • 14.6. Slurm Compute Node Provisioning
    • 14.7. Benchmark Scaling Plots using CycleCloud
  • 15. Additional Resources
    • 15.1. Cycle Cloud Resources
    • 15.2. Help Resources for CMAQ
    • 15.3. Resources from Azure for diagnosing issues with running the Cycle Cloud
    • 15.4. Frequently Asked Questions
    • 15.5. Computing on the Cloud References
  • 16. Future Work
    • 16.1. List of ideas for future work
  • 17. Contribute to this Tutorial
    • 17.1. Contribute to Azure-cmaq Documentation
  • 18. Optional instructions for Creating CycleCloud Cluster using /shared and /lustre
    • 18.1. Advanced Tutorial
      • 18.1.1. Modify Cyclecloud CMAQ Cluster
      • 18.1.2. Install CMAQ and pre-requisite libraries on linux on Alinux
      • 18.1.3. Install CMAQ and pre-requisite libraries on linux on SUSE Linux
      • 18.1.4. Configuring selected storage and obtaining input data
      • 18.1.5. Copy the run scripts from the CycleCloud repo to run on HBv120
      • 18.1.6. Edit the run script to run on 192 pes
      • 18.1.7. Check the status in the queue
      • 18.1.8. check the timings while the job is still running using the following command
      • 18.1.9. When the job has completed, use tail to view the timing from the log file.
      • 18.1.10. Check whether the scheduler thinks there are cpus or vcpus
      • 18.1.11. Edit the run script to run on 96 pes
      • 18.1.12. Check the timing after run completed
      • 18.1.13. Copy the run scripts from the CycleCloud repo
      • 18.1.14. Run the CONUS Domain on 176 pes
      • 18.1.15. Check the status in the queue
      • 18.1.16. check the timings while the job is still running using the following command
      • 18.1.17. When the job has completed, use tail to view the timing from the log file.
      • 18.1.18. Check whether the scheduler thinks there are cpus or vcpus
    • 18.2. CMAQv5.4+ Benchmark on HB176_v4 compute nodes and shared
      • 18.2.1. Use Cycle Cloud pre-installed with CMAQv5.4+ software and 12US1 Benchmark data.
      • 18.2.2. Log into the new cluster
      • 18.2.3. Download the input data from the AWS Open Data CMAS Data Warehouse.
      • 18.2.4. Verify Input Data
      • 18.2.5. Examine CMAQ Run Scripts
      • 18.2.6. Submit Job to Slurm Queue to run CMAQ on 2 nodes
      • 18.2.7. Submit a run script to run on the shared volume
    • 18.3. CMAQv5.4+ Benchmark on HBv3_120 compute nodes and lustre
      • 18.3.1. Use Cycle Cloud pre-installed with CMAQv5.4+ software and 12US1 Benchmark data.
      • 18.3.2. This method relies on obtaining the code and data from blob storage.
      • 18.3.3. Update Cycle Cloud
      • 18.3.4. Log into the new cluster
      • 18.3.5. Verify Software
      • 18.3.6. Download the input data from the AWS Open Data CMAS Data Warehouse.
      • 18.3.7. Verify Input Data
      • 18.3.8. Examine CMAQ Run Scripts
      • 18.3.9. To run on the Shared Volume a code modification is required.
      • 18.3.10. Build the code by running the makefile
      • 18.3.11. Submit Job to Slurm Queue to run CMAQ on Lustre
      • 18.3.12. Submit a run script to run on the shared volume
      • 18.3.13. Submit a minimum of 2 benchmark runs
Back to top
Edit this page

13. Logout and Delete CycleCloud#

Logout and delete the CycleCloud when you are done to avoid incurring costs.

  • 13.1. Link to Azure Instructions on how to logout and delete cyclecloud
Next
13.1. Link to Azure Instructions on how to logout and delete cyclecloud
Previous
12.1. Copy Output Data and Run script logs to S3 Bucket
Copyright © 2022, CMAS Center
Made with Sphinx and @pradyunsg's Furo
Last updated on 2024-10-04 17:55:49 +0000