Skip Nav

Jade




Cray XT4 (jade)
User's Guide

Preface

  1. Introduction
  2. System Resources
  3. Program Development
  4. Parallel Programming Models
  5. Job Submission

The U.S. Army Engineer Research and Development Center (ERDC), headquartered in Vicksburg, MS, is the premier research and development laboratory complex of the Corps of Engineers. The Army Supercomputer Center at ERDC was established in 1989. In 1993, ERDC began operations as the first of the Department of Defense (DoD) High Performance Computing (HPC) Major Shared Resource Centers (MSRCs). The ERDC MSRC was formed under the auspices of the DoD HPC Modernization Program and is located in the ERDC Information Technology Laboratory (ITL) in Vicksburg, MS. The name was changed to the ERDC DoD Supercomputing Resource Center (DSRC) in 2009. The ERDC DSRC mission is to deliver HPC leadership, service, education, and technical expertise to achieve research and engineering objectives vital to the Nation. Access to the ERDC DSRC systems is available through multiple, nationwide high‑speed data communications networks. In addition, training and consultation are provided to local and remote DoD users of these systems.

Questions, comments, and suggestions about this guide are welcome. Comments may be sent in the following ways:

Toll‑free long distance: 800-500-HPCC (4722)

Local ERDC telephone: 601-634-4400, Option 1

E-mail: dsrchelp@erdc.hpc.mil

Facsimile: ERDC HPC Service Center, 601-634-3808

U.S. Postal Mail:
U.S. Army Engineer Research and Development Center
ATTN: CEERD‑IH (HPC Service Center)
3909 Halls Ferry Road
Vicksburg, MS 39180‑6199

Web: http://www.erdc.hpc.mil

Users requiring printed copies of Center documentation may contact our Service Center.

  1. Scope
    This document provides an overview and introduction to the use of the Cray XT4 (jade) located at the ERDC DSRC and a description of the specific computing environment on jade. The intent of this guide is to provide information that will enable the average user to perform computational tasks on the system. To receive the most benefit from the information provided here, you should be proficient in the following areas:

    • Use of the UNIX operating system
    • Use of an editor (e.g., vi or emacs)
    • Remote usage of computer systems via network or modem access
    • A selected programming language and its related tools and libraries

    Font Conventions
    The following font conventions will be used in this manual:

    Style Meaning
    Boldface Indicates acronyms, abbreviations, or terms that will be used later in the text.
    (e.g., Portable Batch System (PBS) )
    Italic Indicates items that are especially important, or noteworthy.
    (e.g., Kerberized ftp is required...)
    Constant Width Indicates environment variables, file names, and command‑line output. (e.g., /etc/motd)
    Constant Width Italic Indicates items that you should replace with your own actual values.
    (e.g., login: my_user_name)

    System Access
    Before you can access our systems, you must have Kerberos version 5 installed on your PC or workstation. This software must be used to acquire a Kerberos ticket before a connection to the ERDC DSRC systems is allowed. More information on Kerberos can be found below. After acquiring a Kerberos ticket, jade can be accessed via Kerberized ssh as follows:

    ssh login_node.erdc.hpc.mil

    where login_node is one of jade01 through jade06.

    If a Kerberized version of rlogin or telnet is available on your local machine, connection is also available via those commands.

    Seven login nodes, jade01 through jade06, provide login access to the XT4. These nodes provide the look and feel of a Linux-based environment with full access to the standard Linux utilities, commands, and shells that make program development easy and portable. Subsequent sequential processes, such as system commands or sequential user programs, run on the same node as your login shell. All jobs, both parallel and serial, are to be submitted to the batch queuing system. See the Job Submission section for more information on batch processing. Production jobs found running on the login nodes will be unilaterally terminated because of the negative impact those jobs have on the response time of the login nodes.

    Service Center
    The Consolidated Customer Assistance Center (CCAC) is available to help users with any problems, questions, or training requirements for our HPC systems. Analysts are on duty Monday - Friday, 7:00 a.m. to 10:00 p.m. Central Time. After‑hours support is provided by ERDC ITL operations staff.

    You can contact us in any of the following ways:

    Phone:
    System issues during working hours: 1-877-222-2039
    Emergency after-hours: 1-800-500-4722 or
    (601) 634-4448
    ERDC HPC Accounts/Allocations: 1-800-500-4722 or
    (601) 634-4400, option 1

    E-Mail:
    CCAC Help Desk: help@ccac.hpc.mil
    CCAC Accounts Center: accounts@ccac.hpc.mil
    ERDC DSRC HPC Accounts: hpc-accounts@erdc.hpc.mil
    Fax:
    CCAC: (937) 656-9538
    ERDC HPC Accounts: (601) 634-2622

    U.S. Mail: U.S. Army Engineer Research and Development Center
    ATTN: CEERD-IH HPC Service Center
    3909 Halls Ferry Road
    Vicksburg, MS 39180-6199

    The ERDC DSRC provides application support for the following Computational Technology Areas (CTAs):

    • Computational Structural Mechanics (CSM)
    • Computational Fluid Dynamics (CFD)
    • Environmental Quality Modeling and Simulation (EQM)
    • Climate/Weather/Ocean Modeling and Simulation (CWO)
    • Forces Modeling and Simulation / C4I (FMS)

    Obtaining an Account
    Authorized DoD and contractor personnel may obtain computer accounts on our systems through their site's Service/Agency Approval Authority (S/AAA). Please see the instructions for obtaining accounts prior to contacting your S/AAA.

    Security
    We have implemented a three‑phase transition to a secure computing environment. The overlapping transition began with the implementation of Secure Shell and has been enhanced by Kerberos version 5. A hardware preauthentication step is performed using Security Dynamics' SecurID card. This card implements a one‑time password mechanism and requires you to enter a personal identification number (PIN) in order to generate a passcode.

    Using Kerberos
    To use our computer systems, you must have Kerberos version 5 installed on your PC or workstation. Kerberos client kits and documentation are available from the Kerberos & SecurID Information Center.

    From that page, you can download client kits by clicking the "Software" link and then selecting from the end-user clients listed under the Binary section.

    If you need help installing, configuring, or using the Kerberos Client Kit or your SecurID card, click "Documentation" and select the document that you need.

    Other information, such as the HPCMP Kerberos Ticket Lifetimes and Required Minimum Versions, is available only via Kerberized login. To login, click here External Link.

    If you still need help, contact our Service Center.

    For information on required port configurations for Kerberized services, please see the Kerberos Port Configuration Document.

    For those unfamiliar with Kerberos, a few commonly used commands are shown below.

    To obtain a Kerberos ticket on a UNIX‑based system

    %kinit

    Password for user@WES.HPC.MIL: enter Kerberos password
    Passcode: enter your PIN number into the SecurID card, press
    the diamond, and then enter the six‑digit passcode.

    Note: Remember to press the "P" to delete the passcode from the SecurID card.

    To verify that you have received a Kerberos ticket

    %klist

     Ticket cache: PIPE:1023
     Default principal: user@WES.HPC.MIL
    
     Valid starting    Expires           Service principal
     10/17/05 16:04:16 10/18/05 02:04:16 krbtgt/WES.HPC.MIL@WES.HPC.MIL

    For security reasons, passwords are "aged" on all of our systems. As a result, your password will eventually expire, and you will have to change it. To change your password, use the kpasswd command. NOTE: kpasswd is not the same as the typical UNIX passwd command. The UNIX passwd command will not change your Kerberos password.

    To change your Kerberos password

    %kpasswd

    Password for user@WES.HPC.MIL: enter your password
    SAM Authentication
    Challenge for Security Dynamics mechanism
    SecurID Passcode: enter passcode from SecurID card
    Enter new password: enter new password
    Enter it again: re‑enter new password
    Password changed.

    To establish a login session using the ssh command

    ssh login_node.erdc.hpc.mil

    where login_node is one of jade01 through jade06.

    Login sessions may also be established using the rlogin and telnet commands provided in the Kerberos kits.

    For more complete instructions on using Kerberos, contact our Service Center.

    Services and Information
    Users of our systems are provided with information through the toll‑free Service Center hotline, workshops and seminars, the Web site, and on‑line documentation. A brief discussion of some on‑line services follows:

    • An informative message of the day (motd) is displayed upon login to our systems. The motd contains important information about imminent events that will affect the immediate usage of the system. The UNIX more command is used with motd to prevent longer messages from scrolling off the monitor. The message is located in the file /etc/motd and may be viewed at any time by issuing the command "more /etc/motd" at the system prompt. Please read this information carefully.

    • An on‑line bulletin system is available on each system and can be used to obtain important information about the system. The bulletins are usually brief and contain information on a variety of topics. To display the list of available bulletins, use the bull command. A menu with a list of available bulletins will be displayed on the screen. Enter the number of the bulletin you wish to display. The bulletin will be displayed at your terminal one screen at a time. Press <spacebar> to display each additional screen until you have reached the end of the list of bulletins. Press "q <return>" to exit the bulletin utility from the main menu. For more information about the bull command, type "man bull".

    Training
    The ERDC DSRC also supports an extensive training schedule through our User Productivity Enhancement and Technology Transfer (PET) function. Most training is conducted in the PET Training Facility in Room 1205 in the ERDC ITL. Training at remote facilities and specialized training courses will be considered upon request. Please contact our Service Center for more information or to submit training requests. Also, for users migrating their code to jade, the Computational Science and Engineering (CS&E) group is available for consultation. The training schedule is updated regularly on the Online Knowledge Center External Link. You can also contact our Service Center for additional information.

  2. System Configuration
    Jade is a Massively Parallel Processor (MPP) supercomputer that is an update to the Cray XT3. Jade contains 2,152 compute nodes, each containing one 2.1‑GHz AMD Opteron 64‑bit quad‑core processor and 8 GBytes of memory. Jade also contains 76 service nodes. Each service node contains one 2.8‑GHz dual‑core Opteron 64‑bit processor and varying amounts of memory. The login nodes (a subset of the 76 service nodes) contain 16 GBytes of DDR memory. The compute nodes, which perform computation only, run a microkernel OS called Compute Node Linux (CNL) that has limited UNIX capabilities. The service nodes run SUSE Linux and perform support functions for application and system services. There are four types of predefined service nodes on jade: login, IO, boot, and database. All nodes are connected to each other in a three‑dimensional torus using a HyperTransport link to a dedicated Cray SeaStar2 communications engine. Jade is rated at 72.3 peak TFLOPS and currently has 379 TBytes of Fiber Channel RAID disk space.

    Users are allocated 1 GByte of permanent disk space for their home directories on jade. You can reference this area with the $HOME environment variable.

    Users are also assigned a temporary work area on the /work file system. This directory is seen by all the processors on the system and may be referenced by the environment variable $WORKDIR. Please review the Temporary File Storage and Managing Temporary File Storage sections of this document for more information about using $WORKDIR.

    Operating System
    Jade's operating system is UNICOS/lc, which consists of a Cray‑modified Linux kernel and a Compute Node Linux (CNL) kernel. Each node has its own kernel, which is either a CNL microkernel for the compute nodes or a full‑featured SUSE Linux kernel for the service nodes (including the login nodes). Its heritage is partially Linux, but it resembles the distributed UNICOS/mk operating system of the Cray T3E. The Cray XT4 CNL microkernel running on the compute nodes interacts with an application process in a very limited way by managing virtual memory addressing, providing memory protection, and performing basic scheduling. The microkernel architecture ensures reproducible run times for MPP jobs, supports fine‑grained synchronization at scale, and ensures high‑performance, high‑bandwidth MPI and SHMEM communication. Service nodes run a full SUSE Linux distribution with specific Cray XT4 modifications.

    Login Node Abuse Policy
    The login nodes, jade01 - jade06, provide login access for jade and support such activities as compiling, editing, and general interactive use by all users. Consequently, memory or CPU intensive programs running on the login nodes can significantly affect all users of the system. Therefore, only small applications requiring less than 10 minutes of runtime and less than 2 GBytes of memory are allowed on the login nodes. Any job running on the login nodes that exceeds these limits may be unilaterally terminated.

    The preferred method to run interactive jobs is to use the Interactive Batch Environment. Jobs submitted to the batch queuing system from the Interactive Batch Environment will be submitted to compute nodes for execution.

    Data Storage
    Home Directory Storage

    Each user is allocated a home directory (the current working directory immediately after login) with an initial disk quota of 1 GByte of permanent nonmigrated storage. Your home directory can be referenced locally with the $HOME environment variable from all nodes in the system.

    Requests to increase disk space quotas may be submitted by contacting our Service Center. You must supply the following information for evaluation of the request by the system administrators and the ERDC DSRC management:

    • Amount of system resource requested
    • Length of time requested for the increase
    • Special deadlines for the project
    • Explanation of the attempts to work within limits

    Temporary File Storage

    Jade has one large file system (/work) for the temporary storage of data files needed for executing programs. You may access your personal working directory under /work by using the $WORKDIR environment variable, which is set for you upon login. Your $WORKDIR directory has no disk quotas, and files stored there do not affect your permanent file quota usage. Because there are no disk quotas for $WORKDIRs, each user's interactive and batch jobs may consume large amounts of disk space. This fact, compounded by the large number of jade users, predisposes this file system to space shortages and necessitates regular purges of unaccessed files. Don't forget that $WORKDIR is a "scratch" file system. It is not backed up, and files may be deleted at any time. Always back up working files to the DMS to ensure safekeeping during (if possible) and upon completion of your jobs.

    It is important to note that /work is a parallel, striped file system. This means that as files are written, they are automatically divided into chunks and written across multiple disk sets, or "OSTs," simultaneously. This process, called "striping," plays a vital role in running very large jobs because it significantly improves file I/O speed, thereby reducing the time required to read or write a file. Without parallel striping, large jobs, many of which require hundreds of gigabytes of disk space, would spend much of their time just reading from and writing to disk.

    The default stripe size for /work is 1 MByte, and the default stripe count is two stripes. Increasing the stripe count is advisable when creating files on /work that are larger than 20 GBytes. Click here for an explanation of how to do this.

    Please note that all of your jobs should execute from your $WORKDIR directory. Jobs that are run from $HOME are subject to disk space quotas and have a greater chance of failing if problems occur with that resource. Jobs that are run entirely from your $WORKDIR directory are more likely to complete, even if all other resources are temporarily unavailable.

    If you use $WORKDIR in your batch scripts, you must be careful to avoid having one job accidentally contaminate the files of another job. If two different batch jobs use the same names for temporary files, unusual errors can arise if the two jobs happen to run at the same time. By having each job create and use its own subdirectory underneath your $WORKDIR, this problem can be avoided.

    Managing Temporary File Storage

    Close management of your temporary storage is a very high priority. This is because the system halts processing, and manual intervention is required to restart processing when disk space becomes too low. Users are responsible for managing their own files in their $WORKDIR by transferring needed files to the DMS and deleting unneeded files when their processes end. If available space in $WORKDIR becomes critically low, a manual purge may be run, and all files in $WORKDIR are eligible for deletion.

    Archival File Storage

    All of our systems share an on‑line DMS that currently includes more than 32 TBytes of high‑speed disk cache, 18.2 TBytes of Tier 1 archival storage, and 3 PBytes of Tier 2 high‑speed archival storage utilizing a robotic tape library. The DMS should be used for all long‑term storage (more than 90 days).

    Every user is given an account and an archival directory on one of the two partitions (gold and silver) of the DMS system - a Sun Enterprise 15000. The command getarchost can be used to determine your host DMS partition. Kerberized login and ftp are allowed into the DMS. Locally developed utilities may be used to transfer files to and from the DMS as well as to create and delete directories, rename files, and list directory contents. For convenience, the environment variable $ARCHIVE_HOME can be used to reference your DMS archive directory when using DMS commands. The command getarchome can be used to display the value of $ARCHIVE_HOME for any user.

    The ERDC DSRC provides the user with the option to place files in a subdirectory specifically designated for the subproject under which the files were created. For additional details on these enhancements, click here.

    Archival Command Synopsis

    A synopsis of the archival utilities is listed below. For additional information, read the on‑line man pages that are available on each system.

    • Change file and directory permissions on the DMS
      msfchmod [-d] -m mode file1 [file2 ...]

    • Copy one or more files from the DMS
      archive get [-C path ] [-s] file1 [file2 ...]

    • List files and directory contents on the DMS
      archive ls [lsopts] [file/dir ...]

    • Create directories on the DMS
      archive mkdir [-C path] [-m mode] [-p] [-s] dir1 [dir2 ...]

    • Rename a file on the DMS
      msfmv old-filename new-filename

    • Copy one or more files to the DMS
      archive put [-C path ] [-D] [-s] file1 [file2 ...]

    • Delete files on the DMS
      msfrm [-i] file1 [file2 ...]

    • Delete directories on the DMS
      msfrmdir dir1 [dir2 ...]

    • Check the status and availability of the DMS
      archive stat [-s]

    For a sample batch script using these archival commands, see below.

    Network Connectivity
    The ERDC DSRC is a critical node of the Defense Research and Engineering Network (DREN) and has direct, redundant connectivity to the DREN. The internal DSRC networks are built on redundant Gigabit Ethernet technology. Jade can be accessed via Kerberized ssh as follows:

    ssh login_node.erdc.hpc.mil

    where login_node is one of jade01 through jade06.

    You may also connect using Kerberized telnet or rlogin. For security purposes, you must have a current Kerberos ticket on your computer before attempting to connect. Use of the hostname, jade, is preferred since Internet Protocol (IP) addresses are subject to change.

    Application Support Software
    All of our systems run derivatives of the UNIX System V operating system with vendor‑specific enhancements. A large variety of compiler environments, numerical libraries, graphics libraries, and third‑party analysis applications is available on the systems. Additional applications can be added to accommodate the diverse needs of the user communities that we serve. Please contact our Service Center for more information.

    A list of third‑party software licensed for jade is available at http://www.erdc.hpc.mil/hardSoft/Software/XT4.

    Application/Utilities File Systems
    In addition to the software mentioned previously, applications and utilities on jade include Cray‑proprietary and supported programs, commercial packages, and privately written and supported programs. All non‑Cray applications software, libraries, utilities, and documentation are stored in one of the following important subdirectories of the /usr/local/ directory structure:

    Subdirectories Description of contents
    applic Third‑party applications.
    bin Third‑party software executable files and system shell scripts.
    info Bulletin files for user access.
    man Locally developed ERDC DSRC man pages.
    usp Contents of these subdirectories are not supported by our systems personnel, but rather by the owner/user. Subdirectories named bin, lib, and applic contain the programs.
  3. Overview of Compilers and Development Tools
    Jade provides a full complement of programming development tools. These tools include assemblers, compilers, parallelizing compilers, and programming utilities. The following sections describe these elements of the Cray XT4 programming environment.

    Jade has three programming environments for compiling: Portland Group (PGI), PathScale, and GNU. The PGI Programming Environment is the default programming environment on jade. To switch from the PGI default programming environment, use one of the following commands:

    module swap PrgEnv-pgi PrgEnv-pathscale     //To switch to PathScale
    module swap PrgEnv-pgi PrgEnv-gnu           //To switch to GNU

    Optimization flags are different for each programming environment and can be found in the man pages for each: for PGI, man pgf90, pgf77, pgcc, or pgCC; for PathScale, man pathf90, pathcc, or pathCC; for GNU, man gfortran, g77, gcc, or g++.

    To compile your code to run on the compute nodes using any of the three programming environments, use the compilers listed in the following table, which will invoke your loaded programming environment. Note, you will still need to issue an aprun command in a batch job to run the compiled code on the compute nodes.

    Compiler Description
    ftn Fortran 90/95
    f77 FORTRAN 77
    cc C
    CC C++

    You may run small applications on jade's login nodes if they do not run for more than a few minutes. Alternatively, you may use PBS to schedule a batch job on a batch interactive node. The OS on the batch interactive nodes is full SUSE Linux that can run serial or threaded applications. The batch interactive nodes contain a single dual‑core 2.8‑GHz Opteron processor with about 14 GBytes of usable memory. Any of the three programming environments may be used, but you must issue the same "module swap" commands listed above if you want to compile with PathScale or GNU. To schedule a single batch interactive node, use the PBS option "-l ncpus=0".

    To compile a serial or threaded code to run on a login node or on a batch interactive node, use the compilers listed in the following table.

    Compiler Description
    pgf90 PGI Fortran 90/95
    pgf77 PGI FORTRAN 77
    pgcc PGI C
    pgCC PGI C++
    pathf90 PathScale Fortran 77/90/95
    pathcc PathScale C
    pathCC PathScale C++
    gfortran GNU Fortran 90/95
    g77 GNU FORTRAN 77
    gcc GNU C
    g++ GNU C++

    Some useful compiler options on the Cray XT4 are presented in the following table. For additional information on these options, see the compiler man pages or the PGI User's Guide External Link.

    Useful PGI Compiler Options
    OPTION PURPOSE
    Fortran & C/C++:
    -ON
    Specifies a level of optimization between 0 - 3. As "N" increases, compilation time increases, and execution time decreases. N=3 may generate results that differ from those obtained at lower levels.
    Fortran & C/C++:
    -fastsse
    An aggregate option that includes a number of individual PGI compiler options. The actual included options depend on the compilation target.
    Fortran & C/C++:
    -Mipa=fast
    Invokes interprocedural analysis including several IPA suboptions.
    Fortran & C/C++:
    -dryrun
    Causes the command‑line inputs to be printed to stdout but not actually performed.
    Fortran & C/C++:
    -Minfo=all
    Causes the PGI compilers to issue informational messages to stdout as compilation proceeds. From these messages, you can determine which loops are optimized using unrolling, SSE/SSE2 instructions, vectorization, parallelization, inter-procedural and various other optimizations.

    AMD Core Math Library (ACML) and Cray LibSci
    In addition to the AMD Core Math Libraries, jade provides Cray's LibSci library as part of the Cray Programming Environment. This library is a collection of single‑processor and parallel numerical routines that have been tuned for optimal performance on Cray XT systems. The LibSci library is loaded by default and contains optimized versions of many of the BLAS math routines. In general, LibSci contains most of the ACML routines, but users can opt to load the ACML module to allow ACML routines to take precedence over LibSci routines. Users should call these routines, instead of the public domain or user written versions, to optimize application performance on jade.

    The ACML includes the following:

    • Basic Linear Algebra Subroutines (BLAS) - Levels 1, 2, and 3
    • Linear Algebra Package (LAPACK)
    • Fast Fourier Transform (FFT) routines for single-precision, double-precision, single-precision complex, and double-precision complex data types
    • Random Number Generator
    • Fast Math and Fast Vector Library

    Cray LibSci includes the following:

    • Goto Basic Linear Algebra Subroutines (BLAS) - Levels 1, 2, and 3
    • Linear Algebra Package (LAPACK)
    • Scalable LAPACK (ScaLAPACK) (distributed-memory parallel set of LAPACK routines)
    • Basic Linear Algebra Communication Subprograms (BLACS)
    • Iterative Refinement Toolkit (IRT)
    • SuperLU (for large, sparse nonsymmetrical systems of linear equations)

    C and C++ Compilers
    Jade provides both C and C++ compilers. The C compiler conforms to the ANSI C standard as well as "traditional C," the dialect of C defined by Kernigan and Ritchie in "The C Programming Language." Compiler options allow compilation of programs written in "traditional C" or pure ANSI C.

    The cc and CC commands invoke the Portland Group International (PGI) C and C++ compilers. The C and C++ command‑line syntax is as follows:

    cc [option(s)] filename[...]
    CC [option(s)] filename[...]

    Where option(s) are one or more command‑line options. See above. And, filename is the name of the source file, assembly‑language file, object file, or library to be processed by the compilation system. More than one filename may be specified.

    The following table lists the C and C++ file extensions that are supported on jade:

    Filename Assumed Type
    file.s Assembly language source file
    file.o Object file
    file.a Library input file
    file.lst Listing file
    a.out C or C++ Executable output file
    file.c C source file
    file.C, file.c++,
    file.cc, file.cxx,
    file.cpp
    C++ source files

    After compilation, a ".o" extension will be added to each object program produced. For further information and a list of options, use "man cc" or "man CC".

    Fortran Compilers
    The full ANSI Programming Languages capabilities of FORTRAN 77, Fortran 90, and Fortran 95 are available on jade with a comprehensive set of Fortran extensions.

    The FORTRAN 77, Fortran 90, and Fortran 95 command‑line syntax is as follows:

    f77 [option(s)] filename[...]
    ftn [option(s)] filename[...]

    Where option(s) are one or more command‑line options. See above. And, filename is the name of the source file, assembly‑language file, object file, or library to be processed by the compilation system. More than one filename may be specified.

    The following table lists the Fortran file extensions that are supported on jade.

    Filename Assumed Type
    file.a Library file to be searched for external references.
    file.f, file.F Input Fortran source file in fixed source form. If the filename extension is .F, the Fortran preprocessor is invoked.
    file.f90, file.F90, file.f95, file.F95 Input Fortran source file in free source form. If the filename extension is .F90 or .F95, the Fortran preprocessor is invoked.
    file.i Preprocessor output file
    file.lst Listing file.
    file.o Object file.
    file.s Assembly language file
    a.out Default name for a binary (executable) file

    After compilation, a ".o" extension will be added to each object program produced. For further information and a list of options, use "man f77" or "man ftn".

  4. Jade supports three programming models: Message Passing Interface (MPI), SHared‑MEMory (SHMEM), and Open Multi‑Processing (OpenMP). MPI and SHMEM are examples of the message- or data‑passing models, while OpenMP only uses shared memory on a node by spawning threads.

    Message Passing Interface (MPI)
    The MPI package on jade is derived from MPICH‑2 and implements the MPI‑2 standard except for spawn support. It also implements the MPI 1.2 standard, as documented by the MPI Forum in the spring 1997 release of MPI: A Message Passing Interface Standard.

    For more information on included MPI‑2 features, see the Cray XT Series Programming Environment User's Guide External 
      Link,
    available on‑line from Cray.

    On UNICOS/lc systems, MPI is a component of the Cray Message Passing Toolkit (MPT), which is a software package that supports parallel programming across a network of computer systems through a technique known as message passing. MPI establishes a practical, portable, efficient, and flexible standard for message passing that makes use of the most attractive features of a number of existing message‑passing systems, rather than selecting one of them and adopting it as the standard. See "man intro_mpi" for additional information.

    When creating an MPI program on jade, ensure that the following actions are taken:

    • Make sure the Message Passing Toolkit is loaded.
      "module list" and check to see that xt-mpt is listed. If not, "module load xt-mpt"
    • The source code includes the line
      INCLUDE "mpif.h"      //if written in Fortran, or
      #include <mpi.h>      //if written in C

    To compile an MPI program, use the following examples:

    cc -o mpi_program mpi_program.c
    ftn -o mpi_program mpi_program.f

    To run an MPI program within a batch script, use the following command:

    aprun -n N mpi_program [user_arguments]

    where N is the number of processes to start. The aprun command launches executables across a set of CNL compute nodes. File operations performed by the compute node processes (if not directed to a parallel I/O facility) are transparently forwarded to aprun, which executes the operations and returns the results to the application. When each member of the parallel application has exited, aprun exits. For more information about aprun, see the aprun man page.

    SHared MEMory (SHMEM)
    The logically shared, distributed-memory access routines provide high‑performance, high‑bandwidth communication for use in highly parallelized scalable programs. The SHMEM data‑passing library routines are similar to the MPI library routines: they pass data between cooperating parallel processes. The SHMEM data‑passing routines can be used in programs that perform computations in separate address spaces and that explicitly pass data to and from different processes in the program.

    The SHMEM routines minimize the overhead associated with data‑passing requests, maximize bandwidth, and minimize data latency. Data latency is the length of time between a process initiating a transfer of data and that data becoming available for use at its destination.

    SHMEM routines support remote data transfer through put operations that transfer data to a different process and get operations that transfer data from a different process. Other supported operations are work‑shared broadcast and reduction, barrier synchronization, and atomic memory updates. An atomic memory operation is an atomic read and update operation, such as a fetch and increment, on a remote or local data object. The value read is guaranteed to be the value of the data object just prior to the update. See "man intro_shmem" for details on the SHMEM library.

    When creating a SHMEM program on jade, ensure that the following actions are taken:

    • Make sure the Message Passing Toolkit is loaded.
      "module list" and check to see that xt-mpt is listed. If not, "module load xt-mpt"
    • The source code includes the line
      INCLUDE 'mpp/shmem.fh'       //if written in Fortran, or
      #include <mpp/shmem.h>       //if written in C
    • The compile command includes an option to reference the SHMEM library.

    To compile a SHMEM program, use the following examples:

    cc -lsma -o shmem_program shmem_program.c    or
    ftn -lsma -o shmem_program shmem_program.f90

    Before running a SHMEM program, you may want to set the following environment variables:

    setenv XT_LINUX_SHMEM_STACK_SIZE 24m
    setenv XT_LINUX_SHMEM_HEAP_SIZE 120m
    setenv XT_SYMMETRIC_HEAP SIZE 20m

    The program can then be launched using the aprun command as follows:

    aprun -n N shmem_program [user_arguments]

    where N is the number of processes to start. The aprun command launches executables across a set of CNL compute nodes. File operations performed by the compute node processes (if not directed to a parallel I/O facility) are transparently forwarded to aprun, which executes the operations and returns the results to the application. When each member of the parallel application has exited, aprun exits. For more information about aprun, see the aprun man page.

    For more information on the performance and use of SHMEM calls, see the Cray XT Series Programming Environment User's Guide External Link, available on‑line from Cray.

    Open Multi-Processing (OpenMP)
    OpenMP is a shared‑memory parallel programming model that consists of a set of compiler directives (Fortran directives, C and C++ pragmas), library routines, and environment variables.

    When creating an OpenMP program on jade, ensure that the following actions are taken:

    • Make sure the Message Passing Toolkit is loaded.
      "module list" and check to see that xt-mpt is listed. If not, "module load xt-mpt"
    • If using OpenMP functions (for example, omp_get_wtime), ensure that the source code includes the line
      USE omp_lib
      Or, includes one of the following:
      INCLUDE 'omp.h'     //if written in Fortran, or
      #include <omp.h>    //if written in C.
    • The compile command includes an option to reference the OpenMP library. The PGI, PathScale, and GNU compilers support OpenMP, and each one uses a different option.

    To compile an OpenMP program, use the following examples:

    # For C codes:
    cc -o OpenMP_program -mp=nonuma OpenMP_program.c  //PGI compiler
    cc -o OpenMP_program -mp OpenMP_program.c         //PathScale compiler
    cc -o OpenMP_program -fopenmp OpenMP_program.c    //GNU compiler
    
    # For Fortran codes:
    ftn -o OpenMP_program -mp=nonuma OpenMP_program.f //PGI compiler
    ftn -o OpenMP_program -mp OpenMP_program.f        //PathScale compiler
    ftn -o OpenMP_program -fopenmp OpenMP_program.f   //GNU compiler

    To run an OpenMP program within a batch script, you also need to set the $OMP_NUM_THREADS environment variable to the number of threads in the team. For example:

    setenv OMP_NUM_THREADS 4
    aprun -n 1 -d 4 OpenMP_program [user_arguments]

    In the example above, the application starts OpenMP_program on one node and spawns three additional threads.

    An application built with the hybrid model of parallel programming can run on jade using both OpenMP and MPI. In OpenMP/MPI applications, MPI calls can be made from MPI parallel regions but not from inside the threaded regions.

    The Portable Batch System (PBS) is currently running on jade. It schedules jobs and manages resources and job queues, and can be accessed through the interactive batch environment or by submitting a batch request. PBS is able to manage both single‑processor and multiprocessor jobs.

    Available Queues
    Currently, the XT4 batch environment consists of seven queues.

    1. urgent queue. This queue is reserved for time‑critical processing. Users must request access to this queue and provide justification for such access.
    2. debug queue. This queue should be used for small, short duration jobs and for debugging code. Anyone with an account on the system and remaining allocation can use this queue.
    3. high queue. This queue is for DoD high‑priority projects. Users must have a high‑priority allocation to use this queue.
    4. challenge queue. Users must be part of a DoD Challenge project to have access to this queue.
    5. standard queue. Anyone with an account on the system and remaining allocation can submit jobs to this queue.
    6. special queue. This queue is reserved for jobs requiring wall time greater than 48 hours. Users must request access to this queue and provide justification for such access.
    7. background queue. Anyone with an account on the system can submit jobs to this queue. Users who have exceeded their allocation are restricted to this queue.

    For a complete description of job queue limits for jade, see the Cray XT4 Queue Limits Summary.

    Interactive Environment
    When you log in to jade, you will be running in an interactive shell on a login node. These nodes are for compiling, editing, and general interactive use by all users. You may only run small applications on these nodes if they complete in less than 10 minutes and use less than 2 GBytes of memory. In the interest of all users, any job running on the login nodes that exceeds these limits may be unilaterally terminated. The preferred method to run interactive jobs is to use the Interactive Batch Environment. Jobs submitted to the batch queuing system from the Interactive Batch Environment will be submitted to compute nodes for execution.

    Interactive Batch Environment
    In order to use the interactive batch environment, you must first acquire an interactive batch shell. This is done by executing a qsub command with the "-I" option from within the interactive environment. For example,

    qsub -l ncpus=# -A project_name -q queue_name -l walltime=wall_time -I

    Your batch shell request will be placed in the desired queue and scheduled for execution. This may take a few minutes because of the system load. Once your shell starts, you will be logged in to one of the PBS host nodes. At this point, you can run or debug interactive applications, execute job scripts, start an execution on the compute node via the aprun command or postprocess data, etc.

    Batch Request Submission
    An alternative to using the interactive batch environment is to submit batch requests directly to PBS from within the interactive environment. This is done by using the qsub command to hand off a job script to the PBS scheduler. The scheduler will determine when the job is eligible for execution based on job resource requirements and available system resources.

    Creating a Batch Script
    While it is possible to include all PBS directives at the qsub command‑line, the preferred method is to embed the PBS directives within the batch request script using "#PBS". Such a script might look like the following:

    # This is a sample PBS batch script.

    # Declare the project under which this job run will be charged.
    # (required)
    # Users can find eligible projects by typing "show_usage" on the command line.
    #PBS -A project_name

    # Request 1 hour of wallclock time for execution (required).
    #PBS -l walltime=01:00:00

    # Request 4 cores (required).
    #PBS -l ncpus=4

    # Submit job to debug queue (required).
    #PBS -q debug

    # Declare a jobname.
    #PBS -N myjob

    # Send standard output (stdout) and error (stderr) to the same file.
    #PBS -j oe

    # Make a new subdirectory in working storage space.
    mkdir $WORKDIR/projA-7

    # Change to the new directory.
    cd $WORKDIR/projA-7

    # Check DMS availability. If not available, then wait.
    archive stat -s

    # Retrieve executable program from the DMS.
    archive get -C $ARCHIVE_HOME/project_name program.exe

    # Retrieve input data file from the DMS.
    archive get -C $ARCHIVE_HOME/project_name/input data.in

    # Execute a parallel program.
    aprun -n 4 my_program < data.in > projA-7.out

    # Check DMS availability. If not available, then wait.
    archive stat -s

    # Create a new subdirectory on the DMS.
    archive mkdir -C $ARCHIVE_HOME/project_name output7

    # Transfer output file back to the DMS.
    archive put -C $ARCHIVE_HOME/project_name/output7 projA-7.out

    # Clean up unneeded files from working storage.
    cd $WORKDIR
    rm -r projA-7

    Submitting a Batch Script
    To submit the batch request script, use the following command:

    qsub scriptname

    When the script (above) begins execution, it first copies the executable program and input files from your $ARCHIVE_HOME directory to your working directory, $WORKDIR. It then runs the executable on four cores and returns the results to your $ARCHIVE_HOME.

    Option "-j oe" creates a file named myjob.ojobid that contains both stdout and stderr from the job. This file name is a combination of the name that you supplied with the "-N" option and the numeric job ID assigned by PBS.

    You can monitor batch jobs by using the qstat, qview, or qhist commands. You can delete a job by using "qdel job_ID". The job_ID can be obtained from the output of the qstat command.

    For more information on qsub and other PBS commands, see their respective man pages or the PBS qsub Quick Reference Guide.

    Single-System View Commands
    Regular Unix commands only work on the specific node into which you are logged. The following commands allow operations to and provide information on the entire system. Further information and command syntax can be found using the system's man page utility.

    Command Description
    xtshowmesh Shows information about compute and service partition processors and the jobs running in each partition.
    xtshowcabs Shows information about compute and service nodes organized by chassis and cabinet.
    xt_ps Provides process information for all login nodes of the system.
    xthostname Displays or sets the xthostname value.
    xt_who Shows users logged onto the Cray XT4 system.
    xt_free Shows free and used physical memory for all login nodes.

Last update: January 04, 2010

You are accessing a U.S. Government (USG) Information System (IS) that is provided for USG-authorized use only. By using this IS (which includes any device attached to this IS), you consent to the following conditions: * The USG routinely intercepts and monitors communications on this IS for purposes including, but not limited to, penetration testing, COMSEC monitoring, network operations and defense, personnel misconduct (PM), law enforcement (LE), and counterintelligence (CI) investigations. * At any time, the USG may inspect and seize data stored on this IS. * Communications using, or data stored on, this IS are not private, are subject to routine monitoring, interception, and search, and may be disclosed or used for any USG- authorized purpose. * This IS includes security measures (e.g., authentication and access controls) to protect USG interests--not for your personal benefit or privacy. * Not withstanding the above, using this IS does not constitute consent to PM, LE or CI investigative searching or monitoring of the content of privileged communications, or work product, related to personal representation or services by attorneys, psychotherapists, or clergy, and their assistants. Such communications and work product are private and confidential.