Q. How is jade configured?
A. Jade is a Massively Parallel Processor (MPP) supercomputer
that is a successor to the Cray XT3. Jade contains 2,228 nodes. The node pool is
partitioned into compute and service partitions that are composed of 2,152 and 76 nodes,
respectively. Compute nodes contain a single AMD 2.1‑GHz quad‑core Opteron
processor and run a Linux microkernel called Compute Node Linux (CNL). The service nodes
contain a single AMD 2.8‑GHz dual‑core Opteron running SUSE Linux and perform
support functions for application and system services. All nodes are connected to each
other in a three‑dimensional torus using a HyperTransport link to a dedicated Cray
SeaStar2 communications engine. Jade is rated at 72.3 Peak TFLOPS and contains
339 TBytes of Fibre Channel RAID disk space.
table of contents — top of page
Q. What operating system is on jade?
A. The Cray XT4, UNICOS/lc consists of two primary components: a microkernel for compute
nodes and a full‑featured operating system for the service nodes. The XT4 CNL
microkernel running on the compute nodes interacts with an application process in a very
limited way by managing virtual memory addressing, providing memory protection, and performing
basic scheduling. This proven microkernel architecture ensures reproducible run times for MPP
jobs, supports fine‑grained synchronization at scale, and ensures high‑performance,
high‑bandwidth MPI and SHMEM communication. Service nodes run a full SUSE Linux
distribution with specific Cray XT4 modifications.
table of contents — top of page
Q. How many nodes and cores are available for user
jobs?
A. On jade, there are 8,608 compute cores on 2,152 nodes. The actual number of nodes available
to a user is controlled by the Portable Batch System (PBS) batch
queue structure. See the Cray XT4 Queue Limits Summary
table for a complete queue summary.
table of contents — top of page
Q. When I log in to jade, where am I running?
A. When you log in to jade, you will be running in an interactive shell on a login node.
Subsequent sequential processes, such as system commands or sequential user programs, will run
on the same node as your login shell.
table of contents — top of page
Q. Where do my compiles execute?
A. All UNIX commands, including compile commands, execute on the login nodes.
table of contents — top of page
Q. Where do my parallel programs execute?
A. Parallel programs execute on their dedicated subset of the 8,608 compute cores.
table of contents — top of page
Q. Can I check the current usage of all the nodes?
A. Yes, an interactive display of node usage is available through the xtshowmesh
command. See "man xtshowmesh" for details.
table of contents — top of page
Q. How much memory can I use on each compute node?
A. Each compute node on jade contains 8 GBytes of memory. Approximately 200 MBytes
are used by system processes. The four cores on a compute node share the remaining
7.6 GBytes of memory.
table of contents — top of page
Q. What application software is available?
A. For a description of supported software, see the Cray XT4
Software Version Matrix. Unsupported software can be found in the directory
/usr/local/usp/ on jade.
table of contents — top of page
Q. What compilers are available?
A. The Portland Group's (PGI) Programming Environment is the default programming environment
on jade. PathScale and GNU compilers are also available. To switch from the PGI default
programming environment, use one of the following commands:
module swap PrgEnv-pgi PrgEnv-pathscale //To switch to PathScale module swap PrgEnv-pgi PrgEnv-gnu //To switch to GNU
Optimization flags are different for each programming environment and can be found in the man pages for each: for PGI, man pgf90, pgf77, pgcc, or pgCC; for PathScale, man pathf90, pathcc, or pathCC; for GNU, man gfortran, g77, gcc, or g++.
To compile your code to run on the compute nodes using any of the three programming environments, use the compilers listed in the following table. Note, you will still need to issue an aprun command in a batch job to run the compiled code on the compute nodes.
| Compiler | Description |
|---|---|
| ftn | Fortran 90/95 |
| f77 | FORTRAN 77 |
| cc | C |
| CC | C++ |
You may run small applications on jade's login nodes if they do not run for more than a few minutes. Alternatively, you may use PBS to schedule a batch job on a single batch interactive node. The OS on the batch interactive nodes is full SUSE Linux that can run serial or threaded applications. The batch interactive nodes contain a single dual‑core 2.8‑GHz Opteron processor with about 14 GBytes of usable memory. Any of the three programming environments may be used, but you must issue the same "module swap" commands listed above if you want to compile with PathScale or GNU. To schedule a single batch interactive node, use the PBS option "-l ncpus=0".
To compile a serial or threaded code to run on a login node or on a batch interactive node, use the compilers listed in the following table.
| Compiler | Description |
|---|---|
| pgf90 | PGI Fortran 90/95 |
| pgf77 | PGI FORTRAN 77 |
| pgcc | PGI C |
| pgCC | PGI C++ |
| pathf90 | PathScale Fortran 77/90/95 |
| pathcc | PathScale C |
| pathCC | PathScale C++ |
| gfortran | GNU Fortran 90/95 |
| g77 | GNU FORTRAN 77 |
| gcc | GNU C |
| g++ | GNU C++ |
table of contents — top of page
Q. What do I need to know about modules?
A. The modules package is a convenient way for you to modify your programming environment
without directly modifying the $PATH, $MANPATH, and other environment
variables. The "module list" command can be used to determine what modules are
currently loaded. After a major OS upgrade the default modules are changed to the latest
tested and proven OS dependent libraries. You will be notified if you need to relink and/or
recompile your codes to execute under the upgraded OS.
Use "module avail" to determine what modules (and what versions of those modules) are available. Newer compiler versions are often installed for testing purposes and may become the default version at some point in the future.
To use an alternate module, the "module swap" command can be used as shown in the following example that replaces the currently loaded PrgEnv with PrgEnv.3600:
jade$ module swap PrgEnv PrgEnv.3600
Note libraries listed by the "module avail" command must be loaded before they can be used in compiling. Only libraries listed by the "module list" command will be automatically searched during compile and link operations. The command "module load library_name" will load any module library that is shown to be available. A list of all module keywords is given in response to "module help".
table of contents — top of page
Q. What batch‑queuing system is used?
A. The Portable Batch System (PBS) is currently running on jade.
The syntax of the command to run a parallel job is similar to other batch systems.
table of contents — top of page
Q. What batch commands are available?
A. The following table lists some of the available batch commands. For more information, see
the man page for each command.
| Command | Description |
|---|---|
| qsub | Submits a batch job. |
| qstat | Displays information about jobs and queues. |
| qview | Displays information about jobs. |
| qhist | Displays information about a user's jobs. |
| qlim | Displays information about batch queues. |
| qdel | Deletes a job. |
| qsig | Sends a signal to a job. |
| qhold | Places a hold status on a queued job. |
| qrls | Releases a hold status on a job. |
table of contents — top of page
Q. How do I submit a batch job?
A. The preferred method is to embed the PBS directives within the batch request script using
#PBS, as follows:
#PBS -l ncpus=4
#PBS -l walltime=4:00:00
#PBS -A project_name
#PBS -q standard
Then to submit the batch job script with the embedded PBS directives, use the following command:
qsub scriptname
This script must contain your eight‑character project name. Note: the show_usage command will generate a project list. For more information on qsub, see the qsub man page.
table of contents — top of page
Q. What might a batch script for jade look like?
A. A sample batch script appears below, requesting eight compute cores for 1 hour.
#PBS -A project_name
#PBS -l walltime=01:00:00
#PBS -l ncpus=8
#PBS -q standard
#PBS -N myjobname
#PBS -j oe
cd $PBS_O_WORKDIR
# Make a new subdirectory in working storage space.
mkdir $WORKDIR/projA-7
# Change to the new directory.
cd $WORKDIR/projA-7
# Check DMS availability. If not available, then wait.
archive stat -s
# Retrieve executable program from the DMS.
archive get -C $ARCHIVE_HOME/project_name program.exe
# Retrieve input data file from the DMS.
archive get -C $ARCHIVE_HOME/project_name/input data.in
# Execute a parallel program.
aprun -n 8 my_program < data.in > projA-7.out
# Check DMS availability. If not available, then wait.
archive stat -s
# Create a new subdirectory on the DMS.
archive mkdir -C $ARCHIVE_HOME/projA output7
# Transfer output file back to the DMS.
archive put -C $ARCHIVE_HOME/project_name/output7 projA-7.out
# Clean up unneeded files from working storage.
cd $WORKDIR
rm -r projA-7
When the script executes, it first changes directory to where the job was submitted (cd $PBS_O_WORKDIR). It then makes a run directory under $WORKDIR, changes to the newly created run directory, and gets both the executable file and input file from DMS. The script runs the executable on eight cores. Once the execution is done, the script archives the results back to DMS and cleans up the run directory.
table of contents — top of page
Q. What is the aprun command?
A. The aprun utility loads and executes a program on one or more compute cores. The
aprun utility reads the executable, obtains compute cores for it to run on, sends the
application to the compute cores, and launches the application.
table of contents — top of page
Q. After a batch job completes, where does the output go?
A. Standard error and standard out are written to the files specified by the "-e" and
"-o" options, respectively. By default, these files are created in the directory from
which the job was submitted. Specify the file names with full paths in order for them to be
created in a different location. The default standard error and standard output file names are
jobname.ejobID and jobname.ojobID, respectively.
For more information about standard error and standard out file naming, see the qsub man page.
table of contents — top of page
Q. Is there a way to change directories automatically at
job start?
A. No, PBS always begins execution in $HOME. Insert the command "cd
$PBS_O_WORKDIR" immediately after the last PBS directive to change the current working
directory to the location from which the job was submitted.
table of contents — top of page
Q. How do I merge stderr and stdout in PBS?
A. The qsub option "-j" will cause the stderr and stdout streams to be merged into a
single file. Specify "#PBS -j oe" to send the merged output to the stdout file;
specify "#PBS -j eo" to send the merged output to the stderr file. This is
detailed in the qsub man page.
table of contents — top of page
Q. What queues are available?
A. Jade contains a Batch Queue structure essentially identical to the other ERDC DSRC
systems. See the Cray XT4 Queue Limits Summary
table for a complete queue summary.
table of contents — top of page
Q. How do I determine the status of the queues?
A. Use the qlim, "qstat -Q", or "qstat -q"commands to see the available
queues.
table of contents — top of page
Q. How do I monitor batch jobs?
A. Use qview or qstat to list the status of all current batch jobs. For more
information on a particular batch job, use "qstat -f job_id" where
job_id is found in the output from the qview command.
table of contents — top of page
Q. How do I cancel a batch job?
A. To cancel a job, use the "qdel job_id" command where job_id
is found in the output from the qview command.
table of contents — top of page
Q. How much space is in my $HOME directory?
A. Jade users are typically allocated 1 GByte of disk space in their home directory.
This space can be accessed using the $HOME environment variable. Requests for more
home directory space will be considered on a case-by-case basis. A much greater amount of disk
space is available for temporary use in your $WORKDIR directory.
table of contents — top of page
Q. What is $WORKDIR?
A. $WORKDIR is a directory in the /work file system where you can
temporarily store large amounts of data for job execution. It is automatically created for you
upon login if it does not already exist.
table of contents — top of page
Q. Is $WORKDIR local to each node?
A. No, all the nodes on jade see the same file systems. No disks are local to any node.
Therefore, the /work file system is shared.
table of contents — top of page
Q. What is /tmp and how is it used?
A. /tmp is a very small memory‑resident nonpermanent file system used during
program execution. You should not use /tmp. It should be reserved for system
processes.
table of contents — top of page
Q. Which file system should be used for running
I/O‑intensive jobs?
A. The /work filesystem has the highest file transfer rates.
table of contents — top of page
Q. How do I check my disk usage?
A. You can check your disk usage by using the command "show_storage". This returns the size of
your home, archive, and work directories in megabytes.
table of contents — top of page
Q. Can I use the Data Management System
(DMS) from jade?
A. Yes, you can confirm availability of the DMS, and get and put files, using the archive
command. See the man page for details.
table of contents — top of page
Q. What is the XT4 programming environment?
A. The Cray XT4 programming environment includes tools designed to facilitate the
development of scalable applications. The Opteron processor's native support for 32‑bit
and 64‑bit applications and full x86‑64 compatibility makes the XT4 system
compatible with many existing compilers and libraries, including optimized C, C++, and
Fortran90/95 compilers and high‑performance numerical libraries such as optimized
versions of BLAS, FFTs, LAPACK, ScaLAPACK, and SuperLU.
Communication libraries include MPI and SHMEM. The MPI implementation is compliant with the MPI 2.0 standard and is optimized to take advantage of the scalable interconnect in the XT4 system. The SHMEM library is compatible with previous Cray systems.
table of contents — top of page
Q. What are the sizes of the standard C, C++, and Fortran
data types?
A. Data type precision for jade is IEEE‑compliant. (See the table below for more
information.) For your convenience, the following C/C++/Fortran data size summary is provided:
| Data Size Summary | ||
|---|---|---|
| C/C++ Type | Fortran | XT4 Size in Bits |
| char | 8 | |
| Integer*1 | 8 | |
| short | 16 | |
| Integer*2 | 16 | |
| int | Integer*4 | 32 |
| long | Integer*8 | 64 |
| long long | 64 | |
| pointer | Integer*8 | 64 |
| float | Real*4 | 32 |
| double | Real*8 | 64 |
| long double | Real*16 | 128 |
| float complex | Complex*4 | 32 x 2 |
| double complex | Complex*8 | 64 x 2 |
| long double complex | Complex*16 | 128 x 2 |
table of contents — top of page
Q. What parallel programming models are available?
A. Jade supports the Message Passing Interface, version 2 (MPI),
SHared MEMory (SHMEM) and OpenMP shared memory on a node.
table of contents — top of page
Q. What numerical libraries are available?
A. The XT4 provides a 64‑bit AMD Core Math Library (ACML)
and LibSci. ACML provides Level 1, 2, and 3 BLAS; a full suite of Linear Algebra
(LAPACK) routines and a suite of Fast Fourier Transform
(FFT) routines for single‑precision, double‑precision,
single‑precision complex, and double‑precision complex data types. LibSci provides
ScaLAPACK, BLACS, and SuperLU routines, which are not included in the ACML library.
For additional information, you may refer to the AMD Core
Math Library
and the
Cray XT
Series Programming Environment User's Guide
, available on‑line from Cray.
table of contents — top of page
Q. Are special actions required to access the
double‑precision numerical libraries?
A. No, the compilers should automatically resolve any precision issues.
table of contents — top of page
Q. What is MPI, and how do I use it?
A. Message Passing Interface (MPI) is the de facto standard
library for portable message‑passing programming. MPICH2 is the implementation on
jade. It is MPI 2.0‑compliant except for Dynamic process creation. In order to
use MPI routines within your program, you must add a line in the code to reference the MPI
header file, as shown in the following examples:
In C, add the following line to the program and compile:
#include <mpi.h>
cc mpi_program.c
In Fortran, add the following line to the program and compile:
INCLUDE "mpif.h"
ftn mpi_program.f
Execute MPI programs as you would other parallel programs. For more information on MPI, see
the MPI home page
available on‑line from
Argonne National Laboratory. For more information on the Cray implementation of MPI, see the
Cray XT
Series Programming Environment User's Guide
, available on‑line from Cray.
table of contents — top of page
Q. What is OpenMP, and how do I use it?
A. OpenMP is a shared‑memory parallel‑programming interface. OpenMP uses
directives inserted into your code to define areas of your code to be executed in parallel by
use of threads. The directives are typically placed at the start and end of large do-loops
that have enough iteration independence to be performed in parallel by separate threads. The
OpenMP directives can also define which variables can be shared in memory or must be private
in memory between the threads. Each compute node on jade has only one quad‑core Opteron
processor where the cores share the memory. It only makes sense to run with four threads on
jade. To compile an OpenMP code with the default PGI compiler, add the "-mp=nonuma"
option, for a GNU compile add "-fopenmp", and for the PathScale compiler add
"-mp".
An example OpenMP code:
program omptest
implicit none
REAL*8x(20000)
integer i,n,nt
inteter OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS
!$omp parallel private(i,n) shared(x,nt)
n=OMP_GET_THREAD_NUM()
nt=OMP_GET_NUM_THREADS()
print *, "Thread number",n," of",nt
do j=1,20000
x(j) = dfloat(j) * 3.14
enddo
!$omp end parallel
end
The following example shows how to run an OpenMP parallel batch program on a compute node:
# Request one node PBS -l ncpus=4 # Set the number of threads to 4 export OMP_NUM_THREADS=4 aprun -n 1 -d 4 ./my_openmp.exe
Q. What is SHMEM, and how do I use it?
A. SHMEM is a library supporting one‑sided communication, available on systems from SGI
and Cray. The SHMEM library is the most efficient means of PE communication available on
jade. In order to use SHMEM routines within your programs, you must add a line in the code
to reference the SHMEM header file and add an option to the compile command to reference the
shared‑memory library, as shown in the following examples:
In C, add the following line to the program and compile:
#include <mpp/shmem.h>
cc -l sma shmem_program.c
In Fortran, add the following line to the program and compile:
INCLUDE "mpp/shmem.fh"
ftn -l sma shmem_program.f
Run SHMEM programs as you would other parallel programs. See "man intro_shmem"
for details on the SHMEM library. For more information on the performance and use of SHMEM
calls, see the Cray XT Series
Programming Environment User's Guide
, available on‑line from Cray.
table of contents — top of page
Q. What Fortran standards are supported?
A. The Portland Group Fortran 90/95 Compiler used on jade provides the full ANSI
Programming Languages capabilities of FORTRAN 77, Fortran 90, and Fortran 95
with a comprehensive set of Fortran extensions.
table of contents — top of page
Q. How do I access the GNU version of make?
A. GNU make is the default make command on jade. It is automatically included in
your $PATH.
table of contents — top of page
Q. What programming and performance analysis tools are
available?
A. TotalView and GNU debuggers, CrayPat, PAPI, and Apprentice2™ performance
analysis tools are available on jade. See "man totalview" for details.
Additional information on TotalView can be found through the Documentation tab on the Totalview Technologies
website. For additional details about CrayPat, see
the pat_build and pat_report man pages. The GNU debugger, gdb is also
available. See the gdb man pages for details.
table of contents — top of page
Q. How do I debug parallel programs?
A. Jade provides the TotalView debugger. See the man page for more information.
table of contents — top of page
Q. Where do I run TotalView?
A. If Totalview is run from the login nodes, you can debug serial codes compiled using the
pgf90, pgcc, etc. compilers (with -g option). Totalview is an X11
application and requires an Xterm session. To run a TotalView (serial) job on a login node,
use the following command:
totalview ./a.out
If you need to debug a parallel code, compile it for the compute nodes (using the -g option) and then follow the steps outlined here.
table of contents — top of page
Q. How do I analyze the performance of parallel
programs?
A. The programming and performance analysis tools described above all
support parallel programs. In addition, they are all "post mortem"; execution of a program
produces an analysis file, and this file is used by the tool well after the program finishes.
Therefore, the performance‑analysis tools are effective within a batch environment.
Simply run each parallel program as a batch job, remembering to copy the necessary analysis
files to permanent storage within the batch script. After a particular job completes, use the
desired tool to interpret the analysis files.
table of contents — top of page
Q. How do I run on a single core per node?
A. If more than 2 GBytes of memory per MPI process is required, then running on a single
core per node can provide up to 7.6 GBytes of memory per MPI process. Since nodes can not
be shared with other users, running in this configuration will keep one core per node active
and the other three cores will be idle. The active core has access to all the memory on the node. For
an example of running with 32 MPI processes on 32 nodes, the PBS directive and aprun
options required are:
#PBS -l ncpus=128
aprun -n 32 -N 1 ./a.out
The PBS directive will allocate 128 cores on 32 nodes, but the aprun command option "-N" forces only one core per node to be active.
table of contents — top of page
Q. What login shells are available?
A. The following shells are available on all of our systems: bash, csh, ksh, tcsh, and sh. If
you don't request a specific shell on your account application, you are assigned the csh shell
by default.
table of contents — top of page
Q. How can I change my login shell?
A. You can contact our Service Center via e‑mail,
phone, or walk‑in to have your default shell changed.
table of contents — top of page
Q. What commands are available to provide
information on the entire system?
A. Regular Unix commands only work on the specific login node into which you are logged. The
following commands allow operations to and provide information on the entire system. Further
information and command syntax can be found using the system's man page utility.
| Command | Description |
|---|---|
| xtshowmesh | Shows information about compute and service partition nodes and the jobs running in each partition. |
| xtshowcabs | Shows information about compute and service nodes organized by chassis and cabinet. |
| xthostname | Displays or sets the xthostname value. |
table of contents — top of page
Last update: July 10, 2009
You are accessing a U.S. Government (USG) Information System (IS) that is provided for USG-authorized use only. By using this IS (which includes any device attached to this IS), you consent to the following conditions: * The USG routinely intercepts and monitors communications on this IS for purposes including, but not limited to, penetration testing, COMSEC monitoring, network operations and defense, personnel misconduct (PM), law enforcement (LE), and counterintelligence (CI) investigations. * At any time, the USG may inspect and seize data stored on this IS. * Communications using, or data stored on, this IS are not private, are subject to routine monitoring, interception, and search, and may be disclosed or used for any USG- authorized purpose. * This IS includes security measures (e.g., authentication and access controls) to protect USG interests--not for your personal benefit or privacy. * Not withstanding the above, using this IS does not constitute consent to PM, LE or CI investigative searching or monitoring of the content of privileged communications, or work product, related to personal representation or services by attorneys, psychotherapists, or clergy, and their assistants. Such communications and work product are private and confidential.