
Debugging Tools
TotalView
The TotalView debugger is a tool that lets you debug, analyze, and tune the performance of complex serial, multiprocessor, and multithreaded programs.
Abnormal Termination Processing (ATP)
ATP is a system that monitors Cray XT system user applications. ATP builds a merged stack backtrace tree that provides a concise yet comprehensive view of what the application was doing at the time of its termination.
lgdb
lgdb is used to launch an application. After lgdb is run it will give directions on how to run gdb and attach to processes. The gdb that ships with lgdb should always be used over any other version.

Optimization
Software Optimization Guide for AMD64 Processors
Guidelines for serial optimizations specific to AMD Opteron can be found on the AMD site in the Software Optimization Guide for AMD64 Processors.
PGI Compilers & Tools PGI Premier Support at ORNL – Brent Lebeck
Cray XT Optimization Basics – Jeff Larkin
Monitoring Performance with Profiling Tools
PAPI
The Performance API (PAPI) project allows users to monitor events that can be used to map code to underlying architecture. This correlation has a variety of uses in performance analysis including hand tuning, compiler optimization, debugging, benchmarking, monitoring and performance modeling.
Cray PAT
Cray PAT is the Cray performance analysis tool for instrumenting and tracing code. It may be used to selectively trace specific functions.
- Apprentice2
Cray Apprentice2 is a post-processing performance data visualization tool. Use Cray Apprentice2 with Cray PAT to explore the experiment data and generate a variety of interactive graphical reports. It includes an online help system which is accessible whenever Cray Apprentice2 is running.
Cray Apprentice2 for 32 Bit Desktops
FPMPI
FPMPI and FPMPI_papi are light-weight profiling libraries that use the pmpi hooks, as specified in the MPI standard. Applications linked against one of these libraries will gather statistics about MPI use. If using FPMPI_papi, PAPI counter data will also be collected.
TAU
Tuning and Analysis Utilities, or TAU, is a performance analysis tool available from the University of Oregon. There are a number of procedures for instrumenting, tracing, and profiling code.
SCALASCA
Scalasca supports an incremental performance-analysis procedure that integrates runtime summaries with in-depth studies of concurrent behavior via event tracing, adopting a strategy of successively refined measurement configurations. It is able to identify wait states that occur as a result of unevenly distributed workloads, such as scaling communication-intensive applications to large processor counts.
MMPI
Memory Monitor (Memory Monitor Programming Interface) is a programming API for monitoring the memory usage of a process. Currently MMPI reports VmSize (the virtual memory size allocated for a process) and VmRSS (the real memory usage by a process). The library is implemented in C language, and it can be used in C/C++ and FORTRAN languages. This page provides examples to demonstrate how to use the library in different programming language environments, as well as with multi-threading and MPI.

