Modeling and Simulation Group Meeting Old

High performance computing at NYU, and debugging/profiling compiled codes

Speaker: Shafer Smith and Aleksandar Donev

Location: Warren Weaver Hall 1302

Date: Thursday, March 23, 2023, 12:45 p.m.

Synopsis:

Shafer Smith has prepared a google document summarizing what he presented regarding NYU HPC resources -- you must be logged into your NYU gmail to access this.

From Aleks Donev regarding his tutorial on debugging/profiling:

Valgrind and its profiler:

https://valgrind.org/docs/manual/QuickStart.html

https://developer.mantidproject.org/ProfilingWithValgrind.html

Valgrind comes with many tools:

https://valgrind.org/info/tools.html

and you can find there some tools for multithreaded codes, which is something someone asked today. For debugging MPI programs, which is harder, see:

https://valgrind.org/docs/manual/mc-manual.html#mc-manual.mpiwrap

https://www.open-mpi.org/faq/?category=debugging#valgrind_clean

Regarding Fortran, I already sent information in an earlier email, but here it is again:

Contrary to many misconceptions/misrepresentations (coming primarily from people that have not written a Fortran code since 1990 and never studied Fortran), Fortran is a modern programming language that was used more than half a century ago to launch NASA rockets into space, and its modern version(s) (the latest being Fortran 2018 and Fortran 2023 about to be ratified as a standard this summer) are still actively used in scientific codes in AOS (see actual code for halo/ghost region sync'ing, which I welcome you to compare in terms of complexity to a related MPI code!). I added some IMO comments about features of Fortran many don't know about at the end of my MD lecture notes. Full disclosure: I am certainly biased in my opinions since I was on the Fortran language committee for many years...

I am quite certain that Fortran is the best compiled language to teach to newcomers/students in scientific computing. I myself have had ~5 summer undergrad students that learned it mostly independently in 1-2 weeks, and then wrote rather sophisticated algorithms in it over a couple of months (e.g., this particle reaction-diffusion code). In case you want to try this yourself, this book by  Drew McCormack is pedagogical, and a great starting book to learn programming from, but it only covers Fortran 95. There are newer books that cover up to Fortran 2008 (now fully implemented in gfortran, though parallel coarrays are not yet efficient in gfortran) and even up to Fortran 2018 (not yet fully implemented by any vendor AFAIK), but I have not read them yet. The Fortran bible (written by people that lead the language committee) is detailed but a bit dry and not good for learning, IMO.

Modern Fortran is much more debuggable (both through strong compile-time checking and runtime checking with compiler debugging flags) and simpler than C++, and has almost all features that a scientific programmer may want (templates/containers being the notable exception). Fortran is not without faults for sure, for example, it's committee is dominated by vendors that are resistant to change, it is case insensitive, it lacks a standardized library of commonly used functions (e.g., plotting or stdlib type stuff) [part of the reason for this is to ensure portability over essentially all Turing machines] and generic programming (templates, which are expected be the major new feature in Fortran 2027), etc. But it certainly deserves a little more attention than it receives in today's scientific computing circles, especially at Universities.