DOE National Lab Workshops: Learn about Large Datasets
Wednesday, September 18, 2019 — 1:00PM - 5:00PM
Learn about large datasets from three National Laboratories! In this 4-hour tutorial series, get to know the types of research performed by scientists at Los Alamos National Laboratory (LANL), Lawrence Livermore National Laboratory (LLNL), and Oak Ridge National Laboratory (ORNL). As applications become more complex and interdisciplinary, so does the data they accumulate. Prior to any analysis on these large datasets, you will need tools to organize, analyze and visualize the results.
Using the background of research done at LANL, LLNL, and ORNL, the tutorial leaders from each Lab will discuss the following topics and provide an interactive experience in using these tools:
The Grand Unified File Indexing (GUFI) system, winner of a 2018 Research & Development Top 100 Award (the most prestigious innovation awards program for the past 56 years, honoring great R&D pioneers and their revolutionary ideas in science and technology.) is a hybrid indexing capability designed to assist storage administrators and users in managing their data. GUFI is designed using a new, hierarchical approach to indexing file metadata, allowing rapid parallel searches across many internal databases. Queries that would previously have taken hours or days can now be run in seconds.
SIGHT (https://www.olcf.ornl.gov/olcf-resources/rd-project/sight/) is an exploratory visualization tool for large scale datasets supporting Ray Tracing, remote and interactive scientific visualization, parallel I/O and large scale displays. SIGHT is a custom solution for easy exploration of particle based datasets. OpenACC is a programming model for developing portable High Performance Computing applications using hardware accelerators, such as General Purpose Graphics Processing Units (GPGPUs). OpenACC provides a relatively easy way to transform a serial program into a parallel one using the power of accelerator technology. This mini-tutorial will provide an introduction to GPUs, an overview of accelerator programming models, and the specifics of OpenACC. In addition, attendees will be given access to the Oak Ridge Leadership Computing Facility’s Ascent system for a hands-on tutorial. Ascent is an IBM 16-node Power9 system, with 2 Power9 CPUs and 6 NVIDIA V100 GPUs per node.
The workshop format is as follows:
- Part 1: Introduction of Labs by each tutorial leader and will include a discussion of the types of research performed at each Lab that use these tools. There will be a demonstration of GUFI and a discussion on how this data management and organization tool can benefit you prior to data analysis. (1 hour and 10 minutes)
- Part 2: Tutorial of SIGHT (55 minutes)
- Part 3: Tutorial of OpenACC (1 hour 40 minutes)
Workshop is limited to the first 150 participants to attend and the tutorial series is open to participants at all levels: undergraduate, graduate students, and professionals attendees. Parts 2 and 3 of the workshop, build upon the knowledge gained in Part 1 of the workshop. It is for this reason that attendees must attend Part 1 of the workshop to participate in Part 2 and/or Part 3.