Competing in HPC, “Big Data” and Visualization

One of my challenges in coming to Missouri S&T has been to leverage the most effective use of our meager High Performance Computing, HPC, capabilities to stimulate learning and non-funded research. This has been an ideal opportunity for myself to evaluate this rapidly evolving area of HPC with no predetermined assumptions. Some early observations were that we did not have adequate super computing resources, but it was also apparent that those with enough resources did not necessarily produce proportional results. What we did have was an understanding of what we would do if we had more resources. If I just focused on HPC I would find myself in a resource battle trying to gain recognition in the research community based on cores and compute capability. But we were also interested in visualization and then along came interest in “Big Data”. What I saw was an opportunity.

The one thing I did have was the foundation of an effective research support team which included skill in adapting HPC technique to fit differences in data and workflow requirements. I also had talented student employees who totally thought outside the box and exposed many new options for us. So we started to see that we could compete in processing by adapting our HPC resources to the jobs being requested. And it became increasingly apparent that we were dealing with data that benefited from some sort of visualization to help identify what we should be looking for. For example: we have gotten good at presenting large data sets graphically over time with flexible data attribute selection where we are just looking for anomalies. Now that we are also exploring “Big Data” I could not help but ask why the concept of large in-memory processing for hadoop based data could not be married with traditional HPC and supported by our flexible visualization.

It now appears that my first year of exploration is starting to take shape. I have strengthened my human resources and have discovered that the human element is the most scarce, or at least a flexible human resource team such as we have. So now I have some financial resources to invest and this understanding of the interrelationships of these research tools is helping to stretch what I hope to accomplish. Most of our HPC cluster is devoted to students so we need a base HPC investment devoted to non funded research. For us that goal is probably a 1000 cores. But our success is not going to come from those 1000 cores, but instead from the collaborations we have developed with neighboring university computing centers who realize that we have more to share then just HPC. We can help them optimize their 1000’s of cores specific to the computation desired. Good example here is in computational chemistry.

I mentioned exploring “Big Data”, which has become the darling of big iron computer sales. In simplest terms, “Big Data” is about managing large diverse data sets and processing it with large amounts of memory. The real driver of “Big Data” is the need to analyze the massive amounts of real-time data flowing in about customer buying habits. But of course we have been led to believe that all of our analytical investigations should be using “Big Data”. Not true for analyzing student data but can be true for analyzing some forms of scientific data. And guess what “Big Data” really means it is too big to visualize with traditional spreadsheet type tools. So I am thinking why can’t we blend HPC and “Big Data” with my new nimble visualization techniques? We have all the ingredients and the most important turns out to be the human factor. So now I am throwing some DBA’s into the equation along with scientific software engineers with plans to expand the visualization resources. We should be able to provide most of our processing needs locally or via sharing with regional partners. Add in efficient on-ramps to XSEDE and Open Science Grid and we can compete with anyone.

About ghsmith76

Greg Smith is currently the Interim CIO at Western Washington University. Prior to WWU Greg was the CIO at Missouri S&T, and before that the CIO for George Fox University in Newberg, OR. Greg went to the Northwest from the Purdue School of Engineering and Technology in Indianapolis, IN. where he served as the Director of IT for 8 years. Prior to the IT career in Academia, Greg was a Systems Consultant with Hewlett-Packard primarily with the Analytical Group working out of San Francisco,Cincinnati and Indianapolis. Greg's passion as a CIO in Higher Education comes from his belief that Technology can benefit Teaching & Learning.

Posted on February 16, 2014, in academic, Big Data, Collaboration, Higher Education, HPC and tagged , , . Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s