Guides & Whitepapers

High performance computing in the cloud

Is your business ready to reap the benefits?

The democratisation of data

Supercomputing is inching downmarket. Although large government and research institutions were once the main large scale processing users, it is now within reach of businesses and organisations looking to create value and insights from vast data volumes.The cloud is a major supercomputer market driver, and according to industry research, the on-premises high-performance computing (HPC) market is growing at a compound annual growth rate (CAGR) of 6.9% through 2026. The cloud supercomputing market—currently representing only 27% of total spend—is growing at a 17.6% CAGR.1 The cloud model offers nearly limitless resources for processing and storage across a network of cloud centers, making cloud supercomputing attractive and achievable for businesses that are unable or unwilling to build and maintain the immense facilities required for on-premises HPC.However, it takes more than processing and storage to support a supercomputing application in the cloud. The cloud must securely transport massive amounts of data between cloud centres (and sometimes back to the source), and so organisations planning on running supercomputing applications must in corporate a supernetworking component—a network that is as high bandwidth and massively scalable as the application it supports.In this brief, we will review HPC definitions and data requirements and explore the role of the network in a successful cloud supercomputing initiative.

Defining HPC

HPC refers to the aggregation of processing capability to support complex analytics and simulations on massive amounts of data measured in petabytes (PB) or even exabytes.Supercomputing is a type of HPC that involves hundreds or thousands of individual processors working together on a task.The term supercomputer generally refers to a massively scalable hardware cluster designed to perform parallel processing.Supercomputing requires specialised hardware, hosted in the cloud or on-premises, to perform complex algorithms on massive datasets simultaneously and quickly.

To support on-premises installations, commercial system manufacturers design hardware to handle massive data growth, scaling to hundreds of thousands of nodes while ensuring consistent performance. This approach can be costly and incur considerable capital expenditure as the application scales. Operating expenses can also skyrocket from hiring or engaging additional skilled resources to support growing infrastructure installations.

For cloud installations, public cloud providers offer specialised instance types designed to support compute-intensive or machine learning (ML)/neural networking workloads. In addition, providers may offer functionality to build, deploy, and manage HPC clusters. The public cloud option relieves organisations of the capital and operating expense burdens associated with on-premises installations, as the cloud service provider takes on responsibility for acquiring, installing, and maintaining the infrastructure.

Artificial Intelligence and ML drive HPC Use Cases

If the public cloud is democratising HPC infrastructure, cloud-based AI and ML functionality is democratising HPC use cases. Commercial and public organisations of all sizes are turning to data to create value—for their use, on behalf of clients, or on behalf of society—and they are increasingly leveraging easily accessible and affordable cloud-based AI and ML tools to do so.

The diverse and creative ways organisations are using AI and ML include:

  • Modernising financial services: e.g. credit card fraud detection, real-time stock tracking, trading automation
  • Disrupting industries e.g. self-driving cars, chatbot-based services, self-guided technical support
  • Processing environmental data: e.g. seismic data processing from, deep learning statistical analysis form oil and gas exploration
  • Diagnosing disease and prescribing treatment e.g. gene therapy
  • Exploring contingencies e.g. flight simulators, behaviour predications
  • Building simulations for manufacturing designs e.g. automotive and airline industries, photorealistic 3D rendering

In each case, the AI/ML model requires massive amounts of data for ongoing training. As input data quantities and variables grow and the outcomes become more complex, the AI/ML model requires more processing power. When the volume, complexity, and required processing speed exceed normal computing capacity, it becomes HPC.

Disparate data types and needs shape HPC workloads

HPC workloads ingest massive volumes of data, and depending on the application or use cases, data may vary in format, pace, urgency, and security requirements.

  1. Data may arrive in multiple formats, including bandwidth-hungry formats such as images or streaming media. A single HPC workload may use multiple types of data.For example, a diagnostics research organisation may ingest images and tabular lab results to derive precision medicine solutions.
  2. Data may arrive in a steady and predictable cadence (e.g., earth images used in geolocation apps) or come in spiky patterns (e.g., weather pattern alerts).
  3. Data may require immediate processing (e.g. data derived from autonomous vehicles), or stored for future use (e.g. voluntary DNA databases).
  4. Data may be subject to compliance regulations (e.g. private health or financial information) and/or considered proprietary intellectual property, sensitive to breach (e.g. media and entertainment files).

The volume and diversity of data - and the needs of each application - place a great burden on the networks transporting data to and within the cloud/data centres that perform the processing.

HPC Applications require a high-performance network

A data-intensive HPC workload may ingest upwards of 10 PB per day, process and return instructions instantaneously, and then forward the data for storage and additional processing in a remote cloud centre. Depending on the specific HPC application, data ingest and transfer scenarios may include:• High volumes of data collected at the edge (e.g., via sensor-equipped Internet of Things [IoT] or IndustrialIoT devices or endpoints) must shift to a cloud facility for processing, with results returned quickly.

• PB-scale data from multiple sources (e.g., smart infrastructure, weather stations) aggregate at a facility for cleansing and verification before shifting to the HPC cloud.

• Configuration of HPC workloads in private data centres for cloud-bursting (moving to a cloud HPC centre) may occur when data volumes spike beyond the local centre's capacity.

• An HPC workload may require multi-step processing of massive data volumes, with data exchanged among intermediaries based on processing outcomes. Each scenario requires massive data volumes to securely, reliably, and quickly move across network connections.

Just as a standard enterprise server or public cloud cannot support the intense needs of an HPC workload, neither can traditional network services. Unfortunately, cloud professionals in most organisations do not have the expertise or experience to manage cloud HPC despite an ability to manage standard cloud workloads successfully. They may not be aware that as data volumes and AI-enabled services increase (e.g., simulation software), the underlying infrastructure needs to be reassessed. Their first indication of a problem may be poor app performance or lost data.To maximise the value and performance of HPC workloads, you must ensure you have dedicated network links capable of supporting the unique requirements.

Your high performance network should deliver:

  1. Very high bandwidth to rapidly move massive-scale ingest data into and out of HPC faciities
  2. Guaranteed, consistent performance with end-to-end low latency and minimal packet loss, from every data source and for every data format that is transporting data for HPC purposes
  3. Dedicated, direct connections to HPC centres, edge-to-cloud and cloud-to-cloud, enabling full visibility and management for crucial workloads
  4. Highest level of security to protect vital data end-to-end
The last word

Supercomputing workloads are no longer confined to stodgy research or educational institutions. Thanks to the cloud, any company with visionary insight can leverage massive-scale data and cloud-based tools—such as AI and ML—to transform industries and societies.However, just as HPC cannot run on standard computing infrastructure, the workloads cannot utilise standard network infrastructure. Cloud HPC requires fast, secure, and reliable ingest of massive amounts of data into the cloud facility. Some workloads also require the same volumes to be transferred among cloud centres and back to the origin sites (edge or private data centre). For essential circuits, organisations must rely on a network services partner that can offer a high performance network—very high bandwidth, highly scalable, highly reliable, and far-reaching.Do not let your network hold back the potential of yourHPC workloads. With the right network partner, you will be prepared to derive maximum value and insight from your data-intensive HPC workloads.

RESOURCES

Related insightS

The extraordinary everyday.

By tackling the complexities and frustrations of the everyday, we take away the grind - making room for the extraordinary.