Intelligently Running AlphaFold 3 on Fovus with an AI-Optimized HPC Strategy, Saving up to 67% Cost and 45% Time

A Case Study

6 min read2 days ago

AlphaFold 3 is a powerful AI-driven protein structure prediction tool developed by DeepMind. Building upon the success of its predecessors, AlphaFold 3 extends beyond protein folding and incorporates broader molecular modeling capabilities. It is widely used in computational biology to predict protein-protein interactions, drug discovery, and large-scale proteomics research.

AlphaFold 3 is particularly valuable in computational biology, enabling researchers to model complex biomolecular structures accurately. A key challenge, however, arises in large-batch use cases where one must process hundreds or thousands of proteins or molecules simultaneously. These scenarios include:

Large-scale protein structure predictions for drug discovery.
Genomic studies requiring the folding of entire proteomes.
Molecular dynamics simulations needing pre-folded structures.

In such cases, cost and scalability become critical concerns, as running AlphaFold 3 on cloud GPUs can be resource-intensive and expensive.

Pain Points of Running AlphaFold 3 in the Cloud

Running AlphaFold 3 at scale presents several challenges:

Cost Challenges: Cloud GPUs can be expensive, especially when running large-batch protein folding operations.
Scalability of GPUs: GPUs are among the most in-demand resources in the cloud and thus often have limited availability, making them difficult to procure and challenging to scale for large-scale tasks.
Resource Utilization: The data pipeline step of AlphaFold 3 runs on CPUs and does not benefit from GPU acceleration. Yet, cloud GPU instances often lack powerful CPUs, making running the entire AlphaFold 3 pipeline on GPU instances inefficient.
Strategy Challenges: Too many instance choices, with different CPUs/GPUs and flexible system configurations, make it challenging to find the optimal strategy.

Introduction to Fovus

Fovus is an AI-powered, serverless high-performance computing (HPC) platform that delivers intelligent, scalable, and cost-efficient supercomputing power designed to enhance performance, scalability, and cost efficiency for complex HPC workloads like AlphaFold 3. Fovus uses AI to optimize HPC strategies and orchestrates cloud logistics, making cloud HPC a no-brainer and ensuring sustained time-cost optimality for digital innovation amid quickly evolving cloud infrastructure.

Benefits of Fovus

Free Automated Benchmarking: Evaluates how different instance choices and system configurations (CPUs, GPUs, memory, and more) quantitively impact the runtime and cost of different computation stages in your AlphaFold runs.
AI-driven Strategy Optimization: Automatically determines the optimal (but different) instance choice and system configurations (CPUs, GPUs, memory, and more) for each computation stage for separate execution. Ensures efficient and cost-effective computation.
Dynamic Multi-Cloud-Region Auto-Scaling: Dynamically allocates optimal GPUs to distribute your AlphaFold runs across multiple cloud regions and availability zones according to their availability dynamics. Efficiently scale up your GPU cluster and computation parallelism for large-batch protein folding simulations.
Intelligent Spot Instance Utilization: Intelligently utilizes spot instances according to their availability and pricing dynamics with spot-to-spot failover capability to further minimize the computation cost of large-batch protein folding simulations.
Continuous Improvement: Auto-updates benchmarking data and auto-refines your HPC strategy as cloud infrastructure evolves.
Serverless HPC Model: AI-driven automation eliminates manual setup, allowing single-command or few-click deployment. Users pay only for runtime.

Optimizing AlphaFold on Fovus

Two-Step Strategy for Running AlphaFold 3

AlphaFold 3’s execution can be broken into two distinct stages:

Data Pipeline Step (CPU-Optimized): Processes multiple sequence alignments (MSA) and other preparatory computations. This stage is CPU-intensive and does not benefit significantly from GPU acceleration.
Model Inference Step (GPU-Optimized): Runs deep learning-based structure prediction using neural networks, which requires GPU acceleration.

Running both steps on GPU instances is costly and inefficient. A typical breakdown of AlphaFold 3 execution time shows:

Data Pipeline: 60–80% of total runtime, CPU-intensive.
Inference: 20–40% of total runtime, GPU-intensive.

Most cloud GPU instances have relatively weak CPUs, making them suboptimal for the data pipeline stage. Fovus mitigates this by running each step on the most optimal hardware configuration.

How Fovus Optimizes AlphaFold 3 Execution

Fovus benchmarks and optimizes AlphaFold 3 execution using two separate benchmarking profiles:

Data Pipeline Benchmarking Profile: This profile helps determine the optimal CPU choice, memory configuration, and parallel computing settings (number of parallel processes) for executing the data pipeline stage.
Inference Benchmarking Profile: This profile helps identify the optimal GPU choice and system configurations for the inference stage execution.

By dynamically optimizing cloud operations and resource utilization in both stages, Fovus minimizes both cost and runtime.

Case Study Methods

We evaluated AlphaFold 3’s performance on Fovus using four protein systems of varying sizes:

System 1: 7U8C
Crystal structure of Mesothelin C-terminal peptide-MORAb 15B6 FAB complex
3,633 atoms and 444 residues
System 2: 7BZB
Crystal structure of plant sesterterpene synthase AtTPS18
4,676 atoms and 554 residues
System 3: 7URD
Human PORCN in complex with LGK974 and WNT3A peptide
5,534 atoms and 970 residues
System 4: 7AU2
Cryo-EM structure of human exostosin-like 3 (EXTL3)
11,236 atoms and 1,780 residues

Each system was simulated with Alphafold 3 using the two-stage execution strategy with the following commands:

Data Pipeline Step:

docker run -i \
    --volume=$PWD/input:/container_workspace/input \
    --volume=$PWD/output:/container_workspace/output \
    --volume=$PWD/databases:/container_workspace/databases \
    --ipc=host \
    fovus/alphafold3:latest \
    python run_alphafold.py \
    --json_path=/container_workspace/input/fold_input.json \
    --db_dir=/container_workspace/databases \
    --output_dir=/container_workspace/output \
    --run_inference=false \
    --jackhmmer_n_cpu=$FovusOptVcpu \
    --nhmmer_n_cpu=$FovusOptVcpu

As part of the AI-optimized HPC strategy, Fovus automatically determines the value of $FovusOptVcpu, the optimal number of CPU cores to use at runtime.

Inference Step:

docker run -i \
    --volume=$PWD/input:/container_workspace/input \
    --volume=$PWD/output:/container_workspace/output \
    --volume=$PWD/models:/container_workspace/models \
    --ipc=host \
    --gpus all \
    fovus/alphafold3:latest \
    python run_alphafold.py \
    --json_path=/container_workspace/input/fold_input.json \
    --model_dir=/container_workspace/models \
    --output_dir=/container_workspace/output \
    --run_data_pipeline=false

For each protein system, the study was conducted three times on Fovus, each with a different objective specified for HPC strategy optimization:

Minimizing Cost: Prioritizing the most cost-efficient HPC strategy.
Minimizing Cost and Time: Optimizing for both cost-efficiency and performance.
Minimizing Time: Prioritizing the fastest execution.

Key performance metrics analyzed included:

Data Pipeline runtime and cost
Inference runtime and cost
Total runtime and cost per folding
Cost savings percentage of the two-step strategy, as opposed to running the entire workload on GPUs
Times savings percentage of the two-step strategy, as opposed to running the entire workload on GPUs

Case Study Results

Below are the performance and cost-efficiency results achieved on Fovus under each objective:

The benchmarking results demonstrate that Fovus significantly optimizes AlphaFold 3 execution by:

Minimizing runtime and cost by optimizing the HPC strategy for each computation stage, making large-scale protein folding simulations affordable.
Providing flexibility for researchers to choose between cost-efficient, balanced, or time-optimized execution strategies.

By strategically separating AlphaFold 3’s execution into two distinct stages — running the data pipeline step on the optimal spot CPU instance and the model inference step on the optimal spot GPU instance — significant cost and time savings can be achieved with improved performance. Unlike running the entire workflow on expensive GPU instances, this two-step optimization approach ensures that each computation stage runs on the most cost-effective and performance-efficient hardware. As a result, our study found that leveraging this method, combined with AI-optimized HPC strategy, can lead to cost reductions of up to 67% per folding and time savings of up to 45% per folding, making large-scale protein structure prediction far more accessible and efficient.

Conclusion

Fovus enables efficient and scalable execution of AlphaFold 3 for large-batch protein structure prediction use cases, overcoming the challenges of GPU cost and resource allocation in the cloud. By optimizing the two-step execution pipeline, Fovus makes it possible to run large-scale protein folding simulations more cost-effectively and performantly. This advancement democratizes access to powerful AI-based molecular modeling tools and scalable and affordable supercomputing power, making them easily accessible to researchers working on cutting-edge biological problems.

As demand for large-scale protein structure prediction grows, Fovus will be instrumental in accelerating research while keeping costs manageable. The ability to fine-tune execution strategies based on cost, time, or a balance of both gives researchers the flexibility to optimize for their specific needs.

Visit Fovus to explore how AI-optimized HPC can streamline your AlphaFold 3 workflows.