文章摘要
丁武,杨芳,王卫光,王汉岗,何用,蔺崇哲.二维水动力模型多GPU分布式数据并行计算方法研究[J].水利学报,2025,56(12):1647-1658
二维水动力模型多GPU分布式数据并行计算方法研究
Research on multi-GPU distributed data parallel computing methods for 2D hydrodynamic models
投稿时间:2025-05-07  修订日期:2025-12-23
DOI:10.13243/j.cnki.slxb.20250262
中文关键词: 二维水动力模型  有限体积法  计算域分割  NCCL  异构计算
英文关键词: 2D hydrodynamic model  finite volume method  computational domain partitioning  NCCL communica? tion  heterogeneous parallel computing
基金项目:国家重点研发计划项目(2024YFC3212000)
作者单位
丁武 河海大学 水文水资源学院, 江苏 南京 210024
珠江水利科学研究院, 广东 广州 510611 
杨芳 珠江水利科学研究院, 广东 广州 510611 
王卫光 河海大学 水文水资源学院, 江苏 南京 210024 
王汉岗 珠江水利科学研究院, 广东 广州 510611 
何用 珠江水利科学研究院, 广东 广州 510611 
蔺崇哲 青岛市海润自来水集团有限公司崂山水库管理处, 山东 青岛 266114 
摘要点击次数: 11
全文下载次数: 16
中文摘要:
      针对二维水动力模型在复杂流域洪水模拟中的算力瓶颈,本研究构建了基于物理拓扑分域与基于分布式数据并行及NCCL异步通信融合的多GPU异构并行架构,实现非结构三角网格水动力模型的超算级加速。通过物理拓扑保持型计算域分割算法,在保障非结构三角网格邻接关系完整性的同时,实现多GPU间的动态负载均衡;结合基于Godunov有限体积法的分布式数据并行求解框架与NCCL异步通信策略,构建了“计算-通信”协同的异构加速架构。实验验证表明:在二维理想溃坝算例中,模型能精准捕捉溃坝激波传播特征,与理论解保持高度一致。在崂山水库溃坝洪水模拟案例中,利用8块GPU并行计算实现了15.70 s完成2 h的溃坝洪水演进模拟,取得了308.43倍的加速比,其中各子域间的数据通信效率较传统MPI提升了8.67%。该架构支持单节点至跨节点GPU集群的弹性扩展,可为数字孪生流域提供秒级响应的超算级水动力引擎,为推动防洪预报调度从静态预案向动态预演范式提供核心技术支撑。
英文摘要:
      To address the computational bottlenecks of 2D hydrodynamic models for flood simulation of complex watersheds,this study develops a multi-GPU heterogeneous parallel architecture based on physical topology-based domain partitioning and integration of distributed data parallelism with NCCL asynchronous communication,enabling supercomputing-level acceleration for unstructured triangular mesh-based hydrodynamic models. Through a physical topology-preserving domain partitioning algorithm,dynamic load balancing across multiple GPUs was achieved, while ensuring the integrity of adjacency relationships within unstructured triangular meshes. By combining a distributed data parallel solution framework based on the Godunov finite volume method with an NCCL asynchronous com munication strategy,a heterogeneous acceleration architecture with "computation-communication" collaboration was established. Experimental validation shows that,in a 2D idealized dam-break test case,the model accurately captured the propagation characteristics of the dam-break shock waves,maintaining high consistency with theoretical solutions. In the case study of the Laoshan Reservoir dam-break flood simulation,the 2-hour flood evolution simulation was completed in just 15.70 s using 8 GPUs for parallel computing,achieving a speedup of 308.43 times. Furthermore,the data communication efficiency between subdomains improved by 8.67% compared to traditional MPIbased methods. The proposed architecture supports elastic scaling from single-node to cross-node GPU clusters,providing a supercomputing-level hydrodynamic engine with second-level response capabilities for digital twin watersheds. It offers core technological support for advancing flood forecasting and dispatching from static planning toward dynamic rehearsal paradigms.
查看全文   查看/发表评论  下载PDF阅读器
关闭