关键词:递归;并行算法;FRPA
摘 要:Recursion continues to play an important role in high-performance computing. However, parallelizing recursive algorithms while achieving high performance is nontrivial and can result in complex, hard to maintain code. In particular, assigning processors to subproblems is complicated by recent observations that communication costs often dominate computation costs. Previous work [1]–[3] demonstrates that carefully choosing which divide-and-conquer steps to execute in parallel (breadth-first steps) and which to execute sequentially (depth-first steps) can result in significant performance gains over na?ve scheduling.
内 容: