Content Summary | Recently, in [2,3], a new and novel unified approach was introduced for analyzing linear and integrable
nonlinear PDEs in two dimensions. Central issue to this approach is a generalized Dirichlet-Neumann
map, characterized through the solution of the so-called global relation, namely an equation, valid for
all values of a complex parameter k, coupling specified known and unknown values of the solution and
its derivatives on the boundary.
For a large class of boundary value problems, the global relation can be solved analytically, and hence
the generalized Dirichlet-Neumann map can be constructed in closed form. However, for general boundary
value problems, the global relation must be solved numerically. For this, in [4], a well conditioned
and fast convergent collocation-type numerical method was developed and studied for the numerical solution
of the Generalized Dirichlet-Neumann map associated to the generic model problem of Laplace’s
equation on an arbitrary convex polygon domain. For the case of regular polygon domains, with the
same type of boundary conditions on all sides, we have (cf. [5]) rigorously studied the properties of the
associated collocation coefficient matrix revealing its Block Circulant structure. And as the block circulant
property is strongly connected with the Discrete Fourier Transforms (cf. [1]), the produced linear
system can be solved efficiently using FFTs (cf. [6]). The development of a parallel algorithm for this
computational task is the main problem we are addressing in the work herein. The parallel algorithm
we present is realized, through MPI programming, on two parallel systems: (a) on a shared-distributed
memory computer with 8 processors and (b) on a cluster of 4 nodes with 2 processors each. The cluster
uses a local ethernet interconnection for its nodes of either 100Mbps or 1 Gbps.
Our implementation is further studied through extensive numerical experimentation accompanied with
computation/communication and speedup measurements (see for example Figure 1 below). Through
this study we are able to draw conclusions for the performance of our implementation as well as to
evaluate and compare the different parallel architectures used. | en |