Implementation Issues
Implementation is greatly simplified if we can treat certain subsets of processes as a “communication universe”
Broadcast A[i,k_bar] across processor row i;
- Broadcast is contained within a row (collective communication)
Send B[k_bar,j] to destination; Recv B[(k_bar+1) mod q, j] from source;
- send/recv contained within a column (pt to pt)