Simple Parallel Implementation
Perform a row (or column) data distribution
for each column of B {
allgather(column);
compute dot product of my row with column
}
This is not going to work well ...
Previous slide
Next slide
Back to first slide
View graphic version