This interview in Communications of the ACM with developers of Photoshop highlights a basic problem with extending parallelism beyond a certain number of processors: memory bandwidth limitations.
For the engineers on the Photoshop development team, the scaling limitations imposed by Amdahl's Law have become all too familiar over the past few years. Although the application's current parallelism scheme has scaled well over two- and four-processor systems, experiments with systems featuring eight or more processors indicate performance improvements that are far less encouraging. That's partly because as the number of cores increases, the image chunks being processed, called tiles, end up getting sliced into a greater number of smaller pieces, resulting in increased synchronization overhead. Another issue is that in between each of the steps that process the image data in parallelizable chunks, there are sequential bookkeeping steps. Because of all this, Amdahl's Law quickly transforms into Amdahl's wall. Photoshop's engineers tried to mitigate these effects by increasing the tile size, which in turn made each of the sub-pieces larger. This helped to reduce the synchronization overhead, but it presented the developers with yet another parallel-computing bugaboo: memory-bandwidth limitations
The article also mentions how these developers have not seriously considered the MPI (message passing interface) because of their existing difficulties moving from four to eight or 16 cores and that this interface would require so much work to re-architect the existing code.