The mmc type chips are designed for single-tasking camera/media use, they hide the flash/nand behind a dumb controller which is fully optimized to be cheap and simple for large file use. It doesn't really handle the random updates of 4k blocks here and there very well.. Its native block size is something like 256k, so a random 4k write gets translated into a 256k read-modify-write cycle.
A factor of 10 difference! nilfs2 is by accident and its nature faster on flash.