I noticed that your version is much more CPU intesive that the original. And I imagine that loop where you wait for the flag could be the culprit.