The execution time with the old optimizer is approximately 5 cycles in the best case (this assumes out-of-order execution and perfect branch prediction) and at least 10 cycles in the worst case. With the new optimizer, execution time is always 2 cycles. Obviously, there are also important savings in code size. Very interesting results can be achieved by combining multiple smaller transformations.