As can be seen, UTF-16 takes about 50% more space than UTF-8 on real data, it only saves 20% for dense Asian text, and hardly competes with general purpose compression algorithms. The Chinese translation of this manifesto takes 58.8 KiB in UTF-16, and only 51.7 KiB in UTF-8. Text operations on encoded strings The popular text-based data formats (e.g. CSV, XML, HTML, JSON, RTF and source codes of c