CLEANEVAL: home page CLEANEVAL is a shared task and competitive evaluation on the topic of cleaning arbitrary web pages, with the goal of preparing web data for use as a corpus, for linguistic and language technology research and development. The first Cleaneval took place (for Chinese and English) over the summer of 2007, with a workshop in Belgium in September (3rd Web as Corpus workshop (WAC3),