Discussions of coroutines and user-mode threads — like Project Loom’s virtual threads or Go’s goroutines — frequently turn to the subject of performance. The question I’ll try answering here is, how do user-mode threads offer better application performance than OS threads? One common assumption is that this has to do with task-switching costs, and that the performance benefit of user-mode threads