Reliably tracking multiple people using ordinary cameras is challenging, mostly due to the severe occlusions that occur when many people are involved. We tackle this problem by using several cameras, observing the scene from different viewpoints. To exploit the resulting images, we developed a people detection algorithm called POM and a Deep-Learning based improvement, that use a generative model