Interactive media streaming is the communication paradigm where an observer, viewing transmitted subsets of media in real-time, periodically requests new desired subsets from the streaming sender, upon which the sender sends the appropriate media data corresponding to the received requests. This is in contrast to non-interactive media streaming like TV broadcast, where the entire media set is compressed and delivered to the observer before the observer interacts with the data (such as switching TV channels). Examples of interactive streaming abound in different media modalities: interactive browsing of JPEG2000 images, interactive light field or multiview video streaming, etc. Interactive media streaming has the obvious advantage of bandwidth efficiency: only the media subsets corresponding to observer's requests are transmitted. This is important when an observer only views a small subset out of a very large media data set during a typical streaming session. The technical challenge is how to structure media data such that good compression efficiency can be achieved using compression tools like differential coding, while providing sufficient flexibility for the observer to freely navigate the media data set in his/her desired order. In this introductory paper to the special session on "immersive interaction for networked multiview video systems", we overview different proposals in the literature that simultaneously achieve the conflicting objectives of compression efficiency and decoding flexibility.