Malware is an ever-present problem in the modern era and while detecting malware with AI has grown as a new field of exploration, current methods are not yet mature enough for widespread adoption in terms of speed and performance. Current methods largely focus on viewing malicious assembly as an image for detection, requiring a large amount of preprocessing and making network architectures inflexible. Preprocessing malware images to one size introduces additional time to predict and makes the task of prediction more difficult. We explore a novel method for transforming executable bytecode into a video rather than an image for classification with deep, time-distributed neural networks, achieving up to 98.74% testing accuracy on 9 classes of malware, and up to 99.36% testing accuracy on a balanced set of malicious vs. benign files. The network could also classify all malware in our dataset for a false positive rate of 13%, and was also found successful in classifying only parts of an input, as well as initial success in a 0-day scenario. The network only uses the executable code and no additional information to make predictions. We then explore methods for pruning and quantizing the network so that it may be more feasible for widespread implementation, including a novel pruning method we call Node-Distance pruning. Our model is found to be competitive to current works while remaining fast, lean and flexible.