In this paper, we set up a novel audio dataset named Gastrointestinal (GI) Sound Set which includes 6 kinds of body sounds Bowel sound, Speech, Snore, Cough, Groan, and Rub. We do sound event detection (SED) based on it, and can accurately detect 6 types of sound events. First, the GI Sound Set is collected by wearable auscultation devices. To ensure generalization, patients from five different hospital departments are recruited for data collection, along with a group of healthy subjects. GI Sound Set refers to Google AudioSet in data format but varies in audio length and sampling rate. Second, we extract Mel-filter features from the recordings and investigate the performance of different activation functions and neural network architectures for detecting sound events. We use data augmentation, class balance to deal with the problem of quantitative imbalance between classes on the dataset. We apply multiple instances learning(MIL) to give out not only bag-level results but also frame-level results. In this work, GI Sound Set is the largest body sound dataset to date, and our approach shows state-of-the-art performance with an average score of F1=81.06% evaluated on the test set. Due to its simple network and conventional processing method, our CRNN system has high universality, which can be used in other audio datasets, such as respiratory sound and heart sound. INDEX TERMS Gastrointestinal (GI) Sound Set, sound event detection(SED), convolutional recurrent neural network (CRNN), multiple instance learning(MIL).