We aim to contribute to deep learning based smart agriculture through semantic segmentation on crop images from real field environment. The key objective is the precise detection of diseases to facilitate the automation of agricultural management. The most significant issue is that the disease regions, serving as Regions of Interest (RoI), are small, making accurate prediction challenging. To address this issue, we propose a new framework of RoI-Attention Network (RA-Net) which additionally utilizes an RoI-attentive image that includes only regions predicted as disease and their surroundings from the input image. Using the RoI-attentive image, RA-Net enhances the representation power for disease regions by guiding the network to re-focus on RoI-associated context based on the initial prediction from the input. Using the proposed RoI-Attention stage, the coarse predictions of disease regions in crop images can be enhanced by incorporating additional sequential RoI-Attention and fusion stages. We have experimentally demonstrated the effectiveness of the proposed RA-Net in predicting small disease regions.