The rapid growth of the number of devices in the industrial Internet of things (IIoT) has a huge influence in the amount of data involved. In order to alleviate the computing load of cloud servers and reduce the delay of data processing, edge-cloud computing cooperation has been introduced to the IIoT. General programmable logic controllers (PLCs), which have been playing important roles in industrial control systems, start to gain the ability of processing large amount of industrial data and sharing the workload of cloud servers. This transforms them into edge-PLCs. However, continuous influx of multiple types of concurrent production data stream against the limited capacity of built-in memory in PLCs bring a huge challenge. Therefore, ability to reasonably allocate memory resources in edge-PLCs to ensure data utilization and real-time processing has become one of the core means in improving the efficiency of industrial processes. In this paper, to tackle dynamic changes of arrival data rate over time at each edge-PLC, we propose to optimize memory allocation with Q-learning distributedly. The simulation experiments verify that the method can effectively reduce the data loss probability while improve the system performance.