In-sensor multi-task learning is not only the key merit of biological visions but also a primary goal of artificial-general-intelligence. However, traditional silicon-vision-chips suffer from large time/energy overheads. Further, training conventional deep-learning models is neither scalable nor affordable on edge-devices. Here, a material-algorithm co-design is proposed to emulate human retina and the affordable learning paradigm. Relying on a bottle-brush-shaped semiconducting p-NDI with efficient exciton-dissociations and through-space charge-transport characteristics, a wearable transistor-based dynamic in-sensor Reservoir-Computing system manifesting excellent separability, fading memory, and echo state property on different tasks is developed. Paired with a ‘readout function’ on memristive organic diodes, the RC recognizes handwritten letters and numbers, and classifies diverse costumes with accuracies of 98.04%, 88.18%, and 91.76%, respectively (higher than all reported organic semiconductors). In addition to 2D images, the spatiotemporal dynamics of RC naturally extract features of event-based videos, classifying 3 types of hand gestures at an accuracy of 98.62%. Further, the computing cost is significantly lower than that of the conventional artificial-neural-networks. This work provides a promising material-algorithm co-design for affordable and highly efficient photonic neuromorphic systems.