Vision-based robots have been utilized for pick-and-place operations by their ability to find object poses. As they progress into handling a variety of objects with cluttered state, more flexible and lightweight operations have been presented. In this paper, an autonomous robotic bin-picking platform which combines human demonstration with a collaborative robot for the flexibility of the objects and YOLOv5 neural network model for the faster object localization without prior CAD models or dataset in the training. After simple human demonstration of which target object to pick and place, the raw color and depth images were refined, and the one on top of the bin was utilized to create synthetic images and annotations for the YOLOv5 model. To pick up the target object, the point cloud was lifted using the depth data corresponding to the result of the trained YOLOv5 model, and the object pose was estimated through matching them by Iterative Closest Points (ICP) algorithm. After picking up the target object, the robot placed it where the user defined in the previous human demonstration stage. From the result of experiments with four types of objects and four human demonstrations, it took a total of 0.5 seconds to recognize the target object and estimate the object pose. The success rate of object detection was 95.6%, and the pick-and-place motion of all the found objects were successful.