This paper addresses the lack of proper Learning from Demonstration (LfD) architectures for Sign Language-based Human–Robot Interactions to make them more extensible. The paper proposes and implements a Learning from Demonstration structure for teaching new Iranian Sign Language signs to a teacher assistant social robot, RASA. This LfD architecture utilizes one-shot learning techniques and Convolutional Neural Network to learn to recognize and imitate a sign after seeing its demonstration (using a data glove) just once. Despite using a small, low diversity data set (~ 500 signs in 16 categories), the recognition module reached a promising 4-way accuracy of 70% on the test data and showed good potential for increasing the extensibility of sign vocabulary in sign language-based human–robot interactions. The expansibility and promising results of the one-shot Learning from Demonstration technique in this study are the main achievements of conducting such machine learning algorithms in social Human–Robot Interaction.