To explore the impact of autonomous vehicles (AVs) on human-driven vehicles (HDVs), a solution for AV to coexist harmoniously with HDV during the car following period when AVs are in low market penetration rate (MPR) was provided. An extension car following framework with two possible soft optimization targets was proposed in this article to improve the experience of HDV followers with different following strategies by deep deterministic policy gradient (DDPG) algorithm. The pretreated Next Generation Simulation (NGSIM) dataset was used for the experiments. 1027 car following events with being redefined were extracted from it, in which 600 of the events were used for training and 427 of the events were used for testing. The different driving strategies obtained from the classical car following models were embedded into virtual environment built by OpenAI gym. The reward function combined safety, efficiency, jerk, and stability was used to encourage the agent with DDPG algorithm to maximize it. The final result reveals that disturbance of HDV followers decreases by 2.362% (strategy a), 8.184% (strategy b), and 13.904% (strategy c), respectively. The disturbance of HDV follower decreases by 14.961% (strategy a), 12.020% (strategy b), and 13.425% (strategy c), respectively. HDV followers with different strategies get less jerk in both soft optimizations. AV passengers get a loss on jerk and efficiency, but safety is enhanced. Also, AV car following performs better than HDV car following in both soft and brutal optimizations. Moreover, two possible solutions for harmonious coexistence of HDVs and AVs when AVs are in low MPR are proposed.