In recent years FPGA has become popular in CNN acceleration, and many CNN-to-FPGA toolchains are proposed to fast deploy CNN on FPGA. However, for these toolchains, updating CNN network means regeneration of RTL code and re-implementation which is timeconsuming and may suffer timing-closure problems. So, we propose HB-DCA: a toolchain and corresponding accelerator. The CNN on HBDCA is defined by the content of BRAM. The toolchain integrates UpdateMEM utility of Xilinx, which updates content of BRAM without re-synthesis and re-implementation process. The toolchain also integrates TensorFlow Lite which provides high-accuracy quantization. HBDCA supports 8-bits perchannel quantization of weights and 8-bits per-layer quantization of activations. Upgrading CNN on accelerator means the kernel size of CNN may change. Flexible structure of HBDCA supports kernel-level parallelism with three different sizes (3 × 3, 5 × 5, 7 × 7). HBDCA implements four types of parallelism in convolution layer and two types of parallelism in fully-connected layer. In order to reduce access number to memory, both spatial and temporal data-reuse techniques were applied on convolution layer and fully-connect layer. Especially, temporal reuse is adopted at both row and column level of an Input Feature Map of convolution layer. Data can be just read once from BRAM and reused for the following clock. Experiments show by updating BRAM content with single Up-dateMEM command, three CNNs with different kernel size (3 × 3, 5 × 5, 7×7) are implemented on HBDCA. Compared with traditional design flow, UpdateMEM reduces development time by 7.6X-9.1X for different synthesis or implementation strategy. For similar CNN which is created by toolchain, HBDCA has smaller latency (9.97µs-50.73µs), and eliminates re-implementation when update CNN. For similar CNN which is created by dedicated design, HBDCA also has the smallest latency 9.97µs, the highest accuracy 99.14% and the lowest power 1.391W. For different CNN which is created by similar toolchain which eliminate re-implementation process, HBDCA achieves higher speedup 120.28X.