There has been considerable interest and resulting progress
in implementing machine learning (ML) models in hardware over the
last several years from the particle and nuclear physics
communities. A big driver has been the release of the Python
package, hls4ml, which has enabled porting models specified and
trained using Python ML libraries to register transfer level (RTL)
code. So far, the primary end targets have been commercial
field-programmable gate arrays (FPGAs) or synthesized custom blocks
on application specific integrated circuits (ASICs). However, recent
developments in open-source embedded FPGA (eFPGA) frameworks now
provide an alternate, more flexible pathway for implementing ML
models in hardware. These customized eFPGA fabrics can be integrated
as part of an overall chip design. In general, the decision between
a fully custom, eFPGA, or commercial FPGA ML implementation will
depend on the details of the end-use application. In this work, we
explored the parameter space for eFPGA implementations of
fully-connected neural network (fcNN) and boosted decision tree
(BDT) models using the task of neutron/gamma classification with a
specific focus on resource efficiency. We used data collected using
an AmBe sealed source incident on Stilbene, which was optically
coupled to an OnSemi J-series silicon photomultiplier (SiPM) to
generate training and test data for this study. We investigated
relevant input features and the effects of bit-resolution and
sampling rate as well as trade-offs in hyperparameters for both ML
architectures while tracking total resource usage. The performance
metric used to track model performance was the calculated neutron
efficiency at a gamma leakage of 10-3. The results of the study
will be used to aid the specification of an eFPGA fabric, which will
be integrated as part of a test chip.