In the industrial Internet of ings domain, applications are moving from the Cloud into the edge, closer to the devices producing and consuming data. is means applications move from the scalable and homogeneous cloud environment into a constrained heterogeneous edge network. Making edge applications reliable enough to ful ll Industrie 4.0 use cases is still an open research challenge. Maintaining operation of an edge system requires advanced management techniques to mitigate the failure of devices. is paper tackles this challenge with a twofold approach: (1) a policy-enabled failure detector that enables adaptable failure detection and (2) an allocation component for the e cient selection of failure mitigation actions. We evaluate the parameters and performance of our failure detection approach and the performance of an energy-e cient allocation technique, and present a vision for a complete system as well as an example use case.