Cybersecurity exercises (CSXs) enable raising organizational awareness, testing capabilities, identifying strengths and weaknesses, and gaining hands-on practice in building resilience against attacks. Typical CSX execution is designed as a competition or a challenge with gamification features to increase participant engagement. Also, it requires a significant amount of human resources to ensure up-to-date attack simulation and proper feedback. The usual concerns related to CSXs are how many points the team or participant received and the reason behind the particular evaluation. Properly balanced scoring can provide valuable feedback and keep CSX participants engaged. An inadequate scoring system might have the opposite effect—spread disorder, cause discontent, decrease motivation, and distract the participants from the event's primary goal. Combining both technical and soft sides in CSX makes it increasingly complex and challenging to ensure a balanced scoring. This paper defines scoring challenges and trends based on the case study of one of the largest international live-fire cyber defense exercises, Locked Shields (LS). It reviews the CSX scoring categories of the recent LS executions and provides the most common participant concerns related to scoring. The feedback shows that clarity and transparency of the scoring system together with providing feedback and justification to the scores are one of the top concerns. The design choices of the scoring system are explored to demonstrate the subtle variations of balanced category scoring and make a basis for future discussions. The chosen contrast and comparison approach enabled distinguishing four parameters for design decision categories: complexity, transparency, level of competition, and automatization. The research results demonstrate that learning facilitation requires system simplification and decisions regarding trends of the scoring curve. Even though transparency is a critical issue, concealing some scoring logic details can ensure more flexibility during the event to stimulate participants, support learning experiences, and cope with unexpected situations. Time as a central dimension enables the implementation of complex scoring curves for automated assessment. Our study contributes to the community of higher education institutions and all organizers of cybersecurity challenges for skill development and assessment.