Purpose -Designing efficient XML schemas is essential for XML applications which manage semi-structured data. On generating XML schemas, there are two opposite goals: to avoid redundancy and to provide connected structures in order to achieve good performance on queries. In general, highly connected XML structures allow data redundancy, and redundancy-free schemas generate disconnected XML structures. The purpose of this paper is to describe and evaluate by experiments an approach which balances such trade-off through a workload analysis. Additionally, it aims to identify the most accessed data based on the workload and suggest indexes to improve access performance. Design/methodology/approach -The paper applies and evaluates a workload-aware methodology to provide indexing and highly connected structures for data which are intensively accessed through paths traversed by the workload. Findings -The paper presents benchmarking results on a set of design approaches for XML schemas and demonstrates that the XML schemas generated by the approach provide high query performance and low cost of data redundancy on balancing the trade-off on XML schema design. Research limitations/implications -Although an XML benchmark is applied in these experiments, further experiments are expected in a real-world application. Practical implications -The approach proposed may be applied in a real-world process for designing new XML databases as well as in reverse engineering process to improve XML schemas from legacy databases. Originality/value -Unlike related work, the reported approach integrates the two opposite goal in the XML schema design, and generates suitable schemas according to a workload. An experimental evaluation shows that the proposed methodology is promising.