Design Exploration of Machine Learning Data-Flows onto Heterogeneous Reconﬁgurable Hardware

Oliveira, Westerley C.; Canesche, Michael; Reis, Lucas; Nacif, José Augusto M.; Ferreira, Ricardo

doi:10.5753/wscad.2020.14063

Cited by 3 publications

(5 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Estes algoritmos foram utilizados para validar às duas abordagens de posicionamento propostas [Canesche et al 2020. Além disso, foram utilizados para validar uma nova versão de posicionamento com SA do nosso grupo de pesquisa [Carvalho et al 2020, Oliveira et al 2020. A sétima contribuic ¸ão foi a validac ¸ão dos algoritmos de travessia com uma implementac ¸ão em hardware [Vieira et al 2021].…”

Section: Contribuic ¸õEsunclassified

Algoritmos de Posicionamento e Roteamento baseados em Travessia de Grafo para Arquiteturas Reconfiguráveis de Grão Grosso (CGRA)

Canesche¹,

Ferreira²

2022

Anais Do XXXV Concurso De Teses E Dissertações (CTD 2022)

Self Cite

View full text Add to dashboard Cite

O hardware reconfigurável é flexível, eficiente energeticamente e oferece alto desempenho para vários domínios de aplicação. Os FPGAs são a tecnologia reconfigurável mais utilizada, entretanto o mapeamento da aplicação é NP-Completo. Os CGRAs simplificam o desenvolvimento de aplicações para hardware reconfigurável e alteram a granularidade das operações do nível de bit para o nível de palavras (8-32 bits). Este trabalho apresenta avanços no estado da arte para o problema de mapeamento em CGRA propondo novos algoritmos de travessia de grafos. As soluções alcançadas são ordens de grandeza mais rápidas sem comprometer a qualidade. Os algoritmos propostos foram 91x mais rápidos com uma redução de 1.7x nos recursos de balanceamento em comparação com a abordagem com Simulated Annealing.

show abstract

Section: Contribuic ¸õEsunclassified

Algoritmos de Posicionamento e Roteamento baseados em Travessia de Grafo para Arquiteturas Reconfiguráveis de Grão Grosso (CGRA)

Canesche¹,

Ferreira²

2022

Anais Do XXXV Concurso De Teses E Dissertações (CTD 2022)

Self Cite

View full text Add to dashboard Cite

show abstract

“…This article is an extension of our conference paper. 12 In the previous version of this work, our main contribution was to explore homogeneous and heterogeneous CGRA for machine learning dataflows considering only the functionality heterogeneity case. We proposed three heterogeneous architectures that considered multipliers as the critical functionality resource.…”

Section: Interconnections and Routing Resourcesmentioning

confidence: 99%

“…We also demonstrated that it is impossible to map some simple dataflow patterns in mesh topologies without the aid of buffers. This article is an extension of our conference paper 12 . In the previous version of this work, our main contribution was to explore homogeneous and heterogeneous CGRA for machine learning dataflows considering only the functionality heterogeneity case.…”

Section: Introductionmentioning

confidence: 99%

Heterogeneous reconfigurable architectures for machine learning dataflows

Oliveira

Canesche

Reis

et al. 2022

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

This work explores the placement and routing of machine learning applications' dataflow graphs on different heterogeneous coarse‐grained reconfigurable architectures (CGRA). We analyze three different types of processing element (PE) heterogeneity, the first concerning the interconnection pattern, the second being on the kind of operations a single PE can execute, and the last concerning the PE buffer resources. This analysis aim to propose a fair reduction to the overall cost in comparison to the homogeneous CGRA architecture. We compare our results with the homogeneous case and one of the state‐of‐the‐art tools for placement and routing (P&R). Our algorithm executed, on average, 52% faster than VPR 8.1 (Versatile Place and Route), which is an open‐source academic tool designed for the FPGA placement and routing phases, reaching better mapping in 66% of cases and achieving the same results in 26% of cases. Furthermore, a heterogeneous architecture reduces the cost without losing performance in 76% of the cases considering multiplier heterogeneity. We propose a novel heterogeneous buffer architecture that minimizes the buffer resources by 56.3% for K‐means dataflow patterns. We also show that a heterogeneous border chess architecture outperforms a homogeneous one. In addition, our mapping reaches optimal instances of single tree dataflows compared to classical Lee/Choi and H‐trees.

show abstract

“…Heuristic approaches provide approximate solutions and scalability without the guarantee of optimality. By employing techniques such as local search based on graph level (Ferreira et al, 2005), genetic algorithms (Silva et al, 2006), graph traversal approaches (Canesche et al, 2020(Canesche et al, , 2021, or simulated annealing (Luu et al, 2011;Oliveira et al, 2020a), heuristics can efficiently explore large solution spaces and find good-quality solutions within reasonable time frames.…”

Section: Placement and Routing Heuristic Approachesmentioning

confidence: 99%

“…We also demonstrated that it is impossible to map some simple dataflow patterns in mesh topologies without the aid of buffers. This paper is an extension of our conference paper (Oliveira et al, 2020a). In the previous version of this work, our main contribution was to explore homogeneous and heterogeneous CGRA for machine learning dataflows considering only the functionality heterogeneity case.…”

Section: Interconnections and Routingmentioning

confidence: 99%

Design exploration of machine learning data-flows onto heterogeneous reconﬁgurable hardware

Oliveira

View full text Add to dashboard Cite

This work explores the placement and routing of Machine Learning applications data- ﬂow graphs on different heterogeneous Coarse-Grained Reconﬁgurable Architectures (CGRA). We analyze three different types of processing element (PE) heterogeneity, the ﬁrst concerning the interconnection pattern, the second being on the kind of ope- rations a single PE can execute, and the last concerning the PE buffer resources. This analysis aims to propose a fair reduction to the overall cost in comparison to the ho- mogeneous CGRA architecture. We compare our results with the homogeneous case and one of the state-of-the-art tools for placement and routing (P&R). Our algorithm executed, on average, 52% faster than VPR 8.1 (Versatile Place and Route), which is an open-source academic tool designed for the FPGA placement and routing pha- ses, reaching better mapping in 66% of cases and achieving the same results in 26% of cases. Furthermore, a heterogeneous architecture reduces the cost without losing performance in 76% of the cases considering multiplier heterogeneity. We propose a novel heterogeneous buffer architecture that minimizes the buffer resources by 56.3% for K-means dataﬂow patterns. We also show that a heterogeneous border chess archi- tecture outperforms a homogeneous one. In addition, our mapping reaches optimal instances of single tree dataﬂows compared to classical Lee/Choi and H-Trees. Keywords: Reconﬁgurable architecture. CGRAs. Placement. Routing.

show abstract

Design Exploration of Machine Learning Data-Flows onto Heterogeneous Reconﬁgurable Hardware

Cited by 3 publications

References 20 publications

Algoritmos de Posicionamento e Roteamento baseados em Travessia de Grafo para Arquiteturas Reconfiguráveis de Grão Grosso (CGRA)

Algoritmos de Posicionamento e Roteamento baseados em Travessia de Grafo para Arquiteturas Reconfiguráveis de Grão Grosso (CGRA)

Heterogeneous reconfigurable architectures for machine learning dataflows

Design exploration of machine learning data-flows onto heterogeneous reconﬁgurable hardware

Contact Info

Product

Resources

About