DualIso: An Algorithm for Subgraph Pattern Matching on Very Large Labeled Graphs

Saltz, Matthew; Jain, Ayushi; Kothari, Abhishek; Fard, Arash; Miller, John A.; Ramaswamy, Lakshmish

doi:10.1109/bigdata.congress.2014.79

Cited by 16 publications

(10 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…is built on a subgraph-isomorphism algorithm from Saltz et al [210], named dual-simulation , which proved that if the program dependence graphs of two programs are isomorphic then the programs are "strongly" semantically equivalent [106]. Our dependence graph representations G F N and G F N , which we check for isomorphism, only include the data dependences, and not the control dependences.…”

Section: Transformer and Matchermentioning

confidence: 99%

Scalable validation of binary lifters

Dasgupta

Dinesh

Venkatesh

et al. 2020

Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation

View full text Add to dashboard Cite

The ability to directly reason about binary machine code is desirable, not only because it allows analyzing binaries even when the source code is not available (e.g., legacy code, closed-source software, or malware), but also avoids the need to trust the correctness of compilers. Binary analysis is generally performed by existing decompiler projects by (1) converting raw bytes from the binary into a stream of assembly instructions through disassembly, (2) translating machine code to an intermediate representation (IR) using a binary lifter, and (3) performing various analysis and transformations on the IR pertaining to the specific goals of the decompiler. Many binary analysis frameworks published in academia or as open-source code, use such a lifter as the first step in their pipeline. Validating the correctness of binary lifters is pivotal to gain trust in binary analysis, especially when used in scenarios where correctness is essential. Unfortunately, existing approaches focus on validating the correctness of lifting a single instruction and do not scale to full programs. I believe an effort in the direction would enable both the developers of binary translators, to validate their implementation, and the clients of those translators, to gain trust in their analysis results. The overall goal of my work is to develop formal and informal techniques to achieve high confidence in the correctness of binary lifting by leveraging the semantics of the languages involved (e.g., Intel's x86-64 and LLVM IR). Towards that goal, I made two broad contributions. First, I defined the most complete and thoroughly tested formal semantics of x86-64 to date. The semantics faithfully formalizes all the non-deprecated, sequential user-level instructions of the x86-64 Haswell instruction set architecture. The formal specification covers 3155 instruction variants, corresponding to 774 mnemonics. The semantics is fully executable and has been tested against the GCC C-torture test suite. Moreover, each instruction is individually tested against more than 7,000 instruction-level test cases. This extensive testing paid off, revealing bugs in both the x86-64 reference manual and other existing semantics, which are all acknowledged, and some are fixed. Also, I illustrated potential applications of the semantics in different formal analyses, and discuss how it can be useful for processor verification. Second, I show that formal translation validation of single instructions for a complex ISA like x86-64 is not only practical but can be used as a building block for scalable full-program validation. My work is the first to do translation validation of single instructions on an Coming up to this point is one of the biggest challenges I have ever pursued in my life. This thesis would not have been possible if I did not have the support of the following people. I owe my deepest gratitude to my adviser, Professor Vikram Adve, whose guidance, patience, and encouragement have been pivotal in the successful completion of this thesis. He is one of the ...

show abstract

Section: Transformer and Matchermentioning

confidence: 99%

Scalable validation of binary lifters

Dasgupta

Dinesh

Venkatesh

et al. 2020

Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation

View full text Add to dashboard Cite

show abstract

“…The match between a graph defined in the META part of an MCMT and a subgraph of one of the models in the multilevel stack is done by means of graph homomorphisms, plus some restrictions. The algorithm for graph matching is a modification of the Ullman algorithm [67], as proposed in [68]. Basically, we take into account some modelling aspects in order to adapt the process from pure graphs to modelling.…”

Section: Graph Matchingmentioning

confidence: 99%

Multilevel coupled model transformations for precise and reusable definition of model behaviour

Macías

Wolter

Rutle

et al. 2019

Journal of Logical and Algebraic Methods in Programming

View full text Add to dashboard Cite

The use of Domain-Specific Languages (DSLs) is a promising field for the development of tools tailored to specific problem spaces, effectively diminishing the complexity of hand-made software. With the goal of making models as precise, simple and reusable as possible, we augment DSLs with concepts from multilevel modelling, where the number of abstraction levels are not limited. This is particularly useful for DSL definitions with behaviour, whose concepts inherently belong to different levels of abstraction. Here, models can represent the state of the modelled system and evolve using model transformations. These transformations can benefit from a multilevel setting, becoming a precise and reusable definition of the semantics for behavioural modelling languages. We present in this paper the concept of Multilevel Coupled Model Transformations, together with examples, formal definitions and tools to assess their conceptual soundness and practical value.

show abstract

“…Let us now describe the refine procedure outlined in Algorithm 4. Our refine procedure is based on the dual graph simulation technique [20] that was shown in [21] to outperform the commonly used VF2 algorithm [6]. The refine procedure checks for each node p and its candidate node u whether the neighborhood of p ∈ V P is sub-isomorphic to that of u in the graph.…”

Section: Refine Candidatesmentioning

confidence: 99%

Top- Durable Graph Pattern Queries on Temporal Graphs

Semertzidis

Pitoura

2019

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Graphs offer a natural model for the relationships and interactions among entities, such as those occurring among users in social and cooperation networks, and proteins in biological networks. Since most such networks are dynamic, to capture their evolution over time, we assume a sequence of graph snapshots where each graph snapshot represents the state of the network at a different time instance. Given this sequence, we seek to find the top-k most durable matches of an input graph pattern query, that is, the matches that exist for the longest period of time. The straightforward way to address this problem is to apply a state-of-the-art graph pattern matching algorithm at each snapshot and then aggregate the results. However, for large networks and long sequences, this approach is computationally expensive, since all matches have to be generated at each snapshot, including those appearing only once. We propose a new approach that uses a compact representation of the sequence of graph snapshots, appropriate time indexes to prune the search space and strategies to determine the duration of the seeking matches. Finally, we present experiments with real datasets that illustrate the efficiency and effectiveness of our approach.

show abstract

DualIso: An Algorithm for Subgraph Pattern Matching on Very Large Labeled Graphs

Cited by 16 publications

References 16 publications

Scalable validation of binary lifters

Scalable validation of binary lifters

Multilevel coupled model transformations for precise and reusable definition of model behaviour

Top- Durable Graph Pattern Queries on Temporal Graphs

Contact Info

Product

Resources

About