Our project sought to implement a known difficult problem in dynamically calculating causal relationships between threaded events, specifically relationships between thread reads and writes. By using intel PIN, we are able to instrument and insert our analysis functions, allowing us to track and control execution of an arbitrary target, even without source code producer cooperation. Our targeted analysis, in this particular case, makes a record of every instruction's operation, recording data necessary to make the arduous offline evaluation that could detect further program faults possible (our true final goal). We have implemented a dynamic vector clock, directly from a descriptive and valuable research paper that specifically addressed the shortfall that most vector clocks suffer from, which is that a static number of concurrent processes must be known from the start. With this dynamic vector clock in use, we are able to capture inherently the nature of execution, since as thread spawn and death events occur our tool adapts according, without loss of data accuracy. Our pin tool is meant to track data reads and writes in a novel way, such that, when combined with the vector clock and a periodic intra-thread "window-view" snapshot, a scalable algorithm to detect cases where OS scheduler sequences may or may not resolve such that data inconsistency from memory arises. Therefore, we do not reveal this novel scheme in context of this textual submission (but will discuss it with judges briefly in order that we may substantiate). This first step is critical in finding and exploiting/ameliorating a most difficult class of surreptitious bugs.

Although the Dynamic Vector Clock and the pin tool themselves are currently separated in source, we have devised a method and plan for their integration, and feel absolutely very strongly confident that their integration can be accomplished quickly with only marginal amounts of time required. Further, we believe that the data gathered can be made use of both in automated deep analysis and in contexts involving more human interaction. For example, in replacement of vector clocks, matrix clocks can be employed to realize deeper causal reasoning about inter-thread relationships. While this idea is not novel, the method of employment itself is, and has ample room for expansive research.

Our vision is to place execution records (which are output as simple xml files) in NoSQL containers, thereby inherently capturing the relational nature of execution; data-data association, instruction-data association, data-function, and inter-thread-data relationships are all questions worth deep questioning and advanced analysis. Because these NoSQL containers can scale in areas of transitive relationship exploration and network traversal queries, we believe we are poised to unlock a trove of valuable security artifacts that can be discovered in off-the-shelf software by producing, analyzing and querying these collected datasets.

Further, by creating some rather simple integration scripts between these containers and existing 3D data visualization libraries, and creating our own layout capabilities, we believe it is possible to view these quickly and in an immediately useful and obvious manner.

Share this project: