How do we account for exception throws in static code analysis?

user3911119 :

I wrote an utility to create a CFG (Control Flow Graph) for a java method whose nodes are basic blocks instead of instructions.

I could not consider exception throws to be edges in CFG. The reasons are that:

  1. Every instruction in try block can potentially throw exceptions / errors which can be handled by any of the nesting try-catch blocks. If we consider exception throws as edges, the number of paths to process increases drastically, and so will the number of nodes in CFG.
  2. We need to know the inheritance hierarchy for exceptions before we can decide what jumps are possible.

How do static code analyzers solve this problem?

I am stuck at this point. If I have to proceed, how should I go about it?

Edit: in my case, I can limit support to those use cases which can specify where and which exceptions are thrown. This solved my second problem. I would still like to know how generic static code analysers manage this.

Antimony :

Here's how I dealt with the issue in the Krakatau decompiler:

We need to know the inheritance hierarchy for exceptions before we can decide what jumps are possible.

Krakatau requires the class definitions of any referenced classes to be available, so it knows the inheritance hierarchy. However, if I was doing it over, I wouldn't do this. Requiring class definitions makes the decompiler difficult for users to operate because finding and adding the dependencies is a huge pain. You don't actually need this if you're ok with the analysis being a little less precise. You can instead just assume that all exceptions can reach all handlers. In practice, I expect that it would lead to nearly the same results.

Every instruction in try block can potentially throw exceptions / errors which can be handled by any of the nesting try-catch blocks. If we consider exception throws as edges, the number of paths to process increases drastically, and so will the number of nodes in CFG.

Krakatau does include exceptions as edges in the CFG, which leads to the problems you identified. In order to reduce the number of edges, I pretended that only certain instructions can throw (method calls, array accesses, division, etc.). This isn't technically correct, but it does the right thing for real world code. I've never seen anything that actually cares about the exceptions being thrown from a link error, Thread.Stop or the like. I did later add an option to disable this behavior though.

Anyway, this worked well enough for most code, but it sometimes caused performance problems. In particular, really large methods with lots of field accesses or method calls would result in huge CFGs that made decompilation really slow. I tried some tricks to optimize this, but ultimately, the solution was to move from basic blocks to extended basic blocks.

Extended Basic Blocks are similar to Basic Blocks, except that exception edges are represented semi-implicitly, resulting in a much smaller CFG. An EBB consists of straight line code with no entry points or exit points in the middle apart from exception edges, and where every instruction in the block is covered by the same set of exception handlers. That way, instead of having an exception edge per instruction, you have one per block, making things much more efficient.

Even Java methods with thousands of methods calls will typically only have a few try/catches and hence only a few EBBs.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=462114&siteId=1