Git Diff at method level

anon-sAI :

I am looking at getting information on all the methods/function Added , Deleted and Modified between any two commits

Notes -

  1. Code Base is in Java and on Github

  2. Utlimate Goal - I must be able to get all the Deleted, Modified(Both source code modification and renaming of methods) and Newly added Methods between any two commits spanning across sub-packages and classes

  3. More pleased if full method signature is returned along with fully qualified method name

Things I Tried

  1. git Diff - Link - but the Diff history is huge and I'm really only interested in the changes of methods added, deleted or modified (ie in Java lists the class but not the function)

  2. git log -L :function:path/to/file - prints the change history of that function, doesn't do what I intend to do and watchers are on a specific function but not on whole git repo. Another limitation is of getting diff between two commits.

Desired Results

Diff between any two commits should return

Methods Added -> 
        myMethod12 - path/to/class
        myMethod34 - path/to/class

Methods Deleted -> 
        myMethod3 - path/to/class
        myMethod11 - path/to/class

Methods Renamed ->
        (Previous Name)  (Revised Name)  (Path)
        myMethod6        yourMethod32    path/to/class

Methods Modified (source code modifs) ->
        myMethod44 - path/to/class

or ideally the fully qualified method name

ie

Methods Added ->
       com.example.subp.subp2.nestedpack.addMessages(Message[] msgs)
...
torek :

Git is a general tool. It does not understand your source language (in this case Java, but what if your source language were instead Swift or Python or C++ or TypeScript or, well, whatever else you can think of?). It just understands "lines of text" and has simple (or sometimes, not-very-simple) regular expressions to recognize function / method / class / other such definitions, to annotate diffs.1

To get the kind of output you want, you need a tool that does understand the language in question.

Given such a tool, you will give it:

  • an older version (a commit or a file from that commit), and
  • a newer version (another commit, or "the same" file from that commit).

It should then read those two commits' files, figure out what methods you have, and produce whatever analysis you like.

What this tool needs from Git is two versions. When and whether it can handle just getting two files, or needs two entire snapshots, depends on that tool.

The git difftool command may, or may not, be helpful for invoking this other tool. What git difftool does is compare two entire commits, then, for each differing file, feed the old and new versions of those files to another tool. You choose that second tool, from any tool you have on your computer, anywhere. Git merely invokes that tool, on the pair of files extracted from the pair of commits. If this does what you need, you're now done. If not, you may need some more steps: for instance, you might want to run git diff --raw <commit1> <commit2> and parse its output, or just git checkout each of the two commits into some temporary locations (using a temporary index for each) and work from there.


1Note that regular expressions are not capable of proper parsing; most real languages require a grammar. See, e.g., Regular Expression Vs. String Parsing. A proper CS-theoretic discussion will get into Finite State Automata but is generally off topic on StackOverflow.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=417048&siteId=1