I wanted to write a tool to help developers write better code, something about using the Roslyn compiler and its Diagnostic API to analyze code and provide rule-based feedback and correction.
But that proved too difficult :(
So I made something much less ambitious, slightly more fun, and infinitely more useless.
(Also I just really wanted to use the Closure Compiler)
How it works
As always, step 1 is parsing. I use Closure Compiler to parse a JS file.
an aside, you might be wondering why js? Because Closure Compiler parses JS, and I want to use it, that's why.
I wrote a simple compilation pass that will print to stdout every time the compiler meets a variable name node.
e.g. when parsing
var x = 1, the compiler will print something like
VARNAME NAME x 1
My language of choice for analytics is python, it comes with the fantastic library pandas. So the strategy here is to use python to start a subprocess that runs the compiler (jar file) with the right arguments, and capture all the stdout of the subprocess. We can then easily parse this output into a list of strings.
We transform whatever output the compiler emits into a list of variable names, i.e. from
VARNAME NAME x 1 VARNAME NAME y 1
So at the end of this step all we have left is a list of strings, and these strings are all the variable names used in the file.
We can now do some magic on all these variables. Well, not really. I just wrote some simple string processing functions, I call them Analysis Engines, that try to capture interesting facts about the variable names. For example, length of the variable names, number of times a variable name was used, most popular prefix, most popular subsequence. From these we can ascribe characteristics to programmer.
There characteristics are:
- concise / verbose depending on the mean variable name length
- camel case or underscore case depending on the style of the variable name
Challenges I ran into
Closure Compiler. I'm glad I have dug slightly into the source of the Closure Compiler, and wrote a blogpost about it too. That is a complicated piece of software. I took about an hour or so trying to get it to print the things I wanted.
Another challenge was using pandas. It is another complex piece of software, very very powerful, and I didn't that much experience. So I had to spend some time digging into the documentation wondering how to get the results I wanted.
The front end interaction took some time as well, I lack design skills but I didn't want things to be unusable. So I went with a minimalist look, just like my blog - black and white, some typography, font weights, font sizes.
The biggest challenge is probably thinking of what kind of metric about variable names would be interesting to us programmers And how to present this in a fun way. I went with the sarcastic comments, hopefully that's funny...
Accomplishments that I'm proud of
Twilio integration to sms you when the analysis is done. I did that and it worked in like one try. Two reason for that, I had used their API before, and Twilio API and documentation is just plain awesome.
UX of the app. There's not much UX, but I did try my hands on it and made it such that buttons and text disappear/appear as necessary so users wouldn't be confused by it, and by users I mean just myself.
What I learned
More about Closure Compiler. Recently I've been really interested in compilers and programming languages, so I have been finding excuses to work on something related to them. If I couldn't be hacking on the compiler or the language it self, at least I can use it do so something productive and useful right? Right?
And I met so many people from all walks of life. Can't wait for the showcase!
What's next for accgtimnotyz
Probably a nap. A long nap.