The Power of CQLinq for Developers
Editorial Note: I originally wrote this post for the NDepend blog. Check out the original here, at their site. While you’re there, have a look around at some of the other posts and subscribe to the RSS feed if you’d like a weekly post about static analysis.
I can still remember my reaction to Linq when I was first exposed to it. And I mean my very first reaction. You’d think, as a connoisseur of the programming profession, it would have been, “wow, groundbreaking!” But, really, it was, “wait, what? Why?!” I couldn’t fathom why we’d want to merge SQL queries with application languages.
Up until that point, a little after .NET 3.5 shipped, I’d done most of my programming in PHP, C++ and Java (and, if I’m being totally honest, a good bit of VB6 and VBA that I could never seem to escape). I was new to C#, and, at that time, it didn’t seem much different than Java. And, in all of these languages, there was a nice, established pattern. Application languages were where you wrote loops and business logic and such, and parameterized SQL strings were where you defined how you’d query the database. I’d just gotten to the point where ORMs were second nature. And now, here was something weird.
But, I would quickly realize, here was something powerful.
The object oriented languages that I mentioned (and whatever PHP is) are imperative languages. This means that you’re giving the compiler/interpreter a step by step series of instructions on how to do something. “For an integer i, start at zero, increment by one, continue if less than 10, and for each integer…” SQL, on the other hand, is a declarative language. You describe what you want, and let something else (e.g. the RDBMS server) sort out the details. “I want all of the customer records where the customer’s city is ‘Chicago’ and the customer is less than 40 years old — you figure out how to do that and just give me the results.”
And now, all of a sudden, an object oriented language could be declarative. I didn’t have to write loop boilerplate anymore!
Rise of the Linq Providers
Well, okay, that might not be strictly true, but a guy can dream, right? And still, the point remained. In my code, I could write detailed, declarative queries to examine collections of objects in memory. Bits of code that previously needed hundreds of lines now required maybe dozens. And the code, once you were used to it, became a lot more readable.
Another interesting development emerged as developers got used to and internalized the Linq revolution. The Linq to X pattern began to emerge. Microsoft itself had an erstwhile ORM called “Linq to SQL” but you also saw “Linq to objects” and “Linq to XML” and the like. This briefly became ubiquitous enough that some recruiters probably knew these terms back in 2010 or so. You could Linq to all the things!
Eventually, however, this gave way to a less sensational state of affairs in which Linq was essentially the abstraction it was intended, and “to XML” and “to objects” were single implementations in a growing ecosystem of them. People began to write Linq providers, where the idea was to create a queryable, Linq interface on top of some kind of source of data.
Maybe you wanted to see what was up on Twitter. The idea was that a client of such a provider should be able to do something like.
var gitHubTweets = tweets.Where(t => t.text.toLower().Contains("github"));
What’s happening here is that the Linq provider (both the code and the author of that code) are allowing declarative queries against their data. So, someone writing a Twitter Linq Provider would give users a way to say, “I don’t care how you go about it, but get me my Github tweets!”
As the dust settled with Linq and all of this backstory, something awesome happened. At least, something awesome for me and for my programming chops.
NDepend introduced a very specialized, very powerful Linq provider: CQLinq. With CQLinq, you didn’t query XML files, databases, Twitter, or any other more traditional sources of data. Rather, you queried your own source code.
Now, consider the impact this had on me, given what you may know about me. You know that I have it in my head that code can be data. You know that I ran a whole series of posts that were experimentally oriented around code metrics. So there I am with a conceptual data canvas and with a mind that likes scientific inquiry, and all of a sudden, I discover a way to form, test, and confirm hypotheses about source code with a feedback loop unprecedented in shortness.
I could suddenly answer questions that would historically have been insanely laborious. “I wonder what percentage of methods in this code base have a cyclomatic complexity higher than 5 and with reference to at least 3 class fields?” The old way to answer this question? Go through literally every method in the codebase by hand, keeping score. The new way? Spend 30 seconds writing a query and then execute it.
The old way was prohibitive, frankly, so there was no old way. Going through every method in the codebase and tabulating something like that would never have been worth the labor investment to nibble at hunches and patterns. So I just didn’t do it.
But, with the new way, the barriers to entry were utterly removed, and so I did. And I’ve been doing so ever since.
This has allowed me to learn a lot not just about code or best practices, but about patterns and approaches and about what people do. I’ve learned about code properties that correlate with different qualitative opinions on code, and I’ve learned about properties that tend to make code easy or hard to maintain. I’ve learned enough about this to build it into part of a consulting practice.
CQLinq is extremely powerful, and it will make you a student of code in a way you’ve never been before. It frees you up to let your curiosity be satisfied in very little time, and to truly start understanding what makes application source code tick.