Search for Code

Today, TechCrunch posted a review of AllTheCode, a new code search site. So now we have at least

  • AllTheCode — only java results right now, no filtering available
  • Krugle — can specify language and project. Can filter results by where the search term appears (e.g. comments, method name, etc). Can also search “Tech Pages” (a selective web site search) and projects.
  • Koders — can specify language and license
  • Google Code Search — can specify language, license, package, & file. Can search using regular expressions.
  • Codase — can specify language (Java, C, or C++ only), plus advanced things like search within method name, class name, etc.

Why Use Code Search Tools?

Code search tools are handy because as a developer, your productivity and quality can be greatly affected by how you come up with the code to put into your applications. Sure, you could write everything from scratch, but that’s too slow. Reuse is a good thing. Ideally you should be able to find existing samples, snippets, libraries, and frameworks that you can adjust and glue together.

In the old days you’d have to store away your favorite code snippets on your PC, or thumb through archived issues of Doctor Dobb’s Journal. Later, you’d buy CDROMs filled with sample code, and tap your fingers while your drive whirred away. Eventually those code libraries could be installed on your PC. Recently, you’d use Google and other search engines (well, mostly Google) to search for code. But searching on “VB MD5 Algorithm” might get you some relevant results, but also a lot of junk.

So now we have code-specific search engines, allowing us to (hopefully) enter in our search terms, our programming language, maybe our license preference, and get back a set of results sorted by popularity, maybe with a Digg-esque rating and comment system so we know which code snippets are good and which suck. Right now if I pulled up an MD5 algorithm, how would I know if it even worked correctly?

Testing the Sites

To test the code search engines out, I did a search for “luhn” in Java on all five sites. FYI the Luhn Algorithm is used to test for invalid credit card numbers — it’s nice because a lot of invalid numbers can be eliminated without having to connect to and pass the number to a credit card gateway. All five sites returned matches (Codase with some extra work), and all had an integrated source viewer so I could review the match.

So here’s what I found:

AllTheCode

AllTheCode returned 29 matches, but I didn’t see any search hit highlighting or anything. A lot of them were duplicates, too.
allthecode

The source code viewer was ug-ly, bad color choices. And interestingly, my term “luhn” wasn’t in the document at all. I tried some other searches & it seems AllTheCode is having some sort of issue, because most or all of the matches were totally irrelevant to my search. Chalk it up to Alpha status code, maybe. Hmm…maybe code search engines should let us filter against alpha, beta, or release code? There was a link to download the code file, but no links for the project home page or project zip file. No links to view other files in the project, either.
allthecodeviewer

 

Google Code Search

Google Code Search found about 100 matches and highlighted the search term. It also showed the code license and a link to the zip file containing the code file. It seemed to do a good job at suppressing duplicate files (e.g. the same code file in multiple projects), something none of the other search engines did. Not a huge deal, but nice.
GoogleCodeSearch

The code viewer was spartan (as with many things Google), but it had links to other files in the project, and it automatically scrolled me to the spot in the code where “luhn” was found. A very nice touch, especially for huge files. There were handy links to download either just the code file, or the whole project in a zip file. No link to visit the project’s home page (e.g. on SourceForge) though, which might make it annoying if you wanted to check for later versions or supporting documentation.
googleviewer

 

Koders

Koders found 23 matches & showed a bit more information than Google did, like LOC, copyright and/or license info, links to SourceForge project pages, etc. Interestingly, koders runs on ASP.NET, a rarity in the Web 2.0 world.
koders

The code viewer was a bit nicer than Google’s, and the search term was highlighted. On the left side was links to other files in the project plus anchor links to the various methods in the code file, which was cool. I still had to scroll or browser search to find out exactly where in the file it said “luhn,” but at least it was highlighted. I think they should have provided anchor links to exactly where in the file the term was found. I could download the code file, or browse to a “project home page” of sorts (which had links to SourceForge and some stats), but I didn’t see an easy way to download the entire project right from koders. Koders also offers a plugin for IDEs like Vistual Studio and Eclipse.
kodersviewer

koders “project home page”
kodersproject

 

Krugle

Krugle found 21 matches. Like Koders, it had some links to the SourceForge project pages and showed the license. It also has a neat feature allowing you to filter results based on where the search term appears (e.g. only show results where “luhn” is part of the function definition), but this feature didn’t work very reliably.
krugle

I must say that Krugle has the prettiest code viewer. I liked the highlighting (though using italics for comments is probably a bad idea for readability), it had a cool treeview of the code repository, and it opened the code files in separate tabs so you could browse through multiple results. Yes I know you can also middle-click a link to open a new browser window tab, but still. I could download the code file and link to the project home page, but I couldn’t download the whole project from Krugle.

One bad thing is that Krugle heavily uses AJAX, and thus your browser only shows the www.krugle.com link, no matter where you are. Makes it hard to bookmark stuff. You can click the “Create Link” button to get a greybox popup with a unique tinyurl-ish link to your search results, but that’s one extra step I’d rather not take. Another bad thing is it didn’t highlight my search term in the code, and my browser’s search feature didn’t work on the AJAX-y page. So I had to manually scan the whole code file to find out where “luhn” was. Not good for huge code files. 🙂
krugleviewer

 

Codase

Edit: I liked the idea of offering different types of searches, but got inconsistent results on Codase. A smart query (the default type of query) for “luhn” in java returned no matches, while a free text query for “luhn” in java returned 1. A free text query for “luhn” in all languages returned 11 matches, but Codase said they were all java file. So why those 10 extra matches didn’t show up before isn’t quite clear to me. FYI, searching for “validate credit card” returned more matches, but it still trailed the other engines. The presentation is nice, with syntax coloring in the search results for better readability.

codaseresults.png

The code browser is also pretty nice looking, too. Nice highlighting, with links to download the code. No project links, there, though.

codaseviewer.png

Conclusion

Right now I’d put Google at the top in terms of number of searches, duplicate file suppression, and barebones ease of use. Koder is at a close second with a better code viewer, handier project links, and some neat tools like IDE plugins. I could see Koders and Google switching places depending on your preferences.

Krugle was in third place with an attractive UI but average features. Krugle offers some interesting stuff that the other engines don’t (MyKrugle for attaching notes and saving documents, code position searches, tech page searches), but IMO they seemed like not a lot of people would use them.

Currently, AllTheCode is just too alpha/buggy to consider, plus the fact that it only searches java limits it greatly. Codase could be interesting with its deep filtering ability and syntax coloring, but I think mostly people would just use the “smart search,” and the poor number of results and limited languages hurts Codase.

Other Thoughts

FYI, to simulate a developer who might need to validate credit cards but wouldn’t know the name of the algorithm (i.e. “luhn”), I also did a general search for “validate credit card” in java code and got a lot of results (100+) in Krugle, Koders, and Google Code Search. The results returned seemed applicable, and interestingly, the top few matches were different for all three engines.

Also, I’d like to see some community-based value added in, with comments or diggs to help me decide which algorithm to pick or avoid.

 

 

0