The problem
Ever struggle to untangle a Big Ball of Mud? I have too. Too many times to count. Borne out of abject frustration and significant time spent reverse-engineering code, I’m excited to open source a tool I created, SourceCrawler, to help us all work through C# monoliths.
After I’d started my third position as a senior software engineer, specializing in C#, at an enterprise software company, I’d pretty much had it with digging through thousands of C# files for code affecting a given behavior and attempting to reverse-engineer the file to the project, solution, and assembly. I’d ask people about the location of the solutions, projects, and in which assemblies a particular file or bit of functionality contributed, but, if the individual who coded a given section was even still around, they — understandably — rarely had any memory of anything about the code in question.
Every company was once a start-up and a practice of throwing in unreviewed code to douse the latest fire-du-jour was pervasive, and naming conventions from variables, through source and project files, to assemblies were barely-existent or inconsistent. The requirements (read: customer and market demands) changed dramatically during development and reliable refactor tools were either neither available nor sufficient, or just not used in the interest of speedily getting the code to production.
I realized that a tool was needed for newbies to quickly locate functionality in C# monoliths. Not just to find where it might be in the source, but to find where it is in the hierarchy of solutions, projects and, ultimately, which assembly that file affected.
A utility is born
My current company allows employees to switch departments with some regularity. I found myself between departments and started researching, designing, and writing the SourceCrawler during an afternoon of no particular deliverable for the department from which I would soon depart.
Where the SourceCrawler adds value over directory ‘grep’ tools
I’ve used many grep tools (but I love Agent Ransack now). They have several things in common: they crawl an entire directory structure recursively, searching for text, or a pattern of text, in files matching a given pattern. Each search starts the crawl process from the root. While the speed does increase after the first search given Windows improvements in caching throughout the years, one will typically get little context other than the file’s location in the directory structure per “hit.”
SourceCrawler is different in that it only searches *.cs files, and shows you in which projects, solutions, and assemblies that file contributes. Also, the directory structure and all source-file data is ingested into memory, yielding typically sub-1 second search times (depending on the things you already know about data magnitude, memory performance, etc., which increases search times). Once the directory tree is ingested, it can actually be deleted and you’ll still be able to search over the source.
This is a double-edged sword, of course: when you change source on the disk, you’ll need to “recrawl” to get the latest code into SourceCrawler. A command-line option exists to run scheduled re-crawls via Task Manager, or after a source pull or sync.
SourceCrawler in practice
I’ve been using this tool to help me locate and start solutions since the day I had a walking skeleton of it. Not a day goes by when I don’t use it to answer some typical developer question. A few others in my organization have used it also.
Starting solutions and opening command prompts or the Explorer from a given location has proven to be invaluable to save me from navigating through a maze of directories — which also can be redundantly or just badly named — on an almost hourly basis.
As a developer, you know that saving seconds once a day is not a big deal — but multiply that by 75 or 100 per day and the time savings add up. Not to mention the lower levels of frustration resulting from not having to navigate old, rotted, yet functional code.
I look forward to your feedback and PRs.
Download at: SourceCrawler