Jun 26, 2009 07:36
One of our projects is a minor enhancement to a very large app built by an outside development firm down in Argentina. We're trying to reuse existing stored procedures whenever possible because we don't have any documentation on the app itself and only spotty documentation on the business logic. But the new requirement is just another report that combines data from two existing reports, so maximum reuse is the order of the day.
So one of my devs calls me over yesterday, complaining that one of the existing sprocs is running really slow, taking several minutes to return about 45,000 rows of data.
So I took a look at it and saw that it's really a very straightforward select statement which just joins together 3 tables. But there's a "distinct" in there. So two of the tables are just dimensions (read: relatively static metadata) with a few thousand rows each, but the third is a "fact" (read: transactional data) table with... 12 million rows.
So the SQL is joining together three tables and then performing a DISTINCT on 12 million rows of data. On SQL Server 2000 and there don't appear to be any indexes either.
So I just performed a distinct against the fact table on the two fields that we're using to join, wrapped that up in a subquery and then joined to the dimension tables. Brought the return time down to a consistent 41 seconds. And I just know we can add some indexes that can drop that down dramatically further.
The moral of the story, though, is that now my dev thinks I'm some sort of database god, which just blows my mind. Do they not teach database fundamentals anymore? Can people (demonstrably at least 2 developers: the original and the current one) no longer recognize a blatantly bad query?
This is why technologies like Linq, while they save you loads of time, ultimately lead to a degradation in the overall skillset of developers, a thing against which I have railed before.