DevStorm: Ordinals vs Column Names

Since the first post turned into a discussion of readability versus performance, maintainability versus speed and other very worthwhile topics, I decided to put this entire process to a road test. I created unit tests that called a DAL services basically retrieving an entire table in each of the three presented methodologies. To get the output, I enabled our handy-dandy unit test timer to trap the performance of the DAL list functions. In order for this to be good science, I restarted the database 3 times over the course of the testing and ran the tests in different orders each time. For retrieving just under 17000 rows consisting of 38 columns, here are the results and averages for the test run (values in Milliseconds).

Run	String	Dictionary	Ordinal	String Order	Dictionary Order	Ordinal Order	Difference
1	2705	2469	2505	1	3	2	236
2	2648	2575	2506	2	3	1	142
3	2669	2520	2501	2	1	3	168
4	2570	2648	2445	3	1	2	203
5	2577	2566	2397	3	1	2	180
6	2450	2646	2575	3	1	2	196
7	2555	2606	2479	2	1	3	127
8	2843	2621	2446	1	2	3	397
9	2628	2485	2388	1	2	3	240
10	2580	2525	2400	3	1	2	180
Total	2622.50	2566.10	2464.20				0.21

So, at the end of the day, here is what I learned. It really doesn't matter enough to worry about it. Sure, the ordinals are consistently faster than the dictionary and the dictionary is faster than string, but the variance is hardly worth worrying about. When retrieving 17000 rows, the average difference from the best performance to the worst performance was a matter of 2/10 of one second. I think I got wrapped up in a common distraction know as micro-optimization, where we spend a ton of mental energy and rewrite code to gain a millisecond on a routine. Yes, you should care about performance, and yes, faster code should always be your goal, but there comes a point in time where it stops being worth the development time. Yes, we should avoid the blatantly poor performing code mistakes that everyone knows about. But after that, we should be equally worried about the scalability, portability, maintainability and readability of our code. We should ask if saving 2/10 second while retrieving 17000 rows is the matter to discuss, or should we be discussing why we would ever be retrieving 17000 often enough in our application to have 2/10 second be an issue. Yes, I started the discussion, thinking it was a creative way to gain performance. Honestly, I enjoy trying to work through things like this too, which fed the distraction. I believe now that these mental exercises should always be framed with realistic improvement potential weighed against the time and effort of the pursuit. At that point, only pursue until it starts becoming a net loss of productivity. On this particular subject, after doing this research, count me firmly in the "It really doesn't matter, aim at code readability and object design" camp.

Pages

Friday, July 27, 2012

Ordinals vs Column Names - Part Deux

No comments:

Post a Comment