Comments | iluvsheep: A lesson on overthinking

iluvsheep

A lesson on overthinking

Nov 07, 2007 14:58

I recently came across a random text file that I had written earlier in the year. After a little inspection, it appears to contain two things: 1) The pseudo-code for two very different ways of determining which 3 numbers in a list (confusingly referred to as "string") sum to 0, and 2) A poignant lesson on the dangers of unnecessary complication.

( Read more... )

Leave a comment

Back to all threads

arisrabkin November 7 2007, 21:10:33 UTC

I think there's an O(N) /O(N^2) algorithm:

Make a hashtable h, sized for n elements.
Hash each number and stick it in the bucket.
For each pair of numbers, add them, and and stick them in the corresponding bucket.

Scan through the table
for each bucket:
Let x be the label of the bucket
If there's a number or pair in the bucket, look at the bucket for -x: if there's a pair or number [respectively]
then they sum to zero, return true.

If no matches, fail.

iluvsheep November 8 2007, 00:29:10 UTC

I think this was sort of what I was originally trying with the first crazy bit of code above. Looking at it again, perhaps I was on the right track and just didn't get it quite right.

I'm not sure if your solution is quite correct yet, say I gave you a list of positive numbers and two zeroes. You would hash the pair of both zeroes as summing to 0, and each 0 itself would hash into the single 0 value. Then you would find that there is a singleton 0 that matched a pair summing to 0, specifically (0,0), even though there aren't, in fact, three separate entries that sum to 0.

Also, for what it's worth, I think it takes more than O(N) memory. Say I gave you a list of [0,1,2,4,8,16]. Those 6 numbers would result in pairs summing to 15 different values. I think the memory usage is actually O(N^2).

arisrabkin November 8 2007, 00:41:13 UTC

*thinks*.

I see that I have N^2 elements, and probably need an O(N^2) size hash table. that's probably okay; also, I may be able to pick a better data structure. Something treeish, for instance.

And yes, I need to do something to detect duplicates. The things that go in the buckets are probably not the numbers themselves, but number+index in input string. That lets me detect duplicates.

jibb November 8 2007, 08:35:14 UTC

Hmm... But you don't really need to put the pair-sums in the hash table. If you have a hash table keyed by number containing the index(es) of the number in the list, then go through all pairs of i and j, checking if hashtable[-(list[i] + list[j])] contains an index other than i and j, you should be able to get an answer in O(n^2) time and O(n) space. And now no trouble with duplicates.

def triplets(nums):
indexes = {}
for index, num in enumerate(nums):
if num not in indexes:
indexes[num] = []
indexes[num].append(index)
for index_a, a in enumerate(nums):
for offset_b, b in enumerate(nums[index_a+1:]):
index_b = index_a + 1 + offset_b
c = -(a+b)
if c not in indexes:
continue
for index_c in indexes[c]:
if index_c not in (index_a, index_b):
return ((a, b, c), (index_a, index_b, index_c))
return None

A bit of lazy testing seems to indicate it works.

iluvsheep November 8 2007, 18:06:15 UTC

Well, done. It does seem to work from my testing as well.

As for the run time, maybe it's the 3 nested for loops that are confusing me, but it really looks like that algorithm is O(N^3), not O(N^2). Care to explain?

jibb November 8 2007, 22:00:37 UTC

Hmm... A slightly different variation, making sure we only iterate through each of the lists in indexes once...

def triplets(nums):
indexes = {}
for index, num in enumerate(nums):
if num not in indexes:
indexes[num] = []
indexes[num].append(index)
for index_a, a in enumerate(nums):
for offset_b, b in enumerate(nums[index_a+1:]):
index_b = index_a + 1 + offset_b
c = -(a+b)
if c not in indexes:
continue
for index_c in indexes[c]:
if index_c not in (index_a, index_b):
return ((a, b, c), (index_a, index_b, index_c))
del indexes[c] # added line
return None

Now the innermost for loop can only iterate through as many item as were inserted into the lists in indexes, which is n, so it should now be O(n^2)

iluvsheep November 15 2007, 05:16:27 UTC

Yup, I do think that works. Well done. Want a job? (I'm only sorta kidding).

Back to all threads