I recently came across a random text file that I had written earlier in the year. After a little inspection, it appears to contain two things: 1) The pseudo-code for two very different ways of determining which 3 numbers in a list (confusingly referred to as "string") sum to 0, and 2) A poignant lesson on the dangers of unnecessary complication.
(
Read more... )
Make a hashtable h, sized for n elements.
Hash each number and stick it in the bucket.
For each pair of numbers, add them, and and stick them in the corresponding bucket.
Scan through the table
for each bucket:
Let x be the label of the bucket
If there's a number or pair in the bucket, look at the bucket for -x: if there's a pair or number [respectively]
then they sum to zero, return true.
If no matches, fail.
Reply
I'm not sure if your solution is quite correct yet, say I gave you a list of positive numbers and two zeroes. You would hash the pair of both zeroes as summing to 0, and each 0 itself would hash into the single 0 value. Then you would find that there is a singleton 0 that matched a pair summing to 0, specifically (0,0), even though there aren't, in fact, three separate entries that sum to 0.
Also, for what it's worth, I think it takes more than O(N) memory. Say I gave you a list of [0,1,2,4,8,16]. Those 6 numbers would result in pairs summing to 15 different values. I think the memory usage is actually O(N^2).
Reply
I see that I have N^2 elements, and probably need an O(N^2) size hash table. that's probably okay; also, I may be able to pick a better data structure. Something treeish, for instance.
And yes, I need to do something to detect duplicates. The things that go in the buckets are probably not the numbers themselves, but number+index in input string. That lets me detect duplicates.
Reply
Hmm... But you don't really need to put the pair-sums in the hash table. If you have a hash table keyed by number containing the index(es) of the number in the list, then go through all pairs of i and j, checking if hashtable[-(list[i] + list[j])] contains an index other than i and j, you should be able to get an answer in O(n^2) time and O(n) space. And now no trouble with duplicates.
def triplets(nums):
indexes = {}
for index, num in enumerate(nums):
if num not in indexes:
indexes[num] = []
indexes[num].append(index)
for index_a, a in enumerate(nums):
for offset_b, b in enumerate(nums[index_a+1:]):
index_b = index_a + 1 + offset_b
c = -(a+b)
if c not in indexes:
continue
for index_c in indexes[c]:
if index_c not in (index_a, index_b):
return ((a, b, c), (index_a, index_b, index_c))
return None
A bit of lazy testing seems to indicate it works.
Reply
As for the run time, maybe it's the 3 nested for loops that are confusing me, but it really looks like that algorithm is O(N^3), not O(N^2). Care to explain?
Reply
Hmm... A slightly different variation, making sure we only iterate through each of the lists in indexes once...
def triplets(nums):
indexes = {}
for index, num in enumerate(nums):
if num not in indexes:
indexes[num] = []
indexes[num].append(index)
for index_a, a in enumerate(nums):
for offset_b, b in enumerate(nums[index_a+1:]):
index_b = index_a + 1 + offset_b
c = -(a+b)
if c not in indexes:
continue
for index_c in indexes[c]:
if index_c not in (index_a, index_b):
return ((a, b, c), (index_a, index_b, index_c))
del indexes[c] # added line
return None
Now the innermost for loop can only iterate through as many item as were inserted into the lists in indexes, which is n, so it should now be O(n^2)
Reply
Reply
Leave a comment