Simple is Beautiful

Mar 05, 2017 19:05

Given two sequences of characters (short one and long one) find how many sub-sequences of the long sequence are equal to the short sequence.

python, algorithm

Leave a comment

Comments 14

ex_juan_gan March 6 2017, 02:10:07 UTC
I do it by building a suffix tree (actually there's a better way, but I'm not there yet). It's all linear.

What's the simple solution?

Reply

aklepatc March 6 2017, 02:51:14 UTC
It's about 20 lines code, if I remember correctly. Also linear but much simpler than a suffix tree.

We need 2 data structures (mappings):
- for each characters of the short sequence the set of its indexes in the short sequence;
- for each index of the short sequence the number of the ways to "get there" using different sub-sequences of the long sequence.

The 2nd data structure is populated by means of walking the long sequence (just once)...

Reply

ex_juan_gan March 6 2017, 23:50:33 UTC
By index of the short sequence you mean a position in the short sequence? Or?

Reply

aklepatc March 7 2017, 00:35:50 UTC
Yes, "index" == "position". Choosing zero-based or 1-based or whatever does not matter, of course. However, it is important to be able to determine the previous position for the current one.

Maybe I should just post those 20 lines of code, if that would be useful?

Reply


Leave a comment

Up