兄台所见极是

来源: catcherintherye 2009-01-20 22:46:54 [] [旧帖] [给我悄悄话] 本文已被阅读: 次 (1195 bytes)
The keywords are provided sorted already. Initially it's just gonna be sorted alphabetically. Later may add the "most popular" parameter like you said.

There're the generic types of SortedDictionary<>, SortedList<> in .NET. However, not sure how useful these could be in this case, as it's not practical to build an index that covers all the possible character combinations. If we only match the first two letters, theoratically it's an index size of 26x26. It grows exponencially as the number of match letters increases.

So I'm thinking instead of building an elegant solution, the next best thing might be to simply break down the 50,000 into numerous smaller with the first two letters as the key:

aa list1
ab list2
ac list3
...

Of course some of these lists would be far bigger than some others (because a lot of words would start with, sya "st", whereas few words would start with , say "xx", ).

Then starting from the third character, "brutal force" match would be applied (like the FindAll() method). What you guys think?

If we figure out a perfect solution to this, guys, let's start our own Qooqle.

所有跟帖: 

gotta be careful about the size of collection -澳洲老土- 给 澳洲老土 发送悄悄话 (208 bytes) () 01/21/2009 postreply 00:03:46

请您先登陆,再发跟帖!

发现Adblock插件

如要继续浏览
请支持本站 请务必在本站关闭/移除任何Adblock

关闭Adblock后 请点击

请参考如何关闭Adblock/Adblock plus

安装Adblock plus用户请点击浏览器图标
选择“Disable on www.wenxuecity.com”

安装Adblock用户请点击图标
选择“don't run on pages on this domain”