Raised This Month: $32 Target: $400
 8% 

The advantages of a trie over an array for strings storage


Post New Thread Reply   
 
Thread Tools Display Modes
Author Message
joaquimandrade
Veteran Member
Join Date: Dec 2008
Location: Portugal
Old 03-24-2009 , 08:48   The advantages of a trie over an array for strings storage
Reply With Quote #1

Those who are unaware of what a trie is and what's its purpose, when needing a data structure to store strings, use an array. Since you can also use tries for store strings there are best cases of use for each one.



What they are:

An array is a group of sequential bytes in memory subdivided in groups of the same length, being each one of the subdivided a individual place to store data, that you can access by its position in the array (usually referred as index).

A trie is a collection of data organized in the form of a hierarchy. The level zero is a start point. From there on, each level is a group of strings of length equal to the level number and different from each other. Each string can be associated to a value and to strings in the level below that share its characters. If one is associated to a value, it is stored in the trie.

From wikipedia:



In the tree above there are stored the strings:

A , i , to , in , tea , ted , ten , inn

(since they are those who have associated values).


Which one should be used where you can use both:

An array should be used when you know the index of the string that you want to posteriorly access or if you need to iterate over its values.

A trie should be used when you want to posteriorly search for a string.

To show the differences of its usage, in a case where you should use a trie, i created the following situation:

I want to know if in a list of steam Ids there is a given steam Id. I have a list of 69 steam Ids and the given steam Id is not on the list.

When comparing algorithms it must be tested what's referred to as "worst case". The worst case is the case on which you force the algorithm to test every possibility. That's why the steam Id is not on the list.

The code and the profiling results go as attachments.

A note:

While the superiority of tries over arrays for this case should be obvious i think that the results for ArrayGetString are not accurate. So or:
  • They are accurate and i am wrong
  • I made something wrong
  • There's something wrong with ArrayGetString
  • There's something wrong with the profiling.
Attached Files
File Type: sma Get Plugin or Get Source (cellArray.sma - 2050 views - 885 Bytes)
File Type: sma Get Plugin or Get Source (cellTrie.sma - 2273 views - 724 Bytes)
File Type: txt cellArrayResults.txt (2.7 KB, 867 views)
File Type: txt cellTrieResults.txt (2.4 KB, 896 views)
__________________

Last edited by joaquimandrade; 03-25-2009 at 04:32.
joaquimandrade is offline
Arkshine
AMX Mod X Plugin Approver
Join Date: Oct 2005
Old 03-24-2009 , 09:10   Re: The advantages of a trie over an array for strings storage
Reply With Quote #2

Interesting ! Thanks joaquimandrade.
Arkshine is offline
joaquimandrade
Veteran Member
Join Date: Dec 2008
Location: Portugal
Old 03-24-2009 , 09:19   Re: The advantages of a trie over an array for strings storage
Reply With Quote #3

Quote:
Originally Posted by arkshine View Post
Interesting ! Thanks joaquimandrade.
Thank you for having the patience to read it.
__________________
joaquimandrade is offline
Arkshine
AMX Mod X Plugin Approver
Join Date: Oct 2005
Old 03-24-2009 , 09:56   Re: The advantages of a trie over an array for strings storage
Reply With Quote #4

Hope to see more tutos from you !
Arkshine is offline
stupok
Veteran Member
Join Date: Feb 2006
Old 03-24-2009 , 11:42   Re: The advantages of a trie over an array for strings storage
Reply With Quote #5

Thanks for taking the time to write this. It's very concise and well-written.

The results for cellArray are quite strange...
__________________

Last edited by stupok; 03-24-2009 at 19:07.
stupok is offline
joaquimandrade
Veteran Member
Join Date: Dec 2008
Location: Portugal
Old 03-24-2009 , 12:16   Re: The advantages of a trie over an array for strings storage
Reply With Quote #6

Quote:
Originally Posted by stupok View Post
Thanks for taking the time to write this. It's very concise and well-written.

The results for cellArray are quite strange...
Thanks.

About the results of the time spent by ArrayGetString:

I think that they should be far lower. Since they are so closer to those of strcmp my guess is that the problem is with the profiling.
__________________
joaquimandrade is offline
stupok
Veteran Member
Join Date: Feb 2006
Old 03-24-2009 , 14:55   Re: The advantages of a trie over an array for strings storage
Reply With Quote #7

It's not the profiler, it's your code.

Edit: nevermind... I am wrong.
__________________

Last edited by stupok; 03-24-2009 at 19:00.
stupok is offline
Old 03-24-2009, 16:19
joaquimandrade
This message has been deleted by joaquimandrade.
joaquimandrade
Veteran Member
Join Date: Dec 2008
Location: Portugal
Old 03-24-2009 , 16:27   Re: The advantages of a trie over an array for strings storage
Reply With Quote #8

Stupok,

In a trie you associate a value to a string. So first we associated the boolean true to the steamIDs in the list. The we just do
Code:
TrieGetCell(steamIDs,steamID,value)
__________________

Last edited by joaquimandrade; 03-25-2009 at 04:40.
joaquimandrade is offline
stupok
Veteran Member
Join Date: Feb 2006
Old 03-24-2009 , 18:34   Re: The advantages of a trie over an array for strings storage
Reply With Quote #9

Well, now I get it. Oops.
__________________
stupok is offline
joaquimandrade
Veteran Member
Join Date: Dec 2008
Location: Portugal
Old 03-24-2009 , 18:41   Re: The advantages of a trie over an array for strings storage
Reply With Quote #10

Quote:
Originally Posted by stupok View Post
Well, now I get it. Oops.
In fact it is also my bad. I didn't explained how the search then works in a trie, just its structure
__________________

Last edited by joaquimandrade; 03-24-2009 at 18:44.
joaquimandrade is offline
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 09:51.


Powered by vBulletin®
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Theme made by Freecode