AlliedModders

AlliedModders (https://forums.alliedmods.net/index.php)
-   Code Snippets/Tutorials (https://forums.alliedmods.net/forumdisplay.php?f=83)
-   -   The advantages of a trie over an array for strings storage (https://forums.alliedmods.net/showthread.php?t=88396)

joaquimandrade 03-24-2009 08:48

The advantages of a trie over an array for strings storage
 
4 Attachment(s)
Those who are unaware of what a trie is and what's its purpose, when needing a data structure to store strings, use an array. Since you can also use tries for store strings there are best cases of use for each one.



What they are:

An array is a group of sequential bytes in memory subdivided in groups of the same length, being each one of the subdivided a individual place to store data, that you can access by its position in the array (usually referred as index).

A trie is a collection of data organized in the form of a hierarchy. The level zero is a start point. From there on, each level is a group of strings of length equal to the level number and different from each other. Each string can be associated to a value and to strings in the level below that share its characters. If one is associated to a value, it is stored in the trie.

From wikipedia:

http://upload.wikimedia.org/wikipedi...xample.svg.png

In the tree above there are stored the strings:

A , i , to , in , tea , ted , ten , inn

(since they are those who have associated values).


Which one should be used where you can use both:

An array should be used when you know the index of the string that you want to posteriorly access or if you need to iterate over its values.

A trie should be used when you want to posteriorly search for a string.

To show the differences of its usage, in a case where you should use a trie, i created the following situation:

I want to know if in a list of steam Ids there is a given steam Id. I have a list of 69 steam Ids and the given steam Id is not on the list.

When comparing algorithms it must be tested what's referred to as "worst case". The worst case is the case on which you force the algorithm to test every possibility. That's why the steam Id is not on the list.

The code and the profiling results go as attachments.

A note:

While the superiority of tries over arrays for this case should be obvious i think that the results for ArrayGetString are not accurate. So or:
  • They are accurate and i am wrong
  • I made something wrong
  • There's something wrong with ArrayGetString
  • There's something wrong with the profiling.

Arkshine 03-24-2009 09:10

Re: The advantages of a trie over an array for strings storage
 
Interesting ! Thanks joaquimandrade.

joaquimandrade 03-24-2009 09:19

Re: The advantages of a trie over an array for strings storage
 
Quote:

Originally Posted by arkshine (Post 787898)
Interesting ! Thanks joaquimandrade.

Thank you for having the patience to read it.

Arkshine 03-24-2009 09:56

Re: The advantages of a trie over an array for strings storage
 
Hope to see more tutos from you ! :)

stupok 03-24-2009 11:42

Re: The advantages of a trie over an array for strings storage
 
Thanks for taking the time to write this. It's very concise and well-written. :up:

The results for cellArray are quite strange...

joaquimandrade 03-24-2009 12:16

Re: The advantages of a trie over an array for strings storage
 
Quote:

Originally Posted by stupok (Post 787993)
Thanks for taking the time to write this. It's very concise and well-written. :up:

The results for cellArray are quite strange...

Thanks.

About the results of the time spent by ArrayGetString:

I think that they should be far lower. Since they are so closer to those of strcmp my guess is that the problem is with the profiling.

stupok 03-24-2009 14:55

Re: The advantages of a trie over an array for strings storage
 
It's not the profiler, it's your code.

Edit: nevermind... I am wrong.

joaquimandrade 03-24-2009 16:27

Re: The advantages of a trie over an array for strings storage
 
Stupok,

In a trie you associate a value to a string. So first we associated the boolean true to the steamIDs in the list. The we just do
Code:

TrieGetCell(steamIDs,steamID,value)

stupok 03-24-2009 18:34

Re: The advantages of a trie over an array for strings storage
 
Well, now I get it. :oops: Oops.

joaquimandrade 03-24-2009 18:41

Re: The advantages of a trie over an array for strings storage
 
Quote:

Originally Posted by stupok (Post 788377)
Well, now I get it. :oops: Oops.

In fact it is also my bad. I didn't explained how the search then works in a trie, just its structure


All times are GMT -4. The time now is 20:27.

Powered by vBulletin®
Copyright ©2000 - 2024, vBulletin Solutions, Inc.