Raised This Month: $51 Target: $400
 12% 

Object Library - Dynamic data storage, type safety and validation


Post New Thread Reply   
 
Thread Tools Display Modes
Author Message
rhelgeby
Veteran Member
Join Date: Oct 2008
Location: 0x4E6F72776179
Old 03-13-2013 , 12:53   Object Library - Dynamic data storage, type safety and validation
Reply With Quote #1

Object Library

Note: This seems to work well for me, but this is still experimental and may have bugs.

Main Features

Key/value object storage manager
Create objects with dynamic content. Data is internally stored in ADT Arrays.

Object data can be accessed through get/set functions (which implies validation).

Mutable or immutable objects
Objects can be either mutable or immutable. Immutable objects can't modify their type (add/remove keys) when created, but data in existing keys can be modified.

Both use a type descriptor as a template. Mutable objects store a bundled descriptor so they can be modified independently, while immutable objects store a reference to a shared read-only type descriptor to save memory.

Supports built-in and custom data validation
The library supports basic validation constraints such as min/max limits in addition to a callback where the user can do custom validation of the object.

Type safe (as far as it's possible in SourcePawn)
Each key is assigned a type. This structure requires you to use appropriate get/set functions where the library will check if you use the correct function at runtime - and the compiler will be able to do tag checks.

Import (and validate) data from Valve's KeyValue file format
Creates objects based on the contents of a KeyValue file, and a user defined object type descriptor with optional validation constraints.

Reflection
Objects or types can be inspected at run time. Loop through keys, get data types or validation constraints.


Why Use This

This is an alternative to enumerated arrays. If you use various types of data sets, such as player profiles or weapon profiles, you don't have to create specific storage implementation for each data set when using this library. You just need to define types and create objects.

If you have many data sets, a "hard coded" manual solution for each set will result in somewhat repeated code.

If you also have validation constraints, that code will be repeated too.

This library will help you with everything from reading keyvalue files to storage and validation. You just need to declare types and validation constraints and the library will enforce it.


What It Doesn't Do

Memory management
You'll have to make sure objects and types are deleted when no longer in use. Otherwise there will be memory leaks. Read the API documentation carefully to see which functions that return resources that must be released again. (Hint: Cloning or creating objects and types, parser context object)

It's not a tree structure
Regular KeyValue files use a tree structure. This object manager use a plain associative array structure where each object has keys mapped to values.

However, a tree structure is indirectly supported by linking object references together. Objects can store references to other objects. It has its own object data type so that the compiler can do tag checking on object references as well.


Resource Usage

Small processing overhead
The main goal isn't a super efficient object manager, but efficient enough. Because of type checking and validation there is a small overhead when modifying data. These checks are basically comparisons of primitive values and shouldn't be an issue with normal usage.

If you have code that's very busy you should consider using buffers or caches in front of the data storage. Use the SourceMod profiler to measure if this really is an issue in your code - before optimizing.

Memory overhead
Since it's a dynamic storage manager, objects need to store meta data and will use a little bit more memory than a static hard coded solution would. But it's also a lot more flexible solution.

However, immutable objects are more memory efficient than mutable objects, since immutable objects share their type descriptor between objects of the same type. Mutable objects have their own private type descriptor.

Use immutable objects when you can to reduce memory overhead, especially on object types that aren't modified after creation.

The memory overhead also depends on how much space you reserve for each value entry. Memory will be wasted if you reserve more space than the longest value requires.

An object with 4 strings of 256 byte will require about 2 KB, including the object array itself, list of null keys and a type descriptor reference.

The type descriptor for this object use 8-9 KB where the trie (key name index) use 8 KB alone (probably a canditate for the SourceMod team to optimize).

A little memory and CPU overhead is a trade off for writing more code yourself. It can still be efficient if used correctly.


Examples

Creating Objects
Spoiler


Adding Validation Constraints
Spoiler


Reflection
Spoiler


Parse and Validate KeyValue File
Spoiler


Types and Callbacks
Spoiler


Source
The newest code is available in SourceMod project base on Google Code (the "libraries" folder in the project-components repository). It's still experimental and some parts of this library may be broken when I work on it.

More documentation and full example usage is provided in the docs folder.

An older snapshot of the library collection is attached below:
Attached Files
File Type: zip projectcomponents-r189.zip (146.0 KB, 432 views)
__________________
Richard Helgeby

Zombie:Reloaded | PawnUnit | Object Library
(Please don't send private messages for support, they will be ignored. Use the forum.)

Last edited by rhelgeby; 10-29-2014 at 19:41.
rhelgeby is offline
Send a message via MSN to rhelgeby
Minoost
SourceMod Donor
Join Date: Aug 2011
Old 03-14-2013 , 08:34   Re: Object Library - Dynamic data storage, type safety and validation
Reply With Quote #2

Great Job!
Minoost is offline
alongub
Veteran Member
Join Date: Aug 2009
Location: Israel
Old 04-24-2013 , 02:25   Re: Object Library - Dynamic data storage, type safety and validation
Reply With Quote #3

  • Is there any ObjLib_TypeOf method that returns the ObjectType of an Object?
  • How would you implement inheritance? Using ObjLib_CloneType?
  • What about performance and memory usage? Do you have some benchmarks of this vs enumerated arrays?
__________________

Last edited by alongub; 04-24-2013 at 02:27.
alongub is offline
rhelgeby
Veteran Member
Join Date: Oct 2008
Location: 0x4E6F72776179
Old 04-24-2013 , 04:04   Re: Object Library - Dynamic data storage, type safety and validation
Reply With Quote #4

ObjLib_GetTypeDescriptor is the same as TypeOf, although ObjLib_TypeOf would be a better name.

Edit: Actually I've already made ObjLib_TypeOf a long time ago (in object.inc).

Inheritance is not in my plans and probably adds too much complexity. If you've studied my code and have ideas you're welcome to share it here, or in the issue tracker on google code.

My first thoughts is to add a key for a base type in the type descriptor (ObjectType) and then modify the accessor functions to also check the base type when reading or writing data. It might add some complexity in my KeyValue parser when it's validating keys and sections. I'm not sure what side effects I'd get from doing this. Hopefully my code is flexible enough that I could implement it later.

The intention of this library is to load configuration data from KeyValue trees into objects with automatic validation, not adding object oriented programming in SourcePawn. In the Zombie:Reloaded plugin we do this "manually" with enumerated arrays several places - with a lot of duplicated or very similar code. I want it just to declare data structures and constraints, so this library will handle the rest.

There obviously is a performance overhead because it has to work with meta data too, but I haven't done any benchmarks so far.

Internally it's all ADT Arrays and some ADT Tries for fast lookup in arrays. When objects are created all keys for that object is pre-created so it doesn't add or remove array elements when accessing object. The arrays are also created with a predefined size where possible so they can allocate all memory they need right away.

Some rough estimates of resource usage:

Read data in object, ObjLib_GetCell for instance:
  • Validation: object reference (0), null key (2), data type (2).
  • Get object type for meta data access (1).
  • Look up key name in trie to get key index (2).
  • Get object data array (1).
  • Get value in object data array (1).

The numbers are how many native calls it does to access data and meta data. 9 native calls in total for each get function. The library do a lot of native calls, but in an old benchmark (attached) I measured a simple dynamic function call overhead to be around 200 to 400 nanoseconds. It's probably a bit faster when calling natives. Parameters I pass through native calls are mostly cells and a few short strings. You have to do really many native calls to affect performance with native calls alone, so I don't consider that an issue.

Then the only performance concern left is all the meta data it has to access. That is mostly getting a single cell from an array and comparing it to something. The trie lookup is the most expensive part, which is O(keyLength) if I'm correct, which is pretty fast when key names are short (less than ~30 I suppose).

Because of this, the order of get functions is O(keyLength), which is much better than stuff like O(log n) and O(n^2). Correct me if I'm wrong here.

Write data (ObjLib_SetCell):
  • Validation: object reference (0), data type (2).
  • Get object type for meta data access (1).
  • Look up key name in trie to get key index (2).
  • Get constraints object, if any (1). If none it will skip all constraint stuff below.
  • Get constraint type (1).
  • Delegate work to correct constraint handler (x).
  • Get object data array (1).
  • Store value in object data array (1).
  • Remove null flag (1).

It's quite similar to reading data but now it has to go through constraint handlers to validate data being set, if any. This will read various constraint settings from the constraint object and validate the value being set. Usually O(1) stuff, or O(keyLength) when it reads keys. It depends on what kind of constraint each key has. Some constraint allow callbacks where you can do custom validation.

The goal isn't to have a super efficient object library, but good enough. I plan to figure out a cache solution that can be used to dump and load object data from enumerated arrays. Then you can use enumerated arrays directly in hot areas and flush/update the cache when ready.

Memory usage is obvious. It stores meta data about keys and apparently the trie lookup index alone use 8 KB with just one small key and value. Type descriptors use most memory, but they can be shared between objects. It also creates a lot of array handles so I'm worried it would be a hell to troubleshoot memory leaks (a feature in SM for adding a short description to handles would help a lot, but again, performance).

Objects use one data array for storing both cells and strings. Lots of space is wasted in non-string keys since it has to reserve space for string keys for every array element. Not sure if that is the case for enumerated arrays, but there you have to reserve space in the first dimension. Arrays and strings in the object data array can be moved to it's own data array, but that will mess up key indexes and I haven't figured out an elegant solution there. Internally the library heavily relies on accessing keys by index.

This is what you get with abstract libraries. You get a lot for free, but in turn it has some small overhead and use more memory (though we're still just talking about kilobytes, max a very few megabytes).

Recently I made a huge commit that adds lookup constraints where it reads a name (for instance a class name in the ZR plugin) and looks it up through a custom callback or a list/trie and replace it with an actual (class) object reference, with validation. It can also convert a lookup name to a number, enumerated value, an array, or another string. This is achieved by just declaring some meta data and making the lookup callback, and my KeyValue parser handles this automatically. In the old ZR plugin it adds a lot of messy code, now you just need the declaration.

The attached plugin is an old benchmark I did with dynamic function calls. It does a lot of native calls so native call overhead won't be higher than this.
Attached Files
File Type: sp Get Plugin or Get Source (dyncalltest.sp - 951 views - 4.8 KB)
__________________
Richard Helgeby

Zombie:Reloaded | PawnUnit | Object Library
(Please don't send private messages for support, they will be ignored. Use the forum.)

Last edited by rhelgeby; 04-26-2013 at 06:53.
rhelgeby is offline
Send a message via MSN to rhelgeby
rhelgeby
Veteran Member
Join Date: Oct 2008
Location: 0x4E6F72776179
Old 04-25-2013 , 08:27   Re: Object Library - Dynamic data storage, type safety and validation
Reply With Quote #5

Made some benchmarks with graphs: Object Library Benchmark

Based on this revision: r195

Server CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 4600+
RAM: 4GB

Basically we're talking about iteration times below 10 µs. Constraints and mutable objects are most expensive, but still quite fast. The rest is 1-2 µs.

What's surprising is that the iteration time isn't constant. It's doing the same thing up to a million times, but there are spikes and jumps. Though there might be something in my enviroment, some working processes or scheduled tasks affecting this. Some of the graphs are amplifying the details even though the numbers are pretty stable.

Since I double iterations for each test, the total time also increase exponentially. If there were any badly optimized code it would grow much faster.

The objects in this test only has one key, to test for general overhead when reading and writing data. With multiple keys, key name lookup will be a tiny bit slower.
__________________
Richard Helgeby

Zombie:Reloaded | PawnUnit | Object Library
(Please don't send private messages for support, they will be ignored. Use the forum.)

Last edited by rhelgeby; 04-25-2013 at 08:44.
rhelgeby is offline
Send a message via MSN to rhelgeby
Ermert1992
Member
Join Date: Jan 2012
Location: Germany
Old 05-24-2013 , 09:25   Re: Object Library - Dynamic data storage, type safety and validation
Reply With Quote #6

Well Done!
Ermert1992 is offline
rhelgeby
Veteran Member
Join Date: Oct 2008
Location: 0x4E6F72776179
Old 05-24-2013 , 12:02   Re: Object Library - Dynamic data storage, type safety and validation
Reply With Quote #7

Thank you. I recently started working on typed collection objects and had to refactor constraint handlers. It's not tested yet and constraints may be broken.
__________________
Richard Helgeby

Zombie:Reloaded | PawnUnit | Object Library
(Please don't send private messages for support, they will be ignored. Use the forum.)
rhelgeby is offline
Send a message via MSN to rhelgeby
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 12:28.


Powered by vBulletin®
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Theme made by Freecode