Quote:
all this "optimization" ... I'm trying to figure out if all this really served a purpose.
|
If you meant Valve's GCC optimizations, not sure. Someone who's looked at them should comment.
If you meant our caching optimizations, which you mentioned removing and then adding back in -
It's a trivial 200-line hash table implementation, 30% of which is a
copy-pasted hash function.
My benchmarks showed that a single string comparison cost about 40 nanoseconds, so a worst case hit would cost 2ms. That's pretty bad. Hash table would essentially be free. Adding an entry into the hash table costs 1,100 nanoseconds, so up-front conversion of the whole thing costs about 60ms. That's really cheap, but just to be safe we incrementally fill the table as we search.
You can do even better by getting rid of the malloc() call - easily doable by pre-allocating the hash table memory parallel to the ELF section size. I tried this just now and it's about a 5X win, bringing the cost for a complete table fill down to 14ms.
I don't know what your workloads are like, but the cache is not necessary for the dlhsym() kernel to work. SM always assumes worst-case since it's extensible. My guess is all of this happens on server startup so as long as it says under half a second or so, it won't matter (but it's not the last cookie that makes you fat).
__________________