Raised This Month: $32 Target: $400
 8% 

socket_recv() hangs?


Post New Thread Reply   
 
Thread Tools Display Modes
xOR
Veteran Member
Join Date: Jun 2006
Location: x-base.info
Old 07-04-2006 , 09:37   Re: socket_recv() hangs
Reply With Quote #11

although i said i was too tired i still changed these 2 things before going to sleep yesterday:

- as suggested by Hawk i left out the timeout value in socket_change() so the default value is used
- i added the socket number to my debug messages

the results after this night are:
- the server is still running and didn't crash yet, but this doesn't mean anything until it is stable for several days
- the socket numbers stayed the same for the whole night and never increased
__________________
Got more than one HL1 (CS, DoD, NS, TS, TFC, HLDM...) server? Check:
xOR is offline
xOR
Veteran Member
Join Date: Jun 2006
Location: x-base.info
Old 07-05-2006 , 17:27   Re: socket_recv() hangs
Reply With Quote #12

server is running for 48 hours now without crash.
plus i noticed that i forgot to set mp_timelimit to something so the server was running
- 48 hours without restart
- with checking 8 servers
- with checking them every 5 seconds
stable and without a crash.

i will wait some more days though. i still wouldn't be surprised to see it crashing on the third day =/
__________________
Got more than one HL1 (CS, DoD, NS, TS, TFC, HLDM...) server? Check:
xOR is offline
xOR
Veteran Member
Join Date: Jun 2006
Location: x-base.info
Old 07-06-2006 , 19:29   Re: socket_recv() hangs?
Reply With Quote #13

my test server is still running with it but Dominion's servers kept on crashing. so i went on one of his servers with him and we did some testing.
he has 4 servers and server 1 and 4 are on the same machine, as well as server 2 and 3. from his debug log i noticed that server 4 is crashing when trying to receive the UDP answer from server 1. so we went on server 4 and deleted server 1 from the server checking list and TA-DAH: no crashes.
with server 1 added to the list both server 1 and 4 are crashing after some requests (3-5 or so). the socket number does NOT increase. i used default timeout value for socket_change. and i rechecked again with debug messages, it is still socket_recv where the server just hangs.

to sum this up:
the server is only hanging if the server it is receiving the UDP packet from is the same machine (thus on the same IP). of course both servers were on different ports so we had this situation:

server:27400 --- UDP request ---> server:27500
server:27500 --- UDP answer ---> server:27400

this works fine for like 3 times, then both servers lock up.



i really don't think that i can do anything about this myself, it seems to be an AMXX or Metamod bug or even worse: a bug in the linux sockets. oh and btw, he had crashes with AMXX 1.60 as well as 1.75a.

but first i want to hear your suggestions. should i post this to the bug forums or is there still a chance i made something wrong?
remember, this other plugin had the same problem. but there was never reported such a problem, because it's only there when a server is receiving a packet from itself.
__________________
Got more than one HL1 (CS, DoD, NS, TS, TFC, HLDM...) server? Check:
xOR is offline
jtp10181
Veteran Member
Join Date: May 2004
Location: Madison, WI
Old 07-06-2006 , 20:00   Re: socket_recv() hangs?
Reply With Quote #14

could be a firewall issue, blocking something and making the sockets module fuck up. Also, try using 127.0.0.1ORT for the local IPs, just to test and see if it works. I might mess around with this later also to test it on windows. FYI I really doubt using the default timeout value will do anything usefull... using a timeout of 1 is the safest way to avoid the server lagging while waiting for data.
__________________
jtp10181 is offline
Send a message via ICQ to jtp10181 Send a message via AIM to jtp10181 Send a message via MSN to jtp10181 Send a message via Yahoo to jtp10181
xOR
Veteran Member
Join Date: Jun 2006
Location: x-base.info
Old 07-06-2006 , 20:27   Re: socket_recv() hangs?
Reply With Quote #15

Quote:
Originally Posted by jtp10181
could be a firewall issue, blocking something and making the sockets module fuck up. Also, try using 127.0.0.1ORT for the local IPs, just to test and see if it works. I might mess around with this later also to test it on windows. FYI I really doubt using the default timeout value will do anything usefull... using a timeout of 1 is the safest way to avoid the server lagging while waiting for data.
hmm firewall...but why then hang when receiving? the packet wouldn't arrive there if the firewall was blocking it.
anyway, trying loopback IP is a good idea.

unfortunately i don't have the same problem on my server. servers there can request packets from themselves without problems.

anyway, the timeout value was the only thing i changed when my server stopped crashing (at least for 3 days). so i must assume now that i changed it back to 1 that the crash problem is still there - i will see what happens over night.
__________________
Got more than one HL1 (CS, DoD, NS, TS, TFC, HLDM...) server? Check:
xOR is offline
BAILOPAN
Join Date: Jan 2004
Old 07-07-2006 , 03:09   Re: socket_recv() hangs?
Reply With Quote #16

remember:

socket_change() will return true if there is data waiting in the buffer for a given set of descriptors.

Obviously, doing while (socket_change()) will infinite loop, because you have not called the socket_recv() function yet.

These sockets are blocking and thus any function which requests more than what it is in the buffer will block.
__________________
egg
BAILOPAN is offline
xOR
Veteran Member
Join Date: Jun 2006
Location: x-base.info
Old 07-07-2006 , 05:03   Re: socket_recv() hangs?
Reply With Quote #17

Quote:
Originally Posted by BAILOPAN
remember:

socket_change() will return true if there is data waiting in the buffer for a given set of descriptors.

Obviously, doing while (socket_change()) will infinite loop, because you have not called the socket_recv() function yet.

These sockets are blocking and thus any function which requests more than what it is in the buffer will block.
i am pretty aware of this fact and of course never had a while loop with socket_change as condition where socket_recv wasn't called within the loop.

it's always the same procedure:
  1. socket_open
  2. socket_send
  3. let some time go by
  4. while (socket_change()) socket_recv()
  5. socket_close


it looks like the crash is a problem that only occurs when 2 conditions come together, 1 of them being that the server is socket_recv()eiving from itself (so source IP = destination IP) and the other condition i don't know. if i would know, i had found the error

i would really appreciate if you could have a short look at my plugin source just to tell whether i made some common mistake. though i don't believe it, because noone else saw an error yet. plus the plugin is working stable on my 4 servers since almost two weeks now. just that 2 people reported crashes, Dominion being one of them.
i more and more start to think that this is a problem related to their servers, either with firewall like jtp10181 suggested or with linux socket libs or what-do-i-know.
someone else just reported the plugin as stable again and he seems to have 2 servers on one machine as well.

i am a little at a loss with this. i really would not like to just tell those people to set the server checking feature to off, because the error is due to an unknown error in their server but not in my plugin. but actually, that's how it currently looks like
__________________
Got more than one HL1 (CS, DoD, NS, TS, TFC, HLDM...) server? Check:

Last edited by xOR; 07-07-2006 at 05:13.
xOR is offline
xOR
Veteran Member
Join Date: Jun 2006
Location: x-base.info
Old 01-06-2008 , 02:51   Re: socket_recv() hangs?
Reply With Quote #18

quite an old topic, but the problem didn't get fixed ever since. so i was surprised when i just stumbled over this.
hey, there is finally some chance that the problem is fixed \o/

i just wanted to let the people know who are still observing the post.
__________________
Got more than one HL1 (CS, DoD, NS, TS, TFC, HLDM...) server? Check:
xOR is offline
xOR
Veteran Member
Join Date: Jun 2006
Location: x-base.info
Old 04-17-2009 , 22:53   Re: socket_recv() hangs?
Reply With Quote #19

unfortunately the link i posted before doesn't work anymore.

but that doesn't matter, the problem is gone by fixing that bug entry and comparing the source code of socket_change shows it pretty clearly. there is only one change that matters since v1.75 (highlighted):
if (select(socket+1, &rfds, NULL, NULL, &tv) > 0)
return 1; // Ok, new data, return it
else
return 0; // No new data, return it

the explanation is pretty simple. select() returns -1 if there is a socket error, so on every socket error socket_select would always act as there was data to receive with the old code. when you then call socket_recv it will hang on your because actually there is no data waiting in the socket.

now from here on systems are different (that's why it didn't hang for everyone). some also detect an error pretty fast and return from socket_recv, so they wouldn't notice any big problems, some wait longer and the plugin seems to make the server laggy. and for some socket_recv waits forever. maybe that depends on operating system, C sockets implementation version or firewall, i don't know.
for those who had socket_recv returning it returned an empty packet, which means there is a socket error - the plugin should then handle this special case (e.g. show an error, close and try to reopen the socket...).
that's also what jtp10181 had encountered when he said he received an empty socket i guess.

well, just some details for those who were as curious as me.
__________________
Got more than one HL1 (CS, DoD, NS, TS, TFC, HLDM...) server? Check:
xOR is offline
SonicSonedit
Veteran Member
Join Date: Nov 2008
Location: Silent Hill
Old 04-23-2011 , 10:06   Re: socket_recv() hangs?
Reply With Quote #20

bump.

The following problems detected:
In some cases socket get corrupted after receiving data from it for unknown reason.
As xOR said, socket_change actually returns 1 on socket error. It even returns 1 on closed socket. But there is more! socket_recv may or may not crush your server (depending on which version you use), but if there was error, it actually returns -1 and empty buffer. So conditional statemant if (socket_recv) is incorrect, use if (socket_recv>0) instead. You can also use if (socket_recv<0) to detect error.
So, the only stable way to ensure socket to work correctly is to reset (close/open) it each time after it socket_recv. But i still don't understand why it gets corrupted after socket_recv and why it works so fine after being reseted - it really should allocate a different port instead of using the same port - so incoming packets should be lost, but they are not, and you can be read them with socket_recv

Any explanation will be really appreciated!
__________________


Last edited by SonicSonedit; 04-23-2011 at 10:11.
SonicSonedit is offline
Old 04-25-2011, 10:53
SonicSonedit
This message has been deleted by Exolent[jNr]. Reason: Don't bump until 2 weeks have passed since last post.
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 17:18.


Powered by vBulletin®
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Theme made by Freecode