View Single Post
el_psycho
Member
Join Date: Feb 2012
Location: Vancouver, Canada
Old 09-10-2012 , 18:35   Re: [L4D & L4D2] Custom Player Stats v1.4B117
Reply With Quote #1537

TL;DR go to EDIT 2

Hi muukis, I've been looking into the encoding problem with some special characters causing the names to be blank. I'm fairly certain its an encoding problem. UTF-8 unicode works just fine, but the same special character encoded in ANSI will cause the issue. I know UTF-8 is backwards compatible with ASCII encoding but i don't know if it is with ANSI. As far as I've been able to find, ASCII and ANSI (or Windows-1252) differ in some aspects.

Anyways. the PHP code asumes the content of the blob is UTF-8 and so, cant handle ANSI correctly causing the problems. I wanted to find a way to detect what encoding a string in a blob was using.
Although i dont have a final solution, i found some promesing information here:
http://php.net/manual/en/function.mb...t-encoding.php
Which will detect the encoding used for a string. Once knowing if its ANSI we can take apropiate action like converting the value to UTF-8 enconding for display on the page?

Take a look and let me know if this is a step in the right direction.
I know what the problem is but I'm not sure how to fix it.

EDIT:
I'm now 100% sure the problem is caused by ANSI encoded characters. I can reproduce the problem at will.

I'm having trouble implementing a fix in the php. My php knowledge is very limited. This is what im tryring to do:

- Use mb_detect_encoding function to detect if the encoding is Windows-1252 (ANSI).
PHP Code:
if (mb_detect_encoding($NAMEstring'windows-1252'true) !== FALSE
- Then, If it is ANSI, convert the string from windows-1252 to UTF8 using iconv.
PHP Code:
$NAMEstring iconv('windows-1252''utf-8//TRANSLIT'$NAMEstring); 
It's hard for me to figure out where to do this.
You can read more about these functions here:
mb_detect_encoding
iconv

EDIT 2:
Good news!!! I got my fix for the empty names working! I didn't need iconv after all.

Fix example for playerlist.php starting at line 160:
PHP Code:
 while ($row mysql_fetch_array($result))
                    {
                            
$line createtablerowtooltip($row$i);
                if (
mb_detect_encoding($row['name'], 'UTF-8'true) === FALSE)
                        
$line .= "<td align=\"center\">" number_format($i) . "</td><td>" . ($showplayerflags $ip2c->get_country_flag($row['ip']) : "") . "<a href=\"player.php?steamid=" $row['steamid']. "\">" htmlentities($row['name'], ENT_COMPAT"cp1252") . "</a></td>";
                else
                        
$line .= "<td align=\"center\">" number_format($i) . "</td><td>" . ($showplayerflags $ip2c->get_country_flag($row['ip']) : "") . "<a href=\"player.php?steamid=" $row['steamid']. "\">" htmlentities($row['name'], ENT_COMPAT"UTF-8") . "</a></td>";
                
$line .= "<td>" number_format($row['real_points']) . "</td>";
                
$line .= "<td>" formatage($row['real_playtime'] * 60) . "</td>";
                
$line .= "<td>" formatage(time() - $row['lastontime']) . " ago</td></tr>\n";
                
//$line .= "<td>" . $query . "</td></tr>\n";
                
$arr_players[] = $line;
                
$i++;
                    } 
EDIT 3:

OK, this works well but its not perfect. I've noticed an issue with a couple of names using cyrillic characters not being detected as UTF-8. They do show up in the lists but the names have the wrong characters (becuase i assumed if they're not UTF8 then they're Windows-1252). Its probably a matter of properly detecting what charset those names are using (which isn't easy in PHP apparently) and then applying a proper fix that takes that into account.

The following function looks like it could be the answer to our problem, we just need to add more charsets to the array. But i cant get it to work for some reason: UPDATE: scratch that. same problem as with mb_detect_encoding. it cant tell the difference between windows-1252 and other ISO-8859-* charsets.
PHP Code:
<?php
function detect_encoding($string) {  
  static 
$list = array('utf-8''windows-1251');
  
  foreach (
$list as $item) {
    
$sample iconv($item$item$string);
    if (
md5($sample) == md5($string))
      return 
$item;
  }
  return 
null;
}
?>

Last edited by el_psycho; 09-17-2012 at 07:40.
el_psycho is offline