UTF-8 Character Map

Written by: NetworkError, on 07-01-2009 12:00
Last update: 21-05-2009 10:16
Published in: Public, Technical Wootness
Views: 7174


Update:
I have added search functionality. You can search for any UTF-8 character (within the first 1,000,000 characters) and it will take you to the correct page and highlight the right row. I've updated the source code below.

I recently needed to write tests using some UTF-8 characters from specific ranges. I couldn't find a character map I liked out on the Intertubes, so I whipped one up myself. It's not what I would consider "well written", but it does work so I thought I would share it here.

You can see the UTF-8 character map in action here, and I have supplied the source code below.

The Code:

<?
/**
 * This file will display a dynamic UTF-8 codepage list.
 * You can browse it and search it.
 *
 * @author NetworkError <junk@networkerror.org>
 **/
?>
<html>
<head>
<title>UTF-8 Character Map</title>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
        <style>
                body {font-family: arial; font-size: 12px;}
                thead {font-weight: bold;}
                thead td {border-bottom: 1px solid black; font-size: 12px;}
                div table {text-align: center; border-left: 1px solid gray; border-top: 1px solid gray;}
                div td {vertical-align: top; border-right: 1px solid gray; border-bottom: 1px solid gray; font-size: 11px;}
                .row1  {background-color: #FCFCFC;}
                .row1o {background-color: white;}
                .row2  {background-color: #FFFBE7;}
                .row2o {background-color: #FFFEFA;}
                .row_found  {background-color: yellow;}
                .row_foundo {background-color: yellow;}
        </style>
</head>
<body>
<?
// Setup defaults...
if (!array_key_exists('start', $_GET)) {
        $_GET['start'] = 0;
        $_GET['end'] = 100;
        $_GET['cols'] = 4;
} else {
        // Error checking.
        $_GET['start'] = (int)$_GET['start'];
        $_GET['end'] = (int)$_GET['end'];
        $_GET['cols'] = (int)$_GET['cols'];

        if ($_GET['cols'] == 0) {
                $_GET['cols'] = 1;
        }

        // If the user has dyslexia, fix it.
        if ($_GET['start'] > $_GET['end']) {
                $temp = $_GET['start'];
                $_GET['start'] = $_GET['end'];
                $_GET['end'] = $temp;
                unset($temp);
        }

        // If the user likes all things equal, make it stop.
        if ($_GET['start'] == $_GET['end']) {
                $_GET['end'] += 1;
        }
}

// Search Logic
if (array_key_exists('search', $_GET) && strlen($_GET['search']) > 0) {
        $search = $_GET['search'];
        set_time_limit(60);
        for ($i = 0, $found = null; $found === null && $i < 1000000; $i++) {
                $html = '&#'.$i.';';
                $str = html_entity_decode($html, ENT_NOQUOTES, 'UTF-8');
                //echo $str.' === '.$seach.'<br />';
                if ($str === $search) {
                        $found = $i;
                }
        }

        $diff = $_GET['end'] - $_GET['start'];
        $page_num = (int)($found / $diff);
        $_GET['start'] = $page_num * $diff;
        $_GET['end'] = $_GET['start'] + $diff;
}



$diff = $_GET['end'] - $_GET['start'];
$next_href = '?start=' . ($_GET['start'] + $diff) . '&end=' . ($_GET['end'] + $diff) . '&cols=' . $_GET['cols'];
$next_link = '<a href="'.$next_href.'">Next '.$diff.' &gt;&gt;</a>';
if ($_GET['start'] > 0) {
        $last_href = '?start=' . ($_GET['start'] - $diff) . '&end=' . ($_GET['end'] - $diff) . '&cols=' . $_GET['cols'];
        $last_link = '<a href="'.$last_href.'">&lt;&lt; Last '.$diff.'</a>';
} else {
        $last_link = '';
}
?>
<form action="" method="GET">
        <table>
                <tr>
                        <td style="text-align: right;">
                                Output characters in range:
                        </td>
                        <td>
                                <input type="text" size="5" name="start" value="<?=$_GET['start']?>" /> -
                                <input type="text" size="5" name="end" value="<?=$_GET['end']?>" />
                        </td>
                </tr>
                <tr>
                        <td style="text-align: right;">
                                Number of columns:
                        </td>
                        <td>
                                <input type="text" size="2" name="cols" value="<?=$_GET['cols']?>" />
                        </td>
                </tr>
                <tr>
                        <td style="text-align: right;">
                                Search (optional):
                        </td>
                        <td>
                                <input type="text" size="1" name="search" maxlength="1" />
                        </td>
                </tr>
                <tr>
                        <td style="vertical-align: center">
                                <input type="submit" value="GO" />
                        </td>
                </tr>
        </table>
</form>
<?
if ($found !== null) {
        echo 'Found character "'.$search.'".  UTF-8 character number '.$i.'<p />';
}
?>
<?=$last_link?> <?=$next_link?><p />

<div>
<table cellspacing="0" cellpadding="4" width="100%">
        <tr>
        <?
        $range_size = round($diff / $_GET['cols']);
        for ($i = 0; $i < $_GET['cols']; $i++) {
        ?>
                <td>
                        <?
                        $start = $_GET['start'] + $range_size * $i;
                        $end = $start + $range_size;
                        outputRange($start, $end);
                        ?>
                </td>
        <?
        }
        ?>
        </tr>
</table>
</div>

<?=$last_link?> <?=$next_link?><p />

<p>
<a href="http://www.networkerror.org">&lt;&lt;Back to NetworkError.org</a>

</body>
</html>

<?
function outputRange($start, $end)
{
?>
        <table align="center" width="250" cellspacing="0" cellpadding="2">
                <thead>
                        <tr>
                                <td>Unicode<br />Number</td>
                                <td>HTML<br />Entity</td>
                                <td>Char</td>
                        </tr>
                </thead>
                <tbody>
        <?
        global $found;
        for ($i = $start; $i <= $end; $i++) {
                $code = $i;
                $html = '&#'.$code.';';
                $str = html_entity_decode($html, ENT_NOQUOTES, 'UTF-8');

                if ($found !== null && (int)$found === (int)$i) {
                        $row_class = 'row_found';
                } else {
                        if (($i % 2) == 0) {
                                $row_class = 'row1';
                        } else {
                                $row_class = 'row2';
                        }
                }

                echo '<tr class="'.$row_class.'" onmouseover="this.className=\''.$row_class.'o\';" onmouseout="this.className=\''.$row_class.'\';">';
                echo '<td>'.$code.'</td>';
                echo '<td>'.(htmlentities($html)).'</td>';
                echo '<td>'.$str.'</td>';
                echo '</tr>';
        }
        ?>
                </tbody>
        </table>
<?
}

Read more... Be first to comment this article   |   Print   |   Send to friend

UTF-8 Character Map

Print