04.12.07

Multibyte String Truncate Modifier for Smarty - mb_truncate

Posted in PHP at 3:53 pm by Guy

When working with Smarty, a PHP templating engine, I discovered that while the regular truncate modifier works great on ASCII strings, it doesn’t work with multibyte strings, i.e. UTF-8 encoded strings. This leads to problems in internationalization (i18n), as UTF-8 is the popular encoding for non-Latin alphabets nowdays. The problem can be solved by modifying the built-in truncate modifier and create a new one that takes an additional argument, the charset of the string, and acts accordingly. The new modified modifier, mb_truncate is implemented below.

<?php
/**
 * Smarty plugin
 * @package Smarty
 * @subpackage plugins
 */
 
 
/**
 * Smarty truncate modifier plugin
 *
 * Type:     modifier<br>
 * Name:     mb_truncate<br>
 * Purpose:  Truncate a string to a certain length if necessary,
 *           optionally splitting in the middle of a word, and
 *           appending the $etc string or inserting $etc into the middle.
 *           This version also supports multibyte strings.
 * @link http://smarty.php.net/manual/en/language.modifier.truncate.php
 *          truncate (Smarty online manual)
 * @author   Guy Rutenberg <guyrutenberg@gmail.com> based on the original 
 *           truncate by Monte Ohrt <monte at ohrt dot com>
 * @param string
 * @param integer
 * @param string
 * @param string
 * @param boolean
 * @param boolean
 * @return string
 */
function smarty_modifier_mb_truncate($string, $length = 80, $etc = '...', $charset='UTF-8',
                                  $break_words = false, $middle = false)
{
    if ($length == 0)
        return '';
 
    if (strlen($string) > $length) {
        $length -= min($length, strlen($etc));
        if (!$break_words && !$middle) {
            $string = preg_replace('/\s+?(\S+)?$/', '', mb_substr($string, 0, $length+1, $charset));
        }
        if(!$middle) {
            return mb_substr($string, 0, $length, $charset) . $etc;
        } else {
            return mb_substr($string, 0, $length/2, $charset) . $etc . mb_substr($string, -$length/2, $charset);
        }
    } else {
        return $string;
    }
}
 
/* vim: set expandtab: */
?>

The license for the code is LGPL (same as Smarty’s). To install to modifier just put it under /smarty/plugins/modifier.mb_truncate.php.

Using the mb_truncate modifier is similar to truncate.

//$some_string is a string of utf-8 variables assign via php
{$some_string|mb_truncate:|mb_truncate:13:"...":'UTF-8'}

The modifier also supports the break words, and truncate in the middle flags of the original truncate.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • StumbleUpon
  • bodytext
  • Sphinn
  • Facebook
  • Mixx
  • Google

4 Comments »

  1. Miroslav Mollov said,

    March 4, 2008 at 3:17 pm

    Nice !
    It is good idea !

  2. curzon said,

    April 9, 2008 at 9:57 am

    i have a 7 line paragraph in my database. now i wanna show only 3 line from this paragraph using PHP. which function i can use ?

  3. Guy said,

    April 9, 2008 at 10:48 am

    Hi curzon,
    A simple solution would be to call strpos (multiple times) to find the third occurrence of “\n” and now return the substring till this point, it will only have 3 lines.

    I don’t know any existing modifier for Smarty, so you will have to wrap it before you can use it in Smarty. See the mb_truncate modifier as an example how to do it.

  4. Pierre said,

    April 10, 2008 at 3:24 pm

    That was exactly what I needed. Took me about two hours to figure out that the problem was not in the database but with Smartys truncate ;-)
    Then I googled and came here -> problem solved.

    Thanks,
    Pierre

Leave a Comment