Guy Rutenberg

Keeping track of what I do

Multibyte String Truncate Modifier for Smarty – mb_truncate

with 19 comments

When working with Smarty, a PHP templating engine, I discovered that while the regular truncate modifier works great on ASCII strings, it doesn’t work with multibyte strings, i.e. UTF-8 encoded strings. This leads to problems in internationalization (i18n), as UTF-8 is the popular encoding for non-Latin alphabets nowdays. The problem can be solved by modifying the built-in truncate modifier and create a new one that takes an additional argument, the charset of the string, and acts accordingly. The new modified modifier, mb_truncate is implemented below.

<?php
/**
 * Smarty plugin
 * @package Smarty
 * @subpackage plugins
 */
 
 
/**
 * Smarty truncate modifier plugin
 *
 * Type:     modifier<br>
 * Name:     mb_truncate<br>
 * Purpose:  Truncate a string to a certain length if necessary,
 *           optionally splitting in the middle of a word, and
 *           appending the $etc string or inserting $etc into the middle.
 *           This version also supports multibyte strings.
 * @link http://smarty.php.net/manual/en/language.modifier.truncate.php
 *          truncate (Smarty online manual)
 * @author   Guy Rutenberg <guyrutenberg@gmail.com> based on the original 
 *           truncate by Monte Ohrt <monte at ohrt dot com>
 * @param string
 * @param integer
 * @param string
 * @param string
 * @param boolean
 * @param boolean
 * @return string
 */
function smarty_modifier_mb_truncate($string, $length = 80, $etc = '...', $charset='UTF-8',
                                  $break_words = false, $middle = false)
{
    if ($length == 0)
        return '';
 
    if (strlen($string) > $length) {
        $length -= min($length, strlen($etc));
        if (!$break_words && !$middle) {
            $string = preg_replace('/\s+?(\S+)?$/', '', mb_substr($string, 0, $length+1, $charset));
        }
        if(!$middle) {
            return mb_substr($string, 0, $length, $charset) . $etc;
        } else {
            return mb_substr($string, 0, $length/2, $charset) . $etc . mb_substr($string, -$length/2, $charset);
        }
    } else {
        return $string;
    }
}
 
/* vim: set expandtab: */
?>

The license for the code is LGPL (same as Smarty’s). To install to modifier just put it under /smarty/plugins/modifier.mb_truncate.php.

Using the mb_truncate modifier is similar to truncate.

//$some_string is a string of utf-8 variables assign via php
{$some_string|mb_truncate:13:"...":'UTF-8'}

The modifier also supports the break words, and truncate in the middle flags of the original truncate.

Share and Enjoy:
  • del.icio.us
  • StumbleUpon
  • Digg
  • Facebook
  • Mixx
  • Google Bookmarks
  • Simpy

Written by Guy

December 4th, 2007 at 3:53 pm

Posted in PHP

Tagged with

19 Responses to 'Multibyte String Truncate Modifier for Smarty – mb_truncate'

Subscribe to comments with RSS or TrackBack to 'Multibyte String Truncate Modifier for Smarty – mb_truncate'.

  1. Nice !
    It is good idea !

    Miroslav Mollov

    4 Mar 08 at 15:17

  2. i have a 7 line paragraph in my database. now i wanna show only 3 line from this paragraph using PHP. which function i can use ?

    curzon

    9 Apr 08 at 09:57

  3. Hi curzon,
    A simple solution would be to call strpos (multiple times) to find the third occurrence of “\n” and now return the substring till this point, it will only have 3 lines.

    I don’t know any existing modifier for Smarty, so you will have to wrap it before you can use it in Smarty. See the mb_truncate modifier as an example how to do it.

    Guy

    9 Apr 08 at 10:48

  4. That was exactly what I needed. Took me about two hours to figure out that the problem was not in the database but with Smartys truncate ;-)
    Then I googled and came here -> problem solved.

    Thanks,
    Pierre

    Pierre

    10 Apr 08 at 15:24

  5. {$some_string|mb_truncate:13:”…”:’UTF-8′}
    This is the right syntax to truncate…..
    :)

    muthu muttu

    30 Jan 09 at 16:46

  6. @muthu: Thanks, I updated the post and removed the extra mb_truncate from the example.

    Guy

    30 Jan 09 at 18:26

  7. wow! thanks great work!

    9h0st

    9 Feb 09 at 00:40

  8. Thank you very much for sharing this code!

    It helped me fix an annoying bug in my weblog, which relies on Smarty.

    Best regards,
    Alex

    alemartini

    13 Feb 09 at 22:34

  9. If you don’t put “/u” at the end of pattern, you may have a problem at the end of accentued caracters.

    $string = preg_replace(‘/\s+?(\S+)?$/u’, ”, mb_substr($string, 0, $length+1, $charset));

    Best regards
    tics

    tics

    18 Mar 09 at 14:40

  10. In UTF8, if you don’t put “/u” at the end of pattern, you may have a problem at the end of accentued caracters.

    $string = preg_replace(’/\s+?(\S+)?$/u’, ”, mb_substr($string, 0, $length+1, $charset));

    Best regards
    tics

    tics

    18 Mar 09 at 14:41

  11. tics +1

    Alex

    26 Mar 09 at 16:58

  12. Great work,
    I have exactly the problem you described in your post. Thanks for this plugin.

    Hen

    31 Jul 09 at 11:51

  13. Thanks for this plugin.
    There is a small problem in the code.
    On line 36 and line 37 instead of strlen must be used mb_strlen.
    So it should be:
    if (mb_strlen($string, $charset) > $length) {
    $length -= min($length, mb_strlen($etc, $charset));

    :)

    Vladimir Dokuzanov

    25 Oct 09 at 00:08

  14. Nice plugin, works even better if previously suggested changes are made.

    Additionally one can change the order of the function arguments. To match the order of things with classic smarty truncate, the charset should be passed after breakwords, so it would be:
    $string, $length, $etc, $break_words …

    I placed that as last param, as php`s mb_ functions.

    You can now prepend mb_ on all existing truncate modifier calls within the templates.

    This is a very easy task using find/replace, which won’t do so well with original argument order.
    It would pass the former break_words as charset, mb_ functions will not be happy with it.

    Thomas

    24 Feb 10 at 14:04

  15. Thank you for posting this, an excellent addition to Smarty and very useful for those of us working in UTF-8 only environments.

    Mike

    19 Oct 10 at 09:55

  16. Thanks/Спасибо

    patgod

    8 Feb 11 at 18:13

  17. Lista de problemas com o serendipity…

    É óbvio que há alguns problemas com o Serendipity. Este artigo será actualizado para manter a lista de problemas conhecidos, resolvidos e por resolver. A maioria dos problemas detectados têm a ver com internacionalização. De uma forma resumida:…

  18. Thank you! Good work! :)

    Coder.UA

    15 Dec 11 at 17:15

  19. [...] Truncate String smarty mb_truncate カテゴリー: PHP タグ: コメント (0) トラックバック (0) [...]

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">