PHP Classes

File: build/docs/base.md

Recommend this page to a friend!
  Classes of Lars Moelleken   Portable UTF-8   build/docs/base.md   Download  
File: build/docs/base.md
Role: Auxiliary data
Content type: text/markdown
Description: Auxiliary data
Class: Portable UTF-8
Manipulate UTF-8 text strings in pure PHP
Author: By
Last change:
Date: 2 years ago
Size: 9,353 bytes
 

Contents

Class file image Download

Build Status Build status FOSSA Status codecov.io Codacy Badge Latest Stable Version Total Downloads License Donate to this project using PayPal Donate to this project using Patreon

? Portable UTF-8

Description

It is written in PHP (PHP 7+) and can work without "mbstring", "iconv" or any other extra encoding php-extension on your server.

The benefit of Portable UTF-8 is that it is easy to use, easy to bundle. This library will also auto-detect your server environment and will use the installed php-extensions if they are available, so you will have the best possible performance.

As a fallback we will use Symfony Polyfills, if needed. (https://github.com/symfony/polyfill)

The project based on ... + Hamid Sarfraz's work - portable-utf8 + Nicolas Grekas's work - tchwork/utf8 + Behat's work - Behat/Transliterator + Sebastián Grignoli's work - neitanod/forceutf8 + Ivan Enderlin's work - hoaproject/Ustring + and many cherry-picks from "GitHub"-gists and "Stack Overflow"-snippets ...

Demo

Here you can test some basic functions from this library and you can compare some results with the native php function results.

Index

Alternative

If you like a more Object Oriented Way to edit strings, then you can take a look at voku/Stringy, it's a fork of "danielstjules/Stringy" but it used the "Portable UTF-8"-Class and some extra methods.

// Standard library
strtoupper('fòôbà?');       // 'FòôBà?'
strlen('fòôbà?');           // 10

// mbstring 
// WARNING: if you don't use a polyfill like "Portable UTF-8", you need to install the php-extension "mbstring" on your server
mb_strtoupper('fòôbà?');    // 'FÒÔBÀ?'
mb_strlen('fòôbà?');        // '6'

// Portable UTF-8
use voku\helper\UTF8;
UTF8::strtoupper('fòôbà?');    // 'FÒÔBÀ?'
UTF8::strlen('fòôbà?');        // '6'

// voku/Stringy
use Stringy\Stringy as S;
$stringy = S::create('fòôbà?');
$stringy->toUpperCase();    // 'FÒÔBÀ?'
$stringy->length();         // '6'

Install "Portable UTF-8" via "composer require"

composer require voku/portable-utf8

If your project do not need some of the Symfony polyfills please use the replace section of your composer.json. This removes any overhead from these polyfills as they are no longer part of your project. e.g.:

{
  "replace": {
    "symfony/polyfill-php72": "1.99",
    "symfony/polyfill-iconv": "1.99",
    "symfony/polyfill-intl-grapheme": "1.99",
    "symfony/polyfill-intl-normalizer": "1.99",
    "symfony/polyfill-mbstring": "1.99"
  }
}

Why Portable UTF-8?[]()

PHP 5 and earlier versions have no native Unicode support. To bridge the gap, there exist several extensions like "mbstring", "iconv" and "intl".

The problem with "mbstring" and others is that most of the time you cannot ensure presence of a specific one on a server. If you rely on one of these, your application is no more portable. This problem gets even severe for open source applications that have to run on different servers with different configurations. Considering these, I decided to write a library:

Requirements and Recommendations

  • No extensions are required to run this library. Portable UTF-8 only needs PCRE library that is available by default since PHP 4.2.0 and cannot be disabled since PHP 5.3.0. "\u" modifier support in PCRE for UTF-8 handling is not a must.
  • PHP 5.3 is the minimum requirement, and all later versions are fine with Portable UTF-8.
  • PHP 7.0 is the minimum requirement since version 4.0 of Portable UTF-8, otherwise composer will install an older version
  • PHP 8.0 support is also available and will adapt the behaviours of the native functions.
  • To speed up string handling, it is recommended that you have "mbstring" or "iconv" available on your server, as well as the latest version of PCRE library
  • Although Portable UTF-8 is easy to use; moving from native API to Portable UTF-8 may not be straight-forward for everyone. It is highly recommended that you do not update your scripts to include Portable UTF-8 or replace or change anything before you first know the reason and consequences. Most of the time, some native function may be all what you need.
  • There is also a shim for "mbstring", "iconv" and "intl", so you can use it also on shared webspace.

Usage

Example 1: UTF8::cleanup()

  echo UTF8::cleanup('?Düsseldorf?');
  
  // will output:
  // Düsseldorf

Example 2: UTF8::strlen()

  $string = 'string <strong>with utf-8 chars åèä</strong> - doo-bee doo-bee dooh';

  echo strlen($string) . "\n<br />";
  echo UTF8::strlen($string) . "\n<br />";

  // will output:
  // 70
  // 67

  $string_test1 = strip_tags($string);
  $string_test2 = UTF8::strip_tags($string);

  echo strlen($string_test1) . "\n<br />";
  echo UTF8::strlen($string_test2) . "\n<br />";

  // will output:
  // 53
  // 50

Example 3: UTF8::fix_utf8()


  echo UTF8::fix_utf8('Düsseldorf');
  echo UTF8::fix_utf8('ä');
  
  // will output:
  // Düsseldorf
  // ä

Portable UTF-8 | API

The API from the "UTF8"-Class is written as small static methods that will match the default PHP-API.

Class methods

%__functions_index__voku\helper\UTF8__%

%__functions_list__voku\helper\UTF8__%

Unit Test

1) Composer is a prerequisite for running the tests.

composer install

2) The tests can be executed by running this command from the root directory:

./vendor/bin/phpunit

Support

For support and donations please visit GitHub | Issues | PayPal | Patreon.

For status updates and release announcements please visit Releases | Twitter | Patreon.

For professional support please contact me.

Thanks

  • Thanks to GitHub (Microsoft) for hosting the code and a good infrastructure including Issues-Management, etc.
  • Thanks to IntelliJ as they make the best IDEs for PHP and they gave me an open source license for PhpStorm!
  • Thanks to Travis CI for being the most awesome, easiest continuous integration tool out there!
  • Thanks to StyleCI for the simple but powerful code style check.
  • Thanks to PHPStan && Psalm for really great Static analysis tools and for discovering bugs in the code!

License and Copyright

"Portable UTF8" is free software; you can redistribute it and/or modify it under the terms of the (at your option): - Apache License v2.0, or - GNU General Public License v2.0.

Unicode handling requires tedious work to be implemented and maintained on the long run. As such, contributions such as unit tests, bug reports, comments or patches licensed under both licenses are really welcomed.

FOSSA Status