NAME Encode::DoubleEncodedUTF8 - Fix double encoded UTF-8 bytes to the correct one SYNOPSIS use Encode; use Encode::DoubleEncodedUTF8; my $dodgy_utf8 = "Some byte strings from the web/DB with double-encoded UTF-8 bytes"; my $fixed = decode("utf-8-de", $dodgy_utf8); # Fix it WARNINGS Use this module only for testing, debugging, data recovery and working around with buggy software you *can't* fix. Do not use this module in your production code just to *work around* bugs in the code you *can* fix. This module is slow, and not perfect and may break the encodings if you run against correctly encoded strings. See perlunitut for more details. DESCRIPTION Encode::DoubleEncodedUTF8 adds a new encoding "utf-8-de" and fixes double encoded utf-8 bytes found in the original bytes to the correct Unicode entity. The double encoded utf-8 frequently happens when strings with UTF-8 flag and without are concatenated, for instance: my $string = "L\x{e9}on"; # latin-1 utf8::upgrade($string); my $bytes = "L\xc3\xa9on"; # utf-8 my $dodgy_utf8 = encode_utf8($string . " " . $bytes); # $bytes is now double encoded my $fixed = decode("utf-8-de", $dodgy_utf8); # "L\x{e9}on L\x{e9}on"; See encoding::warnings for more details. AUTHOR Tatsuhiko Miyagawa <miyagawa@bulknews.net> LICENSE This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. SEE ALSO encoding::warnings, Test::utf8