head 1.10; access; symbols pkgsrc-2023Q4:1.10.0.18 pkgsrc-2023Q4-base:1.10 pkgsrc-2023Q3:1.10.0.16 pkgsrc-2023Q3-base:1.10 pkgsrc-2023Q2:1.10.0.14 pkgsrc-2023Q2-base:1.10 pkgsrc-2023Q1:1.10.0.12 pkgsrc-2023Q1-base:1.10 pkgsrc-2022Q4:1.10.0.10 pkgsrc-2022Q4-base:1.10 pkgsrc-2022Q3:1.10.0.8 pkgsrc-2022Q3-base:1.10 pkgsrc-2022Q2:1.10.0.6 pkgsrc-2022Q2-base:1.10 pkgsrc-2022Q1:1.10.0.4 pkgsrc-2022Q1-base:1.10 pkgsrc-2021Q4:1.10.0.2 pkgsrc-2021Q4-base:1.10 pkgsrc-2021Q3:1.8.0.40 pkgsrc-2021Q3-base:1.8 pkgsrc-2021Q2:1.8.0.38 pkgsrc-2021Q2-base:1.8 pkgsrc-2021Q1:1.8.0.36 pkgsrc-2021Q1-base:1.8 pkgsrc-2020Q4:1.8.0.34 pkgsrc-2020Q4-base:1.8 pkgsrc-2020Q3:1.8.0.32 pkgsrc-2020Q3-base:1.8 pkgsrc-2020Q2:1.8.0.28 pkgsrc-2020Q2-base:1.8 pkgsrc-2020Q1:1.8.0.8 pkgsrc-2020Q1-base:1.8 pkgsrc-2019Q4:1.8.0.30 pkgsrc-2019Q4-base:1.8 pkgsrc-2019Q3:1.8.0.26 pkgsrc-2019Q3-base:1.8 pkgsrc-2019Q2:1.8.0.24 pkgsrc-2019Q2-base:1.8 pkgsrc-2019Q1:1.8.0.22 pkgsrc-2019Q1-base:1.8 pkgsrc-2018Q4:1.8.0.20 pkgsrc-2018Q4-base:1.8 pkgsrc-2018Q3:1.8.0.18 pkgsrc-2018Q3-base:1.8 pkgsrc-2018Q2:1.8.0.16 pkgsrc-2018Q2-base:1.8 pkgsrc-2018Q1:1.8.0.14 pkgsrc-2018Q1-base:1.8 pkgsrc-2017Q4:1.8.0.12 pkgsrc-2017Q4-base:1.8 pkgsrc-2017Q3:1.8.0.10 pkgsrc-2017Q3-base:1.8 pkgsrc-2017Q2:1.8.0.6 pkgsrc-2017Q2-base:1.8 pkgsrc-2017Q1:1.8.0.4 pkgsrc-2017Q1-base:1.8 pkgsrc-2016Q4:1.8.0.2 pkgsrc-2016Q4-base:1.8 pkgsrc-2016Q3:1.7.0.6 pkgsrc-2016Q3-base:1.7 pkgsrc-2016Q2:1.7.0.4 pkgsrc-2016Q2-base:1.7 pkgsrc-2016Q1:1.7.0.2 pkgsrc-2016Q1-base:1.7 pkgsrc-2015Q4:1.6.0.2 pkgsrc-2015Q4-base:1.6 pkgsrc-2015Q3:1.5.0.2 pkgsrc-2015Q3-base:1.5 pkgsrc-2015Q2:1.4.0.2 pkgsrc-2015Q2-base:1.4 pkgsrc-2015Q1:1.3.0.6 pkgsrc-2015Q1-base:1.3 pkgsrc-2014Q4:1.3.0.4 pkgsrc-2014Q4-base:1.3 pkgsrc-2014Q3:1.3.0.2 pkgsrc-2014Q3-base:1.3 pkgsrc-2014Q2:1.1.1.1.0.44 pkgsrc-2014Q2-base:1.1.1.1 pkgsrc-2014Q1:1.1.1.1.0.42 pkgsrc-2014Q1-base:1.1.1.1 pkgsrc-2013Q4:1.1.1.1.0.40 pkgsrc-2013Q4-base:1.1.1.1 pkgsrc-2013Q3:1.1.1.1.0.38 pkgsrc-2013Q3-base:1.1.1.1 pkgsrc-2013Q2:1.1.1.1.0.36 pkgsrc-2013Q2-base:1.1.1.1 pkgsrc-2013Q1:1.1.1.1.0.34 pkgsrc-2013Q1-base:1.1.1.1 pkgsrc-2012Q4:1.1.1.1.0.32 pkgsrc-2012Q4-base:1.1.1.1 pkgsrc-2012Q3:1.1.1.1.0.30 pkgsrc-2012Q3-base:1.1.1.1 pkgsrc-2012Q2:1.1.1.1.0.28 pkgsrc-2012Q2-base:1.1.1.1 pkgsrc-2012Q1:1.1.1.1.0.26 pkgsrc-2012Q1-base:1.1.1.1 pkgsrc-2011Q4:1.1.1.1.0.24 pkgsrc-2011Q4-base:1.1.1.1 pkgsrc-2011Q3:1.1.1.1.0.22 pkgsrc-2011Q3-base:1.1.1.1 pkgsrc-2011Q2:1.1.1.1.0.20 pkgsrc-2011Q2-base:1.1.1.1 pkgsrc-2011Q1:1.1.1.1.0.18 pkgsrc-2011Q1-base:1.1.1.1 pkgsrc-2010Q4:1.1.1.1.0.16 pkgsrc-2010Q4-base:1.1.1.1 pkgsrc-2010Q3:1.1.1.1.0.14 pkgsrc-2010Q3-base:1.1.1.1 pkgsrc-2010Q2:1.1.1.1.0.12 pkgsrc-2010Q2-base:1.1.1.1 pkgsrc-2010Q1:1.1.1.1.0.10 pkgsrc-2010Q1-base:1.1.1.1 pkgsrc-2009Q4:1.1.1.1.0.8 pkgsrc-2009Q4-base:1.1.1.1 pkgsrc-2009Q3:1.1.1.1.0.6 pkgsrc-2009Q3-base:1.1.1.1 pkgsrc-2009Q2:1.1.1.1.0.4 pkgsrc-2009Q2-base:1.1.1.1 pkgsrc-2009Q1:1.1.1.1.0.2 pkgsrc-2009Q1-base:1.1.1.1 pkgsrc-base:1.1.1.1 TNF:1.1.1; locks; strict; comment @# @; 1.10 date 2021.10.26.11.22.45; author nia; state Exp; branches; next 1.9; commitid TS3y6sgAeGKWpjeD; 1.9 date 2021.10.07.15.01.52; author nia; state Exp; branches; next 1.8; commitid 0fS32tEWoNe7fTbD; 1.8 date 2016.11.28.13.37.53; author wiz; state Exp; branches; next 1.7; commitid 4fVODRca7RxdOTvz; 1.7 date 2016.02.18.03.38.36; author wen; state Exp; branches; next 1.6; commitid NmCAcEOmDRWlClVy; 1.6 date 2015.11.04.01.59.54; author agc; state Exp; branches; next 1.5; commitid 8Vi0UoG7obKytIHy; 1.5 date 2015.08.28.22.46.28; author mef; state Exp; branches; next 1.4; commitid PE1oRzONLsK1z5zy; 1.4 date 2015.05.10.03.02.05; author mef; state Exp; branches; next 1.3; commitid R6kSd2DYJ7uUxQky; 1.3 date 2014.09.16.12.27.48; author wen; state Exp; branches; next 1.2; commitid ZkFYVSmeri71gzQx; 1.2 date 2014.08.11.02.11.27; author wen; state Exp; branches; next 1.1; commitid 0mfgVAwvGSeE0TLx; 1.1 date 2009.02.24.12.00.40; author tonnerre; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2009.02.24.12.00.40; author tonnerre; state Exp; branches; next ; desc @@ 1.10 log @textproc: Replace RMD160 checksums with BLAKE2s checksums All checksums have been double-checked against existing RMD160 and SHA512 hashes Unfetchable distfiles (fetched conditionally?): ./textproc/convertlit/distinfo clit18src.zip @ text @$NetBSD: distinfo,v 1.9 2021/10/07 15:01:52 nia Exp $ BLAKE2s (Text-Unidecode-1.30.tar.gz) = 38b5da74aec821ab44ccc4b87f5a25526dab050e538a70b6c32ec5021130f867 SHA512 (Text-Unidecode-1.30.tar.gz) = 194f8aba0dcdc7a53338b86370b7cfb6c60d4a8982ada6084f0eb0ccd66ce461b831b6daf04932f039ff1b983dc3cd0c0ced1e8b455955d2699c36120b41a526 Size (Text-Unidecode-1.30.tar.gz) = 137977 bytes @ 1.9 log @textproc: Remove SHA1 hashes for distfiles @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.8 2016/11/28 13:37:53 wiz Exp $ d3 1 a3 1 RMD160 (Text-Unidecode-1.30.tar.gz) = c4f5ba6ac84eef0ce4999935b7a32da0576c8720 @ 1.8 log @Updated p5-Text-Unidecode to 1.30. 2016-11-26 Sean M. Burke sburke@@cpan.org * Release 1.30 * Many many (forty?) tables were missing the final character! Fixed. * Minor stuff: . Added just a few Arabesque things to U+FD__ . Renamed t/00400_just_load_module.t to t/00400_just_load_main_module.t . This is the first time non-7bit data appears in any Unidecode/x__.pm files, although it is just in comments. (In x02.pm, x03.pm, xfd.pm) But this is just THE SHAPE OF THINGS TO COME. * Oh look, I blinked and a year went by. I've been spending about the past *two* years trying to think of how Unidecode v2-and-later's data tables should work. * TODO: Kill the surrogatey "xD8", "xD9", "xDA", "xDB" blocks, and actually handle surrogates (when properly encoded). * TODO: Inaugurate the (private) Text::Unidecode::Blackbox namespace. @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.7 2016/02/18 03:38:36 wen Exp $ a2 1 SHA1 (Text-Unidecode-1.30.tar.gz) = 13c28520896a0073e0ea9333a2b6b770dcf17d6e @ 1.7 log @Update to 1.27 Upstream changes: 2015-10-21 Sean M. Burke sburke@@cpan.org * RELEASE 1.27. (Stable.) The release, 1.25_01, didn't blow up, so this is just a re-release of it as a normal ("stable") version. * Minor changes to the documentation. Nothing substantial. * Release 1.26 had a confusing mistake in the ChangeLog. Ignore v1.26. 2015-10-21 Sean M. Burke sburke@@cpan.org * RELEASE 1.26. Mistake. See above for change notes between v1.25_01 and v1.27. 2015-10-16 Sean M. Burke sburke@@cpan.org * RELEASE 1.25_01. * !DEVELOPER RELEASE!, OH GOD HELP US ALL! * Here's a new thing that makes me nervous and hesitant, and that I've been talking myself into for weeks: ************************************************************** * I've switched to accepting values in the range 0x80-0x9F * * as if they are the Windows-1252 ("ANSI") characters. * ************************************************************** Previously they had all mapped to emptystring. Technically, Unicode specifies those codepoints as control characters that I've never heard of, "C1 Controls"... ... U+0087 ESA - End of Selected Area U+0088 HTS - Character (Horizontal) Tabulation Set U+0089 HTJ - Character (Horizontal) Tabulation with Justification ... ( See "C1" in https://en.wikipedia.org/wiki/C0_and_C1_control_codes ) And Unidecode mapped all of those to emptystring. Now they are treated as if you fed the Windows-1252 characters, as that is an extremely common thing to have happen. So if you feed character value 0x80 to it, it is taken to mean "��" (which Unidecode then decodes as "EUR", at the moment at least). (This doesn't interfere with the fact that U+20AC is the proper Unicode place for the "��" to be found.) And the smartquotes at 0x91 to 0x94, �� �� �� �� turn into ' ' " " so yaaaay! Note that in theory, according to C1 Controls, 0x85 is "NEL: Next Line", "Equivalent to CR+LF. Used to mark end-of-line on some IBM mainframes." I could map this to \n or \r\n or whatever, but I've never seen 0x85 in use in the wild, and I never heard anyone complain about my not having mapped it to "\n" in all the Unidecode versions since the first, in 2001. So instead, Unidecode takes 0x85 as its Windows-1252 value, the ellipsis "��" which of course it Unidecodes as "..." I'm not thrilled with the idea of going off spec but I think this should be okay, and it has massive DWIM value. Let's hope I'm not dividing Unicode times infinity by zero and then the whole universe will disa That's why I'm making this a developer release. Unless anything besplodes by November 1st, I'll re-issue this as a stable release. @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.6 2015/11/04 01:59:54 agc Exp $ d3 4 a6 4 SHA1 (Text-Unidecode-1.27.tar.gz) = 221442bbf1fcb3a1df4b8988033154e3934124e9 RMD160 (Text-Unidecode-1.27.tar.gz) = 07411c625707f3a2ec0adf98b419641e7deb27d2 SHA512 (Text-Unidecode-1.27.tar.gz) = c124e09b75050717fc13716b46ca54e607fd1e093f6ce06db466cda669d772661173a394eac81b5073a757f7af5e0174aa23eac037a356f008268b2bd767428c Size (Text-Unidecode-1.27.tar.gz) = 134929 bytes @ 1.6 log @Add SHA512 digests for distfiles for textproc category Problems found locating distfiles: Package cabocha: missing distfile cabocha-0.68.tar.bz2 Package convertlit: missing distfile clit18src.zip Package php-enchant: missing distfile php-enchant/enchant-1.1.0.tgz Otherwise, existing SHA1 digests verified and found to be the same on the machine holding the existing distfiles (morden). All existing SHA1 digests retained for now as an audit trail. @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.5 2015/08/28 22:46:28 mef Exp $ d3 4 a6 4 SHA1 (Text-Unidecode-1.24.tar.gz) = eb492ce66f856d709a54fe5244f424a6555bf580 RMD160 (Text-Unidecode-1.24.tar.gz) = 601464595b2e0942c72e74cf0e09d4ea3f930b8b SHA512 (Text-Unidecode-1.24.tar.gz) = d9abcc2b3425457a814ffd2c1061d232d851633fca5780c87b4bfe0fcfa7025f1519776a433a650fe91f431f76ccf05b4e548f2a760acbd6fb2629675867fec0 Size (Text-Unidecode-1.24.tar.gz) = 131589 bytes @ 1.5 log @Update to 1.24 -------------- 2015-08-28 Sean M. Burke sburke@@cpan.org * RELEASE 1.24. Fixing a little (BIG) bug that David Cusimano is a superstar for having noticed. Ah, what a difference a ";" vs a "," makes! [https://rt.cpan.org/Public/Bug/Display.html?id=105420] * I'M BACK. After nine months of semi-catastrophic system failures, and after Voyager-style flybys of a dozen project deadlines... and now I can somehow try to get back in the swing of things. * ANOTHER superstar is Mistah Brendan Byrd who said that there are [ https://rt.cpan.org/Public/Bug/Display.html?id=102357 ] many ports of Unidecode to other languages and that I should brag about that fact, and he is very extremely correct, so now the Pod in Unidecode.pm indeed does just that. * (I got my distro-building back up and running. WOLVERIIIINES!) * I'm thinking of having future Unidecode/*.pm data files contain the canonical Unicode character name for every character as a comment. Obviously, this would make the dist pretty big. But the lib/Unidecode/*.pm files is somewhere around a meg. What's a few megs more?... with the benefit of added clarity? Everyone's a winner! @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.4 2015/05/10 03:02:05 mef Exp $ d5 1 @ 1.4 log @Update to 1.23 -------------- 2014-12-07 Sean M. Burke sburke@@cpan.org * RELEASE 1.23. Just a bugfix version. * The bug in question: https://rt.cpan.org/Ticket/Display.html?id=97456 * Thank you very much to superstar Dagfinn Ilmari Mannsaker for noting it first *and* for providing a patch for a problem that would baffle me completely: "On perls 5.8.8 through 5.12.x, regex matches against UTF-16 surrogate characters emits a fatal "Malformed UTF-8 character" warning if warnings are enabled. ExtUtils::MakeMaker prior to 6.78 runs the test suite with -w, causing the installation to fail. The attached patch [which I applied -SMB] disables utf8 warnings while doing the regex substitution and converting the character number to a character in the test." And thank you very much to Ricardo Signes and Tim Bunce for reminding me to actually release this thang! I was stupid and forgot... for several MONTHS. * Doc: Adding mention of Tom Christiansen's "Perl Unicode Cookbook": http://www.perl.com/pub/2012/04/perlunicook-standard-preamble.html * Doc: Adding a suggestion of "use utf8;" in German example. @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.3 2014/09/16 12:27:48 wen Exp $ d3 3 a5 3 SHA1 (Text-Unidecode-1.23.tar.gz) = fab5894d81b86d63ffdf8a78509ce668e60e4693 RMD160 (Text-Unidecode-1.23.tar.gz) = 888ed310339265dd74e44201b9c4d6bf81f9530a Size (Text-Unidecode-1.23.tar.gz) = 130431 bytes @ 1.3 log @Update to 1.22 Add LICENSE Upstream changes: 2014-08-15 Sean M. Burke sburke@@cpan.org * RELEASE 1.22. (The dev release works, so this is a version bump.) * See notes for 2014-07-25, because this is the first public release with significant changes since 2001! 2014-07-25 Sean M. Burke sburke@@cpan.org * !DEVELOPER RELEASE! * !Release 1.20_01! * Many bugfixes. Thanks especially to Tomaolc! * Yet more *.t files added for improved sanity checking. * Shuffling around the internals of Unidecode.pm * Putting in some vacuous 0x__.pm files where previously there would just be a load failure @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.2 2014/08/11 02:11:27 wen Exp $ d3 3 a5 3 SHA1 (Text-Unidecode-1.22.tar.gz) = 845dd22ea0614d5692a6bffaddc1a0e4e8c61ae3 RMD160 (Text-Unidecode-1.22.tar.gz) = 079d73fe93ced3a541efd3cd5474db8917b1fdae Size (Text-Unidecode-1.22.tar.gz) = 129557 bytes @ 1.2 log @Update to 1.01 Upstream changes: 2014-06-30 Sean M. Burke sburke@@cpan.org * Release 1.01 -- first official Unidecode release since 2001!!! * There are no real changes since the 2014-06-23 developer release. I'm just making this all official now. 2014-06-23 Sean M. Burke sburke@@cpan.org * !DEVELOPER RELEASE! * Release 1.00_03 * Now asserting that we need at least Perl 5.8.0 An automated test system that tried running the t/*.t under a 5.6.2 spewed all kinds of crazy error messages. Hence the bump-up. So, I added assertions for the version. * I added some tests for more basic sanity assertions. 2014-06-17 Sean M. Burke sburke@@cpan.org v1.00_02 - Not released. Just internal rearranging. 2014-06-13 Sean M. Burke sburke@@cpan.org * !DEVELOPER RELEASE! * Release 1.00(_01!)- so many years later, finally we bump up to 1.*! * My documentation is now BRILLIANT. * Minor bugfixes. * Some code comments for clarity. * A modern test suite. * A proper release will follow in a few days. @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.1.1.1 2009/02/24 12:00:40 tonnerre Exp $ d3 3 a5 3 SHA1 (Text-Unidecode-1.01.tar.gz) = 06cda303c630ca75393ce09322b6e3cea6664493 RMD160 (Text-Unidecode-1.01.tar.gz) = d620cb422c948365399f1e04bdcfd68fd08f21dc Size (Text-Unidecode-1.01.tar.gz) = 122457 bytes @ 1.1 log @Initial revision @ text @d1 1 a1 1 $NetBSD: distinfo,v 1.1.1.1 2005/10/04 18:51:27 wiz Exp $ d3 3 a5 3 SHA1 (Text-Unidecode-0.04.tar.gz) = baf3e2f90011e25fb10cb4d47ade53cc3977b3af RMD160 (Text-Unidecode-0.04.tar.gz) = 4a56c5b7494894c516e01f98ec7af1c7193a7d5b Size (Text-Unidecode-0.04.tar.gz) = 103091 bytes @ 1.1.1.1 log @Initial import of Text::Unidecode version 0.04. It often happens that you have non-Roman text data in Unicode, but you can't display it -- usually because you're trying to show it to a user via an application that doesn't support Unicode, or because the fonts you need aren't accessible. You could represent the Unicode characters as "???????" or "\15BA\15A0\1610...", but that's nearly useless to the user who actually wants to read what the text says. What Text::Unidecode provides is a function, unidecode(...) that takes Unicode data and tries to represent it in US-ASCII characters (i.e., the universally displayable characters between 0x00 and 0x7F). The representation is almost always an attempt at transliteration -- i.e., conveying, in Roman letters, the pronunciation expressed by the text in some other writing system. @ text @@