View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001565 | OpenMPT | User Interface | public | 2022-02-07 05:25 | 2022-03-15 09:20 |
Reporter | ProgrammerNerd | Assigned To | manx | ||
Priority | normal | Severity | minor | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Platform | Arch Linux | OS | Linux | ||
Product Version | OpenMPT 1.30.02.00 / libopenmpt 0.6.1 (upgrade first) | ||||
Target Version | OpenMPT 1.30.03.00 / libopenmpt 0.6.2 (upgrade first) | Fixed in Version | OpenMPT 1.30.03.00 / libopenmpt 0.6.2 (upgrade first) | ||
Summary | 0001565: Jester's epilepsy.mod breaks the openmpt123 user interface | ||||
Description | When playing epilepsy.mod everything after the title gets messed up. Here is an example of what it looks like:
For comparison here is what I get when playing any other module:
| ||||
Steps To Reproduce |
| ||||
Tags | No tags attached. | ||||
Has the bug occurred in previous versions? | Yes | ||||
Tested code revision (in case you know it) | |||||
The link to modarchive.org is not correct. The ampersand got replaced with "&". Replace "&" with & or search for epilepsy.mod. There is only one result when doing so. |
|
When I typed & amp ; (without the spaces) it replaced it with an ampersand. I can't edit my comment so I'm posted this. Sorry for all these notes. This is my first time using the issue tracker. |
|
I would assume this to be caused by byte
On Windows, openmpt123 displays the character as |
|
Yes, that makes sense because it displays the 199 but it doesn't display the Z with an accent after it like OpenMPT does. I am wondering what would happen if this the module were loaded into Protracker on an Amiga? Would it also display a 'Z' (with an accent) or does 0x9E mean something else on an Amiga? |
|
Here I opened the module in Protracker 2.3D. Attached is a screenshot. It looks like the Z doesn't show on my setup. I will note that I'm not familiar with what options the Amiga had for international users and if Jester being from Germany would have installed some language pack or something similar in concept and if that would have an impact on Protracker. |
|
ISO8859-1 byte OpenMPT may display I do not think that libopenmpt should filter anything here, as there may be legitimate reasons for module formats supporting Unicode text to include Unicode Control Characters. libopenmpt is documented as returning "UTF8" and not "UTF8 without control characters" or anything similar. openmpt123 should filter any such Unicode Control Characters when displaying them. However, that is not as easy as it sounds. Unicode splatters all kinds of harmless and problematic control characters all over its code point space. If we just filter C1 and some C0, we would still miss a lot. A proper solution would make depend openmpt123 on a full Unicode library like libicu. I am actually not opposed to make openmpt123 depend on libicu, however that is not something that could be implemented in a timely manner. I guess we could just filter C1 and most of C0 for now, as there are only few formats supporting UTF8, this would likely be a fair tradeoff (for now). |
|
Sorry, of course I mean U+009E, not U+00093. From my understanding, the codepoints in C1 have alternative encodings which do not double as control characters such as
Amiga in general uses ISO-8859-1 but ProTracker bitmap fonts may not support characters beyond ASCII, or not all of them. In general, it's impossible to know which tracker a MOD file was made in, so we cannot rely on which characters ProTracker does and does not support, because a MOD file might just as well have been made in another tracker. |
|
As far as Unicode and official mapping tables are concerned, I am not aware of any alternative mappings from ISO-8859-1 to Unicode. However, looking at some Amiga Bitmap Fonts that are used in the ANSI scene (I looked at the Ansilove/PHP package), some fonts have duplicates of the range Other Amiga Bitmap Fonts have this range either blank or all empty squares. |
|
Right, so as the only difference between standard Latin-1 and Windows-1252 are the characters in range 0x80..0x9F being printable, I wonder if we should assume Windows-1252 to be the encoding used in MOD files (as that would allow e.g. umlauts from OpenMPT to keep working), and for other formats using Latin-1, we stop mapping 0x80..0x9F to control characters. Obviously this wouldn't cause the title of this particular module to be displayed "correctly", but it's hard to reason what is "correct" here anyway. To me it looks like this 0x9E byte isn't supposed to be there anyway, and is maybe caused by a corrupted file transfer. |
|
I guess we could do that. According to https://en.wikipedia.org/wiki/ISO/IEC_8859-1, the difference would basically come down to using However, in any case, openmpt123 should probably still filter known control characters. |
|
I think ISO 8859-15 isn't quite what we want here, as it replaces existing characters in ISO-8859-1 rather than adding them to the 0x80..0x9F range like Windows-1252. See this table: https://en.wikipedia.org/wiki/ISO/IEC_8859-15#Codepage_layout |
|
And that would have been my typo here. I meant to say (or copy) |
|
|
|
There are also https://www.iana.org/assignments/charset-reg/Amiga-1251 and https://de.wikipedia.org/wiki/Commodore_Amiga_(Zeichensatz) (german), but both do not solve our issue here (both use C1). It probably makes sense to implement this as a distinct Amiga encoding anyway instead of using any generic 8859-1 encoding for Amiga formats, due to the |
|
So the current plan appears to be:
The remaining questions is:
A, C, and E will be equally easy to implement. I'm not sure if any of this warrants back-porting to earlier release branches, especially since I did rewrite all of charset conversions for 0.6, which basically requires implementing it twice for back-porting. |
|
status update:
bug1565-implement-no-c1-charsets-v1.patch (14,800 bytes)
Index: common/mptString.cpp =================================================================== --- common/mptString.cpp (revision 16908) +++ common/mptString.cpp (working copy) @@ -357,18 +357,21 @@ switch(charset) { #if defined(MPT_ENABLE_CHARSET_LOCALE) - case Charset::Locale: return mpt::encode<Tdststring>(mpt::logical_encoding::locale, src); break; + case Charset::Locale: return mpt::encode<Tdststring>(mpt::logical_encoding::locale, src); break; #endif - case Charset::UTF8: return mpt::encode<Tdststring>(mpt::common_encoding::utf8, src); break; - case Charset::ASCII: return mpt::encode<Tdststring>(mpt::common_encoding::ascii, src); break; - case Charset::ISO8859_1: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_1, src); break; - case Charset::ISO8859_15: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_15, src); break; - case Charset::CP850: return mpt::encode<Tdststring>(mpt::common_encoding::cp850, src); break; - case Charset::CP437: return mpt::encode<Tdststring>(mpt::common_encoding::cp437, src); break; - case Charset::CP437AMS: return mpt::encode<Tdststring>(CharsetTableCP437AMS, src); break; - case Charset::CP437AMS2: return mpt::encode<Tdststring>(CharsetTableCP437AMS2, src); break; - case Charset::Windows1252: return mpt::encode<Tdststring>(mpt::common_encoding::windows1252, src); break; - case Charset::Amiga: return mpt::encode<Tdststring>(mpt::common_encoding::amiga, src); break; + case Charset::UTF8: return mpt::encode<Tdststring>(mpt::common_encoding::utf8, src); break; + case Charset::ASCII: return mpt::encode<Tdststring>(mpt::common_encoding::ascii, src); break; + case Charset::ISO8859_1: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_1, src); break; + case Charset::ISO8859_15: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_15, src); break; + case Charset::CP850: return mpt::encode<Tdststring>(mpt::common_encoding::cp850, src); break; + case Charset::CP437: return mpt::encode<Tdststring>(mpt::common_encoding::cp437, src); break; + case Charset::CP437AMS: return mpt::encode<Tdststring>(CharsetTableCP437AMS, src); break; + case Charset::CP437AMS2: return mpt::encode<Tdststring>(CharsetTableCP437AMS2, src); break; + case Charset::Windows1252: return mpt::encode<Tdststring>(mpt::common_encoding::windows1252, src); break; + case Charset::Amiga: return mpt::encode<Tdststring>(mpt::common_encoding::amiga, src); break; + case Charset::ISO8859_1_no_C1: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_1_no_c1, src); break; + case Charset::ISO8859_15_no_C1: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_15_no_c1, src); break; + case Charset::Amiga_no_C1: return mpt::encode<Tdststring>(mpt::common_encoding::amiga_no_c1, src); break; } return Tdststring(); } @@ -383,18 +386,21 @@ switch(charset) { #if defined(MPT_ENABLE_CHARSET_LOCALE) - case Charset::Locale: return mpt::decode<Tsrcstring>(mpt::logical_encoding::locale, src); break; + case Charset::Locale: return mpt::decode<Tsrcstring>(mpt::logical_encoding::locale, src); break; #endif - case Charset::UTF8: return mpt::decode<Tsrcstring>(mpt::common_encoding::utf8, src); break; - case Charset::ASCII: return mpt::decode<Tsrcstring>(mpt::common_encoding::ascii, src); break; - case Charset::ISO8859_1: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_1, src); break; - case Charset::ISO8859_15: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_15, src); break; - case Charset::CP850: return mpt::decode<Tsrcstring>(mpt::common_encoding::cp850, src); break; - case Charset::CP437: return mpt::decode<Tsrcstring>(mpt::common_encoding::cp437, src); break; - case Charset::CP437AMS: return mpt::decode<Tsrcstring>(CharsetTableCP437AMS, src); break; - case Charset::CP437AMS2: return mpt::decode<Tsrcstring>(CharsetTableCP437AMS2, src); break; - case Charset::Windows1252: return mpt::decode<Tsrcstring>(mpt::common_encoding::windows1252, src); break; - case Charset::Amiga: return mpt::decode<Tsrcstring>(mpt::common_encoding::amiga, src); break; + case Charset::UTF8: return mpt::decode<Tsrcstring>(mpt::common_encoding::utf8, src); break; + case Charset::ASCII: return mpt::decode<Tsrcstring>(mpt::common_encoding::ascii, src); break; + case Charset::ISO8859_1: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_1, src); break; + case Charset::ISO8859_15: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_15, src); break; + case Charset::CP850: return mpt::decode<Tsrcstring>(mpt::common_encoding::cp850, src); break; + case Charset::CP437: return mpt::decode<Tsrcstring>(mpt::common_encoding::cp437, src); break; + case Charset::CP437AMS: return mpt::decode<Tsrcstring>(CharsetTableCP437AMS, src); break; + case Charset::CP437AMS2: return mpt::decode<Tsrcstring>(CharsetTableCP437AMS2, src); break; + case Charset::Windows1252: return mpt::decode<Tsrcstring>(mpt::common_encoding::windows1252, src); break; + case Charset::Amiga: return mpt::decode<Tsrcstring>(mpt::common_encoding::amiga, src); break; + case Charset::ISO8859_1_no_C1: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_1_no_c1, src); break; + case Charset::ISO8859_15_no_C1: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_15_no_c1, src); break; + case Charset::Amiga_no_C1: return mpt::decode<Tsrcstring>(mpt::common_encoding::amiga_no_c1, src); break; } return mpt::widestring(); } Index: common/mptString.h =================================================================== --- common/mptString.h (revision 16908) +++ common/mptString.h (working copy) @@ -68,6 +68,10 @@ Amiga, + ISO8859_1_no_C1, + ISO8859_15_no_C1, + Amiga_no_C1, + #if defined(MPT_ENABLE_CHARSET_LOCALE) Locale, // CP_ACP on windows, current C locale otherwise #endif // MPT_ENABLE_CHARSET_LOCALE Index: src/mpt/string/types.hpp =================================================================== --- src/mpt/string/types.hpp (revision 16908) +++ src/mpt/string/types.hpp (working copy) @@ -37,6 +37,9 @@ cp437, windows1252, amiga, + iso8859_1_no_c1, + iso8859_15_no_c1, + amiga_no_c1, }; Index: src/mpt/string_transcode/transcode.hpp =================================================================== --- src/mpt/string_transcode/transcode.hpp (revision 16908) +++ src/mpt/string_transcode/transcode.hpp (working copy) @@ -43,27 +43,25 @@ -/* -default 1:1 mapping + +// default 1:1 mapping inline constexpr char32_t CharsetTableISO8859_1[256] = { - 0x0000,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,0x0008,0x0009,0x000a,0x000b,0x000c,0x000d,0x000e,0x000f, - 0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,0x0018,0x0019,0x001a,0x001b,0x001c,0x001d,0x001e,0x001f, - 0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,0x0028,0x0029,0x002a,0x002b,0x002c,0x002d,0x002e,0x002f, - 0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,0x0038,0x0039,0x003a,0x003b,0x003c,0x003d,0x003e,0x003f, - 0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,0x0048,0x0049,0x004a,0x004b,0x004c,0x004d,0x004e,0x004f, - 0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,0x0058,0x0059,0x005a,0x005b,0x005c,0x005d,0x005e,0x005f, - 0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,0x0068,0x0069,0x006a,0x006b,0x006c,0x006d,0x006e,0x006f, - 0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,0x0078,0x0079,0x007a,0x007b,0x007c,0x007d,0x007e,0x007f, - 0x0080,0x0081,0x0082,0x0083,0x0084,0x0085,0x0086,0x0087,0x0088,0x0089,0x008a,0x008b,0x008c,0x008d,0x008e,0x008f, - 0x0090,0x0091,0x0092,0x0093,0x0094,0x0095,0x0096,0x0097,0x0098,0x0099,0x009a,0x009b,0x009c,0x009d,0x009e,0x009f, - 0x00a0,0x00a1,0x00a2,0x00a3,0x00a4,0x00a5,0x00a6,0x00a7,0x00a8,0x00a9,0x00aa,0x00ab,0x00ac,0x00ad,0x00ae,0x00af, - 0x00b0,0x00b1,0x00b2,0x00b3,0x00b4,0x00b5,0x00b6,0x00b7,0x00b8,0x00b9,0x00ba,0x00bb,0x00bc,0x00bd,0x00be,0x00bf, - 0x00c0,0x00c1,0x00c2,0x00c3,0x00c4,0x00c5,0x00c6,0x00c7,0x00c8,0x00c9,0x00ca,0x00cb,0x00cc,0x00cd,0x00ce,0x00cf, - 0x00d0,0x00d1,0x00d2,0x00d3,0x00d4,0x00d5,0x00d6,0x00d7,0x00d8,0x00d9,0x00da,0x00db,0x00dc,0x00dd,0x00de,0x00df, - 0x00e0,0x00e1,0x00e2,0x00e3,0x00e4,0x00e5,0x00e6,0x00e7,0x00e8,0x00e9,0x00ea,0x00eb,0x00ec,0x00ed,0x00ee,0x00ef, - 0x00f0,0x00f1,0x00f2,0x00f3,0x00f4,0x00f5,0x00f6,0x00f7,0x00f8,0x00f9,0x00fa,0x00fb,0x00fc,0x00fd,0x00fe,0x00ff -}; -*/ + 0x0000, 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008, 0x0009, 0x000a, 0x000b, 0x000c, 0x000d, 0x000e, 0x000f, + 0x0010, 0x0011, 0x0012, 0x0013, 0x0014, 0x0015, 0x0016, 0x0017, 0x0018, 0x0019, 0x001a, 0x001b, 0x001c, 0x001d, 0x001e, 0x001f, + 0x0020, 0x0021, 0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002a, 0x002b, 0x002c, 0x002d, 0x002e, 0x002f, + 0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003a, 0x003b, 0x003c, 0x003d, 0x003e, 0x003f, + 0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048, 0x0049, 0x004a, 0x004b, 0x004c, 0x004d, 0x004e, 0x004f, + 0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057, 0x0058, 0x0059, 0x005a, 0x005b, 0x005c, 0x005d, 0x005e, 0x005f, + 0x0060, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006a, 0x006b, 0x006c, 0x006d, 0x006e, 0x006f, + 0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007a, 0x007b, 0x007c, 0x007d, 0x007e, 0x007f, + 0x0080, 0x0081, 0x0082, 0x0083, 0x0084, 0x0085, 0x0086, 0x0087, 0x0088, 0x0089, 0x008a, 0x008b, 0x008c, 0x008d, 0x008e, 0x008f, + 0x0090, 0x0091, 0x0092, 0x0093, 0x0094, 0x0095, 0x0096, 0x0097, 0x0098, 0x0099, 0x009a, 0x009b, 0x009c, 0x009d, 0x009e, 0x009f, + 0x00a0, 0x00a1, 0x00a2, 0x00a3, 0x00a4, 0x00a5, 0x00a6, 0x00a7, 0x00a8, 0x00a9, 0x00aa, 0x00ab, 0x00ac, 0x00ad, 0x00ae, 0x00af, + 0x00b0, 0x00b1, 0x00b2, 0x00b3, 0x00b4, 0x00b5, 0x00b6, 0x00b7, 0x00b8, 0x00b9, 0x00ba, 0x00bb, 0x00bc, 0x00bd, 0x00be, 0x00bf, + 0x00c0, 0x00c1, 0x00c2, 0x00c3, 0x00c4, 0x00c5, 0x00c6, 0x00c7, 0x00c8, 0x00c9, 0x00ca, 0x00cb, 0x00cc, 0x00cd, 0x00ce, 0x00cf, + 0x00d0, 0x00d1, 0x00d2, 0x00d3, 0x00d4, 0x00d5, 0x00d6, 0x00d7, 0x00d8, 0x00d9, 0x00da, 0x00db, 0x00dc, 0x00dd, 0x00de, 0x00df, + 0x00e0, 0x00e1, 0x00e2, 0x00e3, 0x00e4, 0x00e5, 0x00e6, 0x00e7, 0x00e8, 0x00e9, 0x00ea, 0x00eb, 0x00ec, 0x00ed, 0x00ee, 0x00ef, + 0x00f0, 0x00f1, 0x00f2, 0x00f3, 0x00f4, 0x00f5, 0x00f6, 0x00f7, 0x00f8, 0x00f9, 0x00fa, 0x00fb, 0x00fc, 0x00fd, 0x00fe, 0x00ff}; inline constexpr char32_t CharsetTableISO8859_15[256] = { 0x0000, 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008, 0x0009, 0x000a, 0x000b, 0x000c, 0x000d, 0x000e, 0x000f, @@ -206,6 +204,60 @@ } template <typename Tsrcstring> +inline mpt::widestring decode_8bit_no_c1(const Tsrcstring & str, const char32_t (&table)[256], mpt::widechar replacement = MPT_WIDECHAR('\uFFFD')) { + mpt::widestring res; + res.reserve(str.length()); + for (std::size_t i = 0; i < str.length(); ++i) { + std::size_t c = static_cast<std::size_t>(mpt::char_value(str[i])); + if ((0x80 <= c) && (c <= 0x9f)) { + res.push_back(replacement); + } else if (c < std::size(table)) { + res.push_back(static_cast<mpt::widechar>(table[c])); + } else { + res.push_back(replacement); + } + } + return res; +} + +template <typename Tdststring> +inline Tdststring encode_8bit_no_c1(const mpt::widestring & str, const char32_t (&table)[256], char replacement = '?') { + Tdststring res; + res.reserve(str.length()); + for (std::size_t i = 0; i < str.length(); ++i) { + char32_t c = static_cast<char32_t>(str[i]); + bool found = false; + // Try non-control characters first. + // In cases where there are actual characters mirrored in this range (like in AMS/AMS2 character sets), + // characters in the common range are preferred this way. + for (std::size_t x = 0x20; x < std::size(table); ++x) { + if ((0x80 <= c) && (c <= 0x9f)) { + continue; + } + if (c == table[x]) { + res.push_back(static_cast<typename Tdststring::value_type>(static_cast<uint8>(x))); + found = true; + break; + } + } + if (!found) { + // try control characters + for (std::size_t x = 0x00; x < std::size(table) && x < 0x20; ++x) { + if (c == table[x]) { + res.push_back(static_cast<typename Tdststring::value_type>(static_cast<uint8>(x))); + found = true; + break; + } + } + } + if (!found) { + res.push_back(replacement); + } + } + return res; +} + +template <typename Tsrcstring> inline mpt::widestring decode_ascii(const Tsrcstring & str, mpt::widechar replacement = MPT_WIDECHAR('\uFFFD')) { mpt::widestring res; res.reserve(str.length()); @@ -493,6 +545,15 @@ case common_encoding::amiga: result = false; break; + case common_encoding::iso8859_1_no_c1: + result = false; + break; + case common_encoding::iso8859_15_no_c1: + result = false; + break; + case common_encoding::amiga_no_c1: + result = false; + break; } return result; } @@ -550,6 +611,15 @@ case common_encoding::amiga: throw std::domain_error("unsupported encoding"); break; + case common_encoding::iso8859_1_no_c1: + throw std::domain_error("unsupported encoding"); + break; + case common_encoding::iso8859_15_no_c1: + throw std::domain_error("unsupported encoding"); + break; + case common_encoding::amiga_no_c1: + throw std::domain_error("unsupported encoding"); + break; } return result; } @@ -856,6 +926,15 @@ case common_encoding::amiga: return encode_8bit<Tdststring>(src, CharsetTableAmiga); break; + case common_encoding::iso8859_1_no_c1: + return encode_8bit_no_c1<Tdststring>(src, CharsetTableISO8859_1); + break; + case common_encoding::iso8859_15_no_c1: + return encode_8bit_no_c1<Tdststring>(src, CharsetTableISO8859_15); + break; + case common_encoding::amiga_no_c1: + return encode_8bit_no_c1<Tdststring>(src, CharsetTableAmiga); + break; } throw std::domain_error("unsupported encoding"); } @@ -953,6 +1032,15 @@ case common_encoding::amiga: return decode_8bit(src, CharsetTableAmiga); break; + case common_encoding::iso8859_1_no_c1: + return decode_8bit_no_c1(src, CharsetTableISO8859_1); + break; + case common_encoding::iso8859_15_no_c1: + return decode_8bit_no_c1(src, CharsetTableISO8859_15); + break; + case common_encoding::amiga_no_c1: + return decode_8bit_no_c1(src, CharsetTableAmiga); + break; } throw std::domain_error("unsupported encoding"); } |
|
status update: RISC OS charset is also based on ISO8859-1, but has actual glyphs in the C1 range, so there is no question as to which these should map to in order to fix the same issue as this one. |
|
status update:
TODO:
bug1565-implement-no-c1-charsets-v2.patch (15,020 bytes)
Index: common/mptString.cpp =================================================================== --- common/mptString.cpp (revision 16910) +++ common/mptString.cpp (working copy) @@ -357,18 +357,21 @@ switch(charset) { #if defined(MPT_ENABLE_CHARSET_LOCALE) - case Charset::Locale: return mpt::encode<Tdststring>(mpt::logical_encoding::locale, src); break; + case Charset::Locale: return mpt::encode<Tdststring>(mpt::logical_encoding::locale, src); break; #endif - case Charset::UTF8: return mpt::encode<Tdststring>(mpt::common_encoding::utf8, src); break; - case Charset::ASCII: return mpt::encode<Tdststring>(mpt::common_encoding::ascii, src); break; - case Charset::ISO8859_1: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_1, src); break; - case Charset::ISO8859_15: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_15, src); break; - case Charset::CP850: return mpt::encode<Tdststring>(mpt::common_encoding::cp850, src); break; - case Charset::CP437: return mpt::encode<Tdststring>(mpt::common_encoding::cp437, src); break; - case Charset::CP437AMS: return mpt::encode<Tdststring>(CharsetTableCP437AMS, src); break; - case Charset::CP437AMS2: return mpt::encode<Tdststring>(CharsetTableCP437AMS2, src); break; - case Charset::Windows1252: return mpt::encode<Tdststring>(mpt::common_encoding::windows1252, src); break; - case Charset::Amiga: return mpt::encode<Tdststring>(mpt::common_encoding::amiga, src); break; + case Charset::UTF8: return mpt::encode<Tdststring>(mpt::common_encoding::utf8, src); break; + case Charset::ASCII: return mpt::encode<Tdststring>(mpt::common_encoding::ascii, src); break; + case Charset::ISO8859_1: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_1, src); break; + case Charset::ISO8859_15: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_15, src); break; + case Charset::CP850: return mpt::encode<Tdststring>(mpt::common_encoding::cp850, src); break; + case Charset::CP437: return mpt::encode<Tdststring>(mpt::common_encoding::cp437, src); break; + case Charset::CP437AMS: return mpt::encode<Tdststring>(CharsetTableCP437AMS, src); break; + case Charset::CP437AMS2: return mpt::encode<Tdststring>(CharsetTableCP437AMS2, src); break; + case Charset::Windows1252: return mpt::encode<Tdststring>(mpt::common_encoding::windows1252, src); break; + case Charset::Amiga: return mpt::encode<Tdststring>(mpt::common_encoding::amiga, src); break; + case Charset::ISO8859_1_no_C1: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_1_no_c1, src); break; + case Charset::ISO8859_15_no_C1: return mpt::encode<Tdststring>(mpt::common_encoding::iso8859_15_no_c1, src); break; + case Charset::Amiga_no_C1: return mpt::encode<Tdststring>(mpt::common_encoding::amiga_no_c1, src); break; case Charset::RISC_OS: return mpt::encode<Tdststring>(mpt::common_encoding::riscos, src); break; } return Tdststring(); @@ -384,18 +387,21 @@ switch(charset) { #if defined(MPT_ENABLE_CHARSET_LOCALE) - case Charset::Locale: return mpt::decode<Tsrcstring>(mpt::logical_encoding::locale, src); break; + case Charset::Locale: return mpt::decode<Tsrcstring>(mpt::logical_encoding::locale, src); break; #endif - case Charset::UTF8: return mpt::decode<Tsrcstring>(mpt::common_encoding::utf8, src); break; - case Charset::ASCII: return mpt::decode<Tsrcstring>(mpt::common_encoding::ascii, src); break; - case Charset::ISO8859_1: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_1, src); break; - case Charset::ISO8859_15: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_15, src); break; - case Charset::CP850: return mpt::decode<Tsrcstring>(mpt::common_encoding::cp850, src); break; - case Charset::CP437: return mpt::decode<Tsrcstring>(mpt::common_encoding::cp437, src); break; - case Charset::CP437AMS: return mpt::decode<Tsrcstring>(CharsetTableCP437AMS, src); break; - case Charset::CP437AMS2: return mpt::decode<Tsrcstring>(CharsetTableCP437AMS2, src); break; - case Charset::Windows1252: return mpt::decode<Tsrcstring>(mpt::common_encoding::windows1252, src); break; - case Charset::Amiga: return mpt::decode<Tsrcstring>(mpt::common_encoding::amiga, src); break; + case Charset::UTF8: return mpt::decode<Tsrcstring>(mpt::common_encoding::utf8, src); break; + case Charset::ASCII: return mpt::decode<Tsrcstring>(mpt::common_encoding::ascii, src); break; + case Charset::ISO8859_1: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_1, src); break; + case Charset::ISO8859_15: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_15, src); break; + case Charset::CP850: return mpt::decode<Tsrcstring>(mpt::common_encoding::cp850, src); break; + case Charset::CP437: return mpt::decode<Tsrcstring>(mpt::common_encoding::cp437, src); break; + case Charset::CP437AMS: return mpt::decode<Tsrcstring>(CharsetTableCP437AMS, src); break; + case Charset::CP437AMS2: return mpt::decode<Tsrcstring>(CharsetTableCP437AMS2, src); break; + case Charset::Windows1252: return mpt::decode<Tsrcstring>(mpt::common_encoding::windows1252, src); break; + case Charset::Amiga: return mpt::decode<Tsrcstring>(mpt::common_encoding::amiga, src); break; + case Charset::ISO8859_1_no_C1: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_1_no_c1, src); break; + case Charset::ISO8859_15_no_C1: return mpt::decode<Tsrcstring>(mpt::common_encoding::iso8859_15_no_c1, src); break; + case Charset::Amiga_no_C1: return mpt::decode<Tsrcstring>(mpt::common_encoding::amiga_no_c1, src); break; case Charset::RISC_OS: return mpt::decode<Tsrcstring>(mpt::common_encoding::riscos, src); break; } return mpt::widestring(); Index: common/mptString.h =================================================================== --- common/mptString.h (revision 16910) +++ common/mptString.h (working copy) @@ -69,6 +69,10 @@ Amiga, RISC_OS, + ISO8859_1_no_C1, + ISO8859_15_no_C1, + Amiga_no_C1, + #if defined(MPT_ENABLE_CHARSET_LOCALE) Locale, // CP_ACP on windows, current C locale otherwise #endif // MPT_ENABLE_CHARSET_LOCALE Index: src/mpt/string/types.hpp =================================================================== --- src/mpt/string/types.hpp (revision 16910) +++ src/mpt/string/types.hpp (working copy) @@ -38,6 +38,9 @@ windows1252, amiga, riscos, + iso8859_1_no_c1, + iso8859_15_no_c1, + amiga_no_c1, }; Index: src/mpt/string_transcode/transcode.hpp =================================================================== --- src/mpt/string_transcode/transcode.hpp (revision 16910) +++ src/mpt/string_transcode/transcode.hpp (working copy) @@ -43,27 +43,25 @@ -/* -default 1:1 mapping + +// default 1:1 mapping inline constexpr char32_t CharsetTableISO8859_1[256] = { - 0x0000,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,0x0008,0x0009,0x000a,0x000b,0x000c,0x000d,0x000e,0x000f, - 0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,0x0018,0x0019,0x001a,0x001b,0x001c,0x001d,0x001e,0x001f, - 0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,0x0028,0x0029,0x002a,0x002b,0x002c,0x002d,0x002e,0x002f, - 0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,0x0038,0x0039,0x003a,0x003b,0x003c,0x003d,0x003e,0x003f, - 0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,0x0048,0x0049,0x004a,0x004b,0x004c,0x004d,0x004e,0x004f, - 0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,0x0058,0x0059,0x005a,0x005b,0x005c,0x005d,0x005e,0x005f, - 0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,0x0068,0x0069,0x006a,0x006b,0x006c,0x006d,0x006e,0x006f, - 0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,0x0078,0x0079,0x007a,0x007b,0x007c,0x007d,0x007e,0x007f, - 0x0080,0x0081,0x0082,0x0083,0x0084,0x0085,0x0086,0x0087,0x0088,0x0089,0x008a,0x008b,0x008c,0x008d,0x008e,0x008f, - 0x0090,0x0091,0x0092,0x0093,0x0094,0x0095,0x0096,0x0097,0x0098,0x0099,0x009a,0x009b,0x009c,0x009d,0x009e,0x009f, - 0x00a0,0x00a1,0x00a2,0x00a3,0x00a4,0x00a5,0x00a6,0x00a7,0x00a8,0x00a9,0x00aa,0x00ab,0x00ac,0x00ad,0x00ae,0x00af, - 0x00b0,0x00b1,0x00b2,0x00b3,0x00b4,0x00b5,0x00b6,0x00b7,0x00b8,0x00b9,0x00ba,0x00bb,0x00bc,0x00bd,0x00be,0x00bf, - 0x00c0,0x00c1,0x00c2,0x00c3,0x00c4,0x00c5,0x00c6,0x00c7,0x00c8,0x00c9,0x00ca,0x00cb,0x00cc,0x00cd,0x00ce,0x00cf, - 0x00d0,0x00d1,0x00d2,0x00d3,0x00d4,0x00d5,0x00d6,0x00d7,0x00d8,0x00d9,0x00da,0x00db,0x00dc,0x00dd,0x00de,0x00df, - 0x00e0,0x00e1,0x00e2,0x00e3,0x00e4,0x00e5,0x00e6,0x00e7,0x00e8,0x00e9,0x00ea,0x00eb,0x00ec,0x00ed,0x00ee,0x00ef, - 0x00f0,0x00f1,0x00f2,0x00f3,0x00f4,0x00f5,0x00f6,0x00f7,0x00f8,0x00f9,0x00fa,0x00fb,0x00fc,0x00fd,0x00fe,0x00ff -}; -*/ + 0x0000, 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008, 0x0009, 0x000a, 0x000b, 0x000c, 0x000d, 0x000e, 0x000f, + 0x0010, 0x0011, 0x0012, 0x0013, 0x0014, 0x0015, 0x0016, 0x0017, 0x0018, 0x0019, 0x001a, 0x001b, 0x001c, 0x001d, 0x001e, 0x001f, + 0x0020, 0x0021, 0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002a, 0x002b, 0x002c, 0x002d, 0x002e, 0x002f, + 0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003a, 0x003b, 0x003c, 0x003d, 0x003e, 0x003f, + 0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048, 0x0049, 0x004a, 0x004b, 0x004c, 0x004d, 0x004e, 0x004f, + 0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057, 0x0058, 0x0059, 0x005a, 0x005b, 0x005c, 0x005d, 0x005e, 0x005f, + 0x0060, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006a, 0x006b, 0x006c, 0x006d, 0x006e, 0x006f, + 0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007a, 0x007b, 0x007c, 0x007d, 0x007e, 0x007f, + 0x0080, 0x0081, 0x0082, 0x0083, 0x0084, 0x0085, 0x0086, 0x0087, 0x0088, 0x0089, 0x008a, 0x008b, 0x008c, 0x008d, 0x008e, 0x008f, + 0x0090, 0x0091, 0x0092, 0x0093, 0x0094, 0x0095, 0x0096, 0x0097, 0x0098, 0x0099, 0x009a, 0x009b, 0x009c, 0x009d, 0x009e, 0x009f, + 0x00a0, 0x00a1, 0x00a2, 0x00a3, 0x00a4, 0x00a5, 0x00a6, 0x00a7, 0x00a8, 0x00a9, 0x00aa, 0x00ab, 0x00ac, 0x00ad, 0x00ae, 0x00af, + 0x00b0, 0x00b1, 0x00b2, 0x00b3, 0x00b4, 0x00b5, 0x00b6, 0x00b7, 0x00b8, 0x00b9, 0x00ba, 0x00bb, 0x00bc, 0x00bd, 0x00be, 0x00bf, + 0x00c0, 0x00c1, 0x00c2, 0x00c3, 0x00c4, 0x00c5, 0x00c6, 0x00c7, 0x00c8, 0x00c9, 0x00ca, 0x00cb, 0x00cc, 0x00cd, 0x00ce, 0x00cf, + 0x00d0, 0x00d1, 0x00d2, 0x00d3, 0x00d4, 0x00d5, 0x00d6, 0x00d7, 0x00d8, 0x00d9, 0x00da, 0x00db, 0x00dc, 0x00dd, 0x00de, 0x00df, + 0x00e0, 0x00e1, 0x00e2, 0x00e3, 0x00e4, 0x00e5, 0x00e6, 0x00e7, 0x00e8, 0x00e9, 0x00ea, 0x00eb, 0x00ec, 0x00ed, 0x00ee, 0x00ef, + 0x00f0, 0x00f1, 0x00f2, 0x00f3, 0x00f4, 0x00f5, 0x00f6, 0x00f7, 0x00f8, 0x00f9, 0x00fa, 0x00fb, 0x00fc, 0x00fd, 0x00fe, 0x00ff}; inline constexpr char32_t CharsetTableISO8859_15[256] = { 0x0000, 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008, 0x0009, 0x000a, 0x000b, 0x000c, 0x000d, 0x000e, 0x000f, @@ -226,6 +224,60 @@ } template <typename Tsrcstring> +inline mpt::widestring decode_8bit_no_c1(const Tsrcstring & str, const char32_t (&table)[256], mpt::widechar replacement = MPT_WIDECHAR('\uFFFD')) { + mpt::widestring res; + res.reserve(str.length()); + for (std::size_t i = 0; i < str.length(); ++i) { + std::size_t c = static_cast<std::size_t>(mpt::char_value(str[i])); + if ((0x80 <= c) && (c <= 0x9f)) { + res.push_back(replacement); + } else if (c < std::size(table)) { + res.push_back(static_cast<mpt::widechar>(table[c])); + } else { + res.push_back(replacement); + } + } + return res; +} + +template <typename Tdststring> +inline Tdststring encode_8bit_no_c1(const mpt::widestring & str, const char32_t (&table)[256], char replacement = '?') { + Tdststring res; + res.reserve(str.length()); + for (std::size_t i = 0; i < str.length(); ++i) { + char32_t c = static_cast<char32_t>(str[i]); + bool found = false; + // Try non-control characters first. + // In cases where there are actual characters mirrored in this range (like in AMS/AMS2 character sets), + // characters in the common range are preferred this way. + for (std::size_t x = 0x20; x < std::size(table); ++x) { + if ((0x80 <= c) && (c <= 0x9f)) { + continue; + } + if (c == table[x]) { + res.push_back(static_cast<typename Tdststring::value_type>(static_cast<uint8>(x))); + found = true; + break; + } + } + if (!found) { + // try control characters + for (std::size_t x = 0x00; x < std::size(table) && x < 0x20; ++x) { + if (c == table[x]) { + res.push_back(static_cast<typename Tdststring::value_type>(static_cast<uint8>(x))); + found = true; + break; + } + } + } + if (!found) { + res.push_back(replacement); + } + } + return res; +} + +template <typename Tsrcstring> inline mpt::widestring decode_ascii(const Tsrcstring & str, mpt::widechar replacement = MPT_WIDECHAR('\uFFFD')) { mpt::widestring res; res.reserve(str.length()); @@ -516,6 +568,15 @@ case common_encoding::riscos: result = false; break; + case common_encoding::iso8859_1_no_c1: + result = false; + break; + case common_encoding::iso8859_15_no_c1: + result = false; + break; + case common_encoding::amiga_no_c1: + result = false; + break; } return result; } @@ -576,6 +637,15 @@ case common_encoding::riscos: throw std::domain_error("unsupported encoding"); break; + case common_encoding::iso8859_1_no_c1: + throw std::domain_error("unsupported encoding"); + break; + case common_encoding::iso8859_15_no_c1: + throw std::domain_error("unsupported encoding"); + break; + case common_encoding::amiga_no_c1: + throw std::domain_error("unsupported encoding"); + break; } return result; } @@ -885,6 +955,15 @@ case common_encoding::riscos: return encode_8bit<Tdststring>(src, CharsetTableRISC_OS); break; + case common_encoding::iso8859_1_no_c1: + return encode_8bit_no_c1<Tdststring>(src, CharsetTableISO8859_1); + break; + case common_encoding::iso8859_15_no_c1: + return encode_8bit_no_c1<Tdststring>(src, CharsetTableISO8859_15); + break; + case common_encoding::amiga_no_c1: + return encode_8bit_no_c1<Tdststring>(src, CharsetTableAmiga); + break; } throw std::domain_error("unsupported encoding"); } @@ -985,6 +1064,15 @@ case common_encoding::riscos: return decode_8bit(src, CharsetTableRISC_OS); break; + case common_encoding::iso8859_1_no_c1: + return decode_8bit_no_c1(src, CharsetTableISO8859_1); + break; + case common_encoding::iso8859_15_no_c1: + return decode_8bit_no_c1(src, CharsetTableISO8859_15); + break; + case common_encoding::amiga_no_c1: + return decode_8bit_no_c1(src, CharsetTableAmiga); + break; } throw std::domain_error("unsupported encoding"); } bug1565-use-amiga-no-c1-charset-v1.patch (9,917 bytes)
Index: soundlib/Load_dbm.cpp =================================================================== --- soundlib/Load_dbm.cpp (revision 16910) +++ soundlib/Load_dbm.cpp (working copy) @@ -356,7 +356,7 @@ m_modFormat.formatName = U_("DigiBooster Pro"); m_modFormat.type = U_("dbm"); m_modFormat.madeWithTracker = MPT_UFORMAT("DigiBooster Pro {}.{}")(mpt::ufmt::hex(fileHeader.trkVerHi), mpt::ufmt::hex(fileHeader.trkVerLo)); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; // Name chunk FileReader nameChunk = chunks.GetChunk(DBMChunk::idNAME); @@ -383,7 +383,7 @@ if(Order.AddSequence() == SEQUENCEINDEX_INVALID) break; } - Order().SetName(mpt::ToUnicode(mpt::Charset::ISO8859_1, name)); + Order().SetName(mpt::ToUnicode(mpt::Charset::Amiga_no_C1, name)); ReadOrderFromFile<uint16be>(Order(), songChunk, numOrders); #else const ORDERINDEX startIndex = Order().GetLength(); Index: soundlib/Load_digi.cpp =================================================================== --- soundlib/Load_digi.cpp (revision 16910) +++ soundlib/Load_digi.cpp (working copy) @@ -129,7 +129,7 @@ m_modFormat.formatName = U_("DigiBooster"); m_modFormat.type = U_("digi"); m_modFormat.madeWithTracker = MPT_UFORMAT("Digi Booster {}.{}")(fileHeader.versionInt >> 4, fileHeader.versionInt & 0x0F); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; ReadOrderFromArray(Order(), fileHeader.orders, fileHeader.lastOrdIndex + 1); Index: soundlib/Load_dtm.cpp =================================================================== --- soundlib/Load_dtm.cpp (revision 16910) +++ soundlib/Load_dtm.cpp (working copy) @@ -598,7 +598,7 @@ m_modFormat.formatName = U_("Digital Tracker"); m_modFormat.type = U_("dtm"); m_modFormat.madeWithTracker = std::move(tracker); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; return true; } Index: soundlib/Load_gt2.cpp =================================================================== --- soundlib/Load_gt2.cpp (revision 16910) +++ soundlib/Load_gt2.cpp (working copy) @@ -540,7 +540,7 @@ m_modFormat.madeWithTracker = U_("Graoumf Tracker"); m_modFormat.formatName = MPT_UFORMAT("Graoumf Tracker v{}")(fileHeader.fileVersion); m_modFormat.type = U_("gtk"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; m_songName = mpt::String::ReadBuf(mpt::String::spacePadded, fileHeader.songName); size_t msgLength = sizeof(fileHeader.smallComment); @@ -1223,7 +1223,7 @@ m_modFormat.madeWithTracker = mpt::ToUnicode(mpt::Charset::ASCII, mpt::String::ReadBuf(mpt::String::spacePadded, fileHeader.trackerName)); m_modFormat.formatName = (fileHeader.fileVersion <= 5 ? MPT_UFORMAT("Graoumf Tracker v{}") : MPT_UFORMAT("Graoumf Tracker 2 v{}"))(fileHeader.fileVersion); m_modFormat.type = U_("gt2"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; m_songName = mpt::String::ReadBuf(mpt::String::spacePadded, fileHeader.songName); Index: soundlib/Load_med.cpp =================================================================== --- soundlib/Load_med.cpp (revision 16910) +++ soundlib/Load_med.cpp (working copy) @@ -1146,7 +1146,7 @@ } } if(playSeq.name[0]) - order.SetName(mpt::ToUnicode(mpt::Charset::ISO8859_1, mpt::String::ReadAutoBuf(playSeq.name))); + order.SetName(mpt::ToUnicode(mpt::Charset::Amiga_no_C1, mpt::String::ReadAutoBuf(playSeq.name))); // Play commands (jump / stop) if(playSeq.commandTableOffset > 0 && file.Seek(playSeq.commandTableOffset)) @@ -1194,7 +1194,7 @@ { file.ReadString<mpt::String::maybeNullTerminated>(m_songName, expData.songNameLength); if(numSongs > 1) - order.SetName(mpt::ToUnicode(mpt::Charset::ISO8859_1, m_songName)); + order.SetName(mpt::ToUnicode(mpt::Charset::Amiga_no_C1, m_songName)); } if(expData.annoLength > 1 && file.Seek(expData.annoText)) { @@ -1434,7 +1434,7 @@ m_modFormat.formatName = MPT_UFORMAT("OctaMED (MMD{})")(version); m_modFormat.type = MPT_USTRING("med"); m_modFormat.madeWithTracker = madeWithTracker; - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; return true; } Index: soundlib/Load_mid.cpp =================================================================== --- soundlib/Load_mid.cpp (revision 16910) +++ soundlib/Load_mid.cpp (working copy) @@ -640,7 +640,7 @@ m_modFormat.formatName = U_("Standard MIDI File"); m_modFormat.type = isRIFF ? UL_("rmi") : UL_("mid"); m_modFormat.madeWithTracker = U_("Standard MIDI File"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; SetMixLevels(MixLevels::v1_17RC3); m_nTempoMode = TempoMode::Modern; Index: soundlib/Load_mo3.cpp =================================================================== --- soundlib/Load_mo3.cpp (revision 16910) +++ soundlib/Load_mo3.cpp (working copy) @@ -1932,7 +1932,7 @@ if(m_dwLastSavedWithVersion) m_modFormat.charset = mpt::Charset::Windows1252; else if(GetType() == MOD_TYPE_MOD) - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; else m_modFormat.charset = mpt::Charset::CP437; Index: soundlib/Load_mod.cpp =================================================================== --- soundlib/Load_mod.cpp (revision 16910) +++ soundlib/Load_mod.cpp (working copy) @@ -1309,7 +1309,7 @@ m_modFormat.type = U_("mod"); if(modMagicResult.madeWithTracker) m_modFormat.madeWithTracker = modMagicResult.madeWithTracker; - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; return true; } @@ -1811,7 +1811,7 @@ m_modFormat.formatName = U_("Soundtracker"); m_modFormat.type = U_("stk"); m_modFormat.madeWithTracker = madeWithTracker; - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; // Reading samples if(loadFlags & loadSampleData) @@ -1900,13 +1900,13 @@ m_modFormat.formatName = U_("MnemoTroN SoundTracker"); m_modFormat.type = U_("st26"); m_modFormat.madeWithTracker = U_("SoundTracker 2.6"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; } else if(IsMagic(magic, "IT10")) { m_modFormat.formatName = U_("Ice Tracker"); m_modFormat.type = U_("ice"); m_modFormat.madeWithTracker = U_("Ice Tracker 1.0 / 1.1"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; } else { return false; @@ -2140,7 +2140,7 @@ chunk.Skip(4); if(chunk.ReadMagic("PT") && iffHead.chunksize > 6) { - chunk.ReadString<mpt::String::maybeNullTerminated>(version, mpt::Charset::ISO8859_1, iffHead.chunksize - 6); + chunk.ReadString<mpt::String::maybeNullTerminated>(version, mpt::Charset::Amiga_no_C1, iffHead.chunksize - 6); } break; @@ -2196,7 +2196,7 @@ std::string author; commentChunk.ReadString<mpt::String::maybeNullTerminated>(author, 32); if(author != "UNNAMED AUTHOR") - m_songArtist = mpt::ToUnicode(mpt::Charset::ISO8859_1, author); + m_songArtist = mpt::ToUnicode(mpt::Charset::Amiga_no_C1, author); if(!commentChunk.NoBytesLeft()) { m_songMessage.ReadFixedLineLength(commentChunk, commentChunk.BytesLeft(), 40, 0); Index: soundlib/Load_okt.cpp =================================================================== --- soundlib/Load_okt.cpp (revision 16910) +++ soundlib/Load_okt.cpp (working copy) @@ -340,7 +340,7 @@ m_modFormat.formatName = U_("Oktalyzer"); m_modFormat.type = U_("okt"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; // Go through IFF chunks... while(file.CanRead(sizeof(OktIffChunk))) Index: soundlib/Load_sfx.cpp =================================================================== --- soundlib/Load_sfx.cpp (revision 16910) +++ soundlib/Load_sfx.cpp (working copy) @@ -469,7 +469,7 @@ m_modFormat.formatName = m_nSamples == 15 ? MPT_UFORMAT("SoundFX 1.{}")(version) : U_("SoundFX 2.0 / MultiMedia Sound"); m_modFormat.type = m_nSamples == 15 ? UL_("sfx") : UL_("sfx2"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; return true; } Index: soundlib/Load_stp.cpp =================================================================== --- soundlib/Load_stp.cpp (revision 16910) +++ soundlib/Load_stp.cpp (working copy) @@ -258,7 +258,7 @@ m_modFormat.formatName = MPT_UFORMAT("Soundtracker Pro II v{}")(fileHeader.version); m_modFormat.type = U_("stp"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; m_nChannels = 4; m_nSamples = 0; Index: soundlib/Load_symmod.cpp =================================================================== --- soundlib/Load_symmod.cpp (revision 16910) +++ soundlib/Load_symmod.cpp (working copy) @@ -1231,7 +1231,7 @@ if(symInst.IsEmpty() || symInst.IsVirtual()) continue; - auto filename = mpt::PathString::FromUnicode(mpt::ToUnicode(mpt::Charset::ISO8859_1, symInst.GetName())); + auto filename = mpt::PathString::FromUnicode(mpt::ToUnicode(mpt::Charset::Amiga_no_C1, symInst.GetName())); if(file.GetOptionalFileName()) filename = file.GetOptionalFileName()->GetPath() + filename.GetFullFileName(); @@ -1939,7 +1939,7 @@ m_modFormat.madeWithTracker = U_("Symphonie Pro"); else m_modFormat.madeWithTracker = U_("Symphonie Pro 256"); - m_modFormat.charset = mpt::Charset::ISO8859_1; + m_modFormat.charset = mpt::Charset::Amiga_no_C1; return true; } |
|
I'd say MOD would make sense to use Windows-1252, and Amiga-noC1 for the rest.
UMX itself shouldn't contain any strings, as all the actual module loading is taken care of by the IT/XM/S3M/etc. loaders. For UAX, Windows-1252 would make sense as it's a Windows format. |
|
In any case, this could also be a later additional commit.
We parse out the filename inside container formats, but it is currently unused, so it does not really matter.
UAX is already using Windows-1252, |
|
UAX and UMX are the same format anyway (just different contents); both should definitely be using Windows-1252. |
|
Yeah, true. r16911 |
|
status update:
TODO:
|
|
merged in 0.6 as r16930 / 0.6.2-pre.1 |
|
I'll close this issue as fixed. The only remaining issue is about how to achieve compatibility for .MOD with earlier OpenMPT versions, which is only vaguely related to this issue. However, the .MOD charset issue does not need to be resolved right now, because in the tracker, |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2022-02-07 05:25 | ProgrammerNerd | New Issue | |
2022-02-07 05:44 | ProgrammerNerd | Note Added: 0005043 | |
2022-02-07 05:45 | ProgrammerNerd | Note Added: 0005044 | |
2022-02-07 08:02 | Saga Musix | Steps to Reproduce Updated | |
2022-02-07 08:02 | Saga Musix | Steps to Reproduce Updated | |
2022-02-07 09:15 | Saga Musix | Note Added: 0005045 | |
2022-02-08 01:15 | ProgrammerNerd | Note Added: 0005048 | |
2022-02-08 03:47 | ProgrammerNerd | Note Added: 0005049 | |
2022-02-08 03:47 | ProgrammerNerd | File Added: protracker-screenshot.png | |
2022-02-08 07:35 | manx | Note Added: 0005050 | |
2022-02-08 07:35 | manx | Assigned To | => manx |
2022-02-08 07:35 | manx | Status | new => confirmed |
2022-02-08 08:13 | Saga Musix | Note Added: 0005051 | |
2022-02-08 09:08 | manx | Note Added: 0005052 | |
2022-02-08 10:51 | Saga Musix | Note Added: 0005053 | |
2022-02-08 12:28 | manx | Note Added: 0005054 | |
2022-02-08 19:37 | Saga Musix | Note Added: 0005055 | |
2022-02-08 19:39 | manx | Note Added: 0005056 | |
2022-02-08 19:42 | manx | Note Added: 0005057 | |
2022-02-08 20:04 | manx | Note Added: 0005058 | |
2022-02-10 10:46 | manx | Note Added: 0005061 | |
2022-02-10 10:49 | manx | Note Edited: 0005061 | |
2022-02-10 11:55 | manx | Note Edited: 0005061 | |
2022-02-10 15:08 | manx | Note Added: 0005062 | |
2022-02-10 15:08 | manx | File Added: bug1565-implement-no-c1-charsets-v1.patch | |
2022-02-10 15:09 | manx | Note Edited: 0005061 | |
2022-02-10 15:09 | manx | Note Edited: 0005062 | |
2022-02-10 15:09 | manx | Note Edited: 0005062 | |
2022-02-10 15:50 | manx | Note Added: 0005063 | |
2022-02-10 16:05 | manx | Note Added: 0005064 | |
2022-02-10 16:05 | manx | File Added: bug1565-implement-no-c1-charsets-v2.patch | |
2022-02-10 16:05 | manx | File Added: bug1565-use-amiga-no-c1-charset-v1.patch | |
2022-02-10 16:20 | manx | Note Edited: 0005064 | |
2022-02-10 18:30 | Saga Musix | Note Added: 0005065 | |
2022-02-10 18:47 | manx | Note Added: 0005066 | |
2022-02-10 18:48 | Saga Musix | Note Added: 0005067 | |
2022-02-10 18:52 | manx | Note Added: 0005068 | |
2022-02-11 09:13 | manx | Relationship added | related to 0001567 |
2022-02-11 09:52 | manx | Note Added: 0005071 | |
2022-02-11 11:49 | manx | Target Version | => OpenMPT 1.30.03.00 / libopenmpt 0.6.2 (upgrade first) |
2022-02-14 08:56 | manx | Note Added: 0005082 | |
2022-02-16 15:20 | manx | Relationship added | related to 0000569 |
2022-02-16 15:25 | manx | Status | confirmed => resolved |
2022-02-16 15:25 | manx | Resolution | open => fixed |
2022-02-16 15:25 | manx | Fixed in Version | => OpenMPT 1.30.03.00 / libopenmpt 0.6.2 (upgrade first) |
2022-02-16 15:25 | manx | Note Added: 0005094 |