View Issue Details

IDProjectCategoryView StatusLast Update
0001242OpenMPTGeneralpublic2023-04-11 09:00
ReporterSaga Musix Assigned Tomanx  
PrioritynormalSeverityminorReproducibilityN/A
Status assignedResolutionwon't fix 
Product VersionOpenMPT 1.29.00.* (old testing) 
Target VersionOpenMPT 1.32 / libopenmpt 0.8 (goals) 
Summary0001242: Eliminate unnecessary copies in charset conversion functions (std::string_view / no-ops)
Description
  • Consider using std::string_view for charset conversion functions where it makes sense. This can save a copy in case the input to the function is not already a std::string. Might be useful e.g. for the scripting API and other interop code.
  • Use std::string rather than const std::string &in the parameters of in no-op string conversion functions. We can avoid a copy there in some (probably most) contexts.
TagsNo tags attached.
Has the bug occurred in previous versions?
Tested code revision (in case you know it)

Activities

manx

manx

2019-07-31 12:25

administrator   ~0003982

Using string_view before actually requiring C++17 results in either a ton of additional ifdef (which mptString.h and mptString.cpp already have plenty enough of anyway), or another zoo of additional string types named mpt::ustring_maybe_view for each and every string type that we have and still then result in strange const string_view & parameters (or again another set of ifdef). I would rather delay that until we require C++17.

manx

manx

2019-07-31 12:41

administrator   ~0003983

Using value parameters instead of const & for trivial iniline conversion functions pessimizes binary size for me: 4.447.744 bytes (before), 4.453.376 bytes (after) (VS2019 Win32 ReleaseShared).

Saga Musix

Saga Musix

2019-07-31 17:35

administrator   ~0003984

Let's wait for C++17 then.

Saga Musix

Saga Musix

2019-09-06 23:44

administrator   ~0004046

C++17 support is now required. We could try using string_viewagain.

manx

manx

2019-11-01 21:35

administrator   ~0004130

mpt::String::ReadBuf() returns a StringModeBufRefImpl object, which currently is implicitly convertible to std::string. Changing mpt::ToUnicode() to take a std::string_view argument breaks implicit conversion (because 2 consecutive conversions would be required). Changing StringModeBufRefImpl to also be implicitly convertible to std::string_view is impossible because of the spacePadded and spacePaddedNull cases which need to modify the actual string data, for which no buffer space would be available unless an actual std::string exists somewhere.

The only way to fix this without breaking implicit conversions for this case would be to add std::string_view overloads in addition to the already existing std::string overloads instead of replacing them, for all mpt::ToUnicode(), mpt::ToCharset(), ... functions. This in turn would break implicit conversion of const char*arguments, which would require another set of overloads. mptString.h is already complicated and convoluted enough even without triple the overloads.

manx

manx

2023-01-25 15:05

administrator   ~0005502

Given that we now have mpt::transcode and mpt::make_string_type, there are maybe more efficient implementations possible. There is still an awful tradeoff between const & and moveable which is very difficult to evaluate at the initial call site.

manx

manx

2023-01-25 16:28

administrator   ~0005503

Last edited: 2023-01-26 07:41

This adds move support and forwarding to the outer layer of mpt::transcode. This is enough to catch the no-op To-to-T transcodes via mpt::transcode.

transcode-move-v1.patch (6,811 bytes)   
Index: src/mpt/string/types.hpp
===================================================================
--- src/mpt/string/types.hpp	(revision 18621)
+++ src/mpt/string/types.hpp	(working copy)
@@ -14,6 +14,7 @@
 #include <string>
 #include <string_view>
 #include <type_traits>
+#include <utility>
 
 #include <cstddef>
 
@@ -504,11 +505,11 @@
 
 
 template <typename T>
-inline typename mpt::make_string_type<T>::type as_string(const T & str) {
-	if constexpr (std::is_pointer<typename std::remove_cv<T>::type>::value) {
-		return str ? typename mpt::make_string_type<T>::type{str} : typename mpt::make_string_type<T>::type{};
-	} else if constexpr (mpt::is_string_view_type<T>::value) {
-		return typename mpt::make_string_type<T>::type{str};
+inline typename mpt::make_string_type<typename std::decay<T>::type>::type as_string(T && str) {
+	if constexpr (std::is_pointer<typename std::decay<T>::type>::value) {
+		return str ? typename mpt::make_string_type<typename std::decay<T>::type>::type{std::forward<T>(str)} : typename mpt::make_string_type<typename std::decay<T>::type>::type{};
+	} else if constexpr (mpt::is_string_view_type<typename std::decay<T>::type>::value) {
+		return typename mpt::make_string_type<typename std::decay<T>::type>::type{std::forward<T>(str)};
 	} else {
 		return str;
 	}
@@ -517,11 +518,11 @@
 
 
 template <typename T>
-inline typename mpt::make_string_view_type<T>::type as_string_view(const T & str) {
-	if constexpr (std::is_pointer<typename std::remove_cv<T>::type>::value) {
-		return str ? typename mpt::make_string_view_type<T>::type{str} : typename mpt::make_string_view_type<T>::type{};
-	} else if constexpr (mpt::is_string_view_type<T>::value) {
-		return typename mpt::make_string_view_type<T>::type{str};
+inline typename mpt::make_string_view_type<typename std::decay<T>::type>::type as_string_view(T && str) {
+	if constexpr (std::is_pointer<typename std::decay<T>::type>::value) {
+		return str ? typename mpt::make_string_view_type<typename std::decay<T>::type>::type{std::forward<T>(str)} : typename mpt::make_string_view_type<typename std::decay<T>::type>::type{};
+	} else if constexpr (mpt::is_string_view_type<typename std::decay<T>::type>::value) {
+		return typename mpt::make_string_view_type<typename std::decay<T>::type>::type{std::forward<T>(str)};
 	} else {
 		return str;
 	}
Index: src/mpt/string_transcode/transcode.hpp
===================================================================
--- src/mpt/string_transcode/transcode.hpp	(revision 18621)
+++ src/mpt/string_transcode/transcode.hpp	(working copy)
@@ -25,6 +25,7 @@
 #if !defined(MPT_COMPILER_QUIRK_NO_WCHAR)
 #include <type_traits>
 #endif // !MPT_COMPILER_QUIRK_NO_WCHAR
+#include <utility>
 #include <vector>
 
 #if MPT_OS_DJGPP
@@ -1891,28 +1892,28 @@
 
 #endif // MPT_DETECTED_MFC
 
-template <typename Tdststring, typename Tsrcstring, std::enable_if_t<mpt::is_string_type<typename mpt::make_string_type<Tsrcstring>::type>::value, bool> = true>
-inline Tdststring transcode(const Tsrcstring & src) {
-	if constexpr (std::is_same<Tdststring, typename mpt::make_string_type<Tsrcstring>::type>::value) {
-		return mpt::as_string(src);
+template <typename Tdststring, typename Tsrcstring, std::enable_if_t<mpt::is_string_type<typename mpt::make_string_type<typename std::decay<Tsrcstring>::type>::type>::value, bool> = true>
+inline Tdststring transcode(Tsrcstring && src) {
+	if constexpr (std::is_same<Tdststring, typename mpt::make_string_type<typename std::decay<Tsrcstring>::type>::type>::value) {
+		return mpt::as_string(std::forward<Tsrcstring>(src));
 	} else {
-		return string_transcoder<Tdststring>::encode(string_transcoder<decltype(mpt::as_string(src))>::decode(mpt::as_string(src)));
+		return string_transcoder<Tdststring>::encode(string_transcoder<decltype(mpt::as_string(std::forward<Tsrcstring>(src)))>::decode(mpt::as_string(std::forward<Tsrcstring>(src))));
 	}
 }
 
-template <typename Tdststring, typename Tsrcstring, typename Tencoding, std::enable_if_t<std::is_same<Tdststring, std::string>::value, bool> = true, std::enable_if_t<mpt::is_string_type<typename mpt::make_string_type<Tsrcstring>::type>::value, bool> = true>
-inline Tdststring transcode(Tencoding to, const Tsrcstring & src) {
-	return mpt::encode<Tdststring>(to, string_transcoder<decltype(mpt::as_string(src))>::decode(mpt::as_string(src)));
+template <typename Tdststring, typename Tsrcstring, typename Tencoding, std::enable_if_t<std::is_same<Tdststring, std::string>::value, bool> = true, std::enable_if_t<mpt::is_string_type<typename mpt::make_string_type<typename std::decay<Tsrcstring>::type>::type>::value, bool> = true>
+inline Tdststring transcode(Tencoding to, Tsrcstring && src) {
+	return mpt::encode<Tdststring>(to, string_transcoder<decltype(mpt::as_string(std::forward<Tsrcstring>(src)))>::decode(mpt::as_string(std::forward<Tsrcstring>(src))));
 }
 
-template <typename Tdststring, typename Tsrcstring, typename Tencoding, std::enable_if_t<std::is_same<typename mpt::make_string_type<Tsrcstring>::type, std::string>::value, bool> = true, std::enable_if_t<mpt::is_string_type<typename mpt::make_string_type<Tsrcstring>::type>::value, bool> = true>
-inline Tdststring transcode(Tencoding from, const Tsrcstring & src) {
-	return string_transcoder<Tdststring>::encode(mpt::decode<decltype(mpt::as_string(src))>(from, mpt::as_string(src)));
+template <typename Tdststring, typename Tsrcstring, typename Tencoding, std::enable_if_t<std::is_same<typename mpt::make_string_type<typename std::decay<Tsrcstring>::type>::type, std::string>::value, bool> = true, std::enable_if_t<mpt::is_string_type<typename mpt::make_string_type<typename std::decay<Tsrcstring>::type>::type>::value, bool> = true>
+inline Tdststring transcode(Tencoding from, Tsrcstring && src) {
+	return string_transcoder<Tdststring>::encode(mpt::decode<decltype(mpt::as_string(std::forward<Tsrcstring>(src)))>(from, mpt::as_string(std::forward<Tsrcstring>(src))));
 }
 
-template <typename Tdststring, typename Tsrcstring, typename Tto, typename Tfrom, std::enable_if_t<mpt::is_string_type<typename mpt::make_string_type<Tsrcstring>::type>::value, bool> = true>
-inline Tdststring transcode(Tto to, Tfrom from, const Tsrcstring & src) {
-	return mpt::encode<Tdststring>(to, mpt::decode<decltype(mpt::as_string(src))>(from, mpt::as_string(src)));
+template <typename Tdststring, typename Tsrcstring, typename Tto, typename Tfrom, std::enable_if_t<mpt::is_string_type<typename mpt::make_string_type<typename std::decay<Tsrcstring>::type>::type>::value, bool> = true>
+inline Tdststring transcode(Tto to, Tfrom from, Tsrcstring && src) {
+	return mpt::encode<Tdststring>(to, mpt::decode<decltype(mpt::as_string(std::forward<Tsrcstring>(src)))>(from, mpt::as_string(std::forward<Tsrcstring>(src))));
 }
 
 
transcode-move-v1.patch (6,811 bytes)   
manx

manx

2023-01-25 17:01

administrator   ~0005504

r18622 implements the trivial case for the no-op functions.

mptString-avoid-noop-copies-v1.patch (2,815 bytes)   
Index: common/mptString.h
===================================================================
--- common/mptString.h	(revision 18621)
+++ common/mptString.h	(working copy)
@@ -127,7 +127,7 @@
 // The wide encoding is UTF-16 or UTF-32, based on sizeof(wchar_t).
 // If str does not contain any invalid characters, this conversion is lossless.
 // Invalid source bytes will be replaced by some replacement character or string.
-inline std::wstring ToWide(const std::wstring &str) { return str; }
+inline std::wstring ToWide(std::wstring str) { return str; }
 inline std::wstring ToWide(const wchar_t * str) { return (str ? std::wstring(str) : std::wstring()); }
 std::wstring ToWide(Charset from, const std::string &str);
 inline std::wstring ToWide(Charset from, const char * str) { return ToWide(from, str ? std::string(str) : std::string()); }
@@ -159,7 +159,7 @@
 #endif
 mpt::lstring ToLocale(Charset from, const std::string &str);
 inline mpt::lstring ToLocale(Charset from, const char * str) { return ToLocale(from, str ? std::string(str): std::string()); }
-inline mpt::lstring ToLocale(const mpt::lstring &str) { return str; }
+inline mpt::lstring ToLocale(mpt::lstring str) { return str; }
 #endif // MPT_ENABLE_CHARSET_LOCALE
 
 #if MPT_OS_WINDOWS
@@ -180,7 +180,7 @@
 // Convert to a MFC CString. The CString encoding depends on UNICODE.
 // This should also be used when converting to TCHAR strings.
 // If UNICODE is defined, this is a completely lossless operation.
-inline CString ToCString(const CString &str) { return str; }
+inline CString ToCString(CString str) { return str; }
 CString ToCString(const std::wstring &str);
 inline CString ToCString(const wchar_t * str) { return ToCString(str ? std::wstring(str) : std::wstring()); }
 CString ToCString(Charset from, const std::string &str);
@@ -210,7 +210,7 @@
 
 
 #if MPT_USTRING_MODE_WIDE
-inline mpt::ustring ToUnicode(const std::wstring &str) { return str; }
+inline mpt::ustring ToUnicode(std::wstring str) { return str; }
 inline mpt::ustring ToUnicode(const wchar_t * str) { return (str ? std::wstring(str) : std::wstring()); }
 inline mpt::ustring ToUnicode(Charset from, const std::string &str) { return ToWide(from, str); }
 inline mpt::ustring ToUnicode(Charset from, const char * str) { return ToUnicode(from, str ? std::string(str) : std::string()); }
@@ -221,7 +221,7 @@
 inline mpt::ustring ToUnicode(const CString &str) { return ToWide(str); }
 #endif // MFC
 #else // !MPT_USTRING_MODE_WIDE
-inline mpt::ustring ToUnicode(const mpt::ustring &str) { return str; }
+inline mpt::ustring ToUnicode(mpt::ustring str) { return str; }
 #if MPT_WSTRING_CONVERT
 mpt::ustring ToUnicode(const std::wstring &str);
 inline mpt::ustring ToUnicode(const wchar_t * str) { return ToUnicode(str ? std::wstring(str) : std::wstring()); }
manx

manx

2023-01-26 07:45

administrator   ~0005505

r18624 implements transcode-move.

manx

manx

2023-04-11 09:00

administrator   ~0005665

r19006:

[Ref] mpt/string_transcode/transcode.hpp: Implement non-transcoding fast-path when converting between mpt::u8string and mpt::common_encoding::utf8.
[Ref] mpt/string_transcode/transcode.hpp: Implement non-transcoding fast-path when converting between identical mpt::common_encoding or mpt::logical_encoding.

Issue History

Date Modified Username Field Change
2019-07-30 19:18 Saga Musix New Issue
2019-07-31 12:08 manx Assigned To => manx
2019-07-31 12:08 manx Status new => assigned
2019-07-31 12:25 manx Note Added: 0003982
2019-07-31 12:41 manx Note Added: 0003983
2019-07-31 12:41 manx Product Version => OpenMPT 1.29.00.* (old testing)
2019-07-31 12:41 manx Target Version => OpenMPT 1.?? (long term goals)
2019-07-31 17:35 Saga Musix Note Added: 0003984
2019-09-06 23:44 Saga Musix Note Added: 0004046
2019-09-07 05:40 manx Target Version OpenMPT 1.?? (long term goals) => OpenMPT 1.29.01.00 / libopenmpt 0.5.0 (upgrade first)
2019-11-01 21:35 manx Note Added: 0004130
2019-11-01 21:35 manx Target Version OpenMPT 1.29.01.00 / libopenmpt 0.5.0 (upgrade first) => OpenMPT 2.0 (very long term goals)
2019-11-01 21:35 manx Status assigned => closed
2019-11-01 21:35 manx Resolution open => won't fix
2023-01-25 15:01 manx Status closed => assigned
2023-01-25 15:01 manx Target Version OpenMPT 2.0 (very long term goals) => OpenMPT 1.31.01.00 / libopenmpt 0.7.0 (upgrade first)
2023-01-25 15:05 manx Note Added: 0005502
2023-01-25 16:28 manx Note Added: 0005503
2023-01-25 16:28 manx File Added: transcode-move-v1.patch
2023-01-25 16:40 manx Note Edited: 0005503
2023-01-25 17:01 manx Note Added: 0005504
2023-01-25 17:01 manx File Added: mptString-avoid-noop-copies-v1.patch
2023-01-26 07:41 manx Note Edited: 0005503
2023-01-26 07:45 manx Note Added: 0005505
2023-04-11 09:00 manx Note Added: 0005665
2023-04-11 09:00 manx Target Version OpenMPT 1.31.01.00 / libopenmpt 0.7.0 (upgrade first) => OpenMPT 1.32 / libopenmpt 0.8 (goals)