View Issue Details

IDProjectCategoryView StatusLast Update
0001922OpenMPTFeature Requestpublic2025-11-10 13:55
Reportermanx Assigned To 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
Product VersionOpenMPT 1.33.00.* (current testing) 
Summary0001922: CPU usage statistics per Plugin
Description

Suggested by coda on IRC.

TagsNo tags attached.
Has the bug occurred in previous versions?
Tested code revision (in case you know it)

Activities

manx

manx

2025-09-08 09:12

administrator   ~0006494

kode54 and paper suggested QueryPerformanceCounter.

QueryPerformanceCounter (QPC) is unreliable in every version of Windows with the "correct" hardware (i.e. hardware that does not conform to Microsoft's invalid assumptions). It is less of a problem starting with roughly about Windows 8 because the hardware is rather rare/uncommon with this Windows version.

Microsoft assumes synchronized TSC accross different cores even if the hardware does not provide it. The fluctuations are rather catastrophic on early Athlon 64 Dual Cores, or early Pentium 4/D systems, or any multi-socket system, where power management frequency scaling scales the TSC differently between cores.

And there is another problem with QPC: on the subset of old systems where Microsoft correctly does not assume synchronized TSC, it can fall back to ACPI HPET or ACPI PMTIMER, which can be awfully slow to read, and will be a measurable performance overhead in practice.

The unreliability is not that much of a problem though when QPC (or RDTSC) is only used to display a value to the user. It is highly problematic when doing any sort of scheduling or automatic performance tuning based on it though.

Half of the truth is documented at https://learn.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps, the other half is scattered across the internet and personal experience. On all my systems from Windows 2000 through Windows 8.1, QPC was non-monotonic and/or slow on about 80% of them. I have not checked recently with Windows 10 or Windows 11.

Microsoft C++ STL choosing QPC as the implementation backend for std::chrono::steady_clock is highly questionable IMHO, given that this is supposed to be compatible with systems back to Windows XP SP3.

QPC in the audio hot path (i.e. twice per period per plugin) gets a hard no from me, this will kill performance on systems that we likely cannot easily test on any more.

I would strongly suggest using raw RDTSC for this purpose. Even though it has all the same problems that QPC has (and in some cases is even worse), it does not destroy performance on some systems that we cannot name or list.

Saga Musix

Saga Musix

2025-11-09 14:29

administrator   ~0006519

Very rough, very unfinished implementation

PluginCPUMeter.patch (5,265 bytes)   
Index: mptrack/AbstractVstEditor.cpp
===================================================================
--- mptrack/AbstractVstEditor.cpp	(revision 24407)
+++ mptrack/AbstractVstEditor.cpp	(working copy)
@@ -514,6 +514,14 @@
 		if(m_VstPlugin.IsBypassed())
 			title += _T(" - Bypass");
 
+		double cpuUsage = 0.0;
+		{
+			CriticalSection cs;
+			if(m_VstPlugin.GetSoundFile().m_PlayState.m_totalClocks)
+				cpuUsage = m_VstPlugin.m_MixState.clockCycles * 100.0 / m_VstPlugin.GetSoundFile().m_PlayState.m_totalClocks;
+		}
+		title += MPT_CFORMAT(" [{}%]")(mpt::cfmt::flt(cpuUsage, 3));
+
 		SetWindowText(title);
 	}
 }
Index: mptrack/AbstractVstEditor.h
===================================================================
--- mptrack/AbstractVstEditor.h	(revision 24407)
+++ mptrack/AbstractVstEditor.h	(working copy)
@@ -97,7 +97,7 @@
 
 	virtual bool OpenEditor(CWnd *parent);
 	virtual void DoClose();
-	virtual void UpdateParamDisplays() { if(m_updateDisplay) { SetupMenu(true); m_updateDisplay = false; } }
+	virtual void UpdateParamDisplays() { SetTitle(); if(m_updateDisplay) { SetupMenu(true); m_updateDisplay = false; } }
 	virtual void UpdateParam(int32 /*param*/) { }
 	virtual void UpdateView(UpdateHint hint);
 
Index: soundlib/Fastmix.cpp
===================================================================
--- soundlib/Fastmix.cpp	(revision 24407)
+++ soundlib/Fastmix.cpp	(working copy)
@@ -21,6 +21,7 @@
 #include "MixerLoops.h"
 #include "MixFuncTable.h"
 #include "plugins/PlugInterface.h"
+#include "mpt/arch/x86_amd64.hpp"
 #include <cfloat>  // For FLT_EPSILON
 #include <algorithm>
 
@@ -573,6 +574,10 @@
 			IMixPlugin *mixPlug = plugin.pMixPlugin;
 			SNDMIXPLUGINSTATE &state = mixPlug->m_MixState;
 
+#if (MPT_COMPILER_MSVC || MPT_COMPILER_GCC || MPT_COMPILER_CLANG) && (MPT_ARCH_X86 || MPT_ARCH_AMD64)
+			uint64 startClock = mpt::arch::current::rdtsc();
+#endif
+
 			//We should only ever reach this point if the song is playing.
 			if (!mixPlug->IsSongPlaying())
 			{
@@ -612,6 +617,10 @@
 			{
 				masterHasInput = true;
 			}
+
+#if(MPT_COMPILER_MSVC || MPT_COMPILER_GCC || MPT_COMPILER_CLANG) && (MPT_ARCH_X86 || MPT_ARCH_AMD64)
+			state.clockCycles += mpt::arch::current::rdtsc() - startClock;
+#endif
 		}
 	}
 	// Convert mix buffer
@@ -667,6 +676,10 @@
 			float *pOutL = pMixL;
 			float *pOutR = pMixR;
 
+#if(MPT_COMPILER_MSVC || MPT_COMPILER_GCC || MPT_COMPILER_CLANG) && (MPT_ARCH_X86 || MPT_ARCH_AMD64)
+			uint64 startClock = mpt::arch::current::rdtsc();
+#endif
+
 			if (!plugin.IsOutputToMaster())
 			{
 				PLUGINDEX nOutput = plugin.GetOutputPlugin();
@@ -770,6 +783,10 @@
 				}
 			}
 			state.dwFlags &= ~SNDMIXPLUGINSTATE::psfHasInput;
+
+#if(MPT_COMPILER_MSVC || MPT_COMPILER_GCC || MPT_COMPILER_CLANG) && (MPT_ARCH_X86 || MPT_ARCH_AMD64)
+			state.clockCycles += mpt::arch::current::rdtsc() - startClock;
+#endif
 		}
 	}
 #ifdef MPT_INTMIXER
Index: soundlib/PlayState.h
===================================================================
--- soundlib/PlayState.h	(revision 24407)
+++ soundlib/PlayState.h	(working copy)
@@ -25,6 +25,7 @@
 	friend class CSoundFile;
 
 public:
+	uint64 m_totalClocks = 0;
 	samplecount_t m_lTotalSampleCount = 0;  // Total number of rendered samples
 protected:
 	samplecount_t m_nBufferCount = 0;  // Remaining number samples to render for this tick
Index: soundlib/plugins/PlugInterface.h
===================================================================
--- soundlib/plugins/PlugInterface.h	(revision 24407)
+++ soundlib/plugins/PlugInterface.h	(working copy)
@@ -44,6 +44,8 @@
 	uint32 inputSilenceCount = 0;      // How much silence has been processed? (for plugin auto-turnoff)
 	mixsample_t nVolDecayL = 0, nVolDecayR = 0; // End of sample click removal
 
+	uint64 clockCycles = 0;
+
 	void ResetSilence()
 	{
 		dwFlags |= psfHasInput;
Index: soundlib/Sndmix.cpp
===================================================================
--- soundlib/Sndmix.cpp	(revision 24407)
+++ soundlib/Sndmix.cpp	(working copy)
@@ -221,6 +221,15 @@
 {
 	MPT_ASSERT_ALWAYS(m_MixerSettings.IsValid());
 
+#if(MPT_COMPILER_MSVC || MPT_COMPILER_GCC || MPT_COMPILER_CLANG) && (MPT_ARCH_X86 || MPT_ARCH_AMD64)
+	uint64 startClock = mpt::arch::current::rdtsc();
+	for(SNDMIXPLUGIN &plugin : m_MixPlugins)
+	{
+		if(plugin.pMixPlugin)
+			plugin.pMixPlugin->m_MixState.clockCycles = 0;
+	}
+#endif
+
 	samplecount_t countRendered = 0;
 	samplecount_t countToRender = count;
 
@@ -387,6 +396,10 @@
 
 	// mix done
 
+#if(MPT_COMPILER_MSVC || MPT_COMPILER_GCC || MPT_COMPILER_CLANG) && (MPT_ARCH_X86 || MPT_ARCH_AMD64)
+	m_PlayState.m_totalClocks = mpt::arch::current::rdtsc() - startClock;
+#endif
+
 	return countRendered;
 
 }
Index: src/mpt/arch/x86_amd64.hpp
===================================================================
--- src/mpt/arch/x86_amd64.hpp	(revision 24407)
+++ src/mpt/arch/x86_amd64.hpp	(working copy)
@@ -1988,7 +1988,15 @@
 };
 
 
+#if MPT_COMPILER_MSVC || MPT_COMPILER_GCC || MPT_COMPILER_CLANG
 
+MPT_ATTR_ALWAYSINLINE MPT_INLINE_FORCE static uint64 rdtsc() noexcept {
+	return __rdtsc();
+}
+
+#endif // MPT_COMPILER_MSVC || MPT_COMPILER_GCC || MPT_COMPILER_CLANG
+
+
 } // namespace x86
 
 namespace amd64 = x86;
PluginCPUMeter.patch (5,265 bytes)   
manx

manx

2025-11-10 13:55

administrator   ~0006520

r24411 adds mpt::profiler::highres_clock::now() and mpt::profiler::fast_clock::now(), as well as MPT_PROFILER_TSC_CLOCK to check whether a highres and fast TSC clock is available.

Issue History

Date Modified Username Field Change
2025-09-08 07:57 manx New Issue
2025-09-08 09:12 manx Note Added: 0006494
2025-11-09 14:29 Saga Musix Note Added: 0006519
2025-11-09 14:29 Saga Musix File Added: PluginCPUMeter.patch
2025-11-10 13:55 manx Note Added: 0006520