From 60ef4b9cc9828ba994189c4bf1324e70cb5f3c7f Mon Sep 17 00:00:00 2001 From: Gerald Pfeifer Date: Wed, 1 Jan 2025 09:05:02 +0800 Subject: [PATCH] libstdc++: Delete further Profile Mode leftovers Commit 544be2beb1fa in 2019 remove Profile Mode and associated docs. Now also remove generated HTML files. libstdc++-v3: * doc/html/manual/profile_mode.html: Delete. * doc/html/manual/profile_mode_api.html: Ditto. * doc/html/manual/profile_mode_cost_model.html: Ditto. * doc/html/manual/profile_mode_design.html: Ditto. * doc/html/manual/profile_mode_devel.html: Ditto. * doc/html/manual/profile_mode_impl.html: Ditto. --- .../doc/html/manual/profile_mode.html | 145 ------------------ .../doc/html/manual/profile_mode_api.html | 9 -- .../html/manual/profile_mode_cost_model.html | 17 -- .../doc/html/manual/profile_mode_design.html | 121 --------------- .../doc/html/manual/profile_mode_devel.html | 67 -------- .../doc/html/manual/profile_mode_impl.html | 50 ------ 6 files changed, 409 deletions(-) delete mode 100644 libstdc++-v3/doc/html/manual/profile_mode.html delete mode 100644 libstdc++-v3/doc/html/manual/profile_mode_api.html delete mode 100644 libstdc++-v3/doc/html/manual/profile_mode_cost_model.html delete mode 100644 libstdc++-v3/doc/html/manual/profile_mode_design.html delete mode 100644 libstdc++-v3/doc/html/manual/profile_mode_devel.html delete mode 100644 libstdc++-v3/doc/html/manual/profile_mode_impl.html diff --git a/libstdc++-v3/doc/html/manual/profile_mode.html b/libstdc++-v3/doc/html/manual/profile_mode.html deleted file mode 100644 index 39c732180ac..00000000000 --- a/libstdc++-v3/doc/html/manual/profile_mode.html +++ /dev/null @@ -1,145 +0,0 @@ - -Chapter 19. Profile Mode

Chapter 19. Profile Mode

Intro

- Goal: Give performance improvement advice based on - recognition of suboptimal usage patterns of the standard library. -

- Method: Wrap the standard library code. Insert - calls to an instrumentation library to record the internal state of - various components at interesting entry/exit points to/from the standard - library. Process trace, recognize suboptimal patterns, give advice. - For details, see the - Perflint - paper presented at CGO 2009. -

- Strengths: -

  • - Unintrusive solution. The application code does not require any - modification. -

  • The advice is call context sensitive, thus capable of - identifying precisely interesting dynamic performance behavior. -

  • - The overhead model is pay-per-view. When you turn off a diagnostic class - at compile time, its overhead disappears. -

-

- Drawbacks: -

  • - You must recompile the application code with custom options. -

  • You must run the application on representative input. - The advice is input dependent. -

  • - The execution time will increase, in some cases by factors. -

-

Using the Profile Mode

- This is the anticipated common workflow for program foo.cc: -

-$ cat foo.cc
-#include <vector>
-int main() {
-  vector<int> v;
-  for (int k = 0; k < 1024; ++k) v.insert(v.begin(), k);
-}
-
-$ g++ -D_GLIBCXX_PROFILE foo.cc
-$ ./a.out
-$ cat libstdcxx-profile.txt
-vector-to-list: improvement = 5: call stack = 0x804842c ...
-    : advice = change std::vector to std::list
-vector-size: improvement = 3: call stack = 0x804842c ...
-    : advice = change initial container size from 0 to 1024
-

-

- Anatomy of a warning: -

  • - Warning id. This is a short descriptive string for the class - that this warning belongs to. E.g., "vector-to-list". -

  • - Estimated improvement. This is an approximation of the benefit expected - from implementing the change suggested by the warning. It is given on - a log10 scale. Negative values mean that the alternative would actually - do worse than the current choice. - In the example above, 5 comes from the fact that the overhead of - inserting at the beginning of a vector vs. a list is around 1024 * 1024 / 2, - which is around 10e5. The improvement from setting the initial size to - 1024 is in the range of 10e3, since the overhead of dynamic resizing is - linear in this case. -

  • - Call stack. Currently, the addresses are printed without - symbol name or code location attribution. - Users are expected to postprocess the output using, for instance, addr2line. -

  • - The warning message. For some warnings, this is static text, e.g., - "change vector to list". For other warnings, such as the one above, - the message contains numeric advice, e.g., the suggested initial size - of the vector. -

-

Three files are generated. libstdcxx-profile.txt - contains human readable advice. libstdcxx-profile.raw - contains implementation specific data about each diagnostic. - Their format is not documented. They are sufficient to generate - all the advice given in libstdcxx-profile.txt. The advantage - of keeping this raw format is that traces from multiple executions can - be aggregated simply by concatenating the raw traces. We intend to - offer an external utility program that can issue advice from a trace. - libstdcxx-profile.conf.out lists the actual diagnostic - parameters used. To alter parameters, edit this file and rename it to - libstdcxx-profile.conf. -

Advice is given regardless whether the transformation is valid. - For instance, we advise changing a map to an unordered_map even if the - application semantics require that data be ordered. - We believe such warnings can help users understand the performance - behavior of their application better, which can lead to changes - at a higher abstraction level. -

Tuning the Profile Mode

Compile time switches and environment variables (see also file - profiler.h). Unless specified otherwise, they can be set at compile time - using -D_<name> or by setting variable <name> - in the environment where the program is run, before starting execution. -

  • - _GLIBCXX_PROFILE_NO_<diagnostic>: - disable specific diagnostics. - See section Diagnostics for possible values. - (Environment variables not supported.) -

  • - _GLIBCXX_PROFILE_TRACE_PATH_ROOT: set an alternative root - path for the output files. -

  • _GLIBCXX_PROFILE_MAX_WARN_COUNT: set it to the maximum - number of warnings desired. The default value is 10.

  • - _GLIBCXX_PROFILE_MAX_STACK_DEPTH: if set to 0, - the advice will - be collected and reported for the program as a whole, and not for each - call context. - This could also be used in continuous regression tests, where you - just need to know whether there is a regression or not. - The default value is 32. -

  • - _GLIBCXX_PROFILE_MEM_PER_DIAGNOSTIC: - set a limit on how much memory to use for the accounting tables for each - diagnostic type. When this limit is reached, new events are ignored - until the memory usage decreases under the limit. Generally, this means - that newly created containers will not be instrumented until some - live containers are deleted. The default is 128 MB. -

  • - _GLIBCXX_PROFILE_NO_THREADS: - Make the library not use threads. If thread local storage (TLS) is not - available, you will get a preprocessor error asking you to set - -D_GLIBCXX_PROFILE_NO_THREADS if your program is single-threaded. - Multithreaded execution without TLS is not supported. - (Environment variable not supported.) -

  • - _GLIBCXX_HAVE_EXECINFO_H: - This name should be defined automatically at library configuration time. - If your library was configured without execinfo.h, but - you have it in your include path, you can define it explicitly. Without - it, advice is collected for the program as a whole, and not for each - call context. - (Environment variable not supported.) -

-

Bibliography

- Perflint: A Context Sensitive Performance Advisor for C++ Programs - . Lixia Liu. Silvius Rus. Copyright © 2009 . - Proceedings of the 2009 International Symposium on Code Generation - and Optimization - .

\ No newline at end of file diff --git a/libstdc++-v3/doc/html/manual/profile_mode_api.html b/libstdc++-v3/doc/html/manual/profile_mode_api.html deleted file mode 100644 index e63bd5701c6..00000000000 --- a/libstdc++-v3/doc/html/manual/profile_mode_api.html +++ /dev/null @@ -1,9 +0,0 @@ - -Extensions for Custom Containers

Extensions for Custom Containers

- Many large projects use their own data structures instead of the ones in the - standard library. If these data structures are similar in functionality - to the standard library, they can be instrumented with the same hooks - that are used to instrument the standard library. - The instrumentation API is exposed in file - profiler.h (look for "Instrumentation hooks"). -

\ No newline at end of file diff --git a/libstdc++-v3/doc/html/manual/profile_mode_cost_model.html b/libstdc++-v3/doc/html/manual/profile_mode_cost_model.html deleted file mode 100644 index bc87048b4df..00000000000 --- a/libstdc++-v3/doc/html/manual/profile_mode_cost_model.html +++ /dev/null @@ -1,17 +0,0 @@ - -Empirical Cost Model

Empirical Cost Model

- Currently, the cost model uses formulas with predefined relative weights - for alternative containers or container implementations. For instance, - iterating through a vector is X times faster than iterating through a list. -

- (Under development.) - We are working on customizing this to a particular machine by providing - an automated way to compute the actual relative weights for operations - on the given machine. -

- (Under development.) - We plan to provide a performance parameter database format that can be - filled in either by hand or by an automated training mechanism. - The analysis module will then use this database instead of the built in. - generic parameters. -

\ No newline at end of file diff --git a/libstdc++-v3/doc/html/manual/profile_mode_design.html b/libstdc++-v3/doc/html/manual/profile_mode_design.html deleted file mode 100644 index 8ce51c88950..00000000000 --- a/libstdc++-v3/doc/html/manual/profile_mode_design.html +++ /dev/null @@ -1,121 +0,0 @@ - -Design

Design

-

Table 19.1. Profile Code Location

Code LocationUse
libstdc++-v3/include/std/*Preprocessor code to redirect to profile extension headers.
libstdc++-v3/include/profile/*Profile extension public headers (map, vector, ...).
libstdc++-v3/include/profile/impl/*Profile extension internals. Implementation files are - only included from impl/profiler.h, which is the only - file included from the public headers.

-

Wrapper Model

- In order to get our instrumented library version included instead of the - release one, - we use the same wrapper model as the debug mode. - We subclass entities from the release version. Wherever - _GLIBCXX_PROFILE is defined, the release namespace is - std::__norm, whereas the profile namespace is - std::__profile. Using plain std translates - into std::__profile. -

- Whenever possible, we try to wrap at the public interface level, e.g., - in unordered_set rather than in hashtable, - in order not to depend on implementation. -

- Mixing object files built with and without the profile mode must - not affect the program execution. However, there are no guarantees to - the accuracy of diagnostics when using even a single object not built with - -D_GLIBCXX_PROFILE. - Currently, mixing the profile mode with debug and parallel extensions is - not allowed. Mixing them at compile time will result in preprocessor errors. - Mixing them at link time is undefined. -

Instrumentation

- Instead of instrumenting every public entry and exit point, - we chose to add instrumentation on demand, as needed - by individual diagnostics. - The main reason is that some diagnostics require us to extract bits of - internal state that are particular only to that diagnostic. - We plan to formalize this later, after we learn more about the requirements - of several diagnostics. -

- All the instrumentation points can be switched on and off using - -D[_NO]_GLIBCXX_PROFILE_<diagnostic> options. - With all the instrumentation calls off, there should be negligible - overhead over the release version. This property is needed to support - diagnostics based on timing of internal operations. For such diagnostics, - we anticipate turning most of the instrumentation off in order to prevent - profiling overhead from polluting time measurements, and thus diagnostics. -

- All the instrumentation on/off compile time switches live in - include/profile/profiler.h. -

Run Time Behavior

- For practical reasons, the instrumentation library processes the trace - partially - rather than dumping it to disk in raw form. Each event is processed when - it occurs. It is usually attached a cost and it is aggregated into - the database of a specific diagnostic class. The cost model - is based largely on the standard performance guarantees, but in some - cases we use knowledge about GCC's standard library implementation. -

- Information is indexed by (1) call stack and (2) instance id or address - to be able to understand and summarize precise creation-use-destruction - dynamic chains. Although the analysis is sensitive to dynamic instances, - the reports are only sensitive to call context. Whenever a dynamic instance - is destroyed, we accumulate its effect to the corresponding entry for the - call stack of its constructor location. -

- For details, see - paper presented at - CGO 2009. -

Analysis and Diagnostics

- Final analysis takes place offline, and it is based entirely on the - generated trace and debugging info in the application binary. - See section Diagnostics for a list of analysis types that we plan to support. -

- The input to the analysis is a table indexed by profile type and call stack. - The data type for each entry depends on the profile type. -

Cost Model

- While it is likely that cost models become complex as we get into - more sophisticated analysis, we will try to follow a simple set of rules - at the beginning. -

  • Relative benefit estimation: - The idea is to estimate or measure the cost of all operations - in the original scenario versus the scenario we advise to switch to. - For instance, when advising to change a vector to a list, an occurrence - of the insert method will generally count as a benefit. - Its magnitude depends on (1) the number of elements that get shifted - and (2) whether it triggers a reallocation. -

  • Synthetic measurements: - We will measure the relative difference between similar operations on - different containers. We plan to write a battery of small tests that - compare the times of the executions of similar methods on different - containers. The idea is to run these tests on the target machine. - If this training phase is very quick, we may decide to perform it at - library initialization time. The results can be cached on disk and reused - across runs. -

  • Timers: - We plan to use timers for operations of larger granularity, such as sort. - For instance, we can switch between different sort methods on the fly - and report the one that performs best for each call context. -

  • Show stoppers: - We may decide that the presence of an operation nullifies the advice. - For instance, when considering switching from set to - unordered_set, if we detect use of operator ++, - we will simply not issue the advice, since this could signal that the use - care require a sorted container.

Reports

-There are two types of reports. First, if we recognize a pattern for which -we have a substitute that is likely to give better performance, we print -the advice and estimated performance gain. The advice is usually associated -to a code position and possibly a call stack. -

-Second, we report performance characteristics for which we do not have -a clear solution for improvement. For instance, we can point to the user -the top 10 multimap locations -which have the worst data locality in actual traversals. -Although this does not offer a solution, -it helps the user focus on the key problems and ignore the uninteresting ones. -

Testing

- First, we want to make sure we preserve the behavior of the release mode. - You can just type "make check-profile", which - builds and runs the whole test suite in profile mode. -

- Second, we want to test the correctness of each diagnostic. - We created a profile directory in the test suite. - Each diagnostic must come with at least two tests, one for false positives - and one for false negatives. -

\ No newline at end of file diff --git a/libstdc++-v3/doc/html/manual/profile_mode_devel.html b/libstdc++-v3/doc/html/manual/profile_mode_devel.html deleted file mode 100644 index 768c610ba80..00000000000 --- a/libstdc++-v3/doc/html/manual/profile_mode_devel.html +++ /dev/null @@ -1,67 +0,0 @@ - -Developer Information

Developer Information

Big Picture

The profile mode headers are included with - -D_GLIBCXX_PROFILE through preprocessor directives in - include/std/*. -

Instrumented implementations are provided in - include/profile/*. All instrumentation hooks are macros - defined in include/profile/profiler.h. -

All the implementation of the instrumentation hooks is in - include/profile/impl/*. Although all the code gets included, - thus is publicly visible, only a small number of functions are called from - outside this directory. All calls to hook implementations must be - done through macros defined in profiler.h. The macro - must ensure (1) that the call is guarded against reentrance and - (2) that the call can be turned off at compile time using a - -D_GLIBCXX_PROFILE_... compiler option. -

How To Add A Diagnostic

Let's say the diagnostic name is "magic". -

If you need to instrument a header not already under - include/profile/*, first edit the corresponding header - under include/std/ and add a preprocessor directive such - as the one in include/std/vector: -

-#ifdef _GLIBCXX_PROFILE
-# include <profile/vector>
-#endif
-

-

If the file you need to instrument is not yet under - include/profile/, make a copy of the one in - include/debug, or the main implementation. - You'll need to include the main implementation and inherit the classes - you want to instrument. Then define the methods you want to instrument, - define the instrumentation hooks and add calls to them. - Look at include/profile/vector for an example. -

Add macros for the instrumentation hooks in - include/profile/impl/profiler.h. - Hook names must start with __profcxx_. - Make sure they transform - in no code with -D_NO_GLIBCXX_PROFILE_MAGIC. - Make sure all calls to any method in namespace __gnu_profile - is protected against reentrance using macro - _GLIBCXX_PROFILE_REENTRANCE_GUARD. - All names of methods in namespace __gnu_profile called from - profiler.h must start with __trace_magic_. -

Add the implementation of the diagnostic. -

  • - Create new file include/profile/impl/profiler_magic.h. -

  • - Define class __magic_info: public __object_info_base. - This is the representation of a line in the object table. - The __merge method is used to aggregate information - across all dynamic instances created at the same call context. - The __magnitude must return the estimation of the benefit - as a number of small operations, e.g., number of words copied. - The __write method is used to produce the raw trace. - The __advice method is used to produce the advice string. -

  • - Define class __magic_stack_info: public __magic_info. - This defines the content of a line in the stack table. -

  • - Define class __trace_magic: public __trace_base<__magic_info, - __magic_stack_info>. - It defines the content of the trace associated with this diagnostic. -

-

Add initialization and reporting calls in - include/profile/impl/profiler_trace.h. Use - __trace_vector_to_list as an example. -

Add documentation in file doc/xml/manual/profile_mode.xml. -

\ No newline at end of file diff --git a/libstdc++-v3/doc/html/manual/profile_mode_impl.html b/libstdc++-v3/doc/html/manual/profile_mode_impl.html deleted file mode 100644 index e9495273d52..00000000000 --- a/libstdc++-v3/doc/html/manual/profile_mode_impl.html +++ /dev/null @@ -1,50 +0,0 @@ - -Implementation Issues

Implementation Issues

Stack Traces

- Accurate stack traces are needed during profiling since we group events by - call context and dynamic instance. Without accurate traces, diagnostics - may be hard to interpret. For instance, when giving advice to the user - it is imperative to reference application code, not library code. -

- Currently we are using the libc backtrace routine to get - stack traces. - _GLIBCXX_PROFILE_STACK_DEPTH can be set - to 0 if you are willing to give up call context information, or to a small - positive value to reduce run time overhead. -

Symbolization of Instruction Addresses

- The profiling and analysis phases use only instruction addresses. - An external utility such as addr2line is needed to postprocess the result. - We do not plan to add symbolization support in the profile extension. - This would require access to symbol tables, debug information tables, - external programs or libraries and other system dependent information. -

Concurrency

- Our current model is simplistic, but precise. - We cannot afford to approximate because some of our diagnostics require - precise matching of operations to container instance and call context. - During profiling, we keep a single information table per diagnostic. - There is a single lock per information table. -

Using the Standard Library in the Instrumentation Implementation

- As much as we would like to avoid uses of libstdc++ within our - instrumentation library, containers such as unordered_map are very - appealing. We plan to use them as long as they are named properly - to avoid ambiguity. -

Malloc Hooks

- User applications/libraries can provide malloc hooks. - When the implementation of the malloc hooks uses stdlibc++, there can - be an infinite cycle between the profile mode instrumentation and the - malloc hook code. -

- We protect against reentrance to the profile mode instrumentation code, - which should avoid this problem in most cases. - The protection mechanism is thread safe and exception safe. - This mechanism does not prevent reentrance to the malloc hook itself, - which could still result in deadlock, if, for instance, the malloc hook - uses non-recursive locks. - XXX: A definitive solution to this problem would be for the profile extension - to use a custom allocator internally, and perhaps not to use libstdc++. -

Construction and Destruction of Global Objects

- The profiling library state is initialized at the first call to a profiling - method. This allows us to record the construction of all global objects. - However, we cannot do the same at destruction time. The trace is written - by a function registered by atexit, thus invoked by - exit. -

\ No newline at end of file