|  | 
|  | 1 | +# Profiling on Windows  | 
|  | 2 | + | 
|  | 3 | +## Introducing WPR and WPA | 
|  | 4 | + | 
|  | 5 | +High-level performance analysis (including memory usage) can be performed with the Windows | 
|  | 6 | +Performance Recorder (WPR) and Windows Performance Analyzer (WPA). As the names suggest, WPR is for | 
|  | 7 | +recording system statistics (in the form of event trace log a.k.a. ETL files), while WPA is for | 
|  | 8 | +analyzing these ETL files. | 
|  | 9 | + | 
|  | 10 | +WPR collects system wide statistics, so it won't just record things relevant to rustc but also | 
|  | 11 | +everything else that's running on the machine. During analysis, we can filter to just the things we | 
|  | 12 | +find interesting.  | 
|  | 13 | + | 
|  | 14 | +These tools are quite powerful but also require a bit of learning before we can succesfully profile | 
|  | 15 | +the Rust compiler. | 
|  | 16 | + | 
|  | 17 | +Here we will explore how to use WPR and WPA for analyzing the Rust compiler as well as provide | 
|  | 18 | +links to useful "profiles" (i.e., settings files that tweak the defaults for WPR and WPA) that are | 
|  | 19 | +specifically designed to make analyzing rustc easier. | 
|  | 20 | + | 
|  | 21 | +### Installing WPR and WPA | 
|  | 22 | + | 
|  | 23 | +You can install WPR and WPA as part of the Windows Performance Toolkit which itself is an option as | 
|  | 24 | +part of downloading the Windows Assesment and Deployment Kit (ADK). You can download the ADK | 
|  | 25 | +installer [here](https://go.microsoft.com/fwlink/?linkid=2086042). Make sure to select the Windows | 
|  | 26 | +Performance Toolkit (you don't need to select anything else).  | 
|  | 27 | + | 
|  | 28 | +## Recording | 
|  | 29 | + | 
|  | 30 | +In order to perform system analysis, you'll first need to record your system with WPR. Open WPR and | 
|  | 31 | +at the bottom of the window select the "profiles" of the things you want to record. For looking | 
|  | 32 | +into memory usage of the rustc bootstrap process, we'll want to select the following items: | 
|  | 33 | + | 
|  | 34 | +* CPU usage | 
|  | 35 | +* VirtualAlloc usage | 
|  | 36 | + | 
|  | 37 | +You might be tempted to record "Heap usage" as well, but this records every single heap allocation | 
|  | 38 | +and can be very, very expensive. For high-level analysis, it might be best to leave that turned | 
|  | 39 | +off. | 
|  | 40 | + | 
|  | 41 | +Now we need to get our setup ready to record. For memory usage analysis, it is best to record the | 
|  | 42 | +stage 2 compiler build with a stage 1 compiler build with debug symbols. Having symbols in the | 
|  | 43 | +compiler we're using to build rustc will aid our analysis greatly by allowing WPA to resolve Rust | 
|  | 44 | +symbols correctly. Unfortunately, the stage 0 compiler does not have symbols turned on which is why | 
|  | 45 | +we'll need to build a stage 1 compiler and then a stage 2 compiler ourselves.  | 
|  | 46 | + | 
|  | 47 | +To do this, make sure you have set `debuginfo-level = 1` in your `config.toml` file. This tells | 
|  | 48 | +rustc to generate debug information which includes stack frames when bootstrapping. | 
|  | 49 | + | 
|  | 50 | +Now you can build the stage 1 compiler: `python x.py build --stage 1 -i library/std` or however | 
|  | 51 | +else you want to build the stage 1 compiler. | 
|  | 52 | + | 
|  | 53 | +Now that the stage 1 compiler is built, we can record the stage 2 build. Go back to WPR, click the | 
|  | 54 | +"start" button and build the stage 2 compiler (e.g., `python x build --stage=2 -i library/std `). | 
|  | 55 | +When this process finishes, stop the recording.  | 
|  | 56 | + | 
|  | 57 | +Click the Save button and once that process is complete, click the "Open in WPA" button which | 
|  | 58 | +appears. | 
|  | 59 | + | 
|  | 60 | +Note: The trace file is fairly large so it can take WPA some time to finish opening the file. | 
|  | 61 | + | 
|  | 62 | +## Analysis | 
|  | 63 | + | 
|  | 64 | +Now that our ETL file is open in WPA, we can analyze the results. First, we'll want to apply the | 
|  | 65 | +pre-made "profile" which will put WPA into a state conducive to analyzing rustc bootstrap. Download | 
|  | 66 | +the profile [here](https://github.com/wesleywiser/rustc-bootstrap-wpa-analysis/releases/download/1/rustc.generic.wpaProfile). | 
|  | 67 | +Select the "Profiles" menu at the top, then "apply" and then choose the downloaded profile.  | 
|  | 68 | + | 
|  | 69 | +You should see something resembling the following: | 
|  | 70 | + | 
|  | 71 | + | 
|  | 72 | + | 
|  | 73 | +Next, we will need to tell WPA to load and process debug symbols so that it can properly demangle | 
|  | 74 | +the Rust stack traces. To do this, click "Trace" and then choose "Load Symbols". This step can take | 
|  | 75 | +a while. | 
|  | 76 | + | 
|  | 77 | +Once WPA has loaded symbols for rustc, we can expand the rustc.exe node and begin drilling down | 
|  | 78 | +into the stack with the largest allocations. | 
|  | 79 | + | 
|  | 80 | +To do that, we'll expand the `[Root]` node in the "Commit Stack" column and continue expanding | 
|  | 81 | +until we interesting stack frames. | 
|  | 82 | + | 
|  | 83 | +> Tip: After selecting the node you want to expand, press the right arrow key. This will expand the | 
|  | 84 | +node and put the selection on the next largest node in the expanded set. You can continue pressing | 
|  | 85 | +the right arrow key until you reach an interesting frame.  | 
|  | 86 | + | 
|  | 87 | + | 
|  | 88 | + | 
|  | 89 | +In this sample, you can see calls through codegen are allocating ~30gb of memory in total | 
|  | 90 | +throughout this profile. | 
|  | 91 | + | 
|  | 92 | +## Other Analysis Tabs | 
|  | 93 | + | 
|  | 94 | +The profile also includes a few other tabs which can be helpful: | 
|  | 95 | + | 
|  | 96 | +- System Configuration | 
|  | 97 | +    - General information about the system the capture was recorded on. | 
|  | 98 | +- rustc Build Processes | 
|  | 99 | +    - A flat list of relevant processes such as rustc.exe, cargo.exe, link.exe etc. | 
|  | 100 | +    - Each process lists its command line arguments. | 
|  | 101 | +    - Useful for figuring out what a specific rustc process was working on. | 
|  | 102 | +- rustc Build Process Tree | 
|  | 103 | +    - Timeline showing when processes started and exited. | 
|  | 104 | +- rustc CPU Analysis | 
|  | 105 | +    - Contains charts preconfigured to show hotspots in rustc. | 
|  | 106 | +    - These charts are designed to support analyzing where rustc is spending its time. | 
|  | 107 | +- rustc Memory Analysis | 
|  | 108 | +    - Contains charts preconfigured to show where rustc is allocating memory. | 
0 commit comments