Optimizing Linux Performance: A Hands-On Guide to Linux Performance Tools

Phillip G. Ezolt





Table of Contents:


1. Performance Hunting Tips.

    General Tips.

      Take Copious Notes (Save Everything).

      Automate Redundant Tasks.

      Choose Low-Overhead Tools If Possible.

      Use Multiple Tools to Understand the Problem.

      Trust Your Tools.

      Use the Experience of Others (Cautiously).

    Outline of a Performance Investigation.

      Finding a Metric, Baseline, and Target.

      Track Down the Approximate Problem.

      See Whether the Problem Has Already Been Solved

      The Case Begins (Start to Investigate).

      Document, Document, Document.

    Chapter Summary

2. Performance Tools: System CPU.

    CPU Performance Statistics

      Run Queue Statistics.

      Context Switches.


      CPU Utilization.

    Linux Performance Tools: CPU.

      vmstat (Virtual Memory Statistics).

      top (v. 2.0.x).

      top (v. 3.x.x).

      procinfo (Display Info from the /proc File System).


      mpstat (Multiprocessor Stat).

      sar (System Activity Reporter).


    Chapter Summary

3. Performance Tools: System Memory.

    Memory Performance Statistics.

      Memory Subsystem and Performance.

      Memory Subsystem (Virtual Memory).

    Linux Performance Tools: CPU and Memory.

      vmstat  (Virtual Memory Statistics) II.

      top (2.x and 3.x).

      procinfo II.

      gnome-system-monitor (II).



      sar (II).


    Chapter Summary

4. Performance Tools: Process-Specific CPU.

    Process Performance Statistics.

      Kernel Time Versus User Time.

      Library Time Versus Application Time.

      Subdividing Application Time.

    The Tools.




      ps (Process Status).

      ld.so (Dynamic Loader).


      oprofile (II).

      Languages: Static (C and C++) Versus Dynamic (Java and Mono).

    Chapter Summary.

5. Performance Tools: Process-Specific Memory.

    Linux Memory Subsystem.

    Memory Performance Tools.




      valgrind (cachegrind).


      oprofile (III).


      Dynamic Languages (Java, Mono).

    Chapter Summary.

6. Performance Tools: Disk I/O.

    Introduction to Disk I/O.

    Disk I/O Performance Tools.

      vmstat (ii).



      lsof  (List Open Files).

    What’s Missing?

    Chapter Summary.

7. Performance Tools: Network.

    Introduction to Network I/O.

      Network Traffic in the Link Layer.

      Protocol-Level Network Traffic.

    Network Performance Tools.

      mii-tool (Media-Independent Interface Tool).


      ifconfig (Interface Configure).







    Chapter Summary

8. Utility Tools: Performance Tool Helpers.

    Performance Tool Helpers.

      Automating and Recording Commands.

      Graphing and Analyzing Performance Statistics.

      Investigating the Libraries That an Application Uses.

      Creating and Debugging Applications.









      GNU Debugger (gdb).

      gcc (GNU Compiler Collection).

    Chapter Summary

9. Using Performance Tools to Find Problems.

    Not Always a Silver Bullet.

    Starting the Hunt.

    Optimizing an Application.

      Is Memory Usage a Problem?

      Is Startup Time a Problem?

      Is the Loader Introducing a Delay?

      Is CPU Usage (or Length of Time to Complete) a Problem?

      Is the Application’s Disk Usage a Problem?

      Is the Application’s Network Usage a Problem?

    Optimizing a System.

      Is the System CPU-Bound?

      Is a Single Processor CPU-Bound?

      Are One or More Processes Using Most of the System CPU?

      Are One or More Processes Using Most of an Individual CPU?

      Is the Kernel Servicing Many Interrupts?

      Where Is Time Spent in the Kernel?

      Is the Amount of Swap Space Being Used Increasing?

      Is the System I/O-Bound?

      Is the System Using Disk I/O?

      Is the System Using Network I/O?

    Optimizing Process CPU Usage.

      Is the Process Spending Time in User or Kernel Space?

      Which System Calls Is the Process Making, and How Long Do They Take to Complete?

      In Which Functions Does the Process Spend Time?

      What Is the Call Tree to the Hot Functions?

      Do Cache Misses Correspond to the Hot Functions or Source Lines?

    Optimizing Memory Usage.

      Is the Kernel Memory Usage Increasing?

      What Type of Memory Is the Kernel Using?

      Is a Particular Process’s Resident Set Size Increasing?

      Is Shared Memory Usage Increasing?

      Which Processes Are Using the Shared Memory?

      What Type of Memory Is the Process Using?

      What Functions Are Using All of the Stack?

      What Functions Have the Biggest Text Size?

      How Big Are the Libraries That the Process Uses?

      What Functions Are Allocating Heap Memory?

    Optimizing Disk I/O Usage.

      Is the System Stressing a Particular Disk?

      Which Application Is Accessing the Disk?

      Which Files Are Accessed by the Application?

    Optimizing Network I/O Usage.

      Is Any Network Device Sending/Receiving Near the Theoretical Limit?

      Is Any Network Device Generating a Large Number of Errors?

      What Type of Traffic Is Running on That Device?

      Is a Particular Process Responsible for That Traffic?

      What Remote System Is Sending the Traffic?

      Which Application Socket Is Responsible for the Traffic?

    The End.

    Chapter Summary.

10. Performance Hunt 1: A CPU-Bound Application (GIMP).

    CPU-Bound Application.

    Identify a Problem.

    Find a Baseline/Set a Goal.

    Configure the Application for the Performance Hunt.

    Install and Configure Performance Tools.

    Run Application and Performance Tools.

    Analyze the Results.

    Jump to the Web.

    Increase the Image Cache.

    Hitting a (Tiled) Wall.

    Solving the Problem.

    Verify Correctness?

    Next Steps.

    Chapter Summary

11. Performance Hunt 2: A Latency-Sensitive Application (nautilus).

    A Latency-Sensitive Application.

    Identify a Problem.

    Find a Baseline/Set a Goal.

    Configure the Application for the Performance Hunt.

    Install and Configure Performance Tools.

    Run Application and Performance Tools.

    Compile and Examine the Source.

    Using gdb to Generate Call Traces.

    Finding the Time Differences.

    Trying a Possible Solution.

    Chapter Summary

12. Performance Hunt 3: The System-Wide Slowdown (prelink).

    Investigating a System-Wide Slowdown.

    Identify a Problem.

    Find a Baseline/Set a Goal.

    Configure the Application for the Performance Hunt.

    Install and Configure Performance Tools.

    Run Application and Performance Tools.

    Simulating a Solution.

    Reporting the Problem.

    Testing the Solution.

    Chapter Summary.

13. Performance Tools: What’s Next?

    The State of Linux Tools.

    What Tools Does Linux Still Need?

      Hole 1: Performance Statistics Are Scattered.

      Hole 2: No Reliable and Complete Call Tree.

      Hole 3: I/O Attribution.

    Performance Tuning on Linux.

      Available Source.

      Easy Access to Developers.

      Linux Is Still Young.

    Chapter Summary.

Appendix A. Performance Tool Locations.

Appendix B. Installing oprofile.

    Fedora Core 2 (FC2).

    Enterprise Linux 3 (EL3).

    SUSE 9.1.