![]() The Apollo Lunar Module guidance computer (a pioneering time sharing system) called its idle thread the "DUMMY JOB", and engineers tracked cycles running it vs real tasks as a important computer utilization metric. This metric is as old as time sharing systems. If a non-idle thread begins running, then stops 100 milliseconds later, the kernel considers that CPU utilized that entire time. Your operating system kernel (whatever it is) usually tracks this during context switch. The metric we call CPU utilization is really "non-idle time": the time the CPU was not running the idle thread. ![]() What does this mean for you? Understanding how much your CPUs are stalled can direct performance tuning efforts between reducing code or reducing memory I/O.Īnyone looking at CPU performance, especially on clouds that auto scale based on CPU, would benefit from knowing the stalled component of their %CPU. Chances are, you're mostly stalled, but don't know it. The ratio I drew above (between busy and stalled) is what I typically see in production. ![]() Stalled means the processor was not making forward progress with instructions, and usually happens because it is waiting on memory I/O. What you may think 90% CPU utilization means: Yes, I'm talking about the "%CPU" metric used everywhere, by everyone. What is CPU utilization? How busy your processors are? No, that's not what it measures. ![]() The metric we all use for CPU utilization is deeply misleading, and getting worse every year. Systems Performance: Enterprise and the Cloud, 2nd Edition How To Add eBPF Observability To Your ProductīPF binaries: BTF, CO-RE, and the future of BPF perf tools USENIX LISA2021 Computing Performance: On the Horizon USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |