IO Stats on Linux OS with Python

IO Stats on Linux OS with Python

Finding perfomance bottlenecks on Linux can be easy done for CPU and Memory usage. The bottleneck is disk IO it seems more difficult to find the right tool for this. Luckely their is an easy way to read the disk IO statistics in Linux, try:

cat /proc/diskstats

This will give you the total amounts, this are the values:

  • read I/Os requests number of read I/Os processed
  • read merges requests number of read I/Os merged with in-queue I/O
  • read sectors sectors number of sectors read
  • read ticks milliseconds total wait time for read requests
  • write I/Os requests number of write I/Os processed
  • write merges requests number of write I/Os merged with in-queue I/O
  • write sectors sectors number of sectors written
  • write ticks milliseconds total wait time for write requests
  • in_flight requests number of I/Os currently in flight
  • io_ticks milliseconds total time this block device has been active
  • time_in_queue milliseconds total wait time for all requests

Check: https://www.kernel.org/doc/Documentation/block/stat.txt

Main problem here is that the values are since Linux was booted. What I want is the disk IO over an certain amount of time.

Monitor IO with Python

In most Linux distributions Python is installed by default, this makes it easy to create your own scripts.  In our case we want to retrieve the disk IO two times with an interval in between, the delta between those two will give us the amount of IO used.

By default I set the interval time on 5 seconds, but by using arguments in your command you can choose an custom value. This is the Python script I created for this task:

# Python IOstat
# https://www.kernel.org/doc/Documentation/block/stat.txt

import time
import os
import sys

if len(sys.argv) 1:
 timer = float(sys.argv[1])
else :
 timer = 5

stats = os.popen("cat /sys/block/sda/stat").read()
stats_split = stats.split(' ')
stats_array = filter(None, stats_split)

time.sleep(timer)
stats_2 = os.popen("cat /sys/block/sda/stat").read()
stats_split_2 = stats_2.split(' ')
stats_array_2 = filter(None, stats_split_2)

iototal = [
int(stats_array_2[0]) - int(stats_array[0]),
int(stats_array_2[1]) - int(stats_array[1]),
int(stats_array_2[2]) - int(stats_array[2]),
int(stats_array_2[3]) - int(stats_array[3]),
int(stats_array_2[4]) - int(stats_array[4]),
int(stats_array_2[5]) - int(stats_array[5]),
int(stats_array_2[6]) - int(stats_array[6]),
int(stats_array_2[7]) - int(stats_array[7]),
int(stats_array_2[8]) - int(stats_array[8]),
int(stats_array_2[9]) - int(stats_array[9]),
int(stats_array_2[10]) - int(stats_array[10])
]

description = [
"IO stats over last "+str(timer)+" seconds \n\n",
"read I/Os : " + iototal[0],
"read merges : " + iototal[1],
"read sectors : " + iototal[2],
"read ticks (ms) : " + iototal[3],
"write I/Os : " + iototal[4],
"write merges : " + iototal[5],
"write sectors : " + iototal[6],
"write ticks (ms) : " + iototal[7],
"in_flight : " + iototal[8],
"io_ticks (ms) : " + iototal[9],
"time_in_queue (ms) : " + iototal[10]
]
for d in description:
 print(d)

So this script basically retrieves 2 times the values for device ‘SDA’ and caculates the delta between the values. The script is very basic and should be readable for the beginning Python learner.

Please feel free to add comments bellow,

Comments are closed.