Netmiko and what constitutes done

Author: Kirk Byers
Date: July 3rd, 2018

Note, in order to reproduce the results specified here, you will need to use Netmiko 2.2.2 or later.

Screen-scraping suffers from two fundamental problems—the first is that screen-scraping returns unstructured data (i.e. a big block of text) and the second is that the communication channel has no good notion of being done. While people frequently bring up the former, the latter is probably a bigger issue.

Let's discuss the second issue in some more detail (and what it implies).

The communication channel has no good notion of being done.

What does this statement mean?

It means that when you send a command down the SSH channel that there is no clear indicator that your command has completed. The general practice for screen-scraping is to search for the trailing router prompt and to use this to indicate "done".

Let's look at a fairly typical Netmiko send_command() call:

#!/usr/bin/env python
from netmiko import ConnectHandler
from getpass import getpass

host = input("Enter your hostname: ")
device = { 
    'device_type': 'cisco_ios',
    'host': host,
    'username': 'pyclass',
    'password': getpass,
} 

net_connect = ConnectHandler(**device)
output = net_connect.send_command("show ip int brief")
print(output)

How does send_command() determine that the show command is done?

By default send_command() will try to find the current prompt using the find_prompt() method. In other words, Netmiko will attempt to find the existing prompt before the command is sent.

Netmiko will then use this current prompt as the default indicator that the command is done (i.e. when it sees that same prompt string again in the output, then it knows it is done).

Now there are multiple things that can go wrong here.

Netmiko determines the current prompt by sending an 'enter' right before the command is sent. But sending an 'enter' right before the command, might not work (for example, you might be in a context that doesn't allow another 'enter' to be executed)

Additionally, even if find_prompt() works, the command we are executing might not directly return the trailing prompt (for example, the command might prompt for additional information).

Then there is an additional question:

How long should we wait for the trailing prompt to be returned?

Should we wait 10 seconds, a minute, an hour?

Let's look at this some more:

output = net_connect.send_command("show ip int brief")

We send the above command down the SSH channel and it will work successfully (as we successfully match the trailing router prompt).

Note, I have added some minor code to track the execution time and to make the output printing cleaner.

$ python test2_send_command.py 
Enter your hostname: test1.twb-tech.com
Password: 


############################################################
Interface           IP-Address      OK? Method Status   Protocol
FastEthernet0       unassigned      YES unset  down     down    
FastEthernet1       unassigned      YES unset  down     down    
FastEthernet2       unassigned      YES unset  down     down    
FastEthernet3       unassigned      YES unset  down     down    
FastEthernet4       10.220.88.20    YES NVRAM  up       up      
Vlan1               unassigned      YES unset  down     down    
############################################################


Total time: 0:00:05.922922

This SSH login process and show command took about 6 seconds to execute. But it did execute successfully and we found the trailing router prompt.

Now what about executing this second command on the CLI (this is for a Cisco IOS router)?

copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin

The output of this command looks as follows (on the router's CLI):

# copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin
Destination filename [test1.bin]? 
Copy in progress...CCCCCCCCCCCCCCCCCCCCCC
...
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
42628912 bytes copied in 143.676 secs (296702 bytes/sec)

What are the issues with executing this copy command?

First, the command prompts us, "Destination filename [test1.bin]?". The router asks us, is that really the destination filename?

Second, the command starts outputting a bunch of 'C' characters (hundreds of them).

Third, the command takes a long time to complete (over two minutes).

What happens if you execute this copy command using Netmiko's send_command()?

net_connect = ConnectHandler(**device)
cmd = 'copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin'
output = net_connect.send_command(cmd)

And then executing this yields:

$ python test3_send_command_copy.py 
Traceback (most recent call last):
  File "test3_send_command_copy.py", line 16, in 
    output = net_connect.send_command(cmd)
  File "/home/gituser/netmiko/netmiko/base_connection.py", line 1072, in send_command
    search_pattern))
OSError: Search pattern never detected in send_command_expect: pynet\-rtr1\#

Netmiko sent the command and the trailing router prompt never came back (because the router was stuck waiting for us to confirm the filename).

How can we handle this?

There are a few different ways in Netmiko to handle this, one of them is to use the 'expect_string' argument to send_command(). The 'expect_string' argument tells Netmiko what to look for in the output.

The updated code now looks as follows:

starttime_time = datetime.now()
net_connect = ConnectHandler(**device)
cmd = 'copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin'
output = net_connect.send_command(
    cmd, 
    expect_string=r'Destination filename'
)

try:
    output += net_connect.send_command('\n', expect_string=r'#')
except:
    end_time = datetime.now()
    print("Total time: {}".format(end_time - start_time))
    raise

I included both exception handling code and variables to track the Netmiko execution time.

Here is the output that we receive when we execute this:

$ python test3_send_command_copy.py 
Enter your hostname: cisco1.twb-tech.com
Password: 
Total time: 0:01:46.065829
Traceback (most recent call last):
  File "test3_send_command_copy.py", line 19, in 
    output += net_connect.send_command('\n', expect_string=r'#')
  File "/home/gituser/netmiko/netmiko/base_connection.py", line 1072, in send_command
    search_pattern))
OSError: Search pattern never detected in send_command_expect: #

Netmiko waited 106 seconds for the command to complete.

Under the hood, send_command() defaults to a timeout of roughly 100 seconds. The additional 6 seconds is overhead time (the SSH connection setup, disabling output paging, determining the current prompt).

This exception makes sense. In other words, we saw that the copy requires about 143 seconds and Netmiko's send_command() only waits 100 seconds. Consequently, we get an exception.

So how do we tell Netmiko to wait longer?

There are two main ways to do this: global_delay_factor and delay_factor.

delay_factor is an argument that we can pass into send_command(). Its effect is to modify all of the delays embedded in send_command (for example, a delay_factor=2 will double all of the delays; a delay_factor=4 will quadruple all of the delays).

Consequently, if our command requires 143 seconds to complete, we can set delay_factor=2. This will cause send_command to wait at least 200 seconds before the command times out. Note, Netmiko will still end the loop earlier if the expect_string is detected.

Our updated code now looks as follows:

start_time = datetime.now()
net_connect = ConnectHandler(**device)
cmd = 'copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin'
output = net_connect.send_command(
    cmd, 
    expect_string=r'Destination filename'
)
output += net_connect.send_command(
    '\n', 
    expect_string=r'#', 
    delay_factor=2
)
end_time = datetime.now()

And executing it:

$ python test3_send_command_copy.py
Enter your hostname: test1.twb-tech.com
Password: 

############################################################
Destination filename [test1.bin]? Copy in progress...CCCCC
....
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
42628912 bytes copied in 146.256 secs (291468 bytes/sec)
############################################################

Total time: 0:02:32.690754

This time the copy completed successfully.

Now what about global_delay_factor?

global_delay_factor also let's you increase the delays in your program, but it essentially increases ALL of the delays in your program.

When would you want to use global_delay_factor and not just delay_factor?

Occasionally you run into issues where the SSH session setup fails (for example, due to the device being too slow). In these situations, you generally need to tell Netmiko to slow down.

Let's update our code and use global_delay_factor:

#!/usr/bin/env python
from datetime import datetime
from netmiko import ConnectHandler
from getpass import getpass

host = input("Enter your hostname: ")
device = { 
    'device_type': 'cisco_ios',
    'host': host,
    'username': 'pyclass',
    'password': getpass(),
    'global_delay_factor': 2,
} 

start_time = datetime.now()
net_connect = ConnectHandler(**device)
cmd = 'copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin'
output = net_connect.send_command(
    cmd, 
    expect_string=r'Destination filename'
)
output += net_connect.send_command('\n', expect_string=r'#')
end_time = datetime.now()
 
print("\n")
print("#" * 60) 
print(output)
print("#" * 60) 
print("\n")
print("Total time: {}".format(end_time - start_time))
print("\n")

And executing this yields:

$ python test4_global_delay.py 
Enter your hostname: test1.twb-tech.com
Password: 

############################################################
Destination filename [test1.bin]? Copy in progress...CCCCCCCCCCCCCCC
...
CCCCCCCCCCCCCCCCCCCCCC
42628912 bytes copied in 142.880 secs (298355 bytes/sec)
############################################################

Total time: 0:02:33.498948

Once again the file copied successfully. This time we used global_delay_factor=2 and it took marginally longer to execute when compared to using delay_factor=2 (about a second).

Let's look at a more extreme example of slowing down a script.

What if we go back to our initial send_command("show ip int brief") script. In our initial execution, this script took about 6 seconds to execute. This time, however, we set the global_delay_factor equal to 4.

device = {
    'device_type': 'cisco_ios',
    'host': host,
    'username': 'pyclass',
    'password': getpass(),
    'global_delay_factor': 4,
} 

start_time = datetime.now()
net_connect = ConnectHandler(**device)
output = net_connect.send_command("show ip int brief")
end_time = datetime.now()

When we re-execute this script, it now takes over 19 seconds to complete:

$ python test5_send_command_global.py
Enter your hostname: test1.twb-tech.com
Password:

############################################################
Interface          IP-Address      OK? Method Status    Protocol
FastEthernet0      unassigned      YES unset  down      down    
FastEthernet1      unassigned      YES unset  down      down    
FastEthernet2      unassigned      YES unset  down      down    
FastEthernet3      unassigned      YES unset  down      down    
FastEthernet4      10.220.88.20    YES NVRAM  up        up      
Vlan1              unassigned      YES unset  down      down    
############################################################

Total time: 0:00:19.672448

Now one additional item worth noting, what happens if you set both delay_factor and global_delay_factor at the same time?

By default the larger delay factor will win (so whichever value is the larger of the two will win out). In other words, whichever one slows you down the most will be selected.

But, but, but...you want to go faster, not slower, I will talk about that at some other time (hopefully).

Kirk Byers

@kirkbyers

You might also be interested in: