Paradigm shift in IT…
The landscape on how we view Networking is changing. These days, and days/months/years to follow, automation seems to be a key aspect on how Network Infrastructure should be ran. When we look closer – it’s now the responsibility of the workload (i.e. application) to dictate how the ‘fabric’ should be laid out across our Data Centers. The application/workload drives business revenues and waiting for changes on the network to be made can take a significant amount of time which delays product release… which delays revenues generation.
Better yet – what about a new datacentre build, or a DR site or replacement of hardware that went End of Life? Aside from the change control administration that needs to take place, an engineer needs to rack the devices, 10mins each, then cabling – this time varies. Next, power on the device; maybe about 5 mins to startup. Then console into the device through a serial cable (or remotely through a terminal server), maybe 5mins to check that you’re in the device and manually type in commands via the CLI: enable… configuration mode… interface mode… etc.
Sure the engineer can cut-n-paste…BUT they must very careful on how much the buffer can accept in one paste… and then ‘show’ the running configuration to verify that the 25 to 30 lines inputted was taking correctly. Believe me… I’m one to do just that. And afterwards, save and check again… Say that took you 30 – 35 mins.
No big deal, right? What if you have four Spines and eight Leafs? How much time have you exhausted? So that’s 12 devices, 35mins each… oh bathroom break, maybe a bite to eat, and having to change console connections and putty screens, etc. Doing the math: I would have to say about 8.5hrs to 9.5hrs? Oooops I forgot the rack-n-stack, cabling and power-on… say well add another 3hrs for a grand total of 13hrs.
How long should it take?
In the real world – this is a two day job to make sure all your configuration are solid and your Fabric is good to go! But should it take that long? Is there a more efficient way of getting this job done? Yes, it is – through Infrastructure Automation [IA]. I can get a typical Spine/Leaf Fabric up and running under 5mins after the physical connection, rack-n-stack done. FIVE Minutes!!!
To Network Engineer: this doesn’t mean you have relinquish everything you know about configuring devices; just need a better way to be more efficient, a bit more reliable, adapt to the changes across the IT landscape… and to stay, well – relevant!
I’m a 2x CCIE and love the CLI – that’s how I passed the labs; but I am kind tired of the remote sessions, cut-n-paste from notepad++ (I prefer Atom these days), making sure I’m at the correct sub configuration level… waiting and then sometimes have my session log me out because of inactivity due to security enforcement policies set by SOC. It can get tiresome, especially if you’ve been in the game a long time.
It’s time to move into the DevOp world and tweak our personalities a bit – let become InfraOps Engineers [???], NetOps [??] FabOps[??].
So you’ve seen my blog postings on:
- DHCP to the Rescue…using Linux
- Web Servers – Start your NGIN…X
- My name is DNS & Time Check with NTP, Part 1 & 2
- The Answer is ANSIBLE… Sort of…
All of those were leading up to this Blog! We’ll use each of these services along with Arista switches to bring up an OSPF fabric so two hosts on different subnets,can communicate in under 5mins (well maybe 6mins…) with the use of Ansible as a CMS [Configuration Management System] tool
DISCLAIMER: The material presented in this blog is for educational and training purposes only. Neither the author(s), AHA-VTS.com© nor Ahaliblogger© assume any liability or responsibility to any person or entity with respect to loss or damages incurred from the information contained in this blog.
The diagram depicts what where trying accomplish from a Fabric perspective
Where going build out a Spine/leaf Architecture [using Arista switches]. Then add two hosts on two different subnets utilizing OSPF in Area 0 as the routing protocol of choice, to form neighbor relationship, in order to exchange routing information so these two host can communicate. All of which will be automated using Ansible playbook and running just four  adhoc commands.
Automation Provisioning Flow
The flow of automation WRT Arista switches can be seen in the diagram above; which I re-drew from the Arista ACE Certification courseware.
This is what will take place
First: The Arista switch[es] will boot…
Second: Since the switch[es] do not have any configurations – it will follow the ZTP process as per the automation provisioning flow diagram.
Since it will contact a DHCP for an IP – I have configured my DHCP server to assign IPs based on received MAC addresses. This way I still have a level of control on what IPs to give out (since ZTP uses the MGMT interface of the switch[es], the switch[es] will get an MGMT IP).
I also configured a ‘bootfile’ option so the switch[es] can obtain and download a startup script with key information as RSA keys, MGMT IP, SSH services, etc.
The diagram is a snippet of the actual DHCP configurations – but you’ll get the idea.
Third: As you can see the DHCP server tell the switch[es] where to find the startup script [or bootfile] from. Which is http://172.16.5.100/basic, a Web-server I built to hold start-up config files and more. Also if you haven’t noticed – this Web-server/script resides on the MGMT network.
Now the contents of this script looks like this:
Does this look familiar to you? It should to all those network engineers! 🙂
Aside from the top line (which is called a ‘sha-bang’ [#!] and a reference to CLI privileged level 15), the content is all input commands an engineer would use if he/she was consoled into the switch and using CLI syntax!
I created a startup script based on my knowledge of CLI as well as leveraging Arista switch’s ability to interpret CLI scripts! How great is that?
NOTE: In the script above, you’ll notice that there’s a reference to an ansible.pub file. This file is the Ansible Server’s public key so it can SSH into the switches. This file also resides on the Web-server.
Fourth: The start-up script is downloaded and executed against the switch[es].
Fifth: Once done the Arista switches are ready for Ansible to push configurations to them.
Sixth: The diagram below illustrates the Ansible playbook work-flow
Since I’m running Ansible adhoc commands, there is no need for the hosts files in this case. The IP address of the switch is sufficient enough to run the playbook against [e.g. leaf1.yml]. Below is a better look at leaf1.yml file
In the playbook – a call is made for a source file [leaf1.cfg], that is local on the Ansible server, which contains the relevant content to configure the switch. Below is leaf1.cfg file, which is written in bash – and also can be interpreted by the Arista switch.
Doesn’t this content, again, look familiar? I hope so because this the output of show running-config. Again, I took the knowledge I know as a Network Engineer and applied it to the DevOps world…
Now four Ansible playbooks have been created and four .cfg files too – each reflective of their corresponding switches.
I’ll run four adhoc commands on the Ansible Server:
ansible-playbook -i ‘172.16.5.74,’ spine1.yml
ansible-playbook -i ‘172.16.5.75,’ spine2.yml
ansible-playbook -i ‘172.16.5.76,’ leaf1.yml
ansible-playbook -i ‘172.16.5.77,’ leaf2.yml
All of which will configure OSPF, spine/leaf IP addresses, VLANs, etc.
The video below will illustrate the usage Automation of Arista switches using Ansible. The run time is 6:17 which includes the initial boot of the Arista switches.
Imagine this: a little over six minutes to configure four switches, which would have taken me, maybe 30-35mins, if I used consoled and used CLI… What if you had six switches [2 spine, 4 leafs]? Better yet what if you had a new datacenter that consisted of ten racks, each had 2 TORs going back to a spine core? How will you configure these devices without in a timely manner? You already know the answer… 😉
Enjoy the video and thank you for your support!