Category Archives: Uncategorized

1000v in and out of vCenter

I was setting up the Nexus 1110 (aka: virtual service appliance, aka: VSA) with one of our best customers and as we were doing it the appliance rebooted never to come up again without completely reinstalling the firmware from the remote media.  Most of this was probably my fault because I didn’t follow the docs exactly, and I think we can now move forward, but it made me realize I hadn’t written down an important way to reconnect to an orphaned 1000v from a new virtual supervisor module (VSM).
Here’s the situation:  When you lose the 1000v that is connecting into vCenter, there is no way to remove the virtual distributed switch (VDS or DVS) that the 1000v presented to vCenter.  You can remove hosts from the DVS but you can’t get rid of that switch.
In the above picture, there is my DVS.  If I try to remove it, I get the following error:
In my case, I didn’t want to get rid of it, I just wanted to reconnect a new VSM that I created with the same name.  But this operation can be used to remove the 1000v DVS from vCenter as well.
So here’s how you do it:
Adopt an  Orphaned Nexus 1000v DVS
Install a VSM.  I usually do mine manually, so that it doesn’t try to register with vCenter or one of the hosts.  Don’t do any configuration, other than an IP address.  Just get it so that you can log in.  Once you can log in, if you did create an SVS connection you’ll need to disconnect.  In mine, I made an svs connection and called it venter.  To disconnect from vCenter and erase the svs connection run:
# config
# svs connection vcenter
# no connect
# exit
# no svs connection venter
Trivia: What does SVS stand for?  ”Service Virtual Switch
Step 2.  Change the hostname to match what is in vCenter
Looking at the error picture above, you can see there is a folder named nexus1000v with a DVS named nexus1000v.  To make vCenter think that this new 1000v is the same one, we need to change the name to match what is in vCenter
nexus1000v-a(config)# conf
nexus1000v-a(config)# hostname nexus1000v
nexus1000v(config)#
Step 3.  Build SVS Connection
Since we destroyed (or never built) the SVS connection in step 1, we’ll need to build one and try to connect.  The SVS connection should have the same name as the one you created when you first made you SVS.  So if you called your SVS ‘vCenter’, or ‘VCENTER’, or ‘VMware’ then you’ll need to name it the same thing.  I named mine ‘vcenter’ so that’s what I use.  Similarly, you’ll have to create the datacenter-name the same as what you had before.
nexus1000v(config)# svs connection vcenter
nexus1000v(config-svs-conn)# remote ip address 10.93.234.91 port 80
nexus1000v(config-svs-conn)# vmware dvs datacenter-name Lucky Lab
nexus1000v(config-svs-conn)# protocol vmware-vim
nexus1000v(config-svs-conn)# max-ports 8192
nexus1000v(config-svs-conn)# admin user n1kUser
nexus1000v(config-svs-conn)# connect
ERROR:  [VMware vCenter Server 5.0.0 build-455964] Cannot create a VDS of extension key Cisco_Nexus_1000V_1169242977 that is different than that of the login user session Cisco_Nexus_1000V_125266846. The extension key of the vSphere Distributed Switch (dvsExtensionKey) is not the same as the login session’s extension key (sessionExtensionKey)..
Notice that when I tried to connect I got an error.  This is because the extension key in my Nexus 1000v (that was created when it was installed) doesn’t match what the old one is.  The nice thing, is I can actually change that, and that is how I make this new 1000v take over the other one.

Step 4.  Change the extension key to match what is in vCenter.
To see what the current extension-key is (or the offending key is) run the following command:
nexus1000v(config-svs-conn)# show vmware vc extension-key
Extension ID: Cisco_Nexus_1000V_125266846
That is the one we need to change.  You can see the extension-key that vCenter wants from the error message we saw in the previous step.  In the previous step it showed that the extension key we wanted was ‘Cisco_Nexus_1000V_1169242977′.  So we need to make our extension-key on the 1000v match that.  No problem:
nexus1000v(config-svs-conn)# no connect
nexus1000v(config-svs-conn)# exit
nexus1000v(config)# no svs connection vcenter
nexus1000v(config)# vmware vc extension-key Cisco_Nexus_1000V_1169242977

Now we should be able to connect and run things as before.

Step 5. (Optional) Remove the 1000v

If you’re just trying to remove the 1000v because you had that orphaned one sitting around, we simply disconnect now from vCenter

nexus1000v(config)# svs connection vcenter
nexus1000v(config-svs-conn)# no connect
nexus1000v(config-svs-conn)# connect
nexus1000v(config-svs-conn)# no vmware dvs
This will remove the DVS from the vCenter Server and any associated port-groups. Do you really want to proceed(yes/no)? [yes] yes

Now, the orphaned Nexus 1000v is gone. If you want to remove it from your vCenter plugins then you will have to navigate the managed object browser and remove the extension key. Not a big deal. By opening a web browser to the host that manages vCenter (e.g.: http://10.93.234.91 ) then you can “Browse objects managed by vSphere”. From there go to “content” then “Extension Manager”. To unregister the 1000v plugin, select “UnregisterExtension” and enter in the vCenter Extension key. This will be the same extension key that you used in step 4. (In our example: Cisco_Nexus_1000V_1169242977 )

Hope that helps!

Nexus 1000v – A kinder gentler approach

One of the issues skeptical Server Administrators have with the 1000v is that they don’t like the management interface being subject to a virtual machine.  Even though the 1000v can be configured so that if the VSM gets disconnected/powered-off/blownup the system ports can still be forwarded.  But that is voodoo.  Most say:  Give me a simple access port so I can do my business.

I’m totally on board with this level of thinking.  After all, we don’t want any Jr. Woodchuck network engineer to be taking down our virtual management layer.  So let’s keep it simple.

In fact!  You may not want Jr. Woodchuck Networking engineer to be able to touch your production VLANs for your production VMs.  Well, here’s a solution for you:  You don’t want to do the networking, but you don’t want the networking guy to do the networking either.  So how can we make things right?  Why not just ease into it.  The diagram below, presents, the NIC level of how you can configure your ESXi hosts:

Here, is what is so great about this configuration.  The VMware administrator can use things “business as usual” with the first 6 NICs.

Management A/B teams up with vmknic0 with IP address 192.168.40.101.  This is the management interface and used to talk to vCenter.  This is not controlled by the Nexus 1000v.  Business as usual here.

IP Storage A/B teams up with vmknic1 with IP address 192.168.30.101. This is to communicate with storage devices (NFS, iSCSI).  Not controlled by Nexus 1000v.  Business as usual.

VM Traffic A/B team up.  This is a trunking interface and all kinds of VLANs pass through here.  This is controlled either by a virtual standard switch or using VMware’s distributed Virtual Switch.  Business as usual.  You as the VMware administrator don’t have to worry about anything a Jr. Woodchuck Nexus 1000v administrator might do.

Now, here’s where its all good.  With UCS you can create another vmknic2 with IP address 192.168.10.101.  This is our link that is managed by the Nexus 1000v.  In UCS we would configure this as a trunk port with all kinds of VLANs enabled over it.  This can use the same VNIC Template that the standard VM-A and VM-B used.  Same VLANs, etc.

(Aside:  Some people would be more comfortable with 8 vNICs, Then you can do vMotion over its own native VMware interface.  In my lab this is 192.168.20.101)

The difference is that this IP address 192.168.10.101 belongs on our Control & Packet VLAN.  This is a back end network that the VSM will communicate with the VEM over.  Now, the only VM kernel interface that we need to have controlled by the Nexus 1000v is the 192.168.10.101 IP address.  And this is isolated from the rest of the virtualization stack.  So if we want to move a machine over to the other virtual switch, we can do that with little problem.  A simple edit of the VMs configuration can change it back.

Now, the testing can coexist on a production environment because the VMs that are being tested are running over the 1000v.  Now you can install the VSG, DCNM, the ASA 1000v, and all that good vPath stuff, and test it out.

From the 1000v, I created a port profile called “uplink” that I assign to these two interfaces:

port-profile type ethernet uplink
vmware port-group
switchport mode trunk
switchport trunk allowed vlan 1,501-512
channel-group auto mode on mac-pinning
no shutdown
system vlan 505
state enabled

By making it a system VLAN, I make it so that this control/packet VLAN stays up. For the vmknic (192.168.10.101) I also created a port profile for control:

port-profile type vethernet L3-control
capability l3control
vmware port-group
switchport mode access
switchport access vlan 505
no shutdown
system vlan 505
state enabled

This allows me to migrate the vmknic over from being managed by VMware to being managed by the Nexus 1000v. My VSM has an IP address on the same subnet as vCenter (even though its layer 3)

n1kv221# sh interface mgmt 0 brief

——————————————————————————–
Port VRF Status IP Address Speed MTU
——————————————————————————–
mgmt0 — up 192.168.40.31 1000 1500

Interestingly enough, when I do the sh module vem command, it shows up with the management interface:

Mod Server-IP Server-UUID Server-Name
— ————— ———————————— ——————–
3 192.168.40.102 00000000-0000-0000-cafe-00000000000e 192.168.40.102
4 192.168.40.101 00000000-0000-0000-cafe-00000000000f 192.168.40.101

On the VMware side, too, it shows up with the management interface: 192.168.40.101

Even though I only migrated the 192.168.10.101 vmknic over.

This configuration works great.  It provides a nice opportunity for the networking team to get with it and start taking back control of the access layer.  And it provides the VMware/Server team a clear path to move VMs back to a network they’re more familiar with if they are not yet comfortable with the 1000v.

Let me know what you think about this set up.

Teaching Kids to Program

I get asked a lot from different parents about teaching their kids to write computer programs.  ”What is a good way to get started?” , “How did you get into it?”.  As my oldest child is now 9 I’ve been frequently asking myself the same question.  I feel it is very important that young people know how to write code.  I feel that years from now people will look back on those who couldn’t write basic computer programs the same way we look back to those who can’t write a simple letter.

Much of my thinking has been confirmed and augmented by a Ted Talk I watched this week by Mitch Resnick.  In his talk, he affirms that just because people can code doesn’t mean we expect them to all be professional computer scientists or developers.  We don’t expect all people who learn how to write to become novelists or journalists.  Its just a basic skill that is needed in our day and age.

With his program “Scratch” that him and his team has made I think I’ve found the answer I was looking for.  I got home last night and downloaded it onto our family iMac.  It sits right in the kitchen and got my 9 year old and 6 year old started on it.  We started out with a picture of a “sprite”, or in our case, the default picture of a kitten.  We then created “controls” such as: “When I press the spacebar”.  Then underneath the control we did things like “change color” or move 10.  (the 10 is 10 pixels, but kids don’t really know that yet).  My kids would then keep pressing the space bar.  That’s when we introduced the “Forever” loop to them.  Amazing!  In just a quick 10 min, they understood loops and making things happen.

I’m hoping to do more with this and my kids.  I don’t want them to think of computer programming as dry and boring, but rather a creative medium for doing really cool things.  I am thankful for the people at MIT for making this possible.

 

Fabric Interconnect Failover tests

The default timeout for failover of a UCS fabric Interconnect is 5 seconds. Want to change that? Check this out.

If you fail over the primary fabric interconnect (which UCS manager will be running on) you’ll be logged out of UCS manager.  No worry, just wait 5 seconds and log back in.  You’ll be up on the primary.

When you fail over both of them to test, make sure your HA is back up and running before failing one of them over.  Just log in via SSH:

connect local-mgmt
show cluster stat
A: UP, PRIMARYB: UP, SUBORDINATE
HA READY

This will tell you that the cluster is ready.  At this point you should be able to unplug one of the Fabric Interconnects to test that failover works.

When they come back on line, you may want to change who the primary Fabric Interconnect is.  To do this, once again, SSH into the fabric interconnect:

connect local-mgmt
cluster lead a

Once, we didn’t let the HA get ready and we had to run cluster force primary to make sure the subordinate (who hadn’t been synced yet) become the primary.

 

CCIE Data Center Exam

The CCIE Data Center exam was announced in March of this year.  The list of topics is quite comprehensive.  I for one was stoked to see it announced as I wasn’t even thinking about doing a CCIE until this came up.

After some prodding from my team mates, I signed up for the Beta written exam and I took it today.  120 questions covering UCS, Nexus 7000, 5000, 1000v, MDS.  I don’t know the results of the test because the exam is in beta form and they won’t give out a passing score until after the beta period ends.

My overall feeling of the written exam in its current encarnation is that it is passible.  The UCS stuff I know pretty well.  The other topics… well, I could use some work.  But having taken it (and after all its only $50) then I think I’m ready to get serious and go for the CCIE.  I’m setting a timeline of Fall 2013 to have it passed.  Guess we’ll see.

Cisco UCS Role Based Access Control

One of the cool things that UCS allows you to do is create a place where different users of different organizations can go to to configure their pools of resources.  Its a common goal for many organizations to reduce duplication and allow agility and flexibility.  A multi-tenant solution that has been talked about can actually become a reality with UCS in the form of Role Based Access Control (RBAC).

Let’s suppose that a local county has decided it wants to consolidate its IT infrastructure into its IT department as opposed to every department having its own IT instances.  It can start off slowly, by say, starting with one or two organizations like the department of Superior Courts and the department of Executive Services.

Here’s how the main IT organization might configure RBAC for the Superior Courts and the Department of Executive Services.

1.  Create suborganizations

Log in as admin and navigate to the Servers tab.  From there you can expand the Service Profiles and see “root” and “Sub-Organizations”.  Right click on “root” and add an organization:

2.  Create Locale

A locale in UCSM is designed to reflect the location of a user in an organization.  By default all users are at the ‘root’ level locale, but if we are creating sub-organizations, we want them to use their own stuff and not modify existing resources that exist at the root level, or with other organizations.

Navigate to the Admin Tab in navigation pane, filter by User Management, expand User Services and right click on Locales.

From here we can create a local named Superior_Court and bind it to the Superior_Court organization we created.

Next, to assign the organization, we just expand the Organizations menu and drag the Superior_Court into the pane on the right.

Clicking finish gives us our new locale bound to its sub organization.

3.  Create a User for the Organization

Now let’s create a user called sc-admin that has all the rights in the Superior_Court local, but can’t change things in the root locale or any other locales.

On the navigation pane in the same place you were on the previous step, right click Locally Authenticated Users and select ‘Create User’.

The first fields are pretty self-explanatory.  We created the user and password and left out some of the other information.  The important part is that the locale is set to Superior_Court.  This confines the powers of this user into Superior_Court.  We can then select all the roles except the following:

- aaa:  Authentication, Authorization, and Accounting.  This can only be done in the root locale

- admin: this can only be given in the root locale

- operations: can only be given to root locale.

Now that sc-admin is created.  Give him to your local friendly Superior Court tenant and let them have access to the system.

Now then… What can sc-admin do?

If you now log in as sc-admin, you can see that he can create service profiles, pools, and policies, but only in his superior_court suborg.  If sc-admin tries to create a resource in the root organization, he is blocked from doing so because all of the options are greyed out:

 

Here’s what else he can do:

  • He can create sub organizations within his own Sub-organization.
  • He can create VLANs in the LAN and enable and disable network ports on the Fabric Interconnects.  (because he was given network access… if you don’t want this take away the network privilege)
  • He can create VSANs and disable and enable FC interfaces. (take away the storage privilege if you don’t want him to do this)

An interesting scenario I ran across is that if you remove a role from a user while that user is still logged in, it doesn’t seem to take effect until the user logs in later.  For example, I disabled sc-admin’s network role and he was still able to create VLANs and turn ports off and on.  When I logged him out and logged him back in again, the role acted how it should have.

One of the disadvantages of disabling the network role is that sc-admin can’t create VNIC Templates.  This is something we might want to allow him to do in his own org.  We can change this by creating a new role in the user management entitled Network_SP.  For this role, we just check:

  • Service Profile Network
  • Service Profile Network-Policy
  • Service Profile Qos
  • Service Profile Qos Policy

Next, add this role into the sc-admin account (click on locally authenticated users and right click sc-admin and add a check to the Network_SP role we created)

Now sc-admin can create vNIC templates in his own sub org, but he isn’t allowed to create external VLANs and disable/enable ports on the Fabric Interconnect.  For this to take affect, have sc-admin log out and log back in again after you apply the role.

You can do something very similar on the Storage tab in order to allow a suborg to create and modify its own vHBA Templates but not be able to disable FC ports on the Fabric Interconnects.

Once this is in place, you can repeat the operation for the department of Executive Services.  As other departments join the consolidated data center their users are simply added to the locales and given roles.

App crazy

I’ve been going a little app crazy to start out this year, and I’m very pleased with the results.  With the help of others, I’ve released updates to the two Cisco based apps: UCS Tech Specs, and FlexPod Tech Specs.  And I’ve finally released the xCAT iOS client!  Hurray!

I’ve been doing all this for the past several months during that precious moments I have after the kids go to bed and I drift off to sleep.  Lucky for me, my wife has enough interesting projects going on in her life that she doesn’t miss me… too much!  Don’t get me wrong:  We still find time to go out and have a great time. And for those times when my day job also becomes my night job, you can see why it takes a long time for many of these projects to get done.  Whew!

There are also many other projects cooking.  With my coworker Tige Phillips at Cisco, we are slowly creating SiMU HD, an iPad version of SiMU pro that will manage UCS systems.  …Well, I should restate that:  He’s doing most of the work and I’m lending a hand!

I’ve also thought about starting a little game development?  How about a game for managing clusters?  A game for managing UCS that gives you prizes for learning how to do certain cool features?  Ha!  Yes, I have a lot of bad ideas!  Hope you have a great February!

xCAT r* commands with UCS

xCAT out of the box works on UCS.  Or UCS out of the box works with xCAT? Whichever way you look at it, it works. All of the cool things you can do with xCAT like provision nodes, KVM, vSphere, stateless computing, etc, can all be done with UCS.  In fact, you can even run most of the r* commands on UCS.

Cisco UCS allows this through IPMI.  And configuring IPMI on UCS is easier than any other system I’ve ever used.  While I still plan on furthering my xCAT UCS plugin to get more capabilities into xCAT, most xCAT functions can be used with UCS managing the servers with IPMI.  For most people, this is good enough.

Using IPMI this is what seems to work with xCAT 2.6.6 and UCSM 2.0(1): (See the end of this for sample output)

  • rpower on|off|stat|boot
  • rbeacon on|off
  • reventlog [clear]
  • rvitals  (this is quite thorough)

rinv seems to hang on me.  This I think is due to the nature of service profiles, where UUIDs and MAC addresses are transient.  I’ll investigate this further.

So how do you do it?

Configuring an IPMI machine with xCAT has been well documented.  What I haven’t seen documented so much is configuring IPMI inside UCS.  This is surprisingly easy.  Here’s how its done:

1.  Create a Service Profile Template that you will apply to your blades.  This is documented very well in various places so I won’t go into it here.  Creating a Service Profile Template is UCS 101.   After you’ve created your service profile, assuming its an updating template you can proceed to the next step.  (Don’t worry, any changes made for doing IPMI don’t require a reboot)

2.  From the Servers tab, filter by Service Profile Templates, and navigate to your service profile template.

3.  Click on the policies table and look at the IPMI Access Profile Policy

4.  Create a new policy.  In this policy you’ll give the name of the user and give it a password.  Make sure they have admin privileges.  For simplicity, I just made my user and password the same as my UCSM user and password.

5.  Apply the setting and click save.

From here on out you can just run IPMI commands.  The only issue now is to know which IP address corresponds to the IPMI interface of which blade.

This can be found in UCSM under the Admin tab, Communication Management, Management IP pool.  If you click on the IP addresses tab on the left hand side, you’ll see all the IP addresses.  

Ok my friend, you now have it. xCAT running rpower commands.

And now, here is a sample output running rvitals on a UCS B200 M1:

# rvitals lucky01
lucky01: BIOSPOST_TIMEOUT: N/A
lucky01: BIOS_POST_CMPLT: 0
lucky01: CATERR_N: 0
lucky01: CPUS_PRCHT_N: 0
lucky01: DDR3_P1_A1_ECC: 0 error
lucky01: DDR3_P1_A1_PRS: 0
lucky01: DDR3_P1_A1_TMP: 26 C (79 F)
lucky01: DDR3_P1_A2_ECC: 0 error
lucky01: DDR3_P1_A2_PRS: 0
lucky01: DDR3_P1_A2_TMP: 25 C (77 F)
lucky01: DDR3_P1_B1_ECC: 0 error
lucky01: DDR3_P1_B1_PRS: 0
lucky01: DDR3_P1_B1_TMP: 26 C (79 F)
lucky01: DDR3_P1_B2_ECC: 0 error
lucky01: DDR3_P1_B2_PRS: 0
lucky01: DDR3_P1_B2_TMP: 27 C (81 F)
lucky01: DDR3_P1_C1_ECC: 0 error
lucky01: DDR3_P1_C1_PRS: 0
lucky01: DDR3_P1_C1_TMP: 24 C (75 F)
lucky01: DDR3_P1_C2_ECC: 0 error
lucky01: DDR3_P1_C2_PRS: 0
lucky01: DDR3_P1_C2_TMP: 25 C (77 F)
lucky01: DDR3_P2_D1_ECC: 0 error
lucky01: DDR3_P2_D1_PRS: 0
lucky01: DDR3_P2_D1_TMP: 22 C (72 F)
lucky01: DDR3_P2_D2_ECC: 0 error
lucky01: DDR3_P2_D2_PRS: 0
lucky01: DDR3_P2_D2_TMP: 22 C (72 F)
lucky01: DDR3_P2_E1_ECC: 0 error
lucky01: DDR3_P2_E1_PRS: 0
lucky01: DDR3_P2_E1_TMP: 22 C (72 F)
lucky01: DDR3_P2_E2_ECC: 0 error
lucky01: DDR3_P2_E2_PRS: 0
lucky01: DDR3_P2_E2_TMP: 22 C (72 F)
lucky01: DDR3_P2_F1_ECC: 0 error
lucky01: DDR3_P2_F1_PRS: 0
lucky01: DDR3_P2_F1_TMP: 21 C (70 F)
lucky01: DDR3_P2_F2_ECC: 0 error
lucky01: DDR3_P2_F2_PRS: 0
lucky01: DDR3_P2_F2_TMP: 22 C (72 F)
lucky01: ECC_STROM: 0
lucky01: FM_TEMP_SENS_IO: 21 C (70 F)
lucky01: FM_TEMP_SEN_REAR: 22 C (72 F)
lucky01: HDD0_PRS: 0
lucky01: HDD1_PRS: 0
lucky01: HDD_BP_PRS: 0
lucky01: IOH_THERMALERT_N: 0
lucky01: IOH_THERMTRIP_N: 0
lucky01: IRQ_P1_RDIM_EVNT: 0
lucky01: IRQ_P1_VRHOT: 0
lucky01: IRQ_P2_RDIM_EVNT: 0
lucky01: IRQ_P2_VRHOT: 0
lucky01: LED_BLADE_STATUS: 0
lucky01: LED_FPID: 0
lucky01: LED_MEZZ_FAULT: 0
lucky01: LED_MEZZ_TP_FLT: 0
lucky01: LED_SAS0_FAULT: 0
lucky01: LED_SAS1_FAULT: 0
lucky01: LED_SYS_ACT: 0
lucky01: MAIN_POWER: 0
lucky01: MEZZ_PRS: 0
lucky01: P0V75_DDR3_P1: 0.7644 Volts
lucky01: P0V75_DDR3_P2: 0.7644 Volts
lucky01: P12V_BP: 11.948 Volts
lucky01: P12V_CUR_SENS: 10.78 Amps
lucky01: P1V05_ICH: 1.0486 Volts
lucky01: P1V1_IOH: 1.078 Volts
lucky01: P1V1_VCCP_P1: 1.0192 Volts
lucky01: P1V1_VCCP_P2: 0.931 Volts
lucky01: P1V1_VTT_P1: 1.1368 Volts
lucky01: P1V1_VTT_P2: 1.1564 Volts
lucky01: P1V2_SAS: 1.2152 Volts
lucky01: P1V5_DDR3_P1: 1.5288 Volts
lucky01: P1V5_DDR3_P1_IMN: 5.13 Amps
lucky01: P1V5_DDR3_P2: 1.5386 Volts
lucky01: P1V5_DDR3_P2_IMN: 14.25 Amps
lucky01: P1V5_ICH: 1.5092 Volts
lucky01: P1V8_IOH: 1.813 Volts
lucky01: P1V8_P1: 1.7836 Volts
lucky01: P1V8_P2: 1.7836 Volts
lucky01: P1_PRESENT: 0
lucky01: P1_TEMP_SENS: 39.5 C (103 F)
lucky01: P1_THERMTRIP_N: 0
lucky01: P2_PRESENT: 0
lucky01: P2_TEMP_SENS: 37.5 C (100 F)
lucky01: P2_THERMTRIP_N: 0
lucky01: P3V3_SCALED: 3.2548 Volts
lucky01: P3V_BAT_SCALED: 3.102 Volts
lucky01: P5V_SCALED: 4.9405 Volts
lucky01: POWER_ON_FAIL: 0
lucky01: POWER_USAGE: 126 Watts (430 BTUs/hr)
lucky01: SAS0_FAULT: N/A
lucky01: SAS1_FAULT: N/A
lucky01: SEL_FULLNESS: 0
lucky01: VR_P1_IMON: 1.75 Amps
lucky01: VR_P2_IMON: 3.5 Amps

Cisco UCS + NetApp + MDS configuration Part 1

In a recent training event I went to this week, they stated that a FlexPod = Cisco UCS + Nexus + NetApp. If you don’t have those 3, then its not a FlexPod. However, if you have UCS + NetApp, then you still get the benefits of the back end support being tied together. This means that if you open a support request with TACC and it turns out that it may be a NetApp Filer issue, then TACC will be able to call up NetApp they will both work with the end user to resolve the problem.

In our local lab, we aren’t fortunate enough to have a Nexus 5000 set up. Instead we have MDS 9148s. These are nice boxes, so I wanted to put them to use. What we created was not a Cisco Validated design, however, it still worked for us, and I wanted to show how I set it up. Since its just my lab, its a very simple configuration, and I’ll try to update this post as I remember things. You should note, that the best documented solution is to use Nexus 5000s. (You would then have a FlexPod) and use 10GbE for FCoE and/or NFS.

For this first part, I just wanted to show how we did the cabling.

What you can see from the cabling is that the Fabric Interconnects are connected to MDSes. They’re not cross connected. This is because the Fabric Interconnects are operating in the default “End Host Mode”. You have to look at it like each Fabric Interconnect is a PCI adapter off of a server. (Yes, I know its much more than that). but if you looked at it like each Fabric Interconnect was an HBA off of a single server, then this topology makes a lot more sense. In the case of one server, its like having 2 dual port HBAs each one being connected redundantly to a a fiber channel switch. (The MDS in this case).

On the back end of the MDS, the NetApp is cross connected to each MDS switch. This provides redundancy so that if any of the components fail, the solution will still work. For example, if MDS 9148a loses power, then the solution still works as the traffic can flow through MDS 9148b. If Fabric Interconnect A fails, traffic can flow through Fabric Interconnect B. If one of the Filer fails, since the Filers run in cluster mode, then the other filer will take over and provide the datastores to the servers.

In my next post, I’ll talk a little more of how we configured this.

F Port-Channel Trunking vs. Non Trunking

One of the fun things I get to do is learn about Fiber Channel technologies. Today I’ve been configuring a UCS chassis with two Fabric Interconnects to two MDS 9148 SAN switches.

I’ll go into it a bit more on another post, but after I was configuring it reading some great documentation, I didn’t understand what the difference between trunk mode and non trunk mode was going over a port channel.

So here it is based on what I read here.

F Port-Channel: This refers to taking two physical cables and creating a fat pipe with them. So if you have two 8Gbps FC links, you can combine them to a nice big 16Gbps FC pipe. Its similar to ethernet port-channels but you do it with the SAN.

Now, you may have different VSANS that you want to go over this pipe. Well, if you are in non-trunking mode, you can not do it. You are only allowed one VSAN. In UCS, this is the default setting. No trunking. This usually works fine and in my environment I have no reason to change it.

However, if I had multiple VSANS then I’d want to make a trunk out of it. This requires a bit extra wizardry on UCS. But it is documented here.