Configuring nVidia SATA controllers for use with VMware ESX

by dave on October 16, 2008


Part of the beauty of ESX 3.x from a hardware support standpoint was the addition of SATA as a viable install media for the hypervisor and service console.  However, opening up support for SATA also included a few hiccups along the way, most related to the SATA controllers officially supported by VMware.  For folks like myself who spent a lot of time with AMD-based platforms, the only real choices for SATA controllers (onboard the motherboard, not discrete) were offerings from Broadcom and nVidia.  This post will highlight how to configure your ESX 3.x host to use nVidia SATA controllers.

Note: This information is available within the VMware user community as well.  I am indebted to the person(s) in that community who provided this information, albeit in a slightly less “visual” way.

Just a quick note before we begin.  Just because you can INSTALL ESX 3.5 doesn’t mean that you’ll be able to USE ESX 3.5.  The Service Console will load separate from the hypervisor and, you’ll get a nasty little error stating “Mounting root failed” in the startup screens if you don’t follow these instructions.

Step 1:  Log in as root

Service Console - Root Logged In

Service Console - Root Logged In

You need to log into the ESX 3.5 service console as root in order to proceed.  If you’re like me and you are going to use the remote CLI (through PuTTY in my case), you’ll need to log in as your authenticated user (in my case “ssh”) and type “su – ” to change permissions to root.

Step 2: Use the “LSPCI” command to determine the PCIID of your SATA Controller

LSPCI Command - Determine the PCIID of your SATA Controller

LSPCI Command - Determine the PCIID of your SATA Controller

The LSPCI command will give you a listing of all PCIIDs utilized in your system that will look similar to the list below.

List of all PCIIDs in current system

List of all PCIIDs in current system

You’ll need to scan through the list and determine the SATA controller PCIID.  Once you find the PCIID, write it down on a piece of paper.  You’re going to need it in a few minutes.

Note: on my system, the nVidia MCP55 SATA controller PCIID was “037f“.  Your PCIID may be different.

Step 3: Find the “sata_nv.xml” file and update the PCIID

Now that you know your PCIID, you’re going to need to update the file that the hypervisor and service console reference for storage.  For nVidia SATA controllers, this file is called “sata_nv.xml” and can be found at the following path:

Location of the sata_nv.xml config file

Location of the sata_nv.xml config file

The location of the sata_nv.xml file is /etc/vmware/pciid/ and is highlighted in the picture above.  Once you’ve found the file, you’ll need to update it with the appropriate PCIID you recorded in Step 2.

Editing the sata_nv.xml file with VI

Editing the sata_nv.xml file with VI

The command line syntax is “vi sata_nv.xml”  Once you’ve entered that line, hit Enter and you’ll see the following:

VI display of the sata_nv.xml file

VI display of the sata_nv.xml file

Move your cursor to the appropriate line (in the screen shot, that’d be “device id=00ee” or similar).  Hit “Insert” two times so that you see “Replace” in the lower left hand corner of your screen.  Anything that you type from this point on will overwrite the existing content of the file so, be careful.

Once you’ve updated the device id with the appropriate entry, hit the “Esc” key two times to clear the Replace function.  Type  “:” to access the main VI command complex, and then enter “w!” to overwrite the file contents with your updates.  Once that process finishes, type “:” again and hit “q!” to quit out of the VI editor.

Note:  The screenshots from this article are from a Dell PE1850 and consequently, it doesn’t show ALL the entries that would be found for current nVidia SATA solutions.  If you’re using any Tyan (or other) mainboard that has the NF3400/NF3600 chipset, you need to look for the device id line under the MCP55 line.  That is where you’ll need to update the device id.

Step 4: Reload the PCIID tables to overwrite current config and Reboot

Once you exit the VI editor, you’ll need to make sure that your changes are updated to the PCIID tables that the hypervisor and service console reference.

Updating the PCIID tables using ESXCFG

Updating the PCIID tables using ESXCFG

The command line syntax for doing this update is “esxcfg-pciid”.  Once you type this in, hit Enter and it will process the table updates.  The process should take around 15 seconds to complete and once done, it will put you back at the command line.  At this point you need to reboot to have the changes applied to your ESX installation.

Closing Thoughts:

Obviously, this is process can be a little more complicated for people who just want to “throw and go” within the virtualization space.  Part of the reason why this is required really has to do with the difference in implementation metrics from manufacturer to manufacturer as well as the generic driver model provided by VMware as part of the ESX package.  As future versions of the software are released, I’d expect that this amount of legwork would be reduced significantly.

Hope this helps get you up and running!

Cheers,

Dave Graham

Reblog this post [with Zemanta]
Share
  • Mark

    Hi, good article, but in screenshot it would be nice to see where you get the value for your nVidia MCP55 SATA controller of “037f“. I don't see it anywhere in the output. It's a little confusing to read the lspci output, and it would help connect this step to the next one.

  • dave_graham

    Mark,

    thanks for the comments. Completely understand about LSPCI and the issues in trying to read the device ids. I don't have the server currently installed, but I'll attempt to get a good screen capture within the next couple of weeks for you. Additionally, one of the ways that I found out the PCI ID was doing a base installation of CentOS 5.x and figuring it out through there. Made it a bit easier to do. The PCI ID doesn't fundamentally change between operating systems.

    Anyhow, thanks again for your comment!

    cheers,

    Dave Graham

  • Mark

    Hi, good article, but in screenshot it would be nice to see where you get the value for your nVidia MCP55 SATA controller of “037f“. I don't see it anywhere in the output. It's a little confusing to read the lspci output, and it would help connect this step to the next one.

  • Mark,

    thanks for the comments. Completely understand about LSPCI and the issues in trying to read the device ids. I don't have the server currently installed, but I'll attempt to get a good screen capture within the next couple of weeks for you. Additionally, one of the ways that I found out the PCI ID was doing a base installation of CentOS 5.x and figuring it out through there. Made it a bit easier to do. The PCI ID doesn't fundamentally change between operating systems.

    Anyhow, thanks again for your comment!

    cheers,

    Dave Graham

  • Stephane Lavoie

    Hi,
    Since update 4 i got this when i tried your solution.

    Apr 19 11:20:01 X2ESX vmkernel: 0:00:00:03.295 cpu1:1034)ALERT: Mod: 1413: Initialization for sata_nv failed with -19.

    Apr 19 11:20:01 X2ESX vmkernel: 0:00:00:03.295 cpu1:1034)sata_nv failed to load with status 0, -19, 0xbad0001.

    can you help?
    thanks

  • Stephane,

    My instructions were based on 3.5 Update 2, so, i don't know if the driver stack changed between Update 2 and Update 4. Again, the safest bet is to understand what PCIID is being assigned to your controller and using that.

    what chipset are you using?

    dave

  • STedy

    Hi

    Thanks for the quick awnser.

    Nvidia 570 Ultra on an Asus M2N-E.

    The problem occure only with U4 since everything was fine in U3 with the change in SATA_NV.XML.

    Thanks
    Stephane

  • Matze

    Thx a lot, after hours i got my ESX 3.5 finally run

    Good Job 🙂

  • dave

    absolutely welcome, Matze. Glad they worked for you!

    cheers,

    Dave

  • Chris

    Great article Dave. I'm looking at this with a 4i perspective – has anyone used this successfully? How did you get on Stephane ?
    Cheers

  • Chris

    Great article Dave. I'm looking at this with a 4i perspective – has anyone used this successfully? How did you get on Stephane ?
    Cheers

  • dave

    actually don’t need this for 4i as the MCP55 is coded into the base path. I’ve got two Supermicro-based AMD Istanbul nodes going strong right now on 4i with nary a problem!

    cheers,

    dave