Virtual Environment: February 2014

Wednesday, February 26, 2014

Changing the width of the device field in esxtop

Symptom : When running the esxtop for troubleshooting the issue some field width are too small and not able to see the full field name , for example when troubleshooting the storage issue , the device id will be displayed as truncated like naa.xxxx but not the full device id.

How to fix that : for example when troubleshooting storage issue
1.run esxtop on the ssh session of the esxi server
2.press u for disk device view
3.shift l or simply L ( caps L)
4.press 36 ( to see 36 character)

Converting RDM disk to VMDK

ISSUE : need to convert the RDM disk (physical mode or virtual mode) to VMDK disk

symptoms : couple of vm might be having rdm disk ( physical or virtual) and need to convert those disk into vmdk.

Scenario A: Converting physical compatability mode RDM disk to VMDK

                1.directly converting physical mode rdm to vmdk will not be possisble First this physical mode rdm disk need to be converted to virtual mode.
                2.shutdown the vm
                3. click edit settings and select the physical mode rdm disk
                4. click remove disk and delete from datastore ( this will not delete the data from the disk for physical compatability mode rdm disk)
                5.click edit settings
                6. Add hard disk
                7. rdm disk
                8.select the rdm disk and select the mode as virtual compatability mode
                9.power on the server
                10.click edit settings and select migrate
                11.select datastore migration option
                12.select advance under the disk
                13.change the disk format to thick or thin disk
                14.select the destionation datastore .

Scenario B: Converting Virtual compatability mode RDM disk to VMDK

                 1.click edit settings and select migrate
                 2.select datastore migration option
                 3.select advance under the disk
                 4.change the disk format to thick or thin disk
                 5.select the destination datastore .

"Media Is Write protected" error while initializing new disk in windows server 2008

Error Message : media is write protected

symptoms: added a new storage disk to the windows server and try to initialize the disk under disk managment but it failed with the error "media is write protected"

Root cause: This is due to the san policy in windows 2008 server .If the san policy is VDS_SP_OFFLINE then the disk will be offline and read only disk.

How to fix the issue :

1. Open the command prompt as administrator and type diskpart and hit enter
2. list disk
3.select disk X ( where x is the disk number which is offline)
4.if the disk is offline then type online disk to bring that online
5.type detail disk and check for read only attribute is showing as yes
6.type attribute disk clear readonly to clear the read only attribute
7.exit
8.reinitialize the disk ,if reinitliaze popup doesn't come up then reboot the server and try to reintialize.

Friday, February 21, 2014

Java error shows - Certificate has been revoked .The application will not be executed error on ucs KVM

When you try to access the KVM console on the ucs CIMC or UCS manager Java error shows

“ Certificate has been revoked . The application will not be executed”

How to fix this issue temporarily ( Workaround) :

1. Goto control panel on the machine where you are opening the KVM

2. double click on Java

3. Click on the advanced tab on the java control panel

4. under “Perform Certificate revocation checks on”

5. Select “do not check (not recommended).

Now you will be able to access the KVM . Once you are done with the work go back to the control panel and change back to the previous settings.

UCS Manager shows the Major error F0909 for the keyring certificate

Where to confirm this issue :
Check the ucs manager alerts it shows Error code F0909 with keyring default certificate expired.
Get into the SSH session of the ucs manager using putty then run the following command to check the certificate status.

UCS-A# scope security
UCS-A /security # scope keyring detail

Certificate status : expired ( this is what output screen shows)
How to fix this issue:
On the ssh session of the ucs manager run the following command to regenerate the default certificate

UCS-A# scope security
UCS-A /security # scope keyring default
UCS-A /security/keyring* # set regenerate yes
UCS-A /security/keyring* # commit-buffer
UCS-A /security/keyring #
Goback to the ucs manager GUI screen and accept the new certificate. This will close and open the ucs manager gui session again .

If third party certificate is used instead of default.Then import the certificate using command

UCS-A# scope security
UCS-A /security # scope keyring XXXXX ( XXXXX keyring name for ke20)
UCS-A /security/keyring # set trustpoint yyyyy ( yyyyy is the trustpoint name created during the certificate request)
UCS-A /security/keyring* # set cert
Enter lines one at a time. Enter ENDOFBUF to finish. Press ^C to abort.
Keyring certificate:
> -----BEGIN CERTIFICATE-----
               XXXXXXXXXXXXXXXXX
               XXXXXXXXXXXXXXXXX
               XXXXXXXXXXXXXXXXX
               XXXXXXXXXXXXXXXXX
> -----END CERTIFICATE-----
> ENDOFBUF
Commit-buffer
Go back and accept the new certificate and the GUI session will close and reopen again.

Reclaiming Thin provisioned Disk

Issue: esxi environment storage disk is showing the free space properly by from the netapp view it is not showing the proper free space. For example 100 Gb lun (named as lun1) is presented to the esxi server this lun is a thin provisioned lun from netapp. Created a 25 GB vm on the 100 Gb lun (lun1) , now the free space in that lun (lun1) is 75 GB which is good. Storage vmotioned that vm to the another lun (named lun2). Now the free space on the lun1 should be 100 GB from ESXI view as well as NetApp storage view but I am seeing 100 GB free from esxi view but NetApp storage view is still showing 75 is free and 25 GB in use.

Root Cause: VAAI unmap is disabled by default in ESXI host due to the performance impact. This VAAI feature inform the storage array that the vm files is moved or deleted and allow the array to reclaim the blocks.

Workaround : to reclaim the free space , we have to use vmkfstools .Before running the reclaims command on the esxi host couple of things we need to check couple of things
1.verify the hardware acceleration status on the esxi storage tab which should show as supported or unknown if you disable the VAAI option from NetApp side.
2. not down the device id naa.XXXXXXXXXXXXXXXX , this can be noted from the esxi storage tab view or get into the ssh and run the below command

esxcli storage core device list | more

3.run this command on ssh session to see the lun is a thin provisioned lun or not

esxcli storage core device list –d naa.xxxxxxxxxxx
at the end of the line check
                   Thin provisioning status : yes
                   Attached filters : VAAI_filter
                   VAAI status:supported (it may show unknown if it is disabled from Netapp)
1. Now we need to check if the delete is supported on that lun.run the below command on ssh to check this

esxcli storage core device vaai status get –d naa.XXXXXXXXXX

last line of the result should show Delete status :supported

2. To reclaim the space run the below command on ssh .Navigate to the vmfs path by running the command.

Cd /vmfs/volumes/lun1/
Vmkfstools -y xx ( example vmkfstools –y 50 ------ to reclaim 50% of free space)

Note :don’t put reclaim percentage as 100 , because this will completely occupy your free space till the reclaim process get completed and if you have any other vm running on the same lun will go down if it doesn’t have enough space.

( where xx is the percentage of free space you want to reclaim) , for example consider a scenario that you have 100 GB lun name lun1. You have two vm each 25 GB disk , so you have 50 Gb free and 50 GB used. Now let vmotion one of the vm to another lun call this as lun2. Now lun 1 have 75 GB free space. If you want to put the reclaim percentage then put 50 which will create a balloon disk of size 37.5 GB ( 50 % of 75 GB) . putting 100% is not recommended as it may take the other vm down due to the space issue).

Virtual Storage Console: Host NetApp discovery hangs at terminating stale tasks @ 15%

Where this error shows:
1. In vsphere client it shows the task is hangs @ 15 % with terminating stale task.
2. Under Virtual storage console , any of the host is showing as unknown or disconnected.

Cause for this issue:
1. Any of the ESXI host in the vsphere is in not responding or disconnected state.
2. Double check the account Permission used in adding the vsphere to the vsc.

How to fix this issue:
1. Goto VSC and right click on the host which is showing as disconnected or not responding and click skip option and make sure you have check the small box in it. The host status should show as skipped if skip option is selected.
2. If the esxi host is permanently down then remove that from vsphere client.
3. Check the account permission used for the vsc console configuration having proper access to the vcenter.

This is a bug in 4.1 version of VSC for vcenter version 5 and it is fixed in 4.2. the bug id is 592931. Here is the bug detail

http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=592931

Unable to communicate with Flex flash controller: Operation ffcardsGet, status Error Timeout /Bootbank cannot be found at the path /bootbank” error

Where to See this error :

1.CIMC ---> storage ---> Cisco Flexflash ---->controller info

2.If you are running ESXI server on this C series server you will be getting “Bootbank cannot be found at the path /bootbank” error

Cause for this issue: this is a known bug if you are running firmware version less than 1.5(3d) on c series server. The bug id is CSCuh33982. It is resolved in firmware version 1.5(3d).

https://tools.cisco.com/bugsearch/bug/CSCuh33982

Workaround to fix this issue:
1. Reboot CIMC --- this will not cause any disruptions to the esxi server or the vm’s running on it. The management page of CIMC page need to be reloaded. I would recommend to clear the cache and try to access the management page after CIMC restart.

How to restart the CIMC:

Goto Admin tab on CIMC and click reboot CIMC.
If you want to do that on CLI, ssh into the management ip address of CIMC and run the below commands.
Scope cimc
Reboot

2. If CIMC shows the controller is healthy now and ESXI still shows the same error then ssh into the ESXI server and restart the service by using the below command
Services.sh restart ( this will not affect your VM running on it but this will disconnect the esxi server and connect it back automatically when the service comes up).

Making Windows server 2012 as ISCSI target

Windows 2012 comes with the ISCSI target as a built in component, you have to just install the roles and make use of that.

How to install the ISCSI target role

1. Open Server Manager and Click Add Roles and Features
2. on Add Roles and features wizard Click next
3. Select Role based or feature based installation and click next
4. Click select a server from server pool option and select the server which is going to be running as iscsi target and click next
5. Expand file and storage services and select ISCSI Target server and click next
6. on Add roles and feature wizard , check the box include management tools and click Add features
7. Click next
8. Click next on the next window and click install

How to configure the ISCI target
1. Open Server manager and click File and Storage Service
2. Click ISCSI Virtual Disks option
3. Click the “Launch the new virtual disk Wizard to create a virtual disk”
4. Select the option “type a custom path “
5. Browse to the path you want to create a target for example if you connect the external hard drive and the drive letter is F . then put the path F:\ISCSI-disk1 and click next.
6. On Specify ISCSI Virtual disk name you can give any name for example you can give it as VMdatastore1 or Hypervdatastore1
7. Click next
8. Specify the size of the disk for example 256 GB
9. Now select new ISCSI target option
10. Give any target name, if you using this for vmware cluster give that cluster name or if you using for hyperv cluster give that name
11. Click Add to add the initiators (Initiator will be your esxi host IQN or the iscsi ip address assigned to the esxi port group or if you are using hyperv then it should be the hyperv server ip address or the iqn)
12. Select the option “ Enter a value for the selected type” and select the type as IP address if you want to add ip or select IQN if you want to add ISCSI qualified name (IQN)
13. Click Next
14. Enable authentication, Enable the authentication based on your company policy , if you don’t have any idea then just click next without checking any option on this screen
15. Just go through the configuration overview which you select so far and click create
16. Click close

How to add access the ISCSI disks now

1. Goto the server which you want to configure the ISCSI disk
2. Goto server manager click on TOOLS and select issi initiator
3. If the ISCSI service is not running then it will ask for the service start prompt click yes to start the ISCSI service
4. On the target tab put the ip address of your ISCSI target server name and click quick connect
5. Once it is able to connect to the target then the IQN of the target is listed under the quick connect and click done
6. Goto disk management and rescan the disk.

Set as primary option is greyed out in ADFS certificate option

when we need to replace the token signing certificate or decryption certificate , after importing the new certificate , when we try to make the new certificate is primary , the primary option is greyed out

Cause : AutoCertificateRollover is enabled on the adfs properties.
How to fix that :

1.Open the powershell as administrator
2.Add-PSSnapin Microsoft.Adfs.PowerShell ---- this will load the powershell snapin module for the ADFS
3.get-adfsproperties --- this command will show you the Autocertificaterollover is $true which means it is enabled.
4. Set-ADFSProperties -AutoCertificateRollover $false ---- this sets the autocertificate rollover option to disable.
5.go back to your ADfs certificate console and right click on the new certificate and make that as primary certificate .
6. you can enable the autocertificaterollover back to enabled by running the Set-ADFSProperties -AutoCertificateRollover $true on powershell

Monday, February 10, 2014

Deleting a snapshots failed with the error: the virtual disk is either corrupted or not a supported format

Error file location :

Two ways to see the vmware.log files

Method 1: using CLI

1.ssh into the ESXI host
2.navigate the virtual machine file path Cd /vmfs/volumes/storage lun name/virtual machine name /
3.Cat vmware.log | more
4.hit space bar to see the log entries for the specific time

Note : the time might be in the UTC time if the esxi host is in UTC.

5.Look for the error on the specific time snapshot removal was initiated, the error will be “ SNAPSHOT: SnapshotConsolidateOpenDisks failed: Could not open/create change tracking file “

Method 2 : Using GUI
1.In the vsphere client browse the datastore and navigate to the virtual machine file
2.Select the vmware.log file the current one and download to your desktop.
3. Open the file with wordpad or notepad and look for the error on the snapshot removal initiated time.
“ SNAPSHOT: SnapshotConsolidateOpenDisks failed: Could not open/create change tracking file”

How to fix this issue :
1. Create a temp folder on the same datastore
2. Move all the files end with .CTK extension to the temporary folder
3. Go back to the vm and initiate the snapshot removal again.

Note : sometimes you may get the same error again but the snapshot will get deleted even though it is showing the same error.

Sunday, February 9, 2014

Vcenter SQL transactional log full

vcenter server service is getting stopped or terminated unexpectedly then one of the main reason for this problem is either database space or the transaction log file is full.

Where you will confirm the issue is due to the database transaction :

you have to view the vpxd.log files located under C:\ProgramData\VMware\VMware VirtualCenter\Logs\vpxd.log (for windows 2008 and above) . Open the vpxd file and go all the way down or check for the specific time where service get stopped.

For issue due to the database transaction log there will be error stating that

The transaction log for database 'databasename' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases" is returned when executing SQL statement.

How to fix this issue....

1. login to the sql server with the account having database admin access
2. Open the sql server management studio
3. Right click the vcenter database and click properties
4. Click options
5. set the recovery model to simple and click ok
6. Right click the database and click tasks --->shrink --- >files
7. select the file type as log
8. Confirm that the transaction log file name appears in the file name
9. select release unused space and click ok.
10. go back to vcenter and start the service.

Note : Selecting recovery model as simple has one drawback , for any reason if the database is corrupted you will be able to recover till the last full backup.