digiturf.net

Dynamically expand the LUN ( 1st at Storage Side ) in AIX.

QUESTION ::  HOW TO DYNAMICALLY EXTEND DISK IN AIX, AFTER STORAGE TEAM INCREASED THE LUN SIZE AT THEIR END.

DETAILS from STORAGE TEAM:

FileSystem to Grow  ::      /db2/fs001

LUN Allocated           ::      hdiskpower10 100 125 6000144000000010700803B124345AA7

1. (root@q111pstricfreg):/# lspv | grep -i hdiskpower10       – Execute it to FIND to which volume group this disk / PV is added to / associated with…!
hdiskpower10    000130e3e549c064       datavg          active

2. (root@q111pstricfreg):/# getconf DISK_SIZE /dev/hdiskpower10        –  This SHOWS the NEW INCREASED SIZE.
129024

(root@q111pstricfreg):/# lspv hdiskpower10 | grep -i TOTAL          –  But #lspv still shows OLD SIZE Only.
TOTAL PPs:          1598 (102272 megabytes)  VG DESCRIPTORS:   1

3.  cfgmgr

4.  chvg -g datavg         – ” -g ” Scans for any DISK Size changes for all disks that are part of “datavg” mentioned here.

5. (root@q111pstricfreg):/# lspv hdiskpower10 | grep -i TOTAL     – After #chvg, the size of the disk shows 25 GB extra space.
TOTAL PPs:   014 (128896 megabytes)  VG DESCRIPTORS:   1

6. (root@q111pstricfreg):/# df -g /db2/fs001                      – Check the SIZE of FS before adding the space to it / extnding LV.
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/fslv14      157.81     20.72   87%       69     1% /db2/fs001

7. (root@q111pstricfreg):/# lsvg -l datavg | grep -i /db2/fs001   – Find the LV Details that are associated with FS “/db2/fs001”, in order to add space to FS, we 1st need to extend LV
fslv14              jfs2       2525    2525    5    open/syncd    /db2/fs001                                                              associated with it.

8. (root@q111pstricfreg):/# extendlv fslv14 25G                   –  Extending / Adding 25GB space to LV, so that it can be used to increase the Filesystem space.

9. (root@q111pstricfreg):/# chfs -a size=+25G /db2/fs001          –  Adds 25 GB space to filesystem now.
Filesystem size changed to 383385600

10. (root@q111pstricfreg):/# df -g /db2/fs001                     –  Verify with PREVIOUS output above to find that SPACE is added now.
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/fslv14      182.81     45.71   75%       69     1% /db2/fs001

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
SOME COMMANDS THAT MIGHT BE USEFUL TO GET PRECISE INFORMATION WHILE TRYING TO DO THIS ARE AS FOLLOWS ::

(root@q111pstricfreg):/# lspv -p hdiskpower10
hdiskpower10:
PP RANGE  STATE   REGION        LV NAME             TYPE       MOUNT POINT
1-320   used    outer edge    fslv19              jfs2       /db2/fs006
321-324   used    outer edge    fslv04              jfs2       /dba
325-331   used    outer edge    fslv10              jfs2       /db2/db2p095b
332-403   used    outer edge    fslv14              jfs2       /db2/fs001
404-456   used    outer middle  fslv14              jfs2       /db2/fs001
457-806   used    outer middle  fslv19              jfs2       /db2/fs006
807-1208  used    center        fslv19              jfs2       /db2/fs006
1209-1598  used    inner middle  fslv19              jfs2       /db2/fs006
1599-1611  used    inner middle  fslv14              jfs2       /db2/fs001
1612-1998  used    inner edge    fslv14              jfs2       /db2/fs001
1999-2014  free    inner edge

(root@q111pstricfreg):/# lsvg -p datavg
datavg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdiskpower11      active            2014        16          00..00..00..00..16
hdiskpower10      active            2014        16          00..00..00..00..16
hdiskpower9       active            1598        0           00..00..00..00..00
hdiskpower8       active            2014        16          00..00..00..00..16
hdiskpower7       active            1598        0           00..00..00..00..00
hdiskpower6       active            2014        16          00..00..00..00..16
hdiskpower2       active            1598        0           00..00..00..00..00
hdiskpower1       active            1598        0           00..00..00..00..00
hdiskpower0       active            1598        0           00..00..00..00..00
hdiskpower3       active            1998        0           00..00..00..00..00
hdiskpower12      active            1598        0           00..00..00..00..00
hdiskpower5       active            3998        0           00..00..00..00..00
hdiskpower4       active            3998        105         00..00..00..00..105

(root@q111pstricfreg):/# lslv -l fslv14
fslv14:/db2/fs001
PV                COPIES        IN BAND       DISTRIBUTION
hdiskpower10      525:000:000   10%           072:053:000:013:387
hdiskpower2       1200:000:000  26%           320:320:319:241:000
hdiskpower12      365:000:000   52%           000:192:000:000:173
hdiskpower5       526:000:000   61%           000:323:000:000:203
hdiskpower4       309:000:000   100%          000:309:000:000:000

(root@q111pstricfreg):/# lslv fslv14
LOGICAL VOLUME:     fslv14                 VOLUME GROUP:   datavg
LV IDENTIFIER:      000130e30000d6000000012fe549cb55.12 PERMISSION:     read/write
VG STATE:           active/complete        LV STATE:       opened/syncd
TYPE:               jfs2                   WRITE VERIFY:   off
MAX LPs:            3000                   PP SIZE:        64 megabyte(s)
COPIES:             1                      SCHED POLICY:   parallel
LPs:                2925                   PPs:            2925
STALE PPs:          0                      BB POLICY:      relocatable
INTER-POLICY:       minimum                RELOCATABLE:    yes
INTRA-POLICY:       middle                 UPPER BOUND:    1024
MOUNT POINT:        /db2/fs001             LABEL:          /db2/fs001
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?:     NO
DEVICESUBTYPE : DS_LVZ
COPY 1 MIRROR POOL: None
COPY 2 MIRROR POOL: None
COPY 3 MIRROR POOL: None

How to MAP a disk to LPAR which is provisioned to VIOs and add it to VG in LPAR Server

* * * FIRST and FOREMOST I’VE RECEIVED THE FOLLOWING DETAILS FROM STORAGE TEAM ABOUT “100 GB” LUN ALLOCATION.

Device ID  : 143
Device WWN : 600014400000001071206ebda58e4c9
size of the drive : 100 GB
Disk Mapped to servers “odc-vio1 & 2”

[root@odc-vio1:/] cfgmgr                    >> I’ve run this on both VIO servers.
[root@odc-vio1:/] lspv | tail
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
hdiskpower140     00c5dbb6f88d7bf4           None
hdisk916               none                                 None
hdiskpower143     none                                 None

* * * Then I’ve run the following command on “hdiskpower143” to verify and see if the “Device WWN” provided by Storage team is matching… and it did. rest of the commands after this are used to find more details on LUN.

[root@odc-vio1:/] powermt display dev=hdiskpower143
Pseudo name=hdiskpower143
VPLEX ID=CKM00132500764
Logical device ID=6000144000000010701206EBDA58E4C9
state=alive; policy=ADaptive; queued-IOs=0
==============================================================================
————— Host —————   – Stor –  — I/O Path —   — Stats —
###  HW Path               I/O Paths    Interf.  Mode     State   Q-IOs Errors
==============================================================================
1 fscsi0                 hdisk916    CL1-0E   active   alive      0      0
1 fscsi0                 hdisk915    CL1-02   active   alive      0      0
0 fscsi1                 hdisk914    CL1-0B   active   alive      0      0
0 fscsi1                 hdisk913    CL1-07   active   alive      0      0

[root@odc-vio1:/] lsattr -El hdiskpower143 -a lun_id
lun_id 0x8f000000000000 LUN ID False

[root@odc-vio1:/] lscfg -vl hdiskpower143
hdiskpower143    U78A0.001.DNWK6DG-P1-C2-T1-L389  PowerPath Device
Manufacturer…………….EMC
Machine Type and Model……Invista
ROS Level and ID…………5210
Serial Number……………CKM00132500764
Subsystem Vendor/Device ID..APM000824019860
Device Specific.(PQ)……..00
Device Specific.(VS)……..da58e4c9701206eb
Device Specific.(UI)……..6000144000000010701206EBDA58E4C9
FRU Label……………….0200
Device Specific.(Z0)……..10
Device Specific.(Z1)……..10

[root@odc-vio1:/] lsattr -El hdisk916 -a lun_id
lun_id 0x8f000000000000 Logical Unit Number ID False

[root@odc-vio1:/] lsattr -El hdisk915 -a lun_id
lun_id 0x8f000000000000 Logical Unit Number ID False

[root@odc-vio1:/] lsattr -El hdisk914 -a lun_id
lun_id 0x8f000000000000 Logical Unit Number ID False

[root@odc-vio1:/] lsattr -El hdisk913 -a lun_id
lun_id 0x8f000000000000 Logical Unit Number ID False

[padmin@odc-vio2:/home/padmin] lspv -free              -> This command only runs as “padmin” in VIO server which shows unallocated / non-used disks details.
NAME                  PVID                   SIZE(megabytes)
hdisk84                none                     51200
hdiskpower143    non                     102400

[padmin@odc-vio1:/home/padmin] lspv -free
NAME                  PVID                                SIZE(megabytes)
hdisk400              none                                 51200
hdiskpower143    none                                102400

[padmin@odc-vio2:/home/padmin] chdev -dev hdiskpower143 -attr pv=yes   -> ran as “padmin” user, this allocates the “PVID” to disk which makes it easy to locate the disks easily anywhere ( in LPAR and VIO’s ).
hdiskpower143 changed

[padmin@odc-vio1:/home/padmin] chdev -dev hdiskpower143 -attr pv=yes
hdiskpower143 changed

* * * NOW [padmin]# lspv -free , should show PVID to the ” disk – hdiskpower143 ” as shown below on both the servers. PVID will be same for a DISK on both the servers.

NAME                        PVID                         SIZE(megabytes)
hdiskpower143   00c5dbb6c39d6d74             102400            – Server “odc-vio1”
hdiskpower143   00c5dbb6c39d6d74             102400            – Server “odc-vio2”

* * * Now we need to identify “Virtual Host Device” on each VIO. For that 1st we need to find the “LPAR ID”. Here in this scenario LPAR server is “plussrv”, login to that server and execute the command ” # lparstat -i ” Which will give “LPAR (plussrv) ID” in DECIMAL format. Then We need to convert that from DECIMAL to HEXADECIMAL and then execute ” # lsmap ” command shown below in EACH of the VIO’s. ” Numeric 4 ” is Hexadecimal number that is mentioned in the command.

[root@plussrv:/] lparstat -i
Node Name                                  : plussrv
Partition Name                             : plussrv
Partition Number                           : 4
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – –

 * * * LPAR ID is nothing but, “LPAR Partition Number” as shown below. Numeric 4 is same in DEC & HEX.

[padmin@odc-vio1:/home/padmin] lsmap -all|grep vhost|grep 4
vhost0          U8204.E8A.065DBB6-V1-C15                     0x00000004

[padmin@odc-vio2:/home/padmin] lsmap -all|grep vhost|grep 4
vhost0          U8204.E8A.065DBB6-V2-C16                     0x00000004

* * * Now, we need to map the “Virtual SCSi Disk” to “vhost0” as per above output on both the VIO’s.

[padmin@odc-vio1:/home/padmin] mkvdev -vdev hdiskpower143 -vadapter vhost0 -dev plsrvtrmtrkvgd4
plsrvtrmtrkvgd4 Available

* * * Named “hdiskpower143” to “plsrvtrmtrkvgd4”, so that it can be easily understood to find to which server and “VG” it is allocated to…! Name of the disk should not be more than 15 characters.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

* * * While trying to execute the same # mkvdev command, encountered the following error in VIO2.

[padmin@odc-vio2:/home/padmin] mkvdev -vdev hdiskpower143 -vadapter vhost0 -dev plsrvtrmtrkvgd4
Cannot access a device.

[padmin@odc-vio2:/home/padmin] oem_setup_env
# cfgmgr
# chdev -l hdiskpower143 -a pv=yes -a reserve_policy=no_reserve hdiskpower143 changed

* * * After doing above 3 STEPS in VIO2, still the error ” Cannot access a device ” is coming. When i’ve executed the above 3 STEPS on “VIO1” i’ve got the following error.

# chdev -l hdiskpower143 -a pv=yes -a reserve_policy=no_reserve
Method error (/etc/methods/chgpowerdisk):
        0514-062 Cannot perform the requested function because the
                 specified device is busy.

* * * A nice way to know if the disk is locked by the other vio is doing a ” # bootinfo -s hdiskXX “, if output returned value is 0 instead of the disk size in Mb then something is wrong.I’ve got the following output.
      
[padmin@odc-vio2:/home/padmin] bootinfo -s hdiskpower143
102400

>> >> >> THE FOLLOWING STEPS HELPED OVERCOME THE ABOVE ERROR ” CANNOT ACCESS A DEVICE “.

[padmin@odc-vio1:/home/padmin] rmvdev -vtd plsrvtrmtrkvgd4         >> This is on VIO1 where i could successfully ran / used # mkvdev command.
dfplstrmtrkvgd4 deleted

* * * Following commands are executed on both the servers ” odc-vio[1,2] “.

[padmin@odc-vio1:/home/padmin] rmdev -dev hdiskpower143
hdiskpower143 deleted

[padmin@odc-vio1:/home/padmin] cfgdev

[padmin@odc-vio1:/home/padmin] chdev -dev hdiskpower143 -attr reserve_policy=no_reserve
hdiskpower143 changed

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

* * * After doing the above STEPS to solve / overcome ” Cannot access a device ” error on 2nd VIO, Finally i am able to successfully run ” # mkvdev ” on both the VIO servers.

[padmin@odc-vio1:/home/padmin] mkvdev -vdev hdiskpower143 -vadapter vhost0 -dev plsrvtrmtrkvgd4
plsrvtrmtrkvgd4 Available

[padmin@odc-vio2:/home/padmin] mkvdev -vdev hdiskpower143 -vadapter vhost0 -dev plsrvtrmtrkvgd4
plsrvtrmtrkvgd4 Available

* * * Now ” # lsmap ” should show the disk we’ve added to vhosts. Run the following command on both the VIO’s and Verify that it shows the disk on both of them.

[padmin@odc-vio1:/home/padmin] lsmap -vadapter vhost0 | grep -i plsrvtrmtrkvgd4
VTD                          plsrvtrmtrkvgd4                   _______________________________________________________________________________

* * * NOW, as a FINAL STEP is to login to LPAR Server “plussrv” and SCAN for new Disk and CONFIGURE it and then add to VG.

[root@plussrv:/] cfgmgr

[root@plussrv:/] lspv | egrep “00c5dbb6c39d6d74”         >> After scanning for new DEV’s, searching for disk with “PVID” which is assigned to it by running ” #chdev ” CMD on both VIO’s earlier.
hdisk162        00c5dbb6c39d6d74            None

* * * As the disk is mapped to LPAR Server “plussrv” from both the “VIO Servers”, now we need to add “hdisk162” to VG “trmtrkvg” in “plussrv” server as per requirement. Before adding the Disk to mentioned VG, please check the attributes of disk on if anything is to be changed.

[root@plussrv:/] lsattr -El hdisk162
PCM             PCM/friend/vscsi                 Path Control Module        False
algorithm       fail_over                        Algorithm                  True
hcheck_cmd      test_unit_rdy                    Health Check Command       True
hcheck_interval 0                                Health Check Interval      True
hcheck_mode     nonactive                        Health Check Mode          True
max_transfer    0x40000                          Maximum TRANSFER Size      True
pvid            00c5dbb6c39d6d740000000000000000 Physical volume identifier False
queue_depth     3                                Queue DEPTH                True
reserve_policy  no_reserve                       Reserve Policy             True

[root@plussrv:/] lspv | grep -i trmtrkvg         -Before adding the disk to VG, there are “3 Disks” available under this VG also see the “FREE PPs” Count.
hdisk118        003d081c28f46bf9                    trmtrkvg        active
hdisk136        003d081c2906a6ff                    trmtrkvg        active
hdisk153        003d081c07387d25                  trmtrkvg        active

[root@plussrv:/] lsvg trmtrkvg | grep -i “FREE PPs”
MAX LVs:            256           FREE PPs:       396 (50688 megabytes)

[root@plussrv:/] extendvg trmtrkvg hdisk162      -Adding “hdisk162” to “trmtrkvg” VG. Now check the # of disks and “FREE PPs” count on this VG, it should increase.

[root@plussrv:/] lspv | grep -i trmtrkvg
hdisk118        003d081c28f46bf9                    trmtrkvg        active
hdisk136        003d081c2906a6ff                    trmtrkvg        active
hdisk153        003d081c07387d25                  trmtrkvg        active
hdisk162        00c5dbb6c39d6d74                  trmtrkvg        active

[root@plussrv:/] lsvg trmtrkvg | grep -i “FREE PPs”
MAX LVs:            256                      FREE PPs:       1195 (152960 megabytes)

Destroying a Volume group in Linux LVM :


* * This is tested and worked under RHEL4.

[root@eglts04 ~]# vgs
  VG     #PV #LV #SN Attr   VSize   VFree
  datavg  15   0   0 wz–n- 359.77G 359.77G

[root@eglts04 ~]# df -kh
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d0p3      36G   22G   13G  64% /
/dev/cciss/c0d0p1      99M   15M   80M  16% /boot
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – 
/dev/mapper/datavg-lvdata1
                      355G  113G  225G  34% /data

[root@eglts04 ~]# umount /data

[root@eglts04 ~]# vgdisplay /dev/mapper/datavg
  — Volume group —
  VG Name               datavg
  System ID
  Format                lvm2
  Metadata Areas        15
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  “Cur LV                1”
  Open LV               0
  Max PV                0
  Cur PV                15
  Act PV                15
  VG Size               359.77 GB
  PE Size               4.00 MB
  Total PE              92100
  Alloc PE / Size       92100 / 359.77 GB
  Free  PE / Size       0 / 0
  VG UUID               gTMba6-Aof6-ojKO-yaRd-mowh-q0UU-hhoU2l

[root@eglts04 ~]# lvchange -an /dev/mapper/datavg-lvdata1

[root@eglts04 ~]# lvremove /dev/mapper/datavg-lvdata1
Logical volume “lvdata1” successfully removed

[root@eglts04 ~]# vgdisplay /dev/mapper/datavg
  — Volume group —
  VG Name               datavg
  System ID
  Format                lvm2
  Metadata Areas        15
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
 “Cur LV                 0”
  Open LV               0
  Max PV                0
  Cur PV                15
  Act PV                15
  VG Size               359.77 GB
  PE Size               4.00 MB
  Total PE              92100
  Alloc PE / Size       0 / 0
  Free  PE / Size       92100 / 359.77 GB
  VG UUID               gTMba6-Aof6-ojKO-yaRd-mowh-q0UU-hhoU2l

[root@eglts04 ~]# vgremove /dev/mapper/datavg
  Volume group “datavg” successfully removed

[root@eglts04 ~]# pvs                    * # pvdisplay for earlier versions than LVM2.
  PV           VG   Fmt  Attr PSize  PFree
  /dev/sddlmaa      lvm2 —   23.99G 23.99G
  /dev/sddlmab      lvm2 —   23.99G 23.99G
  /dev/sddlmac      lvm2 —   23.99G 23.99G
  /dev/sddlmad      lvm2 —   23.99G 23.99G
  /dev/sddlmae      lvm2 —   23.99G 23.99G
  /dev/sddlmaf      lvm2 —   23.99G 23.99G
  /dev/sddlmag      lvm2 —   23.99G 23.99G
  /dev/sddlmah      lvm2 —   23.99G 23.99G
  /dev/sddlmai      lvm2 —   23.99G 23.99G
  /dev/sddlmaj      lvm2 —   23.99G 23.99G
  /dev/sddlmak      lvm2 —   23.99G 23.99G
  /dev/sddlmal      lvm2 —   23.99G 23.99G
  /dev/sddlmam      lvm2 —   23.99G 23.99G
  /dev/sddlman      lvm2 —   23.99G 23.99G
  /dev/sddlmao      lvm2 —   23.99G 23.99G

[root@eglts04 ~]# pvremove /dev/sddlmaa /dev/sddlmab /dev/sddlmac /dev/sddlmad /dev/sddlmae /dev/sddlmaf /dev/sddlmag /dev/sddlmah /dev/sddlmai /dev/sddlmaj /dev/sddlmak /dev/sddlmal /dev/sddlmam /dev/sddlman /dev/sddlmao

Labels on physical volume “/dev/sddlmaa” successfully wiped
Labels on physical volume “/dev/sddlmab” successfully wiped
Labels on physical volume “/dev/sddlmac” successfully wiped
Labels on physical volume “/dev/sddlmad” successfully wiped
Labels on physical volume “/dev/sddlmae” successfully wiped
Labels on physical volume “/dev/sddlmaf” successfully wiped
Labels on physical volume “/dev/sddlmag” successfully wiped
Labels on physical volume “/dev/sddlmah” successfully wiped
Labels on physical volume “/dev/sddlmai” successfully wiped
Labels on physical volume “/dev/sddlmaj” successfully wiped
Labels on physical volume “/dev/sddlmak” successfully wiped
Labels on physical volume “/dev/sddlmal” successfully wiped
Labels on physical volume “/dev/sddlmam” successfully wiped
Labels on physical volume “/dev/sddlman” successfully wiped
Labels on physical volume “/dev/sddlmao” successfully wiped

[root@eglts04 ~]# vi /etc/fstab           # comment / remove the entry of Volume group from it.

[root@eglts04 ~]# cat /etc/fstab
# This file is edited by fstab-sync – see ‘man fstab-sync’ for details
LABEL=/                 /                       ext3    defaults        1 1
LABEL=/boot             /boot                   ext3    defaults        1 2
#/dev/datavg/lvdata1     /data                   ext3    defaults        1 2

SAN disks / luns release from VxVm for decommission of SAN Array / Server.


** This procedure worked and tested in Solaris 8 **

–>df -k
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/vx/dsk/rootvol  19778327 12677586 6902958    65%    /
/proc                      0       0       0     0%    /proc
mnttab                     0       0       0     0%    /etc/mnttab
swap                 43897512      24 43897488     1%    /var/run
swap                 44032640  135152 43897488     1%    /tmp
/dev/vx/dsk/dga/vol01
                     104847360 73677500 29221790    72%    /work
/dev/vx/dsk/dga/vol02
                     585604530 340308079 229980692    60%    /u01

* unmount the above TWO filesystems that are from SAN.

–>vxvol -g dga stopall        * Stop all the “volumes” which have luns from SAN.
vxvm:vxvol: ERROR: Volume swapvol2 is currently open or mounted

–>swap -l                              * List all the swap devices.    
swapfile             dev  swaplo blocks   free
/dev/vx/dsk/swapvol 256,5      16 30935392 30935392
/dev/vx/dsk/dga/swapvol2 256,35002     16 31451120 31451120

–>swap -d /dev/vx/dsk/dga/swapvol2     * removing it from SWAP configuration.
–>vxvol -g dga stopall                

–>vxprint -vptg dga             * Lists all ” Volume & Plex ” Information.
V  NAME         RVG          KSTATE   STATE    LENGTH   READPOL   PREFPLEX UTYPE
PL NAME         VOLUME       KSTATE   STATE    LENGTH   LAYOUT    NCOL/WID MODE

pl swapvol2-02  swapvol2     DISABLED CLEAN    31451520 CONCAT    –        RW
pl vol01-02     vol01        DISABLED CLEAN    209695860 CONCAT   –        RW
pl vol02-02     vol02        DISABLED CLEAN    1171212240 CONCAT  –        RW
v  swapvol2     –            DISABLED CLEAN    31451136 SELECT    –        fsgen
v  vol01        –            DISABLED CLEAN    209694720 SELECT   –        fsgen
v  vol02        –            DISABLED CLEAN    1171209060 SELECT  –        fsgen

                                                  * removing all volumes from dg
–>vxedit -g dga -rf rm swapvol2
–>vxedit -g dga -rf rm vol01
–>vxedit -g dga -rf rm vol02

                                                   * removing all the disks from dg
–>vxdg -g dga rmdisk Hbuilding-01
–>vxdg -g dga rmdisk Hbuilding-02
–>vxdg -g dga rmdisk Hbuilding-03
–>vxdg -g dga rmdisk Hbuilding-04
–>vxdg -g dga rmdisk Hbuilding-05
–>vxdg -g dga rmdisk Hbuilding-06
–>vxdg -g dga rmdisk Hbuilding-07
–>vxdg -g dga rmdisk Hbuilding-08
vxvm:vxdg: ERROR: disassociating disk-media Hbuilding-08:
        Cannot remove last disk in disk group

–>vxdisk -e list                         * disk information
DEVICE       TYPE      DISK                GROUP      STATUS       c#t#d#_NAME
c1t0d0s2     sliced    rootdisk          rootdg       online        c1t0d0s2
c1t1d0s2     sliced    rootmir           rootdg       online        c1t1d0s2
c3t0d0s2     sliced    –                             –            online        c3t0d0s2
c3t0d1s2     sliced    –                             –            online        c3t0d1s2
c3t0d2s2     sliced    –                             –            online        c3t0d2s2
c3t0d3s2     sliced    Hbuilding-08     dga         online        c3t0d3s2
c3t0d4s2     sliced    –                             –            online        c3t0d4s2
c3t0d5s2     sliced    –                             –            online        c3t0d5s2
c3t0d6s2     sliced    –                             –            online        c3t0d6s2
c3t0d7s2     sliced    –                             –            online        c3t0d7s2

–>vxdg destroy dga                 * To remove the last disk form dg, we use this.

–>vxdisk -e list                          * Now the last disk is also removed.
DEVICE       TYPE      DISK         GROUP        STATUS       c#t#d#_NAME
c1t0d0s2     sliced    rootdisk      rootdg       online          c1t0d0s2
c1t1d0s2     sliced    rootmir       rootdg       online          c1t1d0s2
c3t0d0s2     sliced          –                   –            online          c3t0d0s2
c3t0d1s2     sliced          –                   –            online          c3t0d1s2
c3t0d2s2     sliced          –                   –            online          c3t0d2s2
c3t0d3s2     sliced          –                   –            online          c3t0d3s2
c3t0d4s2     sliced          –                   –            online          c3t0d4s2
c3t0d5s2     sliced          –                   –            online          c3t0d5s2
c3t0d6s2     sliced          –                   –            online          c3t0d6s2
c3t0d7s2     sliced          –                   –            online          c3t0d7s2

* removing all the disks from veritas control
–>vxdisk rm c3t0d0s2 c3t0d1s2 c3t0d2s2 c3t0d3s2 c3t0d4s2 c3t0d5s2 c3t0d6s2 c3t0d7s2

–>vxdisk -e list                          * It do not show the SAN disks anymore in veritas control
DEVICE       TYPE      DISK         GROUP        STATUS       c#t#d#_NAME
c1t0d0s2     sliced    rootdisk      rootdg       online       c1t0d0s2
c1t1d0s2     sliced    rootmir       rootdg       online       c1t1d0s2

Freezing and Unfreesing Service Groups in Veritas cluster ( VCS )


og-dcc01a # hastatus -summ
                                                – The above command give many details about cluster that is running on group of servers. The information that can be found / understood from it is as follows.

     * If you see “cvm” (Cluster Volume Manager) in output, it means that, the type of cluster that is running is “Parallel Cluster” not the fail-over one.
     * You can see No.of Hosts that are part of cluster under “SYSTEM STATE” menu.
     * It shows, service groups names and other information related to them that are setup on it. (All the others except “CluterService”, “cvm” are service groups that we generally work with…!)

og-dcc01a # haconf -makerw     – Before freezing service groups, we should make VCS cluster config RW.
og-dcc01a # hagrp -freeze cmddsdb_grp -persistent  
og-dcc01a # haconf -dump -makero    – make “VCS Config” RO, once freezing completes.

* -persistent enables the freeze / unfreese to be remembered when the cluster is rebooted. Config
   must be “RW” to use this option.
* You can see from below output that service group “cmddsdb_grp” is shown in the list of “
   GROUPS FROZEN”.

og-dcc01a # hastatus -summ

— SYSTEM STATE
— System               State                Frozen

A  og-dcc01a      RUNNING              0
A  og-dcc01b      RUNNING              0

— GROUP STATE
— Group                  System           Probed     AutoDisabled    State

B  ClusterService  og-dcc01a           Y               N               OFFLINE
B  ClusterService  og-dcc01b           Y               N               ONLINE
B  cmddsdb_grp    og-dcc01a           Y               N               ONLINE
B  cmddsdb_grp    og-dcc01b           Y               N               ONLINE
B  cvm                   og-dcc01a           Y               N               ONLINE
B  cvm                   og-dcc01b           Y               N               ONLINE

— GROUPS FROZEN
— Group

C  cmddsdb_grp

— RESOURCES DISABLED
— Group                      Type                Resource

H  cmddsdb_grp     CFSMount        cmddsdb_mnt1
H  cmddsdb_grp     CFSMount        cmddsdb_mnt2
H  cmddsdb_grp     CFSMount        cmddsdb_mnt3
H  cmddsdb_grp     CVMVolDg      cmddsdb_voldg1
H  cmddsdb_grp     CVMVolDg      cmddsdb_voldg2
H  cmddsdb_grp     CVMVolDg      cmddsdb_voldg3
H  cmddsdb_grp     Oracle               cmddsdb

                * As per the above output we can find that, it is 2 Node “Parellel cluster”, we can freeze the service group on both the systems by logging into them at same time.
                * Once the maintenance activity is complete, we can “unfreeze” the service groups by using following method.

og-dcc01a # haconf -makerw     – 1st make VCS Config RW.
og-dcc01a # hagrp -unfreeze cmddsdb_grp -persistent    – Unfreeze the service group(s).
og-dcc01a # haconf -dump -makero      – Once unfreeze is done, make VCS config RO.

## You get the following error when “-persistent” is not given while trying to freeze / unfreeze SG.
og-dcc01a # hagrp -unfreeze cmddsdb_grp      
VCS WARNING V-16-1-40201 Group is not temporarily frozen

* Now from cluster summery, you can see that SG “cmddsdb_grp” is gone from “GROUP STATE”
   menu, which means it is no more in freeze state. We also see that, the same SG is in “ONLINE”
   state in both the nodes in cluster because it is “Parallel cluster” as i’ve mentioned earlier.
* Here you see that, these steps are executed on “og-dcc01a”, the same steps are also executed at
   same time on “og-dcc01b”.

og-dcc01a # hastatus -summ

— SYSTEM STATE
— System               State                Frozen

A  og-dcc01a      RUNNING              0
A  og-dcc01b      RUNNING              0

— GROUP STATE
— Group                  System           Probed     AutoDisabled    State

B  ClusterService  og-dcc01a         Y               N               OFFLINE
B  ClusterService  og-dcc01b         Y               N               ONLINE
B  cmddsdb_grp    og-dcc01a         Y               N               ONLINE
B  cmddsdb_grp    og-dcc01b         Y               N               ONLINE
B  cvm                   og-dcc01a         Y               N               ONLINE
B  cvm                   og-dcc01b         Y               N               ONLINE

* Parallel    :   Service Groups will be running on all the nodes which are part of cluster at a time.
* Fail-over  :   SG’s will be running on one server at a time, on rest of them they will not be running.

Finding the UID of LUN in Linux.


* UID of LUN ( Logical Unit Number ) is different from LUN ID, which generally is in ” Hexadecimal ” format. for example : 1D79

* The following procedure is tested on ” Red Hat Enterprise Linux AS release 4 (Nahant Update 5) “

* The server where i worked out this procedure do not have ” # multipath  / # dlnkmgr  /  #inq.linux ” or any other third party tools installed which can show / provide us ” LUN ID ” of disks assigned from storage.

* LUN’s are assigned from ” Hitachi Storage Array “. However the server has ” VCS ” & ” VxVm ” installed on it.

# ls -l /dev/sdp
brw-rw—-  1 root disk 8, 240 Jul 19  2013 /dev/sdp            – This is one of the LUN’s from EMC Array.

# /sbin/scsi_id -g -s /block/sdp                                            – This command shows UID of LUN from EMC Array
360060e80057226000000722600000212

* Physically location ” /block ” is not available in linux server, however CMD syntax works that way.

* Being a sys admin, if you also have access allowed to storage array, you can confirm that UID by using the following command provided by ” EMC Unisphere / Navisphere  host agent ” installed in your Linux machine.

* Numeric “3” available from LUN UID is to be skipped / “3” is not a part of LUN ID here. The above LUN is 17th in series which is shown in my system / server.

There is lot more that can be done with # navicli, refer here for more info on it –  http://goo.gl/4b41db

# navicli -h getlun 17 -uid
UID: 60:06:0E:80:05:72:26:00:00:00:72:26:00:00:02:12

Note :  if anybody, who finds this post knows a method to find a ” LUN ID / Number ” without having to use ”  # multipath  / # dlnkmgr  /  # inq.linux ”  & other 3rd party utils in Linux system, please share it via comments section. Thanks in Advance.

Replace a failing / failed HBA on AIX Server.


* * On this server dualpath is there and there are 2 HBA’s in this server which are FCS0 and FCS1 and here failed HBA is FCS1.

^^ Making sure that failed HBA is FCS1, even # errpt will give you clue about it.

# lspath
Enabled hdisk0  scsi0
Missing hdisk1  scsi0
Enabled hdisk2  scsi0
Enabled hdisk3  fscsi0
Enabled hdisk4  fscsi0
Enabled hdisk5  fscsi0
Enabled hdisk6  fscsi0
Missing hdisk3  fscsi1
Missing hdisk4  fscsi1
Missing hdisk5  fscsi1
Missing hdisk6  fscsi1
Enabled hdisk7  fscsi0
Enabled hdisk8  fscsi0
Enabled hdisk9  fscsi0
Enabled hdisk10 fscsi0
Enabled hdisk11 fscsi0
Missing hdisk7  fscsi1
Missing hdisk8  fscsi1
Missing hdisk9  fscsi1
Missing hdisk10 fscsi1
Missing hdisk11 fscsi1
Enabled hdisk12 fscsi0
Missing hdisk12 fscsi1

^^ Take a downtime from application team and arrange for replacement of failed HBA if it is not hot-swappable, here the server model is “IBM eServer pSeries 615″, hence it is not hot-pluggable.
.
^^ Finding PCI Device name of FCS1
 # lsdev -C -l fcs1 -F parent
pci11

^^ Changing parameters of “FCS1 / FSCSI1”
# chdev -a dyntrk=yes -l fscsi1
fscsi1 changed

# chdev -a fc_err_recov -l fscsi1
fscsi1 changed

# chdev -a fast_fail -l fscsi1
fscsi1 changed

^^ Once the failed HBA is replaced and server is up, find new HBA’s “WWPN Number” and give it to Storage team for Zoning / Mapping of LUNS to it. You can find the “WWPN” from below command. Generally, WWPN number is used by storage for mapping / masking / zoning the LUNS to HBA / FC Card.

# lscfg -vl fcs1
 fcs1             U0.1-P1-I4/Q1  FC Adapter

        Part Number……………..00P4295
        EC Level………………..A
        Serial Number……………1D3310C353
        Manufacturer…………….001D
        Feature Code/Marketing ID…5704
        FRU Number………………     00P4297
        Device Specific.(ZM)……..3
        Network Address………….10000000C935A988  [ WWN Number ]
        ROS Level and ID…………02E01991
        Device Specific.(Z0)……..2003806D
        Device Specific.(Z1)……..00000000
        Device Specific.(Z2)……..00000000
        Device Specific.(Z3)……..03000909
        Device Specific.(Z4)……..FF601416
        Device Specific.(Z5)……..02E01991
        Device Specific.(Z6)……..06631991
        Device Specific.(Z7)……..07631991
        Device Specific.(Z8)……..20000000C935A988  [ WWPN Number ]
        Device Specific.(Z9)……..HS1.92A1
        Device Specific.(ZA)……..H1D1.92A1
        Device Specific.(ZB)……..H2D1.92A1
        Device Specific.(YL)……..U0.1-P1-I4/Q1

^^ Now remove the protocol device and its child devices from the “Device tree / ODM definition” and also from the server with any of below 2 commands. This step can be completed before the installation of new HBA (Later comes down time for replacement of new HBA), In fact doing this before HBA replacement is a better way of doing it.

# rmdev -dl fscsi1 -R
fscsi1 deleted

# rmdev -dl fcs1 -R
fcnet1 deleted
fcs1 deleted

^^ This single commands establishes same task the above two commands do.

# rmdev -Rdl fcs1
fscsi1 deleted
fcnet1 deleted
fcs1 deleted

### Now if you observe in ” #lspath / #lsdev -Cc disk “ only disks from single path are shown and all the disks that are showing as “Missed” are removed / gone, You can also see the same from 3rd party multipath software HITACHI also shows disks from single path.

# lsdev -Cc disk
hdisk0  Available 1S-08-00-5,0 16 Bit LVD SCSI Disk Drive
hdisk1  Defined   1S-08-00-8,0 16 Bit LVD SCSI Disk Drive
hdisk2  Available 1S-08-00-8,0 16 Bit LVD SCSI Disk Drive
hdisk3  Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk4  Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk5  Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk6  Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk7  Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk8  Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk9  Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk10 Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk11 Available 1Z-08-01     Hitachi Disk Array (Fibre)
hdisk12 Available 1Z-08-01     Hitachi Disk Array (Fibre)

# lspath
Enabled hdisk0  scsi0
Missing hdisk1  scsi0    – This is onboard / local disk inside server itself.
Enabled hdisk2  scsi0
Enabled hdisk3  fscsi0
Enabled hdisk4  fscsi0
Enabled hdisk5  fscsi0
Enabled hdisk6  fscsi0
Enabled hdisk7  fscsi0
Enabled hdisk8  fscsi0
Enabled hdisk9  fscsi0
Enabled hdisk10 fscsi0
Enabled hdisk11 fscsi0
Enabled hdisk12 fscsi0

/usr/DynamicLinkManager/bin:#  ./dlnkmgr view -lu
Product       : USP_V
SerialNumber  : 0029222
LUs           : 10

iLU    HDevName OSPathID PathID Status
000296 hdisk12  00000    000008 Online
00034D hdisk3   00000    000002 Online
00034F hdisk4   00000    000005 Online
000350 hdisk5   00000    000007 Online
00035B hdisk6   00000    000001 Online
0005F1 hdisk7   00000    000004 Online
0005F2 hdisk8   00000    000003 Online
0005F3 hdisk9   00000    000009 Online
0005F4 hdisk10  00000    000006 Online
0005F5 hdisk11  00000    000000 Online

^^ Now, once storage confirms you that they’ve completed the “mapping of luns” for newly provided WWPN from replaced HBA, do the following.

# cfgmgr             # Scans device tree.

^^ Once scanning is complete, please check ” #lspath and #dlnkmgr view -lu “ outputs to find that 2nd set of disks showing up and you have a dual path again.
[root@dsearch2:/usr/DynamicLinkManager/bin] lspath
Enabled hdisk0  scsi0
Missing hdisk1  scsi0
Enabled hdisk2  scsi0
Enabled hdisk3  fscsi0
Enabled hdisk4  fscsi0
Enabled hdisk5  fscsi0
Enabled hdisk6  fscsi0
Enabled hdisk3  fscsi1
Enabled hdisk4  fscsi1
Enabled hdisk5  fscsi1
Enabled hdisk6  fscsi1
Enabled hdisk7  fscsi0
Enabled hdisk8  fscsi0
Enabled hdisk9  fscsi0
Enabled hdisk10 fscsi0
Enabled hdisk11 fscsi0
Enabled hdisk7  fscsi1
Enabled hdisk8  fscsi1
Enabled hdisk9  fscsi1
Enabled hdisk10 fscsi1
Enabled hdisk11 fscsi1
Enabled hdisk12 fscsi0
Enabled hdisk12 fscsi1

/usr/DynamicLinkManager/bin:# ./dlnkmgr view -lu
Product       : USP_V
SerialNumber  : 0029222
LUs           : 10

iLU    HDevName OSPathID PathID Status
000296 hdisk12  00000    000008 Online
                00001    000015 Online
00034D hdisk3   00000    000002 Online
                00001    000014 Online
00034F hdisk4   00000    000005 Online
                00001    000012 Online
000350 hdisk5   00000    000007 Online
                00001    000016 Online
00035B hdisk6   00000    000001 Online
                00001    000019 Online
0005F1 hdisk7   00000    000004 Online
                00001    000011 Online
0005F2 hdisk8   00000    000003 Online
                00001    000013 Online
0005F3 hdisk9   00000    000009 Online
                00001    000018 Online
0005F4 hdisk10  00000    000006 Online
                00001    000017 Online
0005F5 hdisk11  00000    000000 Online
                00001    000010 Online
KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2014/04/21 03:04:53

NOTES on few things :

> > Dynamic Tracking of the FC adapter driver
detects when the Fibre Channel N_Port ID of a device changes. The
FC adapter driver then reroutes traffic that is destined for that
device to the new address while the devices are still online. Events
that can cause an N_Port ID to change include one of the following
scenarios:

  • Moving a cable between a switch and storage device from one switch
    port to another.
  • Connecting two separate switches by using an inter-switch link
    (ISL).
  • Rebooting a switch.
> > When dynamic tracking is disabled, there is a marked difference between
the delayed_fail and fast_fail settings
of the fc_err_recov attribute. However, with dynamic tracking enabled,
the setting of the fc_err_recov attribute is less significant. This
is because there is some overlap in the dynamic tracking and fast fail error-recovery
policies. Therefore, enabling dynamic tracking inherently enables some of
the fast fail logic.
The general error recovery procedure when a device is no longer reachable
on the fabric is the same for both fc_err_recov settings
with dynamic tracking enabled. The minor difference is that the storage drivers
can choose to inject delays between I/O retries if fc_err_recov is
set to delayed_fail. This increases the I/O failure time
by an additional amount, depending on the delay value and number of retries,
before permanently failing the I/O. With high I/O traffic, however, the difference
between delayed_fail and fast_fail might
be more noticeable.
^^ This procedure is tested and worked well on AIX 5.2 server.
# uname -a
AIX   unixmemis1   2  5   00574ACE4D01