ZFS over iSCSI over ZFS

Jun 11, 2009 19:26

Some stuff I was hacking away on at work today; this is mostly a note to self:

The tl;dr version:
  1. double-check commands you type when playing with disks
  2. triple-check commands you type when playing with disks
  3. see #1 and #2
  4. the way to "refresh" the list of iscsi targets on the initiator is `iscsiadm modify discovery -t enable` (yes, I already had it enabled; it works anyway)
  5. it's `zpool iostat [pool] [interval]`, not `zpool iostat [interval] [pool]`


So...I set out to find out what happens when we resize a disk that's shared out over iSCSI.

And, just for my own sanity that this will work with real data, I copied over some D&D stuff to xitomatl:

> time find dnd -type f -exec cat {} \; | md5sum
b5f1fc52d09baaf9f3db34408ce9c184 -

real 4m6.008s

I started with a mirror of 2 disks, cleverly named foo and bar, both 10G. Resizing bar to 15G caused the initiator to disconnect, and the pool faulted. Then I removed the faulty disk and added the new one in, and viola! 25G of space instead of my 9G mirror.

...crap; I wanted to mirror that. Backing out changes is a PITA, so I'll create a new disk (cleverly named zot) to be a mirror of foo during the upgrade to 15G.

Before adding zot (removed the targets that I'm not playing with):

caerbannog:~# iscsitadm list target
Target: idle/robin/foo
iSCSI Name: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Connections: 1
Target: idle/robin/bar
iSCSI Name: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Connections: 1

xitomatl:/# iscsiadm list target
Target: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Alias: idle/robin/bar
TPGT: 1
ISID: 4000002a0000
Connections: 0
Target: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Alias: idle/robin/foo
TPGT: 1
ISID: 4000002a0000
Connections: 1

um...this is odd, since bar is part of my zpool on xitomatl

And then I added the new zfs iSCSI target on caerbannog:
mkiscitarget.sh is a wrapper around all the steps:
  1. zfs create -s -V ${SIZE} ${TARGET}
  2. (optional) iscsitadm modify -l ${INITIATOR} ${TARGET}
  3. (optional) iscsitadm modify -p ${TPGT} ${TARGET}


caerbannog:~# ./mkiscsitarget.sh idle/robin/zot 10G xitomatl 208
caerbannog:~# iscsitadm list target
Target: idle/robin/foo
iSCSI Name: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Connections: 1
Target: idle/robin/bar
iSCSI Name: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Connections: 1
Target: idle/robin/zot
iSCSI Name: iqn.1986-03.com.sun:02:e0fe1c0d-634e-e96e-f8ad-dbc3c304a6e5
Connections: 1

xitomatl:/# iscsiadm modify discovery -t enable
xitomatl:/# iscsiadm list target
Target: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Alias: idle/robin/foo
TPGT: 208
ISID: 4000002a0000
Connections: 1
Target: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Alias: idle/robin/bar
TPGT: 208
ISID: 4000002a0000
Connections: 1
Target: iqn.1986-03.com.sun:02:e0fe1c0d-634e-e96e-f8ad-dbc3c304a6e5
Alias: idle/robin/zot
TPGT: 208
ISID: 4000002a0000
Connections: 1
Target: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Alias: idle/robin/bar
TPGT: 1
ISID: 4000002a0000
Connections: 0
Target: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Alias: idle/robin/foo
TPGT: 1
ISID: 4000002a0000
Connections: 0
xitomatl:/# format
Searching for disks...done

c1t010000144FF2985500002A004A31B8A6d0: configured with capacity of 10.00GB

AVAILABLE DISK SELECTIONS:
0. c0t0d0
/pci@1f,0/ide@d/dad@0,0
1. c1t010000144FF2985500002A004A31A1F7d0
/scsi_vhci/ssd@g010000144ff2985500002a004a31a1f7
2. c1t010000144FF2985500002A004A31B8A6d0
/scsi_vhci/ssd@g010000144ff2985500002a004a31b8a6
3. c1t010000144FF2985500002A004A304D18d0
/scsi_vhci/ssd@g010000144ff2985500002a004a304d18
Specify disk (enter its number): ^C

an aside: Ben Rockwood used ^D to get out of format in an article I ran across. Since that works, it would stand to reason that `format <&-` would work in bash; so much for reason. `format

Oh...and I changed the tpgt for zot, bar, and foo - 208 is the box's nge0; 1 is actually tpgt 0 on the targets, but it comes across to the initiator as 1; no idea why that is. Evidently, iscsiadm has decided the disks are different (despite matching GUIDs) because the TPGT entries are different.

That's just silly; I'll move them back so it sees its targets again.
Besides, my `zpool status` on xitomatl is hanging and unkillable, so maybe the disk is gone and solaris hasn't quite figured it out (40min seek time? sure, that makes sense).

caerbannog:~# iscsitadm delete target -p 208 idle/robin/foo
caerbannog:~# iscsitadm delete target -p 208 idle/robin/bar

xitomatl:/# iscsiadm modify discovery -t enable
xitomatl:/# iscsiadm list target
Target: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Alias: idle/robin/foo
TPGT: 208
ISID: 4000002a0000
Connections: 1
Target: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Alias: idle/robin/bar
TPGT: 208
ISID: 4000002a0000
Connections: 0
Target: iqn.1986-03.com.sun:02:e0fe1c0d-634e-e96e-f8ad-dbc3c304a6e5
Alias: idle/robin/zot
TPGT: 208
ISID: 4000002a0000
Connections: 1
Target: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Alias: idle/robin/bar
TPGT: 1
ISID: 4000002a0000
Connections: 1
Target: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Alias: idle/robin/foo
TPGT: 1
ISID: 4000002a0000
Connections: 0

wait...what?!? I tried moving foo back to 208, since that's where xitomatl says it's connected to the disk, but nothing changed. Time to turn off the iSCSI sharing and see if ZFS can figure it out when things come back.

caerbannog:~# zfs set shareiscsi=off idle/robin/foo
caerbannog:~# zfs set shareiscsi=off idle/robin/bar
caerbannog:~# zfs set shareiscsi=off idle/robin/zot
caerbannog:~# iscsitadm list target
caerbannog:~#

xitomatl:/# iscsiadm modify discovery -t enable
xitomatl:/# iscsiadm list target
Target: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Alias: idle/robin/foo
TPGT: 208
ISID: 4000002a0000
Connections: 0
Target: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Alias: idle/robin/bar
TPGT: 208
ISID: 4000002a0000
Connections: 0
Target: iqn.1986-03.com.sun:02:e0fe1c0d-634e-e96e-f8ad-dbc3c304a6e5
Alias: idle/robin/zot
TPGT: 208
ISID: 4000002a0000
Connections: 0
Target: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Alias: idle/robin/bar
TPGT: 1
ISID: 4000002a0000
Connections: 0
Target: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Alias: idle/robin/foo
TPGT: 1
ISID: 4000002a0000
Connections: 0
xitomatl:/# iscsiadm modify discovery -t disable
iscsiadm: logical unit in use
iscsiadm: Unable to complete operation

Weird; and the file system is still mounted. But I can't `zpool status` (going on an hour now), so it shouldn't still work, right?

> ls dnd/
2e cerat
3.5e dX_skills.ods
4e dnd
...

Yes, I know I have a dnd/dnd; I really need to clean it up.

Okay...looks like things are still hosed; time to pull the plug on the processes:

xitomatl:/# kill 726 2531 17260
xitomatl:/# kill -9 726 2531 17260
xitomatl:/# kill -CONT 726 2531 17260
xitomatl:/# for i in `/pkgs/gnu/bin/seq 1 48`; do kill -$i 726 2531 17260; done
xitomatl:/#

Now, the box is just mocking me...I think this is why the default is to panic in such a situation.

Sharing all the stuff back out to xitomatl so it can come up "clean":

caerbannog:~# zfs set shareiscsi=on idle/robin/foo
caerbannog:~# iscsitadm modify target -l xitomatl idle/robin/foo
caerbannog:~# iscsitadm modify target -p 208 idle/robin/foo
... ditto for bar and zot ...

caerbannog:~# iscsitadm list target
Target: idle/robin/foo
iSCSI Name: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Connections: 0
Target: idle/robin/bar
iSCSI Name: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Connections: 0
Target: idle/robin/zot
iSCSI Name: iqn.1986-03.com.sun:02:e0fe1c0d-634e-e96e-f8ad-dbc3c304a6e5
Connections: 0
xitomatl:/# reboot
Connection to xitomatl closed by remote host.
Connection to xitomatl closed.

caerbannog:~# iscsitadm list target
Target: idle/robin/foo
iSCSI Name: iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6
Connections: 1
Target: idle/robin/bar
iSCSI Name: iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c
Connections: 1
Target: idle/robin/zot
iSCSI Name: iqn.1986-03.com.sun:02:e0fe1c0d-634e-e96e-f8ad-dbc3c304a6e5
Connections: 1

Yes, it connected just after I told it to reboot and SSH kicked me out.

And, back to xitomatl, now sitting at a white screen of unlife:

stop+a
Type 'go' to resume
> boot
On reboot:

NOTICE: iscsi session(12) iqn.1986-03.com.sun:02:0fb34688-bd5c-6f54-ba25-ca3d972187f6 online
NOTICE: iscsi session(9) iqn.1986-03.com.sun:02:2e83e95d-3841-44d7-c97c-c8de9fe7629c online
NOTICE: iscsi session(6) iqn.1986-03.com.sun:02:e0fe1c0d-634e-e96e-f8ad-dbc3c304a6e5 online

You'll notice these are the iqn's for foo, bar, and zot, respectively. After cde-login comes up, the box helpfully tells me that my pool has faulted.

After logging in and running a `zpool status`, it says that bar is offline. Waiting a little makes bar come back. Weird, but at least it recovered. It's not a mirror, but I still only seem to have 10G out of my 2 10G disks. I hate using the Windows approach in UNIX, though.

xitomatl:/# zpool status trump
pool: trump
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
trump ONLINE 0 0 0
c1t010000144FF2985500002A004A31A1F7d0 ONLINE 0 0 0
c1t010000144FF2985500002A004A304D18d0 ONLINE 0 0 0

errors: No known data errors
xitomatl:/# format
/pci@1f,0/ide@d/dad@0,0
1. c1t010000144FF2985500002A004A31A1F7d0
/scsi_vhci/ssd@g010000144ff2985500002a004a31a1f7
2. c1t010000144FF2985500002A004A31B8A6d0
/scsi_vhci/ssd@g010000144ff2985500002a004a31b8a6
3. c1t010000144FF2985500002A004A304D18d0
/scsi_vhci/ssd@g010000144ff2985500002a004a304d18
Specify disk (enter its number):
xitomatl:/# zfs list trump
NAME USED AVAIL REFER MOUNTPOINT
trump 1.59G 8.19G 1.59G /disk/trump
xitomatl:/# zpool iostat trump 30
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
trump 1.59G 23.2G 13 0 1.59M 0

Weird...looks like it still thinks the devices are 10G and 15G.

caerbannog:~# zfs get volsize idle/robin/foo
NAME PROPERTY VALUE SOURCE
idle/robin/foo volsize 10G -
caerbannog:~# zfs get volsize idle/robin/bar
NAME PROPERTY VALUE SOURCE
idle/robin/bar volsize 10G -
caerbannog:~# zfs get volsize idle/robin/zot
NAME PROPERTY VALUE SOURCE
idle/robin/zot volsize 10G -

And a quick data integrity check:

> time find dnd -type f -exec cat {} \; | md5sum
b5f1fc52d09baaf9f3db34408ce9c184 -

real 4m26.241s

Time to attach zot to the mirror and wait for it to resilver.

xitomatl:/# zpool attach trump c1t010000144FF2985500002A004A31A1F7d0 c1t010000144FF2985500002A004A31B8A6d0
xitomatl:/# zpool status trump
pool: trump
state: FAULTED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-2Q
scrub: resilver in progress for 0h0m, 6.39% done, 0h3m to go
config:

NAME STATE READ WRITE CKSUM
trump FAULTED 0 6 0 insufficient replicas
mirror ONLINE 0 0 0
c1t010000144FF2985500002A004A31A1F7d0 ONLINE 0 0 0
c1t010000144FF2985500002A004A31B8A6d0 ONLINE 0 0 0
c1t010000144FF2985500002A004A304D18d0 UNAVAIL 0 6 0 cannot open

errors: No known data errors

0h3m became 0h6m; 0h6m became 0h10m; 0h10m became 0h30m...I went to get food. I'll work on this more later.

unix, angst, zfs, work, ramblings, iscsi

Previous post Next post
Up