This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
lbaops:lbaobservingnotes:hobart [2009/07/03 15:36] cehotan |
lbaops:lbaobservingnotes:hobart [2015/12/18 16:38] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ===== Instructions for Hobart disk cleaning ===== | ||
+ | |||
+ | The [[http:// | ||
+ | |||
+ | For specific information relating to disk failures or reformatting, | ||
+ | |||
+ | Hobart has one VLBI xraid, which can be found in rack 6.\\ | ||
+ | The xraid is connected to '' | ||
+ | |||
+ | Things that may need to be done are:\\ | ||
+ | - [[.:# | ||
+ | - Change the number of [[.:# | ||
+ | - Put a new [[.:# | ||
+ | - [[.:# | ||
+ | - [[.:# | ||
+ | The most common, of course, is simply reformatting.\\ | ||
+ | \\ | ||
+ | \\ | ||
+ | [[.:# | ||
+ | **Redetecting devices** needs to be done whenever you change disks so that you have more slices, or larger or smaller disks, to ensure that queries give sensible feedback.\\ | ||
+ | The first place you should look for this information is the [[http:// | ||
+ | \\ | ||
+ | However, for completeness (only), the simplest way to do this is simply to restart the computer ('' | ||
+ | However it may not be possible to reboot the machine if other people are currently logged onto the computer - check this using the command '' | ||
+ | If it is not possible to restart the computer at the time and based on the output of '' | ||
+ | \\ | ||
+ | [[.:# | ||
+ | **Adding or removing slices on an xraid set** requires use of the xraid admin tools. Current version is 1.5.1, but I can only find 1.3.3 on '' | ||
+ | To reslice devices, should it be necessary (unlikely but not impossible), | ||
+ | \\ | ||
+ | '' | ||
+ | hovsi:~> su -\\ | ||
+ | Password:\\ | ||
+ | hovsi:~# cd RAID\ Admin\ 1.5.1/\\ | ||
+ | hovsi: | ||
+ | \\ | ||
+ | For screenshots of what to do with the RAID tools (albeit the newer version), see the [[http:// | ||
+ | \\ | ||
+ | Once the admin GUI has opened, you can select the xraid you're interested in, and look at the " | ||
+ | If you need to change the slice settings, select the " | ||
+ | You can then select the set of drives you want to slice/ | ||
+ | \\ | ||
+ | [[.:# | ||
+ | **A disk may need to have a partition applied** if the disk set has been resliced, or the partition table is broken for some other reason. As we use disks which are >2TB, '' | ||
+ | Chris has written scripts for partitioning and formatting: | ||
+ | \\ | ||
+ | < | ||
+ | hovsi:~> su - | ||
+ | Password: </ | ||
+ | %color=# | ||
+ | < | ||
+ | %color=# | ||
+ | < | ||
+ | / | ||
+ | tmpfs | ||
+ | udev | ||
+ | tmpfs | ||
+ | / | ||
+ | / | ||
+ | / | ||
+ | %color=# | ||
+ | %color=# | ||
+ | < | ||
+ | hovsi:~# e2label / | ||
+ | ATNF V017A </ | ||
+ | %color=# | ||
+ | < | ||
+ | |||
+ | We now have a freshly partitioned disk. Can check that did what you want by running '' | ||
+ | |||
+ | < | ||
+ | (parted) help</ | ||
+ | %color=# | ||
+ | We'll make a GPT disk, with a primary partition that fills the whole disk.**\\ | ||
+ | < | ||
+ | (parted) mkpart primary 0 -0 | ||
+ | (parted) quit</ | ||
+ | \\ | ||
+ | As with any other hard disk, after partitioning the disk needs to be formatted.\\ | ||
+ | \\ | ||
+ | [[.:# | ||
+ | **Reformatting a disk set** is the most commonly required task, and hopefully the only one you'll ever need to do! :)\\ | ||
+ | Chris has written a script which will do this, it lives on hovsi.\\ | ||
+ | Substitute appropriate machine numbers and device names below - check with Brett (or someone else in VLBI) if you don't have the vlbi or root passwords.\\ | ||
+ | \\ | ||
+ | < | ||
+ | hovsi:~> su - | ||
+ | Password: | ||
+ | %color=# | ||
+ | Check if the disks are currently mounted - must unmount to format.**\\ | ||
+ | < | ||
+ | Filesystem | ||
+ | / | ||
+ | tmpfs | ||
+ | udev | ||
+ | tmpfs | ||
+ | / | ||
+ | / | ||
+ | / | ||
+ | %color=# | ||
+ | < | ||
+ | %color=# | ||
+ | < | ||
+ | ATNF V017A </ | ||
+ | %color=# | ||
+ | < | ||
+ | hovsi:/~# mount /dev/sdc1 / | ||
+ | %color=# | ||
+ | \\ | ||
+ | If the script doesn' | ||
+ | '' | ||
+ | '' | ||
+ | \\ | ||
+ | Repeat for as many disk sets as necessary - can do 2 at a time with a full chassis. When done, run '' | ||
+ | \\ | ||
+ | [[.:# | ||
+ | **Rebuilding a degraded array** needs to occur whenever a disk loaded in an array comes up with a red light, and removing and reinserting it does not fix the problem. Sometimes you may also need to change disks for an orange light, too, though usually they can be fixed by " | ||
+ | If you insert a spare disk (at the moment Curtin supplies these when needed), the rebuilding process will occur automatically, | ||
+ | For instructions on how to start the xraid admin gui, see " | ||
+ | \\ | ||
+ | The disks come in sets of 7, which are linked in a RAID 5 array, meaning that if a disk is lost, all data is still recoverable, | ||
+ | Similarly, you must be careful to always load disks in the correct order, and if a spare disk is required, replace it in the position of the failed disk, don't move any of the other disks.\\ | ||
+ | \\ | ||
+ | %color=# | ||
+ | In this case, carefully remove the disk with a warning (you may need to press the warning button on the chassis to stop the alarm noise), being careful not to release any of the green-light disks. Check the contacts on the disk, and try reinserting the disk, being sure to push it right in until it makes the " | ||
+ | If the disk is still in an error state it will remain orange or turn red. In this case, start the xraid Admin tools, and look at the state of the individual disk. The xraid may recognise the disk as valid, but think it belongs to a different array (even if it clearly doesn' | ||
+ | In one case that I've encountered the disk lost it's SMART capability - not quite sure what that is, but I think it's not fatal, just means it's lost some of its self-monitoring capability. My feeling is to just ignore the warning.\\ | ||
+ | If for some reason the disk can't be made available and the set rebuilt with the same disk, a spare will be needed - contact Claire or someone else at Curtin, as ATNF has run out of spares.\\ | ||
+ | \\ | ||
+ | **//For a red light failure:// | ||
+ | \\ | ||
+ | **Note:** The time disks seem most prone to failure is when they are removed, inserted or transported - xraids are made to be swappable, but not really to the extent that we use them for that purpose!! Consequently if you DO have a failed disk set, leave it loaded until it's fixed!\\ | ||
+ | \\ | ||
+ | \\ | ||
If any of this doesn' | If any of this doesn' | ||