[External] Re: [openhpc-users] xcat efi stateful on instdisk=nvme0n1 stuck at "Generate the repository for the installation" #pxe #openhpc

Miroslav Hodak <mhodak@...>



I have been following this issue and glad you got this resolved. As you said, the partitionfile is what fixed it and now you are using it correctly.


Not sure what was your issue, do not be confused by messages “Found /dev/sdk, generating partition file”, these messages do not mean that the drives will actually be used.


Miro Hodak, PhD

HPC & AI Developer

Lenovo United States


From: OpenHPC-users@groups.io <OpenHPC-users@groups.io> On Behalf Of jesse.stacey@...
Sent: Thursday, September 19, 2019 2:29 PM
To: OpenHPC-users@groups.io
Subject: [External] Re: [openhpc-users] xcat efi stateful on instdisk=nvme0n1 stuck at "Generate the repository for the installation" #openhpc #pxe


hmm not much interest in this thread, but for completion and anyone that prefers to go the OpenHPC / xCat stateful route, I will show what happened with this finally.

First of all, the NEXTSERVER line might be the cause of some issues since it is grepping through output that is subject to change, but the real issue was my partitionfile.
Debian/Ubuntu uses a partitionfile.sh script while RHEL / Centos prefers to use a kickstart formatted partitionfile. I just went to a functional node that already kickstarted properly, and grabbed the partitioning from /root/anaconda-ks.cfg. It looks like this:

part swap --fstype="swap" --ondisk=nvme0n1 --size=4096
part /boot --fstype="xfs" --ondisk=nvme0n1 --size=512
part / --fstype="xfs" --ondisk=nvme0n1 --size=911056
part /boot/efi --fstype="efi" --ondisk=nvme0n1 --size=50 --fsoptions="defaults,uid=0,gid=0,umask=0077,shortname=winnt"

I saved this to /install/custom/my-partitions and attached it to the install image with: 

chdef -t osimage centos7.6-x86_64-install-compute -p partitionfile=/install/custom/my-partitions

Then regenerate the xcat file for the node(s): 

nodeset compute osimage=centos7.6-x86_64-install-compute

This regenerates the files inside /install/autoinst . You can empty this folder before issuing nodeset to make sure they are regenerated.

Then tell xcat to reboot the node(s) for reinstall :

rsetboot compute net
rpower compute boot

If you do tail on the xcat log (or ssh to the node being installed), just ignore the Found /dev/sdx , it should use the drive you specified in the partitionfile for your osimage regardless:

For a more verbose output of whats happening during the xcat kickstart, use xcatprobe before flagging the node(s) for reinstall. I usually do this from a screen tty session, and make sure you set xcatdebugmode=2 :

Thats it. I would have loved to continue using warewulf, it just gives so many headaches for anyone wanting to use use nvme boot drives in stateful, I had no choice but to go with xcat.