SLURM Plugin for Predict-IT

Warning

Currently only available for OKA >= 1.15.0 <= 2.0.0

Prerequisites

Packages / libraries

  • cmake > 3.14

  • slurm > 2020.xx and its sources

  • libcpr == 1.9.3

  • libcurl >= 7.71 (min version expected by libcpr)

  • gcc & gcc-c++ >= 7.3

Access rights

  • Slurm sources: [REQUIRED-BUILD] Read access rights to the folder where Slurm sources are located. The plugin needs to access Slurm sources to build properly.

  • Slurm install folder: [REQUIRED-INSTALL] Read and Write access rights to ${PATH_TO_SLURM_DIR}/lib/slurm. The plugin will copy its libraries into the Slurm lib folder upon installation.

  • Slurm configuration file: [REQUIRED-INSTALL] Read and Write access to Slurm configuration file ${PATH_TO_SLURM_CONF_DIR}/slurm.conf (e.g. /etc/slurm/slurm.conf or /opt/etc/slurm/slurm/conf). The plugin needs to be defined within the Slurm configuration.

  • Slurm configuration folder:

    • [REQUIRED-INSTALL] Read and Write access rights to ${PATH_TO_SLURM_CONF_DIR}/. The plugin will copy its own configuration file where the Slurm configuration is stored upon installation.

    • [REQUIRED-TEST] Read and Write access rights to ${PATH_TO_SLURM_CONF_DIR}/predictit.conf. Access to the plugin configuration file will be needed during the testing phase.

  • Slurm services: [REQUIRED-INSTALL] slurmctld.service needs to be restarted for the plugin to be loaded and used by Slurm.

  • Slurm logs: [REQUIRED-TEST] Read access rights to Slurm logs (e.g., /var/log/slurmctld.log, /var/log/messages).

Debian 10:

sudo apt update
sudo apt install build-essential

RHEL 7/8:

yum groupinstall 'Development Tools'

Important

Make sure the Slurm libraries and the newly built libraries libcpr.so and libcurl.so are known to the slurm user (i.e., the user running the slurm services) environment so that the plugin runs properly once installed.

  • Create a new /etc/ld.so.conf.d/predictit-lib-path.conf file containing the path to where those libraries are (e.g., /usr/local/lib64, /opt/slurm.lib).

  • Apply the library path changes globally by calling sudo ldconfig

Compiling

To compile the code:

  1. Extract tarball slurm_plugin.tar.gz

    tar -xvf slurm_plugin.tar.gz -C ${EXTRACT_PATH}
    
  2. Enter the slurm_plugin directory

    cd ${EXTRACT_PATH}/slurm_plugin
    
  3. Create a build directory and enter it

    mkdir build
    cd build
    
  1. Run CMAKE

    cmake -DSLURM_SRC_DIR=${SLURM_SRC_PATH}  ../
    

    The plugin has a few parameters as its cmake options:

    • SLURM_SRC_DIR: Path to Slurm sources directory (e.g., /home/slurm-22.05.6)

    • SLURM_CONF_DIR: [OPTIONAL] Path of installation dir for the plugin configuration. Default will be Slurm configuration directory (e.g. /etc/slurm or /opt/etc/slurm/)

      Note

      Use this when you want to control the placement of the predict.conf file. By default, cmake will look first into /etc and if nothing is found only then will it look into SLURM_PATH. Therefore, it might not found the path you were expecting on machine where Slurm was installed twice and remnant conf folder still exists in /etc when you actually want, for example, to use the new /opt/slurm/etc folder.

    • BUILD_CPR: False [OPTIONAL] A boolean to specify if libcpr requirement needs to be handled during the plugin build

    Important

    Make sure your env contains path to Slurm dir so that cmake can find the libraries by itself. If not automatically found, you can export SLURM_PATH=”${PATH_TO_SLURM_DIR}” and try again.

    Note

    You can use ccmake instead of cmake if you want to have an ncurse graphical interface.

  2. Run make to start the compilation and install the plugin

    make
    # We need to be root as we install in the same directories as Slurm
    # - job_submit_predictit.so is copied to Slurm libraries directory
    # - predictit.conf is copied to Slurm etc directory
    sudo make install
    

    The plugin library job_submit_predictit.so will be installed in Slurm libraries directory (e.g., /usr/lib64/slurm/ or /opt/slurm/lib/slurm), and the predictit.conf configuration file for the plugin itself will by default be installed in the Slurm etc directory (e.g., /etc/slurm or /opt/etc/slurm/). Check SLURM_CONF_DIR option if you want to place it eleswhere.

  3. Update predictit.conf

    Edit the newly created configuration file for the plugin to have the following values properly set (see Configuration for more info):

    • "user": "OKA_USER_LOGIN"

    • "password": "OKA_USER_PASSWORD"

    • "oka_url": "OKA_URL"

  4. Update Slurm configuration

    sudo vim /etc/slurm/slurm.conf
    # Edit the file to have:
    JobSubmitPlugins=predictit
    
  5. Restart slurmctld

    sudo systemctl restart slurmctld.service