From 0c65a7ad6541274e431ab2c2ee379d89e1ab83aa Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Tue, 3 Jun 2025 09:50:15 +0300 Subject: [PATCH 001/139] Roihu updates to Docs CSC From 7971b643d3676c55dca7fc4578326d1f8818160f Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg <43936697+rkronberg@users.noreply.github.com> Date: Thu, 12 Mar 2026 16:16:29 +0200 Subject: [PATCH 002/139] Roihu migration guide (#2671) * Roihu migration guide * SSH CA (#2810) * SSH CA * headers * add mycsc instructions * add instructions * fix linkl * fix link * extra step * instructions * SSH agent * computing index page * fix link * use snippets * roihu fingerprints for pilot phase * updates * authentication agent for windows * agent forwarding * document python version of cert helper * quickstart guide * add some link * Add graphics how SSH login works Just a try to add something graphical. Would need at least some text around etc, but remove if not needed. * implement suggestions * fix links * fix links * move diagram up * migration guide: general notes * adjust auto-approved limits * updates * update ssh ca instructions * some updates * data migration guide v1 * fix links * SSH CA updates * more ssh ca updates * more ssh ca updates * update rsync, scp, tar-ssh * filezilla * winscp --------- Co-authored-by: Kylli Ek --- .../assets/snippets/graphical-connection.md | 4 + .../assets/snippets/ssh-agent-forwarding.md | 4 + csc-overrides/assets/snippets/ssh-ca.md | 8 + .../assets/snippets/using-ssh-keys.md | 8 + docs/computing/connecting/index.md | 85 +++- docs/computing/connecting/ssh-keys.md | 186 ++++++- docs/computing/connecting/ssh-unix.md | 134 ++++-- docs/computing/connecting/ssh-windows.md | 213 ++++++--- docs/computing/index.md | 50 +- docs/computing/systems-roihu.md | 1 + docs/data/moving/disk_mount.md | 2 +- docs/data/moving/graphical_transfer.md | 28 ++ docs/data/moving/rsync.md | 16 +- docs/data/moving/scp.md | 24 +- docs/data/moving/tar_ssh.md | 21 +- docs/index.md | 6 +- docs/support/tutorials/index.md | 5 + docs/support/tutorials/ml-guide.md | 2 +- docs/support/tutorials/roihu-data.md | 452 ++++++++++++++++++ docs/support/tutorials/roihu.md | 94 ++++ 20 files changed, 1162 insertions(+), 181 deletions(-) create mode 100644 csc-overrides/assets/snippets/graphical-connection.md create mode 100644 csc-overrides/assets/snippets/ssh-agent-forwarding.md create mode 100644 csc-overrides/assets/snippets/ssh-ca.md create mode 100644 csc-overrides/assets/snippets/using-ssh-keys.md create mode 100644 docs/support/tutorials/roihu-data.md create mode 100644 docs/support/tutorials/roihu.md diff --git a/csc-overrides/assets/snippets/graphical-connection.md b/csc-overrides/assets/snippets/graphical-connection.md new file mode 100644 index 0000000000..d6da1032fd --- /dev/null +++ b/csc-overrides/assets/snippets/graphical-connection.md @@ -0,0 +1,4 @@ +!!! info "Note" + For performance reasons, we generally recommend using the + [HPC web interfaces](/computing/webinterface/index.md) to run applications which + require displaying graphics. diff --git a/csc-overrides/assets/snippets/ssh-agent-forwarding.md b/csc-overrides/assets/snippets/ssh-agent-forwarding.md new file mode 100644 index 0000000000..4637bafd4e --- /dev/null +++ b/csc-overrides/assets/snippets/ssh-agent-forwarding.md @@ -0,0 +1,4 @@ +!!! warning "Note" + You should only forward your SSH agent to remote servers that you trust and + only when you really need it. Forwarding your SSH agent by default to any + server you connect to is considered insecure. diff --git a/csc-overrides/assets/snippets/ssh-ca.md b/csc-overrides/assets/snippets/ssh-ca.md new file mode 100644 index 0000000000..b9e847ee20 --- /dev/null +++ b/csc-overrides/assets/snippets/ssh-ca.md @@ -0,0 +1,8 @@ +!!! warning "SSH certificates are required to connect to Roihu over SSH" + + To connect to Roihu, users must sign their public key in MyCSC to obtain a + time-based SSH certificate. Each certificate is valid for 24 hours, and + once it expires, a new one must be generated by signing the public key + again. + + [Read the detailed instructions on signing your public key](/computing/connecting/ssh-keys.md#signing-public-key). diff --git a/csc-overrides/assets/snippets/using-ssh-keys.md b/csc-overrides/assets/snippets/using-ssh-keys.md new file mode 100644 index 0000000000..0e8975fda5 --- /dev/null +++ b/csc-overrides/assets/snippets/using-ssh-keys.md @@ -0,0 +1,8 @@ +!!! info "Using SSH keys" + See the page on [setting up SSH keys](/computing/connecting/ssh-keys.md) + for general information about using SSH keys and certificates for + authentication. Please note that it is mandatory to add your public key to + MyCSC – copying it directly to a CSC supercomputer does not work! + + Supported key types are Ed25519 and RSA 4096 through 16384. **We strongly + recommend Ed25519**. diff --git a/docs/computing/connecting/index.md b/docs/computing/connecting/index.md index d0086365f4..421f8bc316 100644 --- a/docs/computing/connecting/index.md +++ b/docs/computing/connecting/index.md @@ -1,6 +1,6 @@ # Connecting to CSC supercomputers ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" There are two main ways of connecting to CSC supercomputers. @@ -22,8 +22,7 @@ For instructions on connecting to the LUMI supercomputer, please see the ## Using the web interface The [web interface](../webinterface/index.md) is a good platform -for using graphical applications on the Puhti and Mahti supercomputers. -It hosts +for using graphical applications on CSC supercomputers. It hosts [interactive applications for select programs](../webinterface/apps.md) like Jupyter and RStudio, and for other GUI programs you can use the [remote desktop](../webinterface/desktop.md) interface. @@ -34,15 +33,36 @@ will keep running even if you close your browser or lose your internet connection. The shell applications are especially convenient for users whose workstation has a Windows operating system, since Windows does not typically come with a pre-installed SSH client. See the instructions for -[connecting to Puhti and Mahti web interfaces](../webinterface/connecting.md). +[connecting to HPC web interfaces](../webinterface/connecting.md). ## Using an SSH client -Logging in to Puhti and Mahti using an SSH client requires that you have -[set up SSH keys](ssh-keys.md) and -[added your public key to MyCSC](ssh-keys.md#adding-public-key-in-mycsc). -Traditional password-based authentication and public keys stored in your -personal `~/.ssh/authorized_keys` file will **not** work. +Logging in to CSC supercomputers using an SSH client requires that you have + +1. [set up SSH keys](ssh-keys.md), +2. [added your public key to MyCSC](ssh-keys.md#adding-public-key-in-mycsc), + and +3. [signed your public key](ssh-keys.md#signing-public-key) to obtain a + time-based SSH certificate. + * Step 3. is only required when connecting to Roihu and must be + repeated every 24 hours. + +```mermaid +flowchart LR + A(**Before first connection:** + Set up SSH keys) + A --> B{Connecting + to Roihu?} + B -->|yes| C(**Once every 24 hours:** + Get a new SSH certificate) + C --> D(SSH with Linux/macOS + or + SSH with Windows) + B -->|no| D +``` + +Please note that traditional password-based authentication and public keys +stored in your personal `~/.ssh/authorized_keys` file will **not** work. Unix-based systems like macOS and Linux typically come with a pre-installed terminal program called simply *Terminal*. The instructions for using an @@ -54,12 +74,13 @@ over SSH, there are multiple programs that can be used for this. The instructions for using an [SSH client on Windows](ssh-windows.md) lists a few popular options. -Once you have set up SSH keys and added your public key to MyCSC, use a -command like below to connect over SSH: +Once you have set up SSH keys, added your public key to MyCSC, and signed it to +generate an SSH certificate (only required for Roihu), use a command like below +to connect over SSH: ```bash # Replace with the name of your CSC user account and -# with "puhti" or "mahti" +# with "puhti", "mahti", "roihu-cpu" or "roihu-gpu" ssh @.csc.fi ``` @@ -106,6 +127,22 @@ should again verify the new key against fingerprints provided by CSC. | WC9Lb5tmKDzUJqsQjaZLvp9T7LTs3aMUYSIy2OCdtgg | ssh_host_ecdsa_key.pub (ECDSA) | | tE+1jA4Et1enbbat1V3dMRWlLtJgA8t7ZrkyIkU4ooo | ssh_host_ed25519_key.pub (ED25519) | | 0CxM3ECpD2LhAnMfHnm3YaXresvHrhW4cevvcPb+HNw | ssh_host_rsa_key.pub (RSA) | +=== "Roihu (pilot phase)" + | SHA256 checksum | Key | + |---------------------------------------------|------------------------------------| + | NnNuy5xLxXDhDyBTVCtRbGNSMmTTKdnH6dlomerCg14 | ssh_host_ecdsa_key.pub (ECDSA) | + | mAkMF6xpb4wc1eq+vPc4q4mo7YvcL4GHxe8XauPqGas | ssh_host_ed25519_key.pub (ED25519) | + | IHUo4GZOYH8V9qlcv155iP3w/83SdlS6E2jOb/z01hE | ssh_host_rsa_key.pub (RSA) | +=== "Roihu (general availability)" + | SHA256 checksum | Key | + |---------------------------------------------|------------------------------------| + | h3YVzmNucpxTXcxag8D2TaC21jH8/6LGNNCCOgRDaTU | ssh_host_ecdsa_key.pub (ECDSA) | + | YNdesHbXhxN0hKD4mWvYGQONebjRqY+CGXDqPiZyByQ | ssh_host_ed25519_key.pub (ED25519) | + | cXJ5h3Z9fgu0wVpC2kDIpjdsrFsJF/bfyWegQXsfQpU | ssh_host_rsa_key.pub (RSA) | + +!!! info "Note" + For security reasons, Roihu host keys will be changed after the pilot + phase. ### Graphical connection @@ -125,17 +162,17 @@ the login nodes on the system. However, you can also use your SSH client to connect to a specific login node: ```bash -ssh @-login.csc.fi # e.g. 'puhti-login11.csc.fi' +ssh @-login.csc.fi # e.g. 'roihu-gpu-login1.csc.fi' ``` The available login nodes are: -| Puhti | Mahti | -|-|-| -| `puhti-login11` | `mahti-login11` | -| `puhti-login12` | `mahti-login12` | -| `puhti-login14` | `mahti-login14` | -| `puhti-login15` | `mahti-login15` | +| Puhti | Mahti | Roihu CPU | Roihu GPU | +|-|-|-|-| +| `puhti-login11` | `mahti-login11` | `roihu-cpu-login1` | `roihu-gpu-login1` | +| `puhti-login12` | `mahti-login12` | `roihu-cpu-login2` | `roihu-gpu-login2` | +| `puhti-login14` | `mahti-login14` | `roihu-cpu-login3` | | +| `puhti-login15` | `mahti-login15` | `roihu-cpu-login4` | | This also applies to compute nodes, although just the ones where you have a job running. Use the `squeue` command to see which node(s) your job is on, and @@ -164,9 +201,11 @@ supercomputers in an [SSH config file](https://www.ssh.com/academy/ssh/config) (e.g. `~/.ssh/config`). ```bash -Host # e.g. "puhti" +Host # e.g. "roihu-cpu" HostName .csc.fi User + IdentityFile + CertificateFile # Required for Roihu only ``` Now you can connect to the host simply by running: @@ -174,9 +213,3 @@ Now you can connect to the host simply by running: ```bash ssh ``` - -#### Remote development - -Some editors like Visual Studio Code and Notepad++ can be used to -[work on files remotely](../../support/tutorials/remote-dev.md) -using an appropriate plugin. **However, this is not recommended.** diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 9acb9fd7f1..66a00d17a2 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -1,10 +1,11 @@ # Setting up SSH keys ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" [SSH keys](https://www.ssh.com/academy/ssh-keys) provide more convenient and -secure authentication. Setting them up is a two-step process, and is required -to be able to connect to CSC supercomputers using an SSH client. +secure authentication. SSH keys are required to be able to connect to CSC +supercomputers using an SSH client. Connecting to Roihu requires also that you +sign your public key in order to obtain a time-based SSH certificate. 1. [Generate SSH keys on your local workstation](#generating-ssh-keys). - SSH keys are always generated in pairs consisting of one _public key_ and @@ -16,6 +17,12 @@ to be able to connect to CSC supercomputers using an SSH client. the _public key_ to MyCSC. **Do not copy the private key.** Note that copying the public key directly to CSC supercomputers using tools such as `ssh-copy-id` will not work. +3. [Sign the public key in MyCSC and download SSH certificate](#signing-public-key) (**required for Roihu only**). + - To connect to Roihu, sign your public key in MyCSC to generate a + time-based SSH certificate that is used for authentication. SSH + certificates have a finite lifetime of 24 hours, which significantly + improves the security of the system. After the SSH certificate expires, a + new one must be generated by signing the public key in MyCSC again. For more information about SSH keys, see: @@ -103,7 +110,180 @@ cat /var/lib/acco/sshkeys/${USER}/${USER}.pub If you have added multiple keys to MyCSC, they should all be visible in the same `${USER}.pub` file. +## Signing public key + +!!! warning "The following is a requirement for connecting to Roihu only" + +To connect to Roihu using SSH, you must sign your public key to get a so called +**SSH certificate**. SSH certificates significantly improve the security of the +system by introducing an additional authentication factor for SSH logins. + +**SSH certificates are valid for 24 hours at a time**. Once your certificate +expires, a new one must be signed following either of the processes below. + +### Option 1: Certificate helper tool (recommended) + +The certificate helper is a Python tool developed by CSC to simplify the +process of signing and downloading an SSH certificate, and adding it to your +SSH authentication agent. A detailed documentation of the tool is available in +the source repository (TBA). The following instructions illustrate only basic +usage. + +1. Ensure that you have Python installed on your computer. + - Instructions are available in the + [Python Beginners Guide](https://wiki.python.org/moin/BeginnersGuide/Download). + Contact your local IT-support if you need assistance. + - If Python for some reason cannot be installed on your computer, fall + back to [Option 2](#option-2-mycsc) instead. +2. [Download the certificate helper tool here](https://gitlab.ci.csc.fi/compen/hpc-environment/certificate-helper-tool/-/blob/main/csc_cert.py). +3. Run the `csc_cert.py` tool: + + === "Linux & macOS" + + 1. Optional, but **strongly recommended:** Ensure that + [`ssh-agent`](ssh-unix.md#authentication-agent) is running to + automatically add SSH key and certificate to SSH agent. + 1. Open terminal and execute: + + ```bash + # Replace with your CSC user name and + # with the path to your SSH public key + + python3 csc_cert.py -u + ``` + + * The command above assumes that the path to `csc_cert.py` is in + your `$PATH` environment variable, or that you are in the same + directory as the script. If not, make sure to provide the full + path to `csc_cert.py`. + + 2. If you have an earlier certificate which is still valid, the tool + prints the expiration time and exits. + 3. If signing is needed, a login URL is displayed. Follow the link and + authenticate. + 4. Copy the 6-digit code displayed into your terminal and enter your + SSH key passphrase. + - The signed certificate is automatically downloaded and added to + your SSH agent. + - The signed certificate is saved as + `-cert.pub` (e.g., `~/.ssh/id_ed25519-cert.pub`). + 5. **[Connect to Roihu following these instructions](ssh-unix.md#basic-usage)**. + + === "Windows" + + 6. Optional, but **strongly recommended**: + [Install WinSCP](https://winscp.net/eng/docs/installation) and + [start the Pageant authentication agent](https://the.earth.li/~sgtatham/putty/0.83/htmldoc/Chapter9.html#pageant) + that comes bundled with + [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/) to + automatically add SSH key and certificate to SSH agent. + * If you install WinSCP without admin rights, you must add + `WinSCP.exe` to your Path environment variable. Search for the + _Edit environment variables for your account_ settings menu. + * If you intend to connect to Roihu using PowerShell, it is + possible to use `ssh-agent` instead of Pageant and WinSCP. + [See the instructions for starting `ssh-agent` in PowerShell](ssh-windows.md#authentication-agent). + + 7. Open PowerShell and execute: + + ```bash + # Replace with your CSC user name and + # with the path to your SSH public key + # (.pub) or PuTTY key (.ppk) + + python3 csc_cert.py -u + ``` + + * The command above assumes that the path to `csc_cert.py` is in your + Path environment variable, or that you are in the same directory as + the script. If not, make sure to provide the full path to + `csc_cert.py`. + + !!! info "Note" + PowerShell is just needed to run the certificate + helper script. You can still connect to Roihu using your + [favorite SSH client](ssh-windows.md#basic-usage). + + If you intend to use PowerShell to connect to Roihu, make sure + to provide `csc_cert.py` your OpenSSH public key (`.pub`). + Providing a PuTTY `.ppk` key will create a certificate file + that is only compatible with PuTTY or MobaXterm. + + 8. If you have an earlier certificate which is still valid, the tool + prints the expiration time and exits. + 9. If signing is needed, a login URL is displayed. Follow the link and + authenticate. + 10. Copy the displayed 6-digit code into PowerShell and enter your SSH + key passphrase. + - The signed certificate is automatically downloaded and added to + your SSH agent (if you have WinSCP installed and Pageant + running). + - The signed certificate is saved as `-cert.pub` and/or + `-cert.ppk` (e.g., + `C:\Users\\.ssh\id_ed25519-cert.ppk`). + 11. **[Connect to Roihu following these instructions](ssh-windows.md#basic-usage)**. + +--- + +### Option 2: MyCSC + +1. Log in to MyCSC with your CSC or Haka/Virtu credentials. +1. Select _Profile_ from the left-hand navigation or the dropdown menu in the + top-right corner. +1. Locate _SSH PUBLIC KEYS_ section and click the three vertical dots next to + the public key you want to sign. +1. Click _Sign and download SSH certificate_. As a security measure, you may be + asked to log in again. + + ![Sign and download SSH certificate](https://a3s.fi/docs-files/sign-download-ssh-cert.png 'Sign and download SSH certificate') + + !!! info "Where to store the SSH certificate?" + We **strongly** advice saving the certificate in the default folder for + SSH-related files (e.g. `~/.ssh` or `C:\Users\/.ssh`). + Specifically, storing the certificate in the same directory as your + SSH private key **and** naming it as `-cert.pub` will simplify + connecting, working with SSH agent, etc. + + For example, if you've stored your SSH private key in + `~/.ssh/id_ed25519`, please save your SSH certificate as + `~/.ssh/id_ed25519-cert.pub`. + +1. **Connect to Roihu following these instructions**: + 1. [Linux/macOS](ssh-unix.md#basic-usage) + 1. [Windows](ssh-unix.md#basic-usage) + + !!! info "Optional: Check when your SSH certificate will expire" + Each SSH certificate is valid for 24 hours. The expiration time can be + checked as follows: + + === "Terminal (Linux, macOS, PowerShell, MobaXterm)" + + 1. Open a terminal client. + 1. Run command: + + ```bash + # Replace with the path to your OpenSSH + # certificate file (.pub) + + ssh-keygen -L -f | grep "Valid" + ``` + + === "GUI (PuTTY, MobaXterm)" + + 2. Open PuTTYgen / MobaKeyGen. + 3. Load your `.ppk` private key: + * _File_ :material-arrow-right: _Load private key_ + 4. Add a certificate (`.pub`) to the key (unless already included + in the `.ppk` file): + * _Key_ :material-arrow-right: _Add certificate to key_ + 5. Select _Certificate info_ to see the validity period among other + info. + + --- + ## More information - [Tutorial on setting up SSH keys at CSC](https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-keys.html) - [Troubleshooting issues with SSH keys](../../support/faq/ssh-keys-not-working.md) +- [Connecting to CSC supercomputers with SSH on Linux and macOS](ssh-unix.md) +- [Connecting to CSC supercomputers with SSH on Windows](ssh-windows.md) diff --git a/docs/computing/connecting/ssh-unix.md b/docs/computing/connecting/ssh-unix.md index 2d4a81f1da..5e282a71a6 100644 --- a/docs/computing/connecting/ssh-unix.md +++ b/docs/computing/connecting/ssh-unix.md @@ -1,6 +1,6 @@ # SSH client on macOS and Linux ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" On Unix-based systems like macOS and Linux, it is recommended to connect to CSC supercomputers using the pre-installed terminal program. The OpenSSH client @@ -8,14 +8,7 @@ typically comes pre-installed on macOS and Linux systems. ## Generating SSH keys -!!! info "Using SSH keys" - See the page on [setting up SSH keys](ssh-keys.md) for general - information about using SSH keys for authentication. Please note that it is - mandatory to add your public key to MyCSC – copying it directly to a CSC - supercomputer does not work! - - Supported key types are Ed25519 and RSA 4096 through 16384. **We strongly - recommend Ed25519**. +--8<-- "using-ssh-keys.md" Connecting to CSC supercomputers using an SSH client requires setting up SSH keys. On macOS and Linux, you can use the `ssh-keygen` command-line utility for @@ -38,7 +31,7 @@ Overwrite (y/n)? Generally, you do not want to overwrite existing keys, so enter `n`, run `ssh-keygen` again and enter a different file name when prompted. See also the section on -[SSH key files with non-default name or location](#ssh-key-file-with-non-default-name-or-location). +[SSH key files with non-default name or location](#ssh-key-or-certificate-file-with-non-default-name-or-location). Next, you will be asked for a passphrase. Please choose a secure passphrase. It should be at least 8 characters long and contain numbers, @@ -47,48 +40,72 @@ generating an SSH key pair!** After you have generated an SSH key pair, you need to add the **public key** to the MyCSC portal. -[Read the instructions here](ssh-keys.md#adding-public-key-in-mycsc). +[Read the instructions here](ssh-keys.md#adding-public-key-in-mycsc). To +connect to Roihu, you must also +[sign your public key](ssh-keys.md#signing-public-key) to obtain a time-based +SSH certificate which is required for authentication. You may also wish to configure [authentication agent](#authentication-agent) to make using SSH keys more convenient. ## Basic usage -After setting up SSH keys and adding your public key to MyCSC, you can create a -remote SSH connection by opening the terminal and running: +After setting up SSH keys, adding your public key to MyCSC, and downloading an +SSH certificate (**required for Roihu only**), you can create a remote SSH +connection by opening the terminal and running: ```bash # Replace with the name of your CSC user account and -# with "puhti" or "mahti" +# with "puhti", "mahti", "roihu-cpu" or "roihu-gpu" ssh @.csc.fi ``` -### SSH key file with non-default name or location +This assumes that the SSH keys (and certificate for Roihu) are saved in a +standard location using standard naming: + +- Private key: `~/.ssh/id_` +- Public key: `~/.ssh/id_.pub` +- Certificate: `~/.ssh/id_-cert.pub` + +where `` is either `ed25519` or `rsa`. -If you have stored your SSH key file with a non-default name or in a -non-default location (somewhere else than `~/.ssh/id_`), you must -tell the `ssh` command where to look for the key. Use option `-i` as follows: +### SSH key or certificate file with non-default name or location + +If you have stored your SSH key and/or certificate file with a non-default name +or in a non-default location, you must tell the `ssh` command where to look for +these files. Use option `-i` as follows: ```bash # Replace with the name of your CSC user account, -# with "puhti" or "mahti" and -# with the path to your SSH private key +# with "puhti", "mahti", "roihu-cpu" or "roihu-gpu", +# with the path to your SSH private key and +# with the path to your SSH certificate file (Roihu only) -ssh @.csc.fi -i +ssh @.csc.fi -i -i ``` Alternatively, you may specify the key location in the `~/.ssh/config` file: ```bash -Host .csc.fi +Host HostName .csc.fi User IdentityFile + CertificateFile +``` + +The `~/.ssh/config` file above would allow you to log in to `` simply +using: + +```bash +ssh ``` ## Graphical connection +--8<-- "graphical-connection.md" + Displaying graphics, such as GUIs and plots, over an SSH connection requires a window system. Linux systems have a server program for the X window system (X11) installed by default. On macOS you need to install one separately, for @@ -107,8 +124,8 @@ terminal. ## Authentication agent To avoid having to type your passphrase every time you connect to a CSC -supercomputer, the `ssh-agent` utility can hold your keys in memory. The -program's behavior depends on your system: +supercomputer, the `ssh-agent` utility can hold your SSH keys and certificates +in memory. The program's behavior depends on your system: - On Linux systems, `ssh-agent` is typically configured and run automatically at login and requires no additional actions on your part. @@ -121,31 +138,57 @@ program's behavior depends on your system: AddKeysToAgent yes ``` -Assuming your SSH private key is stored in `~/.ssh/id_ed25519`, add it to the -authentication agent by running: +- Assuming your SSH private key and certificate (required for Roihu only) are + stored in `~/.ssh/id_ed25519` and `~/.ssh/id_ed25519-cert.pub`, add them to + the authentication agent by running: -```bash -ssh-add ~/.ssh/id_ed25519 -``` + ```bash + $ ssh-add ~/.ssh/id_ed25519 + Enter passphrase for ~/.ssh/id_ed25519: # enter key passphrase here + Identity added: ~/.ssh/id_ed25519 + Certificate added: ~/.ssh/id_ed25519-cert.pub + ``` + + **This step is done automatically if you use the + [CSC certificate helper tool](ssh-keys.md#option-1-certificate-helper-tool-recommended) + to sign and download your SSH certificate!** + +!!! warning "Important note if you're not using the certificate helper tool" + Users downloading SSH certificates + [manually from MyCSC](ssh-keys.md#option-2-mycsc) **must** store it in the + same directory as the SSH private key **and** name it as + `-cert.pub` to be able to add it to SSH agent with + `ssh-add` command. If successful, `ssh-add` outputs: + + ```bash + Certificate added: ~/.ssh/id_ed25519-cert.pub + ``` -For more information about `ssh-agent`, see the -[relevant SSH Academy tutorial](https://www.ssh.com/academy/ssh/agent). + **If the certificate is stored and/or named in any other way, it cannot be + added to the authentication agent because OpenSSH uses hard-coded naming + conventions.** + + * If you intend to connect to Roihu via a jump host (e.g. when transferring + data from another CSC server to Roihu), also the SSH certificate must be + added to the SSH agent so that it can be properly forwarded. + * Alternatively, you may connect to Roihu and **pull** data from servers + that do not require a SSH certificate (e.g. Puhti or Mahti). In this case + it is enough to forward only your SSH keys. + * [Read more about SSH agent forwarding below](#ssh-agent-forwarding). ### SSH agent forwarding -!!! warning "Note" - You should only forward your SSH agent to remote servers that you trust and - only when you really need it. Forwarding your SSH agent by default to any - server you connect to is considered insecure. +--8<-- "ssh-agent-forwarding.md" Agent forwarding is a useful mechanism where the SSH client is configured to allow an SSH server to use your local `ssh-agent` on the server as if it was local there. This means in practice that you can, for example, connect directly -from Puhti to Mahti using the SSH keys you have set up on your local machine, -i.e. you do not need to create a new set of SSH keys on CSC supercomputers. +between CSC supercomputers using the SSH keys and certificates you have on your +local machine, i.e. you do not need to create a new set of SSH keys on CSC +supercomputers. -Agent forwarding is also very handy if you need to copy data between Puhti and -Mahti, or, for example, push to a private Git repository from CSC +Agent forwarding is also very handy if you need to copy data directly between +CSC supercomputers, or, for example, push to a private Git repository from CSC supercomputers. To enable agent forwarding, include the `-A` flag to your `ssh` command: @@ -157,8 +200,15 @@ ssh -A @.csc.fi Once connected, you may verify that SSH agent forwarding worked by running: ```bash -ssh-add -l +$ ssh-add -l ``` -If you see the fingerprint(s) of your SSH key(s) listed, agent forwarding is -working. +If you see the fingerprint(s) of your SSH key(s) and certificate(s) listed, +agent forwarding is working. Associated SSH keys and certificates in the +authentication agent have the same fingerprints and are annotated with +`` and `-CERT`, respectively. For example: + +```text +256 SHA256:ZXG7TvhDAWOv8VveFAlt/UYarsO9Nx5md4owX+FE5/M optional_comment (ED25519) +256 SHA256:ZXG7TvhDAWOv8VveFAlt/UYarsO9Nx5md4owX+FE5/M optional_comment (ED25519-CERT) +``` diff --git a/docs/computing/connecting/ssh-windows.md b/docs/computing/connecting/ssh-windows.md index 2fd454f0d8..9c466a6238 100644 --- a/docs/computing/connecting/ssh-windows.md +++ b/docs/computing/connecting/ssh-windows.md @@ -1,6 +1,6 @@ # SSH client on Windows ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" There are various programs that can be used for creating a remote SSH connection on a Windows system. This page provides instructions for three @@ -8,14 +8,7 @@ popular alternatives: MobaXterm, PuTTY and PowerShell. ## Generating SSH keys -!!! info "Using SSH keys" - See the page on [setting up SSH keys](ssh-keys.md) for general - information about using SSH keys for authentication. Please note that it is - mandatory to add your public key to MyCSC – copying it directly to a CSC - supercomputer does not work! - - Supported key types are Ed25519 and RSA 4096 through 16384. **We strongly - recommend Ed25519**. +--8<-- "using-ssh-keys.md" === "MobaXterm" @@ -43,7 +36,7 @@ popular alternatives: MobaXterm, PuTTY and PowerShell. Generally, you do not want to overwrite existing keys, so enter `n`, run `ssh-keygen` again and enter a different file name when prompted. See also the section on - [SSH key files with non-default name or location](#ssh-key-file-with-non-default-name-or-location). + [SSH key files with non-default name or location](#ssh-key-or-certificate-file-with-non-default-name-or-location). Next, you will be asked for a passphrase. Please choose a secure passphrase. It should be at least 8 characters long and contain numbers, @@ -104,7 +97,7 @@ popular alternatives: MobaXterm, PuTTY and PowerShell. Generally, you do not want to overwrite existing keys, so enter `n`, run `ssh-keygen` again and enter a different file name when prompted. See also the section on - [SSH key files with non-default name or location](#ssh-key-file-with-non-default-name-or-location). + [SSH key files with non-default name or location](#ssh-key-or-certificate-file-with-non-default-name-or-location). Next, you will be asked for a passphrase. Please choose a secure passphrase. It should be at least 8 characters long and contain numbers, @@ -115,16 +108,19 @@ popular alternatives: MobaXterm, PuTTY and PowerShell. After you have generated an SSH key pair, you need to add the **public key** to the MyCSC portal. -[Read the instructions here](ssh-keys.md#adding-public-key-in-mycsc). +[Read the instructions here](ssh-keys.md#adding-public-key-in-mycsc). To +connect to Roihu, you must also +[sign your public key](ssh-keys.md#signing-public-key) to obtain a time-based +SSH certificate which is required for authentication. -You may also wish to configure -[authentication agent](#authentication-agent) to make using SSH keys -more convenient. +You may also wish to configure [authentication agent](#authentication-agent) to +make using SSH keys more convenient. ## Basic usage -After setting up SSH keys and adding your public key to MyCSC, you can connect -to a CSC supercomputer. +After setting up SSH keys, adding your public key to MyCSC and downloading an +SSH certificate (**required for Roihu only**) you can connect to a CSC +supercomputer. === "MobaXterm" @@ -132,11 +128,20 @@ to a CSC supercomputer. ```bash # Replace with the name of your CSC user account and - # with "puhti" or "mahti" + # with "puhti", "mahti", "roihu-cpu" or "roihu-gpu" ssh @.csc.fi ``` + This assumes that the SSH keys (and certificate for Roihu) are saved in a standard + location using standard naming: + + - Private key: `~/.ssh/id_` + - Public key: `~/.ssh/id_.pub` + - Certificate: `~/.ssh/id_-cert.pub` + + where `` is either `ed25519` or `rsa`. + Alternatively, you may [connect using the GUI following this tutorial](https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-puhti.html#connecting-from-windows). @@ -151,13 +156,13 @@ to a CSC supercomputer. | **Port** | `22` | | **Connection type** | `SSH` | - When creating a remote connection using PuTTY, select the private key file - under `Connection --> SSH --> Auth --> Credentials`. If you want the private - key to be used each time you connect, save your session to store your choice. - Finally, click `Open` and enter your CSC username and SSH key passphrase. + When creating a remote connection using PuTTY, select the private key and + certificate file (**only if connecting to Roihu**) under + `Connection --> SSH --> Auth --> Credentials`. Finally, click `Open` and + enter your CSC username and SSH key passphrase. - If you are connecting for the first time, PuTTY will ask if you trust the host. - Click `Accept`. + If you are connecting for the first time, PuTTY will ask if you trust the + host. Click `Accept`. === "PowerShell" @@ -165,11 +170,20 @@ to a CSC supercomputer. ```bash # Replace with the name of your CSC user account and - # with "puhti" or "mahti" + # with "puhti", "mahti", "roihu-cpu" or "roihu-gpu" ssh @.csc.fi ``` + This assumes that the SSH keys (and certificate for Roihu) are saved in a standard + location using standard naming: + + - Private key: `~/.ssh/id_` + - Public key: `~/.ssh/id_.pub` + - Certificate: `~/.ssh/id_-cert.pub` + + where `` is either `ed25519` or `rsa`. + !!! warning "Corrupted MAC on input" When connecting using the OpenSSH client software on Windows, you might encounter an error stating "Corrupted MAC on input". This is a known @@ -179,27 +193,25 @@ to a CSC supercomputer. --- -### SSH key file with non-default name or location +### SSH key or certificate file with non-default name or location If you are connecting via the MobaXterm terminal or PowerShell, and have stored -your SSH key file with a non-default name or in a non-default location -(somewhere else than `~/.ssh/id_`), you must tell the `ssh` command -where to look for the key. Use option `-i` as follows: +your SSH key and/or certificate file with a non-default name or in a +non-default location (somewhere else than `~/.ssh/id_`), you must +tell the `ssh` command where to look for these files. Use option `-i` as +follows: ```bash # Replace with the name of your CSC user account, # with "puhti" or "mahti" and # with the path to your SSH private key -ssh @.csc.fi -i +ssh @.csc.fi -i -i ``` ## Graphical connection -!!! info "Note" - For performance reasons, we generally recommend using the - [HPC web interfaces](../webinterface/index.md) to run applications which - require displaying graphics. +--8<-- "graphical-connection.md" === "MobaXterm" @@ -243,18 +255,35 @@ ssh @.csc.fi -i ## Authentication agent +!!! warning "CSC certificate helper is recommended to simplify working with SSH agent on Windows" + [The certificate helper tool](ssh-keys.md#option-1-certificate-helper-tool-recommended) + developed by CSC simplifies the process of signing and downloading SSH + certificates for connecting to Roihu. Importantly, it also automatically + adds your SSH keys and certificate to your OpenSSH and/or Pageant + authentication agent. + === "MobaXterm" - To avoid having to type your passphrase every time you connect, enable the - MobAgent authentication agent in the program settings (`Settings --> + MobaXterm supports three different SSH agents – Pageant, MobAgent and + Windows `ssh-agent`. They can all be used at the same time if you wish. If + you use the CSC certificate helper tool for managing SSH certificates for + Roihu, **we recommend using Pageant**. + + Authentication agents are enabled in the program settings (`Settings --> Configuration --> SSH --> SSH agents`). - 1. Toggle the option `Use internal SSH agent "MobAgent"`. - 2. Click the `+` button and select the private key you want to load at - MobAgent startup. - 3. Click `OK` and restart MobaXterm. You'll be prompted to enter your key + 1. Toggle the SSH agent(s) you wish to use: + 1. For **MobAgent**, you need to click the `+` button and select the + private key(s) you want to load at startup. + 2. For **Pageant** (PuTTY agent), you must make sure Pageant is running + and holds the keys/certificates you wish to use. See the PuTTY tab + for instructions. + 3. For **`ssh-agent`** (Windows SSH agent), you must make sure + `ssh-agent` service is running and holds the keys/certificates you + wish to use. See the PowerShell tab for instructions. + 2. Click `OK` and restart MobaXterm. You'll be prompted to enter your key passphrase. - 4. You may now connect to CSC supercomputers without having to type your + 3. You may now connect to CSC supercomputers without having to type your passphrase again. === "PuTTY" @@ -270,7 +299,7 @@ ssh @.csc.fi -i no keys, so the list box will be empty. 3. Press the `Add Key` button to add a key to Pageant. 4. Find your private key file in the `Select Private Key File` dialog, and - press `Open`. Pageant will ask you to enter the key passhphrase. + press `Open`. Pageant will ask you to enter the key passphrase. 5. Now start PuTTY and open an SSH session to any CSC supercomputer. PuTTY will notice that Pageant is running, retrieve the key automatically from Pageant, and use it to authenticate. You may now open as many PuTTY @@ -278,28 +307,90 @@ ssh @.csc.fi -i === "PowerShell" - To avoid having to type your passphrase every time you connect, - you can - [configure the Windows SSH agent](https://learn.microsoft.com/en-us/windows-server/administration/openssh/openssh_keymanagement?source=recommendations#user-key-generation) - to store your keys in memory for the duration of your local login session. + `ssh-agent` service is usually stopped or disabled in Windows by default, + and starting it requires administrator privileges. + + Run the following commands in an elevated PowerShell prompt: + + ```powershell + # Configure ssh-agent to start automatically. + Get-Service ssh-agent | Set-Service -StartupType Automatic + + # Start the service. + Start-Service ssh-agent + + # The following command should return a status of Running. + Get-Service ssh-agent + + # Load your key files into ssh-agent. + ssh-add $env:USERPROFILE\.ssh\id_ed25519 + ``` + + After you add the key to the `ssh-agent` service on your client, the + `ssh-agent` service automatically retrieves the local private key (and + certificate) and passes it to your SSH client. --- +!!! warning "Important note if you're not using the certificate helper tool" + Users downloading SSH certificates + [manually from MyCSC](ssh-keys.md#option-2-mycsc) must perform some extra + steps to be able to add their certificate to SSH agents. + + === "MobAgent & Pageant" + + To add your SSH certificate to MobAgent or PuTTY, you must first + "combine" the certificate and the PuTTY `.ppk` private key. + + 1. Open MobaKeyGen (_Tools_ tab of MobaXterm) or PuTTYgen. + 2. Load your private key (`File --> Load private key`). + 3. Add a valid certificate to the key (`Key --> Add certificate to key`). + The validity period can be checked by selecting `Certificate info`. + 4. Save the private key as `-cert.ppk`, e.g. + `id_ed25519-cert.ppk`. + 5. The new private key including the certificate can now be added to + MobAgent and/or Pageant following the previous instructions. A + successfully combined key and certificate will show up as `Ed25519 + cert` in MobAgent/Pageant. + + === "Windows SSH agent" + + Users of Windows `ssh-agent` **must** make sure to store their manually + downloaded SSH certificate in the same directory as the SSH private key + **and** name it as `-cert.pub` to be able to add it + to SSH agent with `ssh-add` command. If successful, `ssh-add` outputs: + + ```bash + Certificate added: C:\Users\\.ssh\id_ed25519-cert.pub + ``` + + **If the certificate is stored and/or named in any other way, it cannot be + added to the authentication agent because OpenSSH uses hard-coded naming + conventions.** + + **Please note**: + + * If you intend to connect to Roihu via a jump host (e.g. when transferring + data from another CSC server to Roihu), also the SSH certificate **must** + be added to the SSH agent so that it can be properly forwarded. + * Alternatively, you may connect to Roihu and **pull** data from servers + that do not require a SSH certificate (e.g. Puhti or Mahti). In this case + it is enough to forward only your SSH keys. + * [Read more about SSH agent forwarding below](#ssh-agent-forwarding). + ### SSH agent forwarding -!!! warning "Note" - You should only forward your SSH agent to remote servers that you trust and - only when you really need it. Forwarding your SSH agent by default to any - server you connect to is considered insecure. +--8<-- "ssh-agent-forwarding.md" Agent forwarding is a useful mechanism where the SSH client is configured to allow an SSH server to use your local `ssh-agent` on the server as if it was local there. This means in practice that you can, for example, connect directly -from Puhti to Mahti using the SSH keys you have set up on your local machine, -i.e. you do not need to create a new set of SSH keys on CSC supercomputers. +between CSC supercomputers using the SSH keys and certificates you have on your +local machine, i.e. you do not need to create a new set of SSH keys on CSC +supercomputers. -Agent forwarding is also very handy if you need to copy data between Puhti and -Mahti, or, for example, push to a private Git repository from CSC +Agent forwarding is also very handy if you need to copy data directly between +CSC supercomputers, or, for example, push to a private Git repository from CSC supercomputers. === "MobaXterm" @@ -338,5 +429,15 @@ Once connected, you may verify that SSH agent forwarding worked by running: ssh-add -l ``` -If you see the fingerprint(s) of your SSH key(s) listed, agent forwarding is -working. +If you see the fingerprint(s) of your SSH key(s) and certificate(s) listed, +agent forwarding is working. Associated SSH keys and certificates in the +authentication agent have the same fingerprints and are annotated with +`` and `-CERT`, respectively. For example: + +```text +256 SHA256:ZXG7TvhDAWOv8VveFAlt/UYarsO9Nx5md4owX+FE5/M optional_comment (ED25519) +256 SHA256:ZXG7TvhDAWOv8VveFAlt/UYarsO9Nx5md4owX+FE5/M optional_comment (ED25519-CERT) +``` + +If you're using a combined SSH key and certificate file (PuTTYgen and +MobaKeyGen methods), you should only see the `-CERT` line. diff --git a/docs/computing/index.md b/docs/computing/index.md index e31b7986c9..d507fbc8d4 100644 --- a/docs/computing/index.md +++ b/docs/computing/index.md @@ -12,7 +12,7 @@ [Learn more about Roihu :material-arrow-right:](systems-roihu.md) -Puhti and Mahti are CSC's supercomputers. Puhti has been available for CSC users +Puhti, Mahti and Roihu are CSC's supercomputers. Puhti has been available for CSC users since 2 September 2019 and Mahti has been available since 26 August 2020. LUMI is one of the pan-European pre-exascale supercomputers, located in CSC's data center in Kajaani. The CPU partition of LUMI (LUMI-C) has been available since @@ -35,8 +35,7 @@ basics of [Linux command line usage](../support/tutorials/env-guide/index.md) be For an overview of the LUMI supercomputer, see [the LUMI documentation](https://docs.lumi-supercomputer.eu/hardware/). - -## Accessing Puhti and Mahti +## Accessing CSC supercomputers To be able to use CSC's supercomputers, you need to have a CSC user account that belongs to a computing project which has access to the respective supercomputers. @@ -52,41 +51,46 @@ of this user guide. ## Connecting to the supercomputers ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" Connect using an SSH client: ```bash -ssh yourcscusername@puhti.csc.fi -``` - -or - -```bash -ssh yourcscusername@mahti.csc.fi +ssh username@puhti.csc.fi +# or +ssh username@mahti.csc.fi ``` -This will connect you to one of the login nodes. If you need to connect -to a specific login node, use the command: +Roihu has separate login nodes for the CPU and GPU partitions: ```bash -ssh yourcscusername@puhti-login[11-12,14-15].csc.fi +ssh username@roihu-cpu.csc.fi +# or +ssh username@roihu-gpu.csc.fi ``` -or +These commands will connect you to one of the login nodes. If you need to +connect to a specific login node, use the commands: ```bash -ssh yourcscusername@mahti-login[11-12,14-15].csc.fi +ssh username@puhti-login[11-12,14-15].csc.fi +# or +ssh username@mahti-login[11-12,14-15].csc.fi +# or +ssh username@roihu-cpu-login[1-4].csc.fi +# or +ssh username@roihu-gpu-login[1-2].csc.fi ``` -Where `yourcscusername` is the username you get from CSC. +Where `username` is the username you get from CSC. For more details, see the [connecting](connecting/index.md) page. -Puhti and Mahti can also be accessed via their respective +Puhti, Mahti and Roihu can also be accessed via their respective [web interfaces](webinterface/index.md) available at -[www.puhti.csc.fi](https://www.puhti.csc.fi) and -[www.mahti.csc.fi](https://www.mahti.csc.fi). +[www.puhti.csc.fi](https://www.puhti.csc.fi), +[www.mahti.csc.fi](https://www.mahti.csc.fi) and +[www.roihu.csc.fi](https://www.mahti.csc.fi). ### Scalability @@ -144,13 +148,13 @@ The [disk areas](disk.md) of your projects can be checked with the command: csc-workspaces ``` -## Using Puhti and Mahti +## Using CSC supercomputers * [Systems](available-systems.md): What computational resources are available * [Usage policy](usage-policy.md): Usage policy of CSC supercomputers * [Connecting](connecting/index.md): How to connect to CSC supercomputers -* [Puhti web interface](webinterface/index.md): How to connect to Puhti using the web - interface +* [Web interfaces](webinterface/index.md): How to connect to CSC supercomputers + using the web interfaces * [Disk areas](disk.md): What places are there for storing data on CSC supercomputers * [Modules](modules.md): How to find the programs you need diff --git a/docs/computing/systems-roihu.md b/docs/computing/systems-roihu.md index bbebd6c4fe..99e75f128e 100644 --- a/docs/computing/systems-roihu.md +++ b/docs/computing/systems-roihu.md @@ -143,6 +143,7 @@ interactive access and running graphical user interfaces. ## More information +* [Getting started with Roihu](../support/tutorials/roihu.md) * [Frequently asked questions](../support/faq/roihu.md) * [See the latest Roihu presentation slides](https://a3s.fi/docs-files/roihu-presentation.pdf) (updated 2026-02-25) diff --git a/docs/data/moving/disk_mount.md b/docs/data/moving/disk_mount.md index 247bf73a5c..2aec86b981 100644 --- a/docs/data/moving/disk_mount.md +++ b/docs/data/moving/disk_mount.md @@ -1,6 +1,6 @@ # Remote disk mounts ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" With remote disk mounts you can access your CSC directories in a way that resembles the usage of an external disk or USB memory stick. Using this diff --git a/docs/data/moving/graphical_transfer.md b/docs/data/moving/graphical_transfer.md index a5108cd7e8..29fc8a0ccc 100644 --- a/docs/data/moving/graphical_transfer.md +++ b/docs/data/moving/graphical_transfer.md @@ -35,6 +35,17 @@ For example, use the following settings for connecting to Puhti: Click _Connect_. If it is the first time you're connecting, FileZilla will ask if you trust the host, and then prompt you for your SSH key passphrase. +!!! warning "Important: Connecting to Roihu" + There is no way to manually provide FileZilla an SSH certificate for + connecting to Roihu. Instead, to form a connection to Roihu using + FileZilla, please ensure that you've got Pageant SSH agent running and that + it holds a valid SSH certificate. FileZilla will then fetch it from the + agent automatically. + + [See the SSH certificate instructions here](../../computing/connecting/ssh-keys.md#signing-public-key). + We recommend using the + [CSC certificate helper tool](../../computing/connecting/ssh-keys.md#option-1-certificate-helper-tool-recommended). + Once the connection is opened, FileZilla shows two interactive file listings side by side. On the left side you have your local file system and on the right site the remote file system (e.g. files on Puhti). You can change your location @@ -76,6 +87,23 @@ Click the _Advanced_ button and open the _SSH_ > _Authentication_ tab. Enter the path to your SSH private key in the _Private key file_ field and click _OK_. +!!! warning "Important: Connecting to Roihu" + If you're connecting to Roihu, please specify a `.ppk` file that includes a + valid SSH certificate in the _Private key file_ field (e.g. + `C:\Users\\.ssh\id_ed25519-cert.ppk`). + + Alternatively, if you've added this key to Pageant, you can simply leave + the _Private key file_ field empty – WinSCP will fetch it from the agent + automatically if you've toggled the _Attempt authentication using Pageant_ + option in the _SSH_ > _Authentication_ tab (on by default). Please note + that if you specify a key that does **not** include a valid certificate, + WinSCP will try to use this instead of Pageant. It is thus important that + you leave the field empty. + + [See the SSH certificate instructions here](../../computing/connecting/ssh-keys.md#signing-public-key). + We recommend using the + [CSC certificate helper tool](../../computing/connecting/ssh-keys.md#option-1-certificate-helper-tool-recommended). + ![WinSCP advanced site settings to add ssh private key](https://a3s.fi/docs-files/winscp-ssh-key-add.png 'Add SSH key to WinSCP') Click _Login_ to connect. If it is the first time you're connecting, WinSCP diff --git a/docs/data/moving/rsync.md b/docs/data/moving/rsync.md index 9ebe2260bb..a185b190dd 100644 --- a/docs/data/moving/rsync.md +++ b/docs/data/moving/rsync.md @@ -1,6 +1,6 @@ # Using rsync for data transfer and synchronization ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" **Rsync** is a data transfer tool that can be used much like the `scp` command. When transferring data, `rsync` checks the difference between the source and @@ -71,21 +71,25 @@ rsync -rP @puhti.csc.fi:/path/to/target/folder /path/to/local ``` !!! info "Note" - If you have stored your SSH key file with a non-default name or in a - non-default location (somewhere else than `~/.ssh/id_`), you can + If you have stored your SSH key and/or certificate file with a non-default + name or in a non-default location (somewhere else than + `~/.ssh/id_` or `~/.ssh/id_-cert.pub`), you can specify where `rsync` should look for the key using the `-e` option. For example: ```bash - rsync -rP -e "ssh -i /path/to/private/key" /path/to/local/folder @puhti.csc.fi:/path/to/target + rsync -rP -e "ssh -i /path/to/private/key -i /path/to/certificate" /path/to/local/folder @:/path/to/target ``` + Note that SSH certificates are required for connecting to Roihu only. + ## Using rsync to transfer data directly between CSC supercomputers To transfer data directly between CSC supercomputers, you must be able to access the SSH keys you've set up on your local workstation for authenticating to CSC -supercomputers. This is accomplished by forwarding your SSH agent to the -supercomputer you're first connecting to. +supercomputers. For Roihu, a valid SSH certificate is also needed. This is +accomplished by forwarding your SSH agent including your SSH keys (and +certificate) to the supercomputer you're first connecting to. - [SSH agent forwarding instructions for Linux/macOS](../../computing/connecting/ssh-unix.md#ssh-agent-forwarding) - [SSH agent forwarding instructions for Windows](../../computing/connecting/ssh-windows.md#ssh-agent-forwarding) diff --git a/docs/data/moving/scp.md b/docs/data/moving/scp.md index 04cb316d3b..d306d53ee0 100644 --- a/docs/data/moving/scp.md +++ b/docs/data/moving/scp.md @@ -1,6 +1,6 @@ # Copying files using scp ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" Copying files between different Linux, macOS and Windows machines can be done with the `scp` command. Thus, you can use `scp` to transport data between CSC @@ -19,18 +19,19 @@ machine is: scp username@server:/path/to/file /path/to/local/destination ``` -!!! info "Non-standard location or name for SSH keys" - If you have stored your SSH key file with a non-default name or in a - non-default location (somewhere else than `~/.ssh/id_`), you - must specify where `scp` should look for the key using the `-i` option, - e.g: +!!! info "Non-standard location or name for SSH keys and/or certificates" + If you have stored your SSH key or certificate file with a non-default name + or in a non-default location (somewhere else than `~/.ssh/id_` + and `~/.ssh/id_-cert.pub`), you must specify where `scp` should + look for the files using the `-i` option, e.g: ```bash - scp -i /path/to/sshkey /path/to/file username@server:/path/to/remote/destination + scp -i /path/to/sshkey -i /path/to/cert /path/to/file @:/path/to/remote/destination ``` - The rest of this page assumes the key is stored in a default location using - a standard name, so the `-i` flag is omitted. + The rest of this page assumes the key and certificate are stored in a + default location using standard naming, so the `-i` flag is omitted. Note + that SSH certificates are required for connecting to Roihu only. ## Using scp to copy data between your local computer and Puhti @@ -87,8 +88,9 @@ access mode information from the original file. To copy data directly between CSC supercomputers, `scp` must be able to access the SSH keys you've set up on your local workstation for authenticating to CSC -supercomputers. This is accomplished by forwarding your SSH agent to the -supercomputer you're first connecting to. +supercomputers. For Roihu, a valid SSH certificate is also needed. This is +accomplished by forwarding your SSH agent including your SSH keys (and +certificate) to the supercomputer you're first connecting to. - [SSH agent forwarding instructions for Linux/macOS](../../computing/connecting/ssh-unix.md#ssh-agent-forwarding) - [SSH agent forwarding instructions for Windows](../../computing/connecting/ssh-windows.md#ssh-agent-forwarding) diff --git a/docs/data/moving/tar_ssh.md b/docs/data/moving/tar_ssh.md index f3ec1f5daa..260dd07114 100644 --- a/docs/data/moving/tar_ssh.md +++ b/docs/data/moving/tar_ssh.md @@ -1,6 +1,6 @@ # Using Tar over SSH to move many files ---8<-- "auth-update-ssh.md" +--8<-- "ssh-ca.md" Linux tools such as `scp` and `rsync` are commonly used to transfer files between a remote server and a local machine. However, these tools are not @@ -37,16 +37,18 @@ tar c myfiles | ssh @puhti.csc.fi 'cat > /scratch/project_2001234/myfi ``` !!! info "Note" - If you have stored your SSH key file with a non-default name or in a - non-default location (somewhere else than `~/.ssh/id_`), you must - specify where `ssh` should look for the key using the `-i` option, e.g: + If you have stored your SSH key or certificate file with a non-default name + or in a non-default location (somewhere else than `~/.ssh/id_` + and `~/.ssh/id_-cert.pub`), you must specify where `ssh` should + look for the files using the `-i` option, e.g: ```bash - tar c myfiles | ssh -i @puhti.csc.fi 'cat > /scratch/project_2001234/myfiles.tar' + tar c myfiles | ssh -i -i @ 'cat > /scratch/project_2001234/myfiles.tar' ``` - The rest of this page assumes the key is stored in a default location using - a standard name, so the `-i` flag is omitted. + The rest of this page assumes the key and certificate are stored in a + default location using standard naming, so the `-i` flag is omitted. Note + that SSH certificates are required for connecting to Roihu only. To extract the tar archive at the same time, replace the `cat` command as: @@ -90,8 +92,9 @@ ssh @puhti.csc.fi 'tar c -C /scratch/project_2001234 myfiles' | tar x To transfer data directly between CSC supercomputers, you must be able to access the SSH keys you've set up on your local workstation for authenticating to CSC -supercomputers. This is accomplished by forwarding your SSH agent to the -supercomputer you're first connecting to. +supercomputers. For Roihu, a valid SSH certificate is also needed. This is +accomplished by forwarding your SSH agent including your SSH keys (and +certificate) to the supercomputer you're first connecting to. - [SSH agent forwarding instructions for Linux/macOS](../../computing/connecting/ssh-unix.md#ssh-agent-forwarding) - [SSH agent forwarding instructions for Windows](../../computing/connecting/ssh-windows.md#ssh-agent-forwarding) diff --git a/docs/index.md b/docs/index.md index 0913b7c494..3cadde16d2 100644 --- a/docs/index.md +++ b/docs/index.md @@ -21,14 +21,14 @@ template: home.html [Getting started with supercomputing :material-arrow-right:](support/tutorials/hpc-quick.md) - [Puhti and Mahti Overview :material-arrow-right:](computing/index.md) + [Getting started with Roihu :material-arrow-right:](support/tutorials/roihu.md) + + [Getting started with LUMI :material-open-in-new:](https://docs.lumi-supercomputer.eu/firststeps/getstarted/){ target=_blank } [Cloud services Overview :material-arrow-right:](cloud/index.md) [What is DBaaS :material-arrow-right:](cloud/dbaas/what-is-dbaas.md) - [Getting started with LUMI :material-open-in-new:](https://docs.lumi-supercomputer.eu/firststeps/getstarted/){ target=_blank } - [How to get support :material-arrow-right:](support/contact.md) - ## User guides diff --git a/docs/support/tutorials/index.md b/docs/support/tutorials/index.md index b4a49e85c0..97e2defeae 100644 --- a/docs/support/tutorials/index.md +++ b/docs/support/tutorials/index.md @@ -17,6 +17,11 @@ * [Using Python on CSC supercomputers](python-usage-guide.md) * [Setting up SSH keys at CSC](https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-keys.html) +## Roihu + +* [Getting started with Roihu](roihu.md) +* [Roihu data migration guide](roihu-data.md) + ## Installation of tools on supercomputers * [Installing software with Spack](user-spack.md) diff --git a/docs/support/tutorials/ml-guide.md b/docs/support/tutorials/ml-guide.md index fa1a3901e9..15e4240374 100644 --- a/docs/support/tutorials/ml-guide.md +++ b/docs/support/tutorials/ml-guide.md @@ -65,7 +65,7 @@ European LUMI supercomputer. If you are [unsure which supercomputer to choose, read the discussion here](gpu-ml.md#puhti-mahti-or-lumi). If you are a new user, please read [how to access Puhti and -Mahti](../../computing/index.md#accessing-puhti-and-mahti), and [how +Mahti](../../computing/index.md#accessing-csc-supercomputers), and [how to submit computing jobs](../../computing/running/getting-started.md). If you have opted for LUMI read the [LUMI Get Started diff --git a/docs/support/tutorials/roihu-data.md b/docs/support/tutorials/roihu-data.md new file mode 100644 index 0000000000..a681142492 --- /dev/null +++ b/docs/support/tutorials/roihu-data.md @@ -0,0 +1,452 @@ +# Roihu data migration guide + +??? info "About this guide" + This guide is divided into four parts: + + 1. [General guidelines and prerequisites](#1-general-guidelines-and-prerequisites) + 2. [Recommended data migration methods](#2-recommended-data-migration-methods) + 3. [Special cases](#3-special-cases) + 4. [Discouraged methods](#4-discouraged-methods) + + Please read the + [General guidelines and prerequisites](#1-general-guidelines-and-prerequisites) + section before migrating any data to Roihu. If your data migration needs + are small and simple, checking the + [Basic rsync](#21-basic-rsync) example may suffice. If you have **a lot** + of data or other special requirements, please also read the other sections + carefully. + +## 1. General guidelines and prerequisites + +### 1.1 Review and clean up your data before migration + +* Roihu scratch disk is not intended for long-term data storage, but should + only be used for data that is in active use. Thus, **only move data that you + truly need**. + * Good data hygiene reduces transfer time and load on the file system, as + well as eliminates the risk of moving redundant or duplicate data. Roihu + will implement a similar disk cleaning policy as Puhti, meaning that + files that have not been accessed in 180 days will be deleted. + * We recommend using the [LUE tool](lue.md) to identify where you have lots of + data. Avoid using tools such as `du` as they may cause a lot of load on + the file system. Simple usage example (run `lue -h` for other options): + + ```bash + module load lue + lue + ``` + +* Other tips: + * Remove or exclude temporary files (cached data, intermediate results, + logs, unnecessary checkpoint files, core dumps, etc.). + * Apptainer containers built for CPUs can be moved to Roihu. Do **not** + move containers targeting GPUs or any native installations from Puhti or + Mahti to Roihu. These must be re-built from scratch for best performance, + or in order for them to work at all (GPU nodes have ARM CPU + architecture). + [More about installing software on Roihu](roihu.md#installing-software). + +### 1.2 Ensure that you have enough disk space on Roihu + +* Once you have identified the data you need to transfer, check that it + fits within the default disk quotas on Roihu: + + | Disk area | Path | Default size | Max. size [^1] | Default file number limit | Max. file number limit [^1] | + |-----------|-----------------------|-------------:|--------------------:|--------------------------:|----------------------------:| + | Home | `/users/$USER` | 15 GiB | 15 GiB | 150k | 150k | + | ProjAppl | `/projappl/` | 15 GiB | 250 GiB (< 100 GiB) | 150k | 2.5M (< 1M) | + | ProjData  | `/projdata/` | 0 GiB | case-by-case | 0 | case-by-case | + | Scratch  | `/scratch/` | 1 TiB | 100 TiB (< 10 TiB) | 1M | 10M (< 5M) | + + [^1]: Values in parentheses indicate automatically approved limits. + +* Please note that existing quota extensions on Puhti/Mahti will not + automatically carry over to Roihu, so you must separately + [apply for increased disk quota](../../accounts/how-to-increase-disk-quotas.md) + via [MyCSC](https://my.csc.fi) beforehand if your data does not fit + within the default limits. + +??? info "New ProjData disk area on Roihu" + Users may apply for new "dataset projects" (cf. regular computing projects) + to get access to a new disk area on Roihu – **ProjData**. This disk area + allows storing datasets on the disk for a longer timer (no cleaning, + lifetime is limited by the data project lifetime). Read access to the data + can be shared globally, or with specific project IDs. + + Dataset projects and ProjData quotas are applied for and managed via + [MyCSC](https://my.csc.fi). ProjData quota consumes Storage BUs. + +### 1.3. Add Roihu service access to your CSC project + +* Like any other CSC service, access to Roihu must be enabled for your project + via [MyCSC](https://my.csc.fi). +* Note also that users must have at least a **medium** level of identity + assurance (LoA) to be able to access Roihu. You can check your LoA on your + [profile page in MyCSC](https://my.csc.fi/profile), and + [elevate it if needed following these instructions](../../accounts/strong-identification.md). + +### 1.4 Transfer your data directly from Puhti/Mahti to Roihu + +* It is **not** recommended to transfer data to Roihu via Allas or your local + workstation. Instead, CSC recommends using command-line based tools such as + [`rsync`](#2-recommended-data-migration-methods) to **directly transfer data + from Puhti/Mahti to Roihu.** + +!!! warning "Extremely important" + + ### 1.5 Connecting to Roihu requires SSH certificates + + * In addition to SSH keys, a signed **SSH certificate** is required to + connect to Roihu over SSH. + [Read the instructions for getting and using SSH certificates here](../../computing/connecting/ssh-keys.md#signing-public-key). + * To transfer data directly from Puhti/Mahti to Roihu, you must **forward + your SSH agent** when connecting to the system where you launch the data + transfer process. + 1. **[SSH agent instructions for Linux/macOS](../../computing/connecting/ssh-unix.md#authentication-agent).** + 2. **[SSH agent instructions for Windows](../../computing/connecting/ssh-windows.md#authentication-agent).** + * We **strongly** recommend using the + [certificate helper tool](../../computing/connecting/ssh-keys.md#option-1-certificate-helper-tool-recommended) + developed by CSC to simplify the process. + +## 2. Recommended data migration methods + +* **`rsync`** is the preferred tool for transferring data from Puhti or Mahti + to Roihu. [Read more about `rsync` here](../../data/moving/rsync.md). +* **We will use Puhti as an example**, but the exact same steps apply for Mahti + also. Simply replace all occurrences of `puhti` in host names etc. with + `mahti`. +* All examples require that you've **forwarded your SSH agent** including your + **SSH keys** and a **valid SSH certificate** to Puhti when connecting. +* Before starting the data transfer, **ensure that the target directory on + Roihu exists and is writable**. + +??? info "Help! What to do if I struggle to add my SSH certificate to the SSH agent?" + Alternatively, you may log in to Roihu and **pull** data from Puhti. + Because connecting to Puhti does not require an SSH certificate, it is + enough that the forwarded SSH agent holds your SSH keys. + + Note that you will still need a valid SSH certificate when connecting to + Roihu in the first place, **but it does not have to be added to your SSH + agent**. + +### 2.1 Basic `rsync` + +1. [Obtain an SSH certificate](../../computing/connecting/ssh-keys.md#signing-public-key). +2. Add your SSH keys and certificate to your SSH agent. + 1. [Instructions for Linux/macOS](../../computing/connecting/ssh-unix.md#authentication-agent). + 2. [Instructions for Windows](../../computing/connecting/ssh-windows.md#authentication-agent). +3. Log in to Puhti with SSH agent forwarding turned on. + * [Instructions for Linux/macOS](../../computing/connecting/ssh-unix.md#ssh-agent-forwarding). + * [Instructions for Windows](../../computing/connecting/ssh-windows.md#ssh-agent-forwarding). +4. On the login node, transfer directory `/scratch/project_2001234/my-data` + from Puhti to directory `/scratch/project_2001234/` on Roihu. + + ```bash + rsync -aP /scratch/project_2001234/my-data $USER@roihu-cpu.csc.fi:/scratch/project_2001234/ + ``` + + | Option | Description | + |--------|-----------------------------------------------------------------------------------------------------------------------| + | `-a` | Use archive mode: copy files and directories recursively, preserve access permissions, timestamps and symbolic links. | + | `-P` | Keep partially transferred files and show progress during transfer. | + +5. Alternatively, if you've connected to Roihu and are **pulling** data from + Puhti, use the command: + + ```bash + rsync -aP $USER@puhti.csc.fi:/scratch/project_2001234/my-data /scratch/project_2001234/ + ``` + +**The `rsync -aP` command is suitable if:** + +1. The **number of files to transfer is small** (<1000) or the **files are + large** enough (>1 MB on average). + * If not, please + [archive](#23-migrating-data-with-large-amounts-of-small-files) and, + optionally, [compress the data](#31-data-compression) before transfer. +2. You are transferring your own files **or** resulting file ownership on Roihu + does **not** matter. + * You will own all files that you transfer to Roihu irrespective of who the + owner on Puhti is. + +??? info "Note! The trailing `/` character has a meaning in `rsync` commands!" + A trailing `/` character affects what gets transferred **from the source**. + If the source path ends with `/`, then all contents of the directory will + get copied, but not the directory itself. To transfer the directory itself + (and the contents), leave out the trailing `/` as in the previous example. + +??? info "How long will my data migration take?" + The table below can be used as a rough reference for how long certain data + transfers using `rsync` will take. + + | Number of files | Average file size | Total size | Duration | Notes | + |-:|-:|-:|-:|-| + | 1 | 1 GB | 1 GB | 6 s | + | 10 | 100 MB | 1 GB | 6 s | + | 100 | 10 MB | 1 GB | 6 s | + | 1000 | 1 MB | 1 GB | 11 s | Small-file overhead increases, [please archive](#23-migrating-data-with-large-amounts-of-small-files)! + | 10000 | 100 kB | 1 GB | 45 s | Small-file overhead increases, [please archive](#23-migrating-data-with-large-amounts-of-small-files)! + | 1 | 10 GB | 10 GB | ~1 min | + | 10 | 1 GB | 10 GB | ~1 min | + | 100 | 100 MB | 10 GB | ~1 min | + | 1000 | 10 MB | 10 GB | ~1 min | + | 1 | 100 GB | 100 GB | ~11 min | + | 1 | 1 TB | 1 TB | ~ 2 h | + + Please note that the actual performance may vary based on the current + system load. If you need to transfer thousands of small files (<1 MB), + [pack them into a single archive file for better performance](#23-migrating-data-with-large-amounts-of-small-files). + +### 2.2 Performing a dry run + +It may be useful to perform a dry run before starting the actual `rsync` +process. Add the option `-n` to your `rsync` command: + +```bash +rsync -anP /scratch/project_2001234/my-data $USER@roihu-cpu.csc.fi:/scratch/project_2001234/ +``` + +This command does not transfer anything, it simply shows what would happen if +you were to run `rsync` without the `-n` option. + +??? warning "Note! A dry run will not catch errors that would be caused by insufficient permissions" + An `rsync` dry run will **not** catch errors caused by insufficient + permissions. In other words, it assumes that: + + 1. You have read and execute permissions for all files and directories, + respectively, that you are trying to migrate from Puhti. + 2. You have write permission on the destination (Roihu). + + To list files and directories that you cannot transfer due to insufficient + permissions, try: + + ```bash + find /scratch/project_2001234/my-data ! -readable 2> /dev/null + ``` + + To check if the destination is writable, try: + + ```bash + ssh $USER@roihu-cpu.csc.fi "touch /scratch/project_2001234/.test && rm /scratch/project_2001234/.test" + ``` + + Missing write permissions will cause a `Permission denied` error. If the + destination does not exist, you will get a `No such file or directory` + error. + +### 2.3 Migrating data with large amounts of small files + +If the data you need to migrate contains thousands of small files, it is +recommended to **archive** the data before transferring it, i.e. pack all files +into a single file. Most data transfer tools handle one large file far better +than thousands of small ones. + +1. Assuming you want to migrate the directory + `/scratch/project_2001234/my-data` from Puhti to Roihu, create (`c`) an + archive of it as follows: + + ```bash + cd /scratch/project_2001234 + tar cf my-data.tar my-data + ``` + +2. Transfer the archived dataset `my-data.tar` to Roihu [using `rsync`](#21-basic-rsync). +3. Extract (`x`) the data on Roihu with: + + ```bash + tar xf my-data.tar + ``` + +4. [Read more about `tar` here](env-guide/packing-and-compression-tools.md#tar-packing-several-files-into-one-file). + +??? info "Mind your disk quota!" + Archiving creates new data on the disk. If your dataset is large, you may + end up running out of disk quota since the operation will essentially + double your disk usage (unless the archive is also + [compressed](#31-data-compression)). + + A trick to avoid creating new data on Puhti disk is to pipe the output of + `tar` to Roihu directly over SSH. Use the command: + + ```bash + tar c -C /scratch/project_2001234 my-data | ssh $USER@roihu-cpu.csc.fi 'cat > /scratch/project_2001234/my-data.tar' + ``` + + [Read more about using `tar` over SSH](../../data/moving/tar_ssh.md). + +## 3. Special cases + +### 3.1 Data compression + +Data compression can be useful to save storage space and make data transfer +faster, **but it may take a lot of time**. Data compression is CPU intensive +and compressing large datasets may easily take **several hours**. + +The compressibility of files depends on their content. Certain file formats are +already highly compressed (e.g., images) and trying to compress these further +is counter-productive. On the other hand, data compression can be beneficial +if transferring, for example, many small plain text files, or large text-based +datasets. + +| File types that compress well | File types that do not compress well | +|-------------------------------|--------------------------------------------| +| Plain text | Media (JPG, PNG, GIF, MP3, MP4, WAV, etc.) | +| CSV, XML, JSON, YAML, etc. | Pre-compressed archives (ZIP, gzip, etc.) | +| Source code (Python, C, etc.) | Binary blobs (e.g., compiled executables) | + +`rsync` provides built-in functionality for on-the-fly compression and +decompression via the `-z` option: + +```bash +rsync -azP /scratch/project_2001234/my-data $USER@roihu-cpu.csc.fi:/scratch/project_2001234/ +``` + +??? info "Alternative methods to maximize performance" + `rsync` uses the `zlib` library for compressing data during transfer. The + performance is comparable to `gzip`, but there are even faster options + available if needed. One such is + [`zstd` compression](env-guide/packing-and-compression-tools.md#zstandard-compression-tool). + + `zstd` compression can be combined with using `tar` over SSH. To transfer + the directory `my-data` from Puhti to Roihu, run: + + ```bash + tar c -I zstd -C /scratch/project_2001234 my-data | ssh $USER@roihu-cpu.csc.fi 'cat > /scratch/project_2001234/my-data.tar.zst' + ``` + + In cases where compression is not beneficial, you can also use plain `tar` + over `ssh` + [as explained previously](#23-migrating-data-with-large-amounts-of-small-files). + The performance can be better than `rsync`, especially if your dataset + contains a huge number of tiny files. + + * [Read more about using `tar` over `ssh` for data transfer here](../../data/moving/tar_ssh.md). + * [Read more about packing and compression tools here](env-guide/packing-and-compression-tools.md). + +### 3.2 Running long transfer processes safely + +One of the strengths of `rsync` is that interrupted transfers can be easily +resumed – **just run the same `rsync` command again**. `rsync` will compare the +source and destination, skip already transferred files (copies only what's +missing) and resume partially transferred files (as long as option `-P` or +`--partial` is used as [instructed above](#21-basic-rsync)). + +However, to avoid failures caused by interrupted SSH sessions altogether, you +may run your data migration process in a `screen` session. + +1. On Puhti, start a `screen` session: + + ```bash + screen -S roihu_migration + ``` + +2. Start your `rsync` command inside the `screen` session: + + ```bash + rsync -aP /scratch/project_2001234/my-data $USER@roihu-cpu.csc.fi:/scratch/project_2001234/ + ``` + +3. Now you may detach and leave the `rsync` process running: + + ```txt + Ctrl + A, then press D + ``` + + The data migration process will keep running safely in the background. You + may log out from Puhti if you want. + +4. Reattach the session with: + + ```txt + screen -r roihu_migration + ``` + + If you forgot the name of the session, try `screen -ls`. + +5. When the data transfer has finished, terminate the session by typing `exit` + inside the `screen` session. + +Using `screen` is useful if your data transfer will take several hours. You +can, for example, power off your computer and leave the `rsync` process running +overnight. + +### 3.3 Using checksums to verify data integrity + +`rsync` ensures data integrity using internal checksum mechanisms by default. +**It is therefore not necessary to verify data integrity separately**. + +If you're not using `rsync`, you may calculate a checksum for files using e.g. +`md5sum`. + +1. Assuming you've got a dataset archive `data.tar` on Puhti, calculate a + checksum for it with: + + ```bash + md5sum data.tar > data.tar.md5 + ``` + + Note that calculating checksums for huge datasets can take some time, + especially if the current disk load is high. + +2. [Transfer](#21-basic-rsync) the dataset and the `data.tar.md5` checksum file + to Roihu. +3. With the `data.tar` and `data.tar.md5` files in the same directory, verify + the checksum with: + + ```bash + md5sum -c data.tar.md5 + ``` + + If any byte changed during transfer, the file will not match, and you will + see `data.tar: FAILED`. Otherwise you should get `data.tar: OK`. + +### 3.4 If file ownership matters + +**You will be set as the owner of all files that you transfer from Puhti to +Roihu**. This is important to realize when migrating data from shared project +directories where you may have read access to data owned by your colleagues. + +There is no way for users to move their colleagues' data in such a way that +the ownerships would be preserved. If this is important to your project, then +**please ensure that each member moves only their own data**. + +In case you later notice incorrect file ownerships, Roihu system administrators +may fix them for you. Please [contact CSC Service Desk](../contact.md) with the +details on which files and/or directories are affected and who should be set as +the owner. + +## 4. Discouraged methods + +### 4.1 `scp` + +`scp` has many drawbacks compared to `rsync`. It cannot resume interrupted +transfers, has limited metadata preservation capabilities, no built-in +integrity checks and inferior performance. Using it to migrate data to Roihu is +therefore not recommended, unless your dataset is very small and simple (<10 +GB, <100 files). + +[Read more about `scp` here](../../data/moving/scp.md). + +### 4.2 Using the web interfaces to migrate data + +Unfortunately, there is no good way that the Puhti or Mahti web interfaces can +be used to move data directly to Roihu. There are some indirect ways, but none +of them are efficient, which is why we primarily recommend the command-line +based approaches above. The following options should therefore be considered as +"last resort" choices. + +1. Use the Puhti/Mahti web interface file browser to first download your data + locally, and then upload it to Roihu via the Roihu web interface. Note that + there is a **limit of 10 GB for individual file uploads**, so data larger + than this must be split into suitable chunks. Alternatively, you could use + [graphical file transfer utilities](../../data/moving/graphical_transfer.md) + to upload the data to Roihu since you've already downloaded it locally. +2. If you have a LUMI project, you could use the Puhti/Mahti web interface + [Cloud storage configuration](../../computing/webinterface/file-browser.md#accessing-allas-and-lumi-o) + app to set up a connection to LUMI-O, upload your data there, and then fetch + it from LUMI-O to Roihu. + +??? warning "Don't use Allas for migrating data to Roihu!" + It is **strongly discouraged** to use Allas for migrating data to Roihu + because Allas is running out of capacity. Please prefer LUMI-O if you must + migrate data to Roihu via object storage. diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md new file mode 100644 index 0000000000..e1c1865c59 --- /dev/null +++ b/docs/support/tutorials/roihu.md @@ -0,0 +1,94 @@ +# Getting started with Roihu + +This is a quickstart guide for Roihu users. It is assumed that you have +previously used CSC supercomputing resources like Puhti, Mahti or LUMI. If not, +you can start by looking at our general +[getting started with supercomputing guide](hpc-quick.md). We also recommend +checking the +[CSC Computing Environment self-learning course materials](https://csc-training.github.io/csc-env-eff/). + +To access Roihu, you need a CSC user account and project that has Roihu service +enabled. [Read more here](../../accounts/index.md). + +[TOC] + +## Connecting + +Connect to Roihu using either: + +* [SSH client](#ssh-client) +* [Roihu web interface](#roihu-web-interface) + +### SSH client + +Connecting to Roihu using an SSH client requires that you have: + +1. Set up SSH keys and added your public key to MyCSC (like on Puhti & Mahti). +2. **New:** _Signed_ your public key and downloaded a _certificate_ that allows + authenticating. + * Each certificate is valid for 24 hours, after which a new one must be + generated. + +**[Read the detailed instructions for managing SSH keys and certificates here](../../computing/connecting/ssh-keys.md).** + +Once you have set up SSH keys and obtained a valid SSH certificate, connect +using an SSH client: + +* [Instructions for Linux/macOS](../../computing/connecting/ssh-unix.md). +* [Instructions for Windows](../../computing/connecting/ssh-windows.md). + +!!! info "Roihu has separate login nodes for CPU and GPU partitions" + Roihu has + [different CPU architectures on the CPU and GPU nodes](../../computing/systems-roihu.md#compute). + Hence, there are separate login nodes for building programs and submitting + jobs to the respective nodes: + + 1. **`roihu-cpu.csc.fi`** + 2. **`roihu-gpu.csc.fi`** + + For example, connect to one of the CPU login nodes using a command-line SSH + client like this: + + ```bash + # Replace with the name of your CSC user account. + + ssh @roihu-cpu.csc.fi + ``` + + Please observe that software built on `roihu-cpu.csc.fi` can only be run on + the CPU nodes, while software built on `roihu-gpu.csc.fi` can only be run + on the GPU nodes. Importantly, this applies also to Python environments. + + **Note that you may access your files from all login nodes because they all + use the same shared file system.** + +### Roihu web interface + +The simplest way to connect to Roihu is to use the web interface. + +1. Go to [www.roihu.csc.fi](https://www.roihu.csc.fi). +2. Log in using your Haka, Virtu or CSC user account. + [Multi-factor authentication (MFA)](../../accounts/mfa.md) is required. + +## Migrating research data + +If you need to transfer research data from Puhti or Mahti to Roihu, we require +that you: + +1. Carefully review your data before transferring it – **only move what you + really need and check that you have enough space available on Roihu!** + Notably, previous extended disk quotas on Puhti or Mahti will not be + automatically moved to Roihu. Quota extensions on Roihu must be separately + applied for and properly motivated. +2. Move your data **directly** from Puhti or Mahti to Roihu. + +**[Read the detailed instructions in the Roihu data migration guide](roihu-data.md).** + +## Installing software + +## Running your first job + +## More information + +* [Roihu system overview](../../computing/systems-roihu.md) +* [CSC Computing Environment self-learning materials](https://csc-training.github.io/csc-env-eff/) From f4afad52b04a638151028ef214b3d7512ea552fe Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Fri, 13 Mar 2026 09:23:43 +0200 Subject: [PATCH 003/139] update banner --- mkdocs.yml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mkdocs.yml b/mkdocs.yml index da967426c0..ab4667990b 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -11,9 +11,9 @@ extra: announcement_visible: true # Controls the visibility of the announcement bar landing_banner: path: https://a3s.fi/docs-files/banners/ # Put the image file in this bucket; Don't touch this value. - image: roihu.png - title: Roihu supercomputer – coming in spring 2026! - link: /computing/systems-roihu/ + image: roihu-new.png + title: Roihu supercomputer – get started here! + link: support/tutorials/roihu/ description: |- Puhti and Mahti will be decommissioned in August 2026 and replaced by Roihu, CSC's next-generation supercomputer offering enhanced performance From e1bc0e282b7c3af0313183c66a1c319e4b51c99d Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Mon, 16 Mar 2026 10:32:53 +0200 Subject: [PATCH 004/139] test Roihu apps tags --- docs/apps/gromacs.md | 1 + docs/apps/jupyter.md | 1 + 2 files changed, 2 insertions(+) diff --git a/docs/apps/gromacs.md b/docs/apps/gromacs.md index d610be978b..3caebb31e1 100644 --- a/docs/apps/gromacs.md +++ b/docs/apps/gromacs.md @@ -12,6 +12,7 @@ catalog: - LUMI - Puhti - Mahti + - Roihu --- # GROMACS diff --git a/docs/apps/jupyter.md b/docs/apps/jupyter.md index d2ab436925..1042511378 100644 --- a/docs/apps/jupyter.md +++ b/docs/apps/jupyter.md @@ -12,6 +12,7 @@ catalog: - LUMI - Puhti - Mahti + - Roihu - LUMI --- From 4a0a172773b150d1afa6c9b450668c2c9187fdef Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Mon, 16 Mar 2026 13:21:29 +0200 Subject: [PATCH 005/139] fix bolding --- docs/computing/connecting/index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/computing/connecting/index.md b/docs/computing/connecting/index.md index 421f8bc316..a79ef1bca6 100644 --- a/docs/computing/connecting/index.md +++ b/docs/computing/connecting/index.md @@ -49,11 +49,11 @@ Logging in to CSC supercomputers using an SSH client requires that you have ```mermaid flowchart LR - A(**Before first connection:** + A(
Before first connection:
Set up SSH keys) A --> B{Connecting to Roihu?} - B -->|yes| C(**Once every 24 hours:** + B -->|yes| C(
Once every 24 hours:
Get a new SSH certificate) C --> D(SSH with Linux/macOS or From db792e9cf7055eeacf9a45061bb75f9a5f4870c4 Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Mon, 16 Mar 2026 14:10:02 +0200 Subject: [PATCH 006/139] typo --- docs/computing/connecting/index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/computing/connecting/index.md b/docs/computing/connecting/index.md index a79ef1bca6..9e8610b823 100644 --- a/docs/computing/connecting/index.md +++ b/docs/computing/connecting/index.md @@ -49,11 +49,11 @@ Logging in to CSC supercomputers using an SSH client requires that you have ```mermaid flowchart LR - A(
Before first connection:
+ A(Before first connection: Set up SSH keys) A --> B{Connecting to Roihu?} - B -->|yes| C(
Once every 24 hours:
+ B -->|yes| C(Once every 24 hours: Get a new SSH certificate) C --> D(SSH with Linux/macOS or From b9bff266d6a3c8ae766f8fab4b703563adeda519 Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Mon, 16 Mar 2026 15:15:52 +0200 Subject: [PATCH 007/139] improve based on comments --- docs/computing/connecting/ssh-keys.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 66a00d17a2..875be3c4e1 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -174,14 +174,14 @@ usage. 6. Optional, but **strongly recommended**: [Install WinSCP](https://winscp.net/eng/docs/installation) and [start the Pageant authentication agent](https://the.earth.li/~sgtatham/putty/0.83/htmldoc/Chapter9.html#pageant) - that comes bundled with - [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/) to + that comes bundled with WinSCP (and + [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/)) to automatically add SSH key and certificate to SSH agent. * If you install WinSCP without admin rights, you must add `WinSCP.exe` to your Path environment variable. Search for the _Edit environment variables for your account_ settings menu. * If you intend to connect to Roihu using PowerShell, it is - possible to use `ssh-agent` instead of Pageant and WinSCP. + possible to also use Windows `ssh-agent`. [See the instructions for starting `ssh-agent` in PowerShell](ssh-windows.md#authentication-agent). 7. Open PowerShell and execute: @@ -207,7 +207,9 @@ usage. If you intend to use PowerShell to connect to Roihu, make sure to provide `csc_cert.py` your OpenSSH public key (`.pub`). Providing a PuTTY `.ppk` key will create a certificate file - that is only compatible with PuTTY or MobaXterm. + that is only compatible with PuTTY or MobaXterm. Providing a + `.pub` file will create both an OpenSSH-compatible `-cert.pub` + file, as well as a `-cert.ppk` file (if WinSCP is available). 8. If you have an earlier certificate which is still valid, the tool prints the expiration time and exits. From 1d2d95888ab95c0edc4168eecd8259e8cd4a0733 Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Tue, 17 Mar 2026 09:28:46 +0200 Subject: [PATCH 008/139] Link to public cert helper tool --- docs/computing/connecting/ssh-keys.md | 28 +++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 875be3c4e1..28e2a937f6 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -126,8 +126,8 @@ expires, a new one must be signed following either of the processes below. The certificate helper is a Python tool developed by CSC to simplify the process of signing and downloading an SSH certificate, and adding it to your SSH authentication agent. A detailed documentation of the tool is available in -the source repository (TBA). The following instructions illustrate only basic -usage. +[the source repository](https://github.com/CSCfi/certificate-helper-tool). The +following instructions illustrate only basic usage. 1. Ensure that you have Python installed on your computer. - Instructions are available in the @@ -135,7 +135,7 @@ usage. Contact your local IT-support if you need assistance. - If Python for some reason cannot be installed on your computer, fall back to [Option 2](#option-2-mycsc) instead. -2. [Download the certificate helper tool here](https://gitlab.ci.csc.fi/compen/hpc-environment/certificate-helper-tool/-/blob/main/csc_cert.py). +2. [Download the certificate helper tool here](https://github.com/CSCfi/certificate-helper-tool/raw/refs/heads/main/csc_cert.py). 3. Run the `csc_cert.py` tool: === "Linux & macOS" @@ -143,7 +143,7 @@ usage. 1. Optional, but **strongly recommended:** Ensure that [`ssh-agent`](ssh-unix.md#authentication-agent) is running to automatically add SSH key and certificate to SSH agent. - 1. Open terminal and execute: + 2. Open terminal and execute: ```bash # Replace with your CSC user name and @@ -157,21 +157,21 @@ usage. directory as the script. If not, make sure to provide the full path to `csc_cert.py`. - 2. If you have an earlier certificate which is still valid, the tool + 3. If you have an earlier certificate which is still valid, the tool prints the expiration time and exits. - 3. If signing is needed, a login URL is displayed. Follow the link and + 4. If signing is needed, a login URL is displayed. Follow the link and authenticate. - 4. Copy the 6-digit code displayed into your terminal and enter your + 5. Copy the 6-digit code displayed into your terminal and enter your SSH key passphrase. - The signed certificate is automatically downloaded and added to your SSH agent. - The signed certificate is saved as `-cert.pub` (e.g., `~/.ssh/id_ed25519-cert.pub`). - 5. **[Connect to Roihu following these instructions](ssh-unix.md#basic-usage)**. + 6. **[Connect to Roihu following these instructions](ssh-unix.md#basic-usage)**. === "Windows" - 6. Optional, but **strongly recommended**: + 7. Optional, but **strongly recommended**: [Install WinSCP](https://winscp.net/eng/docs/installation) and [start the Pageant authentication agent](https://the.earth.li/~sgtatham/putty/0.83/htmldoc/Chapter9.html#pageant) that comes bundled with WinSCP (and @@ -184,7 +184,7 @@ usage. possible to also use Windows `ssh-agent`. [See the instructions for starting `ssh-agent` in PowerShell](ssh-windows.md#authentication-agent). - 7. Open PowerShell and execute: + 8. Open PowerShell and execute: ```bash # Replace with your CSC user name and @@ -211,11 +211,11 @@ usage. `.pub` file will create both an OpenSSH-compatible `-cert.pub` file, as well as a `-cert.ppk` file (if WinSCP is available). - 8. If you have an earlier certificate which is still valid, the tool + 9. If you have an earlier certificate which is still valid, the tool prints the expiration time and exits. - 9. If signing is needed, a login URL is displayed. Follow the link and + 10. If signing is needed, a login URL is displayed. Follow the link and authenticate. - 10. Copy the displayed 6-digit code into PowerShell and enter your SSH + 11. Copy the displayed 6-digit code into PowerShell and enter your SSH key passphrase. - The signed certificate is automatically downloaded and added to your SSH agent (if you have WinSCP installed and Pageant @@ -223,7 +223,7 @@ usage. - The signed certificate is saved as `-cert.pub` and/or `-cert.ppk` (e.g., `C:\Users\\.ssh\id_ed25519-cert.ppk`). - 11. **[Connect to Roihu following these instructions](ssh-windows.md#basic-usage)**. + 12. **[Connect to Roihu following these instructions](ssh-windows.md#basic-usage)**. --- From 382bead6b39aed542c2c84e9c58088cff459f3e1 Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Tue, 17 Mar 2026 11:01:39 +0200 Subject: [PATCH 009/139] clone --- docs/computing/connecting/ssh-keys.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 28e2a937f6..4649ecaef1 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -135,7 +135,12 @@ following instructions illustrate only basic usage. Contact your local IT-support if you need assistance. - If Python for some reason cannot be installed on your computer, fall back to [Option 2](#option-2-mycsc) instead. -2. [Download the certificate helper tool here](https://github.com/CSCfi/certificate-helper-tool/raw/refs/heads/main/csc_cert.py). +2. Download the certificate helper tool here, or clone the Git repository: + + ```bash + git clone https://github.com/CSCfi/certificate-helper-tool.git + ``` + 3. Run the `csc_cert.py` tool: === "Linux & macOS" From 080de4e08c06eff3bf0f1bb51106a51813c92eb4 Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Tue, 17 Mar 2026 11:22:45 +0200 Subject: [PATCH 010/139] save link as --- docs/computing/connecting/ssh-keys.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 4649ecaef1..5517ef9ead 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -135,7 +135,9 @@ following instructions illustrate only basic usage. Contact your local IT-support if you need assistance. - If Python for some reason cannot be installed on your computer, fall back to [Option 2](#option-2-mycsc) instead. -2. Download the certificate helper tool here, or clone the Git repository: +2. [Download the certificate helper tool here](https://github.com/CSCfi/certificate-helper-tool/raw/refs/heads/main/csc_cert.py) + (_Right click_ :material-arrow-right: _Save Link As_), or clone the Git + repository: ```bash git clone https://github.com/CSCfi/certificate-helper-tool.git From e203a1af9f7545e37eb9c49aa69e12dc3e6bbca8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mats=20Sj=C3=B6berg?= Date: Fri, 27 Mar 2026 15:41:31 +0200 Subject: [PATCH 011/139] Added PyTorch and vLLM for Roihu --- docs/apps/pytorch.md | 76 +++++++++++++++++++++++++++----------- docs/apps/vllm.md | 88 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 142 insertions(+), 22 deletions(-) create mode 100644 docs/apps/vllm.md diff --git a/docs/apps/pytorch.md b/docs/apps/pytorch.md index fcf179ecd5..533572395b 100644 --- a/docs/apps/pytorch.md +++ b/docs/apps/pytorch.md @@ -11,6 +11,7 @@ catalog: - LUMI - Puhti - Mahti + - Roihu --- # PyTorch @@ -19,6 +20,9 @@ Machine learning framework for Python. !!! info "News" + **7.4.2026** PyTorch is now available on Roihu, the module has been + renamed `python-pytorch`. + **23.1.2026** Since the LUMI service break 21.1.2026, the CSC PyTorch installations no longer work with the fast Slingshot network due to binary incompatibilities with the new drivers. This means that @@ -112,29 +116,34 @@ Machine learning framework for Python. Currently supported PyTorch versions: -| Version | Module | Puhti | Mahti | (LUMI)
*see notes below* | Notes | -|:--------|----------------|:-----:|:-----:|------|:-------------------------| -| 2.9.1 | `pytorch/2.9` | X | X | - | Default on Puhti, Mahti | -| 2.7.1 | `pytorch/2.7` | X | X | (X) | No Slingshot (see below) | -| 2.6.0 | `pytorch/2.6` | X | X | - | | -| 2.5.1 | `pytorch/2.5` | X | X | (X) | | -| 2.4.1 | `pytorch/2.4` | - | - | (X) | | -| 2.4.0 | `pytorch/2.4` | X | X | - | New tykky-based wrappers | -| 2.3.1 | `pytorch/2.3` | X | X | - | New tykky-based wrappers | -| 2.2.2 | `pytorch/2.2` | - | - | (X) | | -| 2.2.1 | `pytorch/2.2` | X | X | - | | -| 2.1.2 | `pytorch/2.1` | - | - | (X) | | -| 2.1.0 | `pytorch/2.1` | X | X | - | | -| 2.0.1 | `pytorch/2.0` | - | - | (X) | | -| 2.0.0 | `pytorch/2.0` | X | X | - | | -| 1.13.1 | `pytorch/1.13` | - | - | (X) | | -| 1.13.0 | `pytorch/1.13` | X | X | - | | -| 1.12.0 | `pytorch/1.12` | X | X | - | | -| 1.11.0 | `pytorch/1.11` | X | X | - | | +| Version | Module | Puhti | Mahti | Roihu | (LUMI)
*see notes below* | Notes | +|:--------|-----------------------|:-----:|:-----:|-------|------------------------------|:-------------------------| +| 2.10.0 | `python-pytorch/2.10` | - | - | X | - | Default on Roihu | +| 2.9.1 | `pytorch/2.9` | X | X | | - | Default on Puhti, Mahti | +| 2.7.1 | `pytorch/2.7` | X | X | | (X) | No Slingshot (see below) | +| 2.6.0 | `pytorch/2.6` | X | X | | - | | +| 2.5.1 | `pytorch/2.5` | X | X | | (X) | | +| 2.4.1 | `pytorch/2.4` | - | - | | (X) | | +| 2.4.0 | `pytorch/2.4` | X | X | | - | New tykky-based wrappers | +| 2.3.1 | `pytorch/2.3` | X | X | | - | New tykky-based wrappers | +| 2.2.2 | `pytorch/2.2` | - | - | | (X) | | +| 2.2.1 | `pytorch/2.2` | X | X | | - | | +| 2.1.2 | `pytorch/2.1` | - | - | | (X) | | +| 2.1.0 | `pytorch/2.1` | X | X | | - | | +| 2.0.1 | `pytorch/2.0` | - | - | | (X) | | +| 2.0.0 | `pytorch/2.0` | X | X | | - | | +| 1.13.1 | `pytorch/1.13` | - | - | | (X) | | +| 1.13.0 | `pytorch/1.13` | X | X | | - | | +| 1.12.0 | `pytorch/1.12` | X | X | | - | | +| 1.11.0 | `pytorch/1.11` | X | X | | - | | Includes [PyTorch](https://pytorch.org/) and related libraries with GPU support via CUDA/ROCm. +!!! note "vLLM on Roihu" + On Roihu we have moved vLLM to a separate module `python-vllm`. + The [vLLM module has its own documentation page](vllm.md). + !!! warning "LUMI installations" LUMI installations - marked with "(X)" in the table above - no longer @@ -198,6 +207,12 @@ with: module load pytorch ``` +To access PyTorch on Roihu: + +```text +module load python-pytorch +``` + To access PyTorch on LUMI - see the [caveats about the LUMI installation above](#lumi-note). ```text @@ -209,7 +224,8 @@ If you wish to have a specific version ([see above for available versions](#available)), use: ```text -module load pytorch/2.9 +module load python-pytorch/2.10 # on Roihu +module load pytorch/2.9 # on other systems ``` Please note that the module already includes CUDA and cuDNN libraries, @@ -218,14 +234,15 @@ so **there is no need to load cuda and cudnn modules separately!** This command will also show all available versions: ```text -module avail pytorch +module avail python-pytorch # on Roihu +module avail pytorch # on other systems ``` To check the exact packages and versions included in the loaded module you can run: ```text -list-packages +pip list ``` @@ -269,6 +286,21 @@ proportion of the available CPU cores in a single node: srun python3 myprog.py ``` +=== "Roihu" + ```bash + #!/bin/bash + #SBATCH --account= + #SBATCH --partition=gpumedium + #SBATCH --ntasks=1 + #SBATCH --cpus-per-task=72 + #SBATCH --mem=120G + #SBATCH --gres=gpu:gh200:1 + #SBATCH --time=1:00:00 + + module load python-pytorch/2.10 + srun python3 myprog.py + ``` + === "LUMI" ```bash #!/bin/bash diff --git a/docs/apps/vllm.md b/docs/apps/vllm.md new file mode 100644 index 0000000000..864a9b6c46 --- /dev/null +++ b/docs/apps/vllm.md @@ -0,0 +1,88 @@ +--- +tags: + - Free +catalog: + name: vLLM + description: A fast and easy-to-use library for LLM inference and serving + license_type: Free + disciplines: + - Data Analytics and Machine Learning + available_on: + - Roihu +--- + +# vLLM + +A fast and easy-to-use library for LLM inference and serving. + +!!! info "News" + + **7.4..2026** vLLM now available as a separate module on Roihu + + +## Available + +Currently supported PyTorch versions: + +| Version | Module | +|:--------|----------------------| +| 0.18.0 | `python-vllm/0.18.0` | + +Includes [vLLM](), [PyTorch](https://pytorch.org/) and related +libraries with GPU support via CUDA/ROCm. + +!!! note "vLLM on Roihu" + If you don't particularly need vLLM, we recommend using the [full PyTorch module](pytorch.md) instead, which includes more packages. + + +If you find that some package is missing, you can often install it +yourself using `pip install`. It is recommended to use Python virtual +environments. See [our Python documentation for more information on +how to install packages +yourself](../support/tutorials/python-usage-guide.md#installing-python-packages-to-existing-modules). +If you think that some important package should be included in the +module provided by CSC, please [contact our +servicedesk](../support/contact.md). + +All modules are based on containers using Apptainer (previously known +as Singularity). Wrapper scripts have been provided so that common +commands such as `python`, `python3`, `pip` and `pip3` should work as +normal. + +## License + +vLLM is covered by the [Apache License +2.0](https://github.com/vllm-project/vllm/blob/main/LICENSE). + +## Usage + +To load the default version on Roihu: + +```text +module load python-vllm +``` + +If you wish to have a specific version ([see above for available +versions](#available)), use: + +```text +module load python-vllm/0.18.0 +``` + +To check the exact packages and versions included in the loaded module you can +run: + +```text +pip list +``` + +### Example script for usage + +See the [vLLM section in CSC's Machine learning +guide](../support/tutorials/ml-llm.md#inference-with-vllm), which has +links to example script for using vLLM on Roihu. + +The full [Machine learning guide](../support/tutorials/ml-guide.md) +might also be of use. + +[vLLM]: https://docs.vllm.ai/en/latest/ From 2895208fceec03fd3baf465147c8bd10f24fa914 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mats=20Sj=C3=B6berg?= Date: Fri, 27 Mar 2026 15:46:45 +0200 Subject: [PATCH 012/139] Fixed vLLM link --- docs/apps/vllm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/apps/vllm.md b/docs/apps/vllm.md index 864a9b6c46..c9cb22d482 100644 --- a/docs/apps/vllm.md +++ b/docs/apps/vllm.md @@ -28,7 +28,7 @@ Currently supported PyTorch versions: |:--------|----------------------| | 0.18.0 | `python-vllm/0.18.0` | -Includes [vLLM](), [PyTorch](https://pytorch.org/) and related +Includes [vLLM][], [PyTorch](https://pytorch.org/) and related libraries with GPU support via CUDA/ROCm. !!! note "vLLM on Roihu" From 68b776482f1b216f322096d5246a3cc7b37c01b1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mats=20Sj=C3=B6berg?= Date: Fri, 27 Mar 2026 15:55:02 +0200 Subject: [PATCH 013/139] Fixed errors in vLLM. --- docs/apps/vllm.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/apps/vllm.md b/docs/apps/vllm.md index c9cb22d482..ceb41553cb 100644 --- a/docs/apps/vllm.md +++ b/docs/apps/vllm.md @@ -17,12 +17,12 @@ A fast and easy-to-use library for LLM inference and serving. !!! info "News" - **7.4..2026** vLLM now available as a separate module on Roihu + **7.4.2026** vLLM now available as a separate module on Roihu ## Available -Currently supported PyTorch versions: +Currently supported vLLM versions: | Version | Module | |:--------|----------------------| @@ -31,7 +31,7 @@ Currently supported PyTorch versions: Includes [vLLM][], [PyTorch](https://pytorch.org/) and related libraries with GPU support via CUDA/ROCm. -!!! note "vLLM on Roihu" +!!! note "PyTorch vs vLLM" If you don't particularly need vLLM, we recommend using the [full PyTorch module](pytorch.md) instead, which includes more packages. From 300fda0be19fcbb7d8e4afc04e3215a084711c1f Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg Date: Tue, 31 Mar 2026 15:40:00 +0300 Subject: [PATCH 014/139] fix link --- docs/computing/connecting/ssh-keys.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 5517ef9ead..04392d92c0 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -259,7 +259,7 @@ following instructions illustrate only basic usage. 1. **Connect to Roihu following these instructions**: 1. [Linux/macOS](ssh-unix.md#basic-usage) - 1. [Windows](ssh-unix.md#basic-usage) + 1. [Windows](ssh-windows.md#basic-usage) !!! info "Optional: Check when your SSH certificate will expire" Each SSH certificate is valid for 24 hours. The expiration time can be From a179ad93b46e6c68dbe6aeec6f9450dffbb742ef Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg <43936697+rkronberg@users.noreply.github.com> Date: Tue, 31 Mar 2026 22:54:48 +0300 Subject: [PATCH 015/139] Gromacs on Roihu (#2921) --- docs/apps/gromacs.md | 217 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 176 insertions(+), 41 deletions(-) diff --git a/docs/apps/gromacs.md b/docs/apps/gromacs.md index 513373a45b..46c74e1a45 100644 --- a/docs/apps/gromacs.md +++ b/docs/apps/gromacs.md @@ -26,6 +26,36 @@ with plenty of analysis scripts. ## Available +=== "Roihu-CPU" + | Version | Available modules | Notes | + |:-------:|:------------------|:-----:| + |2025.1 |`gromacs/2025.1`| + |2025.2 |`gromacs/2025.2`| + |2025.3 |`gromacs/2025.3`| + |2025.4 |`gromacs/2025.4`| + |2026.0 |`gromacs/2026.0`| + |2026.1 |`gromacs/2026.1`| + +=== "Roihu-GPU" + | Version | Available modules | Notes | + |:-------:|:------------------|:-----:| + |2025.1 |`gromacs/2025.1`| + |2025.2 |`gromacs/2025.2`| + |2025.3 |`gromacs/2025.3`| + |2025.4 |`gromacs/2025.4`| + |2026.0 |`gromacs/2026.0`| + |2026.1 |`gromacs/2026.1`| + +=== "LUMI" + | Version | Available modules | Notes | + |:-------:|:------------------|:-----:| + |2025.1 |`gromacs/2025.1`
`gromacs/2025.1-gpu`
`gromacs/2025.1-heffte`|GPU-enabled module available
Module with heFFTe available for [GPU PME decomposition](#gpu-pme-decomposition) + |2025.2 |`gromacs/2025.2`
`gromacs/2025.2-gpu`|GPU-enabled module available + |2025.3 |`gromacs/2025.3`
`gromacs/2025.3-gpu`|GPU-enabled module available + |2025.4 |`gromacs/2025.4`
`gromacs/2025.4-gpu`
`gromacs/2025.4-heffte`|GPU-enabled module available
Module with heFFTe available for [GPU PME decomposition](#gpu-pme-decomposition) + |2026.0 |`gromacs/2026.0`
`gromacs/2026.0-gpu`|GPU-enabled module available + |2026.1 |`gromacs/2026.1`
`gromacs/2026.1-gpu`
`gromacs/2026.1-heffte`|GPU-enabled module available
Module with heFFTe available for [GPU PME decomposition](#gpu-pme-decomposition) + === "Puhti" | Version | Available modules | Notes | |:-------:|:------------------|:-----:| @@ -62,28 +92,20 @@ with plenty of analysis scripts. |2025.2 |`gromacs/2025.2` |2025.4 |`gromacs/2025.4` -=== "LUMI" - | Version | Available modules | Notes | - |:-------:|:------------------|:-----:| - |2025.1 |`gromacs/2025.1`
`gromacs/2025.1-gpu`
`gromacs/2025.1-heffte`|GPU-enabled module available
Module with heFFTe available for [GPU PME decomposition](#gpu-pme-decomposition) - |2025.2 |`gromacs/2025.2`
`gromacs/2025.2-gpu`|GPU-enabled module available - |2025.3 |`gromacs/2025.3`
`gromacs/2025.3-gpu`|GPU-enabled module available - |2025.4 |`gromacs/2025.4`
`gromacs/2025.4-gpu`
`gromacs/2025.4-heffte`|GPU-enabled module available
Module with heFFTe available for [GPU PME decomposition](#gpu-pme-decomposition) - |2026.0 |`gromacs/2026.0`
`gromacs/2026.0-gpu`|GPU-enabled module available - |2026.1 |`gromacs/2026.1`
`gromacs/2026.1-gpu`
`gromacs/2026.1-heffte`|GPU-enabled module available
Module with heFFTe available for [GPU PME decomposition](#gpu-pme-decomposition) - -- Puhti and Mahti have also `gromacs-env/` modules for loading the - recommended latest minor version from each year (replace `` - accordingly). -- To access modules on LUMI, first load the CSC module tree into use with - `module use /appl/local/csc/modulefiles` -- If you want to use command-line [Plumed tools](plumed.md), load the Plumed - module. - -!!! info - We only provide the MPI version `gmx_mpi`, but it can be used for `grompp`, - `editconf` etc. similarly to the serial version. Instead of `gmx grompp`, - give `gmx_mpi grompp`. +!!! info "Notes" + - Roihu, Puhti and Mahti have also `gromacs-env/` modules for loading + the latest minor version from each year (replace `` accordingly). + - To access modules on LUMI, first load the CSC module tree into use with: + + ```bash + module use /appl/local/csc/modulefiles + ``` + + - Versions 2025.0 and later should support PLUMED by default. If you want + to use PLUMED, also load the [PLUMED module](plumed.md). + - We only provide the MPI version `gmx_mpi`, but it can be used for `grompp`, + `editconf` etc. similarly to the serial version. Instead of `gmx grompp`, + give `gmx_mpi grompp`. ## License @@ -91,7 +113,7 @@ GROMACS is a free software available under LGPL, version 2.1. ## Usage -Initialize recommended version of GROMACS on Puhti or Mahti like this: +Initialize recommended version of GROMACS on Roihu, Puhti or Mahti like this: ```bash module purge @@ -108,28 +130,23 @@ To access CSC's GROMACS modules on LUMI, remember to first run: module use /appl/local/csc/modulefiles ``` -!!! info "Note" +!!! warning "Important" Please use the `-maxh` flag for `mdrun`. Setting this equal to or slightly less than the requested time limit (in hours) will ensure that there's time for your simulation to write a final checkpoint and end gracefully before - the scheduler terminates the job. If left unspecified, there's a chance - that the job will crash the node(s) it is running on. For general guidance - on managing long simulations, see the + Slurm terminates the job. + + If left unspecified, there's a chance that the job will crash the node(s) + it is running on. For general tips on managing long simulations, see the [GROMACS manual](https://manual.gromacs.org/current/user-guide/managing-simulations.html). -!!! info "Plumed" - All GROMACS version >=2025.0 should support Plumed by default. If you want - to run GROMACS simulations using Plumed, remember to also load - [any Plumed module](plumed.md). - ### Notes about performance !!! warning "Note" - Please minimize unnecessary disk I/O – never run simulations using - `mdrun -v` (the verbose flag)! + Please minimize unnecessary disk I/O – never run verbose simulations using + the `mdrun -v` flag! It is important to setup the simulations properly to use resources efficiently. -The most important aspects to consider (in addition to avoiding `-v`) are: 1. If you run in parallel, make a scaling test for each system – don't use more cores/GPUs than is efficient. Scaling depends on many aspects of your system @@ -137,11 +154,12 @@ The most important aspects to consider (in addition to avoiding `-v`) are: 2. Use a recent version – there has been significant speedup and bug fixes over the years. If you switch the major version, remember to check that the results are comparable. -3. For large jobs, use full nodes (multiples of 40 cores on Puhti or multiples - of 128 cores on Mahti), see examples below. +3. For large CPU jobs, use full nodes (multiples of 384 cores on Roihu, + multiples of 40 cores on Puhti or multiples of 128 cores on Mahti and LUMI). + See examples below. 4. Performance on GPUs depends on many factors and what calculations you offload. Please consult the - [excellent ENCCS online materials](https://enccs.github.io/gromacs-gpu-performance/) + [ENCCS online materials](https://enccs.github.io/gromacs-gpu-performance/) for a general overview, or the [GROMACS on LUMI workshop materials](https://zenodo.org/records/10610643) for how to run efficiently on LUMI-G. @@ -152,7 +170,122 @@ The most important aspects to consider (in addition to avoiding `-v`) are: For a more complete description, consult the [mdrun performance checklist](https://manual.gromacs.org/current/user-guide/mdrun-performance.html) -on the GROMACS page. +in the GROMACS manua. + +### Roihu + +=== "MPI-only batch script" + + ```bash + #!/bin/bash + #SBATCH --time=00:15:00 + #SBATCH --partition=small + #SBATCH --nodes=1 + #SBATCH --ntasks-per-node=192 + #SBATCH --account= + #SBATCH --hint=nomultithread + + # this script runs a 192-core (half a node, no hyperthreading) gromacs + # job, requesting 15 minutes time + + module purge + module load gromacs-env + export OMP_NUM_THREADS=1 + + srun gmx_mpi mdrun -s topol -maxh 0.2 + ``` + +=== "Hybrid MPI/OpenMP batch script" + + ```bash + #!/bin/bash + #SBATCH --time=00:15:00 + #SBATCH --partition=medium + #SBATCH --nodes=1 + #SBATCH --ntasks-per-node=192 + #SBATCH --cpus-per-task=2 + #SBATCH --account= + #SBATCH --hint=nomultithread + + # this script runs a 384-core (one full node, no hyperthreading) gromacs + # job, requesting 15 minutes time and 192 tasks per node, each with 2 + # OpenMP threads + + module purge + module load gromacs-env + + export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} + + srun gmx_mpi mdrun -s topol -maxh 0.2 + ``` + +=== "Single GPU batch script" + + ```bash + #!/bin/bash + #SBATCH --time=00:15:00 + #SBATCH --partition=gpumedium + #SBATCH --nodes=1 + #SBATCH --ntasks-per-node=1 + #SBATCH --cpus-per-task=72 + #SBATCH --gres=gpu:gh200:1 + #SBATCH --account= + #SBATCH --hint=nomultithread + + # this script runs a single-GPU gromacs job, requesting 1 task per GPU, + # 72 OpenMP threads per task and 15 minutes time + + module purge + module load gromacs-env + + export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} + export GMX_ENABLE_DIRECT_GPU_COMM=1 + export GMX_FORCE_GPU_AWARE_MPI=1 + + srun gmx_mpi mdrun -s topol -maxh 0.2 -nb gpu -bonded gpu -pme gpu -update gpu + ``` + +=== "Full GPU node batch script" + + ```bash + #!/bin/bash + #SBATCH --time=00:15:00 + #SBATCH --partition=gpumedium + #SBATCH --nodes=1 + #SBATCH --ntasks-per-node=4 + #SBATCH --cpus-per-task=72 + #SBATCH --gres=gpu:gh200:4 + #SBATCH --account= + #SBATCH --hint=nomultithread + + # this script runs a full GPU node gromacs job, requesting 1 task per GPU, + # 72 OpenMP threads per task and 15 minutes time + + module purge + module load gromacs-env + + export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} + export GMX_ENABLE_DIRECT_GPU_COMM=1 + export GMX_FORCE_GPU_AWARE_MPI=1 + + srun gmx_mpi mdrun -s topol -maxh 0.2 -nb gpu -bonded gpu -pme gpu -update gpu -npme 1 + ``` + +#### Performance overview + +Below is an overview of the performance of GROMACS 2026.1 on Roihu-CPU and +Roihu-GPU. The STMV benchmark (1067k atoms, 2 fs timestep) is used, and +corresponding results for LUMI-C and LUMI-G are shown for comparison. Note that +each GPU on LUMI contains two physical GPU devices (GCDs), and the plot below +refers specifically to GPUs. + +Bear in mind that this is a large system which exhibits good scalability over +multiple CPU nodes and GPUs. Smaller systems may not be able to utilize +multiple, or even a single GPU efficiently, in which case +[running multiple simulations per GPU](../support/tutorials/gromacs-throughput.md) +is recommended. + +![GROMACS performance on Roihu and LUMI](https://a3s.fi/docs-files/gmx-roihu-vs-lumi.svg 'GROMACS performance on Roihu and LUMI') ### Puhti @@ -166,10 +299,11 @@ on the GROMACS page. #SBATCH --account= ##SBATCH --mail-type=END # uncomment to get mail - # this script runs a 1 core gromacs job, requesting 15 minutes time + # this script runs a 1-core gromacs job, requesting 15 minutes time module purge module load gromacs-env + export OMP_NUM_THREADS=1 srun gmx_mpi mdrun -s topol -maxh 0.2 @@ -186,10 +320,11 @@ on the GROMACS page. #SBATCH --account= ##SBATCH --mail-type=END # uncomment to get mail - # this script runs an 80 core (2 full nodes) gromacs job, requesting 15 minutes time + # this script runs an 80-core (2 full nodes) gromacs job, requesting 15 minutes time module purge module load gromacs-env + export OMP_NUM_THREADS=1 srun gmx_mpi mdrun -s topol -maxh 0.2 -dlb yes From 050c0a395f3e50c251cd14e1a8ad3ee782d23d4b Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg <43936697+rkronberg@users.noreply.github.com> Date: Tue, 31 Mar 2026 23:20:14 +0300 Subject: [PATCH 016/139] CP2K on Roihu (#2922) --- docs/apps/cp2k.md | 117 ++++++++++++++++++++++++++++++++----------- docs/apps/gromacs.md | 24 ++++----- 2 files changed, 101 insertions(+), 40 deletions(-) diff --git a/docs/apps/cp2k.md b/docs/apps/cp2k.md index b9ee40c55b..8161a1d521 100644 --- a/docs/apps/cp2k.md +++ b/docs/apps/cp2k.md @@ -11,6 +11,7 @@ catalog: - LUMI - Puhti - Mahti + - Roihu --- # CP2K @@ -22,6 +23,28 @@ parallel quantum chemistry calculations, in particular for AIMD. ## Available +=== "Roihu-CPU" + | Version | Available modules | Notes | + |:-------:|:------------------|:-----------:| + |2025.1 |`cp2k/2025.1` | CPU version | + |2025.2 |`cp2k/2025.2` | CPU version | + |2026.1 |`cp2k/2026.1` | CPU version | + +=== "Roihu-GPU" + | Version | Available modules | Notes | + |:-------:|:------------------|:-----------:| + |2025.1 |`cp2k/2025.1` | GPU version | + |2025.2 |`cp2k/2025.2` | GPU version | + |2026.1 |`cp2k/2026.1` | GPU version | + +=== "LUMI" + | Version | Available modules | Notes | + |:-------:|:---------------------------------|:---------------------:| + |2024.3 |`cp2k/2024.3`
`cp2k/2024.3-gpu`| GPU version available | + |2025.1 |`cp2k/2025.1`
`cp2k/2025.1-gpu`| GPU version available | + |2025.2 |`cp2k/2025.2`
`cp2k/2025.2-gpu`| GPU version available | + |2026.1 |`cp2k/2026.1`
`cp2k/2026.1-gpu`| GPU version available | + === "Puhti" | Version | Available modules | Notes | |:-------:|:------------------|:-----:| @@ -45,14 +68,6 @@ parallel quantum chemistry calculations, in particular for AIMD. |2024.2 |`cp2k/2024.2` | | |2025.1 |`cp2k/2025.1` | | -=== "LUMI" - | Version | Available modules | Notes | - |:-------:|:---------------------------------|:---------------------:| - |2024.3 |`cp2k/2024.3`
`cp2k/2024.3-gpu`| GPU version available | - |2025.1 |`cp2k/2025.1`
`cp2k/2025.1-gpu`| GPU version available | - |2025.2 |`cp2k/2025.2`
`cp2k/2025.2-gpu`| GPU version available | - |2026.1 |`cp2k/2026.1`
`cp2k/2026.1-gpu`| GPU version available | - ## License CP2K is freely available under the GPL license. @@ -91,43 +106,50 @@ double the number of cores the calculation should be at least 1.5 times faster. ### Example batch scripts -=== "Puhti (MPI only)" +=== "Roihu-CPU (hybrid MPI/OpenMP)" ```bash #!/bin/bash - #SBATCH --time=00:10:00 - #SBATCH --ntasks-per-node=40 - #SBATCH --nodes=2 - #SBATCH --mem-per-cpu=2GB - #SBATCH --partition=large + #SBATCH --time=00:05:00 + #SBATCH --nodes=1 + #SBATCH --ntasks-per-node=96 # maximum 384 + #SBATCH --cpus-per-task=4 # 384 / ntasks-per-node + #SBATCH --partition=medium #SBATCH --account= + #SBATCH --hint=nomultithread module purge - module load gcc/14.2.0 openmpi/5.0.6 - module load cp2k/2025.1 + module load gcc/15.2.0 openmpi/5.0.8 + module load cp2k/2026.1 - srun cp2k.psmp H2O-64.inp > H2O-64.out + srun cp2k.psmp H2O-512.inp > H2O-512.out ``` -=== "Mahti (mixed MPI/OpenMP)" +=== "Roihu-GPU (full GPU node)" ```bash #!/bin/bash - #SBATCH --time=00:05:00 - #SBATCH --ntasks-per-node=32 # 2 - 128 - #SBATCH --cpus-per-task=4 # 128 / ntasks-per-node - #SBATCH --nodes=2 - #SBATCH --partition=test + #SBATCH --time=00:10:00 + #SBATCH --nodes=1 + #SBATCH --ntasks-per-node=64 + #SBATCH --cpus-per-task=4 + #SBATCH --gres=gpu:gh200:4 + #SBATCH --partition=gpumedium #SBATCH --account= + #SBATCH --hint=nomultithread module purge - module load gcc/14.2.0 openmpi/5.0.6 - module load cp2k/2025.1 + module load gcc/13.4.0 openmpi/5.0.8 + module load cp2k/2026.1 - export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK - export OMP_PLACES=cores + export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} - srun cp2k.psmp H2O-64.inp > H2O-64.out + # Nvidia multi-process service (MPS) required to run multiple tasks per GPU + nvidia-cuda-mps-control -d + + srun cp2k.psmp H2O-dft-ls.inp > H2O-dft-ls.out + + echo quit | nvidia-cuda-mps-control ``` === "LUMI-G (full GPU node)" @@ -171,6 +193,45 @@ double the number of cores the calculation should be at least 1.5 times faster. separate GPU, you can reserve up to 8 "GPUs" per node. See more details in [LUMI Docs](https://docs.lumi-supercomputer.eu/hardware/lumig/). +=== "Puhti (MPI only)" + + ```bash + #!/bin/bash + #SBATCH --time=00:10:00 + #SBATCH --ntasks-per-node=40 + #SBATCH --nodes=2 + #SBATCH --mem-per-cpu=2GB + #SBATCH --partition=large + #SBATCH --account= + + module purge + module load gcc/14.2.0 openmpi/5.0.6 + module load cp2k/2025.1 + + srun cp2k.psmp H2O-64.inp > H2O-64.out + ``` + +=== "Mahti (hybrid MPI/OpenMP)" + + ```bash + #!/bin/bash + #SBATCH --time=00:05:00 + #SBATCH --ntasks-per-node=32 # 2 - 128 + #SBATCH --cpus-per-task=4 # 128 / ntasks-per-node + #SBATCH --nodes=2 + #SBATCH --partition=test + #SBATCH --account= + + module purge + module load gcc/14.2.0 openmpi/5.0.6 + module load cp2k/2025.1 + + export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK + export OMP_PLACES=cores + + srun cp2k.psmp H2O-64.inp > H2O-64.out + ``` + ### Performance notes #### Mahti diff --git a/docs/apps/gromacs.md b/docs/apps/gromacs.md index 46c74e1a45..1a7e35182b 100644 --- a/docs/apps/gromacs.md +++ b/docs/apps/gromacs.md @@ -29,22 +29,22 @@ with plenty of analysis scripts. === "Roihu-CPU" | Version | Available modules | Notes | |:-------:|:------------------|:-----:| - |2025.1 |`gromacs/2025.1`| - |2025.2 |`gromacs/2025.2`| - |2025.3 |`gromacs/2025.3`| - |2025.4 |`gromacs/2025.4`| - |2026.0 |`gromacs/2026.0`| - |2026.1 |`gromacs/2026.1`| + |2025.1 |`gromacs/2025.1`|CPU version + |2025.2 |`gromacs/2025.2`|CPU version + |2025.3 |`gromacs/2025.3`|CPU version + |2025.4 |`gromacs/2025.4`|CPU version + |2026.0 |`gromacs/2026.0`|CPU version + |2026.1 |`gromacs/2026.1`|CPU version === "Roihu-GPU" | Version | Available modules | Notes | |:-------:|:------------------|:-----:| - |2025.1 |`gromacs/2025.1`| - |2025.2 |`gromacs/2025.2`| - |2025.3 |`gromacs/2025.3`| - |2025.4 |`gromacs/2025.4`| - |2026.0 |`gromacs/2026.0`| - |2026.1 |`gromacs/2026.1`| + |2025.1 |`gromacs/2025.1`|GPU version + |2025.2 |`gromacs/2025.2`|GPU version + |2025.3 |`gromacs/2025.3`|GPU version + |2025.4 |`gromacs/2025.4`|GPU version + |2026.0 |`gromacs/2026.0`|GPU version + |2026.1 |`gromacs/2026.1`|GPU version === "LUMI" | Version | Available modules | Notes | From c50805ba74777312848f0dbd3bb6adfe272957ad Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg <43936697+rkronberg@users.noreply.github.com> Date: Tue, 31 Mar 2026 23:39:44 +0300 Subject: [PATCH 017/139] Maestro on Roihu (#2923) * Maestro on Roihu * rephrase --- docs/apps/maestro.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/apps/maestro.md b/docs/apps/maestro.md index c207c272a5..5b9d7728d0 100644 --- a/docs/apps/maestro.md +++ b/docs/apps/maestro.md @@ -14,6 +14,7 @@ catalog: - Mahti - Puhti - Mahti + - Roihu --- # Maestro @@ -44,6 +45,7 @@ self-learning materials. ## Available +* Roihu-CPU: 2025.1, 2025.2, 2025.3, 2025.4, 2026.1 * Puhti: 2024.2, 2024.3, 2024.4, 2025.1, 2025.2, 2025.3, 2025.4, 2026.1 * Mahti: 2024.2, 2024.3, 2024.4, 2025.1, 2025.2, 2025.3, 2025.4, 2026.1 @@ -52,6 +54,14 @@ Specifically, this means that module versions older than two years will be remov This policy is enforced to free up disk space and encourage use of the latest versions which tend to be more performant and have less bugs. +!!! warning "Desmond MD simulations cannot be run on Roihu!" + Schrödinger only ships x86 builds of their software suite. This means that + **any Schrödinger modules that require GPUs, most notably Desmond, cannot + be run on Roihu**. + + CSC provides Maestro modules only on Roihu-CPU for purely CPU-based + workloads such as virtual screening (Glide). + !!! info "Maestro versions older than 2023.1 will not work after 13.3.2025!" Schrödinger has taken into use a [new license manager](https://www.schrodinger.com/life-science/learn/white-paper/new-schrodinger-license-manager/), From 79290e8a9b4100a0d0193a8cbd09bb0a51064d54 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Wed, 8 Apr 2026 12:23:17 +0300 Subject: [PATCH 018/139] Add Argos error information for pilot projects --- docs/support/tutorials/roihu.md | 67 +++++++++++++++++++++++---------- 1 file changed, 47 insertions(+), 20 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index e1c1865c59..0128f092de 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -7,7 +7,7 @@ you can start by looking at our general checking the [CSC Computing Environment self-learning course materials](https://csc-training.github.io/csc-env-eff/). -To access Roihu, you need a CSC user account and project that has Roihu service +To access Roihu, you need a CSC user account and a project with the Roihu service enabled. [Read more here](../../accounts/index.md). [TOC] @@ -17,19 +17,18 @@ enabled. [Read more here](../../accounts/index.md). Connect to Roihu using either: * [SSH client](#ssh-client) -* [Roihu web interface](#roihu-web-interface) +* [Roihu web interface (available after general availability)](#roihu-web-interface) ### SSH client -Connecting to Roihu using an SSH client requires that you have: +Connecting to Roihu using an SSH client requires that you: -1. Set up SSH keys and added your public key to MyCSC (like on Puhti & Mahti). -2. **New:** _Signed_ your public key and downloaded a _certificate_ that allows - authenticating. +1. Set up SSH keys and add your public key to MyCSC (like on Puhti & Mahti). +2. **New:** _Sign_ your public key and download a _certificate_ for authentication. * Each certificate is valid for 24 hours, after which a new one must be generated. -**[Read the detailed instructions for managing SSH keys and certificates here](../../computing/connecting/ssh-keys.md).** +**[Read detailed instructions for managing SSH keys and certificates](../../computing/connecting/ssh-keys.md).** Once you have set up SSH keys and obtained a valid SSH certificate, connect using an SSH client: @@ -37,17 +36,16 @@ using an SSH client: * [Instructions for Linux/macOS](../../computing/connecting/ssh-unix.md). * [Instructions for Windows](../../computing/connecting/ssh-windows.md). -!!! info "Roihu has separate login nodes for CPU and GPU partitions" +!!! info "Separate login nodes for CPU and GPU partitions" Roihu has - [different CPU architectures on the CPU and GPU nodes](../../computing/systems-roihu.md#compute). + [different CPU architectures on Roihu-CPU and Roihu-GPU](../../computing/systems-roihu.md#compute). Hence, there are separate login nodes for building programs and submitting - jobs to the respective nodes: + jobs to their respective nodes: 1. **`roihu-cpu.csc.fi`** 2. **`roihu-gpu.csc.fi`** - For example, connect to one of the CPU login nodes using a command-line SSH - client like this: + Connecting example (Roihu-CPU): ```bash # Replace with the name of your CSC user account. @@ -59,8 +57,7 @@ using an SSH client: the CPU nodes, while software built on `roihu-gpu.csc.fi` can only be run on the GPU nodes. Importantly, this applies also to Python environments. - **Note that you may access your files from all login nodes because they all - use the same shared file system.** + **All login nodes still share the same file system, so your files are accessible from all of them.** ### Roihu web interface @@ -72,22 +69,52 @@ The simplest way to connect to Roihu is to use the web interface. ## Migrating research data -If you need to transfer research data from Puhti or Mahti to Roihu, we require +If you need to transfer data from Puhti or Mahti to Roihu, we require that you: -1. Carefully review your data before transferring it – **only move what you +1. Review your data carefully – **only move what you really need and check that you have enough space available on Roihu!** - Notably, previous extended disk quotas on Puhti or Mahti will not be - automatically moved to Roihu. Quota extensions on Roihu must be separately - applied for and properly motivated. -2. Move your data **directly** from Puhti or Mahti to Roihu. + Note that extended disk quotas from Puhti or Mahti are not automatically transferred. + Quota extensions on Roihu must be applied for separately. +2. Transfer data **directly** from Puhti or Mahti to Roihu. **[Read the detailed instructions in the Roihu data migration guide](roihu-data.md).** ## Installing software +For instructions on the available compilers and preferred options, see the instructions for compiling software on: + +- [Compiling on Roihu-CPU](https://csc-guide-preview.2.rahtiapp.fi/origin/roihu/computing/compiling-roihu/) +- [Compiling on Roihu-GPU](https://csc-guide-preview.2.rahtiapp.fi/origin/roihu/computing/compiling-roihu/) + ## Running your first job +### Known issues (pilot phase) + +During the pilot phase, you may encounter multiple warnings or errors related to *Argos* when running jobs, for example: + +``` +error: argos:slurm_spank_task_init: get_env_var: cannot get SLURM_ARGOS_SPANK_OPT from job(22474) environment (No such environment variable) +``` + +These messages are **harmless** and do not affect your job execution. +Your job will continue normally with Argos disabled. + +If your job completes successfully, you can safely ignore these messages. + +To suppress most of the Argos related warnings and errors, you can pass the `--argos=no` flag option to srun in the following manner: + +```bash +#!/bin/bash +#SBATCH --account=project_2001659 +#SBATCH --partition=test +#SBATCH --nodes=1 +#SBATCH --ntasks=1 +#SBATCH --time=DD:HH:MM + +srun --argos=no +``` + ## More information * [Roihu system overview](../../computing/systems-roihu.md) From 51074ab62236928627e10870b4da6d70cc13ee1d Mon Sep 17 00:00:00 2001 From: leopekkas Date: Wed, 8 Apr 2026 12:31:42 +0300 Subject: [PATCH 019/139] Add preliminary information on running, fix links to be dynamic vs. static --- docs/support/tutorials/roihu.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 0128f092de..2dca469a60 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -84,14 +84,16 @@ that you: For instructions on the available compilers and preferred options, see the instructions for compiling software on: -- [Compiling on Roihu-CPU](https://csc-guide-preview.2.rahtiapp.fi/origin/roihu/computing/compiling-roihu/) -- [Compiling on Roihu-GPU](https://csc-guide-preview.2.rahtiapp.fi/origin/roihu/computing/compiling-roihu/) +- [Compiling on Roihu-CPU](../../computing/compiling-roihu/) +- [Compiling on Roihu-GPU](../../computing/compiling-roihu/) ## Running your first job +For more examples, see [Roihu example Slurm job scripts](../../computing/example-job-scripts-roihu.md) + ### Known issues (pilot phase) -During the pilot phase, you may encounter multiple warnings or errors related to *Argos* when running jobs, for example: +During the pilot phase, you may encounter multiple warnings or errors related to *Argos* in your Slurm job output, for example: ``` error: argos:slurm_spank_task_init: get_env_var: cannot get SLURM_ARGOS_SPANK_OPT from job(22474) environment (No such environment variable) From 619a01040ab19209ee680445b2f438f6b986cef9 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Wed, 8 Apr 2026 14:06:38 +0300 Subject: [PATCH 020/139] Add links to Roihu on the computing index page --- docs/computing/index.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/computing/index.md b/docs/computing/index.md index d507fbc8d4..b949745d9a 100644 --- a/docs/computing/index.md +++ b/docs/computing/index.md @@ -61,7 +61,7 @@ ssh username@puhti.csc.fi ssh username@mahti.csc.fi ``` -Roihu has separate login nodes for the CPU and GPU partitions: +Roihu has separate login nodes for the Roihu-CPU and Roihu-GPU partitions: ```bash ssh username@roihu-cpu.csc.fi @@ -90,7 +90,7 @@ Puhti, Mahti and Roihu can also be accessed via their respective [web interfaces](webinterface/index.md) available at [www.puhti.csc.fi](https://www.puhti.csc.fi), [www.mahti.csc.fi](https://www.mahti.csc.fi) and -[www.roihu.csc.fi](https://www.mahti.csc.fi). +[www.roihu.csc.fi](https://www.roihu.csc.fi). ### Scalability @@ -165,6 +165,7 @@ csc-workspaces * [Installing software](installing.md) * [Compiling on Puhti](compiling-puhti.md) * [Compiling on Mahti](compiling-mahti.md) + * [Compiling on Roihu](compiling-roihu.md) * [Debugging applications](debugging.md): How to debug your applications * [Performance analysis](performance.md): How to understand the performance of your applications From 8fa0e2a7f44989fa377b173b86376d61e53d1cbf Mon Sep 17 00:00:00 2001 From: leopekkas Date: Thu, 9 Apr 2026 14:53:59 +0300 Subject: [PATCH 021/139] Add sections for installing and running jobs in Roihu --- docs/support/tutorials/roihu.md | 114 ++++++++++++++++++++++++-------- 1 file changed, 85 insertions(+), 29 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 2dca469a60..321809b8c3 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -1,17 +1,24 @@ # Getting started with Roihu -This is a quickstart guide for Roihu users. It is assumed that you have -previously used CSC supercomputing resources like Puhti, Mahti or LUMI. If not, -you can start by looking at our general -[getting started with supercomputing guide](hpc-quick.md). We also recommend -checking the -[CSC Computing Environment self-learning course materials](https://csc-training.github.io/csc-env-eff/). +This guide assumes familiarity with CSC supercomputers such as Puhti, Mahti or LUMI. + +If you are new to CSC systems, start with the [getting started with supercomputing guide](hpc-quick.md). To access Roihu, you need a CSC user account and a project with the Roihu service -enabled. [Read more here](../../accounts/index.md). +enabled. [Read more about CSC accounts and projects](../../accounts/index.md). [TOC] +!!! info "Key differences compared to Puhti and Mahti" + Before you begin, note the following important differences: + + - **SSH authentication requires short-lived certificates (24h)** + - **Separate login nodes for CPU and GPU environments** + - Software built on CPU nodes cannot be used on GPU nodes (and vice versa) + - Disk quota extensions are **not automatically transferred** from earlier projects on Puhti/Mahti + + These differences affect most workflows, so read the sections below carefully. + ## Connecting Connect to Roihu using either: @@ -21,22 +28,18 @@ Connect to Roihu using either: ### SSH client -Connecting to Roihu using an SSH client requires that you: - -1. Set up SSH keys and add your public key to MyCSC (like on Puhti & Mahti). -2. **New:** _Sign_ your public key and download a _certificate_ for authentication. - * Each certificate is valid for 24 hours, after which a new one must be - generated. - -**[Read detailed instructions for managing SSH keys and certificates](../../computing/connecting/ssh-keys.md).** +To connect via SSH: -Once you have set up SSH keys and obtained a valid SSH certificate, connect -using an SSH client: +1. Set up SSH keys (same as Puhti/Mahti) +2. **New:** _Sign_ your public key and download a _certificate_ + - Certificates are valid for **24 hours** * [Instructions for Linux/macOS](../../computing/connecting/ssh-unix.md). * [Instructions for Windows](../../computing/connecting/ssh-windows.md). -!!! info "Separate login nodes for CPU and GPU partitions" +**[Read detailed instructions for managing SSH keys and certificates](../../computing/connecting/ssh-keys.md).** + +!!! warning "Separate CPU and GPU environments" Roihu has [different CPU architectures on Roihu-CPU and Roihu-GPU](../../computing/systems-roihu.md#compute). Hence, there are separate login nodes for building programs and submitting @@ -53,13 +56,14 @@ using an SSH client: ssh @roihu-cpu.csc.fi ``` - Please observe that software built on `roihu-cpu.csc.fi` can only be run on - the CPU nodes, while software built on `roihu-gpu.csc.fi` can only be run - on the GPU nodes. Importantly, this applies also to Python environments. + **Important:** + - Software compiled on Roihu-CPU nodes only works on Roihu-CPU nodes + - Software compiled on Roihu-GPU nodes only works on Roihu-GPU nodes + - This also applies to Python environments **All login nodes still share the same file system, so your files are accessible from all of them.** -### Roihu web interface +### Roihu web interface (available after general availability) The simplest way to connect to Roihu is to use the web interface. @@ -73,23 +77,74 @@ If you need to transfer data from Puhti or Mahti to Roihu, we require that you: 1. Review your data carefully – **only move what you - really need and check that you have enough space available on Roihu!** - Note that extended disk quotas from Puhti or Mahti are not automatically transferred. - Quota extensions on Roihu must be applied for separately. -2. Transfer data **directly** from Puhti or Mahti to Roihu. + really need** +2. Check your available disk space on Roihu + - Use e.g. the `csc-workspaces` command for this + - Quota extensions are not transferred automatically +3. Transfer data **directly** from Puhti or Mahti to Roihu. **[Read the detailed instructions in the Roihu data migration guide](roihu-data.md).** ## Installing software +Before installing anything: + +1. Check if the software is already available: + - [List of pre-installed applications](../apps/index.md) + - `module spider ` + +If not available, choose one of the following approaches depending on your needs: + +### Compiling C/C++/Fortran code + +HPC software written using programming languages such as C, C++ or Fortran need to be compiled before installing. For instructions on the available compilers and preferred options, see the instructions for compiling software on: -- [Compiling on Roihu-CPU](../../computing/compiling-roihu/) -- [Compiling on Roihu-GPU](../../computing/compiling-roihu/) +- [Compiling on Roihu-CPU](../../computing/compiling-roihu.md#building-mpi-applications) +- [Compiling on Roihu-GPU](../../computing/compiling-roihu.md#building-gpu-applications) + +### Containers + +Roihu supports Apptainer/Singularity containers for container installations. +In most cases, ready-made Docker containers can be easily converted into an Apptainer image. +Another option is to build your own container from scratch. + +More details on working with containers in CSC's computing environment can be found from the links below: + +- [Overview of containers](containers/overview.md) +- [Running containers](containers/overview.md#running-containers) +- [Creating containers](containers/overview.md#building-container-images) +- [Tykky container wrapper](containers/tykky.md) + +### Python/R environments + +Best practice guidelines on installing your own Python and R packages can be found in the Python, R and Tykky container wrapper pages below. + +- [Installing Python packages and environments](../support/tutorials/python-usage-guide.md) +- [Containerizing Conda and pip environments with Tykky](containers/tykky.md) +- [R package installations](../apps/r-env.md#r-package-installations) ## Running your first job -For more examples, see [Roihu example Slurm job scripts](../../computing/example-job-scripts-roihu.md) +Roihu uses Slurm, similarly to Puhti and Mahti. + +Basic workflow: + +1. Create a job script where you + - Define the resources for your job (time, memory, cores) + - Load the required modules + - Launch your executable +2. Submit your batch job into the queuing system +3. Wait for the job to finish, and look for its output + +See the relevant sections for detailed steps: + +1. [Available batch job partitions](batch-job-partitions.md) +2. [Creating a batch job script](creating-job-scripts-roihu.md) +3. [Submit a batch job](submitting-jobs.md) +4. [Performance checklist](performance-checklist.md) + +For common Slurm error messages, see our FAQ on [Why does my batch job fail?](../faq/why-does-my-batch-job-fail.md). ### Known issues (pilot phase) @@ -121,3 +176,4 @@ srun --argos=no * [Roihu system overview](../../computing/systems-roihu.md) * [CSC Computing Environment self-learning materials](https://csc-training.github.io/csc-env-eff/) +* [Contact our service desk](https://docs.csc.fi/support/contact/) \ No newline at end of file From 94a5c3ad145ad9f46d9dbc5a0c89d760024e6ffa Mon Sep 17 00:00:00 2001 From: leopekkas Date: Thu, 9 Apr 2026 15:05:57 +0300 Subject: [PATCH 022/139] Fix list formatting --- docs/support/tutorials/roihu.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 321809b8c3..3852e3a6b1 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -32,7 +32,9 @@ To connect via SSH: 1. Set up SSH keys (same as Puhti/Mahti) 2. **New:** _Sign_ your public key and download a _certificate_ - - Certificates are valid for **24 hours** + * Certificates are valid for **24 hours** + +For platform-specific instructions, see: * [Instructions for Linux/macOS](../../computing/connecting/ssh-unix.md). * [Instructions for Windows](../../computing/connecting/ssh-windows.md). @@ -78,11 +80,11 @@ that you: 1. Review your data carefully – **only move what you really need** -2. Check your available disk space on Roihu - - Use e.g. the `csc-workspaces` command for this - - Quota extensions are not transferred automatically +2. Check your available disk space on Roihu (for example, using the `csc-workspaces` command) 3. Transfer data **directly** from Puhti or Mahti to Roihu. +Note that previous extended disk quotas on Puhti or Mahti will not be automatically moved to Roihu. Quota extensions on Roihu must be separately applied for and properly motivated. + **[Read the detailed instructions in the Roihu data migration guide](roihu-data.md).** ## Installing software @@ -131,13 +133,13 @@ Roihu uses Slurm, similarly to Puhti and Mahti. Basic workflow: 1. Create a job script where you - - Define the resources for your job (time, memory, cores) - - Load the required modules - - Launch your executable + * Define the resources for your job (time, memory, cores) + * Load the required modules + * Launch your executable 2. Submit your batch job into the queuing system 3. Wait for the job to finish, and look for its output -See the relevant sections for detailed steps: +See the relevant documentation below for detailed information: 1. [Available batch job partitions](batch-job-partitions.md) 2. [Creating a batch job script](creating-job-scripts-roihu.md) From 6b8b4d89f2a71fdfc57b495ec823ffa201410e16 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Thu, 9 Apr 2026 15:10:06 +0300 Subject: [PATCH 023/139] Remove specific project id from example Slurm script --- docs/support/tutorials/roihu.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 3852e3a6b1..2c2359fbc5 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -165,7 +165,7 @@ To suppress most of the Argos related warnings and errors, you can pass the `--a ```bash #!/bin/bash -#SBATCH --account=project_2001659 +#SBATCH --account= #SBATCH --partition=test #SBATCH --nodes=1 #SBATCH --ntasks=1 From d3cbbfae38892f8bd72fb6809c85a562235ee47c Mon Sep 17 00:00:00 2001 From: leopekkas Date: Thu, 9 Apr 2026 15:14:50 +0300 Subject: [PATCH 024/139] Fix warning section formatting --- docs/support/tutorials/roihu.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 2c2359fbc5..a10b54afb6 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -58,10 +58,10 @@ For platform-specific instructions, see: ssh @roihu-cpu.csc.fi ``` - **Important:** - - Software compiled on Roihu-CPU nodes only works on Roihu-CPU nodes - - Software compiled on Roihu-GPU nodes only works on Roihu-GPU nodes - - This also applies to Python environments + **Importantly:** + - Software compiled on Roihu-CPU nodes only works on Roihu-CPU nodes + - Software compiled on Roihu-GPU nodes only works on Roihu-GPU nodes + - This also applies to Python environments **All login nodes still share the same file system, so your files are accessible from all of them.** From 55421a74bacad17d73726186c67a5a58a821b4f3 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Fri, 10 Apr 2026 09:58:36 +0300 Subject: [PATCH 025/139] Add link to relevant docs pages on SSH --- docs/support/tutorials/roihu.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index a10b54afb6..977d102b0d 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -33,13 +33,14 @@ To connect via SSH: 1. Set up SSH keys (same as Puhti/Mahti) 2. **New:** _Sign_ your public key and download a _certificate_ * Certificates are valid for **24 hours** + * See our instructions for managing certificates: [Signing public SSH keys](https://csc-guide-preview.2.rahtiapp.fi/origin/roihu-quickstart/computing/connecting/ssh-keys/#signing-public-key) For platform-specific instructions, see: * [Instructions for Linux/macOS](../../computing/connecting/ssh-unix.md). * [Instructions for Windows](../../computing/connecting/ssh-windows.md). -**[Read detailed instructions for managing SSH keys and certificates](../../computing/connecting/ssh-keys.md).** +**[Read detailed instructions for creating and managing SSH keys and certificates](../../computing/connecting/ssh-keys.md).** !!! warning "Separate CPU and GPU environments" Roihu has From 3d77f2dd6531f03f9f11a74b51b0b65114bc7a78 Mon Sep 17 00:00:00 2001 From: Lauri Niemi <113029612+niemilau@users.noreply.github.com> Date: Fri, 10 Apr 2026 13:41:31 +0300 Subject: [PATCH 026/139] Roihu docs on compiling (#2938) * Roihu compilation instructions * Grammar fixes Co-authored-by: Martti Louhivuori * - Updated nvcc instructions - Use Roihu-CPU and Roihu-GPU names when referring to partitions - Small formatting changes * Wrote basic compilation instructions for Roihu-CPU. Some TODOs left still * Added instructions for compiling with MPI on Roihu-GPU, plus general cleanup * Added instructions for using provided HPC libraries on Roihu * Fixed link to CUDA section * Fixed Roihu-CPU sub-sectioning * Apply suggestions from code review Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> * Raise roihu section above mahti/puhti * wording --------- Co-authored-by: leopekkas Co-authored-by: Martti Louhivuori Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> --- docs/computing/compiling-roihu.md | 218 ++++++++++++++++++++++++++++++ docs/computing/hpc-libraries.md | 40 +++++- docs/computing/installing.md | 1 + mkdocs.yml | 1 + 4 files changed, 256 insertions(+), 4 deletions(-) create mode 100644 docs/computing/compiling-roihu.md diff --git a/docs/computing/compiling-roihu.md b/docs/computing/compiling-roihu.md new file mode 100644 index 0000000000..71eb907a66 --- /dev/null +++ b/docs/computing/compiling-roihu.md @@ -0,0 +1,218 @@ +# Compiling applications in Roihu + +!!! info + Roihu has separate CPU and GPU partitions with different CPU architectures: + + - Roihu-CPU nodes use AMD (x86) processors + - Roihu-GPU nodes use NVIDIA Grace (ARM) processors + + Binaries compiled for one architecture are generally not usable on the other. + Accordingly, software should be compiled on the same side where it will be run: + + - Compile for CPU nodes on the Roihu-CPU login node + - Compile for GPU nodes on the Roihu-GPU login node + + +## General instructions + +- Whenever possible, use the [local disk](disk.md#login-nodes) on the login node for compiling software. + - Compiling on the local disk is much faster and shifts load from the shared file system. + - The local disk is cleaned frequently, so please move your files elsewhere after compiling. + + +- Please see [the page on available HPC libraries](hpc-libraries.md#libraries-on-roihu) for using common libraries (BLAS, FFTW, ...) +and linking them to your applications. + +## Compiling on Roihu-CPU + +C/C++ and Fortran applications can be built with +the [GNU](https://gcc.gnu.org) or the [AMD](https://developer.amd.com/amd-aocc/) +compiler suites. GNU compilers are loaded by default. AMD compilers can be +loaded using the [Module system](modules.md) with the command: +``` +module load aocc +``` + +The compiler executables are as follows: + +| Compiler suite | C | C++ | Fortran | +| :------------- | :- | :-- | :------ | +| GNU | gcc | g++ | gfortran | +| AMD | clang | clang++ | flang | + +For applications that depend on MPI, it is recommended to instead use the compiler +wrappers described in the [MPI section](#building-mpi-applications) below. + +The compiler options for different suites are different. The +recommended basic optimization flags are listed in the table below. It +is recommended to start from the safe level and then move up to intermediate +or even aggressive, while ensuring the results are correct and the program's +performance has improved. + +| Optimization level | GNU | AMD (clang) | +| :----------------- | :---------------- | :----------- | +| **Safe** | -O2 -march=native | -O2 -march=native | +| **Intermediate** | -O3 -march=native | -O3 -march=native | +| **Aggressive** | -O3 -march=native -ffast-math -funroll-loops | + +!!! info + Because the Roihu-CPU login and compute nodes share the same CPU architecture, + compiling for the native architecture (`-march=native`) is optimal even if + the compilation is done on login nodes. + +Example of compiling a non-MPI C program in GNU environment: +```bash +gcc -O3 -march=native example.c -o example +``` + +A detailed list of options for the GNU and AMD compilers can be found in the _man_ +pages (`man gcc/gfortran`) when the corresponding programming +environment is loaded, or in the compiler manuals (see the links above). + +We recommend testing and profiling your application with both compiler suites +to see which compiler works the best for your use case. + +List all available versions of the compiler suites: +``` +module spider gcc +module spider aocc +``` + + +### Building MPI applications + +!!! warning + The AMD compiler environment does not yet have a supporting MPI module. + We expect to set this up shortly; until then, please use the GNU environment + for building MPI applications. + + +The MPI environment in Roihu is OpenMPI. You may use one of the MPI compiler wrappers +`mpicc` (C), `mpicxx` (C++), or `mpif90` (Fortran) when compiling MPI applications. +These wrappers end up calling the compiler from your currently loaded compiler suite +(GNU or AMD) and work in both compiler suites. + +Example: +```bash +mpicc -O3 -march=native example.c -o example +``` + +List all available versions of OpenMPI (one is always loaded by default): +``` +module spider openmpi +``` + + +### Building OpenMP and hybrid applications + +An additional compiler and linker flag is needed when building an OpenMP or a hybrid +MPI/OpenMP application: + +| Compiler suite | OpenMP flag | +| :------------- | :---------- | +| GNU and AMD | -fopenmp | + +Example compilation of a hybrid MPI/OpenMP application: +```bash +mpicc -O3 -march=native -fopenmp example.c -o example +``` + + +## Compiling on Roihu-GPU + +!!! info + When compiling for the GPU nodes on Roihu, make sure you use Roihu's GPU login nodes. + Binaries compiled on Roihu-CPU are not compatible with Roihu-GPU nodes. + + +Roihu-GPU provides two compiler environments for building C/C++ and Fortran applications: +the [GNU](https://gcc.gnu.org) suite and the [NVIDIA-HPC](https://developer.nvidia.com/hpc-compilers) +suite. GNU compilers are loaded by default. NVIDIA compilers can be +loaded using the [Module system](modules.md) with the command: +``` +module load nvhpc +``` + +The compiler executables are as follows: + +| Compiler suite | C | C++ | Fortran | +| :------------- | :- | :-- | :------ | +| GNU | gcc | g++ | gfortran | +| NVIDIA | nvc | nvc++ | nvfortran | + + +In addition, the CUDA [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) compiler is available for building GPU kernel code. See the [CUDA section below](#compiling-cuda-code). + + +List all available versions of the compiler suites: +``` +module spider gcc +module spider nvhpc +``` + + +### Compiling CUDA code + +CUDA is the recommended programming model for Nvidia GPUs and is provided as an environment module +on Roihu-GPU (loaded by default). + +The CUDA compiler (`nvcc`) takes care of compiling CUDA kernels code for the target +GPU device and passes the rest to the currently loaded host compiler like `gcc` or `nvhpc`. +For example, to load the CUDA 13.1 environment together with the GNU compiler: + +```bash +module load gcc/15.2.0 cuda/13.1.1 +``` + +To generate code for a given target device, tell the CUDA +compiler what compute capability the target device supports. On Roihu, the +GPUs (Hopper 200) support compute capability 9.0. Specify this using +`-gencode arch=compute_90,code=sm_90`. Alternatively, you may use `compute_90a` +or `sm_90a` to enable Hopper-specific extension features that may produce more +performant code. + +For example, compiling a CUDA kernel (`example.cu`) on Roihu: + +```bash +nvcc -gencode arch=compute_90a,code=sm_90a example.cu +``` + +!!! info + Code generated with `arch=compute_90a` or `code=sm_90a` is not backwards or forwards + compatible with other GPU architectures. If this is a concern for you, use the + more generic `arch=compute_90,code=sm_90` options. + + +### Building MPI applications on Roihu-GPU + +In the GNU compiler environment, an OpenMPI module is available that implements +CUDA-aware MPI. It is loaded by default. You may use one of the MPI compiler wrappers `mpicc` (C), +`mpicxx` (C++), or `mpif90` (Fortran) when compiling MPI applications. When compiling +MPI applications with `nvcc`, you will need to explicitly provide MPI include and library +paths: +```bash +nvcc -gencode arch=compute_90a,code=sm_90a example.cu -lmpi -I$OPENMPI_INSTROOT/include -L$OPENMPI_INSTROOT/lib +``` + +In the NVIDIA compiler environment, the MPI is bundled by NVIDIA and is directly +available after loading the compiler suite. There is no separate MPI module to load. + +!!! warning + The NVHPC environment on Roihu is still undergoing configuration. + The current version may have issues with, for example, its Slurm integration + on Roihu. For now, we strongly recommend using the GNU compiler suite when + building MPI applications." + + + diff --git a/docs/computing/hpc-libraries.md b/docs/computing/hpc-libraries.md index 01936fd631..c52f0649df 100644 --- a/docs/computing/hpc-libraries.md +++ b/docs/computing/hpc-libraries.md @@ -26,17 +26,49 @@ module load fftw and the directory containing `include`, `lib`, *etc.* are found under `FFTW_INSTALL_ROOT` environment variable. -## Libraries in Puhti -Selected libraries available in Puhti: +## Libraries on Roihu + +!!! warning + On Roihu-CPU and Roihu-GPU, many of the installed modules do not currently + set the `CPATH`, `LIBRARY_PATH` or `LD_LIBRARY_PATH` environment variables. + We expect to change this in the near future; until then, you may have to + set them manually eg. when compiling an application that depends on a module. + You can use `module show ` to see where the module files are located. + Many modules define variable like `modulename_INSTROOT` that points to the + installation directory once the module has been loaded. For example, `fftw` + headers are in `$FFTW_INSTROOT\include` and the compiled library files are + in `$FFTW_INSTROOT\lib`. + + +### Roihu-CPU + +Selected libraries available on Roihu-CPU: + +- Dense linear algebra: `openblas` +- Dense distributed linear algebra: `netlib-scalapack` +- Fast fourier transforms: `fftw` + +### Roihu-GPU + +Selected libraries available on Roihu-GPU: + +- Dense linear algebra: `openblas`, `netlib-lapack`, `cublas` +- Dense distributed linear algebra: `netlib-scalapack` +- Fast fourier transforms: `fftw` + + +## Libraries on Puhti + +Selected libraries available on Puhti: - Dense linear algebra: `intel-oneapi-mkl` - Dense distributed linear algebra: `intel-oneapi-mkl`, `netlib-scalapack` - Fast fourier transforms: `fftw` -## Libraries in Mahti +## Libraries on Mahti -Selected libraries available in Mahti: +Selected libraries available on Mahti: - Dense linear algebra: `openblas`, `amdblis`, `amdlibflame` - Dense distributed linear algebra: `netlib-scalapack`, `amdscalapack` diff --git a/docs/computing/installing.md b/docs/computing/installing.md index 6755d2e474..f6221fad20 100644 --- a/docs/computing/installing.md +++ b/docs/computing/installing.md @@ -37,6 +37,7 @@ compiled before installing. Guidelines on compiling software on CSC supercompute be found from the links below. A list of available HPC libraries that may need to be linked upon compiling is also provided. +- [Compiling on Roihu](compiling-roihu.md) - [Compiling on Puhti](compiling-puhti.md) - [Compiling on Mahti](compiling-mahti.md) - [Compiling on LUMI](compiling-lumi.md) diff --git a/mkdocs.yml b/mkdocs.yml index bca25bb9dd..f9736dffd0 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -324,6 +324,7 @@ nav: - Performance checklist: computing/running/performance-checklist.md - Installing software: - computing/installing.md + - Compiling on Roihu: computing/compiling-roihu.md - Compiling on Puhti: computing/compiling-puhti.md - Compiling on Mahti: computing/compiling-mahti.md - Compiling on LUMI: computing/compiling-lumi.md From 712ff667393221ec7576239a494ca8483b50966f Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 13 Apr 2026 13:46:03 +0300 Subject: [PATCH 027/139] update python docs for Roihu --- docs/apps/python-data.md | 23 +++++++++++++++----- docs/apps/python.md | 18 ++++++++------- docs/support/tutorials/python-usage-guide.md | 17 ++++++++++----- 3 files changed, 38 insertions(+), 20 deletions(-) diff --git a/docs/apps/python-data.md b/docs/apps/python-data.md index fd789aee44..4a0aef1f3d 100644 --- a/docs/apps/python-data.md +++ b/docs/apps/python-data.md @@ -8,7 +8,9 @@ catalog: disciplines: - Data Analytics and Machine Learning available_on: + - Roihu - Puhti + - Mahti --- # Python Data @@ -16,6 +18,8 @@ catalog: Collection of Python libraries for data analytics and machine learning. !!! info "News" + **31.3.2026** Python-data is now available on Roihu. + **12.9.2025** Installed `python-data/3.12-25.09` with newer packages of popular Python modules. @@ -43,7 +47,13 @@ installation. Typically the module will include the newest versions of libraries at installation time, to the extent software dependencies allow. -Current versions are: +Current versions in Roihu are: + +- Roihu-CPU: (default version) `python-data/3.12-31.03`: installed in March 2026, + includes for example Scikit-learn 1.8.0, SciPy 1.17.1, Pandas 3.0.2 + and JupyterLab 4.5.6. + +Current versions in Puhti and Mahti are: - (default version) `python-data/3.12-25.09`: installed in September 2025, includes for example Scikit-learn 1.7.2, SciPy 1.16.1, Pandas 2.3.2 @@ -75,6 +85,7 @@ data analytics and machine learning, for example: - [Dask](https://dask.org/): Scalable analytics in Python - [Gensim](https://radimrehurek.com/gensim/): Topic modelling - [Jupyter](https://jupyter.org/index.html) and [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) +- [Marimo](https://marimo.io) - [NLTK](https://matplotlib.org/): Natural language toolkit - [PyTables](http://www.pytables.org/) - [SciPy](https://www.scipy.org/), including [NumPy](https://www.numpy.org/), [Matplotlib](https://matplotlib.org/) and [Pandas](https://pandas.pydata.org/) @@ -95,8 +106,7 @@ If you think that some important package should be included in the module provided by CSC, please [contact our servicedesk](../support/contact.md). Note that some machine learning frameworks have their own specific modules, for example: -[PyTorch](pytorch.md), [TensorFlow](tensorflow.md), [JAX](jax.md), and -[RAPIDS](rapids.md). +[python-pytorch](python-pytorch.md), [python-vllm](python-vllm.md), [python-tensorFlow](python-tensorflow.md), [python-JAX](python-jax.md), and [RAPIDS](rapids.md). !!! info "Note about multi-threading" @@ -116,7 +126,7 @@ All packages are licensed under various free and open source licenses (FOSS). ## Usage -To use this software on Puhti, initialize it with: +To use this software on Roihu, Puhti or Mahti, initialize it with: ```text module load python-data @@ -126,13 +136,14 @@ to access the default version, or if you wish to have a specific version ([see above for available versions](#available)): ```text -module load python-data/3.10-2023.11 +module load python-data/3.12-31.03 # on Roihu +module load python-data/3.12-25.09 # on other systems ``` If you just want the most recent version with a specific Python version, you can also run: ```text -module load python-data/3.10 +module load python-data/3.12 ``` This will show all available versions: diff --git a/docs/apps/python.md b/docs/apps/python.md index c4aa580db1..ed3d29f54e 100644 --- a/docs/apps/python.md +++ b/docs/apps/python.md @@ -8,6 +8,7 @@ catalog: disciplines: - Mathematics and Statistics available_on: + - Roihu - Puhti - Mahti --- @@ -22,6 +23,7 @@ please see our ## Available +* Roihu: 3.x. versions * Puhti: 3.x versions * Mahti: 3.x versions @@ -65,19 +67,19 @@ python3.9 ### Pre-installed Python environments -Puhti and Mahti have several pre-installed +Roihu, Puhti and Mahti have several pre-installed [environment modules](../computing/modules.md) containing Python environments made for different science areas. | Module name | Purpose | |-|-| | [biopythontools](biopython.md) | bioinformatics | -| [geoconda](geoconda.md) | geoinformatics | -| [jax](jax.md) | JAX ML framework | +| [python-geo](python-geo.md) | geoinformatics | +| [python-jax](python-jax.md) | JAX ML framework | | [python-data](python-data.md) | data analysis and ML utilities | -| [pytorch](pytorch.md) | PyTorch ML framework | -| [qiskit](qiskit.md) | quantum computing | -| [tensorflow](tensorflow.md) | TensorFlow ML framework | +| [python-pytorch](python-pytorch.md) | PyTorch ML framework | +| [python-qiskit](python-qiskit.md) | quantum computing | +| [python-tensorflow](python-tensorflow.md) | TensorFlow ML framework | To use any of the above environments, simply load the corresponding module using the `module load` command. @@ -105,12 +107,12 @@ modules listed above). Note that most of the pre-installed Python environment modules are self-contained and mutually exclusive environments, so it does not - make sense to for example load both python-data and pytorch + make sense to for example load both python-data and python-pytorch modules. The module loaded last will be the only active one, and the module load command will warn about this, for example: ``` - Lmod is automatically replacing "python-data/3.10-24.04" with "pytorch/2.5". + Lmod is automatically replacing "python-data/3.12-31.03" with "python-pytorch/2.10". ``` diff --git a/docs/support/tutorials/python-usage-guide.md b/docs/support/tutorials/python-usage-guide.md index a86cde3b66..9d0fecba42 100644 --- a/docs/support/tutorials/python-usage-guide.md +++ b/docs/support/tutorials/python-usage-guide.md @@ -84,7 +84,11 @@ in a module provided by CSC, do not hesitate to contact our Packages are by default installed to your home directory under `.local/lib/pythonx.y/site-packages` (where `x.y` is - the version of Python being used). **Please note that if you install a lot of + the version of Python being used) except in Roihu. In Roihu, the + default installation path for additional packages is set accordingly + to the CPU architechture: `.local/cpu-arch/lib/pythonx.y/site-packages`, + where 'cpu-arch' is 'x86_64' for Roihu-CPU and 'aarch64' for Roihu-GPU. + **Please note that if you install a lot of packages, your home directory can easily run out of space.** This can be avoided by changing the installation folder to make a project-wide installation instead of a personal one. This is @@ -127,11 +131,12 @@ in a module provided by CSC, do not hesitate to contact our You can fix this by editing the first line of the executable (which in our example is located using `which whatshap`) to point to the real Python interpreter (can be found with `which python3`). - In our example we would edit the file `~/.local/bin/whatshap` + In our example we would edit the file `~/.local/bin/whatshap` (`~/.local/cpu-arch/bin/whatshap` in Roihu) to have the following as its first line: ```bash - #!/appl/soft/ai/tykky/python-data-2022-09/bin/python3 + #!/appl/soft/manual/aida/$(uname -m)/python-data/python-data-2026-03/bin/python3 # In Roihu + #!/appl/soft/ai/tykky/python-data-2022-09/bin/python3 # In Puhti and Mahti ``` --- @@ -248,8 +253,8 @@ in a single document. The [Jupyter interactive application](../../computing/webinterface/jupyter.md) on our web interface allows using Jupyter on CSC supercomputers. Many of our Python environments, including -[`python-data`](../../apps/python-data.md), [`geoconda`](../../apps/geoconda.md) -as well as deep learning modules like [`pytorch`](../../apps/pytorch.md) +[`python-data`](../../apps/python-data.md), [`python-geo`](../../apps/python-geo.md) +as well as deep learning modules like [`python-pytorch`](../../apps/python-pytorch.md) include the main Jupyter packages, so they can be used in the application. The documentation page for the application includes a [list of supported environments](../../computing/webinterface/jupyter.md#currently-supported-python-environments). @@ -306,7 +311,7 @@ In addition, there are examples of on our CSC Training GitHub organization. Of the above four packages, examples are provided for `multiprocessing`, `joblib` and `dask`. -The `mpi4py` package is included in our [PyTorch environment](../../apps/pytorch.md). +The `mpi4py` package is included in our [PyTorch environment](../../apps/python-pytorch.md). It is generally the most efficient option for multinode jobs with non-trivial parallelization. For a short tutorial on `mpi4py`, along with other approaches for improving the performance of Python programs, please see the free From 357179f35717d5f7237972f144c4b052ee3e3ff7 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 13 Apr 2026 15:49:32 +0300 Subject: [PATCH 028/139] update python docs --- docs/apps/python-data.md | 6 +++--- docs/apps/python.md | 22 ++++++++++++++++---- docs/support/tutorials/python-usage-guide.md | 4 ++-- 3 files changed, 23 insertions(+), 9 deletions(-) diff --git a/docs/apps/python-data.md b/docs/apps/python-data.md index 4a0aef1f3d..95cbb9a72a 100644 --- a/docs/apps/python-data.md +++ b/docs/apps/python-data.md @@ -85,7 +85,7 @@ data analytics and machine learning, for example: - [Dask](https://dask.org/): Scalable analytics in Python - [Gensim](https://radimrehurek.com/gensim/): Topic modelling - [Jupyter](https://jupyter.org/index.html) and [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) -- [Marimo](https://marimo.io) +- [Marimo](https://marimo.io) (Roihu) - [NLTK](https://matplotlib.org/): Natural language toolkit - [PyTables](http://www.pytables.org/) - [SciPy](https://www.scipy.org/), including [NumPy](https://www.numpy.org/), [Matplotlib](https://matplotlib.org/) and [Pandas](https://pandas.pydata.org/) @@ -105,8 +105,8 @@ To create a virtual environment use the command `python3 -m venv If you think that some important package should be included in the module provided by CSC, please [contact our servicedesk](../support/contact.md). Note that some machine learning -frameworks have their own specific modules, for example: -[python-pytorch](python-pytorch.md), [python-vllm](python-vllm.md), [python-tensorFlow](python-tensorflow.md), [python-JAX](python-jax.md), and [RAPIDS](rapids.md). +frameworks have their own specific modules, for example in Roihu: +[python-pytorch](pytorch.md), [python-vllm](vllm.md), [python-tensorflow](tensorflow.md), and [python-jax](jax.md). !!! info "Note about multi-threading" diff --git a/docs/apps/python.md b/docs/apps/python.md index ed3d29f54e..c8699dcc73 100644 --- a/docs/apps/python.md +++ b/docs/apps/python.md @@ -71,15 +71,29 @@ Roihu, Puhti and Mahti have several pre-installed [environment modules](../computing/modules.md) containing Python environments made for different science areas. +In Roihu: + | Module name | Purpose | |-|-| | [biopythontools](biopython.md) | bioinformatics | | [python-geo](python-geo.md) | geoinformatics | -| [python-jax](python-jax.md) | JAX ML framework | +| [python-jax](jax.md) | JAX ML framework | +| [python-data](python-data.md) | data analysis and ML utilities | +| [python-pytorch](pytorch.md) | PyTorch ML framework | +| [python-qiskit](qiskit.md) | quantum computing | +| [python-tensorflow](tensorflow.md) | TensorFlow ML framework | + +In other systems: + +| Module name | Purpose | +|-|-| +| [biopythontools](biopython.md) | bioinformatics | +| [geoconda](python-geo.md) | geoinformatics | +| [jax](jax.md) | JAX ML framework | | [python-data](python-data.md) | data analysis and ML utilities | -| [python-pytorch](python-pytorch.md) | PyTorch ML framework | -| [python-qiskit](python-qiskit.md) | quantum computing | -| [python-tensorflow](python-tensorflow.md) | TensorFlow ML framework | +| [pytorch](pytorch.md) | PyTorch ML framework | +| [qiskit](qiskit.md) | quantum computing | +| [tensorflow](tensorflow.md) | TensorFlow ML framework | To use any of the above environments, simply load the corresponding module using the `module load` command. diff --git a/docs/support/tutorials/python-usage-guide.md b/docs/support/tutorials/python-usage-guide.md index 9d0fecba42..be0e6acd1d 100644 --- a/docs/support/tutorials/python-usage-guide.md +++ b/docs/support/tutorials/python-usage-guide.md @@ -86,8 +86,8 @@ in a module provided by CSC, do not hesitate to contact our directory under `.local/lib/pythonx.y/site-packages` (where `x.y` is the version of Python being used) except in Roihu. In Roihu, the default installation path for additional packages is set accordingly - to the CPU architechture: `.local/cpu-arch/lib/pythonx.y/site-packages`, - where 'cpu-arch' is 'x86_64' for Roihu-CPU and 'aarch64' for Roihu-GPU. + to the CPU architecture: `.local/x86_64/lib/pythonx.y/site-packages` + for Roihu-CPU and `.local/aarch64/lib/pythonx.y/site-packages` for Roihu-GPU. **Please note that if you install a lot of packages, your home directory can easily run out of space.** This can be avoided by changing the installation folder to make From f627ea75ce401d8741c2ef2e6258953b3a9e5791 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 13 Apr 2026 15:54:59 +0300 Subject: [PATCH 029/139] add vllm to module list --- docs/apps/python.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/apps/python.md b/docs/apps/python.md index c8699dcc73..8bee1d95bb 100644 --- a/docs/apps/python.md +++ b/docs/apps/python.md @@ -80,6 +80,7 @@ In Roihu: | [python-jax](jax.md) | JAX ML framework | | [python-data](python-data.md) | data analysis and ML utilities | | [python-pytorch](pytorch.md) | PyTorch ML framework | +| [python-vllm](vllm.md) | LLM inference | | [python-qiskit](qiskit.md) | quantum computing | | [python-tensorflow](tensorflow.md) | TensorFlow ML framework | From 1e12f986e6c3ebddc9c8009c936af3af68b8890f Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 13 Apr 2026 15:58:14 +0300 Subject: [PATCH 030/139] fix link to geoconda.md --- docs/apps/python.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/apps/python.md b/docs/apps/python.md index 8bee1d95bb..c70d7a2b6d 100644 --- a/docs/apps/python.md +++ b/docs/apps/python.md @@ -89,7 +89,7 @@ In other systems: | Module name | Purpose | |-|-| | [biopythontools](biopython.md) | bioinformatics | -| [geoconda](python-geo.md) | geoinformatics | +| [geoconda](geoconda.md) | geoinformatics | | [jax](jax.md) | JAX ML framework | | [python-data](python-data.md) | data analysis and ML utilities | | [pytorch](pytorch.md) | PyTorch ML framework | From 4e17084e400f887f75a390f0be21bdbd64a454cf Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 13 Apr 2026 16:00:47 +0300 Subject: [PATCH 031/139] update python docs --- docs/apps/python.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/apps/python.md b/docs/apps/python.md index c70d7a2b6d..94bd23fe05 100644 --- a/docs/apps/python.md +++ b/docs/apps/python.md @@ -124,7 +124,7 @@ modules listed above). self-contained and mutually exclusive environments, so it does not make sense to for example load both python-data and python-pytorch modules. The module loaded last will be the only active one, and - the module load command will warn about this, for example: + the module load command will warn about this, for example in Roihu: ``` Lmod is automatically replacing "python-data/3.12-31.03" with "python-pytorch/2.10". From 713faf4bc948eec90f666b45ede68687a24960b9 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 13 Apr 2026 16:08:52 +0300 Subject: [PATCH 032/139] fix python usage guide --- docs/support/tutorials/python-usage-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/python-usage-guide.md b/docs/support/tutorials/python-usage-guide.md index be0e6acd1d..eef768f4ac 100644 --- a/docs/support/tutorials/python-usage-guide.md +++ b/docs/support/tutorials/python-usage-guide.md @@ -131,7 +131,7 @@ in a module provided by CSC, do not hesitate to contact our You can fix this by editing the first line of the executable (which in our example is located using `which whatshap`) to point to the real Python interpreter (can be found with `which python3`). - In our example we would edit the file `~/.local/bin/whatshap` (`~/.local/cpu-arch/bin/whatshap` in Roihu) + In our example we would edit the file `~/.local/bin/whatshap` (`~/.local/$(uname -m)/bin/whatshap` in Roihu) to have the following as its first line: ```bash From 0b9351283835deff2da98ee0d788b67baf538bd6 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 13 Apr 2026 16:23:01 +0300 Subject: [PATCH 033/139] update python docs --- docs/support/tutorials/python-usage-guide.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/support/tutorials/python-usage-guide.md b/docs/support/tutorials/python-usage-guide.md index eef768f4ac..ba6adf8cf1 100644 --- a/docs/support/tutorials/python-usage-guide.md +++ b/docs/support/tutorials/python-usage-guide.md @@ -135,8 +135,8 @@ in a module provided by CSC, do not hesitate to contact our to have the following as its first line: ```bash - #!/appl/soft/manual/aida/$(uname -m)/python-data/python-data-2026-03/bin/python3 # In Roihu - #!/appl/soft/ai/tykky/python-data-2022-09/bin/python3 # In Puhti and Mahti + #!/appl/soft/manual/aida/$(uname -m)/python-data/python-data-2026-03/bin/python3 # On Roihu + #!/appl/soft/ai/tykky/python-data-2022-09/bin/python3 # On Puhti and Mahti ``` --- @@ -253,8 +253,8 @@ in a single document. The [Jupyter interactive application](../../computing/webinterface/jupyter.md) on our web interface allows using Jupyter on CSC supercomputers. Many of our Python environments, including -[`python-data`](../../apps/python-data.md), [`python-geo`](../../apps/python-geo.md) -as well as deep learning modules like [`python-pytorch`](../../apps/python-pytorch.md) +[`python-data`](../../apps/python-data.md), [`python-geo`](../../apps/python-geo.md) (geoconda on Puhti and Mahti) +as well as deep learning modules like [`python-pytorch`](../../apps/pytorch.md) (pytorch on Puhti and Mahti) include the main Jupyter packages, so they can be used in the application. The documentation page for the application includes a [list of supported environments](../../computing/webinterface/jupyter.md#currently-supported-python-environments). From 085e076f5f24f596c2d67013f4e177988f7382e8 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 13 Apr 2026 16:23:59 +0300 Subject: [PATCH 034/139] fix link --- docs/support/tutorials/python-usage-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/python-usage-guide.md b/docs/support/tutorials/python-usage-guide.md index ba6adf8cf1..8c330682b2 100644 --- a/docs/support/tutorials/python-usage-guide.md +++ b/docs/support/tutorials/python-usage-guide.md @@ -311,7 +311,7 @@ In addition, there are examples of on our CSC Training GitHub organization. Of the above four packages, examples are provided for `multiprocessing`, `joblib` and `dask`. -The `mpi4py` package is included in our [PyTorch environment](../../apps/python-pytorch.md). +The `mpi4py` package is included in our [PyTorch environment](../../apps/pytorch.md). It is generally the most efficient option for multinode jobs with non-trivial parallelization. For a short tutorial on `mpi4py`, along with other approaches for improving the performance of Python programs, please see the free From f3faf0365acd122b36b35a299818f1987222bd8c Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Tue, 14 Apr 2026 10:10:24 +0300 Subject: [PATCH 035/139] Fix broken links in quickstart guide (#2942) * Fix broken links in quickstart guide * Fix Python guide and application list links * Fix link for creating Slurm scripts --- docs/support/tutorials/roihu.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 977d102b0d..96d2091eb3 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -40,7 +40,7 @@ For platform-specific instructions, see: * [Instructions for Linux/macOS](../../computing/connecting/ssh-unix.md). * [Instructions for Windows](../../computing/connecting/ssh-windows.md). -**[Read detailed instructions for creating and managing SSH keys and certificates](../../computing/connecting/ssh-keys.md).** +**[Read detailed instructions for managing SSH keys and certificates](../../computing/connecting/ssh-keys.md).** !!! warning "Separate CPU and GPU environments" Roihu has @@ -93,7 +93,7 @@ Note that previous extended disk quotas on Puhti or Mahti will not be automatica Before installing anything: 1. Check if the software is already available: - - [List of pre-installed applications](../apps/index.md) + - [List of pre-installed applications](../../apps/index.md) - `module spider ` If not available, choose one of the following approaches depending on your needs: @@ -114,18 +114,18 @@ Another option is to build your own container from scratch. More details on working with containers in CSC's computing environment can be found from the links below: -- [Overview of containers](containers/overview.md) -- [Running containers](containers/overview.md#running-containers) -- [Creating containers](containers/overview.md#building-container-images) -- [Tykky container wrapper](containers/tykky.md) +- [Overview of containers](../../computing/containers/overview.md) +- [Running containers](../../computing/containers/overview.md#running-containers) +- [Creating containers](../../computing/containers/overview.md#building-container-images) +- [Tykky container wrapper](../../computing/containers/tykky.md) ### Python/R environments Best practice guidelines on installing your own Python and R packages can be found in the Python, R and Tykky container wrapper pages below. -- [Installing Python packages and environments](../support/tutorials/python-usage-guide.md) -- [Containerizing Conda and pip environments with Tykky](containers/tykky.md) -- [R package installations](../apps/r-env.md#r-package-installations) +- [Installing Python packages and environments](../tutorials/python-usage-guide.md) +- [Containerizing Conda and pip environments with Tykky](../../computing/containers/tykky.md) +- [R package installations](../../apps/r-env.md#r-package-installations) ## Running your first job @@ -142,10 +142,10 @@ Basic workflow: See the relevant documentation below for detailed information: -1. [Available batch job partitions](batch-job-partitions.md) -2. [Creating a batch job script](creating-job-scripts-roihu.md) -3. [Submit a batch job](submitting-jobs.md) -4. [Performance checklist](performance-checklist.md) +1. [Available batch job partitions](../../computing/running/batch-job-partitions.md) +2. [Creating a batch job script](../../../computing/running/creating-job-scripts-roihu.md) +3. [Submit a batch job](../../computing/running/submitting-jobs.md) +4. [Performance checklist](../../computing/running/performance-checklist.md) For common Slurm error messages, see our FAQ on [Why does my batch job fail?](../faq/why-does-my-batch-job-fail.md). From 4c888a4f3dd9d01a274ea9029d1a7274df77eaf2 Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Tue, 14 Apr 2026 16:40:14 +0300 Subject: [PATCH 036/139] WIP Roihu quickstart guide link fixes (#2943) * Fix broken links and link formatting in quickstart guide --- docs/support/tutorials/roihu.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 96d2091eb3..a33de44431 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -33,7 +33,7 @@ To connect via SSH: 1. Set up SSH keys (same as Puhti/Mahti) 2. **New:** _Sign_ your public key and download a _certificate_ * Certificates are valid for **24 hours** - * See our instructions for managing certificates: [Signing public SSH keys](https://csc-guide-preview.2.rahtiapp.fi/origin/roihu-quickstart/computing/connecting/ssh-keys/#signing-public-key) + * See our instructions for managing certificates: [Signing public SSH keys](../../computing/connecting/ssh-keys.md#signing-public-key) For platform-specific instructions, see: @@ -66,7 +66,11 @@ For platform-specific instructions, see: **All login nodes still share the same file system, so your files are accessible from all of them.** -### Roihu web interface (available after general availability) +### Roihu web interface + +!!! warning "Roihu web interface availability during the pilot period" + Roihu web interface will only be vailable after General Availability. Please connect via SSH during the pilot phase. + The simplest way to connect to Roihu is to use the web interface. @@ -103,8 +107,8 @@ If not available, choose one of the following approaches depending on your needs HPC software written using programming languages such as C, C++ or Fortran need to be compiled before installing. For instructions on the available compilers and preferred options, see the instructions for compiling software on: -- [Compiling on Roihu-CPU](../../computing/compiling-roihu.md#building-mpi-applications) -- [Compiling on Roihu-GPU](../../computing/compiling-roihu.md#building-gpu-applications) +- [Compiling on Roihu-CPU](../../computing/compiling-roihu.md#compiling-on-roihu-gpu) +- [Compiling on Roihu-GPU](../../computing/compiling-roihu.md#compiling-on-roihu-gpu) ### Containers @@ -143,7 +147,7 @@ Basic workflow: See the relevant documentation below for detailed information: 1. [Available batch job partitions](../../computing/running/batch-job-partitions.md) -2. [Creating a batch job script](../../../computing/running/creating-job-scripts-roihu.md) +2. [Creating a batch job script](../../computing/running/batch-job-partitions.md) 3. [Submit a batch job](../../computing/running/submitting-jobs.md) 4. [Performance checklist](../../computing/running/performance-checklist.md) @@ -179,4 +183,4 @@ srun --argos=no * [Roihu system overview](../../computing/systems-roihu.md) * [CSC Computing Environment self-learning materials](https://csc-training.github.io/csc-env-eff/) -* [Contact our service desk](https://docs.csc.fi/support/contact/) \ No newline at end of file +* [Contact our service desk](../contact.md) \ No newline at end of file From c038e1352d8b17fb4ac9262ee75160888b14035f Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Wed, 15 Apr 2026 16:08:31 +0300 Subject: [PATCH 037/139] Add python-geo (#2944) * Add python-geo * remove geoplot from the list * Specify Roihu-CPU --- docs/apps/geoconda.md | 2 + docs/apps/python-geo.md | 241 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 243 insertions(+) create mode 100644 docs/apps/python-geo.md diff --git a/docs/apps/geoconda.md b/docs/apps/geoconda.md index d840e675d2..8655a4ebc8 100644 --- a/docs/apps/geoconda.md +++ b/docs/apps/geoconda.md @@ -147,6 +147,8 @@ The `geoconda` module is available: Version number is the same as the Python version. +In Roihu geoconda is renamed to [python-geo](python-geo.md) + ## Usage When using in LUMI, run this first: diff --git a/docs/apps/python-geo.md b/docs/apps/python-geo.md new file mode 100644 index 0000000000..e84f6d8e9f --- /dev/null +++ b/docs/apps/python-geo.md @@ -0,0 +1,241 @@ +--- +tags: + - Free +catalog: + name: Python-geo + description: Python libraries for spatial analysis + license_type: Free + disciplines: + - Geosciences + available_on: + - Roihu +--- + +# Python-geo + +Python-geo is a collection of python packages that facilitate the +development of python scripts for geoinformatics applications. It +includes following python packages: + + + +- [access](https://access.readthedocs.io/) - for calculating the spatial accessibility of resources. +- [async-tiff](https://github.com/developmentseed/async-tiff) - fast reader for TIFF-files. NEW 2026 +- [boto3](https://boto3.readthedocs.io) - for working files in S3 storage, for example Allas. [Allas S3 example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/blob/master/python/allas/working_with_allas_from_Python_S3.py). +- [cartopy] - for map plotting. +- [cdsapi](https://cds.climate.copernicus.eu/how-to-api) - access to Copernicus Climate Data Store. NEW 2026 +- [cfgrib](https://pypi.org/project/cfgrib/) - map GRIB files to the NetCDF Common Data Model +- [contextily](https://contextily.readthedocs.io/en/latest/) - to retrieve tile maps from the internet. +- [copc-lib](https://github.com/RockRobotic/copc-lib) - reader and writer interface for [Cloud Optimized Point Clouds (COPC)](https://copc.io/) +- [dask](https://dask.org/) - provides advanced parallelism for analytics, enabling performance at scale, including [dask-geopandas](https://dask-geopandas.readthedocs.io/), [Dask-ML](https://ml.dask.org/) and [Dask JupyterLab extension](https://github.com/dask/dask-labextension). + - [Dask parallization example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/puhti/05_parallel_dask). + - [STAC example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/STAC). + - [dask-image](https://dask-image.readthedocs.io/) - image processing with Dask Arrays. +- [datashader](https://datashader.org/) - for big data rendering. NEW 2026 +- [duckdb](https://duckdb.org/docs/index.html) - to execute analytical SQL queries fast. +- [esda](https://github.com/pysal/esda) - Exploratory Spatial Data Analysis. +- [fiona] - reads and writes spatial data files. +- [geoalchemy2] - provides extensions to [SQLAlchemy] for working with spatial databases, primarily PostGIS. +- [geocube](https://corteva.github.io/geocube/stable/readme.html) - convert geopandas vector data into rasterized xarray data. +- [geodatasets](https://geodatasets.readthedocs.io/) download and cache spatial data example files. +- **[geopandas]** - GeoPandas extends the datatypes used by [pandas]. +- [geoparquet-io](https://geoparquet.io/) - fast reader for GeoParquet files. NEW 2026 +- [geopy](https://geopy.readthedocs.io/) - client for several popular geocoding web services. +- [geoviews](https://geoviews.org/) - geographic visualizations for HoloViews. NEW 2026 +- [Google Earth Engine API](https://developers.google.com/earth-engine/guides/python_install) - see how to [set up GEE authentication](#google-earth-engine-authentication-set-up). +- [holoviews](https://holoviews.org/) - plot big datasets. NEW 2026 +- [h3pandas](https://h3-pandas.readthedocs.io/en/latest/) - for hexagonal geospatial indexing system, with Pandas and GeoPandas. +- [h3-py](https://uber.github.io/h3-py/intro.html) - Python bindings for H3, a hierarchical hexagonal geospatial indexing system. +- [h5py](https://www.h5py.org/) - for HDF5 files. NEW 2026 +- [icechunk](https://icechunk.io/en/stable/) - cloud-native transactional tensor storage engine. NEW 2026 +- [igraph](https://igraph.org/python/) - for fast routing. [Routing examples in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/routing) +- [laspy](https://pythonhosted.org/laspy/) - for reading, modifying, and creating .LAS LIDAR files. +- [leafmap](https://leafmap.org/) - for geospatial analysis and interactive mapping in a Jupyter environment. +- [lidar](https://lidar.gishub.org/) - for delineating the nested hierarchy of surface depressions in digital elevation models (DEMs). +- [lonboard](https://developmentseed.org/lonboard/latest/) - fast, interactive geospatial data visualization in Jupyter. NEW 2026 +- [metpy](https://unidata.github.io/MetPy/latest/index.html) - reading, visualizing, and performing calculations with weather data. +- [movingpandas](http://movingpandas.org) - for trajectory data +- [networkx] - for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. [Routing examples in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/routing) +- [papermill](https://papermill.readthedocs.io/en/latest/) - for parameterizing and executing Jupyter Notebooks. NEW 2026 +- [pot](https://pythonot.github.io/) - solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning. NEW 2026 +- [pyproj] - performs cartographic transformations and geodetic computations. +- [pyogrio](https://pyogrio.readthedocs.io/en/latest/index.html) - vectorized spatial vector file format I/O using GDAL/OGR. +- [obstore](https://developmentseed.org/obstore/latest/) - fast access to S3, Google Cloud Storage and Azure Storage. NEW 2026 +- [odc-stac](https://odc-stac.readthedocs.io/en/latest/) - STAC data to xarray, [STAC example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/STAC). NEW 2026 +- [openeo](https://openeo.org/) - for connecting to Earth observation cloud back-ends in a simple and unified way. +- [open3d](http://www.open3d.org/docs/release/index.html) - for 3D data processing +- [osmnx] - download spatial geometries and construct, project, visualize, and analyze street networks from + OpenStreetMap's APIs. [Routing examples in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/routing) +- [owslib](https://geopython.github.io/OWSLib/index.html) - for retrieving data from Open Geospatial Consortium (OGC) web services +- [pcraster](https://pcraster.geo.uu.nl/) - for spatio-temporal environmental modelling. +- [psycopg2](https://www.psycopg.org/docs/) - PostgreSQL database adapter for Python. +- [python-pdal](https://github.com/PDAL/python) - PDAL Python extension for lidar data +- [pysal] - spatial analysis functions. +- [pdal](https://pdal.io/) - for lidar data +- [pysheds](https://github.com/pysheds/pysheds) - for watershed delineation. +- [pystac-client](https://pystac-client.readthedocs.io/) - for working with STAC Catalogs and APIs. [STAC example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/STAC). +- [python-cdo](https://pypi.org/project/cdo/) - scripting interface to CDO (Climate Data Operators). +- **[rasterio]** - access to geospatial raster data. +- [rasterstats] - for summarizing geospatial raster datasets based on + vector geometries. It includes functions for zonal statistics and + interpolated point queries. [rasterstats example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/zonal_stats) +- [rio-cogeo](https://cogeotiff.github.io/rio-cogeo/) - for Cloud Optimized GeoTIFF (COG) creation. +- [rtree] - spatial indexing and search. +- [r5py](https://r5py.readthedocs.io) - for rapid realistic routing on multimodal transport networks, see [below how to set memory correctly](#r5py-memory-settings) for r5py. +- [shap](https://shap.readthedocs.io/en/latest/) - for explaining the output of any machine learning model. NEW 2026 +- [sentinelhub](https://sentinelhub-py.readthedocs.io/en/latest/index.html) - for working with new Sentinel Hub services. +- [shapely] - manipulation and analysis of geometric objects in the Cartesian plane. +- [scikit-gstat](https://scikit-gstat.readthedocs.io/en/latest/) - for variogram analysis. NEW 2026 +- **[scikit-learn]** - machine learning for Python. [Spatial machine learning scikit-learn (shallow learning) exercises](https://github.com/csc-training/geocomputing/tree/master/machineLearning) +- [skimage] - algorithms for image processing. +- [scipy](https://www.scipy.org/) - inc pandas, numpy, matplotlib etc +- [sparse](https://sparse.pydata.org/en/stable/) - for sparse arrays. NEW 2026 +- [spectral](https://www.spectralpython.net/) - for processing hyperspectral image data. NEW 2026 +- [stackstac](https://stackstac.readthedocs.io/) - STAC data to xarray, [STAC example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/STAC/stacstac(old)). Has not been updated lately, use rather `odc-stac`. +- [swiftclient, keystoneclient](https://docs.openstack.org/python-swiftclient/latest/) - for working with SWIFT storage, for example Allas. [Allas Swift example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/blob/master/python/allas/working_with_allas_from_Python_Swift.py). +- [whiteboxtools](https://www.whiteboxgeo.com/) - wide-scope processing of geospatial data, many tools operate in parallel, see [CSC whiteboxtools page](whiteboxtools.md) for details. Also Whitebox Workflows for Python. +- **[xarray](http://xarray.pydata.org)** - for multidimensional raster data, inc. [rioxarray](https://corteva.github.io/rioxarray). [STAC example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/STAC). + - [cf_xarray](https://cf-xarray.readthedocs.io/en/latest/) - interpret Climate and Forecast metadata convention attributes present on xarray objects. NEW 2026 + - [flox](https://flox.readthedocs.io/en/latest/) - fast GroupBy reductions for Xarray. NEW 2026 + - [xarray-spatial](https://xarray-spatial.readthedocs.io/) - efficient common raster analysis functions for xarray. [xarray-spatial example in CSC geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/python/zonal_stats) + - [xclim](https://xclim.readthedocs.io/en/stable/) - for climate analysis. NEW 2026 +- [xgboost](https://xgboost.readthedocs.io/) - Gradient Boosting machine learning algorithms. NEW 2026 +- [zarr](https://zarr.readthedocs.io/en/stable/) - for reading and writing data to Zarr format. NEW 2026 + +- And many more, for retrieving the full list use: + `list-packages` + +Additionally python-geo includes: + +- **[jupyter]** - Jupyter Notebooks and JupyterLab. Use from [web interface](../computing/webinterface/index.md) with [Jupyter app](../computing/webinterface/jupyter.md). Includes [Dask Extension](https://github.com/dask/dask-labextension) and [Resource usage Extension](https://github.com/jupyter-server/jupyter-resource-usage). +- **[GDAL/OGR](../apps/gdal.md)** commandline tools +- [GMT] The Generic Mapping Tools +- [PDAL](https://pdal.io/) - Point Data Abstraction Library + +Python has multiple packages for parallel computing, for example +**multiprocessing**, **joblib** and **dask**. In our [Puhti Python examples](https://github.com/csc-training/geocomputing/tree/master/python/puhti) there are examples how to utilize these different parallelisation libraries. + +If you think that some important GIS package for Python is missing from here, you can ask for installation from [CSC Service Desk](../support/contact.md). + + +## Available + +The `python-geo` module is available: + +* 3.14.3 (Python 3.14.3, PDAL 2.10.0, GDAL 3.12.2, created April 2026), in Roihu-CPU + +The version number is the same as the Python version. + +In Puhti, Mahti and LUMI python-geo is named [geoconda](geoconda.md) + +## Usage + + + +For using Python packages and other tools listed above, you can initialize them with: + +```bash +module load python-geo +``` + +By default the latest python-geo module is loaded. If you want a specific version you can specify the version number of python-geo: + +```bash +module load python-geo/[VERSION] +``` + +To check the exact packages and versions included in the loaded module: + +```bash +list-packages +``` + +You can add more Python packages to `python-geo` by following the instructions in our +[Python usage guide](../support/tutorials/python-usage-guide.md#installing-python-packages-to-existing-modules). + +You can edit your Python code with [web interface](../computing/webinterface/index.md) or [LUMI](https://docs.lumi-supercomputer.eu/runjobs/webui/jupyter/) web interface : + +* [Visual Studio Code](../computing/webinterface/vscode.md) +* [JupyterLab](../computing/webinterface/jupyter.md) + +### r5py memory settings +`r5py` by default does not correctly understand how much memory it has available in a supercomputer so, it has to be defined manually. It is using Java in the background, so add environmental variable to set maximum memory available for Java: + +* `export _JAVA_OPTIONS="-Xmx4g"` from command-line before starting Python OR +* `os.environ["_JAVA_OPTIONS"] = "-Xmx4g"` in the beginning of your Python code. + +### Google Earth Engine authentication set up +For using Google Earth Engine (GEE) API with `earthengine-api` package, GEE account and project are needed. Before first usage, also set up GEE authentication: + +``` +module load python-geo allas +earthengine authenticate --quiet +``` + +This prints out a long link and asks for a code. Copy the link to the web browser of your local laptop. Follow the instructions on the web page and finally copy the created code back to Terminal. + +## Using Allas or LUMI-O from Python + +There are two Python libraries installed in Python-geo that can interact with Allas or LUMI-O. __Swiftclient__ uses the swift protocol and __boto3__ uses S3 protocol. You can find CSC examples how to use both [here](https://github.com/csc-training/geocomputing/tree/master/python/allas). + +It is also possible to __read__ and __write__ files from and to Allas or other cloud object storage directly with GDAL-based packages such as `geopandas` and `rasterio`. Please check our [Using geospatial files directly from cloud, inc Allas tutorial](../support/tutorials/gis/gdal_cloud.md) for instructions and examples. + +With large quantities of raster data, consider using [virtual rasters](https://research.csc.fi/virtual_rasters). + +## License + +All packages are licensed under various free and open source licenses (FOSS), see the linked pages above for exact details. + +## Citation + +Please see the above linked package pages for citation information per package. + +## Acknowledgement + +Please acknowledge CSC and Geoportti in your publications, it is important for project continuation and funding reports. +As an example, you can write "The authors wish to thank CSC - IT Center for Science, Finland (urn:nbn:fi:research-infras-2016072531) and the Open Geospatial Information Infrastructure for Research (Geoportti, urn:nbn:fi:research-infras-2016072513) for computational resources and support". + +## Installation + +Python-geo was installed to Roihu using [Tykkys conda-containerize functionality](../computing/containers/tykky.md). In LUMI, geoconda was installed using [LUMI container wrapper](https://docs.lumi-supercomputer.eu/software/installing/container-wrapper/). The functionality of the tools is almost identical with `--post` option being `--post-install` on LUMI container wrapper. The WhiteboxTools conda package installs only WhiteboxTools installer, therefore for proper installation of Whiteboxtools required additional post installation command and folder to wrap commandline tools. + +```bash +conda-containerize new --mamba --prefix install_dir --post download_wbt -w miniconda/envs/env1/lib/python3.11/site-packages/whitebox/WBT/whitebox_tools python-geo_3.11.10.yml +``` + +Python-geo conda environment files and `download_wbt` and `start_wbt.py` needed for WhiteboxTools are available in [CSCs geocomputing repository](https://github.com/csc-training/geocomputing/tree/master/supercomputer_installations/python-geo). Note that for reproducibility, you'll need to define the package versions in the environment file, which can be checked using `list-packages` command after loading the `python-geo` module. + + +## References + +- [CSC Python parallelisation examples](https://github.com/csc-training/geocomputing/tree/master/python/puhti) +- [Multiprocessing Basics](https://pymotw.com/2/multiprocessing/basics.html) +- [Automating GIS processes course materials](https://automating-gis-processes.github.io) by University of Helsinki +- [Aalto Spatial Analytics course material](https://spatial-analytics.readthedocs.io/en/latest/course-info/course-info.html) by Henrikki Tenkanen / Aalto University +- [Introduction to GIS Programming](https://geog-312.gishub.org/index.html) by Dr. Qiusheng Wu / University of Tennessee +- [Geographic Data Science with Python](https://geographicdata.science/book/intro.html) by Sergio Rey, Dani Arribas-Bel, Levi Wolf +- [Python Foundation for Spatial Analysis](https://courses.spatialthoughts.com/python-foundation.html) by Ujaval Gandhi + +------------------------------------------------------------------------ + + + [cartopy]: http://scitools.org.uk/cartopy/ + [descartes]: https://pypi.python.org/pypi/descartes + [fiona]: https://pypi.python.org/pypi/Fiona + [gdal]: https://pypi.python.org/pypi/GDAL + [geoalchemy2]: https://geoalchemy-2.readthedocs.io/en/latest/ + [GMT]: https://www.generic-mapping-tools.org/ + [SQLAlchemy]: http://sqlalchemy.org + [geopandas]: http://geopandas.org/ + [jupyter]: https://jupyter.org/ + [pandas]: http://pandas.pydata.org + [networkx]: https://networkx.github.io/ + [pyproj]: https://pypi.python.org/pypi/pyproj? + [pysal]: https://pysal.org/ + [osmnx]: https://osmnx.readthedocs.io/en/stable/index.html + [rasterio]: https://rasterio.readthedocs.io/en/latest/ + [rasterstats]: http://pythonhosted.org/rasterstats/ + [rtree]: http://toblerity.org/rtree/ + [shapely]: https://pypi.python.org/pypi/Shapely + [skimage]: http://scikit-image.org/ + [scikit-learn]: https://scikit-learn.org/stable/ From dd60999892787e80b219ca15676a0e4fb5c55b24 Mon Sep 17 00:00:00 2001 From: Tuomas Rossi Date: Thu, 16 Apr 2026 17:27:45 +0300 Subject: [PATCH 038/139] Roihu docs on affinity (#2936) * Reorder * Add OMP_PROC_BIND * Move affinity section to a separate tutorial * Apply suggestions from review --------- Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> --- .../running/performance-checklist.md | 127 ++++---- docs/support/tutorials/affinity.md | 302 ++++++++++++++++++ docs/support/tutorials/index.md | 1 + 3 files changed, 364 insertions(+), 66 deletions(-) create mode 100644 docs/support/tutorials/affinity.md diff --git a/docs/computing/running/performance-checklist.md b/docs/computing/running/performance-checklist.md index 47116e9c4c..077562949f 100644 --- a/docs/computing/running/performance-checklist.md +++ b/docs/computing/running/performance-checklist.md @@ -4,78 +4,21 @@ This page collects important information to enable maximum performance for your jobs and the system. If you know how to improve job performance, please contribute to the list! -## Limit unnecessary spreading of parallel tasks in Puhti -One of the limiting factors for strong scaling is the communication -between tasks. Communication within a node is faster than between -nodes. It is optimal to use as few nodes as possible. - -If resources are requested simply by: -``` -#SBATCH --ntasks=200 -``` -the queuing system may spread them on tens of nodes (just a few cores each). -This will be very bad for the performance of the job, and will cause a lot of -(unnecessary) communication in the system interconnect. If the performance of -your parallel jobs has decreased, this could be the reason. -Overall, this should be avoided. This also -fragments the system increasing queuing times for large jobs. - -The best performance (fastest communication) can be achieved by requesting -full nodes: -``` -#SBATCH --nodes=5 -#SBATCH --ntasks-per-node=40 -``` -Since Puhti is currently fragmented, requesting full nodes may mean longer queuing -time, but it may be regained by faster execution. If queuing times this way seem -unacceptable, you can still limit the maximum number of nodes the job can spread on. -For example, limiting the 200 task job (which optimally fits on 5 nodes) to a maximum -of 10 nodes, you could use: - -``` -#SBATCH --ntasks=200 -#SBATCH --nodes=5-10 -``` -Slurm will then allocate 200 cores from 5 to 10 nodes for your job. - -### How many nodes to allow? -If full nodes or the minimum is not suitable, it is probably best to try -and monitor job performance. Choosing too many nodes will deteriorate -performance more than is gained by less queuing. Note also that overall this is lost -computer capacity. -Perhaps, a rule of thumb could be -to set the upper limit to 2 or 3 times the number which would accommodate -all tasks. With very large parallel jobs, even smaller is recommended as -communication and the likelihood of one slow node in the allocation gets -higher and poor load balancing gets more likely. Anyway, large parallel jobs -should be run in Mahti. +## Check CPU Affinity -## Hybrid parallelization in Mahti +CPU affinity describes how a running program is placed on +the available CPU cores of a supercomputer node. +In high‑performance computing, setting affinity correctly is important for performance. +It helps programs make better use of fast processor caches and memory, +reduces unnecessary movement between cores, and leads to more stable and predictable runtimes. -Many HPC applications benefit from binding OpenMP threads to CPU cores -which can be achieved by setting `export OMP_PLACES=cores` in the -batch job script. - -When starting new production runs it is also good -practice to ensure correct thread affinity by adding to batch job -script -``` -export OMP_AFFINITY_FORMAT="Process %P level %L thread %0.3n affinity %A" -export OMP_DISPLAY_AFFINITY=true -``` -The runtime affinity will be printed to the standard error of the batch -job. If the output shows that several processes/threads are bound to -the same core, *i.e.* -``` -Process 164433 level 1 thread 000 affinity 0 -Process 164433 level 1 thread 001 affinity 0 -``` -the performance might be deteriorated and one should check the settings -in the batch script. +Please see the [CPU affinity tutorial](../../support/tutorials/affinity.md) for +instructions how to inspect and control CPU affinity of the programs. ## Perform a scaling test + It is important to make sure that your job can efficiently use all the allocated resources (cores). This needs to be verified for each new code and job type (different input) by a scaling test. @@ -93,6 +36,7 @@ completes faster. Note, that not all codes or job types can be run in parallel. Confirm this first for your code. + ## Mind your I/O - it can make a big difference If your workload writes or reads a large number of small files then you may @@ -122,3 +66,54 @@ improved by proper Lustre settings: * Use collective parallel I/O if possible. * See also more extensive [I/O optimization hints](../../support/tutorials/lustre_performance.md). + + +## Limit unnecessary spreading of parallel tasks in Puhti + +One of the limiting factors for strong scaling is the communication +between tasks. Communication within a node is faster than between +nodes. It is optimal to use as few nodes as possible. + +If resources are requested simply by: +``` +#SBATCH --ntasks=200 +``` +the queuing system may spread them on tens of nodes (just a few cores each). +This will be very bad for the performance of the job, and will cause a lot of +(unnecessary) communication in the system interconnect. If the performance of +your parallel jobs has decreased, this could be the reason. +Overall, this should be avoided. This also +fragments the system increasing queuing times for large jobs. + +The best performance (fastest communication) can be achieved by requesting +full nodes: +``` +#SBATCH --nodes=5 +#SBATCH --ntasks-per-node=40 +``` +Since Puhti is currently fragmented, requesting full nodes may mean longer queuing +time, but it may be regained by faster execution. If queuing times this way seem +unacceptable, you can still limit the maximum number of nodes the job can spread on. +For example, limiting the 200 task job (which optimally fits on 5 nodes) to a maximum +of 10 nodes, you could use: + +``` +#SBATCH --ntasks=200 +#SBATCH --nodes=5-10 +``` +Slurm will then allocate 200 cores from 5 to 10 nodes for your job. + +### How many nodes to allow? +If full nodes or the minimum is not suitable, it is probably best to try +and monitor job performance. Choosing too many nodes will deteriorate +performance more than is gained by less queuing. Note also that overall this is lost +computer capacity. + +Perhaps, a rule of thumb could be +to set the upper limit to 2 or 3 times the number which would accommodate +all tasks. With very large parallel jobs, even smaller is recommended as +communication and the likelihood of one slow node in the allocation gets +higher and poor load balancing gets more likely. Anyway, large parallel jobs +should be run in Mahti. + + diff --git a/docs/support/tutorials/affinity.md b/docs/support/tutorials/affinity.md new file mode 100644 index 0000000000..adb98e5f6d --- /dev/null +++ b/docs/support/tutorials/affinity.md @@ -0,0 +1,302 @@ +# Inspecting and Controlling CPU Affinity + +This tutorial describes how CPU affinities can be inspected and controlled in job scripts on CSC supercomputers. +The tutorial scripts are written for the Roihu supercomputer, but are generally also applicable on Puhti, Mahti and LUMI. + + +## What is CPU Affinity? + +CPU affinity describes how a running program is placed on +the available CPU cores of a supercomputer node. +By controlling affinity, we decide which CPU cores a program is allowed to run on, +limiting or fixing its placement rather than leaving it entirely to the system. + +In high‑performance computing, setting affinity correctly is important for performance. +It helps programs make better use of fast processor caches and memory, +reduces unnecessary movement between cores, and leads to more stable and predictable runtimes. + + +## Inspecting CPU Affinity in Slurm Jobs + +The following example job script can be used for checking the CPU affinities of each Slurm task. +The job script creates a script `print_affinity..sh` that is then executed via `srun`: + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=medium +#SBATCH --time=00:30:00 +#SBATCH --nodes=2 +#SBATCH --ntasks-per-node=8 --cpus-per-task=48 # The product should be 384 +#SBATCH --hint=nomultithread + +# Create a script for printing affinity +PRINT_AFFINITY="./print_affinity.$SLURM_JOB_ID.sh" +cat << 'EOF' > $PRINT_AFFINITY +#!/bin/bash +printf "Task %4d running on node %s core %s\n" \ + "$SLURM_PROCID" \ + "$SLURMD_NODENAME" \ + "$(grep Cpus_allowed_list /proc/self/status | cut -f2)" +EOF +chmod +x $PRINT_AFFINITY + +# Remove script on exit +trap "rm -f $PRINT_AFFINITY" EXIT + +# Run the program +srun $PRINT_AFFINITY +``` + +Example output for this job: + +```txt +Task 10 running on node rc6284 core 96-143 +Task 11 running on node rc6284 core 144-191 +Task 9 running on node rc6284 core 48-95 +Task 15 running on node rc6284 core 336-383 +Task 12 running on node rc6284 core 192-239 +Task 13 running on node rc6284 core 240-287 +Task 14 running on node rc6284 core 288-335 +Task 8 running on node rc6284 core 0-47 +Task 7 running on node rc6283 core 336-383 +Task 1 running on node rc6283 core 48-95 +Task 3 running on node rc6283 core 144-191 +Task 6 running on node rc6283 core 288-335 +Task 4 running on node rc6283 core 192-239 +Task 2 running on node rc6283 core 96-143 +Task 5 running on node rc6283 core 240-287 +Task 0 running on node rc6283 core 0-47 +``` + +This output shows that CSC supercomputers are configured so that, by default, each Slurm task is assigned +its own exclusive set of CPU cores from the allocation. +The number of cores reserved per task is determined by the Slurm option `cpus-per-task`. + + +## Advanced: Controlling CPU Affinity Manually in Slurm Jobs + +!!! warning "Advanced topic" + This section describes **advanced, manual control of CPU affinity**. + In practice, this is **rarely needed**, and the default Slurm configuration is best for most workloads. + If you believe you need manual CPU binding, [**please contact us first**](../contact.md) for guidance. + +If a different placement strategy is needed, the default CPU binding can be disabled by `srun --cpu-bind=none`. +This removes all CPU affinity restrictions, allowing processes to run on **any allocated core on the node**. +This is **generally undesirable for performance** unless combined with explicit manual binding, +which can be done using tools such as numactl. + +The following job script provides an example for setting CPU binding manually. +The job script creates two scripts: `print_affinity..sh` and `cpu_bind..sh`. +The first script is the same script used above for checking affinities and the second script +uses numactl for binding tasks to CPU cores in the same way as done by default: +each task is assined a contiguous block of CPU cores based on the task's local ID and +the number of CPUs per task: + +!!! warning "Script for full nodes" + This script works correctly only on full nodes. + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=medium +#SBATCH --time=00:30:00 +#SBATCH --nodes=2 +#SBATCH --ntasks-per-node=8 --cpus-per-task=48 # The product should be 384 +#SBATCH --hint=nomultithread + +# Create a script for printing affinity +PRINT_AFFINITY="./print_affinity.$SLURM_JOB_ID.sh" +cat << 'EOF' > $PRINT_AFFINITY +#!/bin/bash +printf "Task %4d running on node %s core %s\n" \ + "$SLURM_PROCID" \ + "$SLURMD_NODENAME" \ + "$(grep Cpus_allowed_list /proc/self/status | cut -f2)" +EOF +chmod +x $PRINT_AFFINITY + +# Create a script for binding tasks to CPU cores +BIND_CPU="./bind_cpu.$SLURM_JOB_ID.sh" +cat << 'EOF' > $BIND_CPU +#!/bin/bash +cpus_per_task=${SLURM_CPUS_PER_TASK:-1} +start_core=$((SLURM_LOCALID * cpus_per_task)) +end_core=$((SLURM_LOCALID * cpus_per_task + cpus_per_task - 1)) +numactl --physcpubind ${start_core}-${end_core} "$@" +EOF +chmod +x $BIND_CPU + +# Remove scripts on exit +trap "rm -f $PRINT_AFFINITY $BIND_CPU" EXIT + +# Run the program with manual binding +srun --cpu-bind=none $BIND_CPU $PRINT_AFFINITY +``` + +This produces output equivalent to the default CPU binding seen above, +confirming that the manual placement reproduces the standard behaviour: + +```txt +Task 4 running on node rc6283 core 192-239 +Task 6 running on node rc6283 core 288-335 +Task 7 running on node rc6283 core 336-383 +Task 5 running on node rc6283 core 240-287 +Task 0 running on node rc6283 core 0-47 +Task 1 running on node rc6283 core 48-95 +Task 2 running on node rc6283 core 96-143 +Task 3 running on node rc6283 core 144-191 +Task 15 running on node rc6284 core 336-383 +Task 8 running on node rc6284 core 0-47 +Task 9 running on node rc6284 core 48-95 +Task 10 running on node rc6284 core 96-143 +Task 11 running on node rc6284 core 144-191 +Task 12 running on node rc6284 core 192-239 +Task 13 running on node rc6284 core 240-287 +Task 14 running on node rc6284 core 288-335 +``` + +If needed, the logic in the binding script can be modified to bind CPU cores according to the needs of your application. + + +## Inspecting and Controlling CPU Affinity of MPI+OpenMP Applications + +Many HPC applications benefit from binding OpenMP threads to CPU cores. +This does not happen automatically but need to be enabled with the following lines in the batch job script: + +```bash +# Place and bind threads to single cores +export OMP_PLACES=cores +export OMP_PROC_BIND=spread +``` + +This tutorial illustrates the meaning of these settings. + +First, let's check the default behaviour in the case when threads are not bound. +The following job script exemplifies the use of `OMP_*` for checking the CPU affinities of each process and OpenMP thread: + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=small +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 +#SBATCH --ntasks=2 +#SBATCH --cpus-per-task=8 +#SBATCH --mem-per-cpu=1000M +#SBATCH --hint=nomultithread + +# Set the number of threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Print thread affinities +export OMP_DISPLAY_AFFINITY=true +export OMP_AFFINITY_FORMAT="Process %P level %L thread %0.4n/%0.4N on node %H core %A" + +# Run the program +srun +``` + +Example output for this job: + +```txt +Process 2808761 level 1 thread 0000/0004 on node rc6224 core 0-3 +Process 2808761 level 1 thread 0001/0004 on node rc6224 core 0-3 +Process 2808761 level 1 thread 0002/0004 on node rc6224 core 0-3 +Process 2808761 level 1 thread 0003/0004 on node rc6224 core 0-3 +Process 2808762 level 1 thread 0000/0004 on node rc6224 core 4-7 +Process 2808762 level 1 thread 0001/0004 on node rc6224 core 4-7 +Process 2808762 level 1 thread 0002/0004 on node rc6224 core 4-7 +Process 2808762 level 1 thread 0003/0004 on node rc6224 core 4-7 +``` + +This output means that the threads of the processes are free to move between the sets of four cores (0-3 or 4-7), +which can lead to worse performance due to increased context switching and thread migration during execution, +as opposed to a case where threads are bound to single cores. + +In the following example job script, we have enabled thread binding: + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=small +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 +#SBATCH --ntasks=2 +#SBATCH --cpus-per-task=8 +#SBATCH --mem-per-cpu=1000M +#SBATCH --hint=nomultithread + +# Set the number of threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind threads to single cores +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Print thread affinities +export OMP_DISPLAY_AFFINITY=true +export OMP_AFFINITY_FORMAT="Process %P level %L thread %0.4n/%0.4N on node %H core %A" + +# Run the program +srun +``` + +Now the output looks like this: + +```txt +Process 2808761 level 1 thread 0000/0004 on node rc6224 core 0 +Process 2808761 level 1 thread 0001/0004 on node rc6224 core 1 +Process 2808761 level 1 thread 0002/0004 on node rc6224 core 2 +Process 2808761 level 1 thread 0003/0004 on node rc6224 core 3 +Process 2808762 level 1 thread 0000/0004 on node rc6224 core 4 +Process 2808762 level 1 thread 0001/0004 on node rc6224 core 5 +Process 2808762 level 1 thread 0002/0004 on node rc6224 core 6 +Process 2808762 level 1 thread 0003/0004 on node rc6224 core 7 +``` + +This output means that each thread is bound to its own core, which is often desired for best performance. + +In general, when starting to use a new application or configuration, +it is essential to use these settings to verify thread placement and confirm that affinity is applied as expected, +which is often a key step when diagnosing or optimizing performance on a supercomputer. + +If the output would show that several processes or threads are bound to the same core on the same node, for example +```txt +Process 2808761 level 1 thread 0000/0004 on node rc6224 core 0 +Process 2808761 level 1 thread 0001/0004 on node rc6224 core 1 +Process 2808761 level 1 thread 0002/0004 on node rc6224 core 2 +Process 2808761 level 1 thread 0003/0004 on node rc6224 core 3 +Process 2808762 level 1 thread 0000/0004 on node rc6224 core 0 +Process 2808762 level 1 thread 0001/0004 on node rc6224 core 1 +Process 2808762 level 1 thread 0002/0004 on node rc6224 core 2 +Process 2808762 level 1 thread 0003/0004 on node rc6224 core 3 +``` +or +```txt +Process 2808761 level 1 thread 0000/0004 on node rc6224 core 0 +Process 2808761 level 1 thread 0001/0004 on node rc6224 core 0 +Process 2808761 level 1 thread 0002/0004 on node rc6224 core 0 +Process 2808761 level 1 thread 0003/0004 on node rc6224 core 0 +Process 2808762 level 1 thread 0000/0004 on node rc6224 core 0 +Process 2808762 level 1 thread 0001/0004 on node rc6224 core 0 +Process 2808762 level 1 thread 0002/0004 on node rc6224 core 0 +Process 2808762 level 1 thread 0003/0004 on node rc6224 core 0 +``` +or something similar, then the performance is likely deteriorated +and the settings in the batch script should be fixed. + +Please do not hesitate to [**contact us**](../contact.md) if you need help with ensuring good +performance for your application. + + +## More information + +* [`numactl` man page](https://man7.org/linux/man-pages/man8/numactl.8.html) +* [OpenMP API Specification: `OMP_PROC_BIND`](https://www.openmp.org/spec-html/5.0/openmpse52.html) +* [OpenMP API Specification: `OMP_PLACES`](https://www.openmp.org/spec-html/5.0/openmpse53.html) +* [OpenMP API Specification: `OMP_AFFINITY_FORMAT`](https://www.openmp.org/spec-html/5.0/openmpse62.html) diff --git a/docs/support/tutorials/index.md b/docs/support/tutorials/index.md index 97e2defeae..40e538f12c 100644 --- a/docs/support/tutorials/index.md +++ b/docs/support/tutorials/index.md @@ -30,6 +30,7 @@ ## Performance and high-throughput workflows * [General high-throughput guidelines](../../computing/running/throughput.md) +* [Inspecting and controlling CPU affinity](affinity.md) * [Optimising parallel I/O](lustre_performance.md) * [Dask & parallel Python](dask-python.md) * [HyperQueue meta-scheduler](../../apps/hyperqueue.md) From 046d71af95fd3464037263ad55f58789268323f1 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Fri, 17 Apr 2026 10:21:21 +0300 Subject: [PATCH 039/139] Update roihu disk quotas and formatting in tutorials --- docs/support/tutorials/roihu-data.md | 4 ++-- docs/support/tutorials/roihu.md | 12 ++++++------ 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/support/tutorials/roihu-data.md b/docs/support/tutorials/roihu-data.md index a681142492..2fce738918 100644 --- a/docs/support/tutorials/roihu-data.md +++ b/docs/support/tutorials/roihu-data.md @@ -54,9 +54,9 @@ | Disk area | Path | Default size | Max. size [^1] | Default file number limit | Max. file number limit [^1] | |-----------|-----------------------|-------------:|--------------------:|--------------------------:|----------------------------:| | Home | `/users/$USER` | 15 GiB | 15 GiB | 150k | 150k | - | ProjAppl | `/projappl/` | 15 GiB | 250 GiB (< 100 GiB) | 150k | 2.5M (< 1M) | + | ProjAppl | `/projappl/` | 15 GiB | 250 GiB | 150k | 2.5M | | ProjData  | `/projdata/` | 0 GiB | case-by-case | 0 | case-by-case | - | Scratch  | `/scratch/` | 1 TiB | 100 TiB (< 10 TiB) | 1M | 10M (< 5M) | + | Scratch  | `/scratch/` | 250 GiB | 100 TiB | 500k | 10M | [^1]: Values in parentheses indicate automatically approved limits. diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index a33de44431..b17b007c0f 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -60,9 +60,9 @@ For platform-specific instructions, see: ``` **Importantly:** - - Software compiled on Roihu-CPU nodes only works on Roihu-CPU nodes - - Software compiled on Roihu-GPU nodes only works on Roihu-GPU nodes - - This also applies to Python environments + - Software compiled on Roihu-CPU nodes only works on Roihu-CPU nodes
+ - Software compiled on Roihu-GPU nodes only works on Roihu-GPU nodes
+ - This also applies to Python environments
**All login nodes still share the same file system, so your files are accessible from all of them.** @@ -83,12 +83,12 @@ The simplest way to connect to Roihu is to use the web interface. If you need to transfer data from Puhti or Mahti to Roihu, we require that you: -1. Review your data carefully – **only move what you - really need** +1. Review your data carefully – **only move what you really need** 2. Check your available disk space on Roihu (for example, using the `csc-workspaces` command) 3. Transfer data **directly** from Puhti or Mahti to Roihu. -Note that previous extended disk quotas on Puhti or Mahti will not be automatically moved to Roihu. Quota extensions on Roihu must be separately applied for and properly motivated. +Note that previous extended disk quotas on Puhti or Mahti will not be automatically moved to Roihu. +Quota extensions on Roihu must be separately applied for and properly motivated. **[Read the detailed instructions in the Roihu data migration guide](roihu-data.md).** From df063603b036435ed8eaaa7050054bd7f4fd8347 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Fri, 17 Apr 2026 13:53:24 +0300 Subject: [PATCH 040/139] Fix formatting and link to the correct CPU compiling section --- docs/support/tutorials/roihu.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index b17b007c0f..6467c45c7a 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -97,8 +97,8 @@ Quota extensions on Roihu must be separately applied for and properly motivated. Before installing anything: 1. Check if the software is already available: - - [List of pre-installed applications](../../apps/index.md) - - `module spider ` + - [List of pre-installed applications](../../apps/index.md) + - `module spider ` If not available, choose one of the following approaches depending on your needs: @@ -107,7 +107,7 @@ If not available, choose one of the following approaches depending on your needs HPC software written using programming languages such as C, C++ or Fortran need to be compiled before installing. For instructions on the available compilers and preferred options, see the instructions for compiling software on: -- [Compiling on Roihu-CPU](../../computing/compiling-roihu.md#compiling-on-roihu-gpu) +- [Compiling on Roihu-CPU](../../computing/compiling-roihu.md#compiling-on-roihu-cpu) - [Compiling on Roihu-GPU](../../computing/compiling-roihu.md#compiling-on-roihu-gpu) ### Containers From 1a2af3148e6b9a5ab8726ade44c7bcba3a07aa7e Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Fri, 17 Apr 2026 15:21:36 +0300 Subject: [PATCH 041/139] Update python.md Add information of the separate module trees to python.md --- docs/apps/python.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/apps/python.md b/docs/apps/python.md index 94bd23fe05..4e548f6a70 100644 --- a/docs/apps/python.md +++ b/docs/apps/python.md @@ -73,16 +73,16 @@ Python environments made for different science areas. In Roihu: -| Module name | Purpose | -|-|-| -| [biopythontools](biopython.md) | bioinformatics | -| [python-geo](python-geo.md) | geoinformatics | -| [python-jax](jax.md) | JAX ML framework | -| [python-data](python-data.md) | data analysis and ML utilities | -| [python-pytorch](pytorch.md) | PyTorch ML framework | -| [python-vllm](vllm.md) | LLM inference | -| [python-qiskit](qiskit.md) | quantum computing | -| [python-tensorflow](tensorflow.md) | TensorFlow ML framework | +| Module name | Purpose | Roihu-CPU/Roihu-GPU | +|-|-|-| +| [biopythontools](biopython.md) | bioinformatics | | +| [python-geo](python-geo.md) | geoinformatics | CPU | +| [python-jax](jax.md) | JAX ML framework | GPU | +| [python-data](python-data.md) | data analysis and ML utilities | CPU/GPU | +| [python-pytorch](pytorch.md) | PyTorch ML framework | GPU | +| [python-vllm](vllm.md) | LLM inference | GPU | +| [python-qiskit](qiskit.md) | quantum computing | | +| [python-tensorflow](tensorflow.md) | TensorFlow ML framework | GPU | In other systems: From f388094d2f04ac4c9b50a21e9e2bbe8999b523e8 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Mon, 20 Apr 2026 10:22:26 +0300 Subject: [PATCH 042/139] Fix formatting, link to roihu applications --- docs/support/tutorials/roihu.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 6467c45c7a..615359883c 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -59,7 +59,7 @@ For platform-specific instructions, see: ssh @roihu-cpu.csc.fi ``` - **Importantly:** + **Importantly:**
- Software compiled on Roihu-CPU nodes only works on Roihu-CPU nodes
- Software compiled on Roihu-GPU nodes only works on Roihu-GPU nodes
- This also applies to Python environments
@@ -97,7 +97,7 @@ Quota extensions on Roihu must be separately applied for and properly motivated. Before installing anything: 1. Check if the software is already available: - - [List of pre-installed applications](../../apps/index.md) + - [List of pre-installed applications](../../apps/by_availability.md#roihu) - `module spider ` If not available, choose one of the following approaches depending on your needs: From 34f9e259b76a558d22e0f02fbdf5eab72d241b88 Mon Sep 17 00:00:00 2001 From: Jaan Tollander de Balsch Date: Mon, 20 Apr 2026 12:05:32 +0300 Subject: [PATCH 043/139] roihu base containers --- docs/support/tutorials/roihu.md | 48 ++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index a33de44431..2bd5a94afb 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -115,6 +115,52 @@ For instructions on the available compilers and preferred options, see the instr Roihu supports Apptainer/Singularity containers for container installations. In most cases, ready-made Docker containers can be easily converted into an Apptainer image. Another option is to build your own container from scratch. +You can build containers on top of Roihu base containers which have the same software stack as is available via the module system natively. +Base container are built on top of Rockylinux 9. + +=== "Roihu CPU base container (~4 GB)" + ```sh title="container.def" + Bootstrap: docker + From: satama.csc.fi/r_installation_spack/core-cpu-gcc-15.2.0:v2026_03 + + %post + # Activate module environment and load default modules. + . /opt/activate.sh + # Build your application here: + + %runscript + . /opt/activate.sh + exec "$@" + ``` + + When building the containers, set you cache directory to temporary directory to avoid filling you home directory quota. + + ```bash + export APPTAINER_CACHEDIR=$TMPDIR + apptainer build --fakeroot container.sif container.def + ``` + +=== "Roihu GPU base container (~ 16 GB)" + ```sh title="container.def" + Bootstrap: docker + From: satama.csc.fi/r_installation_spack/core-gpu-gcc-14.3.0-cuda-12.9.1 + + %post + # Activate module environment and load default modules. + . /opt/activate.sh + # Build your application here: + + %runscript + . /opt/activate.sh + exec "$@" + ``` + + When building the containers, set you cache directory to temporary directory to avoid filling you home directory quota. + + ```bash + export APPTAINER_CACHEDIR=$TMPDIR + apptainer build --fakeroot container.sif container.def + ``` More details on working with containers in CSC's computing environment can be found from the links below: @@ -183,4 +229,4 @@ srun --argos=no * [Roihu system overview](../../computing/systems-roihu.md) * [CSC Computing Environment self-learning materials](https://csc-training.github.io/csc-env-eff/) -* [Contact our service desk](../contact.md) \ No newline at end of file +* [Contact our service desk](../contact.md) From 115a0b9e00acd61f77a10cdc9d2307d07302e0fc Mon Sep 17 00:00:00 2001 From: Jaan Tollander de Balsch Date: Mon, 20 Apr 2026 14:29:16 +0300 Subject: [PATCH 044/139] base containers --- docs/support/tutorials/roihu.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index e7a787f5b1..92eaf2f222 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -140,7 +140,13 @@ Base container are built on top of Rockylinux 9. apptainer build --fakeroot container.sif container.def ``` -=== "Roihu GPU base container (~ 16 GB)" + Now, you can run commands inside the container with clean environment and environment active as follows: + + ```bash + apptainer run --cleanenv run mycmd + ``` + +=== "Roihu GPU base container (~16 GB)" ```sh title="container.def" Bootstrap: docker From: satama.csc.fi/r_installation_spack/core-gpu-gcc-14.3.0-cuda-12.9.1 @@ -162,6 +168,12 @@ Base container are built on top of Rockylinux 9. apptainer build --fakeroot container.sif container.def ``` + Now, you can run commands inside the container with clean environment and environment active as follows: + + ```bash + apptainer run --cleanenv run mycmd + ``` + More details on working with containers in CSC's computing environment can be found from the links below: - [Overview of containers](../../computing/containers/overview.md) From cf9f98a4870d440c2889aa0d04a0c0effb9e366c Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 20 Apr 2026 16:11:41 +0300 Subject: [PATCH 045/139] update python.md --- docs/apps/python.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/docs/apps/python.md b/docs/apps/python.md index 4e548f6a70..9c0152327d 100644 --- a/docs/apps/python.md +++ b/docs/apps/python.md @@ -75,16 +75,13 @@ In Roihu: | Module name | Purpose | Roihu-CPU/Roihu-GPU | |-|-|-| -| [biopythontools](biopython.md) | bioinformatics | | | [python-geo](python-geo.md) | geoinformatics | CPU | -| [python-jax](jax.md) | JAX ML framework | GPU | | [python-data](python-data.md) | data analysis and ML utilities | CPU/GPU | | [python-pytorch](pytorch.md) | PyTorch ML framework | GPU | | [python-vllm](vllm.md) | LLM inference | GPU | | [python-qiskit](qiskit.md) | quantum computing | | -| [python-tensorflow](tensorflow.md) | TensorFlow ML framework | GPU | -In other systems: +In Puhti and Mahti: | Module name | Purpose | |-|-| From d89b18a73dab5c66f1898a965f01bedaa4b4d83d Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Mon, 20 Apr 2026 16:13:21 +0300 Subject: [PATCH 046/139] update python.md --- docs/apps/python.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/apps/python.md b/docs/apps/python.md index 9c0152327d..f02767f145 100644 --- a/docs/apps/python.md +++ b/docs/apps/python.md @@ -79,7 +79,6 @@ In Roihu: | [python-data](python-data.md) | data analysis and ML utilities | CPU/GPU | | [python-pytorch](pytorch.md) | PyTorch ML framework | GPU | | [python-vllm](vllm.md) | LLM inference | GPU | -| [python-qiskit](qiskit.md) | quantum computing | | In Puhti and Mahti: From 311e0a2ea6067491320820e95bbaf555566ba1de Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mats=20Sj=C3=B6berg?= Date: Tue, 21 Apr 2026 09:47:51 +0300 Subject: [PATCH 047/139] Changed Roihu to Roihu-GPU for PyTorch, vLLM --- docs/apps/pytorch.md | 50 ++++++++++++++++++++++---------------------- docs/apps/vllm.md | 4 ++-- 2 files changed, 27 insertions(+), 27 deletions(-) diff --git a/docs/apps/pytorch.md b/docs/apps/pytorch.md index 533572395b..f39154721a 100644 --- a/docs/apps/pytorch.md +++ b/docs/apps/pytorch.md @@ -20,7 +20,7 @@ Machine learning framework for Python. !!! info "News" - **7.4.2026** PyTorch is now available on Roihu, the module has been + **7.4.2026** PyTorch is now available on Roihu-GPU, the module has been renamed `python-pytorch`. **23.1.2026** Since the LUMI service break 21.1.2026, the CSC PyTorch @@ -116,26 +116,26 @@ Machine learning framework for Python. Currently supported PyTorch versions: -| Version | Module | Puhti | Mahti | Roihu | (LUMI)
*see notes below* | Notes | -|:--------|-----------------------|:-----:|:-----:|-------|------------------------------|:-------------------------| -| 2.10.0 | `python-pytorch/2.10` | - | - | X | - | Default on Roihu | -| 2.9.1 | `pytorch/2.9` | X | X | | - | Default on Puhti, Mahti | -| 2.7.1 | `pytorch/2.7` | X | X | | (X) | No Slingshot (see below) | -| 2.6.0 | `pytorch/2.6` | X | X | | - | | -| 2.5.1 | `pytorch/2.5` | X | X | | (X) | | -| 2.4.1 | `pytorch/2.4` | - | - | | (X) | | -| 2.4.0 | `pytorch/2.4` | X | X | | - | New tykky-based wrappers | -| 2.3.1 | `pytorch/2.3` | X | X | | - | New tykky-based wrappers | -| 2.2.2 | `pytorch/2.2` | - | - | | (X) | | -| 2.2.1 | `pytorch/2.2` | X | X | | - | | -| 2.1.2 | `pytorch/2.1` | - | - | | (X) | | -| 2.1.0 | `pytorch/2.1` | X | X | | - | | -| 2.0.1 | `pytorch/2.0` | - | - | | (X) | | -| 2.0.0 | `pytorch/2.0` | X | X | | - | | -| 1.13.1 | `pytorch/1.13` | - | - | | (X) | | -| 1.13.0 | `pytorch/1.13` | X | X | | - | | -| 1.12.0 | `pytorch/1.12` | X | X | | - | | -| 1.11.0 | `pytorch/1.11` | X | X | | - | | +| Version | Module | Puhti | Mahti | Roihu-GPU | (LUMI)
*see notes below* | Notes | +|:--------|-----------------------|:-----:|:-----:|-----------|------------------------------|:-------------------------| +| 2.10.0 | `python-pytorch/2.10` | - | - | X | - | Default on Roihu-GPU | +| 2.9.1 | `pytorch/2.9` | X | X | | - | Default on Puhti, Mahti | +| 2.7.1 | `pytorch/2.7` | X | X | | (X) | No Slingshot (see below) | +| 2.6.0 | `pytorch/2.6` | X | X | | - | | +| 2.5.1 | `pytorch/2.5` | X | X | | (X) | | +| 2.4.1 | `pytorch/2.4` | - | - | | (X) | | +| 2.4.0 | `pytorch/2.4` | X | X | | - | New tykky-based wrappers | +| 2.3.1 | `pytorch/2.3` | X | X | | - | New tykky-based wrappers | +| 2.2.2 | `pytorch/2.2` | - | - | | (X) | | +| 2.2.1 | `pytorch/2.2` | X | X | | - | | +| 2.1.2 | `pytorch/2.1` | - | - | | (X) | | +| 2.1.0 | `pytorch/2.1` | X | X | | - | | +| 2.0.1 | `pytorch/2.0` | - | - | | (X) | | +| 2.0.0 | `pytorch/2.0` | X | X | | - | | +| 1.13.1 | `pytorch/1.13` | - | - | | (X) | | +| 1.13.0 | `pytorch/1.13` | X | X | | - | | +| 1.12.0 | `pytorch/1.12` | X | X | | - | | +| 1.11.0 | `pytorch/1.11` | X | X | | - | | Includes [PyTorch](https://pytorch.org/) and related libraries with GPU support via CUDA/ROCm. @@ -207,7 +207,7 @@ with: module load pytorch ``` -To access PyTorch on Roihu: +To access PyTorch on Roihu-GPU: ```text module load python-pytorch @@ -224,7 +224,7 @@ If you wish to have a specific version ([see above for available versions](#available)), use: ```text -module load python-pytorch/2.10 # on Roihu +module load python-pytorch/2.10 # on Roihu-GPU module load pytorch/2.9 # on other systems ``` @@ -234,7 +234,7 @@ so **there is no need to load cuda and cudnn modules separately!** This command will also show all available versions: ```text -module avail python-pytorch # on Roihu +module avail python-pytorch # on Roihu-GPU module avail pytorch # on other systems ``` @@ -286,7 +286,7 @@ proportion of the available CPU cores in a single node: srun python3 myprog.py ``` -=== "Roihu" +=== "Roihu-GPU" ```bash #!/bin/bash #SBATCH --account= diff --git a/docs/apps/vllm.md b/docs/apps/vllm.md index ceb41553cb..a4819042bf 100644 --- a/docs/apps/vllm.md +++ b/docs/apps/vllm.md @@ -17,7 +17,7 @@ A fast and easy-to-use library for LLM inference and serving. !!! info "News" - **7.4.2026** vLLM now available as a separate module on Roihu + **7.4.2026** vLLM now available as a separate module on Roihu-GPU ## Available @@ -56,7 +56,7 @@ vLLM is covered by the [Apache License ## Usage -To load the default version on Roihu: +To load the default version on Roihu-GPU: ```text module load python-vllm From 8f45fcd3665a0ea8cdceae5857d239c925769d3c Mon Sep 17 00:00:00 2001 From: Jaan Tollander de Balsch Date: Tue, 21 Apr 2026 10:14:14 +0300 Subject: [PATCH 048/139] update gpu base container version --- docs/support/tutorials/roihu.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 92eaf2f222..1a193c8f13 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -149,7 +149,7 @@ Base container are built on top of Rockylinux 9. === "Roihu GPU base container (~16 GB)" ```sh title="container.def" Bootstrap: docker - From: satama.csc.fi/r_installation_spack/core-gpu-gcc-14.3.0-cuda-12.9.1 + From: satama.csc.fi/r_installation_spack/core-gpu-gcc-15.2.0-cuda-13.1.1 %post # Activate module environment and load default modules. From b5fa72b64b04f2f07ab3149dcd387126adf5a399 Mon Sep 17 00:00:00 2001 From: kkmattil Date: Tue, 21 Apr 2026 10:20:21 +0300 Subject: [PATCH 049/139] Create allas-in-roihu.md Instructions for Allas and Lumi-O configuration in Roihu --- docs/computing/allas-in-roihu.md | 97 ++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) create mode 100644 docs/computing/allas-in-roihu.md diff --git a/docs/computing/allas-in-roihu.md b/docs/computing/allas-in-roihu.md new file mode 100644 index 0000000000..c808ef3450 --- /dev/null +++ b/docs/computing/allas-in-roihu.md @@ -0,0 +1,97 @@ +# Using Allas and Lumi-O object storage services in Roihu + + +Object storage related tools are initialized in Roihu with command: + +```text +module load allas +``` +The allas module enables command: + +```text +allas-conf +``` + +that is used to configure connections to [Allas](../../data/Allas/index.md) and [Lumi-O](https://docs.lumi-supercomputer.eu/storage/lumio/) object storage services. + +In addition this module brings available a set of command line tools that can be used to operate with Allas and Lumi-O object storage services. These tools include: + + [a-commands] (https://docs.csc.fi/data/Allas/using_allas/a_commands/) + Rclone (https://docs.csc.fi/data/Allas/using_allas/rclone/) + s3cmd (https://docs.csc.fi/data/Allas/using_allas/s3_client/) + s5cmd + aws s3 (https://docs.aws.amazon.com/cli/latest/reference/s3/) + swift (https://docs.csc.fi/data/Allas/using_allas/swift_client/) + allas-backup and restic (https://docs.csc.fi/data/Allas/using_allas/a_backup/) + + +Configuring connections with allas-conf + +allas-conf can be used to configure S3 based connections to Lumi-O and Allas and swift based connections to Allas. Note that in Roihu allas-conf now by default configures an S3 based connection to Allas, unlike to Puhti and Mahti where swift is used by default. + + +S3 connection to Allas + +You can define a new S3 connection to Allas with command: + + allas-conf project_proj-number + +or + + allas-conf + +First allas-conf asks you to give your CSC password. After that, if target project is not given as an argument, it lists all available Allas projects and asks user to pick one. ( Note that allas-conf has often problems with passwords that have characters that have special meaning in bash shell. For example space, *, ; and different quotation marks can cause allas-conf to fail). + +The project specific access key pair is stored to to the configuration files of aws, s3cmd a rclone in your home directory. Due to this the configuration is not session specific, but applies to all sessions that utilize aws, s3cmd, rclone and a-commands. S3 keys are permanent so you need to run allas-conf command again only when you wish to set a new default S3 connection in use. Thus, in case of S3 based Allas usage, you normally need just to load the Allas module and then start using Allas. + +In case of aws and s3cmd, only one connection is defined and running allas-conf overwrites the old default connections. + +In case of rclone, two endpoints are defined s3allas: and s3allas_project-proj-number. Both endpoints refer to the same Allas project. When a new project is defined with allas-conf, a3allas: endpoint is changed to refer to the new project, but the older project specific endpoint is preserved in addition to the new project specific endpoint that gets generated. + +For example after commands + + allas-conf project_200111 + allas-conf project_200222 + +Following connections are in use: + +a-commands, aws and s3cmd project_200222 +rclone s3allas: project_200222 +rclone a3allas_project_200111: project_200111 +rclone a3allas_project_200222: project_200222 + + + +Swift connection to Allas + +In Puhti and Mahti a-commands, rclone and allas-backup used by default swift based connections. In Roihu you can define swift based Allas connection with command: + allas-conf --swift +This connection is session specific and valid only for 8 hours. After the connection is activated, rclone endpoint allas: provides swift based connection to Allas. +A-commands need extra option --swift to use swift based Allas connection + + a-list --swift +Note that in a terminal session, S3 based a-list and Swift based a-list --swift may refer to different Allas projects. + +Swift configuration has no effect on aws and s3cmd commands as they use only S3 protocol. + +S3 connections to Lumi-O + +Connections to Lumi-O are defined with command: + + allas-conf --lumi + +The configuration process asks you to login to https://auth.lumidata.eu where you can create an access key pair for your Lumi-project. You can the copy the project name, access key and secret key to the configuration process in Roihu. + +Lumi connections use always S3 protocol and this configuration process changes aws and s3cmd commands to use the Lumi-O project as the default project. In case of a-commands you can add option --lumi to the command in order to make use Lumi-o. For example: + + a-list --lumi + +In the case of rclone four new endpoints are created. + • lumi-o: and lumi-proj-number-private: refer to the non-public area of the Lumi-O project + • lumi-pub: and lumi-proj-number-public: to the public are of the Lumi-O project. +In the same way as in the case of S3 connections to Allas, executing Lumi-O configuration to a new project, changes the target project of a-commands, aws, s3cmd as well as lumi-o: and lumi-pub: endpoints but preserves the long endpoint names that include the project numbers. + +Note that the Lumi-O keys have a validity time, defined in the authentication interface. Thus you may need to update the connection configuration every now and then. + + + From fdafa52392305d662ef5f55eaf1b16a1116b6db7 Mon Sep 17 00:00:00 2001 From: Jaan Tollander de Balsch Date: Tue, 21 Apr 2026 10:35:36 +0300 Subject: [PATCH 050/139] add --nv flag --- docs/support/tutorials/roihu.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 1a193c8f13..e09b2b5c9f 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -171,7 +171,7 @@ Base container are built on top of Rockylinux 9. Now, you can run commands inside the container with clean environment and environment active as follows: ```bash - apptainer run --cleanenv run mycmd + apptainer run --cleanenv --nv run mycmd ``` More details on working with containers in CSC's computing environment can be found from the links below: From c5284ff6e6d975bc162ecee94b8d8322663d5d87 Mon Sep 17 00:00:00 2001 From: kkmattil Date: Tue, 21 Apr 2026 11:08:20 +0300 Subject: [PATCH 051/139] Update allas-in-roihu.md --- docs/computing/allas-in-roihu.md | 121 +++++++++++++++++++------------ 1 file changed, 74 insertions(+), 47 deletions(-) diff --git a/docs/computing/allas-in-roihu.md b/docs/computing/allas-in-roihu.md index c808ef3450..0f7e534956 100644 --- a/docs/computing/allas-in-roihu.md +++ b/docs/computing/allas-in-roihu.md @@ -6,89 +6,116 @@ Object storage related tools are initialized in Roihu with command: ```text module load allas ``` -The allas module enables command: +The allas module enables command: **allas-conf** that is used to configure **S3* connections to [Allas](../../data/Allas/index.md) and [Lumi-O](https://docs.lumi-supercomputer.eu/storage/lumio/) object storage services and *Swift* based connections to Allas. Note that in Roihu _allas-conf: by default configures an S3 based connection to Allas, unlike to Puhti and Mahti where swift is used by default. -```text -allas-conf -``` +In addition this module brings available a set of command line tools that can be used to operate with Allas and Lumi-O object storage services. These tools include: -that is used to configure connections to [Allas](../../data/Allas/index.md) and [Lumi-O](https://docs.lumi-supercomputer.eu/storage/lumio/) object storage services. + * [a-commands](../../data/Allas/using_allas/a_commands.md) + * [Rclone](../../data/Allas/using_allas/rclone.md) + * [s3cmd](../../data/Allas/using_allas/s3_client.md) + * [aws s3](https://docs.aws.amazon.com/cli/latest/reference/s3/) + * [swift](../../data/Allas/using_allas/swift_client/) + * [allas-backup](https://docs.csc.fi/data/Allas/using_allas/a_backup/) and restic -In addition this module brings available a set of command line tools that can be used to operate with Allas and Lumi-O object storage services. These tools include: +You can check current object storage connections with command: - [a-commands] (https://docs.csc.fi/data/Allas/using_allas/a_commands/) - Rclone (https://docs.csc.fi/data/Allas/using_allas/rclone/) - s3cmd (https://docs.csc.fi/data/Allas/using_allas/s3_client/) - s5cmd - aws s3 (https://docs.aws.amazon.com/cli/latest/reference/s3/) - swift (https://docs.csc.fi/data/Allas/using_allas/swift_client/) - allas-backup and restic (https://docs.csc.fi/data/Allas/using_allas/a_backup/) +```text +check-allas-connections +``` +### S3 connection to Allas -Configuring connections with allas-conf +You can define a new S3 connection to Allas with command: -allas-conf can be used to configure S3 based connections to Lumi-O and Allas and swift based connections to Allas. Note that in Roihu allas-conf now by default configures an S3 based connection to Allas, unlike to Puhti and Mahti where swift is used by default. +```text +allas-conf project_proj-number +``` +or +```text +allas-conf +``` -S3 connection to Allas +First allas-conf asks you to give your CSC password (Haka-password can't be used here). After that, if target project is not given as an argument, it lists all available Allas projects and asks user to pick one. ( Note that allas-conf has often problems with passwords that have characters that have special meaning in bash shell. For example space, *, ; and different quotation marks can cause allas-conf to fail). -You can define a new S3 connection to Allas with command: +The project specific access key pair is stored to to the configuration files of *aws*, *s3cmd* a *rclone* in your home directory. Due to this the configuration is not session specific, but applies to all sessions that utilize aws, s3cmd, rclone and a-commands. S3 keys are permanent so you need to run allas-conf command again only when you wish to set a new default S3 connection in use. Thus, in case of S3 based Allas usage, you normally need just to load the Allas module and then start using Allas. - allas-conf project_proj-number +In case of **aws** and **s3cmd**, only one connection is defined and running allas-conf overwrites the old default connections. -or +In case of **rclone**, two endpoints are defined **s3allas:** and **s3allas-project-_proj-number_**. Both endpoints refer to the same Allas project. When a new project is defined with allas-conf, a3allas: endpoint is changed to refer to the new project, but the older project specific endpoint is preserved in addition to the new project specific endpoint that gets generated. - allas-conf +For example after commands: -First allas-conf asks you to give your CSC password. After that, if target project is not given as an argument, it lists all available Allas projects and asks user to pick one. ( Note that allas-conf has often problems with passwords that have characters that have special meaning in bash shell. For example space, *, ; and different quotation marks can cause allas-conf to fail). +```text +allas-conf project_200111 +allas-conf project_200222 +``` -The project specific access key pair is stored to to the configuration files of aws, s3cmd a rclone in your home directory. Due to this the configuration is not session specific, but applies to all sessions that utilize aws, s3cmd, rclone and a-commands. S3 keys are permanent so you need to run allas-conf command again only when you wish to set a new default S3 connection in use. Thus, in case of S3 based Allas usage, you normally need just to load the Allas module and then start using Allas. +Following connections are in use: -In case of aws and s3cmd, only one connection is defined and running allas-conf overwrites the old default connections. +| Tool | Target project | +|--------------------------------|----------------| +| a-commands, aws and s3cmd | project_200222 | +| rclone s3allas: | project_200222 | +| rclone a3allas-project200111: | project_200111 | +| rclone a3allas-project_200222: | project_200222 | -In case of rclone, two endpoints are defined s3allas: and s3allas_project-proj-number. Both endpoints refer to the same Allas project. When a new project is defined with allas-conf, a3allas: endpoint is changed to refer to the new project, but the older project specific endpoint is preserved in addition to the new project specific endpoint that gets generated. +And with these settings all the commands below list the Allas buckets of project 200222 -For example after commands +```txt +a-list +rclone lsd s3allas: +rclone lsd a3allas-project_200222: +s3cmd ls s3:// +aws s3 ls +`` - allas-conf project_200111 - allas-conf project_200222 - -Following connections are in use: +### Swift connection to Allas -a-commands, aws and s3cmd project_200222 -rclone s3allas: project_200222 -rclone a3allas_project_200111: project_200111 -rclone a3allas_project_200222: project_200222 +In Puhti and Mahti a-commands, rclone and allas-backup used by default swift based connections. In Roihu you can define Swift based Allas connection with command: +```text +allas-conf --swift +``` +This connection is session specific and valid only for 8 hours. After the connection is activated, rclone endpoint **allas:** provides Swift based connection to Allas. For example: -Swift connection to Allas +```text +rclone lsd allas: +``` -In Puhti and Mahti a-commands, rclone and allas-backup used by default swift based connections. In Roihu you can define swift based Allas connection with command: - allas-conf --swift -This connection is session specific and valid only for 8 hours. After the connection is activated, rclone endpoint allas: provides swift based connection to Allas. -A-commands need extra option --swift to use swift based Allas connection +A-commands need extra option `--swift` to use Swift based Allas connection. For example: - a-list --swift -Note that in a terminal session, S3 based a-list and Swift based a-list --swift may refer to different Allas projects. +```text +a-list --swift +``` + +Note that in a terminal session, S3 based `a-list` and Swift based `a-list --swift` may refer to different Allas projects. Swift configuration has no effect on aws and s3cmd commands as they use only S3 protocol. -S3 connections to Lumi-O + +### S3 connections to Lumi-O Connections to Lumi-O are defined with command: - allas-conf --lumi +```text +allas-conf --lumi +``` -The configuration process asks you to login to https://auth.lumidata.eu where you can create an access key pair for your Lumi-project. You can the copy the project name, access key and secret key to the configuration process in Roihu. +The configuration process asks you to login to [https://auth.lumidata.eu](https://auth.lumidata.eu) where you can create an access key pair for your Lumi-project. You can then copy the _project name_, _access key_ and _secret key_ to the configuration process in Roihu. -Lumi connections use always S3 protocol and this configuration process changes aws and s3cmd commands to use the Lumi-O project as the default project. In case of a-commands you can add option --lumi to the command in order to make use Lumi-o. For example: +Lumi-O connections use always S3 protocol and this configuration process changes *aws* and *s3cmd* commands to use the Lumi-O project as the default project. In case of *a-commands* you can add option `--lumi` to the command in order to make use Lumi-o. For example: - a-list --lumi +```text +a-list --lumi +``` In the case of rclone four new endpoints are created. - • lumi-o: and lumi-proj-number-private: refer to the non-public area of the Lumi-O project - • lumi-pub: and lumi-proj-number-public: to the public are of the Lumi-O project. + + * **lumi-o:** and **lumi-_proj-number_-private:** refer to the non-public area of the Lumi-O project + * **lumi-pub:** and **lumi-_proj-number_-public:** to the public are of the Lumi-O project. + In the same way as in the case of S3 connections to Allas, executing Lumi-O configuration to a new project, changes the target project of a-commands, aws, s3cmd as well as lumi-o: and lumi-pub: endpoints but preserves the long endpoint names that include the project numbers. Note that the Lumi-O keys have a validity time, defined in the authentication interface. Thus you may need to update the connection configuration every now and then. From 430771b6f3ea4a890f19f869dcbf37748766af27 Mon Sep 17 00:00:00 2001 From: kkmattil Date: Tue, 21 Apr 2026 11:13:50 +0300 Subject: [PATCH 052/139] Update index.md Link to object storage instructions added --- docs/computing/index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/computing/index.md b/docs/computing/index.md index b949745d9a..59930e4672 100644 --- a/docs/computing/index.md +++ b/docs/computing/index.md @@ -157,6 +157,7 @@ csc-workspaces using the web interfaces * [Disk areas](disk.md): What places are there for storing data on CSC supercomputers +* [Object storage](allas-in-roihu): Opening connections to Allas and Lumi-O object storage services * [Modules](modules.md): How to find the programs you need * [Applications](../apps/index.md): Application specific instructions. * [Running jobs](running/getting-started.md): How to run programs on the From a29d6161902f461c10e0622d5626d66ac5bb039c Mon Sep 17 00:00:00 2001 From: kkmattil Date: Tue, 21 Apr 2026 11:14:30 +0300 Subject: [PATCH 053/139] Update index.md --- docs/computing/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/index.md b/docs/computing/index.md index 59930e4672..418ad754e9 100644 --- a/docs/computing/index.md +++ b/docs/computing/index.md @@ -157,7 +157,7 @@ csc-workspaces using the web interfaces * [Disk areas](disk.md): What places are there for storing data on CSC supercomputers -* [Object storage](allas-in-roihu): Opening connections to Allas and Lumi-O object storage services +* [Object storage](allas-in-roihu.md): Opening connections to Allas and Lumi-O object storage services * [Modules](modules.md): How to find the programs you need * [Applications](../apps/index.md): Application specific instructions. * [Running jobs](running/getting-started.md): How to run programs on the From 045165a71a7c1cfcd84bc54faff61482b02caf0f Mon Sep 17 00:00:00 2001 From: kkmattil Date: Tue, 21 Apr 2026 11:15:06 +0300 Subject: [PATCH 054/139] Update allas-in-roihu.md --- docs/computing/allas-in-roihu.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/allas-in-roihu.md b/docs/computing/allas-in-roihu.md index 0f7e534956..981f665421 100644 --- a/docs/computing/allas-in-roihu.md +++ b/docs/computing/allas-in-roihu.md @@ -68,7 +68,7 @@ rclone lsd s3allas: rclone lsd a3allas-project_200222: s3cmd ls s3:// aws s3 ls -`` +``` ### Swift connection to Allas From 7e0c9ab4b7c1fd2f78492e6c27dd838678523bba Mon Sep 17 00:00:00 2001 From: kkmattil Date: Tue, 21 Apr 2026 11:59:22 +0300 Subject: [PATCH 055/139] Update allas-in-roihu.md --- docs/computing/allas-in-roihu.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/allas-in-roihu.md b/docs/computing/allas-in-roihu.md index 981f665421..424f8dc782 100644 --- a/docs/computing/allas-in-roihu.md +++ b/docs/computing/allas-in-roihu.md @@ -6,7 +6,7 @@ Object storage related tools are initialized in Roihu with command: ```text module load allas ``` -The allas module enables command: **allas-conf** that is used to configure **S3* connections to [Allas](../../data/Allas/index.md) and [Lumi-O](https://docs.lumi-supercomputer.eu/storage/lumio/) object storage services and *Swift* based connections to Allas. Note that in Roihu _allas-conf: by default configures an S3 based connection to Allas, unlike to Puhti and Mahti where swift is used by default. +The allas module enables command: **allas-conf** that is used to configure **S3* connections to [Allas](../data/Allas/index.md) and [Lumi-O](https://docs.lumi-supercomputer.eu/storage/lumio/) object storage services and *Swift* based connections to Allas. Note that in Roihu _allas-conf: by default configures an S3 based connection to Allas, unlike to Puhti and Mahti where swift is used by default. In addition this module brings available a set of command line tools that can be used to operate with Allas and Lumi-O object storage services. These tools include: From c55595a22eed8f81afdf8602fa91cd626d3802b3 Mon Sep 17 00:00:00 2001 From: kkmattil Date: Tue, 21 Apr 2026 12:00:46 +0300 Subject: [PATCH 056/139] Update allas-in-roihu.md --- docs/computing/allas-in-roihu.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/computing/allas-in-roihu.md b/docs/computing/allas-in-roihu.md index 424f8dc782..b3e8bf11e2 100644 --- a/docs/computing/allas-in-roihu.md +++ b/docs/computing/allas-in-roihu.md @@ -10,12 +10,12 @@ The allas module enables command: **allas-conf** that is used to configure **S3* In addition this module brings available a set of command line tools that can be used to operate with Allas and Lumi-O object storage services. These tools include: - * [a-commands](../../data/Allas/using_allas/a_commands.md) - * [Rclone](../../data/Allas/using_allas/rclone.md) - * [s3cmd](../../data/Allas/using_allas/s3_client.md) + * [a-commands](../data/Allas/using_allas/a_commands.md) + * [Rclone](../data/Allas/using_allas/rclone.md) + * [s3cmd](../data/Allas/using_allas/s3_client.md) * [aws s3](https://docs.aws.amazon.com/cli/latest/reference/s3/) - * [swift](../../data/Allas/using_allas/swift_client/) - * [allas-backup](https://docs.csc.fi/data/Allas/using_allas/a_backup/) and restic + * [swift](../data/Allas/using_allas/swift_client/) + * [allas-backup](../data/Allas/using_allas/a_backup/) and restic You can check current object storage connections with command: @@ -57,10 +57,10 @@ Following connections are in use: |--------------------------------|----------------| | a-commands, aws and s3cmd | project_200222 | | rclone s3allas: | project_200222 | -| rclone a3allas-project200111: | project_200111 | +| rclone a3allas-project200111: | project_200111 | | rclone a3allas-project_200222: | project_200222 | -And with these settings all the commands below list the Allas buckets of project 200222 +And with these settings all the commands below list the Allas buckets of project 200222. ```txt a-list From 485a17fbb5f76bedc6ad9a65f56448044772cc63 Mon Sep 17 00:00:00 2001 From: kkmattil Date: Tue, 21 Apr 2026 12:05:32 +0300 Subject: [PATCH 057/139] Update allas-in-roihu.md --- docs/computing/allas-in-roihu.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/computing/allas-in-roihu.md b/docs/computing/allas-in-roihu.md index b3e8bf11e2..6c95b92744 100644 --- a/docs/computing/allas-in-roihu.md +++ b/docs/computing/allas-in-roihu.md @@ -14,8 +14,8 @@ In addition this module brings available a set of command line tools that can be * [Rclone](../data/Allas/using_allas/rclone.md) * [s3cmd](../data/Allas/using_allas/s3_client.md) * [aws s3](https://docs.aws.amazon.com/cli/latest/reference/s3/) - * [swift](../data/Allas/using_allas/swift_client/) - * [allas-backup](../data/Allas/using_allas/a_backup/) and restic + * [swift](../data/Allas/using_allas/swift_client.md) + * [allas-backup](../data/Allas/using_allas/a_backup.md) and restic You can check current object storage connections with command: From 80e89bc4c7f37c0dab1f6af27b5ed825e137101c Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Tue, 21 Apr 2026 12:58:39 +0300 Subject: [PATCH 058/139] Apply suggestions from code review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Mats Sjöberg --- docs/support/tutorials/roihu-data.md | 3 +-- docs/support/tutorials/roihu.md | 10 ++++------ 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/docs/support/tutorials/roihu-data.md b/docs/support/tutorials/roihu-data.md index 2fce738918..abb9f3a60d 100644 --- a/docs/support/tutorials/roihu-data.md +++ b/docs/support/tutorials/roihu-data.md @@ -51,14 +51,13 @@ * Once you have identified the data you need to transfer, check that it fits within the default disk quotas on Roihu: - | Disk area | Path | Default size | Max. size [^1] | Default file number limit | Max. file number limit [^1] | + | Disk area | Path | Default size | Max. size | Default file number limit | Max. file number limit | |-----------|-----------------------|-------------:|--------------------:|--------------------------:|----------------------------:| | Home | `/users/$USER` | 15 GiB | 15 GiB | 150k | 150k | | ProjAppl | `/projappl/` | 15 GiB | 250 GiB | 150k | 2.5M | | ProjData  | `/projdata/` | 0 GiB | case-by-case | 0 | case-by-case | | Scratch  | `/scratch/` | 250 GiB | 100 TiB | 500k | 10M | - [^1]: Values in parentheses indicate automatically approved limits. * Please note that existing quota extensions on Puhti/Mahti will not automatically carry over to Roihu, so you must separately diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index e09b2b5c9f..bf9f400b2c 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -94,11 +94,9 @@ Quota extensions on Roihu must be separately applied for and properly motivated. ## Installing software -Before installing anything: - -1. Check if the software is already available: +Before installing anything check if the software is already available: - [List of pre-installed applications](../../apps/by_availability.md#roihu) - - `module spider ` + - `module spider ` If not available, choose one of the following approaches depending on your needs: @@ -116,7 +114,7 @@ Roihu supports Apptainer/Singularity containers for container installations. In most cases, ready-made Docker containers can be easily converted into an Apptainer image. Another option is to build your own container from scratch. You can build containers on top of Roihu base containers which have the same software stack as is available via the module system natively. -Base container are built on top of Rockylinux 9. +Base container are built on top of Rocky Linux 9. === "Roihu CPU base container (~4 GB)" ```sh title="container.def" @@ -133,7 +131,7 @@ Base container are built on top of Rockylinux 9. exec "$@" ``` - When building the containers, set you cache directory to temporary directory to avoid filling you home directory quota. + When building containers, set the Apptainer cache directory to `$TMPDIR` to avoid filling your home directory quota. ```bash export APPTAINER_CACHEDIR=$TMPDIR From 2504eb6343893d520ac7fe81f7623012c381330b Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Tue, 21 Apr 2026 12:59:27 +0300 Subject: [PATCH 059/139] Add suggestion from review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Mats Sjöberg --- docs/computing/compiling-roihu.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/compiling-roihu.md b/docs/computing/compiling-roihu.md index 71eb907a66..f63e20280a 100644 --- a/docs/computing/compiling-roihu.md +++ b/docs/computing/compiling-roihu.md @@ -6,7 +6,7 @@ - Roihu-CPU nodes use AMD (x86) processors - Roihu-GPU nodes use NVIDIA Grace (ARM) processors - Binaries compiled for one architecture are generally not usable on the other. + Binaries compiled for one architecture are not usable on the other. Accordingly, software should be compiled on the same side where it will be run: - Compile for CPU nodes on the Roihu-CPU login node From 00478c44fc599faf860cd37e3a5141bc19685a26 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Tue, 21 Apr 2026 16:33:37 +0300 Subject: [PATCH 060/139] Apply suggestion from review --- docs/support/tutorials/roihu.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index bf9f400b2c..4e496de643 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -222,7 +222,7 @@ Your job will continue normally with Argos disabled. If your job completes successfully, you can safely ignore these messages. -To suppress most of the Argos related warnings and errors, you can pass the `--argos=no` flag option to srun in the following manner: +To suppress most of the Argos related warnings and errors, you can pass the `--argos=no` flag option to Slurm in the following manner: ```bash #!/bin/bash @@ -231,8 +231,9 @@ To suppress most of the Argos related warnings and errors, you can pass the `--a #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --time=DD:HH:MM +#SBATCH --argos=no -srun --argos=no +srun ``` ## More information From 5125649c18c3f46dd8cbc628a4fa4df366b823e0 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Tue, 21 Apr 2026 16:43:56 +0300 Subject: [PATCH 061/139] Fix argos option that breaks automated tests --- docs/support/tutorials/roihu.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 4e496de643..5e2cd5c464 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -222,7 +222,7 @@ Your job will continue normally with Argos disabled. If your job completes successfully, you can safely ignore these messages. -To suppress most of the Argos related warnings and errors, you can pass the `--argos=no` flag option to Slurm in the following manner: +To suppress most of the Argos related warnings and errors, you can pass the `--argos=no` flag option to srun in the following manner: ```bash #!/bin/bash @@ -231,11 +231,12 @@ To suppress most of the Argos related warnings and errors, you can pass the `--a #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --time=DD:HH:MM -#SBATCH --argos=no -srun +srun --argos=no ``` +The same option can also be passed as an `#SBATCH` input. + ## More information * [Roihu system overview](../../computing/systems-roihu.md) From 0e0651c0a6799a52b77c85d352398ae2a61dede6 Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Wed, 22 Apr 2026 12:45:07 +0300 Subject: [PATCH 062/139] Roihu ssh ca from windows (#2949) * fix: advanced level (#2925) * Urgent announcement feature (#2927) * Urgent announcement feature * Enable internal links Edit content, adjust margins * Swap url to lue tutorial (#2928) small change for explanation and rather link to `lue` instructions than pre-cleaning tips * BeeGFS Pouta tutorial (#2924) * BeeGFS Pouta tutorial * Add Benchmark and metadata migration --------- Co-authored-by: Joona Tolonen <54183803+tolonenj@users.noreply.github.com> * Update strong-identification.md (#2930) clarified a few points Co-authored-by: Joonas Somero <50655931+joonas-somero@users.noreply.github.com> * Monitor Pukki DBaaS instance sizes from a Rahti CronJob using application credentials (#2926) * tutorial: Monitor Pukki DBaaS instance sizes from a Rahti CronJob using application credentials * fix broken link * fix: polish and add links * fix bullet points * fix link * fix bullet points * add clone rule * add clone rule * Ac conv (#2935) * Create sd-connect-conversion-tool.md * Updates * Update sd-connect-conversion-tool-ui.md * Update sd-connect-conversion-tool-ui.md * Update sd-connect-conversion-tool-ui.md * Update sd-connect-conversion-tool-ui.md * Navigation * Update sd-connect-conversion.md * Fixes * Fixes * Fixes --------- Co-authored-by: Joonas Somero <50655931+joonas-somero@users.noreply.github.com> * Nextcloud rahti (#2931) * Tutorial: Nextcloud on Rahti * Update docs/cloud/rahti/tutorials/nextcloud.md Co-authored-by: Dean Ruina <81315494+DeRuina@users.noreply.github.com> * Update docs/cloud/rahti/tutorials/nextcloud.md Co-authored-by: Dean Ruina <81315494+DeRuina@users.noreply.github.com> * Update docs/cloud/rahti/tutorials/nextcloud.md Co-authored-by: Dean Ruina <81315494+DeRuina@users.noreply.github.com> * Update docs/cloud/rahti/tutorials/nextcloud.md Co-authored-by: Dean Ruina <81315494+DeRuina@users.noreply.github.com> * Update docs/cloud/rahti/tutorials/nextcloud.md Co-authored-by: Dean Ruina <81315494+DeRuina@users.noreply.github.com> * Update docs/cloud/rahti/tutorials/nextcloud.md Co-authored-by: Dean Ruina <81315494+DeRuina@users.noreply.github.com> * How to install kustomize * Fix internal links --------- Co-authored-by: Dean Ruina <81315494+DeRuina@users.noreply.github.com> * Add volume deletion instructions to documentation (#2940) Co-authored-by: Tristan <74349933+trispera@users.noreply.github.com> * add lastools 2026 (#2946) * add lastools 2026 * Update docs/apps/lastools.md Co-authored-by: EetuHuuskoCSC <116141296+EetuHuuskoCSC@users.noreply.github.com> --------- Co-authored-by: EetuHuuskoCSC <116141296+EetuHuuskoCSC@users.noreply.github.com> * Fix broken external links (#2929) * some rewriting to Windows connecting with CA * styling fixes * styling, links * fix links * fix links * Revert "Merge branch 'master' into roihu-ssh-ca" This reverts commit f2186d5b4eec606618cd8bce8facd1caefabab46, reversing changes made to 59481e63db5b381eefc54c84d67666e369c7261b. --------- Co-authored-by: Dean Ruina <81315494+DeRuina@users.noreply.github.com> Co-authored-by: Joonas Somero <50655931+joonas-somero@users.noreply.github.com> Co-authored-by: attesillanpaa Co-authored-by: Tristan <74349933+trispera@users.noreply.github.com> Co-authored-by: Joona Tolonen <54183803+tolonenj@users.noreply.github.com> Co-authored-by: Johanna Baldrighi <48281666+jotaskin@users.noreply.github.com> Co-authored-by: Aino-Riikka Corell <101800110+ainoc@users.noreply.github.com> Co-authored-by: EetuHuuskoCSC <116141296+EetuHuuskoCSC@users.noreply.github.com> --- docs/computing/connecting/index.md | 25 +- docs/computing/connecting/ssh-keys.md | 101 +++++---- docs/computing/connecting/ssh-unix.md | 20 ++ docs/computing/connecting/ssh-windows.md | 276 +++++++++++------------ 4 files changed, 211 insertions(+), 211 deletions(-) diff --git a/docs/computing/connecting/index.md b/docs/computing/connecting/index.md index 9e8610b823..974b32eecd 100644 --- a/docs/computing/connecting/index.md +++ b/docs/computing/connecting/index.md @@ -42,10 +42,8 @@ Logging in to CSC supercomputers using an SSH client requires that you have 1. [set up SSH keys](ssh-keys.md), 2. [added your public key to MyCSC](ssh-keys.md#adding-public-key-in-mycsc), and -3. [signed your public key](ssh-keys.md#signing-public-key) to obtain a - time-based SSH certificate. - * Step 3. is only required when connecting to Roihu and must be - repeated every 24 hours. +3. Only in Roihu: [sign your public key](ssh-keys.md#signing-public-key) to obtain a + time-based SSH certificate, must be repeated every 24 hours. ```mermaid flowchart LR @@ -194,22 +192,3 @@ If you try to connect to a node where you have no active jobs, you will receive the following error message: `Access denied by pam_slurm_adopt: you have no active jobs on this node`. -#### Configuring SSH client - -You can save yourself some time by adding host-specific options for CSC -supercomputers in an [SSH config file](https://www.ssh.com/academy/ssh/config) -(e.g. `~/.ssh/config`). - -```bash -Host # e.g. "roihu-cpu" - HostName .csc.fi - User - IdentityFile - CertificateFile # Required for Roihu only -``` - -Now you can connect to the host simply by running: - -```bash -ssh -``` diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 04392d92c0..958139cd9c 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -114,7 +114,7 @@ same `${USER}.pub` file. !!! warning "The following is a requirement for connecting to Roihu only" -To connect to Roihu using SSH, you must sign your public key to get a so called +To connect to Roihu using SSH or SFTP (WinSCP, FileZilla), you must sign your public key to get a so called **SSH certificate**. SSH certificates significantly improve the security of the system by introducing an additional authentication factor for SSH logins. @@ -178,24 +178,23 @@ following instructions illustrate only basic usage. === "Windows" - 7. Optional, but **strongly recommended**: - [Install WinSCP](https://winscp.net/eng/docs/installation) and - [start the Pageant authentication agent](https://the.earth.li/~sgtatham/putty/0.83/htmldoc/Chapter9.html#pageant) - that comes bundled with WinSCP (and - [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/)) to - automatically add SSH key and certificate to SSH agent. - * If you install WinSCP without admin rights, you must add - `WinSCP.exe` to your Path environment variable. Search for the - _Edit environment variables for your account_ settings menu. - * If you intend to connect to Roihu using PowerShell, it is - possible to also use Windows `ssh-agent`. - [See the instructions for starting `ssh-agent` in PowerShell](ssh-windows.md#authentication-agent). - 8. Open PowerShell and execute: + 6. [Depending on the tool you plan to use](ssh-windows.md), select the OpenSSH or Putty key as input for the script. + 7. If you are using Putty keys, install [WinSCP](https://winscp.net/eng/docs/installation). + * If you install WinSCP without admin rights, you must add + `WinSCP.exe` to your Path environment variable. Search for the + _Edit environment variables for your account_ settings menu. + 8. Optional, but **strongly recommended** start [SSH agent](ssh-windows.md#authentication-agent) to + automatically add SSH key and certificate to the SSH agent: + * Pageant for Putty keys. + * Windows `ssh-agent` for OpenSSH keys. + + 9. Open PowerShell and run the script. Depending on what tools you plan to use, provide the helper script the right type of key. ```bash - # Replace with your CSC user name and - # with the path to your SSH public key + # Replace: + # with your CSC user name and + # with the path to your OpenSSH public key # (.pub) or PuTTY key (.ppk) python3 csc_cert.py -u @@ -218,19 +217,18 @@ following instructions illustrate only basic usage. `.pub` file will create both an OpenSSH-compatible `-cert.pub` file, as well as a `-cert.ppk` file (if WinSCP is available). - 9. If you have an earlier certificate which is still valid, the tool + 10. If you have an earlier certificate which is still valid, the tool prints the expiration time and exits. - 10. If signing is needed, a login URL is displayed. Follow the link and + 11. If signing is needed, a login URL is displayed. Follow the link and authenticate. - 11. Copy the displayed 6-digit code into PowerShell and enter your SSH + 12. Copy the displayed 6-digit code into PowerShell and enter your SSH key passphrase. - The signed certificate is automatically downloaded and added to - your SSH agent (if you have WinSCP installed and Pageant - running). - - The signed certificate is saved as `-cert.pub` and/or - `-cert.ppk` (e.g., - `C:\Users\\.ssh\id_ed25519-cert.ppk`). - 12. **[Connect to Roihu following these instructions](ssh-windows.md#basic-usage)**. + your SSH authentication agent if you have it running. + - The signed certificate is saved to the same folder as the input key + as `-cert.pub` for OpenSSH keys and/or + `-cert.ppk` for Putty keys. + 13. Connect to Roihu [with SSH clients](ssh-windows.md#basic-usage) or [graphical file transfer tools](../../data/moving/graphical_transfer.md). --- @@ -258,37 +256,48 @@ following instructions illustrate only basic usage. `~/.ssh/id_ed25519-cert.pub`. 1. **Connect to Roihu following these instructions**: - 1. [Linux/macOS](ssh-unix.md#basic-usage) - 1. [Windows](ssh-windows.md#basic-usage) - !!! info "Optional: Check when your SSH certificate will expire" +=== "Linux & macOS" + + 1. Optional, add certificate to [SSH authentication agent](ssh-unix.md#authentication-agent). Mandatory for SSH agent forwarding. + 1. [Connect from Terminal](ssh-unix.md#basic-usage) + + +=== "Windows" + + 1. Optional, add certificate to [SSH authentication agent](ssh-windows.md#authentication-agents-with-roihu). Mandatory for SSH agent forwarding or for using FileZilla and WinSCP. + 1. Connect to Roihu [with SSH clients](ssh-windows.md#basic-usage) or [graphical file transfer tools](../../data/moving/graphical_transfer.md). + +--- + +### Check when your SSH certificate will expire Each SSH certificate is valid for 24 hours. The expiration time can be checked as follows: - === "Terminal (Linux, macOS, PowerShell, MobaXterm)" +=== "Terminal (Linux, macOS, PowerShell, MobaXterm)" - 1. Open a terminal client. - 1. Run command: + 1. Open a terminal client. + 1. Run command: - ```bash - # Replace with the path to your OpenSSH - # certificate file (.pub) + ```bash + # Replace with the path to your OpenSSH + # certificate file (.pub) - ssh-keygen -L -f | grep "Valid" - ``` + ssh-keygen -L -f | grep "Valid" + ``` - === "GUI (PuTTY, MobaXterm)" +=== "GUI (PuTTY, MobaXterm)" - 2. Open PuTTYgen / MobaKeyGen. - 3. Load your `.ppk` private key: - * _File_ :material-arrow-right: _Load private key_ - 4. Add a certificate (`.pub`) to the key (unless already included - in the `.ppk` file): - * _Key_ :material-arrow-right: _Add certificate to key_ - 5. Select _Certificate info_ to see the validity period among other - info. + 2. Open PuTTYgen / MobaKeyGen. + 3. Load your `.ppk` private key: + * _File_ :material-arrow-right: _Load private key_ + 4. Add a certificate (`.pub`) to the key (unless already included + in the `.ppk` file): + * _Key_ :material-arrow-right: _Add certificate to key_ + 5. Select _Certificate info_ to see the validity period among other + info. - --- +--- ## More information diff --git a/docs/computing/connecting/ssh-unix.md b/docs/computing/connecting/ssh-unix.md index 5e282a71a6..3e977c1ae7 100644 --- a/docs/computing/connecting/ssh-unix.md +++ b/docs/computing/connecting/ssh-unix.md @@ -212,3 +212,23 @@ authentication agent have the same fingerprints and are annotated with 256 SHA256:ZXG7TvhDAWOv8VveFAlt/UYarsO9Nx5md4owX+FE5/M optional_comment (ED25519) 256 SHA256:ZXG7TvhDAWOv8VveFAlt/UYarsO9Nx5md4owX+FE5/M optional_comment (ED25519-CERT) ``` + +## Configuring SSH client + +You can save yourself some time by adding host-specific options for CSC +supercomputers in an [SSH config file](https://www.ssh.com/academy/ssh/config) +(e.g. `~/.ssh/config`). + +```bash +Host # e.g. "roihu-cpu" + HostName .csc.fi + User + IdentityFile + CertificateFile # Required for Roihu only +``` + +Now you can connect to the host simply by running: + +```bash +ssh +``` diff --git a/docs/computing/connecting/ssh-windows.md b/docs/computing/connecting/ssh-windows.md index 9c466a6238..b44a71d4e7 100644 --- a/docs/computing/connecting/ssh-windows.md +++ b/docs/computing/connecting/ssh-windows.md @@ -1,84 +1,68 @@ # SSH client on Windows ---8<-- "ssh-ca.md" - There are various programs that can be used for creating a remote SSH connection on a Windows system. This page provides instructions for three -popular alternatives: MobaXterm, PuTTY and PowerShell. +popular alternatives: [MobaXterm](https://mobaxterm.mobatek.net/), [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/) and [Windows PowerShell](https://learn.microsoft.com/en-us/powershell/scripting/security/remoting/ssh-remoting-in-powershell). PowerShell is included by default to all modern Windows machines. MobaXterm and PuTTY need to be installed. -## Generating SSH keys +For file transfer popular options are WinSCP, FileZilla, Cyberduck and MobaXterm SFTP browser. ---8<-- "using-ssh-keys.md" +For using SSH clients **with CSC supercomputers SSH keys are required**. -=== "MobaXterm" +In Windows, 2 different key types are widely used: - [MobaXterm](https://mobaxterm.mobatek.net/) is an SSH client with an embedded X - server, which means that it can be used to display graphics. + * **OpenSSH keys** (the same as for Linux/Mac), used with MobaXterm, PowerShell and Cyberduck. + * **PuTTY keys** .ppk, used with PuTTY, MobaXterm, WinSCP, FileZilla and Cyberduck. + +## Windows SSH and SFTP tools for Roihu - You can generate SSH keys using the utility tool MobaKeyGen - ([see tutorial](https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-keys.html)), - or in a local terminal by running: +--8<-- "ssh-ca.md" - ```bash - ssh-keygen -a 100 -t ed25519 - ``` +CSC provides two options for this: - If you have not set up SSH keys before, feel free to accept the default - name and location by pressing `ENTER` (recommended). However, if using the - default file name would overwrite an existing key, you will receive a - warning that looks like this: +* Option 1, the [certificate helper tool](ssh-keys.md#option-1-certificate-helper-tool-recommended) +* Option 2, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-2-mycsc) + +So for Roihu, consider also how different tools support updating the SSH certificate: - ```text - /home//.ssh/id_ed25519 already exists. - Overwrite (y/n)? - ``` +| Tool | Roihu, option 1 | Roihu, option 2| +|:-----------------|-----------------:|------------------:| +| MobaXterm, inc SFTP browser | :ok:| :ok: | +| Putty | :ok:| :ok: | +| PowerShell | :ok:| :ok:| +| [WinSCP](../../data/moving/graphical_transfer.md#winscp-file-transfer-and-more-on-windows) | :ok:| Difficult | +| [FileZilla](../../data/moving/graphical_transfer.md#filezilla-a-general-file-transfer-tool) | Only with PageAnt | Difficult | +| Cyberduck | :ok:| :ok: with OpenSSH key, difficult with Putty key | - Generally, you do not want to overwrite existing keys, so enter `n`, run - `ssh-keygen` again and enter a different file name when prompted. See also - the section on - [SSH key files with non-default name or location](#ssh-key-or-certificate-file-with-non-default-name-or-location). - Next, you will be asked for a passphrase. Please choose a secure - passphrase. It should be at least 8 characters long and contain numbers, - letters and special characters. **Never leave the passphrase empty when - generating an SSH key pair!** +For first/little usage, Roihu [web interface](../webinterface/index.md) might be the easiest optoin with login-node and compute-node shells and file transfer. - If you want your generated keys to persist through MobaXterm restarts, - set a persistent home directory for MobaXterm in the program settings - (`Settings --> Configuration --> General`). Note, this is only required if - you have generated your keys via the terminal, not MobaKeyGen. +## Generating SSH keys -=== "PuTTY" +--8<-- "using-ssh-keys.md" + +Depending on the tools you plan to use (see above) for SSH connection and moving files, generate right type of SSH keys. - The [PuTTY SSH client](https://www.chiark.greenend.org.uk/~sgtatham/putty/) - is an alternative to using OpenSSH. +=== "PuTTY keys" - To generate SSH keys for connecting with PuTTY, use the PuTTYgen key - generator. Normally, PuTTYgen does not need to be installed separately, as - it comes bundled with the PuTTY installation package. + You can generate PuTTY SSH keys using the PuTTYgen or MobaKeyGen tools. + Normally, PuTTYgen does not need to be installed separately, as + it comes bundled with the PuTTY installation package. MobaKeyGen is included in the MobaXterm installation. - Launch PuTTYgen and - [follow this tutorial to set up SSH keys](https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-keys.html#windows). + Launch PuTTYgen or MobaXterm and follow + [the tutorial to set up SSH keys](https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-keys.html#windows). Although the tutorial is formally written for MobaKeyGen, the instructions can easily be adapted for PuTTYgen as the user interface is virtually identical. - You may also consult the + You may also consult the [PuTTYgen documentation](https://the.earth.li/~sgtatham/putty/0.83/htmldoc/Chapter8.html#pubkey) or the relevant [SSH Academy tutorial](https://www.ssh.com/academy/ssh/putty/windows/puttygen). -=== "PowerShell" +=== "OpenSSH keys" - You can use the - [Windows PowerShell](https://learn.microsoft.com/en-us/powershell/scripting/security/remoting/ssh-remoting-in-powershell) - command-line shell to connect to a CSC supercomputer using the - [Win32 OpenSSH client](https://learn.microsoft.com/en-us/windows-server/administration/openssh/openssh_install_firstuse). - To install OpenSSH on a Windows device, follow - [these installation instructions](https://learn.microsoft.com/en-us/windows-server/administration/openssh/openssh_install_firstuse?tabs=gui#install-openssh-for-windows). - After installing OpenSSH, you can generate SSH keys using PowerShell by - running: + To generate SSH keys using MobaXterm or PowerShell, open the terminal and run: ```bash ssh-keygen -a 100 -t ed25519 @@ -90,7 +74,7 @@ popular alternatives: MobaXterm, PuTTY and PowerShell. warning that looks like this: ```text - C:\Users\/.ssh/id_ed25519 already exists. + /home//.ssh/id_ed25519 already exists. Overwrite (y/n)? ``` @@ -104,15 +88,23 @@ popular alternatives: MobaXterm, PuTTY and PowerShell. letters and special characters. **Never leave the passphrase empty when generating an SSH key pair!** + If you want your generated keys to persist through MobaXterm restarts, + set a persistent home directory for MobaXterm in the program settings + (`Settings --> Configuration --> General`). Note, this is only required if + you have generated your keys via the terminal, not MobaKeyGen. + --- -After you have generated an SSH key pair, you need to add the **public key** to -the MyCSC portal. -[Read the instructions here](ssh-keys.md#adding-public-key-in-mycsc). To +PuTTYgen or MobaKeyGen can also be used for converting keys from OpenSSH to Putty format and vice versa. + +After you have generated an SSH key pair, you need to [add the **public key** to +the MyCSC portal](ssh-keys.md#adding-public-key-in-mycsc). To connect to Roihu, you must also [sign your public key](ssh-keys.md#signing-public-key) to obtain a time-based SSH certificate which is required for authentication. + + You may also wish to configure [authentication agent](#authentication-agent) to make using SSH keys more convenient. @@ -123,7 +115,7 @@ SSH certificate (**required for Roihu only**) you can connect to a CSC supercomputer. === "MobaXterm" - + To connect using MobaXterm, open the terminal and run: ```bash @@ -155,11 +147,13 @@ supercomputer. | **Host Name** | `puhti.csc.fi` or `mahti.csc.fi` | | **Port** | `22` | | **Connection type** | `SSH` | + | Connection -> Data -> Auto-login username | `csc_username` | - When creating a remote connection using PuTTY, select the private key and + It is recommended to use [PageAnt](#authentication-agent) for providing your SSH keys. If you do not use PageAnt, add the keys manually: select the private key and certificate file (**only if connecting to Roihu**) under - `Connection --> SSH --> Auth --> Credentials`. Finally, click `Open` and - enter your CSC username and SSH key passphrase. + `Connection --> SSH --> Auth --> Credentials`. + + Finally, click `Open`. If you do not use PageAnt, your SSH key passphrase is asked. If you are connecting for the first time, PuTTY will ask if you trust the host. Click `Accept`. @@ -255,43 +249,19 @@ ssh @.csc.fi -i -i ## Authentication agent -!!! warning "CSC certificate helper is recommended to simplify working with SSH agent on Windows" - [The certificate helper tool](ssh-keys.md#option-1-certificate-helper-tool-recommended) - developed by CSC simplifies the process of signing and downloading SSH - certificates for connecting to Roihu. Importantly, it also automatically - adds your SSH keys and certificate to your OpenSSH and/or Pageant - authentication agent. +SSH authentication agents help managing your keys and their passphrases. It can hold your SSH keys and certificates in memory. -=== "MobaXterm" +Different authentication agents work with different tools: - MobaXterm supports three different SSH agents – Pageant, MobAgent and - Windows `ssh-agent`. They can all be used at the same time if you wish. If - you use the CSC certificate helper tool for managing SSH certificates for - Roihu, **we recommend using Pageant**. +* [PageAnt](https://the.earth.li/~sgtatham/putty/0.83/htmldoc/Chapter9.html#pageant): PuTTY, WinSCP, FileZilla, MobaXterm, Cyberduck +* Window ssh-agent: PowerShell, Cyberduck, MobaXterm +* MobAgent: MobaXterm - Authentication agents are enabled in the program settings (`Settings --> - Configuration --> SSH --> SSH agents`). - - 1. Toggle the SSH agent(s) you wish to use: - 1. For **MobAgent**, you need to click the `+` button and select the - private key(s) you want to load at startup. - 2. For **Pageant** (PuTTY agent), you must make sure Pageant is running - and holds the keys/certificates you wish to use. See the PuTTY tab - for instructions. - 3. For **`ssh-agent`** (Windows SSH agent), you must make sure - `ssh-agent` service is running and holds the keys/certificates you - wish to use. See the PowerShell tab for instructions. - 2. Click `OK` and restart MobaXterm. You'll be prompted to enter your key - passphrase. - 3. You may now connect to CSC supercomputers without having to type your - passphrase again. +### Authentication agents with Puhti, Mahti and LUMI -=== "PuTTY" +Puhti, Mahti and LUMI do not use SSH certificates, so adding keys to SSH authentication agents is done once and can be used for longer time. Below are the instructions for adding SSH keys to SSH agent manually. - To avoid having to type your passphrase every time you connect, you can use - the - [Pageant authentication agent](https://the.earth.li/~sgtatham/putty/0.83/htmldoc/Chapter9.html#pageant) - to store your private keys in memory. +=== "Pageant" 1. Start Pageant. It will put an icon into the System tray. 2. Right-click the Pageant icon and select `View Keys` from the menu to @@ -299,16 +269,16 @@ ssh @.csc.fi -i -i no keys, so the list box will be empty. 3. Press the `Add Key` button to add a key to Pageant. 4. Find your private key file in the `Select Private Key File` dialog, and - press `Open`. Pageant will ask you to enter the key passphrase. - 5. Now start PuTTY and open an SSH session to any CSC supercomputer. PuTTY + press `Open`. Pageant will ask you to enter the key passphrase. + 5. Now start PuTTY or other Pageant supported tools. PuTTY will notice that Pageant is running, retrieve the key automatically from - Pageant, and use it to authenticate. You may now open as many PuTTY + Pageant, and use it to authenticate. You may now open as many sessions as you like without having to type your passphrase again. -=== "PowerShell" +=== "Windows ssh-agent" - `ssh-agent` service is usually stopped or disabled in Windows by default, - and starting it requires administrator privileges. + [Windows `ssh-agent`](ssh-windows.md#authentication-agent) service is usually stopped or disabled in Windows by default, + and starting it requires **administrator privileges**. Run the following commands in an elevated PowerShell prompt: @@ -330,53 +300,75 @@ ssh @.csc.fi -i -i `ssh-agent` service automatically retrieves the local private key (and certificate) and passes it to your SSH client. +=== "MobAgent" + + MobaXterm has internal MobAgent, but it supports also Pageant and + Windows `ssh-agent`. They can all be used at the same time if you wish. + + 1. Enable MobAgent: `Settings --> Configuration --> SSH --> SSH agents`. + 1. Add your key file, click the `+` button and select the + private key(s) you want to load at startup. + 2. Click `OK` and restart MobaXterm. You'll be prompted to enter your key + passphrase. --- -!!! warning "Important note if you're not using the certificate helper tool" - Users downloading SSH certificates - [manually from MyCSC](ssh-keys.md#option-2-mycsc) must perform some extra - steps to be able to add their certificate to SSH agents. +### Authentication agents with Roihu + +In Roihu, besides SSH keys a SSH certificate is required. If using SSH agent, a new SSH certificate must be added daily. CSC provides two options for this: + +* Option 1, the [certificate helper tool](ssh-keys.md#option-1-certificate-helper-tool-recommended) +* Option 2, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-2-mycsc) + +Option 1 provides the easiest process to sign and download the SSH certificates for connecting to Roihu. +Importantly, it also automatically adds your SSH keys and certificate +to **Windows ssh-agent** and/or **Pageant**. The script does not update MobAgent, +so using Pageant is recommended for MobaXterm-users. + + +Option 2 requires some extra steps for adding the SSH certificate to the SSH agent. - === "MobAgent & Pageant" - - To add your SSH certificate to MobAgent or PuTTY, you must first - "combine" the certificate and the PuTTY `.ppk` private key. - - 1. Open MobaKeyGen (_Tools_ tab of MobaXterm) or PuTTYgen. - 2. Load your private key (`File --> Load private key`). - 3. Add a valid certificate to the key (`Key --> Add certificate to key`). - The validity period can be checked by selecting `Certificate info`. - 4. Save the private key as `-cert.ppk`, e.g. - `id_ed25519-cert.ppk`. - 5. The new private key including the certificate can now be added to - MobAgent and/or Pageant following the previous instructions. A - successfully combined key and certificate will show up as `Ed25519 - cert` in MobAgent/Pageant. - - === "Windows SSH agent" - - Users of Windows `ssh-agent` **must** make sure to store their manually - downloaded SSH certificate in the same directory as the SSH private key - **and** name it as `-cert.pub` to be able to add it - to SSH agent with `ssh-add` command. If successful, `ssh-add` outputs: - - ```bash - Certificate added: C:\Users\\.ssh\id_ed25519-cert.pub - ``` - - **If the certificate is stored and/or named in any other way, it cannot be - added to the authentication agent because OpenSSH uses hard-coded naming - conventions.** - - **Please note**: - - * If you intend to connect to Roihu via a jump host (e.g. when transferring - data from another CSC server to Roihu), also the SSH certificate **must** - be added to the SSH agent so that it can be properly forwarded. - * Alternatively, you may connect to Roihu and **pull** data from servers - that do not require a SSH certificate (e.g. Puhti or Mahti). In this case - it is enough to forward only your SSH keys. - * [Read more about SSH agent forwarding below](#ssh-agent-forwarding). +=== "Pageant & MobAgent" + + To add your SSH certificate to MobAgent or Pageant, you must first + "combine" the certificate and the PuTTY `.ppk` private key. + + 1. Open PuTTYgen or MobaKeyGen (_Tools_ tab of MobaXterm). + 2. Load your private key (`File --> Load private key`). + 3. Add a valid certificate to the key (`Key --> Add certificate to key`). + The validity period can be checked by selecting `Certificate info`. + 4. Save the private key as `-cert.ppk`, e.g. + `id_ed25519-cert.ppk`. + 5. The new private key including the certificate can now be added to + Pageant and/or MobAgent following the instructions above. A + successfully combined key and certificate will show up as `Ed25519 + cert` in Pageant/MobAgent. + +=== "Windows ssh-agent" + + Users of Windows `ssh-agent` **must** make sure to store their manually + downloaded SSH certificate in the same directory as the SSH private key + **and** name it as `-cert.pub` to be able to add it + to SSH agent with `ssh-add` command. If successful, `ssh-add` outputs: + + ```bash + Certificate added: C:\Users\\.ssh\id_ed25519-cert.pub + ``` + + **If the certificate is stored and/or named in any other way, it cannot be + added to the authentication agent because OpenSSH uses hard-coded naming + conventions.** + +--- + +**Please note**: + +* If you intend to connect to Roihu via a jump host (e.g. when transferring +data from another CSC server to Roihu), also the SSH certificate **must** +be added to the SSH agent so that it can be properly forwarded. +* Alternatively, you may connect to Roihu and **pull data** from servers +that do not require a SSH certificate (e.g. Puhti or Mahti). In this case +it is enough to forward only your SSH keys. +* [Read more about SSH agent forwarding below](#ssh-agent-forwarding). ### SSH agent forwarding From 3a1d92abd79ad6cfa484aae656221ecddc2aaf23 Mon Sep 17 00:00:00 2001 From: Jaan Tollander de Balsch Date: Wed, 22 Apr 2026 14:42:20 +0300 Subject: [PATCH 063/139] removed --cleanenv --- docs/support/tutorials/roihu.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 5e2cd5c464..4ff9c77933 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -141,7 +141,7 @@ Base container are built on top of Rocky Linux 9. Now, you can run commands inside the container with clean environment and environment active as follows: ```bash - apptainer run --cleanenv run mycmd + apptainer run run mycmd ``` === "Roihu GPU base container (~16 GB)" @@ -169,7 +169,7 @@ Base container are built on top of Rocky Linux 9. Now, you can run commands inside the container with clean environment and environment active as follows: ```bash - apptainer run --cleanenv --nv run mycmd + apptainer run --nv run mycmd ``` More details on working with containers in CSC's computing environment can be found from the links below: From 5b26c5926321f765f0cecf57d7bbc61258de4708 Mon Sep 17 00:00:00 2001 From: kkmattil Date: Thu, 23 Apr 2026 11:28:17 +0300 Subject: [PATCH 064/139] Update introduction.md Roihu updates --- docs/data/Allas/introduction.md | 30 +++++++++++++----------------- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/docs/data/Allas/introduction.md b/docs/data/Allas/introduction.md index 78304f27b3..11cb80f740 100644 --- a/docs/data/Allas/introduction.md +++ b/docs/data/Allas/introduction.md @@ -20,9 +20,7 @@ The stored objects can be of any data type, such as images or compressed data fi **Limitations** * Specific tools are required to use the object storage. The object storage cannot be properly mounted for local disk-like usage. There are some tools that can do this, but they have their limitations. For example, _svfs_ can be used to mount _Swift_ as a file system, but it uses _FUSE_ which is slow. - * It is unsuitable for files that change constantly during their lifetime (e.g. most SQL databases). - * The data cannot be modified while it is in Allas. It must be downloaded to a server for processing, and the previous version replaced with a new one. - * In case of swift protocol, files larger than 5 GB are divided into smaller segments. Normally, this is done automatically during the upload. See [Files larger than 5 GB](accessing_allas.md#files-larger-than-5-gb). + * The data cannot be modified while it is in Allas. For modification, a file must be downloaded to a server for processing, and the previous version replaced with a new one. Because of this Allas is not unsuitable for files that change constantly during their lifetime (e.g. most SQL databases). See also the [common use cases](./using_allas/common_use_cases.md). @@ -79,45 +77,43 @@ Allas has technical limits, that normally can not be increased: | Buckets per project | 1 000 | | Objects per bucket | 500 000 | -If you a lot of objects, please plan on spreading the objects across multiple buckets. Spreading data to multiple buckets will give a better performance whenever writing objects. +If you need to store lot of objects, please plan on spreading the objects across multiple buckets. Spreading data to multiple buckets will give a better performance whenever writing objects. Alternatively, consider storing data in bigger units, for example as tar or zip arcive files instead of storing each file as a separate object. ## Protocols -The object storage service is provided over two different protocols, **Swift** and **S3**. From the user perspective, one of the main differences between S3 and Swift is authentication. +The object storage service is provided over two different protocols, **S3** and **OpenStack Swift**. From the user perspective, one of the main differences between S3 and Swift is authentication. -* The token-based **Swift authentication** used remains valid for **eight hours* at a time. * The key-based **S3**, the connection can stay **permanently open**. +* The token-based **Swift authentication** used remains valid for **eight hours* at a time. -The permanent connection of S3 is practical in many ways, but it includes a security aspect: if the server where Allas is used is compromised, the object storage space will be compromised as well. Due to this security concern, Swift is the recommended protocol for multiple-user servers such as Mahti and Puhti. Thus, for example, the CSC-specific `a-commands`, as well as the `rclone` configuration in Puhti and Mahti, are by default based on Swift. However, in some cases, the permanent connections provided by the S3 protocol may be the most reasonable option, for example, in personal virtual machines running in cPouta. -The Swift and S3 protocols are not mutually compatible when handling objects. For small objects that do not need to be split during the upload, the protocols can be used interchangeably, but split objects can be accessed only with the protocol that was used for uploading them. The size limit for splitting an object depends on the settings and protocol. The limit is typically between 500 MB and 5 GB. +The permanent connection of S3 is practical in many ways, but it includes a security aspect: if the server where Allas is used is compromised, the object storage space will be compromised as well. Due to this security concern, Swift has been the defult for multiple-user servers such as Allas web interface, Mahti and Puhti. However, due to technical development Allas services are starting to use S3 as the default protocol. For example in Roihu S3 will be the default Allas protocol. -Generic recommendations for selecting the protocol: +Thus, for example, the CSC-specific `a-commands`, as well as the `rclone` configuration in Puhti and Mahti, are by default based on Swift. But in Roihu they use S3- + +The Swift and S3 protocols are not mutually compatible when handling objects. For small objects that do not need to be split during the upload, the protocols can be used interchangeably, but split objects can be accessed only with the protocol that was used for uploading them. The size limit for splitting an object depends on the settings and protocol. The limit is typically between 500 MB and 5 GB. - * If possible, use the _Swift_ protocol. It is better supported. - * In any case, choose only one of the protocols. Do not mix _S3_ and _Swift_. - Note, that some [Allas clients](accessing_allas.md) support only one of these protocols. ## Naming buckets and objects Each bucket has a name that must be unique across all Allas users. If another user has a bucket called `test`, another bucket called `test` cannot be created. All bucket names are public, so please do not include any confidential information in the bucket names. You may, for example, use your project ID, e.g. _2000620-raw-data_. It is not possible to rename a bucket. -Object URLs can be in the DNS format, e.g. _https://a3s.fi/bucketname/objectname_. Please use a valid DNS name (RFC 1035). We recommend not using upper case or non-ASCII (ä, ö etc.) characters. +Object URLs can be in the DNS format, e.g. _https://a3s.fi/bucketname/objectname_. Please use a valid DNS name (RFC 1035). **We recommend not using upper case or non-ASCII (ä, ö etc.) characters in bucket names**. -For object names, you can use [pseodo folders](terms_and_concepts.md#pseudo-folder), which some Allas clients display as folders. +Object names, inside the bucket don't charcter restrictions like bucket names. Howerver, some Allas interfaces may have problems with certain characters (e.g. spaces ) in object names. +Not also that there is no directory sturcture in side the bucket. You can incluce slash (/) characters in object names, to define [pseodo folders](terms_and_concepts.md#pseudo-folder), which some Allas clients display as folders. However thenically slash caracters are just part of a the object name. ## File sizes and packaging File size considerations: * It is better to store a few large objects than many small objects. -* Keeping your objects under 5 GB if often practical, because bigger objects are chunked at upload. * Using over 100 GB objects may cause problems because of long upload/download times. When moving your data to Allas, you can take few different strategies: -* Create one package of all your files, for example .tar or .zip and move the package to Allas. This suits for use cases, when the amount of data is not too big (< 100Gb). Allas is used as storage of data and for active use the data is moved elsewhere, for example CSC computing services. In this scenario, it is difficult to access a single original file. From Allas clients `a-commands` has best support for this. +* Create one package of all your files, for example .tar or .zip and move the package to Allas. This suits for use cases, when the amount of data is not too big (< 500Gb). Allas is used as storage of data and for active use the data is moved elsewhere, for example CSC computing services. In this scenario, it is difficult to access a single original file. From Allas clients `a-commands` has best support for this. * Move your files as such to Allas, so that in Allas would be as many files than originally. This suits for use cases, when originally the files have reasonable size and there is not too many of them. This is reasonable also, if access to single files is important. Many of the Allas clients support this. * A combination of these approaches, so that some subsets of files are packaged for Allas. If you have a lot of small files and the total amount of data is big, then it likely makes sense to package for example different folders to own files that are then stored to Allas. @@ -147,4 +143,4 @@ Allas data is spread across various servers, which protects against disk and ser 6. Move data to/from Allas. 7. If you want to share the data publically or with another project, change access rights for your data. -For last two steps, see tool specific instructions. \ No newline at end of file +For last two steps, see tool specific instructions. From fcb82992a5d74cae763730434ba2b1133618309e Mon Sep 17 00:00:00 2001 From: Juha Lento Date: Thu, 23 Apr 2026 09:30:23 +0300 Subject: [PATCH 065/139] Add files via upload --- docs/support/tutorials/roihu-user-spack.md | 296 +++++++++++++++++++++ 1 file changed, 296 insertions(+) create mode 100644 docs/support/tutorials/roihu-user-spack.md diff --git a/docs/support/tutorials/roihu-user-spack.md b/docs/support/tutorials/roihu-user-spack.md new file mode 100644 index 0000000000..62c5196bd6 --- /dev/null +++ b/docs/support/tutorials/roihu-user-spack.md @@ -0,0 +1,296 @@ +# Using Spack to install software in Roihu + +Roihu's scientific software stack is maintained using Spack package +manager. This document describes how regular users can use Spack +to install additional software, libraries and applications, on top of +the already installed software. + +Throughout this document, we use a package `eccodes` as an example, +and assume that the commands are run in user's custom software install +root, usually somewhere under `/projappl//$USER`. + +In addition to this tutorial style documentation, please refer to the +full +[Spack documentation](https://spack.readthedocs.io/en/v1.1.1/index.html). +For example, term "environment" in this document specifically refers to +[Spack environments](https://spack.readthedocs.io/en/latest/environments.html). + + +## When to install with Spack + +Spack installation is a viable option for "traditional" HPC software, +parallel applications and the libraries they depend on, especially +when Spack package recipies already exist. Using Spack is an +alternative to traditional "manual" installation, loading modules, +running configure or cmake, and make, for the application and it's +dependencies. + +Usually containers are better approach when the number of files in the +installation goes to tens of thousands, which is often the case for +Python and R environments, for example. + + +## What software is available as a Spack package + +The Spack packages can be searched from +[Spack Packages](https://packages.spack.io) (the latest versions), or +directly from Roihu directory +`/appl/soft/spack/v2026_03/spack-packages/repos/spack_repo/builtin/packages`, +which contains almost 9000 package definitions. + + +## How to set up Spack + +First, let's initialize Spack. In here we set Spack cache directory to +temporary directory, which is fine for one shot installations, and +isolate Spack from system and user configuration scopes, so that no +settings from those scopes leak into our setup. See +[Overriding local configuration](https://spack.readthedocs.io/en/v1.1.1/configuration.html#overriding-local-configuration) +for details. + +```console +$ source /appl/soft/spack/v2026_03/spack/share/spack/setup-env.sh +$ source /appl/soft/spack/v2026_03/spack/share/spack/bash/spack-completion.bash +$ export SPACK_USER_CACHE_PATH=$TMPDIR/spack +$ export SPACK_DISABLE_LOCAL_CONFIG=true +``` + + +## How the Spack system installation is organised, and what is installed already + +The different versions of Spack itself are installed in +`/appl/soft/spack`. At the time of writing this, the latest installed +Spack version is in + +`/appl/soft/spack/v2026_03` + +The corresponding core environments and the application environments +(built on top of the core environments) are in directories + +``` +/appl/soft/spack/core/v2026_03/$target_family +/appl/soft/spack/apps/v2026_03/$target_family +``` + +where `$target_family` is either `x86_64` or `aarch64`, referring to +the processor architecture on the CPU or the GPU nodes, respectively. + +In general, the core environments provide a good base, "upstream" +package environment, to build on. Core environments contain compilers +and most common libraries, such as MPI libraries, already configured +work efficiently. + +The available environments can be listed for example with + +```console +$ ls /appl/soft/spack/core/v2026_03/x86_64/ +aocc50_ec compilers_ce compilers_ec gcc152_ec +``` + +The name of the environment gives a hint what is available in the +environment, for example `gcc152_ec` has GNU version 15.2 compiler +collection, and the most commonly used libraries built with the +compiler for the CPU nodes. + +The packages in the upstream environment can be listed, for example, +with command + +```console +$ spack -c upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir find +-- linux-rhel9-x86_64 / %c=gcc@15.2.0 --------------------------- +knem@1.1.4 + +-- linux-rhel9-x86_64 / no compilers ---------------------------- +gcc@15.2.0 glibc@2.34 lustre@2.14.0 rdma-core@54.0 slurm@25.05.3 + +-- linux-rhel9-zen5 / %c,cxx,fortran=gcc@15.2.0 ----------------- +openblas@0.3.30 openmpi@5.0.10 papi@7.2.0 + +-- linux-rhel9-zen5 / %c,cxx=gcc@15.2.0 ------------------------- +berkeley-db@18.1.40 c-blosc@1.21.6 eigen@5.0.1 gettext@1.0 krb5@1.22.2 lz4@1.10.0 ncurses@6.6 openssl@3.6.1 ucx@1.20.0 +bison@3.8.2 cmake@3.31.11 expat@2.7.4 hwloc@2.4.1 libaec@1.1.4 m4@1.4.21 nghttp2@1.67.1 python@3.14.3 zlib-ng@2.3.3 +boost@1.88.0 curl@8.18.0 ffmpeg@7.1 icu4c@74.2 libffi@3.5.2 mimalloc@3.2.7 openssh@10.2p1 snappy@1.2.1 zstd@1.5.7 + +-- linux-rhel9-zen5 / %c,fortran=gcc@15.2.0 --------------------- +fftw@3.3.10 netcdf-fortran@4.6.2 netlib-lapack@3.12.1 netlib-scalapack@2.2.2 + +-- linux-rhel9-zen5 / %c=gcc@15.2.0 ----------------------------- +alsa-lib@1.2.15.3 diffutils@3.12 gmake@4.4.1 libbsd@0.12.2 libiconv@1.18 libtool@2.5.4 nasm@2.16.03 perl@5.42.0 readline@8.3 util-linux-uuid@2.41 +automake@1.18.1 findutils@4.10.0 gsl@2.8 libedit@3.1-20240808 libmd@1.1.0 libxcrypt@4.5.2 netcdf-c@4.9.3 pigz@2.8 sqlite@3.51.2 xz@5.8.2 +bzip2@1.0.8 gdbm@1.26 hdf5@1.14.6 libevent@2.1.12 libsigsegv@2.15 libxml2@2.15.1 numactl@2.0.19 pkgconf@2.5.1 tar@1.35 + +-- linux-rhel9-zen5 / %cxx=gcc@15.2.0 --------------------------- +kokkos@5.0.2 + +-- linux-rhel9-zen5 / no compilers ------------------------------ +autoconf@2.72 compiler-wrapper@1.0 gcc-runtime@15.2.0 +==> 73 installed packages +``` + + +## How to set up a custom environment for the installs + +The overall plan is to set up a custom environment that uses as many +already existing installed packages from the system core environment +(upstream) as possible, and installs the missing ones in a custom +install tree. Multiple environments can use the same install trees. + +The commands + +```console +$ spack env create environments/mygcc152_ec +$ spack env activate -p environments/mygcc152_ec +``` + +create the initial version of the file defining the environment, +`environments/mygcc152_ec/spack.yaml`, and make the following spack +commands to act within the environment. If you plan to install multiple +versions of the same packages in the environment, consider adding +option `--without-view` option to +[spack env activate](https://spack.readthedocs.io/en/v1.1.1/environments.html#activating-an-environment) +command. + +Similar to defining `SPACK_USER_CACHE_PATH`, we need to override some +default settings, so that they do not point to default +system locations (which are not writable by users): + +```console +[mygcc152_ec] $ spack config add 'config:source_cache:$spack_user_cache/source-cache' +``` + +The chosen upstream environment and the location of our custom environment's +actual software install root can be added to environment configuration +(`spack.yaml` file) with commands + +```console +[mygcc152_ec] $ spack config add 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' +[mygcc152_ec] $ spack config add config:install_tree:root:$PWD/mygcc152_ec-install +``` + +Optionally, you can also add other configuration settings, for example +flatten the default hierarchy in the install tree and use short +hashes: + +```console +[mygcc152_ec] $ spack config add 'config:install_tree:projections:all:"{name}-{version}-{hash:7}"' +``` + +The final step in the custom environment configuration is to define what to install to the environment: + +```console +[mygcc152_ec] $ spack add eccodes +``` + +In Spack terminology `eccodes` above is a +[spec](https://spack.readthedocs.io/en/v1.1.1/spec_syntax.html), +defining the exact +configuration of the software to be installed. Obviously `eccodes` is +quite abstract, yet. Next we will refine it. + + +## Refining the spec and installing + +Often the defaults are ok, but it is always best to check the +concretized spec before installing: + +```console +[mygcc152_ec] $ spack concretize +==> Fetching https://ghcr.io/v2/spack/bootstrap-buildcache-v2.2/blobs/sha256:2010a2a50b9620c2bda7c5fa4e9ce137a115dbba35094857fecc819d9a00a789 +==> Fetching https://ghcr.io/v2/spack/bootstrap-buildcache-v2.2/blobs/sha256:31f1649728e2d58902eb62d1c2e37b1cfc73e007089322a17463b3cb5777cb98 +==> Installing "clingo-bootstrap@=spack~apps~docs+ipo+optimized+python+static_libstdcpp build_system=cmake build_type=Release commit=2a025667090d71b2c9dce60fe924feb6bde8f667 generator=make patches:=bebb819,ec99431 platform=linux os=centos7 target=x86_64" from a buildcache +==> Concretized 1 spec: + - bc5ld7m eccodes@2.45.0+aec~fortran~ipo~memfs~netcdf~openmp~png~pthreads+shared~tools build_system=cmake build_type=Release extra_definitions:=none generator=make jp2k=openjpeg platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] zwnvbda ^cmake@3.31.11~doc+ncurses+ownlibs~qtgui build_system=generic build_type=Release platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] lrippm6 ^curl@8.18.0~gssapi~ldap~libidn2~librtmp~libssh~libssh2+nghttp2 build_system=autotools libs:=shared,static tls:=openssl platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] 45i4fsq ^nghttp2@1.67.1 build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] wxvpbhd ^diffutils@3.12 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] 3n6hlmz ^libiconv@1.18 build_system=autotools libs:=shared,static platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] gouvek3 ^openssl@3.6.1~docs+shared build_system=generic certs=system platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] lo2oqga ^perl@5.42.0+cpanm+opcode+open+shared+threads build_system=generic platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] osujnxv ^berkeley-db@18.1.40+cxx~docs+stl build_system=autotools patches:=26090f4,b231fcc platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] tmpzrkl ^bzip2@1.0.8~debug~pic+shared build_system=generic platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] nc55qc5 ^gdbm@1.26 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] lr4bql4 ^readline@8.3 build_system=autotools patches:=21f0a03 platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] 5psyrcf ^pkgconf@2.5.1 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[e] i7ouzvl ^gcc@15.2.0+binutils+bootstrap~graphite+libsanitizer~mold~nvptx~piclibs~profiled~strip build_system=autotools build_type=RelWithDebInfo languages:='c,c++,fortran,jit' platform=linux os=rhel9 target=x86_64 +[^] yi5ecvd ^gcc-runtime@15.2.0 build_system=generic platform=linux os=rhel9 target=zen5 +[^] kztfkls ^ncurses@6.6~symlinks+termlib abi=none build_system=autotools patches:=7a351bc platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] turbkrl ^zlib-ng@2.3.3+compat+new_strategies+opt+pic+shared build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] iglv3xy ^compiler-wrapper@1.0 build_system=generic platform=linux os=rhel9 target=zen5 +[e] fx7oixv ^gcc@15.2.0+binutils+bootstrap~graphite+libsanitizer~mold~nvptx~piclibs~profiled~strip build_system=autotools build_type=RelWithDebInfo languages:='c,c++,fortran' platform=linux os=rhel9 target=x86_64 + - szq7ch3 ^gcc-runtime@15.2.0 build_system=generic platform=linux os=rhel9 target=zen5 +[e] 45if5qv ^glibc@2.34 build_system=autotools platform=linux os=rhel9 target=x86_64 +[^] jvnmnyr ^gmake@4.4.1~guile build_system=generic platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] elazd5o ^libaec@1.1.4~ipo+shared build_system=cmake build_type=Release generator=make platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 + - szh7ud6 ^openjpeg@2.3.1~codec~ipo build_system=cmake build_type=Release generator=make platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 + +==> Updating view at /users/jlento/user-spack/environments/mygcc152_ec/.spack-env/view +``` + +There are some keypoints to check in the concretized spec. First, +we need to check that the variant (build options) are what we +want. In the case of eccodes we can compare the current spec + +``` +eccodes@2.45.0+aec~fortran~ipo~memfs~netcdf~openmp~png~pthreads+shared~tools build_system=cmake build_type=Release extra_definitions:=none generator=make jp2k\ +=openjpeg platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +``` + +against the spec of the eccodes installation in Mahti + +``` +eccodes@2.34.0+aec+fortran~ipo+memfs~netcdf+openmp+png~pthreads+shared~tools build_system=cmake build_type=Release extra_definitions=none generator=make jp2k=jasper +``` + +We notice the version number update, which is fine, but then we notice +that variants `fortran` (fortran interface), `memfs` (some definition +data in the library/memory instead of in small files on disc), +`openmp` (thread support), `png` are marked with `~`, which means that +those configuration options (variants) are not set. In addition, we'd +like to include variant `tools`, which builds command line tools with +the library. The last thing we notice is that the compression library +is openjpeg instead of jasper (fine?). + +Let's update the variant information, and reconcretize (omitting the output): + +```console +[mygcc152_ec] $ spack change eccodes@2.45.0+aec+fortran~ipo+memfs~netcdf+openmp+png~pthreads+shared+tools +[mygcc152_ec] $ spack concretize +``` + +Now the variant is correct. Next we check that Spack is actually using +the packages that are already installed in the upstream +environment. The information is in the first column of the concretized +spec. Entry ` - ` means that the package will be installed, `[^]` +tells Spack will use the already existing upstream installation, and +`[e]` stands for "external", defined separately in configuration. Here +all looks fine, and we can proceed to installation + +```console +[mygcc152_ec] $ spack install +``` + +## Using the environment + +The actual software installs are in the directory +`$PWD/mygcc152_ec-install` that we set earlier in the environment +configuration: + +```console +[mygcc152_ec] $ ls mygcc152_ec-install/ +bin eccodes-2.45.0-4fi3e5d gcc-runtime-15.2.0-szq7ch3 libpng-1.6.55-dhucqwk openjpeg-2.3.1-szh7ud6 +``` + +If the environment was activated with the view, the installed software and libraries +are also accessible through environment's +[view](https://spack.readthedocs.io/en/latest/environments.html#environment-views), +which in this tutorial example is in `environments/mygcc152_ec/.spack-env/view`. + +Command + +```console +[mygcc152_ec] $ spack env deactivate +``` + +will deactivate the current Spack environment. + From 975a1d35eb5cd4ca48d1482cb8062fa57c394d1e Mon Sep 17 00:00:00 2001 From: Juha Lento Date: Thu, 23 Apr 2026 09:38:24 +0300 Subject: [PATCH 066/139] Update user-spack.md Added pointer where corresponding Roihu documentation can be found. --- docs/support/tutorials/user-spack.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/support/tutorials/user-spack.md b/docs/support/tutorials/user-spack.md index 6fbbb75969..3ca572d486 100644 --- a/docs/support/tutorials/user-spack.md +++ b/docs/support/tutorials/user-spack.md @@ -11,6 +11,9 @@ customers that enables per-project software installations using Spack. compiling and linking programs. !!! info "Available versions" + Similar tutorial for Roihu, in which Spack directly is used directly, + is in a separate document + [Using Spack to install software in Roihu](roihu-user-spack.md). This tutorial assumes you are on Puhti, which has `spack/v0.18-user` installed. Mahti has two versions of Spack available for users, `spack/v0.17-user` and `spack/v0.20-user`. Aside from the module versions, From b20ede1eda438d7ae6b441a501474d09eebd3950 Mon Sep 17 00:00:00 2001 From: Juha Lento Date: Thu, 23 Apr 2026 09:51:06 +0300 Subject: [PATCH 067/139] Update roihu-user-spack.md Added shell quotation to make html rendering look more coherent --- docs/support/tutorials/roihu-user-spack.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/support/tutorials/roihu-user-spack.md b/docs/support/tutorials/roihu-user-spack.md index 62c5196bd6..f6b5b69877 100644 --- a/docs/support/tutorials/roihu-user-spack.md +++ b/docs/support/tutorials/roihu-user-spack.md @@ -96,7 +96,7 @@ The packages in the upstream environment can be listed, for example, with command ```console -$ spack -c upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir find +$ spack -c 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' find -- linux-rhel9-x86_64 / %c=gcc@15.2.0 --------------------------- knem@1.1.4 @@ -164,7 +164,7 @@ actual software install root can be added to environment configuration ```console [mygcc152_ec] $ spack config add 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' -[mygcc152_ec] $ spack config add config:install_tree:root:$PWD/mygcc152_ec-install +[mygcc152_ec] $ spack config add 'config:install_tree:root:$PWD/mygcc152_ec-install' ``` Optionally, you can also add other configuration settings, for example @@ -254,7 +254,7 @@ is openjpeg instead of jasper (fine?). Let's update the variant information, and reconcretize (omitting the output): ```console -[mygcc152_ec] $ spack change eccodes@2.45.0+aec+fortran~ipo+memfs~netcdf+openmp+png~pthreads+shared+tools +[mygcc152_ec] $ spack change 'eccodes@2.45.0+aec+fortran~ipo+memfs~netcdf+openmp+png~pthreads+shared+tools' [mygcc152_ec] $ spack concretize ``` From 36bcc6a202794df9e8fbddc4ff264512abce701d Mon Sep 17 00:00:00 2001 From: Juha Lento Date: Thu, 23 Apr 2026 10:06:47 +0300 Subject: [PATCH 068/139] Update roihu-user-spack.md Testing html conversion --- docs/support/tutorials/roihu-user-spack.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu-user-spack.md b/docs/support/tutorials/roihu-user-spack.md index f6b5b69877..d2f42e0674 100644 --- a/docs/support/tutorials/roihu-user-spack.md +++ b/docs/support/tutorials/roihu-user-spack.md @@ -83,7 +83,10 @@ work efficiently. The available environments can be listed for example with ```console -$ ls /appl/soft/spack/core/v2026_03/x86_64/ +ls /appl/soft/spack/core/v2026_03/x86_64/ +``` + +```output aocc50_ec compilers_ce compilers_ec gcc152_ec ``` From bb4a30af9b522420df02a81c4eaaa4182eb8a12f Mon Sep 17 00:00:00 2001 From: Juha Lento Date: Thu, 23 Apr 2026 10:54:11 +0300 Subject: [PATCH 069/139] Update roihu-user-spack.md --- docs/support/tutorials/roihu-user-spack.md | 68 ++++++++++++++-------- 1 file changed, 44 insertions(+), 24 deletions(-) diff --git a/docs/support/tutorials/roihu-user-spack.md b/docs/support/tutorials/roihu-user-spack.md index d2f42e0674..8a05d87440 100644 --- a/docs/support/tutorials/roihu-user-spack.md +++ b/docs/support/tutorials/roihu-user-spack.md @@ -49,10 +49,10 @@ settings from those scopes leak into our setup. See for details. ```console -$ source /appl/soft/spack/v2026_03/spack/share/spack/setup-env.sh -$ source /appl/soft/spack/v2026_03/spack/share/spack/bash/spack-completion.bash -$ export SPACK_USER_CACHE_PATH=$TMPDIR/spack -$ export SPACK_DISABLE_LOCAL_CONFIG=true +source /appl/soft/spack/v2026_03/spack/share/spack/setup-env.sh +source /appl/soft/spack/v2026_03/spack/share/spack/bash/spack-completion.bash +export SPACK_USER_CACHE_PATH=$TMPDIR/spack +export SPACK_DISABLE_LOCAL_CONFIG=true ``` @@ -60,9 +60,7 @@ $ export SPACK_DISABLE_LOCAL_CONFIG=true The different versions of Spack itself are installed in `/appl/soft/spack`. At the time of writing this, the latest installed -Spack version is in - -`/appl/soft/spack/v2026_03` +Spack version is in `/appl/soft/spack/v2026_03`. The corresponding core environments and the application environments (built on top of the core environments) are in directories @@ -86,6 +84,8 @@ The available environments can be listed for example with ls /appl/soft/spack/core/v2026_03/x86_64/ ``` +which (at the time of writing this) gives + ```output aocc50_ec compilers_ce compilers_ec gcc152_ec ``` @@ -99,7 +99,12 @@ The packages in the upstream environment can be listed, for example, with command ```console -$ spack -c 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' find +spack -c 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' find +``` + +which gives + +```output -- linux-rhel9-x86_64 / %c=gcc@15.2.0 --------------------------- knem@1.1.4 @@ -141,8 +146,8 @@ install tree. Multiple environments can use the same install trees. The commands ```console -$ spack env create environments/mygcc152_ec -$ spack env activate -p environments/mygcc152_ec +spack env create environments/mygcc152_ec +spack env activate -p environments/mygcc152_ec ``` create the initial version of the file defining the environment, @@ -158,7 +163,7 @@ default settings, so that they do not point to default system locations (which are not writable by users): ```console -[mygcc152_ec] $ spack config add 'config:source_cache:$spack_user_cache/source-cache' +spack config add 'config:source_cache:$spack_user_cache/source-cache' ``` The chosen upstream environment and the location of our custom environment's @@ -166,8 +171,8 @@ actual software install root can be added to environment configuration (`spack.yaml` file) with commands ```console -[mygcc152_ec] $ spack config add 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' -[mygcc152_ec] $ spack config add 'config:install_tree:root:$PWD/mygcc152_ec-install' +spack config add 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' +spack config add 'config:install_tree:root:$PWD/mygcc152_ec-install' ``` Optionally, you can also add other configuration settings, for example @@ -175,13 +180,13 @@ flatten the default hierarchy in the install tree and use short hashes: ```console -[mygcc152_ec] $ spack config add 'config:install_tree:projections:all:"{name}-{version}-{hash:7}"' +spack config add 'config:install_tree:projections:all:"{name}-{version}-{hash:7}"' ``` The final step in the custom environment configuration is to define what to install to the environment: ```console -[mygcc152_ec] $ spack add eccodes +spack add eccodes ``` In Spack terminology `eccodes` above is a @@ -194,10 +199,15 @@ quite abstract, yet. Next we will refine it. ## Refining the spec and installing Often the defaults are ok, but it is always best to check the -concretized spec before installing: +concretized spec before installing. Command ```console -[mygcc152_ec] $ spack concretize +spack concretize +``` + +shows the concretized spec + +```output ==> Fetching https://ghcr.io/v2/spack/bootstrap-buildcache-v2.2/blobs/sha256:2010a2a50b9620c2bda7c5fa4e9ce137a115dbba35094857fecc819d9a00a789 ==> Fetching https://ghcr.io/v2/spack/bootstrap-buildcache-v2.2/blobs/sha256:31f1649728e2d58902eb62d1c2e37b1cfc73e007089322a17463b3cb5777cb98 ==> Installing "clingo-bootstrap@=spack~apps~docs+ipo+optimized+python+static_libstdcpp build_system=cmake build_type=Release commit=2a025667090d71b2c9dce60fe924feb6bde8f667 generator=make patches:=bebb819,ec99431 platform=linux os=centos7 target=x86_64" from a buildcache @@ -254,14 +264,16 @@ like to include variant `tools`, which builds command line tools with the library. The last thing we notice is that the compression library is openjpeg instead of jasper (fine?). -Let's update the variant information, and reconcretize (omitting the output): +Let's update the variant information, and reconcretize (the output of +the command is omitted as it is similar to the previous output from +`spack concretize`command): ```console -[mygcc152_ec] $ spack change 'eccodes@2.45.0+aec+fortran~ipo+memfs~netcdf+openmp+png~pthreads+shared+tools' -[mygcc152_ec] $ spack concretize +spack change 'eccodes@2.45.0+aec+fortran~ipo+memfs~netcdf+openmp+png~pthreads+shared+tools' +spack concretize ``` -Now the variant is correct. Next we check that Spack is actually using +Now the variant is correct. Next, we check that Spack is actually using the packages that are already installed in the upstream environment. The information is in the first column of the concretized spec. Entry ` - ` means that the package will be installed, `[^]` @@ -270,20 +282,28 @@ tells Spack will use the already existing upstream installation, and all looks fine, and we can proceed to installation ```console -[mygcc152_ec] $ spack install +spack install ``` ## Using the environment The actual software installs are in the directory `$PWD/mygcc152_ec-install` that we set earlier in the environment -configuration: +configuration. Command ```console -[mygcc152_ec] $ ls mygcc152_ec-install/ +ls mygcc152_ec-install +``` + +shows the install roots: + +```output bin eccodes-2.45.0-4fi3e5d gcc-runtime-15.2.0-szq7ch3 libpng-1.6.55-dhucqwk openjpeg-2.3.1-szh7ud6 ``` +Different versions (variants, anything different in the concretized spec) of the packages +are installed with unique hashes. + If the environment was activated with the view, the installed software and libraries are also accessible through environment's [view](https://spack.readthedocs.io/en/latest/environments.html#environment-views), From 303a13e8bdad64b4f0fc8d42101ec59b33f1f9cb Mon Sep 17 00:00:00 2001 From: Jaan Tollander de Balsch Date: Thu, 23 Apr 2026 12:39:29 +0300 Subject: [PATCH 070/139] fix typo --- docs/support/tutorials/roihu.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 4ff9c77933..f4c0377553 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -141,7 +141,7 @@ Base container are built on top of Rocky Linux 9. Now, you can run commands inside the container with clean environment and environment active as follows: ```bash - apptainer run run mycmd + apptainer run container.sif mycmd ``` === "Roihu GPU base container (~16 GB)" @@ -169,7 +169,7 @@ Base container are built on top of Rocky Linux 9. Now, you can run commands inside the container with clean environment and environment active as follows: ```bash - apptainer run --nv run mycmd + apptainer run --nv container.sif mycmd ``` More details on working with containers in CSC's computing environment can be found from the links below: From 5dd98696352b3f86e6e2b27fdcd245c685407e54 Mon Sep 17 00:00:00 2001 From: Juha Lento Date: Thu, 23 Apr 2026 12:51:29 +0300 Subject: [PATCH 071/139] Update roihu-user-spack.md One more prompt removed... --- docs/support/tutorials/roihu-user-spack.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu-user-spack.md b/docs/support/tutorials/roihu-user-spack.md index 8a05d87440..d1773ffa84 100644 --- a/docs/support/tutorials/roihu-user-spack.md +++ b/docs/support/tutorials/roihu-user-spack.md @@ -312,7 +312,7 @@ which in this tutorial example is in `environments/mygcc152_ec/.spack-env/view`. Command ```console -[mygcc152_ec] $ spack env deactivate +spack env deactivate ``` will deactivate the current Spack environment. From 68c4327fbb35442751cd275a5a2027c6501994ce Mon Sep 17 00:00:00 2001 From: Sami Ilvonen Date: Thu, 23 Apr 2026 12:47:51 +0300 Subject: [PATCH 072/139] Fixed typos --- docs/data/Allas/introduction.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/data/Allas/introduction.md b/docs/data/Allas/introduction.md index 11cb80f740..72a6b4f480 100644 --- a/docs/data/Allas/introduction.md +++ b/docs/data/Allas/introduction.md @@ -15,7 +15,7 @@ The stored objects can be of any data type, such as images or compressed data fi * The data can be accessed from anywhere. * The data can have different levels of access control. * The data can have lifecycle policy set. - * You can access Allas from any machine or server that is connected to internet. This can be a your laptop, supercomputer at CSC, virtual machine in cloud or enven your phone. + * You can access Allas from any machine or server that is connected to internet. This can be a your laptop, supercomputer at CSC, virtual machine in cloud or even your phone. **Limitations** @@ -77,7 +77,7 @@ Allas has technical limits, that normally can not be increased: | Buckets per project | 1 000 | | Objects per bucket | 500 000 | -If you need to store lot of objects, please plan on spreading the objects across multiple buckets. Spreading data to multiple buckets will give a better performance whenever writing objects. Alternatively, consider storing data in bigger units, for example as tar or zip arcive files instead of storing each file as a separate object. +If you need to store lot of objects, please plan on spreading the objects across multiple buckets. Spreading data to multiple buckets will give a better performance whenever writing objects. Alternatively, consider storing data in bigger units, for example as tar or zip archive files instead of storing each file as a separate object. ## Protocols @@ -87,7 +87,7 @@ The object storage service is provided over two different protocols, **S3** and * The token-based **Swift authentication** used remains valid for **eight hours* at a time. -The permanent connection of S3 is practical in many ways, but it includes a security aspect: if the server where Allas is used is compromised, the object storage space will be compromised as well. Due to this security concern, Swift has been the defult for multiple-user servers such as Allas web interface, Mahti and Puhti. However, due to technical development Allas services are starting to use S3 as the default protocol. For example in Roihu S3 will be the default Allas protocol. +The permanent connection of S3 is practical in many ways, but it includes a security aspect: if the server where Allas is used is compromised, the object storage space will be compromised as well. Due to this security concern, Swift has been the default for multiple-user servers such as Allas web interface, Mahti and Puhti. However, due to technical development Allas services are starting to use S3 as the default protocol. For example in Roihu S3 will be the default Allas protocol. Thus, for example, the CSC-specific `a-commands`, as well as the `rclone` configuration in Puhti and Mahti, are by default based on Swift. But in Roihu they use S3- @@ -101,8 +101,8 @@ Each bucket has a name that must be unique across all Allas users. If another us Object URLs can be in the DNS format, e.g. _https://a3s.fi/bucketname/objectname_. Please use a valid DNS name (RFC 1035). **We recommend not using upper case or non-ASCII (ä, ö etc.) characters in bucket names**. -Object names, inside the bucket don't charcter restrictions like bucket names. Howerver, some Allas interfaces may have problems with certain characters (e.g. spaces ) in object names. -Not also that there is no directory sturcture in side the bucket. You can incluce slash (/) characters in object names, to define [pseodo folders](terms_and_concepts.md#pseudo-folder), which some Allas clients display as folders. However thenically slash caracters are just part of a the object name. +Object names, inside the bucket don't character restrictions like bucket names. However, some Allas interfaces may have problems with certain characters (e.g. spaces ) in object names. +Not also that there is no directory structure in side the bucket. You can include slash (/) characters in object names, to define [pseudo folders](terms_and_concepts.md#pseudo-folder), which some Allas clients display as folders. However technically slash characters are just part of a the object name. ## File sizes and packaging @@ -141,6 +141,6 @@ Allas data is spread across various servers, which protects against disk and ser 4. If moving data from/to your local machine, install the selected tool (not needed if using webinterfaces). 5. [Configure the connection](using_allas/allas-conf.md) to Allas. 6. Move data to/from Allas. -7. If you want to share the data publically or with another project, change access rights for your data. +7. If you want to share the data publicly or with another project, change access rights for your data. For last two steps, see tool specific instructions. From 9138c4ed4e7fd1dbb24cbe1cb63ff697bb114b2b Mon Sep 17 00:00:00 2001 From: leopekkas Date: Thu, 23 Apr 2026 13:24:15 +0300 Subject: [PATCH 073/139] Fix formatting in Roihu tutorial --- docs/support/tutorials/roihu.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index f4c0377553..ea46202060 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -95,8 +95,8 @@ Quota extensions on Roihu must be separately applied for and properly motivated. ## Installing software Before installing anything check if the software is already available: - - [List of pre-installed applications](../../apps/by_availability.md#roihu) - - `module spider ` +- [List of pre-installed applications](../../apps/by_availability.md#roihu) +- `module spider ` If not available, choose one of the following approaches depending on your needs: From 9b5a1e2af6840c9486779dd12f64b426f2c7f7bd Mon Sep 17 00:00:00 2001 From: leopekkas Date: Thu, 23 Apr 2026 14:29:10 +0300 Subject: [PATCH 074/139] Apply suggestion on SSH-CA instructions, prioritize the MyCSC route --- docs/computing/connecting/ssh-keys.md | 89 ++++++++++++++------------- docs/support/tutorials/roihu.md | 1 + 2 files changed, 48 insertions(+), 42 deletions(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 958139cd9c..e0296503ce 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -121,7 +121,51 @@ system by introducing an additional authentication factor for SSH logins. **SSH certificates are valid for 24 hours at a time**. Once your certificate expires, a new one must be signed following either of the processes below. -### Option 1: Certificate helper tool (recommended) +--- + +### Option 1: MyCSC (primary method) + +1. Log in to MyCSC with your CSC or Haka/Virtu credentials. +1. Select _Profile_ from the left-hand navigation or the dropdown menu in the + top-right corner. +1. Locate _SSH PUBLIC KEYS_ section and click the three vertical dots next to + the public key you want to sign. +1. Click _Sign and download SSH certificate_. As a security measure, you may be + asked to log in again. + + ![Sign and download SSH certificate](https://a3s.fi/docs-files/sign-download-ssh-cert.png 'Sign and download SSH certificate') + + !!! info "Where to store the SSH certificate?" + We **strongly** advice saving the certificate in the default folder for + SSH-related files (e.g. `~/.ssh` or `C:\Users\/.ssh`). + Specifically, storing the certificate in the same directory as your + SSH private key **and** naming it as `-cert.pub` will simplify + connecting, working with SSH agent, etc. + + For example, if you've stored your SSH private key in + `~/.ssh/id_ed25519`, please save your SSH certificate as + `~/.ssh/id_ed25519-cert.pub`. + +1. **Connect to Roihu following these instructions**: + +=== "Linux & macOS" + + 1. Optional, add certificate to [SSH authentication agent](ssh-unix.md#authentication-agent). Mandatory for SSH agent forwarding. + 1. [Connect from Terminal](ssh-unix.md#basic-usage) + + +=== "Windows" + + 1. Optional, add certificate to [SSH authentication agent](ssh-windows.md#authentication-agents-with-roihu). Mandatory for SSH agent forwarding or for using FileZilla and WinSCP. + 1. Connect to Roihu [with SSH clients](ssh-windows.md#basic-usage) or [graphical file transfer tools](../../data/moving/graphical_transfer.md). + + +--- + +### Option 2: Certificate helper tool + +We recommend trying the MyCSC workflow first, as it is the simplest and most reliable option. +If you later find the process repetitive or want to automate it, you can use the helper tool described below. The certificate helper is a Python tool developed by CSC to simplify the process of signing and downloading an SSH certificate, and adding it to your @@ -131,7 +175,7 @@ following instructions illustrate only basic usage. 1. Ensure that you have Python installed on your computer. - Instructions are available in the - [Python Beginners Guide](https://wiki.python.org/moin/BeginnersGuide/Download). + [Official Python ownloads page](https://www.python.org/downloads/). Contact your local IT-support if you need assistance. - If Python for some reason cannot be installed on your computer, fall back to [Option 2](#option-2-mycsc) instead. @@ -174,7 +218,7 @@ following instructions illustrate only basic usage. your SSH agent. - The signed certificate is saved as `-cert.pub` (e.g., `~/.ssh/id_ed25519-cert.pub`). - 6. **[Connect to Roihu following these instructions](ssh-unix.md#basic-usage)**. + 6. Your now have everything ready to **[Connect to Roihu following these instructions](ssh-unix.md#basic-usage)**. === "Windows" @@ -229,45 +273,6 @@ following instructions illustrate only basic usage. as `-cert.pub` for OpenSSH keys and/or `-cert.ppk` for Putty keys. 13. Connect to Roihu [with SSH clients](ssh-windows.md#basic-usage) or [graphical file transfer tools](../../data/moving/graphical_transfer.md). - ---- - -### Option 2: MyCSC - -1. Log in to MyCSC with your CSC or Haka/Virtu credentials. -1. Select _Profile_ from the left-hand navigation or the dropdown menu in the - top-right corner. -1. Locate _SSH PUBLIC KEYS_ section and click the three vertical dots next to - the public key you want to sign. -1. Click _Sign and download SSH certificate_. As a security measure, you may be - asked to log in again. - - ![Sign and download SSH certificate](https://a3s.fi/docs-files/sign-download-ssh-cert.png 'Sign and download SSH certificate') - - !!! info "Where to store the SSH certificate?" - We **strongly** advice saving the certificate in the default folder for - SSH-related files (e.g. `~/.ssh` or `C:\Users\/.ssh`). - Specifically, storing the certificate in the same directory as your - SSH private key **and** naming it as `-cert.pub` will simplify - connecting, working with SSH agent, etc. - - For example, if you've stored your SSH private key in - `~/.ssh/id_ed25519`, please save your SSH certificate as - `~/.ssh/id_ed25519-cert.pub`. - -1. **Connect to Roihu following these instructions**: - -=== "Linux & macOS" - - 1. Optional, add certificate to [SSH authentication agent](ssh-unix.md#authentication-agent). Mandatory for SSH agent forwarding. - 1. [Connect from Terminal](ssh-unix.md#basic-usage) - - -=== "Windows" - - 1. Optional, add certificate to [SSH authentication agent](ssh-windows.md#authentication-agents-with-roihu). Mandatory for SSH agent forwarding or for using FileZilla and WinSCP. - 1. Connect to Roihu [with SSH clients](ssh-windows.md#basic-usage) or [graphical file transfer tools](../../data/moving/graphical_transfer.md). - --- ### Check when your SSH certificate will expire diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index ea46202060..e195ce10f7 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -95,6 +95,7 @@ Quota extensions on Roihu must be separately applied for and properly motivated. ## Installing software Before installing anything check if the software is already available: + - [List of pre-installed applications](../../apps/by_availability.md#roihu) - `module spider ` From 01b2cbd680c94368d85cccbe3aa8a899b8480e82 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Thu, 23 Apr 2026 14:34:59 +0300 Subject: [PATCH 075/139] Add up-to-date Python link/fix typo --- docs/computing/connecting/ssh-keys.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index e0296503ce..274b7a2a2d 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -175,7 +175,7 @@ following instructions illustrate only basic usage. 1. Ensure that you have Python installed on your computer. - Instructions are available in the - [Official Python ownloads page](https://www.python.org/downloads/). + [Official Python downloads page](https://www.python.org/downloads/). Contact your local IT-support if you need assistance. - If Python for some reason cannot be installed on your computer, fall back to [Option 2](#option-2-mycsc) instead. From 2b635ca71768807dea253d865ede4a95c428b312 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Thu, 23 Apr 2026 14:52:31 +0300 Subject: [PATCH 076/139] Fix dynamic links to old SSH headers --- docs/computing/connecting/ssh-keys.md | 3 +-- docs/computing/connecting/ssh-unix.md | 4 ++-- docs/computing/connecting/ssh-windows.md | 13 +++++++------ docs/data/moving/graphical_transfer.md | 8 ++++---- docs/support/tutorials/roihu-data.md | 3 --- 5 files changed, 14 insertions(+), 17 deletions(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 274b7a2a2d..f62e567adf 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -159,7 +159,6 @@ expires, a new one must be signed following either of the processes below. 1. Optional, add certificate to [SSH authentication agent](ssh-windows.md#authentication-agents-with-roihu). Mandatory for SSH agent forwarding or for using FileZilla and WinSCP. 1. Connect to Roihu [with SSH clients](ssh-windows.md#basic-usage) or [graphical file transfer tools](../../data/moving/graphical_transfer.md). - --- ### Option 2: Certificate helper tool @@ -178,7 +177,7 @@ following instructions illustrate only basic usage. [Official Python downloads page](https://www.python.org/downloads/). Contact your local IT-support if you need assistance. - If Python for some reason cannot be installed on your computer, fall - back to [Option 2](#option-2-mycsc) instead. + back to [Option 1](#option-1-mycsc-primary-method) instead. 2. [Download the certificate helper tool here](https://github.com/CSCfi/certificate-helper-tool/raw/refs/heads/main/csc_cert.py) (_Right click_ :material-arrow-right: _Save Link As_), or clone the Git repository: diff --git a/docs/computing/connecting/ssh-unix.md b/docs/computing/connecting/ssh-unix.md index 3e977c1ae7..58eaa76ac4 100644 --- a/docs/computing/connecting/ssh-unix.md +++ b/docs/computing/connecting/ssh-unix.md @@ -150,12 +150,12 @@ in memory. The program's behavior depends on your system: ``` **This step is done automatically if you use the - [CSC certificate helper tool](ssh-keys.md#option-1-certificate-helper-tool-recommended) + [CSC certificate helper tool](ssh-keys.md#option-2-certificate-helper-tool) to sign and download your SSH certificate!** !!! warning "Important note if you're not using the certificate helper tool" Users downloading SSH certificates - [manually from MyCSC](ssh-keys.md#option-2-mycsc) **must** store it in the + [manually from MyCSC](ssh-keys.md#option-1-mycsc-primary-method) **must** store it in the same directory as the SSH private key **and** name it as `-cert.pub` to be able to add it to SSH agent with `ssh-add` command. If successful, `ssh-add` outputs: diff --git a/docs/computing/connecting/ssh-windows.md b/docs/computing/connecting/ssh-windows.md index b44a71d4e7..4d461481a2 100644 --- a/docs/computing/connecting/ssh-windows.md +++ b/docs/computing/connecting/ssh-windows.md @@ -19,8 +19,8 @@ In Windows, 2 different key types are widely used: CSC provides two options for this: -* Option 1, the [certificate helper tool](ssh-keys.md#option-1-certificate-helper-tool-recommended) -* Option 2, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-2-mycsc) +* Option 1, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-1-mycsc-primary-method) +* Option 2, the [certificate helper tool](ssh-keys.md#option-2-certificate-helper-tool) So for Roihu, consider also how different tools support updating the SSH certificate: @@ -316,16 +316,17 @@ Puhti, Mahti and LUMI do not use SSH certificates, so adding keys to SSH authent In Roihu, besides SSH keys a SSH certificate is required. If using SSH agent, a new SSH certificate must be added daily. CSC provides two options for this: -* Option 1, the [certificate helper tool](ssh-keys.md#option-1-certificate-helper-tool-recommended) -* Option 2, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-2-mycsc) +* Option 1, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-1-mycsc-primary-method) +* Option 2, the [certificate helper tool](ssh-keys.md#option-2-certificate-helper-tool) -Option 1 provides the easiest process to sign and download the SSH certificates for connecting to Roihu. +Start with option 1, as it is the simplest and most reliable to start with. +Option 2 provides a more streamlined process to sign and download the SSH certificates for connecting to Roihu. Importantly, it also automatically adds your SSH keys and certificate to **Windows ssh-agent** and/or **Pageant**. The script does not update MobAgent, so using Pageant is recommended for MobaXterm-users. -Option 2 requires some extra steps for adding the SSH certificate to the SSH agent. +Option 1 requires some extra steps for adding the SSH certificate to the SSH agent. === "Pageant & MobAgent" diff --git a/docs/data/moving/graphical_transfer.md b/docs/data/moving/graphical_transfer.md index 29fc8a0ccc..d9e0f0382b 100644 --- a/docs/data/moving/graphical_transfer.md +++ b/docs/data/moving/graphical_transfer.md @@ -43,8 +43,8 @@ if you trust the host, and then prompt you for your SSH key passphrase. agent automatically. [See the SSH certificate instructions here](../../computing/connecting/ssh-keys.md#signing-public-key). - We recommend using the - [CSC certificate helper tool](../../computing/connecting/ssh-keys.md#option-1-certificate-helper-tool-recommended). + For streamlining the process down the line, we recommend using the + [CSC certificate helper tool](../../computing/connecting/ssh-keys.md#option-2-certificate-helper-tool). Once the connection is opened, FileZilla shows two interactive file listings side by side. On the left side you have your local file system and on the right @@ -101,8 +101,8 @@ _OK_. you leave the field empty. [See the SSH certificate instructions here](../../computing/connecting/ssh-keys.md#signing-public-key). - We recommend using the - [CSC certificate helper tool](../../computing/connecting/ssh-keys.md#option-1-certificate-helper-tool-recommended). + For streamlining the process down the line, we recommend using the + [CSC certificate helper tool](../../computing/connecting/ssh-keys.md#option-2-certificate-helper-tool). ![WinSCP advanced site settings to add ssh private key](https://a3s.fi/docs-files/winscp-ssh-key-add.png 'Add SSH key to WinSCP') diff --git a/docs/support/tutorials/roihu-data.md b/docs/support/tutorials/roihu-data.md index abb9f3a60d..d2e42177d3 100644 --- a/docs/support/tutorials/roihu-data.md +++ b/docs/support/tutorials/roihu-data.md @@ -103,9 +103,6 @@ transfer process. 1. **[SSH agent instructions for Linux/macOS](../../computing/connecting/ssh-unix.md#authentication-agent).** 2. **[SSH agent instructions for Windows](../../computing/connecting/ssh-windows.md#authentication-agent).** - * We **strongly** recommend using the - [certificate helper tool](../../computing/connecting/ssh-keys.md#option-1-certificate-helper-tool-recommended) - developed by CSC to simplify the process. ## 2. Recommended data migration methods From 6a74a3e07806cd51b8b59241ffd8095d2488676d Mon Sep 17 00:00:00 2001 From: Sami Ilvonen Date: Thu, 23 Apr 2026 12:47:51 +0300 Subject: [PATCH 077/139] Fixed typos --- docs/data/Allas/introduction.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/data/Allas/introduction.md b/docs/data/Allas/introduction.md index 11cb80f740..72a6b4f480 100644 --- a/docs/data/Allas/introduction.md +++ b/docs/data/Allas/introduction.md @@ -15,7 +15,7 @@ The stored objects can be of any data type, such as images or compressed data fi * The data can be accessed from anywhere. * The data can have different levels of access control. * The data can have lifecycle policy set. - * You can access Allas from any machine or server that is connected to internet. This can be a your laptop, supercomputer at CSC, virtual machine in cloud or enven your phone. + * You can access Allas from any machine or server that is connected to internet. This can be a your laptop, supercomputer at CSC, virtual machine in cloud or even your phone. **Limitations** @@ -77,7 +77,7 @@ Allas has technical limits, that normally can not be increased: | Buckets per project | 1 000 | | Objects per bucket | 500 000 | -If you need to store lot of objects, please plan on spreading the objects across multiple buckets. Spreading data to multiple buckets will give a better performance whenever writing objects. Alternatively, consider storing data in bigger units, for example as tar or zip arcive files instead of storing each file as a separate object. +If you need to store lot of objects, please plan on spreading the objects across multiple buckets. Spreading data to multiple buckets will give a better performance whenever writing objects. Alternatively, consider storing data in bigger units, for example as tar or zip archive files instead of storing each file as a separate object. ## Protocols @@ -87,7 +87,7 @@ The object storage service is provided over two different protocols, **S3** and * The token-based **Swift authentication** used remains valid for **eight hours* at a time. -The permanent connection of S3 is practical in many ways, but it includes a security aspect: if the server where Allas is used is compromised, the object storage space will be compromised as well. Due to this security concern, Swift has been the defult for multiple-user servers such as Allas web interface, Mahti and Puhti. However, due to technical development Allas services are starting to use S3 as the default protocol. For example in Roihu S3 will be the default Allas protocol. +The permanent connection of S3 is practical in many ways, but it includes a security aspect: if the server where Allas is used is compromised, the object storage space will be compromised as well. Due to this security concern, Swift has been the default for multiple-user servers such as Allas web interface, Mahti and Puhti. However, due to technical development Allas services are starting to use S3 as the default protocol. For example in Roihu S3 will be the default Allas protocol. Thus, for example, the CSC-specific `a-commands`, as well as the `rclone` configuration in Puhti and Mahti, are by default based on Swift. But in Roihu they use S3- @@ -101,8 +101,8 @@ Each bucket has a name that must be unique across all Allas users. If another us Object URLs can be in the DNS format, e.g. _https://a3s.fi/bucketname/objectname_. Please use a valid DNS name (RFC 1035). **We recommend not using upper case or non-ASCII (ä, ö etc.) characters in bucket names**. -Object names, inside the bucket don't charcter restrictions like bucket names. Howerver, some Allas interfaces may have problems with certain characters (e.g. spaces ) in object names. -Not also that there is no directory sturcture in side the bucket. You can incluce slash (/) characters in object names, to define [pseodo folders](terms_and_concepts.md#pseudo-folder), which some Allas clients display as folders. However thenically slash caracters are just part of a the object name. +Object names, inside the bucket don't character restrictions like bucket names. However, some Allas interfaces may have problems with certain characters (e.g. spaces ) in object names. +Not also that there is no directory structure in side the bucket. You can include slash (/) characters in object names, to define [pseudo folders](terms_and_concepts.md#pseudo-folder), which some Allas clients display as folders. However technically slash characters are just part of a the object name. ## File sizes and packaging @@ -141,6 +141,6 @@ Allas data is spread across various servers, which protects against disk and ser 4. If moving data from/to your local machine, install the selected tool (not needed if using webinterfaces). 5. [Configure the connection](using_allas/allas-conf.md) to Allas. 6. Move data to/from Allas. -7. If you want to share the data publically or with another project, change access rights for your data. +7. If you want to share the data publicly or with another project, change access rights for your data. For last two steps, see tool specific instructions. From be552f64c5440a5484a634d8953e084b8e589671 Mon Sep 17 00:00:00 2001 From: Tuomas Rossi Date: Fri, 24 Apr 2026 09:37:55 +0300 Subject: [PATCH 078/139] Update Roihu affinity tutorial Simplify the Roihu affinity tutorial by using the csc-print-affinity tool --- docs/support/tutorials/affinity.md | 38 +++++------------------------- 1 file changed, 6 insertions(+), 32 deletions(-) diff --git a/docs/support/tutorials/affinity.md b/docs/support/tutorials/affinity.md index adb98e5f6d..a7dd7fa5df 100644 --- a/docs/support/tutorials/affinity.md +++ b/docs/support/tutorials/affinity.md @@ -19,7 +19,7 @@ reduces unnecessary movement between cores, and leads to more stable and predict ## Inspecting CPU Affinity in Slurm Jobs The following example job script can be used for checking the CPU affinities of each Slurm task. -The job script creates a script `print_affinity..sh` that is then executed via `srun`: +The job script executes `csc-print-affinity` (available in `csc-tools` module) via `srun`: ```bash #!/bin/bash @@ -31,22 +31,8 @@ The job script creates a script `print_affinity..sh` that is then execute #SBATCH --ntasks-per-node=8 --cpus-per-task=48 # The product should be 384 #SBATCH --hint=nomultithread -# Create a script for printing affinity -PRINT_AFFINITY="./print_affinity.$SLURM_JOB_ID.sh" -cat << 'EOF' > $PRINT_AFFINITY -#!/bin/bash -printf "Task %4d running on node %s core %s\n" \ - "$SLURM_PROCID" \ - "$SLURMD_NODENAME" \ - "$(grep Cpus_allowed_list /proc/self/status | cut -f2)" -EOF -chmod +x $PRINT_AFFINITY - -# Remove script on exit -trap "rm -f $PRINT_AFFINITY" EXIT - # Run the program -srun $PRINT_AFFINITY +srun csc-print-affinity ``` Example output for this job: @@ -88,8 +74,7 @@ This is **generally undesirable for performance** unless combined with explicit which can be done using tools such as numactl. The following job script provides an example for setting CPU binding manually. -The job script creates two scripts: `print_affinity..sh` and `cpu_bind..sh`. -The first script is the same script used above for checking affinities and the second script +The job script creates the script `cpu_bind..sh` that uses numactl for binding tasks to CPU cores in the same way as done by default: each task is assined a contiguous block of CPU cores based on the task's local ID and the number of CPUs per task: @@ -107,17 +92,6 @@ the number of CPUs per task: #SBATCH --ntasks-per-node=8 --cpus-per-task=48 # The product should be 384 #SBATCH --hint=nomultithread -# Create a script for printing affinity -PRINT_AFFINITY="./print_affinity.$SLURM_JOB_ID.sh" -cat << 'EOF' > $PRINT_AFFINITY -#!/bin/bash -printf "Task %4d running on node %s core %s\n" \ - "$SLURM_PROCID" \ - "$SLURMD_NODENAME" \ - "$(grep Cpus_allowed_list /proc/self/status | cut -f2)" -EOF -chmod +x $PRINT_AFFINITY - # Create a script for binding tasks to CPU cores BIND_CPU="./bind_cpu.$SLURM_JOB_ID.sh" cat << 'EOF' > $BIND_CPU @@ -129,11 +103,11 @@ numactl --physcpubind ${start_core}-${end_core} "$@" EOF chmod +x $BIND_CPU -# Remove scripts on exit -trap "rm -f $PRINT_AFFINITY $BIND_CPU" EXIT +# Remove the script on exit +trap "rm -f $BIND_CPU" EXIT # Run the program with manual binding -srun --cpu-bind=none $BIND_CPU $PRINT_AFFINITY +srun --cpu-bind=none $BIND_CPU csc-print-affinity ``` This produces output equivalent to the default CPU binding seen above, From 7d8da188b027673f5cce324cdcfb3834ca4ab4aa Mon Sep 17 00:00:00 2001 From: Sebastian von Alfthan Date: Fri, 24 Apr 2026 10:27:24 +0300 Subject: [PATCH 079/139] Minor fixes to SSH instructions (#2956) * Minor fixes to SSH instructions * Small fixes for readability and CI/CD * Internal link fixes for CI/CD --------- Co-authored-by: leopekkas --- docs/computing/connecting/index.md | 12 ++++++------ docs/computing/connecting/ssh-keys.md | 18 ++++++++++-------- docs/computing/connecting/ssh-unix.md | 2 +- docs/computing/connecting/ssh-windows.md | 17 +++++++++-------- 4 files changed, 26 insertions(+), 23 deletions(-) diff --git a/docs/computing/connecting/index.md b/docs/computing/connecting/index.md index 974b32eecd..f8bf576026 100644 --- a/docs/computing/connecting/index.md +++ b/docs/computing/connecting/index.md @@ -67,10 +67,10 @@ terminal program called simply *Terminal*. The instructions for using an [SSH client on macOS and Linux](ssh-unix.md) show how to connect to a CSC supercomputer using the terminal program. -While Windows systems do not have a similar pre-existing solution for connecting -over SSH, there are multiple programs that can be used for this. The -instructions for using an [SSH client on Windows](ssh-windows.md) lists a few -popular options. +Windows comes with the `Command Prompt` terminal program that typically has the OpenSSH +ssh client installed. This client works in a similar fashion to the ssh clients on Linux and MacOS. +In addition to this client, Windows has multiple programs that can be used for this. +The instructions for using an [SSH client on Windows](ssh-windows.md) lists a few popular options. Once you have set up SSH keys, added your public key to MyCSC, and signed it to generate an SSH certificate (only required for Roihu), use a command like below @@ -84,8 +84,8 @@ ssh @.csc.fi ``` !!! note - It might take up to one hour for your new key to become active after adding - it to MyCSC. + It might take up to one hour for your new key to become active on Puhti or Mahti after adding + it to MyCSC. Roihu has no such delay since it is based on SSH certificates. Once the SSH connection to the supercomputer is open, you can interact with it by issuing Linux commands using the Bash shell program. An introduction to diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index f62e567adf..0266d2346d 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -123,7 +123,10 @@ expires, a new one must be signed following either of the processes below. --- -### Option 1: MyCSC (primary method) +### Option 1: Download from MyCSC + +We recommend trying the MyCSC workflow first, since it should work out-of-the-box on all systems. + 1. Log in to MyCSC with your CSC or Haka/Virtu credentials. 1. Select _Profile_ from the left-hand navigation or the dropdown menu in the @@ -163,8 +166,7 @@ expires, a new one must be signed following either of the processes below. ### Option 2: Certificate helper tool -We recommend trying the MyCSC workflow first, as it is the simplest and most reliable option. -If you later find the process repetitive or want to automate it, you can use the helper tool described below. +To make the process smoother and easier, you can use the helper tool. The certificate helper is a Python tool developed by CSC to simplify the process of signing and downloading an SSH certificate, and adding it to your @@ -177,7 +179,7 @@ following instructions illustrate only basic usage. [Official Python downloads page](https://www.python.org/downloads/). Contact your local IT-support if you need assistance. - If Python for some reason cannot be installed on your computer, fall - back to [Option 1](#option-1-mycsc-primary-method) instead. + back to [Option 1](#option-1-download-from-mycsc) instead. 2. [Download the certificate helper tool here](https://github.com/CSCfi/certificate-helper-tool/raw/refs/heads/main/csc_cert.py) (_Right click_ :material-arrow-right: _Save Link As_), or clone the Git repository: @@ -222,12 +224,12 @@ following instructions illustrate only basic usage. === "Windows" - 6. [Depending on the tool you plan to use](ssh-windows.md), select the OpenSSH or Putty key as input for the script. - 7. If you are using Putty keys, install [WinSCP](https://winscp.net/eng/docs/installation). + 6. [Depending on the tool you plan to use](ssh-windows.md), select the OpenSSH or Putty Private Keys (PPK) as input for the script. With OpenSSH keys you can generate certificates for both OpenSSH client and all Windows SSH applications using Putty keys, if you are only using graphical Windows applications you can also use Putty keys as input. + 7. If you are using Putty keys, install [WinSCP](https://winscp.net/eng/docs/installation). This tool is used to generate Putty Private Keys (PPK) from the certificate from MyCSC. * If you install WinSCP without admin rights, you must add `WinSCP.exe` to your Path environment variable. Search for the _Edit environment variables for your account_ settings menu. - 8. Optional, but **strongly recommended** start [SSH agent](ssh-windows.md#authentication-agent) to + 8. Optional, but **strongly recommended** start one of the [SSH agents](ssh-windows.md#authentication-agent) to automatically add SSH key and certificate to the SSH agent: * Pageant for Putty keys. * Windows `ssh-agent` for OpenSSH keys. @@ -240,7 +242,7 @@ following instructions illustrate only basic usage. # with the path to your OpenSSH public key # (.pub) or PuTTY key (.ppk) - python3 csc_cert.py -u + python csc_cert.py -u ``` * The command above assumes that the path to `csc_cert.py` is in your diff --git a/docs/computing/connecting/ssh-unix.md b/docs/computing/connecting/ssh-unix.md index 58eaa76ac4..1d0eebfcae 100644 --- a/docs/computing/connecting/ssh-unix.md +++ b/docs/computing/connecting/ssh-unix.md @@ -155,7 +155,7 @@ in memory. The program's behavior depends on your system: !!! warning "Important note if you're not using the certificate helper tool" Users downloading SSH certificates - [manually from MyCSC](ssh-keys.md#option-1-mycsc-primary-method) **must** store it in the + [manually from MyCSC](ssh-keys.md#option-1-download-from-mycsc) **must** store it in the same directory as the SSH private key **and** name it as `-cert.pub` to be able to add it to SSH agent with `ssh-add` command. If successful, `ssh-add` outputs: diff --git a/docs/computing/connecting/ssh-windows.md b/docs/computing/connecting/ssh-windows.md index 4d461481a2..6c48fe2caa 100644 --- a/docs/computing/connecting/ssh-windows.md +++ b/docs/computing/connecting/ssh-windows.md @@ -19,7 +19,7 @@ In Windows, 2 different key types are widely used: CSC provides two options for this: -* Option 1, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-1-mycsc-primary-method) +* Option 1, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-1-download-from-mycsc) * Option 2, the [certificate helper tool](ssh-keys.md#option-2-certificate-helper-tool) So for Roihu, consider also how different tools support updating the SSH certificate: @@ -29,12 +29,12 @@ So for Roihu, consider also how different tools support updating the SSH certifi | MobaXterm, inc SFTP browser | :ok:| :ok: | | Putty | :ok:| :ok: | | PowerShell | :ok:| :ok:| -| [WinSCP](../../data/moving/graphical_transfer.md#winscp-file-transfer-and-more-on-windows) | :ok:| Difficult | -| [FileZilla](../../data/moving/graphical_transfer.md#filezilla-a-general-file-transfer-tool) | Only with PageAnt | Difficult | -| Cyberduck | :ok:| :ok: with OpenSSH key, difficult with Putty key | +| [WinSCP](../../data/moving/graphical_transfer.md#winscp-file-transfer-and-more-on-windows) | Difficult | :ok:| +| [FileZilla](../../data/moving/graphical_transfer.md#filezilla-a-general-file-transfer-tool) | Difficult | Only with PageAnt | +| Cyberduck | :ok: with OpenSSH key, difficult with Putty key | :ok:| -For first/little usage, Roihu [web interface](../webinterface/index.md) might be the easiest optoin with login-node and compute-node shells and file transfer. +For initial use and light usage, Roihu's [web interface](../webinterface/index.md) might be the easiest starting option as it provides access to login and compute node shells as well as a [graphical file moving tool](../../data/moving/web-interface.md). ## Generating SSH keys @@ -316,11 +316,12 @@ Puhti, Mahti and LUMI do not use SSH certificates, so adding keys to SSH authent In Roihu, besides SSH keys a SSH certificate is required. If using SSH agent, a new SSH certificate must be added daily. CSC provides two options for this: -* Option 1, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-1-mycsc-primary-method) +* Option 1, [manual download of SSH certificate from MyCSC](ssh-keys.md#option-1-download-from-mycsc) * Option 2, the [certificate helper tool](ssh-keys.md#option-2-certificate-helper-tool) -Start with option 1, as it is the simplest and most reliable to start with. -Option 2 provides a more streamlined process to sign and download the SSH certificates for connecting to Roihu. +Option 1 can be used out-of-the-box, without any additional installations. +Option 2 provides an easier and more streamlined process to sign and download the +SSH certificates for connecting to Roihu, but requires you to download a script and to have Python and WinSCP installed. Importantly, it also automatically adds your SSH keys and certificate to **Windows ssh-agent** and/or **Pageant**. The script does not update MobAgent, so using Pageant is recommended for MobaXterm-users. From 299c0f3cc1cefe2dd5b48106fd4f7a28caed1374 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sampo=20Sillanp=C3=A4=C3=A4?= Date: Fri, 24 Apr 2026 12:26:27 +0300 Subject: [PATCH 080/139] Updated Running jobs -> Getting started --- docs/computing/running/getting-started.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/computing/running/getting-started.md b/docs/computing/running/getting-started.md index 4ab50a33d0..f03eca259e 100644 --- a/docs/computing/running/getting-started.md +++ b/docs/computing/running/getting-started.md @@ -47,12 +47,14 @@ the **Slurm** batch job system at CSC. To get started with running your application on CSC supercomputers: 1. [Available batch job partitions](batch-job-partitions.md) -2. [Creating a batch job script for Puhti](creating-job-scripts-puhti.md) -3. [Creating a batch job script for Mahti](creating-job-scripts-mahti.md) -4. [Submit a batch job](submitting-jobs.md) -5. [Performance checklist](performance-checklist.md) +1. [Creating a batch job script for Roihu](creating-job-scripts-roihu.md) +1. [Creating a batch job script for Puhti](creating-job-scripts-puhti.md) +1. [Creating a batch job script for Mahti](creating-job-scripts-mahti.md) +1. [Submit a batch job](submitting-jobs.md) +1. [Performance checklist](performance-checklist.md) If you are already familiar with Slurm, check out our +[example batch job scripts for Roihu](example-job-scripts-roihu.md), [example batch job scripts for Puhti](example-job-scripts-puhti.md) or [example batch job scripts for Mahti](example-job-scripts-mahti.md). From beee982aefc1c592a3e98ca77fc0fc3732ada349 Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Fri, 24 Apr 2026 14:11:57 +0300 Subject: [PATCH 081/139] WIP Supercomputer storage/disk sections (#2937) * Add updated disk quotas on Roihu * Fix the formatting on tabbed content * Update disk quotas on Roihu * Add Roihu local storage details * Fix header and formatting * Fix header to fix an error on a broken link * Add section about projdata into disk.md * Fix typo, formatting * Move Roihu storage to a separate page for clarity and easier management * Roll back Mahti/Puhti disk examples to prior values * Update roihu-disk.md - Minor fixes - Marked `lue` tool references with "not implemented" - Section "Temporary local disk areas" needs a review and rewrite * Fix link that breaks CI/CD * Added roihu lustre specs to lustre page * roihu scratch performance values added * burst buffer section added * typo fix * fix warning box not rendering * Add single line fix to retrigger rahti build --------- Co-authored-by: Juha Lento Co-authored-by: Henry Barton --- docs/computing/disk.md | 19 +- docs/computing/lustre.md | 29 +-- docs/computing/roihu-disk.md | 304 ++++++++++++++++++++++++++++++++ docs/computing/systems-roihu.md | 21 ++- mkdocs.yml | 1 + 5 files changed, 351 insertions(+), 23 deletions(-) create mode 100644 docs/computing/roihu-disk.md diff --git a/docs/computing/disk.md b/docs/computing/disk.md index e0beee78d7..c69863e595 100644 --- a/docs/computing/disk.md +++ b/docs/computing/disk.md @@ -1,5 +1,9 @@ # Disk areas +!!! warning "Roihu documentation on a separate docs page" + This page contains storage information on Puhti and Mahti. + For information on Roihu's storage, see: [Roihu storage](roihu-disk.md) + CSC supercomputers have three main disk areas: **home**, **projappl** and **scratch**. In addition to these disk areas visible to all compute and login nodes, each node has a **local temporary disk area** that is visible to the particular compute node during a batch @@ -45,7 +49,7 @@ These disk areas have quotas for both the amount of data and total number of fil ## Home directory -Each user has a home directory (`$HOME`) that can contain up to 10 GB of data. +Each user on Mahti and Puhti has a home directory (`$HOME`) that can contain up to 10 GB of data. The home directory is the default directory where you begin after logging in to CSC supercomputers. However, typically you should change to your project's `scratch` directory when working because @@ -59,7 +63,8 @@ are project-specific. If you are a member of several projects, you also have acc ## Scratch directory -Each project has by default 1 TB of scratch disk space in the directory `/scratch/`. +Each project on Mahti and Puhti has, by default, 1 TiB of scratch disk space in the +directory `/scratch/`. This fast parallel scratch space is intended as temporary storage space for the data that is used in the supercomputer. The scratch directory is not intended @@ -76,8 +81,8 @@ manage your data on `scratch`](../support/tutorials/clean-up-data.md). ## Projappl directory -Each project has also a 50 GB project application disk space in the directory -`/projappl/`. +Each project on Mahti and Puhti also has 50 GB project application disk space +in the directory `/projappl/`. It is intended for storing compiled software binaries, source code, libraries, scripts and small-scale reference data that are shared within a project. It is not a @@ -106,6 +111,7 @@ cycle and which to the 180 day `scratch` cleaning cycle. For example, if you are a member in two projects, with unix groups `project_2000123` and `project_2001234`, then you have access to two `scratch` and `projappl` directories: + ```text [kkayttaj@puhti-login11 ~]$ csc-workspaces @@ -206,7 +212,7 @@ are only needed within a single login- or compute node. ### Login nodes Each of the login nodes have 2900 GiB of fast local storage. The storage is located under -`$TMPDIR` and is separate for each login node. +`$TMPDIR` and is separate for each login node. The local storage is good for compiling applications and performing pre- and post-processing that require heavy I/O operations, for example packing and unpacking archive files. @@ -215,6 +221,7 @@ that require heavy I/O operations, for example packing and unpacking archive fil The local storage is meant for **temporary** storage and is cleaned frequently. Remember to move your data to a shared disk area after completing your task. + ### Compute nodes with local SSD (NVMe) disks Jobs running in the I/O- and GPU-nodes in Puhti and Mahti have local fast storage @@ -249,4 +256,4 @@ this usually does no other harm. The plus side is that if it works, it should be However, in Puhti, as well as Mahti `small`, `interactive` and GPU partitions, where applications from multiple users can share the same node, running out of memory by filling up `/dev/shm` will -crash other users applications, too! **In these cases it is not recommended to use `/dev/shm` at all.** +crash other users applications, too! **In these cases it is not recommended to use `/dev/shm` at all.** \ No newline at end of file diff --git a/docs/computing/lustre.md b/docs/computing/lustre.md index 1c66e9fb68..82c6209727 100644 --- a/docs/computing/lustre.md +++ b/docs/computing/lustre.md @@ -156,27 +156,29 @@ lmm_stripe_offset: 6 * setup a directory with the desired configuration and copy (not move) the file into the directory. -## Differences between Puhti and Mahti +## Differences between Roihu, Puhti and Mahti -Puhti and Mahti have similar storage areas +Roihu, Puhti, and Mahti have similar storage areas [home](disk.md#home-directory), [project](disk.md#projappl-directory) and [scratch](disk.md#scratch-directory), however, their the Lustre configuration is not the same. -| Name | Puhti | | Mahti | | -|-------------|--------|--------|--------|--------| -|**Disk area** | **# OSTs** | **# MDTs** | **# OSTs** | **# MDTs** | -| home | 24 | 4 | 8 | 1 | -| projappl | 24 | 4 | 8 | 1 | -| scratch | 24 | 4 | 24 | 2 | +| Name | Roihu | | Puhti | | Mahti. | | +|--------------|------------|------------|------------|------------|------------|------------| +|**Disk area** | **# OSTs** | **# MDTs** | **# OSTs** | **# MDTs** | **# OSTs** | **# MDTs** | +| home | 4 | 4 | 24 | 4 | 8 | 1 | +| projappl | 4 | 4 | 24 | 4 | 8 | 1 | +| scratch | 24 | 4 | 24 | 4 | 24 | 2 | -One main difference is that for Mahti there are separate MDTs between -`scratch`, `home`, and `project`, thus the metadata performance does +One main difference between the systems is the separation between the storage area. +For Mahti there are separate MDTs between +`scratch`, `home`, and `projappl`, thus the metadata performance does not interfere from the different file systems. Moreover, the `scratch` on Mahti can have better performance than the other storage areas if your -application and the data size is big enough because of more OSTs and MDTs. On -Puhti, all the OSTs and MDTs are shared across the storage areas, thus the +application and the data size is big enough because of more OSTs and MDTs. +Similarly on Roihu `scratch` is separate from `home` and `projappl`. On +Puhti however, all the OSTs and MDTs are shared across the storage areas, thus the performance should be similar between them. The peak I/O performance for Mahti is around to 100 GB/sec for write @@ -187,6 +189,9 @@ significant I/O, then you will not achieve 1.5 GB/sec, including also that maybe the I/O pattern of an application is not efficient. The corresponding performance for Puhti is half of that of Mahti. +Peak performance on Roihu `scratch` is improved over previous systems, boasting +peak I/O values of 219 GB/sec for write and 180 GB/sec for read. + ## Best practices * If possible, avoid using `ls -l` as the information on ownership and diff --git a/docs/computing/roihu-disk.md b/docs/computing/roihu-disk.md new file mode 100644 index 0000000000..c1da0dab04 --- /dev/null +++ b/docs/computing/roihu-disk.md @@ -0,0 +1,304 @@ +# Roihu disk areas + +Roihu provides three main shared disk areas: **home**, **projappl**, and **scratch**. +In addition, each compute node provides a local temporary disk area that +is available only during a job or interactive session on that node. +Please familiarize yourself with the areas and their specific +purposes. + +Roihu users can also apply for separate dataset projects. +These provide access to a dedicated disk area, **projdata**, intended for sharing +datasets between multiple projects. Unlike computational projects, dataset projects +do not include `scratch` or `projappl` directories. + +These directories are shared across the login and compute nodes on the system, and are based on the Lustre filesystem. +See [a more technical description of the Lustre filesystem on CSC supercomputers](lustre.md). + +!!! warning "CSC does not backup your data!" + None of the disk areas are automatically backed up by CSC! + Deleted files **cannot be recovered**. To avoid unintended data loss, make sure + to perform regular backups to, for example, [Allas](../data/Allas/index.md). See also the + [allas-backup tool](../data/Allas/using_allas/a_backup.md). + +| |Owner |Environment variable|Path |Cleaning |Automatic backup| +|-------------|--------|--------------------|----------------------|---------------------|----------------| +|**home** |Personal|`${HOME}` |`/users/` |No |No | +|**projappl** |Project |Not defined |`/projappl/` |No |No | +|**scratch** |Project |Not defined |`/scratch/` |180 days |No | +|**projdata** |Project |Not defined |`/projdata/` |No |No | + +These disk areas have quotas for both the amount of data and total number of files: + +| |Capacity|Number of files|Notes | +|------------|--------|---------------|------------------------------| +|**home** |15 GiB |150 000 files | | +|**projappl**|15 GiB |150 000 files | | +|**scratch** |250 GiB |500 000 files | | +|**projdata**|0 GiB |0 files |Must be applied for separately| + +!!! info "LUE (not implemented 2026-04-22)" + To easily check the amount of data and number of files within a given folder on + the parallel file system, please consider using the [LUE](../support/tutorials/lue.md) + tool. This tool is significantly faster than tools like `stat` or `du` and causes + much less load on the file system. + +!!! info "Quotas and cleaning" + While it is possible to [apply for increased quotas](#increasing-quotas), we + recommend that you always first ensure that the data you have stored on the + shared file system is really needed and in active use. Unused data should be + deleted or moved to e.g. [Allas](../data/Allas/index.md). A general tutorial on [managing + and cleaning data on Puhti and Mahti disks](../support/tutorials/clean-up-data.md) + is also available. + +## Home directory + +Each user has a home directory (`$HOME`) that can contain up to 15 GB of data on Roihu. + +The home directory is the default location after logging in. +However, it is not intended for data analysis or running jobs. +Its purpose is to store configuration files and other minor personal data. +Be wary of the remaining quota in your home directory, +a home directory exceeding its capacity can cause various account problems. + +The home directory is the only user-specific directory in supercomputers. All other directories +are project-specific. If you are a member of several projects, you also have access to several +`scratch` or `projappl` directories, but still have only one home directory. + +For all computing work, you should use your project's `scratch` directory. + +## Scratch directory + +Each project on Roihu has, by default, 250 GiB of scratch disk space in the +directory `/scratch/`. + +The scratch directory is a fast parallel filesystem intended temporary storage of +data used in computation, and should contain i.e. any input and output files of your +programs. +You should aim to run your jobs on the supercomputer in this `scratch` directory. + +The scratch directory is **not intended for long-term storage**. Files that have not +been accessed for a long time may be automatically removed to free up space. +The current policy on Roihu is to remove files that have not been accessed for +more than 180 days. + +Make sure to consult our tutorial for [tips and guidelines on how to +manage your data on `scratch`](../support/tutorials/clean-up-data.md). + +## Projappl directory + +Each project on Roihu has also a 15 GB project application disk space +in the directory `/projappl/`. + +Use the projappl area for storing compiled software binaries, source code, libraries, scripts +and small-scale reference data that are shared within a project. It is not a +personal storage space, as it is shared with all members of a project. +Files in projappl are not automatically removed, but the quota is limited. + +Please do not submit jobs from or write +large-scale data to your project's `projappl` directory, but use `scratch` +instead for this purpose. Note that any self-installed applications you run +can and should still be stored in `projappl`. + + +## Using scratch and projappl directories + +An overview of your directories in the supercomputer you are currently logged on can be +displayed with: + +```bash +csc-workspaces +``` + +The above command displays all `scratch` and `projappl` directories you have access to. +It also displays which of your projects are subject to the 180 day `scratch` cleaning cycle. + +For example, if you are a member in two projects, with unix groups `project_2000123` +and `project_2001234`, then you have access to two `scratch` and `projappl` directories: + +```text +[kkayttaj@roihu-login11 ~]$ csc-workspaces + +Disk area Capacity(used/max) Files(used/max) Cleanup +---------------------------------------------------------------------- +Personal home folder + +/users/kkayttaj 4.4G/15G 24K/150K n/a +---------------------------------------------------------------------- +Project: project_2000123 "Project X" + +/projappl/project_2000123 24G/15G 36K/150K n/a +/scratch/project_2000123 103G/250G 389K/500k 180d +---------------------------------------------------------------------- +Project: project_2001234 "Project Y" + +/projappl/project_2001234 25G/100G 282K/1.0M n/a +/scratch/project_2001234 7.2/10TB 2.1M/2.5M 180d +---------------------------------------------------------------------- +``` + +Moving to the scratch directory of `project_2000123`: + +```bash +cd /scratch/project_2000123 +``` + +Note that not all CSC projects have Roihu access, so you may not +necessarily find a `scratch` or `projappl` directory for all your CSC projects. + +!!! Note + The `scratch` and `projappl` directories are shared by all the members of the + project. All new files and directories are also fully accessible for other + group members (including read, write and execution permissions) by default. + +If you need to restrict access from your group members, you can reset the permissions +with the `chmod` command as usual. In general, we recommend that you allow the group +members the access, but use a subdirectory with your username for your data, for example + +``` +/scratch/project_2000123/$USER +``` + +This way the data is accessible to other group members in case of long vacations, etc, +but the ownership is still clear and organized. Note, some programs change the file permissions +from the defaults, which may restrict the access from group members. + +As mentioned earlier, the `scratch` directory is only intended for processing data. +Any data that should be preserved for a longer time should be copied to the *Allas* +object storage server. Instructions for backing up files from CSC supercomputers to +Allas can be found in the [Allas guide](../data/Allas/index.md). + +## Projdata directory + +Roihu users can apply for separate dataset projects, which provide access to shared disk +area under `/projdata/`, but no computational resources. + +Unlike normal computational projects, dataset projects do not include scratch or projappl +directories. Instead, they are designed specifically for sharing data between multiple +projects. + +Write access to a projdata directory is restricted to a single project, while multiple +other projects can be granted read access to this disk area. + +!!! note + Dataset projects are intended for data sharing and active use, not long-term storage.
+ For long term storage, consider using [Allas](../data/Allas/index.md). + +## Moving data between supercomputers + +Data can be moved directly between supercomputers using +[rsync](../data/moving/rsync.md) command. + +See our [data migration guide](../support/tutorials/roihu-data.md) for migrating data +from Puhti/Mahti to Roihu. + +## Increasing quotas + +You can use the **MyCSC portal** to [manage quotas of the `scratch` and `projappl` +directories](../accounts/how-to-increase-disk-quotas.md). + +Remember that even after the quota is increased, the planned automatic cleaning process +will continue removing idle files from the `scratch` directory. Data that is not under +active computing should be stored in the Allas storage service. + +Quota increases are limited. If your workflow requires storing very large +numbers of files (e.g. millions), you should reconsider your data workflow, +as this can lead to performance issues on the whole filesystem. + +!!! info + To find out how much data/files you have on the disk, please use our [LUE + tool](../support/tutorials/lue.md) (not implemented 2026-04-22) which is much more performant than standard + tools such as `stat` or `du`. + +## Temporary local disk areas + +Roihu compute nodes provide fast local disk storage that can significantly improve +performance for I/O-intensive workloads. + +This storage is available via the environment variable `$TMPDIR`, which many +applications use automatically for temporary files. + +Local disk is node-specific and available on the login node, as well as in a job +or interactive session. It is intended for temporary files that do not need to be +shared between nodes. + +### Login nodes + +Each login node on both Roihu-CPU and Roihu-GPU provides 80 GB of local storage under `$TMPDIR`. + +The local storage is intended for compiling applications and performing pre- and post-processing +that require heavy I/O operations, for example packing and unpacking archive files. + +!!! Note + The local storage is meant for **temporary** storage and is cleaned frequently. + Remember to move your data to a shared disk area after completing your task. + +### Compute nodes + +All compute nodes in Roihu provide fast NVMe local storage. + +These local disk areas are designed to support I/O intensive computing tasks and cases where you +need to process large amounts (over 100 000) of small files. + +Data in local storage is removed when the job finishes. You must copy any results you want to +keep to `scratch` or Allas before the job ends. + +Based on your [Slurm job reservation](running/batch-job-partitions.md) type, you will have access +to the following amount of local disk space: + +| Allocation type | Quota per user | +|:--------------------------|---------------:| +| R (Shared nodes) | 20 GiB | +| N (Full nodes) | 600 GiB | +| G (GPU nodes) | 150 GiB | +| XL (Hugemem nodes) | 1.6 TiB | +| VIZ (Visualization nodes) | 6.5 TiB | + +The disk space can be accessed under `$TMPDIR`, and does not need to be separately reserved in +your job script to be usable. Using the local disk does not consume [billing units](../accounts/billing.md). + +## Disaggregated storage + +It is also possible to request local disk mounts from a centralised pool of fast storage resources. +This fast storage capacity is provided over the network and will appear as local scratch from +within a Slurm job. The total capacity of the disaggregated NVMe resource is 307.2 TB, allowing you +to get larger capacity fast storage for your jobs. + +### Requesting storage from slurm + +!!! Note + At the present you can only request this storage for jobs that are making use of full nodes, + i.e. that are submitted with the `--exclusive` flag. Support for shared node jobs is coming + at a later date. + +To request flash storage to be mounted in an sbatch job you must add the following to the resource +request block of your script: + +```bash + +#BB_LUA SBF storagesize=20GB path=/run/sbb/ +``` + +Where `storagesize` specifies the amount of storage you need and `path` the location that the +storage will be mounted. + +You can also request resources directly on the command line with the `--bb` flag: + +```bash +srun -p small --exclusive --nodes 1 --mem 20G --account --bb="#BB_LUA SBF storagesize=10G path=/run/sbb/" --pty bash -i +``` + +Alternatively you can pass the request in a file using the `--bbf` flag, for example: + +```bash +srun -p small --exclusive --nodes 1 --mem 20G --account project_2001659 --bbf bb.spec --pty bash -i +``` + +!!! warning "Steps must use `srun`!" + When running a multinode job with sbatch, if each step is expected to run with the disaggregated + disk, then the steps must be started with srun. Otherwise, only the compute node that runs the + sbatch script will be able to use the storage. + + +!!! warning "Remember to move your data!" + Move any data you need off the flash storage before your job completes, i.e. within + your sbatch script. diff --git a/docs/computing/systems-roihu.md b/docs/computing/systems-roihu.md index 99e75f128e..be16fe0ba7 100644 --- a/docs/computing/systems-roihu.md +++ b/docs/computing/systems-roihu.md @@ -108,8 +108,19 @@ billing model. Each Roihu CPU and GPU node will have a small 960 GB local disk suitable for storing temporary files during jobs. High-performance local storage will be -available on the high-memory and visualization nodes, each of which will -include 2 x 7.68 TB fast NVMe disks. +available on the high-memory (XL) and visualization (VIZ) nodes, where each +node will include a total of 13 TiB of fast NVMe disks. + +The available storage quota that a single user can access in their jobs depends +on the system [partition](running/batch-job-partitions.md) they use: + +| Allocation type | Quota per user | +|:-------------------|---------------:| +| R (shared nodes) | 20 GiB | +| N (full nodes) | 600 GiB | +| G (GPU nodes) | 150 GiB | +| Hugemem (XL) nodes | 1,6 TiB | +| VIZ nodes | 6,5 TiB | As a new feature, users will also be able to request local disk mounts from a centralized pool of fast storage resources. This fast storage capacity will be @@ -133,10 +144,10 @@ recompiled on Roihu. More information will be included in the migration guide. The programming environment of Roihu will otherwise be similar to Mahti, including e.g. -* GNU compiler stack -* AOCC compiler stack +* The GNU compiler stack +* The AOCC compiler stack * CUDA and Nvidia HPC Software Development Kit (SDK) -* OpenMPI as main MPI library +* OpenMPI as the main MPI library Like Puhti and Mahti, Roihu will also feature a web interface for easy-to-use interactive access and running graphical user interfaces. diff --git a/mkdocs.yml b/mkdocs.yml index f9736dffd0..4dea3d144c 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -309,6 +309,7 @@ nav: - SSH client on Windows: computing/connecting/ssh-windows.md - Supercomputer storage: - computing/disk.md + - Roihu disk areas: computing/roihu-disk.md - Lustre filesystem: computing/lustre.md - Module environment: computing/modules.md - Running jobs: From 578d4951c3f3087577abb170e60b065c9c56218c Mon Sep 17 00:00:00 2001 From: Tuomas Rossi Date: Fri, 24 Apr 2026 14:35:24 +0300 Subject: [PATCH 082/139] Roihu docs for nvhpc (#2954) * Add compilation instructions for nvhpc * Add more details about compiler environments * Add general MPI+CUDA compilation instructions --------- Co-authored-by: Sami Ilvonen Co-authored-by: leopekkas --- docs/computing/compiling-roihu.md | 153 +++++++++++++++++++++++------- 1 file changed, 118 insertions(+), 35 deletions(-) diff --git a/docs/computing/compiling-roihu.md b/docs/computing/compiling-roihu.md index f63e20280a..2b98ca801e 100644 --- a/docs/computing/compiling-roihu.md +++ b/docs/computing/compiling-roihu.md @@ -125,44 +125,59 @@ mpicc -O3 -march=native -fopenmp example.c -o example Binaries compiled on Roihu-CPU are not compatible with Roihu-GPU nodes. -Roihu-GPU provides two compiler environments for building C/C++ and Fortran applications: -the [GNU](https://gcc.gnu.org) suite and the [NVIDIA-HPC](https://developer.nvidia.com/hpc-compilers) -suite. GNU compilers are loaded by default. NVIDIA compilers can be -loaded using the [Module system](modules.md) with the command: -``` -module load nvhpc -``` +Roihu-GPU provides [GNU](https://gcc.gnu.org) and [NVIDIA-HPC](https://developer.nvidia.com/hpc-compilers) +compiler environments for building C/C++ and Fortran applications under the following [modules](modules.md): -The compiler executables are as follows: +| Compiler suite | Modules | +| :--------------------------- | :------------------------------------------------------- | +| GNU 14.3.0 + CUDA 12.9.1 | `gcc/14.3.0 cuda/12.9.1 openmpi/5.0.8 openblas/0.3.30` | +| GNU 15.2.0 + CUDA 13.1.1 | `gcc/15.2.0 cuda/13.1.1 openmpi/5.0.8 openblas/0.3.30` | +| NVIDIA HPC 26.3 | `nvhpc/26.3` | -| Compiler suite | C | C++ | Fortran | -| :------------- | :- | :-- | :------ | -| GNU | gcc | g++ | gfortran | -| NVIDIA | nvc | nvc++ | nvfortran | +The first compiler suite is loaded by default. +You can change the environment by loading the listed modules, for example, + +```bash +module load nvhpc/26.3 +``` +!!! info "About the `nvhpc` module" + Note that the `nvhpc` module includes CUDA, MPI, and BLAS implementations, + so you don't need to load these modules separately when using the `nvhpc` module. + For this reason, the `module load` might note you about inactive modules. -In addition, the CUDA [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) compiler is available for building GPU kernel code. See the [CUDA section below](#compiling-cuda-code). + To avoid leaving inactive modules, you can purge modules before loading the environment: + ```bash + module purge + module load nvhpc/26.3 + ``` List all available versions of the compiler suites: -``` +```bash module spider gcc +module spider cuda module spider nvhpc ``` +The compiler executables are as follows: + +| Compiler suite | C | C++ | Fortran | +| :------------- | :-- | :---- | :-------- | +| GNU | gcc | g++ | gfortran | +| NVIDIA HPC | nvc | nvc++ | nvfortran | + -### Compiling CUDA code +In addition, the CUDA [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) compiler is available for building GPU kernel code. See the [CUDA section below](#compiling-cuda-applications). -CUDA is the recommended programming model for Nvidia GPUs and is provided as an environment module -on Roihu-GPU (loaded by default). + +### Compiling CUDA applications + +CUDA is the recommended programming model for Nvidia GPUs and it is +provided in `cuda` and `nvhpc` modules. The CUDA compiler (`nvcc`) takes care of compiling CUDA kernels code for the target GPU device and passes the rest to the currently loaded host compiler like `gcc` or `nvhpc`. -For example, to load the CUDA 13.1 environment together with the GNU compiler: - -```bash -module load gcc/15.2.0 cuda/13.1.1 -``` To generate code for a given target device, tell the CUDA compiler what compute capability the target device supports. On Roihu, the @@ -183,25 +198,93 @@ nvcc -gencode arch=compute_90a,code=sm_90a example.cu more generic `arch=compute_90,code=sm_90` options. -### Building MPI applications on Roihu-GPU +### Compiling MPI+CUDA applications + +All the provided GNU and NVIDIA compiler environments provide a CUDA-aware MPI library. + +If the structure of the MPI+CUDA application allows, you can build it in parts: + +1. Compile CUDA kernels to object files with `nvcc -c` +2. Compile host code to object files with the MPI compiler wrappers that will call the loaded host compiler (`mpicc -c`, `mpicxx -c`, or `mpif90 -c`) +3. Link all the object files with the MPI compiler wrapper (`mpicc`, `mpicxx`, or `mpif90`) + +It is also possible to compile the whole codebase with `nvcc`, but then +we need to provide the necessary MPI compile and link options +to the underlying host compiler called by `nvcc`. +This can be achieved as follows via `-Xcompiler` and `-Xlinker` flags: -In the GNU compiler environment, an OpenMPI module is available that implements -CUDA-aware MPI. It is loaded by default. You may use one of the MPI compiler wrappers `mpicc` (C), -`mpicxx` (C++), or `mpif90` (Fortran) when compiling MPI applications. When compiling -MPI applications with `nvcc`, you will need to explicitly provide MPI include and library -paths: ```bash -nvcc -gencode arch=compute_90a,code=sm_90a example.cu -lmpi -I$OPENMPI_INSTROOT/include -L$OPENMPI_INSTROOT/lib +# Parse MPI options for compiler +Xcompiler="-Xcompiler $(mpicxx --showme | tr ' ' '\n' | sed '/^-Wl,/d;1d' | paste -sd, -)" + +# Parse MPI options for linker +Xlinker="-Xlinker $(mpicxx --showme | tr ' ' '\n' | sed -n 's/^-Wl,//p' | paste -sd, -)" + +# Compile MPI code using nvcc +nvcc -gencode arch=compute_90a,code=sm_90a $Xcompiler $Xlinker mpi_cuda_code.cu ``` -In the NVIDIA compiler environment, the MPI is bundled by NVIDIA and is directly -available after loading the compiler suite. There is no separate MPI module to load. +!!! warning + Remember to load the modules used for compiling also when running the application + to ensure that the correct MPI library is used during the runtime. + + +### Compiling application using OpenMP offload, OpenACC, and C++ standard parallelism !!! warning - The NVHPC environment on Roihu is still undergoing configuration. - The current version may have issues with, for example, its Slurm integration - on Roihu. For now, we strongly recommend using the GNU compiler suite when - building MPI applications." + It is recommended to use the NVIDIA HPC compilers for + compiling codes using OpenMP offload, OpenACC, and C++ standard parallelism. + +Start by loading NVIDIA HPC compilers: + +```bash +module purge +module load nvhpc/26.3 +``` + +The compiler options for enabling different GPU programming models are as follows: + +| Programming model | Compiler option | +| :---------------- | :--------------------------- | +| OpenMP offload | `-mp=gpu` | +| OpenACC | `-acc=gpu` | +| C++ stdpar | `-stdpar=gpu` (`nvc++` only) | + + +To generate efficient code for the GH200 superchips on Roihu, +specify the target with the following option: +```raw +-gpu=cc90 +``` + +Example compilation commands: + +| Programming model | C | C++ | Fortran | +| :---------------- | :------------------------------------- | :--------------------------------------------- | :--------------------------------------------- | +| OpenMP offload | `nvc -O3 -mp=gpu -gpu=cc90 example.c` | `nvc++ -O3 -mp=gpu -gpu=cc90 example.cpp` | `nvfortran -O3 -mp=gpu -gpu=cc90 example.F90` | +| OpenACC | `nvc -O3 -acc=gpu -gpu=cc90 example.c` | `nvc++ -O3 -acc=gpu -gpu=cc90 example.cpp` | `nvfortran -O3 -acc=gpu -gpu=cc90 example.F90` | +| C++ stdpar | N/A | `nvc++ -O3 -stdpar=gpu -gpu=cc90 example.cpp` | N/A | + + +The compilers support also codes that contain multiple programming models. +As an example, compile a C++ code that contains OpenMP offload, OpenACC, and C++ parallel algorithms with: +```bash +nvc++ -O3 -mp=gpu -acc=gpu -stdpar=gpu -gpu=cc90 example.cpp +``` + +### Compiling MPI application using OpenMP offload, OpenACC, and C++ standard parallelism + +The `nvhpc` module is bundled with GPU-aware MPI implementation with +the usual compiler wrappers, and MPI applications can be compiled +like above but replacing `nvc`, `nvc++`, and `nvfortran` with +`mpicc`, `mpicxx`, and `mpif90`, respectively: + +| Programming model | C | C++ | Fortran | +| :---------------- | :--------------------------------------- | :---------------------------------------------- | :------------------------------------------ | +| OpenMP offload | `mpicc -O3 -mp=gpu -gpu=cc90 example.c` | `mpicxx -O3 -mp=gpu -gpu=cc90 example.cpp` | `mpif90 -O3 -mp=gpu -gpu=cc90 example.F90` | +| OpenACC | `mpicc -O3 -acc=gpu -gpu=cc90 example.c` | `mpicxx -O3 -acc=gpu -gpu=cc90 example.cpp` | `mpif90 -O3 -acc=gpu -gpu=cc90 example.F90` | +| C++ stdpar | N/A | `mpicxx -O3 -stdpar=gpu -gpu=cc90 example.cpp` | N/A | + [Grand Challenge project]: https://research.csc.fi/grand-challenge-proposals [LUMI documentation]: https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/partitions/ diff --git a/docs/computing/running/creating-job-scripts-roihu.md b/docs/computing/running/creating-job-scripts-roihu.md new file mode 100644 index 0000000000..6f6a0237a8 --- /dev/null +++ b/docs/computing/running/creating-job-scripts-roihu.md @@ -0,0 +1,523 @@ +# Creating a batch job script for Roihu + +A batch job script contains the definitions of the resources to be reserved for +a job and the commands the user wants to run. + +[TOC] + +## The structure of a batch job script + +An example of a batch job script using a share of resources on a single node: + +```bash +#!/bin/bash +#SBATCH --job-name=my-test # Job name +#SBATCH --account= # Billing project, has to be defined! +#SBATCH --partition=small # Job partition (queue) +#SBATCH --time=00:30:00 # Max. duration of the job +#SBATCH --nodes=1 # Number of nodes used for the job +#SBATCH --ntasks=1 # Number of tasks allocated +#SBATCH --cpus-per-task=1 # Number of CPU cores allocated per task +#SBATCH --mem-per-cpu=1000M # Memory to reserve per CPU core +#SBATCH --output=slurm-%j.out # Standard output of the job script +#SBATCH --hint=nomultithread # Allocate physical cores only, avoid simultaneous multithreading +##SBATCH --mail-type=BEGIN # Uncomment to enable mail + +module load myprog/1.2.3 # Load required modules + +srun myprog -i input -o output # Run program using requested resources +``` + +The first line `#!/bin/bash` tells that the file should be interpreted as a +Bash script. + +The lines starting with `#SBATCH` are arguments (directives) for the batch job +system. These examples only use a small subset of the options. For a list of +all possible options, see the +[Slurm documentation](https://slurm.schedmd.com/sbatch.html). + +The general syntax of an `#SBATCH` option is: + +```bash +#SBATCH --option-name=argument +``` + +The first line in our example, sets the name of the job: + +```bash +#SBATCH --job-name=my-test +``` + +The name of the job will be *my-test*. It can be used to identify a job in the +queue and other listings. + + +The billing project for the job is set with option `--account`: + +```bash +#SBATCH --account= +``` + +Please replace `` with the Unix +group of your project. You can find it in [MyCSC](https://my.csc.fi) under the +*Projects* tab. [More information about billing](../../accounts/billing.md). + +!!! warning "Remember to specify the billing project" + The billing project argument is mandatory. Failing to + set it will cause an error: + + ```text + sbatch: error: AssocMaxSubmitJobLimit + sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits) + ``` + +The partition (queue) needs to be set according to the job requirements. For +example: + +```bash +#SBATCH --partition=small +``` + +!!! info "Available partitions" + [See the available batch job partitions](batch-job-partitions.md). + +The runtime reservation is set with option `--time`: + +```bash +#SBATCH --time=00:30:00 +``` + +Time is provided using the format `hours:minutes:seconds` (optionally `days-hours:minutes:seconds`) +The maximum runtime depends on the selected queue. **When the +time reservation ends, the job is terminated regardless of whether it has +finished or not**, so the time reservations should be sufficiently long. Note +that a job consumes Billing Units according to its actual runtime. + +The number of nodes used for the job can be set with option `--nodes`: + +```bash +#SBATCH --nodes=1 +``` + +This does **not** mean that the all the resources of the node(s) would be reserved, but that +all the tasks and CPU cores will be allocated from a single node in this case. In general, +here one could give a range `--nodes=-` to define the spread of nodes +across which the resources would be allocated. + +The number of tasks allocated for the job can be set with option `--ntasks`: + +```bash +#SBATCH --ntasks=1 +``` + +The allocated tasks can be used in different ways in the job script, most typically as +MPI processes. + +The number of CPU cores per task allocated for the job can be set with option `--cpus-per-task`: + +```bash +#SBATCH --cpus-per-task=1 +``` + +The product of `--ntasks` and `--cpus-per-task` defines the total amount of CPU cores +allocated for the job. + +The amount of memory reserved for each CPU core is set with option `--mem-per-cpu`: + +```bash +#SBATCH --mem-per-cpu=1000M +``` + +If the program exceeds the reserved amount of memory, the job is terminated. + +The standard output file of the job script is set with option `--output`: + +```bash +#SBATCH --output=slurm-%j.out +``` + +The standard output means all the prints that would be visible in the shell if +the commands listed in the script were executed in an interactive shell. +Here `%j` is a replacement symbol for jobid, so the output will go to the file `slurm-.out`. +By default, this file collects also the standard error, but it is possible +to specify a different file for standard error with `--error=`. + +The allocation of CPU cores vs hardware threads is controlled with the option: + +```bash +#SBATCH --hint=nomultithread +``` + +Use this option always by default and change it only if you are absolutely sure that it is beneficial. + +!!! info "About `--hint=nomultithread`" + The default behavior regarding this setting is likely to change. + +The user can be notified by email when the job *starts* by using the +`--mail-type` option + +```bash +##SBATCH --mail-type=BEGIN # Uncomment to enable mail +``` + +Other useful arguments (multiple arguments are separated by a comma) are `END` +and `FAIL`. By default, the email will be sent to the email address linked to +your CSC account. This can be overridden with the `--mail-user=` option. + +After defining all required resources in the batch job script, set up the +required environment by loading suitable modules. Note that for modules to be +available for batch jobs, they need to be loaded in the batch job script. +[More information about environment modules](../modules.md). + +```bash +module load myprog/1.2.3 +``` + +Finally, we launch our application using the requested resources with the +`srun` command: + +```bash +srun myprog -i input -o output +``` + +## Serial and shared memory batch jobs + +Serial and shared memory jobs need to be run within one compute node. Thus, the +jobs are limited by the hardware specifications available in the nodes. +See the available node types and the number of cores available per node +on [this page](../systems-roihu.md). + +The `#SBATCH` option `--cpus-per-task` is used to define the number of +computing cores that the batch job task uses. The option `--nodes=1` ensures +that all the reserved cores are located in the same node, and `--ntasks=1` +assigns all reserved computing cores for the same task. + +In thread-based jobs, the `--mem` option is recommended for memory reservation. +This option defines the amount of memory required *per node*. Note that if you +use `--mem-per-cpu` option instead, the total memory request of the job will be +the memory requested per CPU core (`--mem-per-cpu`) multiplied by the number of +reserved cores (`--cpus-per-task`). **Thus, if you modify the number of cores, +also check that the memory reservation is appropriate.** + +Typically, the most efficient practice is to match the number of reserved cores +(`--cpus-per-task`) to the number of threads or processes the application uses. +However, always [check the application-specific details](../../apps/index.md). + +If the application has a command-line option to set the number of +threads/processes/cores, it should always be used to ensure that the software +behaves as expected. Some applications use only one core by default, even if +more are reserved. + +Other applications may try to use all cores in the node, even if only some +are reserved. The environment variable `$SLURM_CPUS_PER_TASK`, which stores the +value of `--cpus-per-task`, can be used instead of a number when specifying the +amount of cores to use. This is useful as the command does not need to be +modified if the `--cpus-per-task` is changed later. + +Finally, use the environment variable `OMP_NUM_THREADS` to set the number of +threads the application uses. For example, + +```bash +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} +``` + +(note the `:-1` syntax, which sets the number of threads to 1 +if `--cpus-per-task` was not set). + +## MPI-based batch jobs + +In MPI jobs, each task has its own memory allocation. Thus, the tasks can be +distributed over multiple nodes. + +When running jobs on a partial node (`small` partition), set the number of MPI tasks with: + +``` bash +#SBATCH --partition=small +#SBATCH --ntasks= +``` + +When running on full nodes (`medium` and `large` partitions), +it is recommended to **not** use `--ntasks` option, +but instead set `--nodes`, `--ntasks-per-node`, and `--cpus-per-task` instead: + +``` bash +#SBATCH --partition=medium +#SBATCH --nodes= +#SBATCH --ntasks-per-node=384 --cpus-per-task=1 # The product should be 384 +``` + +This ensures predictable distribution and CPU binding of processes within the node, +[see Performance checklist](./performance-checklist.md). + +!!! info "Set both `--ntasks-per-node` and `--cpus-per-task` for full nodes" + It is advisable to set both `--ntasks-per-node` and `--cpus-per-task` + to keep their product as 384 for best performance. + See [notes on undersubscribing full nodes](#undersubscribing-full-nodes-on-roihu-cpu). + +!!! info "Running MPI programs" + - MPI programs **should not** be started with `mpirun` or `mpiexec`. Use + `srun` instead. + - An MPI module has to be loaded in the batch job script for the program + to work properly. + +## Hybrid batch jobs (e.g. MPI+OpenMP) + +In hybrid jobs, each task is allocated several cores. Each task then uses some +parallelization, other than MPI, to do the work. The most common strategy is +for every MPI task to launch multiple threads using OpenMP. To request more +cores per MPI task, use the argument `--cpus-per-task`. The default value is +one core per task. + +When running on full nodes, it is recommended to write +`--ntasks-per-node` and `--cpus-per-task` options on +the same `#SBATCH` line for clarity: + +``` bash +#SBATCH --partition=medium +#SBATCH --nodes= +#SBATCH --ntasks-per-node=192 --cpus-per-task=2 # The product should be 384 +#SBATCH --ntasks-per-node=96 --cpus-per-task=4 # The product should be 384 +``` + +The reason is that these options go hand-in-hand in a sense that their product should +always be 384 in order to use all CPU cores available on the node. +You can comment out one of the lines to test the optimal run configuration for +your application, +[see Performance checklist](./performance-checklist.md). + +The optimal ratio between the number of tasks and cores per tasks varies for each +program and job input. Testing is required to find the right combination for your +application. You can find some examples for +[CP2K](../../apps/cp2k.md#performance-notes) and +[NAMD](../../apps/namd.md#performance-considerations). + +!!! info "Threads per task in hybrid MPI+OpenMP jobs" + Set the number of OpenMP threads per MPI task in your batch script using + the `OMP_NUM_THREADS` and `SLURM_CPUS_PER_TASK` environment variables: + + ```bash + export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + ``` + +## High-memory jobs + +Roihu-CPU has high-memory CPU nodes with 6 TiB memory. +These nodes are available in the `hugemem` and `hugemem_longrun` partitions. +See the technical details of the nodes on [this page](../systems-roihu.md). + +The use of these nodes is similar to other Roihu-CPU nodes, but note that +these nodes have a different processor. +In particular, these nodes have 128 CPU cores per node in total. +This means that **if running on a full node in `hugemem` partitions**, +the product of `--ntasks-per-node` and `--cpus-per-task` should be 128 +for best performance: + + +``` bash +#SBATCH --partition=hugemem +#SBATCH --nodes=1 +#SBATCH --ntasks-per-node=128 --cpus-per-task=1 # The product should be 128 +#SBATCH --ntasks-per-node=64 --cpus-per-task=2 # The product should be 128 +#SBATCH --ntasks-per-node=32 --cpus-per-task=4 # The product should be 128 +``` + +## GPU jobs + +Each Roihu GPU node has four Nvidia GH200 superchips. The GPUs are available in the `gpu*` partitions. +See the technical details of the nodes on [this page](../systems-roihu.md). + +The resource allocation is based on full GH200 GPUs in `gputest`, `gpumedium`, and `gpularge` partitions +and the GPUs can be requested with: + +```bash +#SBATCH --partition=gpumedium +#SBATCH --gres=gpu:gh200: +``` + +Note that the `--gres` reservation is on a per-node basis. There are 4 GPUs per GPU node. + +!!! info "About `gpuinteractive` partition" + The MIGs are not configured yet. + +In `gpuinteractive` partition, the GH200 GPUs are sliced into smaller Multi-Instance GPUs (MIG). +Each MIG here has one XXXth of the compute and memory capacity of a full GH200 GPU. +For each GPU slice you can reserve at most XXX CPU cores and for each GPU slice the job is allocated XXX GiB of CPU memory. +Also note that you can reserve at most one GPU slice per job. The GPU slices are available using the options: + +```bash +#SBATCH --partition=gpuinteractive +#SBATCH --gres=gpu:gh200_xxx:1 +``` + +## GPU visualization jobs + +Roihu has visualization nodes with Nvidia L40 GPUs. These nodes are available in the `vizinteractive` partition. +See the technical details of the nodes on [this page](../systems-roihu.md). + +These nodes can be requested with: + +```bash +#SBATCH --partition=vizinteractive +#SBATCH --gres=gpu:l40: +``` + +Note that the `--gres` reservation is on a per-node basis. There are 2 GPUs per GPU node. + + +## Additional resources in batch jobs + +### Local temporary storage + +All nodes on Roihu have local storage space (NVMe) available for jobs. +Using local storage is recommended for I/O-intensive applications, i.e. jobs +that, for example, read and write a lot of small files. +[See more details](../disk.md#temporary-local-disk-areas). + +Local temporary storage is available for every job without extra billing. +Quota is set per user, so available space on a node is independent of +job count or reserved resources: + +- Roihu-CPU shared nodes (`small`, `interactive`, and `test` partitions) have 20 GiB quota +- Roihu-CPU full nodes (`medium` and `large` partitions) have 600 GiB quota +- Roihu-GPU nodes have 150 GiB quota + +Use the environment variable `$TMPDIR` in your batch job scripts to +access the local temporary storage space on each node. For example, to extract a large +dataset package to the local storage: + +```bash +tar xf my-large-dataset.tar.gz -C $TMPDIR +``` + +!!! warning "Remember to recover your data" + The local storage space reserved for your job is emptied after the job has + finished. Thus, if you write data to the local disk during your job, please + remember to move anything you want to preserve to the shared disk area at + the end of your job. Particularly, the commands to move the data must be + given in the batch job script as you cannot access the local storage space + anymore after the batch job has completed. For example, to copy some output + data back to the directory from where the batch job was submitted: + + ```bash + mv $TMPDIR/my-important-output.log $SLURM_SUBMIT_DIR + ``` + +### Fast local scratch storage + +As a new feature on Roihu, it is possible to request local disk mounts from a centralized pool of fast storage resources. +This fast storage capacity is provided over the network and +appears as local scratch from within a Slurm job. + +!!! info "About fast local scratch storage" + These settings are likely to change. + +Request this local storage using the following flag in the batch script: + +```bash +#SBATCH --bb="#BB_LUA SBF storagesize= path=/run/sbb/" +``` + +For example, requesting 100 GiB storage: + +```bash +#SBATCH --bb="#BB_LUA SBF storagesize=100G path=/run/sbb/" +``` + +Then, this storage is available in path `/run/sbb/$USER` during the job script. + + +### Simultaneous multithreading (SMT) on Roihu-CPU + +SMT support can be enabled with `--hint=multithread` option. +When this option is used, it is important to use the `--ntasks-per-node=X` and +`--cpus-per-task=Y` so that `X * Y = 768` on full nodes. Failing to do so will leave some of the +actual physical cores unallocated and performance will be suboptimal. + +### Undersubscribing full nodes on Roihu-CPU + +If an application requires more memory per core than there is available +with full node (2 GB / core) it is possible to use also a subset of +cores within a node. Also, if the application is memory bound, memory +bandwidth and the application performance can be improved by using +only a single core per NUMA domain or L3 cache (look at +[Roihu technical description](../systems-roihu.md) for details. +Note that billing is, however, always based on full nodes. + +When undersubscribing nodes, one should always set +`--ntasks-per-node=X` and `--cpus-per-task=Y` so that `X * Y = 384`, +even with pure MPI jobs. By default, Slurm scatters MPI tasks +`--cpus-per-task` apart, i.e. with `--cpus-per-task=16` the MPI task +**0** is bound to CPU core **0**, the MPI task **1** is bound to CPU +core **15** _etc._. Memory bandwidth (and application performance) is +the best when the tasks are executing on maximally scattered cores. As +an example, in order to use 32 GB / core, one can run only with 24 +tasks per node as + +```bash +... +#SBATCH --ntasks-per-node=24 --cpus-per-task=16 # The product should be 384 + +module load myprog/1.2.3 +export OMP_NUM_THREADS=1 + +srun myprog -i input -o output +``` + +For hybrid applications, one should use +`OMP_PROC_BIND` OpenMP runtime environment variable for +placing the OpenMP threads. As an example, in order to run +one MPI task per NUMA domain and one OpenMP thread per L3 cache one +can set + +```bash +... +#SBATCH --ntasks-per-node=8 --cpus-per-task=48 # The product should be 384 + +export OMP_NUM_THREADS=3 +export OMP_PROC_BIND=spread + +module load myprog/1.2.3 + +srun myprog -i input -o output +``` + +### Using the small partition for non-parallel pre- or post-processing +`` + +In many cases, large computing tasks include pre- or post-processing steps that are not able to utilize parallel computing. +In these cases it is recommended that, if possible, the task is split into several, chained, batch jobs and that the non-parallel +processing is executed in the small partition of Roihu. +In the small partition the jobs can reserve just few cores so that the non-parallel tasks can be executed without wasting resources. + +For example, say that we would like to post-process an _output_ file of a previous job. The post processing command: +`python post-proc.py output` uses only serial computing and requires about 40 minutes and 3 GB of memory. Instead of including the post-processing +to the main job it is reasonable to execute it as separate job in the small partition as in the example below. +Further, by defining `--dependency=afterok:`, the job is allowed to start only when the previously sent job has successfully finished. +Here the `` is replaced with ID number of the batch job that produces the _output_ file (you'll get the ID number when you submit the job). + +```bash +#!/bin/bash +#SBATCH --job-name=post-process-my-test +#SBATCH --account= +#SBATCH --time=00:50:00 +#SBATCH --partition=small +#SBATCH --nodes=1 +#SBATCH --ntasks=1 +#SBATCH --cpus-per-task=1 +#SBATCH --mem-per-cpu=4G +#SBATCH --dependency=afterok: + +python post-proc.py output +``` + +### Executing large amounts of small non-MPI jobs + +In Roihu, [HyperQueue](../../apps/hyperqueue.md) meta-scheduler +can be used to process large amounts of small non-MPI jobs. + +## More information + +* [Roihu example batch scripts](example-job-scripts-roihu.md) +* [Available batch job partitions](batch-job-partitions.md) +* [Batch job training materials](https://csc-training.github.io/csc-env-eff/part-1/batch-jobs/) +* [Slurm documentation](https://slurm.schedmd.com/documentation.html) diff --git a/docs/computing/running/example-job-scripts-roihu.md b/docs/computing/running/example-job-scripts-roihu.md new file mode 100644 index 0000000000..977d0d34b9 --- /dev/null +++ b/docs/computing/running/example-job-scripts-roihu.md @@ -0,0 +1,344 @@ +# Example batch job scripts for Roihu + +Example job scripts for running different types of programs: + +[TOC] + +!!! note + If you use the scripts (please do!), do not forget to change the resources + (time, tasks etc.) to match your needs and to replace `myprog ` + with the executable (and options) of the program you wish to run as well + as `` with the name of your project. + +## Pilot projects + +During the pilot period, pilot users will have access to separate `pilot` and `gpupilot` partitions. +These partitions allow you to run larger test cases on both Roihu-CPU (up to 200 nodes) and Roihu-GPU (up to 60 nodes). + +See [job time and node limits in the pilot partitions.](batch-job-partitions.md#roihu-pilot-partitions) + +Pilot partitions will provide you with full nodes and may experience long queue times during peak use. +Normal partitions (see examples below) are still available during the pilot projects, and are +recommended to use especially for smaller scale and routine runs. + +### Example pilot project CPU job script (MPI+OpenMP) + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=pilot +#SBATCH --time=00:30:00 +#SBATCH --nodes=2 +#SBATCH --ntasks-per-node=384 --cpus-per-task=1 # The product should be 384 +###SBATCH --ntasks-per-node=192 --cpus-per-task=2 # The product should be 384 +###SBATCH --ntasks-per-node=96 --cpus-per-task=4 # The product should be 384 +#SBATCH --hint=nomultithread +#SBATCH --mem=744G # Ensure we use all available memory on the nodes + +# Set the number of threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind threads to single hardware threads +# Comment the following lines if binding is not desired +export OMP_PLACES=threads +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +In the above, set the MPI task (`--ntasks`) and OpenMP thread (`--cpus-per-task`) counts to best +fit your program, while ensuring that the total cpu count is using all 384 cores +on your nodes. + +### Example pilot project GPU job script + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=gpupilot +#SBATCH --time=00:30:00 +#SBATCH --nodes=4 +#SBATCH --ntasks-per-node=4 --cpus-per-task=72 # The product should be 288 +#SBATCH --gres=gpu:gh200:4 # 4 GPUs per node + +# Set the number of CPU threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind CPU threads to single CPU cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## Serial CPU + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=small +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 +#SBATCH --ntasks=1 +#SBATCH --mem-per-cpu=1000M +#SBATCH --hint=nomultithread + +# Run the program +srun myprog +``` + +## Partial CPU node: MPI + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=small +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 +#SBATCH --ntasks=2 +#SBATCH --mem-per-cpu=1000M +#SBATCH --hint=nomultithread + +# Run the program +srun myprog +``` + +## Partial CPU node: OpenMP + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=small +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 +#SBATCH --ntasks=1 +#SBATCH --cpus-per-task=4 +#SBATCH --mem-per-cpu=1000M +#SBATCH --hint=nomultithread + +# Set the number of threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind threads to single cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## Partial CPU node: MPI+OpenMP + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=small +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 +#SBATCH --ntasks=2 +#SBATCH --cpus-per-task=4 +#SBATCH --mem-per-cpu=1000M +#SBATCH --hint=nomultithread + +# Set the number of threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind threads to single cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## Partial CPU node: MPI+OpenMP with simultaneous multithreading + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=small +#SBATCH --time=00:30:00 +#SBATCH --ntasks=2 +#SBATCH --cpus-per-task=4 +#SBATCH --hint=multithread + +# Set the number of threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind threads to single hardware threads +# Comment the following lines if binding is not desired +export OMP_PLACES=threads +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## Full CPU nodes: MPI + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=medium +##SBATCH --partition=large # uncomment if using 6 or more nodes +#SBATCH --time=00:30:00 +#SBATCH --nodes=2 +#SBATCH --ntasks-per-node=384 --cpus-per-task=1 # The product should be 384 +#SBATCH --hint=nomultithread +#SBATCH --mem=744G + +# Run the program +srun myprog +``` + +## Full CPU nodes: OpenMP + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=medium +##SBATCH --partition=large # uncomment if using 6 or more nodes +#SBATCH --time=00:30:00 +#SBATCH --nodes=2 +#SBATCH --ntasks-per-node=1 --cpus-per-task=384 # The product should be 384 +#SBATCH --hint=nomultithread +#SBATCH --mem=744G + +# Set the number of threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind threads to single cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## Full CPU nodes: MPI+OpenMP + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=medium +##SBATCH --partition=large # uncomment if using 6 or more nodes +#SBATCH --time=00:30:00 +#SBATCH --nodes=2 +#SBATCH --ntasks-per-node=192 --cpus-per-task=2 # The product should be 384 +#SBATCH --ntasks-per-node=96 --cpus-per-task=4 # The product should be 384 +#SBATCH --hint=nomultithread +#SBATCH --mem=744G + +# Set the number of threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind threads to single cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## Full CPU nodes: MPI+OpenMP with simultaneous multithreading + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=medium +##SBATCH --partition=large # uncomment if using 6 or more nodes +#SBATCH --time=00:30:00 +#SBATCH --nodes=2 +#SBATCH --ntasks-per-node=384 --cpus-per-task=2 # The product should be 768 +#SBATCH --ntasks-per-node=192 --cpus-per-task=4 # The product should be 768 +#SBATCH --hint=multithread +#SBATCH --mem=744G + +# Set the number of CPU threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind CPU threads to single CPU cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## GPU slices + +!!! info "Work in progress" + This section is work in progress. + + +## Partial GPU nodes: 1-16 GPUs + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=gpumedium +#SBATCH --time=00:30:00 +#SBATCH --nodes=1 +#SBATCH --ntasks-per-node=1 --cpus-per-task=72 # The product should be 72 if requesting 1 GPU per node +#SBATCH --gres=gpu:gh200:1 # Corresponds to 1 GPU per node + +# Set the number of CPU threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind CPU threads to single CPU cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## Full GPU nodes: 16 or more GPUs + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=gpularge +#SBATCH --time=00:30:00 +#SBATCH --nodes=4 +#SBATCH --ntasks-per-node=4 --cpus-per-task=72 # The product should be 288 +#SBATCH --gres=gpu:gh200:4 # 4 GPUs per node + +# Set the number of CPU threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind CPU threads to single CPU cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +## Fast disk (NVMe over Fabric) + +!!! info "Work in progress" + This section is work in progress. diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index a74455480a..8342888c1c 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -81,7 +81,7 @@ On Mahti, each user can have up to 8 active sessions on the `interactive` partition. See the [Mahti `interactive` partition details](batch-job-partitions.md#mahti-cpu-partitions-with-core-based-allocation) for information on the available resources. It is also possible to request a -a [GPU slice](./batch-job-partitions.md#gpu-slices) for interactive work by +a [GPU slice](./batch-job-partitions.md#mahti-gpu-slices) for interactive work by using the `-g` flag, which submits the job to the `gpusmall` partition. Note that using a GPU slice restricts the amount of CPU cores and memory that is available for your job. diff --git a/docs/computing/running/submitting-jobs.md b/docs/computing/running/submitting-jobs.md index 0002af502f..c65b3d17f7 100644 --- a/docs/computing/running/submitting-jobs.md +++ b/docs/computing/running/submitting-jobs.md @@ -69,6 +69,7 @@ parameters that can be used to select which data is displayed. ## More information +- [Creating Roihu batch jobs](creating-job-scripts-roihu.md) - [Creating Puhti batch jobs](creating-job-scripts-puhti.md) - [Creating Mahti batch jobs](creating-job-scripts-mahti.md) - [Available batch job partitions](batch-job-partitions.md) diff --git a/docs/support/tutorials/services-for-courses.md b/docs/support/tutorials/services-for-courses.md index 9b5d7cb8eb..23c0c73e61 100644 --- a/docs/support/tutorials/services-for-courses.md +++ b/docs/support/tutorials/services-for-courses.md @@ -280,8 +280,8 @@ If you're unsure which services would be suitable for your course, |--------|----------|-----------|-----------|------------------------|---------------------------| | Puhti | Automatic | 2 nodes (80 cores) | 0 | 08:00–17:00 | 5 | | | CSC Resource Allocation Group | 5 nodes (200 cores) | 4 nodes (16 GPUs) | Up to 12 hrs (e.g., 08:00–20:00 or 12:00–24:00) | 10 | - | Mahti | Automatic | 2 nodes (256 cores) | 14 [GPU slices](../../computing/running/batch-job-partitions.md#gpu-slices) | 08:00–17:00 | 5 | - | | CSC Resource Allocation Group | 8 nodes (1024 cores) | 56 [GPU slices](../../computing/running/batch-job-partitions.md#gpu-slices) | Up to 12 hrs (e.g., 08:00–20:00 or 12:00–24:00) | 10 | + | Mahti | Automatic | 2 nodes (256 cores) | 14 [GPU slices](../../computing/running/batch-job-partitions.md#mahti-gpu-slices) | 08:00–17:00 | 5 | + | | CSC Resource Allocation Group | 8 nodes (1024 cores) | 56 [GPU slices](../../computing/running/batch-job-partitions.md#mahti-gpu-slices) | Up to 12 hrs (e.g., 08:00–20:00 or 12:00–24:00) | 10 | A granted advance resource reservation will be visible in the form for launching an interactive session in the Puhti and Mahti web interfaces. diff --git a/mkdocs.yml b/mkdocs.yml index 4dea3d144c..f359ad1b76 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -315,6 +315,8 @@ nav: - Running jobs: - computing/running/getting-started.md - Available batch job partitions: computing/running/batch-job-partitions.md + - Create Roihu batch jobs: computing/running/creating-job-scripts-roihu.md + - Roihu example scripts: computing/running/example-job-scripts-roihu.md - Create Puhti batch jobs: computing/running/creating-job-scripts-puhti.md - Puhti example scripts: computing/running/example-job-scripts-puhti.md - Create Mahti batch jobs: computing/running/creating-job-scripts-mahti.md diff --git a/tests/check_commands.sh b/tests/check_commands.sh index c2d8676229..4205a9893b 100644 --- a/tests/check_commands.sh +++ b/tests/check_commands.sh @@ -1,6 +1,10 @@ #!/usr/bin/env bash is_valid_slurm_option(){ + # This --option-name is used in docs as an example + if [[ "$1" == "--option-name" ]]; then + return 0 + fi res=$(grep -- "$1" tests/slurm_options.txt) if [[ -z $res ]];then diff --git a/tests/check_partitions.sh b/tests/check_partitions.sh index a61472f5e8..db09c23b17 100644 --- a/tests/check_partitions.sh +++ b/tests/check_partitions.sh @@ -1,7 +1,7 @@ #!/usr/bin/env sh part_flags=$(grep -E -n -r --include \*.md "^\s*#SBATCH\s*--partition=" docs) -res=$(echo "$part_flags" | grep -Ev "#SBATCH\s*--partition=(All|small|large|medium|gc|test|longrun|fmi|hugemem|hugemem_longrun|gputest|gpu|interactive|q_fiqci|standard-g|small-g|dev-g|standard|small|debug|largemem)") +res=$(echo "$part_flags" | grep -Ev "#SBATCH\s*--partition=(All|small|large|medium|gc|test|longrun|fmi|hugemem|hugemem_longrun|gputest|gpu|interactive|vizinteractive|q_fiqci|standard-g|small-g|dev-g|standard|small|debug|largemem|pilot|gpupilot)") if [ -z "$res" ]; then echo "All partition names seem to be valid" From 631ad78d03d1c450610760a076e8e2731a91f397 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Fri, 24 Apr 2026 16:06:10 +0300 Subject: [PATCH 085/139] Add Roihu Slurm links to the Getting started tutorial --- docs/support/tutorials/roihu.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index e195ce10f7..2c29521c7f 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -204,9 +204,10 @@ Basic workflow: See the relevant documentation below for detailed information: 1. [Available batch job partitions](../../computing/running/batch-job-partitions.md) -2. [Creating a batch job script](../../computing/running/batch-job-partitions.md) -3. [Submit a batch job](../../computing/running/submitting-jobs.md) -4. [Performance checklist](../../computing/running/performance-checklist.md) +1. [Creating a batch job script](../../computing/running/creating-job-scripts-roihu.md) +1. [Example job scripts](../../computing/running/example-job-scripts-roihu.md) +1. [Submit a batch job](../../computing/running/submitting-jobs.md) +1. [Performance checklist](../../computing/running/performance-checklist.md) For common Slurm error messages, see our FAQ on [Why does my batch job fail?](../faq/why-does-my-batch-job-fail.md). From 6b84b92b671b6f3d36a320e4087279f83677f375 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Mon, 27 Apr 2026 12:42:30 +0300 Subject: [PATCH 086/139] added roihu info and updated examples --- docs/computing/modules.md | 129 ++++++++++++++++++++++++-------------- 1 file changed, 81 insertions(+), 48 deletions(-) diff --git a/docs/computing/modules.md b/docs/computing/modules.md index 713260f008..0d07461f48 100644 --- a/docs/computing/modules.md +++ b/docs/computing/modules.md @@ -18,6 +18,48 @@ details can be found on the [Lmod homepage]. [TOC] +## Roihu GPU and CPU specific modules + +On Roihu the GPU and CPU paritions exist in separare environments due to the +different base architecture of the CPU and GPU nodes, for more information see +[Getting started with Roihu](/docs/support/tutorials/roihu.md). + +Consequently, software must be built separately for the GPU and CPU node +architectures, therefore there are independent Lmod environments for the GPU +and CPU parititions, accessible from the respective login nodes; +`roihu-gpu.csc.fi` and `roihu-cpu.csc.fi` respectively. + +Both GPU and CPU environments have a collection of modules loaded by default, that +differ slightly. + +### GPU default modules + +The default modules in the GPU environment are as follows: + +```bash +module list + +Currently Loaded Modules: + 1) csc-tools/default (S) 2) gcc/14.3.0 3) cuda/12.9.1 4) openmpi/5.0.10 5) openblas/0.3.30 6) StdEnv + + Where: + S: Module is Sticky, requires --force to unload or purge +``` + +### CPU default modules + +The default modules in the CPU environment are as follows: + +```bash +module list + +Currently Loaded Modules: + 1) csc-tools/default (S) 2) gcc/15.2.0 3) ucx/1.20.0 4) openmpi/5.0.10 5) openblas/0.3.30 6) StdEnv + + Where: + S: Module is Sticky, requires --force to unload or purge +``` + ## Basic usage The syntax of the module command: @@ -33,17 +75,17 @@ module list ``` The command `module help` provides general information about a module. For -example, to get more information about the module `intel-oneapi-compilers`, use: +example, to get more information about the module `openblas`, use: ```text -module help intel-oneapi-compilers +module help openblas ``` Load new modules to your environment with the command `load`. For -example, to load the `intel-oneapi-mpi` module, use: +example, to load the `openblas` module, use: ```text -module load intel-oneapi-mpi +module load openblas ``` Note that you can only load modules that are compatible with the other @@ -55,7 +97,7 @@ Modules that are not needed or conflict with other modules can be unloaded using `unload`: ```text -module unload intel-oneapi-mkl +module unload openblas ``` ### The most commonly used module commands {#module-commands-table} @@ -99,14 +141,14 @@ module spider List modules by name: ```text -module spider int +module spider mpi ``` -The above command will list all modules with the string _int_ in their name. A more detailed +The above command will list all modules with the string _mpi_ in their name. A more detailed description of a module can be printed using the full module name with a version number: ```text -module spider intel-oneapi-mkl/2022.1.0 +module spider openmpi/5.0.10 ``` ### Solving module dependencies @@ -115,27 +157,14 @@ Some modules depend on other modules. If a required module is missing, the modul prints an error message: ```text -$ module load parallel-netcdf - -Lmod has detected the following error: These module(s) exist but -cannot be loaded as requested: "parallel-netcdf" -Try: "module spider parallel-netcdf" to see how to load the module(s). - -$ module spider parallel-netcdf - ----------------------------------------------------------------------------- - parallel-netcdf: ----------------------------------------------------------------------------- - Versions: - parallel-netcdf/1.12.2 - ----------------------------------------------------------------------------- - For detailed information about a specific "parallel-netcdf" module - (including how to load the modules) use the module's full name. - For example: - -$ module spider parallel-netcdf/1.12.2 ----------------------------------------------------------------------------- +$ module load boost/1.88.0 + +Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: +"boost/1.88.0" + Try: "module spider boost/1.88.0" to see how to load the module(s). + Or load any one of these options: + module load aocc/5.0.0 boost/1.88.0 + module load gcc/15.2.0 boost/1.88.0 ``` In such cases, the `module avail` command excludes the module from the list and the @@ -143,17 +172,23 @@ In such cases, the `module avail` command excludes the module from the list and is to use the `module spider` command with the version information. For example: ```text -$ module spider parallel-netcdf/1.12.2 ------------------------------------------------------------------- - parallel-netcdf: parallel-netcdf/1.12.2 ------------------------------------------------------------------- - You will need to load all module(s) on any one of the lines below before - the "parallel-netcdf/1.12.2" module is available to load. - - gcc/11.3.0 openmpi/4.1.4 - gcc/9.4.0 openmpi/4.1.4 - intel-oneapi-compilers-classic/2021.6.0 intel-oneapi-mpi/2021.6.0 -... +$ module spider boost/1.88.0 + +----------------------------------------------------------------------------------------------------------------------------------- + boost: boost/1.88.0 +----------------------------------------------------------------------------------------------------------------------------------- + + You will need to load all module(s) on any one of the lines below before the "boost/1.88.0" module is available to load. + + aocc/5.0.0 + gcc/15.2.0 + + Help: + Boost provides free peer-reviewed portable C++ source libraries, + emphasizing libraries that work well with the C++ Standard Library. + Boost libraries are intended to be widely useful, and usable across a + broad spectrum of applications. The Boost license encourages both + commercial and non-commercial use. ``` In this case, you will have to load one of the listed environments before @@ -177,22 +212,20 @@ correct versions of the loaded modules: ```text $ module list Currently Loaded Modules: - 1) gcc/11.3.0 2) openmpi/4.1.4 3) parallel-netcdf/1.12.2 + 1) gcc/15.2.0 2) ucx/1.20.0 3) openmpi/5.0.10 4) parallel-netcdf/1.14.1 -$ module swap gcc intel-oneapi-compilers-classic +$ module swap gcc aocc Inactive Modules: - 1) parallel-netcdf/1.12.2 - -Due to MODULEPATH changes the following modules have been reloaded: - 1) openmpi/4.1.4 + 1) openmpi 2) parallel-netcdf 3) ucx/1.20.0 $ module list + Currently Loaded Modules: - 1) intel-oneapi-compilers-classic/2021.6.0 2) openmpi/4.1.4 + 1) aocc/5.1.0 Inactive Modules: - 1) parallel-netcdf/1.12.2 + 1) ucx/1.20.0 2) openmpi 3) parallel-netcdf ``` If the correct version is not found, the module system _deactivates_ these From 7e7beb6b2cd505bfb14c4aad89d32324acc8b13a Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Mon, 27 Apr 2026 12:47:58 +0300 Subject: [PATCH 087/139] link fix --- docs/computing/modules.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/modules.md b/docs/computing/modules.md index 0d07461f48..c766d88674 100644 --- a/docs/computing/modules.md +++ b/docs/computing/modules.md @@ -22,7 +22,7 @@ details can be found on the [Lmod homepage]. On Roihu the GPU and CPU paritions exist in separare environments due to the different base architecture of the CPU and GPU nodes, for more information see -[Getting started with Roihu](/docs/support/tutorials/roihu.md). +[Getting started with Roihu](../support/tutorials/roihu.md). Consequently, software must be built separately for the GPU and CPU node architectures, therefore there are independent Lmod environments for the GPU From 514805a665ad4f6a9e83e6c4be74886d465fcdb5 Mon Sep 17 00:00:00 2001 From: Jaan Tollander de Balsch Date: Mon, 27 Apr 2026 13:42:16 +0300 Subject: [PATCH 088/139] available base containers --- docs/support/tutorials/roihu.md | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 2c29521c7f..c70cc3ce9f 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -118,6 +118,12 @@ You can build containers on top of Roihu base containers which have the same sof Base container are built on top of Rocky Linux 9. === "Roihu CPU base container (~4 GB)" + Base containers available: + + - `satama.csc.fi/r_installation_spack/core-cpu-gcc-15.2.0:v2026_03` + + Build definition file: + ```sh title="container.def" Bootstrap: docker From: satama.csc.fi/r_installation_spack/core-cpu-gcc-15.2.0:v2026_03 @@ -146,9 +152,17 @@ Base container are built on top of Rocky Linux 9. ``` === "Roihu GPU base container (~16 GB)" + Base containers available: + + - `satama.csc.fi/r_installation_spack/core-gpu-gcc-15.2.0-cuda-13.1.1:v2026_03` + - `satama.csc.fi/r_installation_spack/core-gpu-gcc-14.3.0-cuda-12.9.1:v2026_03` + - `satama.csc.fi/r_installation_spack/core-gpu-gcc-13.4.0-cuda-12.6.3:v2026_03` + + Build definition file: + ```sh title="container.def" Bootstrap: docker - From: satama.csc.fi/r_installation_spack/core-gpu-gcc-15.2.0-cuda-13.1.1 + From: satama.csc.fi/r_installation_spack/core-gpu-gcc-14.3.0-cuda-12.9.1:v2026_03 %post # Activate module environment and load default modules. From fc8b7514d14fb998352c6b4f8a6b92fc17175721 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Mon, 27 Apr 2026 13:54:09 +0300 Subject: [PATCH 089/139] typo and phrasing --- docs/computing/modules.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/computing/modules.md b/docs/computing/modules.md index c766d88674..89d3f85dea 100644 --- a/docs/computing/modules.md +++ b/docs/computing/modules.md @@ -20,13 +20,13 @@ details can be found on the [Lmod homepage]. ## Roihu GPU and CPU specific modules -On Roihu the GPU and CPU paritions exist in separare environments due to the +On Roihu the GPU and CPU paritions exist in separate environments due to the different base architecture of the CPU and GPU nodes, for more information see [Getting started with Roihu](../support/tutorials/roihu.md). Consequently, software must be built separately for the GPU and CPU node -architectures, therefore there are independent Lmod environments for the GPU -and CPU parititions, accessible from the respective login nodes; +architectures, therefore there are different, independent software modules +for the GPU and CPU parititions, accessible from the login nodes; `roihu-gpu.csc.fi` and `roihu-cpu.csc.fi` respectively. Both GPU and CPU environments have a collection of modules loaded by default, that From d504abae4406a29391c6904e754f2569ebf3aa42 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Mon, 27 Apr 2026 13:54:57 +0300 Subject: [PATCH 090/139] rephrase --- docs/computing/modules.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/modules.md b/docs/computing/modules.md index 89d3f85dea..2231e84364 100644 --- a/docs/computing/modules.md +++ b/docs/computing/modules.md @@ -26,7 +26,7 @@ different base architecture of the CPU and GPU nodes, for more information see Consequently, software must be built separately for the GPU and CPU node architectures, therefore there are different, independent software modules -for the GPU and CPU parititions, accessible from the login nodes; +for the GPU and CPU parititions, accessible from the corresponding login nodes; `roihu-gpu.csc.fi` and `roihu-cpu.csc.fi` respectively. Both GPU and CPU environments have a collection of modules loaded by default, that From 9de30fec7decf7310dc1031e058542bb41391ea9 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Mon, 27 Apr 2026 14:22:20 +0300 Subject: [PATCH 091/139] Roihu sinteractive warning (#2969) * added sinteractive missing to known issues * added warning to sinteractive docs * triggering rebuild * Update docs/computing/running/interactive-usage.md Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> * Update docs/support/tutorials/roihu.md Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> * Fx typo from suggestion --------- Co-authored-by: Henry Barton Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Co-authored-by: leopekkas --- docs/computing/running/interactive-usage.md | 5 +++++ docs/support/tutorials/roihu.md | 11 +++++++++++ 2 files changed, 16 insertions(+) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index 8342888c1c..9d89a80fef 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -13,6 +13,11 @@ running [web interface applications](../webinterface/apps.md) and [batch jobs](getting-started.md), but the most convenient way to use it is via the [`sinteractive` command](#the-sinteractive-command). +!!! warning "`sinteractive` not yet available on Roihu!" + At present the `sinteractive` tool is not yet installed on Roihu, + as such interactive jobs must be launched directly: + `srun --partition interactive --ntasks 1 --cpus-per-task 1 --mem-per-cpu 2G --account --pty bash -i` + ## The `sinteractive` command `sinteractive` starts a new shell program on a compute node with the resources diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index c70cc3ce9f..f2068b9db3 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -227,6 +227,8 @@ For common Slurm error messages, see our FAQ on [Why does my batch job fail?](.. ### Known issues (pilot phase) +#### Argos errors + During the pilot phase, you may encounter multiple warnings or errors related to *Argos* in your Slurm job output, for example: ``` @@ -253,6 +255,15 @@ srun --argos=no The same option can also be passed as an `#SBATCH` input. +#### sinteractive missing + +At present the `sinteractive` tool is not yet installed on Roihu, as such interactive jobs must be +launched directly as below: + +```bash +srun --partition interactive --ntasks 1 --cpus-per-task 1 --mem-per-cpu 2G --account --pty bash -i +``` + ## More information * [Roihu system overview](../../computing/systems-roihu.md) From 02ceefcb9674d5414ef22a703cf6670f494504bc Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Tue, 28 Apr 2026 08:58:40 +0300 Subject: [PATCH 092/139] removed sinteractive missing warnings --- docs/computing/running/interactive-usage.md | 5 ----- docs/support/tutorials/roihu.md | 9 --------- 2 files changed, 14 deletions(-) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index 9d89a80fef..8342888c1c 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -13,11 +13,6 @@ running [web interface applications](../webinterface/apps.md) and [batch jobs](getting-started.md), but the most convenient way to use it is via the [`sinteractive` command](#the-sinteractive-command). -!!! warning "`sinteractive` not yet available on Roihu!" - At present the `sinteractive` tool is not yet installed on Roihu, - as such interactive jobs must be launched directly: - `srun --partition interactive --ntasks 1 --cpus-per-task 1 --mem-per-cpu 2G --account --pty bash -i` - ## The `sinteractive` command `sinteractive` starts a new shell program on a compute node with the resources diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index f2068b9db3..93581a04e9 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -255,15 +255,6 @@ srun --argos=no The same option can also be passed as an `#SBATCH` input. -#### sinteractive missing - -At present the `sinteractive` tool is not yet installed on Roihu, as such interactive jobs must be -launched directly as below: - -```bash -srun --partition interactive --ntasks 1 --cpus-per-task 1 --mem-per-cpu 2G --account --pty bash -i -``` - ## More information * [Roihu system overview](../../computing/systems-roihu.md) From 8b693c1455b250d123e997c98622bfa97055c7f4 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Tue, 28 Apr 2026 09:29:21 +0300 Subject: [PATCH 093/139] info on roihu interactive parititons --- docs/computing/running/interactive-usage.md | 25 ++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index 8342888c1c..ab09e2b354 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -44,7 +44,30 @@ When this option is used, the user is prompted for the individual parameters of the session (runtime, memory, cores, etc.). If you do not want to specify the resources interactively, you can simply pass them to the command as arguments. Note that the available options and resources are not identical on -Puhti and Mahti due to differences in hardware. +Roihu, Puhti and Mahti due to differences in hardware. + +### `sinteractive` on Roihu + +There are two interactive partitions available on Roihu; `interactive` for CPU +resources and `gpuinteractive` for GPU resources. See the +[Roihu `interactive` partition details](./batch-job-partitions.md#roihu-partitions) +for information on the available resources. The Roihu `gpuinteractive` partition +features GH200 superchips that are divided into a total of 48 smaller slices that +have one-seventh of the compute capacity and one-eighth of the GPU memory capacity +(12 GiB) of a full GH200 superchip. `sinteractive` will select the correct partition +based on your resource request, and will automatically provide you with a GPU if +run from the GPU login node without additional parameters. + +!!! warning "Submit from the correct login node CPU or GPU" + It is imperative that if you are requesting a interactive GPU job that you + request it from `roihu-gpu.csc.fi`, and likewise a CPU job from `roihu-cpu.csc.fi`. + Failure to do so will result in modules incompatible with the system architecture + being loaded and available, as the interactive job inherits the environment from + the login node. + +!!! info "`gpuinteractive` currently gives full GPUs during pilot" + The GPU slicing in the `gpuinteractive` partition is not yet implemented, so + during the pilot users will be allocated full GPUs. ### `sinteractive` on Puhti From 111a4361fdab935f1034c456c27bc1dff2f8092b Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Tue, 28 Apr 2026 11:15:22 +0300 Subject: [PATCH 094/139] Document how to connect to compute node --- docs/computing/running/interactive-usage.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index 9d89a80fef..c47fc7104f 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -130,6 +130,25 @@ module load gromacs-env orterun -n 4 --oversubscribe gmx_mpi mdrun -s topol.tpr ``` +## Connecting to a compute node of a running job + +Sometimes (e.g. for debugging purposes) it is useful to login to a compute +node where a Slurm job is currently running. This can be achieved with the +`srun` command as follows: +```bash +srun --overlap --pty --jobid= bash +``` +If a job spans multiple nodes, you will be connected to the master node of +your job, which is the first node in your allocation and the one on which +your batch script is executed. It is also possible to connect to a specific +node with the `-w` option: +```bash +srun --overlap --pty --jobid= -w rcXXXX bash +``` +where `rcXXXX` is the name of a node as shown e.g. by the `squeue` command. +(Note that the format of the node names varies between different systems). + + ## Explicit interactive shell without X11 graphics If you do not want to use the `sinteractive` wrapper, it is also possible to From 46b6d110ae31267a0c788d946dbb15bd1507f972 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Tue, 28 Apr 2026 11:38:21 +0300 Subject: [PATCH 095/139] upadated example commands --- docs/computing/running/interactive-usage.md | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index ab09e2b354..7907051198 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -58,7 +58,7 @@ have one-seventh of the compute capacity and one-eighth of the GPU memory capaci based on your resource request, and will automatically provide you with a GPU if run from the GPU login node without additional parameters. -!!! warning "Submit from the correct login node CPU or GPU" +!!! warning "Submit from the correct login node; CPU or GPU" It is imperative that if you are requesting a interactive GPU job that you request it from `roihu-gpu.csc.fi`, and likewise a CPU job from `roihu-cpu.csc.fi`. Failure to do so will result in modules incompatible with the system architecture @@ -66,9 +66,16 @@ run from the GPU login node without additional parameters. the login node. !!! info "`gpuinteractive` currently gives full GPUs during pilot" - The GPU slicing in the `gpuinteractive` partition is not yet implemented, so + The GPU slicing in the `gpuinteractive` partition is not yet implemented, so during the pilot users will be allocated full GPUs. +To see the command options available on Roihu, run the following while +logged into the system: + +```bash +sinteractive --help +``` + ### `sinteractive` on Puhti On Puhti, each user can have up to two active sessions on the `interactive` @@ -127,7 +134,7 @@ Since the shell that is started in the interactive session is already a job step in Slurm, additional job steps cannot be created. This prevents running e.g. GROMACS tools in the usual way, since `gmx_mpi` is a parallel program and normally requires using `srun`. In this case, `srun` must be replaced with -`orterun -n 1` in the interactive shell. Orterun does not know of the Slurm +`prterun -n 1` in the interactive shell. Orterun does not know of the Slurm flags, so it needs to be told how many tasks/threads to use. The following example will run a [GROMACS](../../apps/gromacs.md) mean square displacement analysis for an existing trajectory: @@ -135,7 +142,7 @@ analysis for an existing trajectory: ```bash sinteractive --account module load gromacs-env -orterun -n 1 gmx_mpi msd -n index.ndx -f traj.xtc -s topol.tpr +prterun -n 1 gmx_mpi msd -n index.ndx -f traj.xtc -s topol.tpr ``` To use all requested cores in parallel, you need to add `--oversubscribe`. @@ -145,9 +152,13 @@ E.g. for 4 cores, a parallel interactive job ```bash sinteractive --account --cores 4 module load gromacs-env -orterun -n 4 --oversubscribe gmx_mpi mdrun -s topol.tpr +prterun -n 4 --oversubscribe gmx_mpi mdrun -s topol.tpr ``` +!!! info + The older runner `orterun` is now renamed as `prterun` + see: + ## Explicit interactive shell without X11 graphics If you do not want to use the `sinteractive` wrapper, it is also possible to From d0a9c8b335d9d40e9683d8ce0d75b4ff6ca48f3d Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Tue, 28 Apr 2026 11:50:36 +0300 Subject: [PATCH 096/139] typos --- docs/computing/running/interactive-usage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index 7907051198..8db2a4742a 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -59,7 +59,7 @@ based on your resource request, and will automatically provide you with a GPU if run from the GPU login node without additional parameters. !!! warning "Submit from the correct login node; CPU or GPU" - It is imperative that if you are requesting a interactive GPU job that you + It is imperative that if you are requesting an interactive GPU job that you request it from `roihu-gpu.csc.fi`, and likewise a CPU job from `roihu-cpu.csc.fi`. Failure to do so will result in modules incompatible with the system architecture being loaded and available, as the interactive job inherits the environment from @@ -134,7 +134,7 @@ Since the shell that is started in the interactive session is already a job step in Slurm, additional job steps cannot be created. This prevents running e.g. GROMACS tools in the usual way, since `gmx_mpi` is a parallel program and normally requires using `srun`. In this case, `srun` must be replaced with -`prterun -n 1` in the interactive shell. Orterun does not know of the Slurm +`prterun -n 1` in the interactive shell. Prterun does not know of the Slurm flags, so it needs to be told how many tasks/threads to use. The following example will run a [GROMACS](../../apps/gromacs.md) mean square displacement analysis for an existing trajectory: From cc5c5dd39514ff8ddc429983d4b689ce9f766d36 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Tue, 28 Apr 2026 14:49:15 +0300 Subject: [PATCH 097/139] shared node sbof release date (#2973) * added expected date for sbof on shared nodes * typo fix * made note an info box --------- Co-authored-by: Henry Barton --- docs/computing/roihu-disk.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/computing/roihu-disk.md b/docs/computing/roihu-disk.md index c1da0dab04..7bfa40e390 100644 --- a/docs/computing/roihu-disk.md +++ b/docs/computing/roihu-disk.md @@ -265,10 +265,10 @@ to get larger capacity fast storage for your jobs. ### Requesting storage from slurm -!!! Note +!!! info "Support in jobs on shared nodes coming Q3 2026" At the present you can only request this storage for jobs that are making use of full nodes, i.e. that are submitted with the `--exclusive` flag. Support for shared node jobs is coming - at a later date. + in Q3 2026. To request flash storage to be mounted in an sbatch job you must add the following to the resource request block of your script: From 7b1d7e183ff576ec7c2f051eaafc2b1e759b1531 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Tue, 28 Apr 2026 15:20:03 +0300 Subject: [PATCH 098/139] Update docs/computing/running/interactive-usage.md Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> --- docs/computing/running/interactive-usage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index 8db2a4742a..f4942a5ab7 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -157,7 +157,7 @@ prterun -n 4 --oversubscribe gmx_mpi mdrun -s topol.tpr !!! info The older runner `orterun` is now renamed as `prterun` - see: + See also: ## Explicit interactive shell without X11 graphics From aa75c00f643343668fb57cea74a36c1e7cd676eb Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Tue, 28 Apr 2026 15:20:10 +0300 Subject: [PATCH 099/139] Update docs/computing/running/interactive-usage.md Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> --- docs/computing/running/interactive-usage.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index f4942a5ab7..99f621e465 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -156,7 +156,9 @@ prterun -n 4 --oversubscribe gmx_mpi mdrun -s topol.tpr ``` !!! info - The older runner `orterun` is now renamed as `prterun` +The legacy launcher orterun (based on ORTE) has been replaced by prterun (based on PRRTE) starting with OpenMPI 5.0. + +On Mahti and Puhti, you can either use `orterun` with the default MPI environment, or load a newer OpenMPI module (see `module spider openmpi/5.0.6`) to use `prterun`. See also: ## Explicit interactive shell without X11 graphics From 965a6b99ca597e55e18abdf67b39e44adfd01395 Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Tue, 28 Apr 2026 15:25:33 +0300 Subject: [PATCH 100/139] box formatting --- docs/computing/running/interactive-usage.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index 99f621e465..57777592f6 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -156,10 +156,12 @@ prterun -n 4 --oversubscribe gmx_mpi mdrun -s topol.tpr ``` !!! info -The legacy launcher orterun (based on ORTE) has been replaced by prterun (based on PRRTE) starting with OpenMPI 5.0. + The legacy launcher orterun (based on ORTE) has been replaced by prterun + (based on PRRTE) starting with OpenMPI 5.0. -On Mahti and Puhti, you can either use `orterun` with the default MPI environment, or load a newer OpenMPI module (see `module spider openmpi/5.0.6`) to use `prterun`. - See also: + On Mahti and Puhti, you can either use `orterun` with the default MPI environment, + or load a newer OpenMPI module (see `module spider openmpi/5.0.6`) to use `prterun`. See also: + . ## Explicit interactive shell without X11 graphics From e49c8c37efaae161068d74c6e553998bf26b3e0d Mon Sep 17 00:00:00 2001 From: Tuomas Rossi Date: Tue, 28 Apr 2026 17:27:04 +0300 Subject: [PATCH 101/139] Roihu HPC libraries and compiling (#2972) * Remove outdated warning * Update list of libraries * Add more details about available compilers * Use -march=znver5 for explicitness --------- Co-authored-by: leopekkas --- docs/computing/compiling-roihu.md | 78 ++++++++++++++++--------------- docs/computing/hpc-libraries.md | 54 +++++++++------------ 2 files changed, 63 insertions(+), 69 deletions(-) diff --git a/docs/computing/compiling-roihu.md b/docs/computing/compiling-roihu.md index a22e9a35df..23f0172665 100644 --- a/docs/computing/compiling-roihu.md +++ b/docs/computing/compiling-roihu.md @@ -20,25 +20,43 @@ - The local disk is cleaned frequently, so please move your files elsewhere after compiling. -- Please see [the page on available HPC libraries](hpc-libraries.md#libraries-on-roihu) for using common libraries (BLAS, FFTW, ...) +- Please see [the page on available HPC libraries](hpc-libraries.md) for using common libraries (BLAS, FFTW, ...) and linking them to your applications. ## Compiling on Roihu-CPU -C/C++ and Fortran applications can be built with -the [GNU](https://gcc.gnu.org) or the [AMD](https://developer.amd.com/amd-aocc/) -compiler suites. GNU compilers are loaded by default. AMD compilers can be -loaded using the [Module system](modules.md) with the command: +!!! info + When compiling for the CPU nodes on Roihu, make sure you use Roihu's CPU login nodes. + Binaries compiled on Roihu-GPU are not compatible with Roihu-CPU nodes. + +Roihu-CPU provides [GNU](https://gcc.gnu.org) and [AMD AOCC](https://developer.amd.com/amd-aocc/) +compiler environments for building C/C++ and Fortran applications. +These environments are available under the following [modules](modules.md): + +| Compiler suite | Modules | +| :--------------------------- | :------------------------------------------------------- | +| GNU 15.2.0 | `gcc/15.2.0 openmpi/5.0.10` | +| AMD AOCC 5.0.0 | `aocc/5.0.0 openmpi/5.0.10` | + +The first compiler suite is loaded by default. +You can change the environment by loading the listed modules, for example, + +```bash +module load aocc/5.0.0 openmpi/5.0.10 ``` -module load aocc + +List all available versions of the compiler suites: +```bash +module spider gcc +module spider aocc ``` The compiler executables are as follows: -| Compiler suite | C | C++ | Fortran | -| :------------- | :- | :-- | :------ | -| GNU | gcc | g++ | gfortran | -| AMD | clang | clang++ | flang | +| Compiler suite | C | C++ | Fortran | +| :------------- | :---- | :------ | :------- | +| GNU | gcc | g++ | gfortran | +| AMD | clang | clang++ | flang | For applications that depend on MPI, it is recommended to instead use the compiler wrappers described in the [MPI section](#building-mpi-applications) below. @@ -49,44 +67,30 @@ is recommended to start from the safe level and then move up to intermediate or even aggressive, while ensuring the results are correct and the program's performance has improved. -| Optimization level | GNU | AMD (clang) | -| :----------------- | :---------------- | :----------- | -| **Safe** | -O2 -march=native | -O2 -march=native | -| **Intermediate** | -O3 -march=native | -O3 -march=native | -| **Aggressive** | -O3 -march=native -ffast-math -funroll-loops | +| Optimization level | GNU | AMD (clang) | +| :----------------- | :------------------------------------------- | :---------------- | +| **Safe** | -O2 -march=znver5 | -O2 -march=znver5 | +| **Intermediate** | -O3 -march=znver5 | -O3 -march=znver5 | +| **Aggressive** | -O3 -march=znver5 -ffast-math -funroll-loops | | -!!! info - Because the Roihu-CPU login and compute nodes share the same CPU architecture, - compiling for the native architecture (`-march=native`) is optimal even if - the compilation is done on login nodes. Example of compiling a non-MPI C program in GNU environment: ```bash -gcc -O3 -march=native example.c -o example +gcc -O3 -march=znver5 example.c -o example ``` A detailed list of options for the GNU and AMD compilers can be found in the _man_ pages (`man gcc/gfortran`) when the corresponding programming -environment is loaded, or in the compiler manuals (see the links above). +environment is loaded, or in the compiler manuals: +- [GNU](https://gcc.gnu.org) +- [AMD AOCC](https://developer.amd.com/amd-aocc/) We recommend testing and profiling your application with both compiler suites to see which compiler works the best for your use case. -List all available versions of the compiler suites: -``` -module spider gcc -module spider aocc -``` - ### Building MPI applications -!!! warning - The AMD compiler environment does not yet have a supporting MPI module. - We expect to set this up shortly; until then, please use the GNU environment - for building MPI applications. - - The MPI environment in Roihu is OpenMPI. You may use one of the MPI compiler wrappers `mpicc` (C), `mpicxx` (C++), or `mpif90` (Fortran) when compiling MPI applications. These wrappers end up calling the compiler from your currently loaded compiler suite @@ -94,7 +98,7 @@ These wrappers end up calling the compiler from your currently loaded compiler s Example: ```bash -mpicc -O3 -march=native example.c -o example +mpicc -O3 -march=znver5 example.c -o example ``` List all available versions of OpenMPI (one is always loaded by default): @@ -106,15 +110,15 @@ module spider openmpi ### Building OpenMP and hybrid applications An additional compiler and linker flag is needed when building an OpenMP or a hybrid -MPI/OpenMP application: +MPI+OpenMP application: | Compiler suite | OpenMP flag | | :------------- | :---------- | | GNU and AMD | -fopenmp | -Example compilation of a hybrid MPI/OpenMP application: +Example compilation of a hybrid MPI+OpenMP application: ```bash -mpicc -O3 -march=native -fopenmp example.c -o example +mpicc -O3 -march=znver5 -fopenmp example.c -o example ``` diff --git a/docs/computing/hpc-libraries.md b/docs/computing/hpc-libraries.md index c52f0649df..2294bd99c9 100644 --- a/docs/computing/hpc-libraries.md +++ b/docs/computing/hpc-libraries.md @@ -1,21 +1,21 @@ -# High performance libraries +# High-performance libraries -Various high performance libraries for dense linear algebra, fast -fourier transforms *etc.* are available via the module system. Many -libraries are provided both as single threaded and multithreaded +Various high-performance libraries for dense linear algebra, fast +Fourier transforms, *etc.* are available via the module system. Many +libraries are provided both as single-threaded and multithreaded versions, multithreaded modules are designated with `omp` in the module version. For pure MPI applications and applications calling -libraries from multiple threads it is recommended to use a single +libraries from multiple threads it is recommended to use a single- threaded library. -Availibility of libraries may depend on the loaded compiler suite and +Availability of libraries may depend on the loaded compiler suite and MPI environment, use `module avail` for finding out available -libraries. See the documentation of library for -instructions on how to build against that particular library. Note +libraries. See the documentation of the library for +instructions on how to build against it. Note that most modules set `LIBRARY_PATH` and `LD_LIBRARY_PATH` environment -variables so that `-llibrary` linker flag is often enough. Most +variables so that the `-llibrary` linker flag is often enough. Most modules set also `_INSTALL_ROOT` environment variables that -can be utilized in custom build scripts. As an example, `fftw` +can be utilized in custom build scripts. As an example, the `fftw` library can be used as follows: ```bash @@ -23,39 +23,29 @@ module load fftw -o myprog myprog.o -lfftw3 ``` -and the directory containing `include`, `lib`, *etc.* are found under +and the directory containing `include`, `lib`, *etc.* is found under the `FFTW_INSTALL_ROOT` environment variable. -## Libraries on Roihu - -!!! warning - On Roihu-CPU and Roihu-GPU, many of the installed modules do not currently - set the `CPATH`, `LIBRARY_PATH` or `LD_LIBRARY_PATH` environment variables. - We expect to change this in the near future; until then, you may have to - set them manually eg. when compiling an application that depends on a module. - You can use `module show ` to see where the module files are located. - Many modules define variable like `modulename_INSTROOT` that points to the - installation directory once the module has been loaded. For example, `fftw` - headers are in `$FFTW_INSTROOT\include` and the compiled library files are - in `$FFTW_INSTROOT\lib`. - - -### Roihu-CPU +## Libraries on Roihu-CPU Selected libraries available on Roihu-CPU: -- Dense linear algebra: `openblas` +- Dense linear algebra: `openblas`, `amdblis` - Dense distributed linear algebra: `netlib-scalapack` -- Fast fourier transforms: `fftw` +- Fast Fourier transforms: `fftw` -### Roihu-GPU +## Libraries on Roihu-GPU Selected libraries available on Roihu-GPU: -- Dense linear algebra: `openblas`, `netlib-lapack`, `cublas` -- Dense distributed linear algebra: `netlib-scalapack` -- Fast fourier transforms: `fftw` +- For Grace CPU: + - Dense linear algebra: `nvhpc` (includes NVPL), `openblas`, `netlib-lapack` + - Dense distributed linear algebra: `nvhpc` (includes NVPL), `netlib-scalapack` + - Fast fourier transforms: `nvhpc` (includes NVPL), `fftw` +- For Hopper GPU: + - CUDA (module `cuda`/`nvhpc`) includes libraries such as + cublas, cufft, cusolver, ... ## Libraries on Puhti From 2282f22c936889958e8f9de9240b33b4f7a29cf3 Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Tue, 28 Apr 2026 18:21:59 +0300 Subject: [PATCH 102/139] Update Roihu Slurm instructions (#2976) * Remove SMT instructions from examples, add SBoF example * Remove smt examples, use default mem on full nodes --- .../running/creating-job-scripts-roihu.md | 20 ---- .../running/example-job-scripts-roihu.md | 100 +++++++----------- docs/support/tutorials/affinity.md | 4 - 3 files changed, 36 insertions(+), 88 deletions(-) diff --git a/docs/computing/running/creating-job-scripts-roihu.md b/docs/computing/running/creating-job-scripts-roihu.md index 6f6a0237a8..9fdb85a9fc 100644 --- a/docs/computing/running/creating-job-scripts-roihu.md +++ b/docs/computing/running/creating-job-scripts-roihu.md @@ -20,7 +20,6 @@ An example of a batch job script using a share of resources on a single node: #SBATCH --cpus-per-task=1 # Number of CPU cores allocated per task #SBATCH --mem-per-cpu=1000M # Memory to reserve per CPU core #SBATCH --output=slurm-%j.out # Standard output of the job script -#SBATCH --hint=nomultithread # Allocate physical cores only, avoid simultaneous multithreading ##SBATCH --mail-type=BEGIN # Uncomment to enable mail module load myprog/1.2.3 # Load required modules @@ -142,17 +141,6 @@ Here `%j` is a replacement symbol for jobid, so the output will go to the file ` By default, this file collects also the standard error, but it is possible to specify a different file for standard error with `--error=`. -The allocation of CPU cores vs hardware threads is controlled with the option: - -```bash -#SBATCH --hint=nomultithread -``` - -Use this option always by default and change it only if you are absolutely sure that it is beneficial. - -!!! info "About `--hint=nomultithread`" - The default behavior regarding this setting is likely to change. - The user can be notified by email when the job *starts* by using the `--mail-type` option @@ -425,14 +413,6 @@ For example, requesting 100 GiB storage: Then, this storage is available in path `/run/sbb/$USER` during the job script. - -### Simultaneous multithreading (SMT) on Roihu-CPU - -SMT support can be enabled with `--hint=multithread` option. -When this option is used, it is important to use the `--ntasks-per-node=X` and -`--cpus-per-task=Y` so that `X * Y = 768` on full nodes. Failing to do so will leave some of the -actual physical cores unallocated and performance will be suboptimal. - ### Undersubscribing full nodes on Roihu-CPU If an application requires more memory per core than there is available diff --git a/docs/computing/running/example-job-scripts-roihu.md b/docs/computing/running/example-job-scripts-roihu.md index 977d0d34b9..69ea8ad863 100644 --- a/docs/computing/running/example-job-scripts-roihu.md +++ b/docs/computing/running/example-job-scripts-roihu.md @@ -33,8 +33,6 @@ recommended to use especially for smaller scale and routine runs. #SBATCH --ntasks-per-node=384 --cpus-per-task=1 # The product should be 384 ###SBATCH --ntasks-per-node=192 --cpus-per-task=2 # The product should be 384 ###SBATCH --ntasks-per-node=96 --cpus-per-task=4 # The product should be 384 -#SBATCH --hint=nomultithread -#SBATCH --mem=744G # Ensure we use all available memory on the nodes # Set the number of threads based on cpus-per-task export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} @@ -87,7 +85,6 @@ srun myprog #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --mem-per-cpu=1000M -#SBATCH --hint=nomultithread # Run the program srun myprog @@ -104,7 +101,6 @@ srun myprog #SBATCH --nodes=1 #SBATCH --ntasks=2 #SBATCH --mem-per-cpu=1000M -#SBATCH --hint=nomultithread # Run the program srun myprog @@ -122,7 +118,6 @@ srun myprog #SBATCH --ntasks=1 #SBATCH --cpus-per-task=4 #SBATCH --mem-per-cpu=1000M -#SBATCH --hint=nomultithread # Set the number of threads based on cpus-per-task export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} @@ -148,7 +143,6 @@ srun myprog #SBATCH --ntasks=2 #SBATCH --cpus-per-task=4 #SBATCH --mem-per-cpu=1000M -#SBATCH --hint=nomultithread # Set the number of threads based on cpus-per-task export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} @@ -162,30 +156,6 @@ export OMP_PROC_BIND=spread srun myprog ``` -## Partial CPU node: MPI+OpenMP with simultaneous multithreading - -```bash -#!/bin/bash -#SBATCH --job-name=example -#SBATCH --account= -#SBATCH --partition=small -#SBATCH --time=00:30:00 -#SBATCH --ntasks=2 -#SBATCH --cpus-per-task=4 -#SBATCH --hint=multithread - -# Set the number of threads based on cpus-per-task -export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} - -# Place and bind threads to single hardware threads -# Comment the following lines if binding is not desired -export OMP_PLACES=threads -export OMP_PROC_BIND=spread - -# Run the program -srun myprog -``` - ## Full CPU nodes: MPI ```bash @@ -197,8 +167,6 @@ srun myprog #SBATCH --time=00:30:00 #SBATCH --nodes=2 #SBATCH --ntasks-per-node=384 --cpus-per-task=1 # The product should be 384 -#SBATCH --hint=nomultithread -#SBATCH --mem=744G # Run the program srun myprog @@ -215,8 +183,6 @@ srun myprog #SBATCH --time=00:30:00 #SBATCH --nodes=2 #SBATCH --ntasks-per-node=1 --cpus-per-task=384 # The product should be 384 -#SBATCH --hint=nomultithread -#SBATCH --mem=744G # Set the number of threads based on cpus-per-task export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} @@ -242,8 +208,6 @@ srun myprog #SBATCH --nodes=2 #SBATCH --ntasks-per-node=192 --cpus-per-task=2 # The product should be 384 #SBATCH --ntasks-per-node=96 --cpus-per-task=4 # The product should be 384 -#SBATCH --hint=nomultithread -#SBATCH --mem=744G # Set the number of threads based on cpus-per-task export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} @@ -257,33 +221,6 @@ export OMP_PROC_BIND=spread srun myprog ``` -## Full CPU nodes: MPI+OpenMP with simultaneous multithreading - -```bash -#!/bin/bash -#SBATCH --job-name=example -#SBATCH --account= -#SBATCH --partition=medium -##SBATCH --partition=large # uncomment if using 6 or more nodes -#SBATCH --time=00:30:00 -#SBATCH --nodes=2 -#SBATCH --ntasks-per-node=384 --cpus-per-task=2 # The product should be 768 -#SBATCH --ntasks-per-node=192 --cpus-per-task=4 # The product should be 768 -#SBATCH --hint=multithread -#SBATCH --mem=744G - -# Set the number of CPU threads based on cpus-per-task -export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} - -# Place and bind CPU threads to single CPU cores -# Comment the following lines if binding is not desired -export OMP_PLACES=cores -export OMP_PROC_BIND=spread - -# Run the program -srun myprog -``` - ## GPU slices !!! info "Work in progress" @@ -341,4 +278,39 @@ srun myprog ## Fast disk (NVMe over Fabric) !!! info "Work in progress" - This section is work in progress. + This section is a work in progress. + +On Roihu, it is possible to request local disk mounts from a centralised pool of fast storage resources. +This fast storage capacity is provided over the network and will appear as local scratch from +within a Slurm job. + +Example script reserving 10G of fast NVMe disk space: + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --account= +#SBATCH --partition=medium +#SBATCH --time=00:10:00 +#SBATCH --nodes=1 +#SBATCH --ntasks-per-node=4 --cpus-per-task=96 +#SBATCH --bb="#BB_LUA SBF storagesize=10G path=/run/sbb/" + +# Set the number of CPU threads based on cpus-per-task +export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} + +# Place and bind CPU threads to single CPU cores +# Comment the following lines if binding is not desired +export OMP_PLACES=cores +export OMP_PROC_BIND=spread + +# Run the program +srun myprog +``` + +!!! Note + At the present you can only request this storage for jobs that are making use of full nodes, + i.e. that are submitted with the `--exclusive` flag or in exclusive partitions (e.g. medium). + Support for shared node jobs is coming at a later date. + +See [detailed usage instructions](../roihu-disk.md#disaggregated-storage). \ No newline at end of file diff --git a/docs/support/tutorials/affinity.md b/docs/support/tutorials/affinity.md index a7dd7fa5df..25d5dd641a 100644 --- a/docs/support/tutorials/affinity.md +++ b/docs/support/tutorials/affinity.md @@ -29,7 +29,6 @@ The job script executes `csc-print-affinity` (available in `csc-tools` module) v #SBATCH --time=00:30:00 #SBATCH --nodes=2 #SBATCH --ntasks-per-node=8 --cpus-per-task=48 # The product should be 384 -#SBATCH --hint=nomultithread # Run the program srun csc-print-affinity @@ -90,7 +89,6 @@ the number of CPUs per task: #SBATCH --time=00:30:00 #SBATCH --nodes=2 #SBATCH --ntasks-per-node=8 --cpus-per-task=48 # The product should be 384 -#SBATCH --hint=nomultithread # Create a script for binding tasks to CPU cores BIND_CPU="./bind_cpu.$SLURM_JOB_ID.sh" @@ -161,7 +159,6 @@ The following job script exemplifies the use of `OMP_*` for checking the CPU aff #SBATCH --ntasks=2 #SBATCH --cpus-per-task=8 #SBATCH --mem-per-cpu=1000M -#SBATCH --hint=nomultithread # Set the number of threads based on cpus-per-task export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} @@ -203,7 +200,6 @@ In the following example job script, we have enabled thread binding: #SBATCH --ntasks=2 #SBATCH --cpus-per-task=8 #SBATCH --mem-per-cpu=1000M -#SBATCH --hint=nomultithread # Set the number of threads based on cpus-per-task export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1} From f5a52c13ab92d62c7da314283e83ea9d89d6763b Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Wed, 29 Apr 2026 09:32:38 +0300 Subject: [PATCH 103/139] WIP: Roihu user spack tutorial (#2968) * Typo fixes * Add section for Spack installations in quickstart guide * Add a warning about user-spack being WIP --- docs/support/tutorials/roihu-user-spack.md | 26 +++++++++++++--------- docs/support/tutorials/roihu.md | 8 +++++++ 2 files changed, 23 insertions(+), 11 deletions(-) diff --git a/docs/support/tutorials/roihu-user-spack.md b/docs/support/tutorials/roihu-user-spack.md index d1773ffa84..1cb5175f7f 100644 --- a/docs/support/tutorials/roihu-user-spack.md +++ b/docs/support/tutorials/roihu-user-spack.md @@ -5,7 +5,7 @@ manager. This document describes how regular users can use Spack to install additional software, libraries and applications, on top of the already installed software. -Throughout this document, we use a package `eccodes` as an example, +Throughout this document, we use the package `eccodes` as an example, and assume that the commands are run in user's custom software install root, usually somewhere under `/projappl//$USER`. @@ -15,17 +15,21 @@ full For example, term "environment" in this document specifically refers to [Spack environments](https://spack.readthedocs.io/en/latest/environments.html). +!!! warning "Work in progress" + This section is a work in progress. Some steps may be outdated, + incomplete, or not fully tested in the current Roihu environment. + Use with caution and report any issues you encounter. ## When to install with Spack Spack installation is a viable option for "traditional" HPC software, parallel applications and the libraries they depend on, especially -when Spack package recipies already exist. Using Spack is an +when Spack package recipes already exist. Using Spack is an alternative to traditional "manual" installation, loading modules, -running configure or cmake, and make, for the application and it's +running configure or cmake, and make, for the application and its dependencies. -Usually containers are better approach when the number of files in the +Usually containers are a better approach when the number of files in the installation goes to tens of thousands, which is often the case for Python and R environments, for example. @@ -34,15 +38,15 @@ Python and R environments, for example. The Spack packages can be searched from [Spack Packages](https://packages.spack.io) (the latest versions), or -directly from Roihu directory +directly from the Roihu directory `/appl/soft/spack/v2026_03/spack-packages/repos/spack_repo/builtin/packages`, which contains almost 9000 package definitions. ## How to set up Spack -First, let's initialize Spack. In here we set Spack cache directory to -temporary directory, which is fine for one shot installations, and +First, let's initialize Spack. In here we set the Spack cache directory to +temporary directory, which is fine for one-shot installations, and isolate Spack from system and user configuration scopes, so that no settings from those scopes leak into our setup. See [Overriding local configuration](https://spack.readthedocs.io/en/v1.1.1/configuration.html#overriding-local-configuration) @@ -75,10 +79,10 @@ the processor architecture on the CPU or the GPU nodes, respectively. In general, the core environments provide a good base, "upstream" package environment, to build on. Core environments contain compilers -and most common libraries, such as MPI libraries, already configured +and most common libraries, such as MPI libraries, already configured to work efficiently. -The available environments can be listed for example with +The available environments can be listed, for example, with ```console ls /appl/soft/spack/core/v2026_03/x86_64/ @@ -240,7 +244,7 @@ shows the concretized spec ==> Updating view at /users/jlento/user-spack/environments/mygcc152_ec/.spack-env/view ``` -There are some keypoints to check in the concretized spec. First, +There are some key points to check in the concretized spec. First, we need to check that the variant (build options) are what we want. In the case of eccodes we can compare the current spec @@ -266,7 +270,7 @@ is openjpeg instead of jasper (fine?). Let's update the variant information, and reconcretize (the output of the command is omitted as it is similar to the previous output from -`spack concretize`command): +`spack concretize` command): ```console spack change 'eccodes@2.45.0+aec+fortran~ipo+memfs~netcdf+openmp+png~pthreads+shared+tools' diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index 93581a04e9..abab715845 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -194,6 +194,14 @@ More details on working with containers in CSC's computing environment can be fo - [Creating containers](../../computing/containers/overview.md#building-container-images) - [Tykky container wrapper](../../computing/containers/tykky.md) +### Spack + +Spack is a flexible package manager that can be used to install software on supercomputers and Linux and macOS systems. The basic module tree including compilers, MPI libraries and many of the available software on CSC supercomputers have been installed using Spack. Spack is similar to the EasyBuild package manager extensively used on LUMI. + +CSC provides user Spack modules on Roihu, that can be used to build software on top of the available compilers and libraries. It is also possible to install different customized versions of packages available in the module tree for special use cases. + +[See here for a short tutorial on how to install software on Roihu using Spack.](roihu-user-spack.md) + ### Python/R environments Best practice guidelines on installing your own Python and R packages can be found in the Python, R and Tykky container wrapper pages below. From e1ce135e2aff444c8ba38995fecff63bb630e722 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mats=20Sj=C3=B6berg?= Date: Wed, 29 Apr 2026 09:42:51 +0300 Subject: [PATCH 104/139] Update vLLM versions --- docs/apps/vllm.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/apps/vllm.md b/docs/apps/vllm.md index a4819042bf..b8648f1487 100644 --- a/docs/apps/vllm.md +++ b/docs/apps/vllm.md @@ -24,9 +24,10 @@ A fast and easy-to-use library for LLM inference and serving. Currently supported vLLM versions: -| Version | Module | -|:--------|----------------------| -| 0.18.0 | `python-vllm/0.18.0` | +| Version | Module | Notes | +|:--------|----------------------|---------| +| 0.19.1 | `python-vllm/0.19.1` | Default | +| 0.18.0 | `python-vllm/0.18.0` | | Includes [vLLM][], [PyTorch](https://pytorch.org/) and related libraries with GPU support via CUDA/ROCm. From ca82b1639547d654da501fd8b9ef8f6bc911f2e3 Mon Sep 17 00:00:00 2001 From: leopekkas Date: Wed, 29 Apr 2026 10:08:18 +0300 Subject: [PATCH 105/139] Add warning about Satama not being in use yet --- docs/support/tutorials/roihu.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index abab715845..b8910e1691 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -117,6 +117,11 @@ Another option is to build your own container from scratch. You can build containers on top of Roihu base containers which have the same software stack as is available via the module system natively. Base container are built on top of Rocky Linux 9. +!!! warning "Work in progress" + Satama is not yet available on Roihu, so the container images + referenced below cannot currently be accessed. + Satama support on Roihu is expected very soon. + === "Roihu CPU base container (~4 GB)" Base containers available: From b467fb8eef4911b79a16f8c093ba77add3659b13 Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Wed, 29 Apr 2026 10:16:14 +0300 Subject: [PATCH 106/139] Small text improvements based on review --- docs/computing/running/interactive-usage.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/docs/computing/running/interactive-usage.md b/docs/computing/running/interactive-usage.md index 9cae735c62..8c53fe5575 100644 --- a/docs/computing/running/interactive-usage.md +++ b/docs/computing/running/interactive-usage.md @@ -167,17 +167,18 @@ prterun -n 4 --oversubscribe gmx_mpi mdrun -s topol.tpr ## Connecting to a compute node of a running job Sometimes (e.g. for debugging purposes) it is useful to login to a compute -node where a Slurm job is currently running. This can be achieved with the -`srun` command as follows: +node where a Slurm job is currently running. Bash shell can be started in the master +node of a job (i.e. the first node in your allocation and the one on which +your batch script is executed) as follows: ```bash -srun --overlap --pty --jobid= bash +srun --jobid= --overlap --pty bash ``` -If a job spans multiple nodes, you will be connected to the master node of -your job, which is the first node in your allocation and the one on which -your batch script is executed. It is also possible to connect to a specific -node with the `-w` option: +(`--overlap` allows to run multiple job steps at the same time, and `--pty` enables normal +terminal behaviour). +For jobs spanning multiple nodes it is also possible to connect to a specific +node with the `--nodelist=` option: ```bash -srun --overlap --pty --jobid= -w rcXXXX bash +srun --jobid= --overlap --nodelist=rcXXXX --pty bash ``` where `rcXXXX` is the name of a node as shown e.g. by the `squeue` command. (Note that the format of the node names varies between different systems). From b98d7225a618bc2cd8e1982dacd6d9994eb2cf52 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Wed, 29 Apr 2026 10:52:36 +0300 Subject: [PATCH 107/139] Update python-data.md Add gpu installations to python-data.md --- docs/apps/python-data.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/apps/python-data.md b/docs/apps/python-data.md index 95cbb9a72a..dc3423a0b5 100644 --- a/docs/apps/python-data.md +++ b/docs/apps/python-data.md @@ -53,6 +53,12 @@ Current versions in Roihu are: includes for example Scikit-learn 1.8.0, SciPy 1.17.1, Pandas 3.0.2 and JupyterLab 4.5.6. +- Roihu-GPU: (default version) `python-data/3.12-20.04`: installed in April 2026, + includes for example Cupy 14.0.1. + +- Roihu-GPU: `python-data/3.10-17.04`: installed in April 2026, + includes for example Cupy 13.6.0. + Current versions in Puhti and Mahti are: - (default version) `python-data/3.12-25.09`: installed in September 2025, From 33f518db8328a75689337f178d81a57a93a7a66a Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Wed, 29 Apr 2026 11:17:28 +0300 Subject: [PATCH 108/139] Update python-data.md --- docs/apps/python-data.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/apps/python-data.md b/docs/apps/python-data.md index dc3423a0b5..1a4ca5d4f4 100644 --- a/docs/apps/python-data.md +++ b/docs/apps/python-data.md @@ -54,10 +54,10 @@ Current versions in Roihu are: and JupyterLab 4.5.6. - Roihu-GPU: (default version) `python-data/3.12-20.04`: installed in April 2026, - includes for example Cupy 14.0.1. + includes for example Cupy 14.0.1 in addition to the Python libraries in Roihu-CPU python-data. - Roihu-GPU: `python-data/3.10-17.04`: installed in April 2026, - includes for example Cupy 13.6.0. + includes for example Cupy 13.6.0 in addition to the Python libraries in Roihu-CPU python-data. Current versions in Puhti and Mahti are: From 1e0e139f980a0371cb5cf494d6027c5b8e174951 Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Wed, 29 Apr 2026 11:35:41 +0300 Subject: [PATCH 109/139] Fix Roihu SSH instructions (#2978) * Fix SSH instruction typos * Add info about python script printing cert expiration --- docs/computing/connecting/ssh-keys.md | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/docs/computing/connecting/ssh-keys.md b/docs/computing/connecting/ssh-keys.md index 0266d2346d..37f4b9206e 100644 --- a/docs/computing/connecting/ssh-keys.md +++ b/docs/computing/connecting/ssh-keys.md @@ -139,7 +139,7 @@ We recommend trying the MyCSC workflow first, since it should work out-of-the-bo ![Sign and download SSH certificate](https://a3s.fi/docs-files/sign-download-ssh-cert.png 'Sign and download SSH certificate') !!! info "Where to store the SSH certificate?" - We **strongly** advice saving the certificate in the default folder for + We **strongly** advise saving the certificate in the default folder for SSH-related files (e.g. `~/.ssh` or `C:\Users\/.ssh`). Specifically, storing the certificate in the same directory as your SSH private key **and** naming it as `-cert.pub` will simplify @@ -209,8 +209,8 @@ following instructions illustrate only basic usage. directory as the script. If not, make sure to provide the full path to `csc_cert.py`. - 3. If you have an earlier certificate which is still valid, the tool - prints the expiration time and exits. + 3. **If you have an earlier certificate which is still valid, the tool + prints the expiration time and exits.** 4. If signing is needed, a login URL is displayed. Follow the link and authenticate. 5. Copy the 6-digit code displayed into your terminal and enter your @@ -219,7 +219,7 @@ following instructions illustrate only basic usage. your SSH agent. - The signed certificate is saved as `-cert.pub` (e.g., `~/.ssh/id_ed25519-cert.pub`). - 6. Your now have everything ready to **[Connect to Roihu following these instructions](ssh-unix.md#basic-usage)**. + 6. You now have everything ready to **[connect to Roihu following these instructions](ssh-unix.md#basic-usage)**. === "Windows" @@ -277,8 +277,13 @@ following instructions illustrate only basic usage. --- ### Check when your SSH certificate will expire - Each SSH certificate is valid for 24 hours. The expiration time can be - checked as follows: + +Each SSH certificate is valid for 24 hours. + +If you have an active certificate, the expiration time is printed when running the +[`csc_cert.py` tool](#option-2-certificate-helper-tool). + +The expiration time can also be checked as follows: === "Terminal (Linux, macOS, PowerShell, MobaXterm)" From e65e1b44f5bef3c95726de78475e4882db4e67b5 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Wed, 29 Apr 2026 11:56:00 +0300 Subject: [PATCH 110/139] Update python-data.md --- docs/apps/python-data.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/apps/python-data.md b/docs/apps/python-data.md index 1a4ca5d4f4..7957e4c9ee 100644 --- a/docs/apps/python-data.md +++ b/docs/apps/python-data.md @@ -54,10 +54,10 @@ Current versions in Roihu are: and JupyterLab 4.5.6. - Roihu-GPU: (default version) `python-data/3.12-20.04`: installed in April 2026, - includes for example Cupy 14.0.1 in addition to the Python libraries in Roihu-CPU python-data. + includes for example Cupy 14.0.1 in addition to the Python libraries available in Roihu-CPU python-data. - Roihu-GPU: `python-data/3.10-17.04`: installed in April 2026, - includes for example Cupy 13.6.0 in addition to the Python libraries in Roihu-CPU python-data. + includes for example Cupy 13.6.0 in addition to the Python libraries available in Roihu-CPU python-data. Current versions in Puhti and Mahti are: From e15b2b538b0d8f50389314bbcb69f3b0b053fe7f Mon Sep 17 00:00:00 2001 From: Nino Runeberg Date: Wed, 29 Apr 2026 14:46:55 +0300 Subject: [PATCH 111/139] Update OpenMPI version in cp2k.md --- docs/apps/cp2k.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/apps/cp2k.md b/docs/apps/cp2k.md index 8161a1d521..2d47db0fcd 100644 --- a/docs/apps/cp2k.md +++ b/docs/apps/cp2k.md @@ -119,7 +119,7 @@ double the number of cores the calculation should be at least 1.5 times faster. #SBATCH --hint=nomultithread module purge - module load gcc/15.2.0 openmpi/5.0.8 + module load gcc/15.2.0 openmpi/5.0.10 module load cp2k/2026.1 srun cp2k.psmp H2O-512.inp > H2O-512.out @@ -139,7 +139,7 @@ double the number of cores the calculation should be at least 1.5 times faster. #SBATCH --hint=nomultithread module purge - module load gcc/13.4.0 openmpi/5.0.8 + module load gcc/13.4.0 openmpi/5.0.10 module load cp2k/2026.1 export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} From dd1925c841df0f19fef28002c1849a89bd3d32bf Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Wed, 29 Apr 2026 15:30:31 +0300 Subject: [PATCH 112/139] Include Roihu and improve the documentation --- docs/apps/gdb.md | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/docs/apps/gdb.md b/docs/apps/gdb.md index 189484357d..e4d8145861 100644 --- a/docs/apps/gdb.md +++ b/docs/apps/gdb.md @@ -10,6 +10,7 @@ catalog: available_on: - Puhti - Mahti + - Roihu-CPU --- # gdb: GNU Debugger @@ -18,6 +19,7 @@ catalog: - Puhti - Mahti +- Roihu-CPU ## License @@ -34,18 +36,21 @@ correct the effects of a bug. In order to use the debugger the program has to be compiled with `-g` flag to enable symbolic debugging. -The debugger can either start a new process or attach to a running process. - -Example of starting a new process to be debugged: +One can either start the application under the debugger, or attach the debugger +to a running application. +In order to start the application under the debugger, launch first an +[interactive session](running/interactive-usage.md) and execute then: ``` gdb --tui ./myexecutable ``` -Example of attaching to an existing process (with process ID `pid`): - +In order to attach to a running application, +[connect first to a compute node](running/interactive-usage.md#connecting-to-a-compute-node-of-a-running-job). +Next, you need to find the process ID `` e.g. by running the command `ps ux`, +attach then debugger to that: ``` -gdb --tui ./myexecutable pid +gdb --tui ./myexecutable ``` If additional arguments are needed for the program, one can use the option From 15fea5d7ceba143b529fcfd506afc79d23c47d42 Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Wed, 29 Apr 2026 15:33:07 +0300 Subject: [PATCH 113/139] Add Roihu and improve description a bit --- docs/apps/cuda-gdb.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/apps/cuda-gdb.md b/docs/apps/cuda-gdb.md index e82168cd00..5911bdf483 100644 --- a/docs/apps/cuda-gdb.md +++ b/docs/apps/cuda-gdb.md @@ -10,6 +10,7 @@ catalog: available_on: - Puhti - Mahti + - Roihu-GPU --- # cuda-gdb: CUDA debugger @@ -18,6 +19,7 @@ catalog: - Puhti: 10.2 - Mahti: 10.1 +- Roihu-GPU ## License @@ -32,14 +34,14 @@ CUDA programs. In order to use tool the CUDA code has to be compiled with the extra flags `-g` and `-G`. -Next in an [interactive session](../computing/running/interactive-usage.md) one needs to +Next, in an [interactive session](../computing/running/interactive-usage.md) one needs to first load the CUDA module: ```bash module load cuda ``` -and then the debugging can be started by running: +and then start the application under debugger by running: ```bash cuda-gdb ./cuda_program @@ -52,6 +54,6 @@ specific to CUDA debugging: * Focus Commands: Commands to query or switch the focus of the debugger * Configuration Commands: Commands to configure the CUDA-specific commands -Out of bonds accesses can be checked inside the debugger by activating +Out of bounds accesses can be checked inside the debugger by activating the memory checker with `set cuda memcheck on`. Alternatively the `cuda-memcheck` or [`compute-sanitizer`](compute-san.md) tool can be used outside of the debugger (`cuda-memcheck ./cuda_program` or `compute-sanitizer ./cuda_program`). From a0be52dd477578e2412e933763fbb3810adc1de3 Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Wed, 29 Apr 2026 15:47:52 +0300 Subject: [PATCH 114/139] Include Roihu and improve a bit --- docs/apps/compute-san.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/apps/compute-san.md b/docs/apps/compute-san.md index e64bbd9c64..88eea93892 100644 --- a/docs/apps/compute-san.md +++ b/docs/apps/compute-san.md @@ -10,6 +10,7 @@ catalog: available_on: - Puhti - Mahti + - Roihu-GPU --- # compute-sanitizer: functional correctness checking suite for CUDA programs @@ -18,6 +19,7 @@ catalog: - Puhti: 2022.2.0 - Mahti: 2021.3.0 +- Roihu-GPU ## License @@ -29,8 +31,9 @@ Usage is possible for both academic and commercial purposes. In order to use the tool, the CUDA code has to be compiled with the extra flags `-g` and `-G`. -Debugging is started in an [interactive session](../computing/running/interactive-usage.md) -by running: +Running the tool can be done either in an [interactive session](running/interactive-usage.md) or in a +[batch job](running/submitting-jobs.md). The application is started as (prepend `compute-sanitizer` by `srun` +in a batch job): ```bash compute-sanitizer --tool ./cuda_program @@ -44,3 +47,8 @@ where `` is one of the several sub-tools for different type of checks: * `initcheck`: can report cases where the GPU performs uninitialized accesses to global memory * `synccheck`: can report cases where the application is attempting invalid usages of synchronization primitives + +!!! info + Sometimes external libraries (e.g. MPI) generate a lot of false positives + + From 9d8227298b1be53bee6504d7bee28c17da67bb88 Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Wed, 29 Apr 2026 15:54:00 +0300 Subject: [PATCH 115/139] Add Roihu and improve the description --- docs/apps/pdb.md | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/docs/apps/pdb.md b/docs/apps/pdb.md index c0e4c486a6..33fd90bfde 100644 --- a/docs/apps/pdb.md +++ b/docs/apps/pdb.md @@ -10,6 +10,8 @@ catalog: available_on: - Puhti - Mahti + - Roihu-CPU + - Roihu-GPU --- # pdb: Python debugger @@ -18,6 +20,8 @@ catalog: - Mahti: Any Python version - Puhti: Any Python version +- Roihu-CPU: Any Python version +- Roihu-GPU: Any Python version ## License @@ -29,16 +33,8 @@ Usage is possible for both academic and commercial purposes. debugger that supports breakpoints, stepping through the source line by line, inspection of stack frames, source code listing, etc. -There are two ways to use the debugger. Within the code (or from the -interpreter): - -``` -import pdb -pdb.run('functbd(list_parameters)') -``` - -Alternatively pdb can also be invoked as a script to profile another script: - +In order to use the tool, launch first an [interactive session](running/interactive-usage.md) +and start then your Python program under the debugger: ``` python -m pdb myscript.py ``` From b18ec5759239474ef1d36372020b715b6645c373 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mats=20Sj=C3=B6berg?= Date: Wed, 29 Apr 2026 16:28:22 +0300 Subject: [PATCH 116/139] Roihu ML base container (#2981) * Add ML base image example to Roihu getting started * Further fixes to ML base image description * Remove Satama warning * Add section lines for containers --------- Co-authored-by: leopekkas --- docs/support/tutorials/roihu.md | 62 +++++++++++++++++++++++++-------- 1 file changed, 48 insertions(+), 14 deletions(-) diff --git a/docs/support/tutorials/roihu.md b/docs/support/tutorials/roihu.md index b8910e1691..36c577620e 100644 --- a/docs/support/tutorials/roihu.md +++ b/docs/support/tutorials/roihu.md @@ -69,7 +69,7 @@ For platform-specific instructions, see: ### Roihu web interface !!! warning "Roihu web interface availability during the pilot period" - Roihu web interface will only be vailable after General Availability. Please connect via SSH during the pilot phase. + Roihu web interface will only be available after General Availability. Please connect via SSH during the pilot phase. The simplest way to connect to Roihu is to use the web interface. @@ -94,7 +94,7 @@ Quota extensions on Roihu must be separately applied for and properly motivated. ## Installing software -Before installing anything check if the software is already available: +Before installing anything, check if the software is already available: - [List of pre-installed applications](../../apps/by_availability.md#roihu) - `module spider ` @@ -115,15 +115,12 @@ Roihu supports Apptainer/Singularity containers for container installations. In most cases, ready-made Docker containers can be easily converted into an Apptainer image. Another option is to build your own container from scratch. You can build containers on top of Roihu base containers which have the same software stack as is available via the module system natively. -Base container are built on top of Rocky Linux 9. +Base containers are built on top of Rocky Linux 9. -!!! warning "Work in progress" - Satama is not yet available on Roihu, so the container images - referenced below cannot currently be accessed. - Satama support on Roihu is expected very soon. +--- === "Roihu CPU base container (~4 GB)" - Base containers available: + Base containers are available: - `satama.csc.fi/r_installation_spack/core-cpu-gcc-15.2.0:v2026_03` @@ -143,21 +140,21 @@ Base container are built on top of Rocky Linux 9. exec "$@" ``` - When building containers, set the Apptainer cache directory to `$TMPDIR` to avoid filling your home directory quota. + When building the containers, set the Apptainer cache directory to `$TMPDIR` to avoid filling your home directory quota. ```bash export APPTAINER_CACHEDIR=$TMPDIR apptainer build --fakeroot container.sif container.def ``` - Now, you can run commands inside the container with clean environment and environment active as follows: + Now, you can run commands inside the container with the environment active as follows: ```bash apptainer run container.sif mycmd ``` === "Roihu GPU base container (~16 GB)" - Base containers available: + Base containers are available: - `satama.csc.fi/r_installation_spack/core-gpu-gcc-15.2.0-cuda-13.1.1:v2026_03` - `satama.csc.fi/r_installation_spack/core-gpu-gcc-14.3.0-cuda-12.9.1:v2026_03` @@ -179,19 +176,56 @@ Base container are built on top of Rocky Linux 9. exec "$@" ``` - When building the containers, set you cache directory to temporary directory to avoid filling you home directory quota. + When building the containers, set the Apptainer cache directory to `$TMPDIR` to avoid filling your home directory quota. ```bash export APPTAINER_CACHEDIR=$TMPDIR apptainer build --fakeroot container.sif container.def ``` - Now, you can run commands inside the container with clean environment and environment active as follows: + Now, you can run commands inside the container with the environment active as follows: ```bash apptainer run --nv container.sif mycmd ``` +=== "Roihu ML/AI GPU base containers" + Base containers for machine learning/AI are available. + + These containers are built on Rocky Linux 9.7 with Python 3, MPI and CUDA installed via RPM packages. + *This approach produces a container that is not identical to Roihu's host system, but may be easier to extend in some cases than the normal base containers.* + + - `satama.csc.fi/r_installation_aida/ml-base:rocky9.7_gcc12_py3.12_cuda12.9` + - `satama.csc.fi/r_installation_aida/ml-base:rocky9.7_gcc12_py3.12_cuda13` + - `satama.csc.fi/r_installation_aida/pytorch-base:2.10_cuda13_roihu` - `ml-base` image with basic PyTorch 2.10 packages added + - `satama.csc.fi/r_installation_aida/pytorch:2.10_cuda13_roihu` - full PyTorch installation (same as CSC module) + - `satama.csc.fi/r_installation_aida/vllm:0.19.1_cuda12.9_roihu` - vLLM container (same as CSC module) + + Build definition file: + + ```sh title="container.def" + Bootstrap: docker + From: satama.csc.fi/r_installation_aida/ml-base:rocky9.7_gcc12_py3.12_cuda13 + + %post + # Build your application here: + ``` + + When building the containers, set the Apptainer cache directory to `$TMPDIR` to avoid filling your home directory quota. + + ```bash + export APPTAINER_CACHEDIR=$TMPDIR + apptainer build --fakeroot container.sif container.def + ``` + + Now, you can run commands inside the container. For example to launch python3: + + ```bash + apptainer exec --nv --bind=$(csc-common-bind) container.sif python3 + ``` + +--- + More details on working with containers in CSC's computing environment can be found from the links below: - [Overview of containers](../../computing/containers/overview.md) @@ -225,7 +259,7 @@ Basic workflow: * Define the resources for your job (time, memory, cores) * Load the required modules * Launch your executable -2. Submit your batch job into the queuing system +2. Submit your batch job to the queuing system 3. Wait for the job to finish, and look for its output See the relevant documentation below for detailed information: From 780b6948e7107aa4e8e4a643a2d7301ec820655a Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Wed, 29 Apr 2026 17:04:17 +0300 Subject: [PATCH 117/139] Update Roihu Slurm docs (#2982) * Add a max nodes section for Roihu GPU partitions --- .../computing/running/batch-job-partitions.md | 30 +++++++++---------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/docs/computing/running/batch-job-partitions.md b/docs/computing/running/batch-job-partitions.md index 2aa56f9ea0..3eb3f36348 100644 --- a/docs/computing/running/batch-job-partitions.md +++ b/docs/computing/running/batch-job-partitions.md @@ -2,8 +2,8 @@ On CSC supercomputers, programs are run by submitting them to partitions, which are logical sets of nodes managed by the Slurm workload manager. -This page lists the available Slurm partitions on the Roihu, Puhti and Mahti -supercomputers, as well as explains their intended uses. Below are the general +This page lists the available Slurm partitions on the Roihu, Puhti, and Mahti +supercomputers and explains their intended uses. Below are the general guidelines for using the Slurm partitions on our systems: 1. **Use the `test` and `gputest` partitions for testing your code, not production.** @@ -13,15 +13,15 @@ guidelines for using the Slurm partitions on our systems: 2. **Only request multiple CPU cores if you know your program supports parallel processing.** Reserving multiple cores does not automatically speed up your job. Your program must be written in a way that the - computations can be done in multiple threads or processes. Reserving more + computations can be performed in multiple threads or processes. Reserving more cores does nothing by itself if your code is not parallelized, except making you queue for longer. 3. **Only use the GPU partitions if you know your program can utilize GPUs.** Running your computations using one or more GPUs is a very effective parallelization method for certain applications, but your program must be configured to use the CUDA platform. If you are unsure whether this is the - case, it is better to submit it to a CPU partition, since you will be - allocated resources sooner. If unsure, contact + case, it is better to submit your job to a CPU partition, since you will be + allocated resources sooner. If unsure, contact the [CSC Service Desk](../../support/contact.md). The following commands can be used to show information about available @@ -73,13 +73,13 @@ Roihu features the following partitions for submitting jobs to CPU nodes: Roihu features the following partitions for submitting jobs to GPU nodes: -| Partition | Allocation type | Time limit | Min GPUs | Max GPUs | [Node types](../systems-roihu.md#nodes) | Memory per GPU | Requirements | -|------------------|-----------------|------------|----------|----------|-----------------------------------------|------------------|--------------------| -| `gputest` | G | 15 minutes | 1 | 8 | GPU | 116 GiB + 95 GiB | | -| `gpuinteractive` | G | 12 hours | 1 | 1 | GPU ([slice](#roihu-gpu-slices)) | TBA | | -| `gpumedium` | G | 36 hours | 1 | 4 | GPU | 116 GiB + 95 GiB | | -| `gpularge` | G | 36 hours | 4 | 40 | GPU | 116 GiB + 95 GiB | [scalability test] | -| `vizinteractive` | G | 12 hours | 1 | 1 | V | 183 GiB + 44 GiB | | +| Partition | Allocation type | Time limit | Min GPUs | Max GPUs | Max nodes | [Node types](../systems-roihu.md#nodes) | Memory per GPU | Requirements | +|------------------|-----------------|------------|----------|----------|-----------|-----------------------------------------|------------------|--------------------| +| `gputest` | G | 15 minutes | 1 | 8 | 2 | GPU | 116 GiB + 95 GiB | | +| `gpuinteractive` | G | 12 hours | 1 | 1 | 1 | GPU ([slice](#roihu-gpu-slices)) | TBA | | +| `gpumedium` | G | 36 hours | 1 | 4 | 1 | GPU | 116 GiB + 95 GiB | | +| `gpularge` | G | 36 hours | 4 | 40 | 10 | GPU | 116 GiB + 95 GiB | [scalability test] | +| `vizinteractive` | G | 12 hours | 1 | 1 | 1 | V | 183 GiB + 44 GiB | | #### Roihu GPU slices @@ -103,7 +103,7 @@ available during the Roihu pilot phase: Local storage on Roihu M, L and GPU nodes is meant for storing temporary files only, not high-performance I/O. -High-performance local storage is available on Roihu XL and V nodes. Ideal for I/O intensive jobs. +High-performance local storage is available on Roihu XL and V nodes. Ideal for I/O-intensive jobs. Read more about: [Local storage on Roihu nodes](../disk.md#temporary-local-disk-areas) @@ -197,7 +197,7 @@ accessible to Two CPU partitions on Mahti allow you to reserve cores instead of full nodes. These are the `small` partition and the `interactive` -partition. In these partitions jobs are allocated 1.875 GiB of memory +partition. In these partitions, jobs are allocated 1.875 GiB of memory for each reserved CPU core, and the only way to reserve more memory is to reserve more cores. These partitions are also special in that you can reserve local storage on the node. It is important that you only @@ -216,7 +216,7 @@ anything in between. The `small` partition is intended for batch processing of small scale CPU compute workloads, that do not need a full node. It is also able to support applications that need local storage to perform -optimally. Many workloads that have traditionally used Puhti, may +optimally. Many workloads that have traditionally used Puhti may benefit from this partition. | Partition | Time
limit | Max CPU
cores | Max
nodes | [Node types](../systems-mahti.md) | Max memory
per node | Max local storage
([NVMe]) per node | From 6a044205e3cf737af9036813cb5f275d87ebdc6b Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Mon, 4 May 2026 09:05:47 +0300 Subject: [PATCH 118/139] Fix links --- docs/apps/compute-san.md | 4 ++-- docs/apps/gdb.md | 4 ++-- docs/apps/pdb.md | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/apps/compute-san.md b/docs/apps/compute-san.md index 88eea93892..66a946c2a5 100644 --- a/docs/apps/compute-san.md +++ b/docs/apps/compute-san.md @@ -31,8 +31,8 @@ Usage is possible for both academic and commercial purposes. In order to use the tool, the CUDA code has to be compiled with the extra flags `-g` and `-G`. -Running the tool can be done either in an [interactive session](running/interactive-usage.md) or in a -[batch job](running/submitting-jobs.md). The application is started as (prepend `compute-sanitizer` by `srun` +Running the tool can be done either in an [interactive session](../computing/running/interactive-usage.md) or in a +[batch job](../computing/running/submitting-jobs.md). The application is started as (prepend `compute-sanitizer` by `srun` in a batch job): ```bash diff --git a/docs/apps/gdb.md b/docs/apps/gdb.md index e4d8145861..fc6fde1936 100644 --- a/docs/apps/gdb.md +++ b/docs/apps/gdb.md @@ -40,13 +40,13 @@ One can either start the application under the debugger, or attach the debugger to a running application. In order to start the application under the debugger, launch first an -[interactive session](running/interactive-usage.md) and execute then: +[interactive session](../computing/running/interactive-usage.md) and execute then: ``` gdb --tui ./myexecutable ``` In order to attach to a running application, -[connect first to a compute node](running/interactive-usage.md#connecting-to-a-compute-node-of-a-running-job). +[connect first to a compute node](../computing/running/interactive-usage.md#connecting-to-a-compute-node-of-a-running-job). Next, you need to find the process ID `` e.g. by running the command `ps ux`, attach then debugger to that: ``` diff --git a/docs/apps/pdb.md b/docs/apps/pdb.md index 33fd90bfde..d83b5449d6 100644 --- a/docs/apps/pdb.md +++ b/docs/apps/pdb.md @@ -33,7 +33,7 @@ Usage is possible for both academic and commercial purposes. debugger that supports breakpoints, stepping through the source line by line, inspection of stack frames, source code listing, etc. -In order to use the tool, launch first an [interactive session](running/interactive-usage.md) +In order to use the tool, launch first an [interactive session](../computing/running/interactive-usage.md) and start then your Python program under the debugger: ``` python -m pdb myscript.py From 7cb6ac43dcf3ee47e31ba4cc4e0d9a04d8e08d04 Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Tue, 5 May 2026 08:44:10 +0300 Subject: [PATCH 119/139] Apply suggestions from code review Co-authored-by: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> --- docs/apps/compute-san.md | 2 +- docs/apps/cuda-gdb.md | 2 +- docs/apps/gdb.md | 2 +- docs/apps/pdb.md | 3 +-- 4 files changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/apps/compute-san.md b/docs/apps/compute-san.md index 66a946c2a5..82db6786c1 100644 --- a/docs/apps/compute-san.md +++ b/docs/apps/compute-san.md @@ -10,7 +10,7 @@ catalog: available_on: - Puhti - Mahti - - Roihu-GPU + - Roihu --- # compute-sanitizer: functional correctness checking suite for CUDA programs diff --git a/docs/apps/cuda-gdb.md b/docs/apps/cuda-gdb.md index 5911bdf483..e33f7fe3c6 100644 --- a/docs/apps/cuda-gdb.md +++ b/docs/apps/cuda-gdb.md @@ -10,7 +10,7 @@ catalog: available_on: - Puhti - Mahti - - Roihu-GPU + - Roihu --- # cuda-gdb: CUDA debugger diff --git a/docs/apps/gdb.md b/docs/apps/gdb.md index fc6fde1936..76296cac22 100644 --- a/docs/apps/gdb.md +++ b/docs/apps/gdb.md @@ -10,7 +10,7 @@ catalog: available_on: - Puhti - Mahti - - Roihu-CPU + - Roihu --- # gdb: GNU Debugger diff --git a/docs/apps/pdb.md b/docs/apps/pdb.md index d83b5449d6..162657d16a 100644 --- a/docs/apps/pdb.md +++ b/docs/apps/pdb.md @@ -10,8 +10,7 @@ catalog: available_on: - Puhti - Mahti - - Roihu-CPU - - Roihu-GPU + - Roihu --- # pdb: Python debugger From 197032b8bd6bee060afdb39dc9bb7689010da68f Mon Sep 17 00:00:00 2001 From: Jussi Enkovaara Date: Tue, 5 May 2026 08:46:45 +0300 Subject: [PATCH 120/139] Update version information for compute-sanitizer --- docs/apps/compute-san.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/apps/compute-san.md b/docs/apps/compute-san.md index 82db6786c1..74a4af0e34 100644 --- a/docs/apps/compute-san.md +++ b/docs/apps/compute-san.md @@ -17,9 +17,9 @@ catalog: ## Available -- Puhti: 2022.2.0 -- Mahti: 2021.3.0 -- Roihu-GPU +- Puhti: version depends on the CUDA environment +- Mahti: version depends on the CUDA environment +- Roihu-GPU: version depends on the CUDA environment ## License @@ -27,7 +27,7 @@ Usage is possible for both academic and commercial purposes. ## Usage -[compute-sanitizer](https://docs.nvidia.com/cuda/compute-sanitizer/index.html) is a functional correctness checking suite included in the CUDA toolkit (starting from version 11). +[compute-sanitizer](https://docs.nvidia.com/cuda/compute-sanitizer/index.html) is a functional correctness checking suite included in the CUDA toolkit (`cuda` module needs to be loaded). In order to use the tool, the CUDA code has to be compiled with the extra flags `-g` and `-G`. From 11299fe6b7c72ecf97d62e47484061f4598de45d Mon Sep 17 00:00:00 2001 From: Leopekka Saraste <45951298+leopekkas@users.noreply.github.com> Date: Tue, 5 May 2026 12:30:01 +0300 Subject: [PATCH 121/139] WIP: Fix Roihu details in docs (#2985) * Fix Roihu documentation details * Update allas-in-roihu.md * Fix gpupilot partition name --------- Co-authored-by: kkmattil --- docs/computing/allas-in-roihu.md | 13 ++--- .../computing/running/batch-job-partitions.md | 2 +- docs/computing/systems-roihu.md | 50 ++++++++++--------- docs/support/tutorials/roihu-data.md | 4 +- 4 files changed, 37 insertions(+), 32 deletions(-) diff --git a/docs/computing/allas-in-roihu.md b/docs/computing/allas-in-roihu.md index 6c95b92744..d23abf0bfd 100644 --- a/docs/computing/allas-in-roihu.md +++ b/docs/computing/allas-in-roihu.md @@ -6,7 +6,7 @@ Object storage related tools are initialized in Roihu with command: ```text module load allas ``` -The allas module enables command: **allas-conf** that is used to configure **S3* connections to [Allas](../data/Allas/index.md) and [Lumi-O](https://docs.lumi-supercomputer.eu/storage/lumio/) object storage services and *Swift* based connections to Allas. Note that in Roihu _allas-conf: by default configures an S3 based connection to Allas, unlike to Puhti and Mahti where swift is used by default. +The allas module enables command: **allas-conf** that is used to configure **S3* connections to [Allas](../data/Allas/index.md) and [Lumi-O](https://docs.lumi-supercomputer.eu/storage/lumio/) object storage services and *Swift* based connections to Allas. Note that in Roihu _allas-conf_ by default configures an S3 based connection to Allas, unlike to Puhti and Mahti where swift is used by default. In addition this module brings available a set of command line tools that can be used to operate with Allas and Lumi-O object storage services. These tools include: @@ -17,12 +17,13 @@ In addition this module brings available a set of command line tools that can be * [swift](../data/Allas/using_allas/swift_client.md) * [allas-backup](../data/Allas/using_allas/a_backup.md) and restic -You can check current object storage connections with command: +You can check current Allas and Lumi-O connections with command: ```text check-allas-connections ``` + ### S3 connection to Allas You can define a new S3 connection to Allas with command: @@ -38,7 +39,7 @@ allas-conf First allas-conf asks you to give your CSC password (Haka-password can't be used here). After that, if target project is not given as an argument, it lists all available Allas projects and asks user to pick one. ( Note that allas-conf has often problems with passwords that have characters that have special meaning in bash shell. For example space, *, ; and different quotation marks can cause allas-conf to fail). -The project specific access key pair is stored to to the configuration files of *aws*, *s3cmd* a *rclone* in your home directory. Due to this the configuration is not session specific, but applies to all sessions that utilize aws, s3cmd, rclone and a-commands. S3 keys are permanent so you need to run allas-conf command again only when you wish to set a new default S3 connection in use. Thus, in case of S3 based Allas usage, you normally need just to load the Allas module and then start using Allas. +The project specific access key pair is stored to the configuration files of *aws*, *s3cmd* a *rclone* in your home directory. Due to this the configuration is not session specific, but applies to all sessions that utilize aws, s3cmd, rclone and a-commands. S3 keys are permanent so you need to run allas-conf command again only when you wish to set a new default S3 connection in use. Thus, in case of S3 based Allas usage, you normally need just to load the Allas module and then start using Allas. In case of **aws** and **s3cmd**, only one connection is defined and running allas-conf overwrites the old default connections. @@ -57,7 +58,7 @@ Following connections are in use: |--------------------------------|----------------| | a-commands, aws and s3cmd | project_200222 | | rclone s3allas: | project_200222 | -| rclone a3allas-project200111: | project_200111 | +| rclone a3allas-project_200111: | project_200111 | | rclone a3allas-project_200222: | project_200222 | And with these settings all the commands below list the Allas buckets of project 200222. @@ -84,7 +85,7 @@ This connection is session specific and valid only for 8 hours. After the connec rclone lsd allas: ``` -A-commands need extra option `--swift` to use Swift based Allas connection. For example: +In Roihu, A-commands need extra option `--swift` to use Swift based Allas connection. For example: ```text a-list --swift @@ -105,7 +106,7 @@ allas-conf --lumi The configuration process asks you to login to [https://auth.lumidata.eu](https://auth.lumidata.eu) where you can create an access key pair for your Lumi-project. You can then copy the _project name_, _access key_ and _secret key_ to the configuration process in Roihu. -Lumi-O connections use always S3 protocol and this configuration process changes *aws* and *s3cmd* commands to use the Lumi-O project as the default project. In case of *a-commands* you can add option `--lumi` to the command in order to make use Lumi-o. For example: +Lumi-O connections use always S3 protocol and this configuration process changes *aws* and *s3cmd* commands to use the Lumi-O project as the default project. In case of *a-commands* you can add option `--lumi` to the command in order to make use Lumi-o. For example: ```text a-list --lumi diff --git a/docs/computing/running/batch-job-partitions.md b/docs/computing/running/batch-job-partitions.md index 3eb3f36348..5ab61b3206 100644 --- a/docs/computing/running/batch-job-partitions.md +++ b/docs/computing/running/batch-job-partitions.md @@ -96,7 +96,7 @@ available during the Roihu pilot phase: | Partition | Allocation type | Time limit | Min nodes | Max nodes | [Node types](../systems-roihu.md#nodes) | |------------|-----------------|------------|-----------|-----------|-----------------------------------------| | `pilot` | N | 24 hours | 1 | 200 | M | -| `pilotgpu` | N | 48 hours | 1 | 60 | GPU | +| `gpupilot` | N | 48 hours | 1 | 60 | GPU | ### Local storage on Roihu nodes diff --git a/docs/computing/systems-roihu.md b/docs/computing/systems-roihu.md index 98df53840a..d9773e337f 100644 --- a/docs/computing/systems-roihu.md +++ b/docs/computing/systems-roihu.md @@ -2,8 +2,8 @@ !!! info "Note" This page contains preliminary information about CSC's next national - supercomputer Roihu, which is projected to be in researchers' use in spring - 2026. Please note that the details may evolve over time. + supercomputer Roihu, which is projected to be available for researchers + in summer 2026. Please note that the details may evolve over time. [See tentative schedule below](#schedule). ## Schedule @@ -32,20 +32,21 @@ graph LR; ``` **Roihu** will be installed in the same datacenter as LUMI, meaning that the -system will be brought up without disturbing Puhti and Mahti services. There +system will be brought up without disrupting Puhti and Mahti services. There will also be a margin between Roihu general availability and the decommissioning of Puhti and Mahti to enable users to migrate to Roihu without a break in HPC access. Puhti will be decommissioned in two steps: First, the computing services of Puhti will be shut down one month after the general availability of Roihu. This -means that jobs cannot be submitted on Puhti anymore. Puhti's storage will, -however, remain accessible until August 2026, after which Puhti will be retired +means that jobs cannot be submitted on Puhti anymore. However, Puhti's storage will +remain accessible until August 2026, after which it will be retired completely. Mahti will be closed in August 2026. If you have any data that you need to migrate from Puhti to Roihu, please be -prepared to do it during spring 2026, at the very latest in August 2026. CSC -will publish a detailed Roihu migration guide in early 2026. +prepared to do it by August 2026 at the very latest. +See [the Roihu migration guide](../support/tutorials/roihu-data.md) for instructions +on moving your data from Mahti and Puhti to Roihu. ## Compute @@ -62,9 +63,9 @@ memory of 1 536 GiB each. Each GPU node will be equipped with 4 Nvidia GH200 Grace Hopper superchips. Each GH200 superchip comprises one Hopper (H100) GPU and one Grace CPU with -72 ARM CPU cores which are connected with a very fast interface. Each +72 ARM CPU cores, which are connected via a very fast interface. Each GH200 superchip has 120 GiB CPU memory and 96 GiB GPU memory, providing -a total of 480 GiB CPU memory per node. This gives a total of 528 GPUs and +a total of 480 GiB CPU memory per node. This results in a total of 528 GPUs and 38 016 CPU cores in the whole GPU partition. The system will also provide four visualization nodes with two Nvidia L40 GPUs @@ -93,9 +94,10 @@ and users' personal Home directories. Separate file systems will ensure responsiveness of Home and ProjAppl even under heavy Scratch usage. The Scratch disk of Roihu will be more than ten times as performant as Puhti -Scratch. Specifically, the peak I/O performance of Roihu Scratch is expected to -be around 560 GB/s for read and 280 GB/s for write. The Home and ProjAppl will -have read and write bandwidths of 120 GB/s and 100 GB/s, respectively. +Scratch. Specifically, the peak I/O performance of Roihu Scratch is expected +to be around 560 GB/s for read and 280 GB/s for write. The Home and ProjAppl +disk areas are expected to have read and write bandwidths of 120 GB/s and +100 GB/s, respectively. Similar to Puhti, Roihu Scratch disk will be regularly cleaned of files that have not been accessed in the last 180 days to avoid inactive data accumulating @@ -119,34 +121,36 @@ on the system [partition](running/batch-job-partitions.md) they use: | R (shared nodes) | 20 GiB | | N (full nodes) | 600 GiB | | G (GPU nodes) | 150 GiB | -| Hugemem (XL) nodes | 1,6 TiB | -| VIZ nodes | 6,5 TiB | +| Hugemem (XL) nodes | 1.6 TiB | +| VIZ nodes | 6.5 TiB | As a new feature, users will also be able to request local disk mounts from a centralized pool of fast storage resources. This fast storage capacity will be -provided over the network and will appear as local scratch from within a Slurm -job. The total capacity of the disaggregated NVMe resource will be 307.2 TB. +provided over the network and will appear as local scratch storage from within a +Slurm job. The total capacity of the disaggregated NVMe resource will be 307.2 TB. ## Network -The network of Roihu is based on Infiniband NDR interconnect. Each CPU node +The network of Roihu is based on an InfiniBand NDR interconnect. Each CPU node will be connected to the network with one 200 Gb/s link, while in the GPU -partition there will be four 200 Gb/s links per node, one for each GPU. +partition there will be four 200 Gb/s links per node, one per GPU. ## Software and programming environment We intend to provide a comprehensive stack of pre-installed HPC libraries and -scientific software on Roihu similar to Puhti and Mahti. Some older and less -used software and software versions may, however, be deprecated. Please also +scientific software on Roihu, similar to those on Puhti and Mahti. Some older and less +used software packages and versions may, however, be deprecated. Please also note that any software compiled on Puhti and Mahti will most likely need to be -recompiled on Roihu. More information will be included in the migration guide. +recompiled on Roihu. +Instructions for installing applications are provided in +[the getting started with Roihu tutorial](../support/tutorials/roihu.md#installing-software) The programming environment of Roihu will otherwise be similar to Mahti, -including e.g. +including: * The GNU compiler stack * The AOCC compiler stack -* CUDA and Nvidia HPC Software Development Kit (SDK) +* CUDA and NVIDIA HPC Software Development Kit (SDK) * OpenMPI as the main MPI library Like Puhti and Mahti, Roihu will also feature a web interface for easy-to-use diff --git a/docs/support/tutorials/roihu-data.md b/docs/support/tutorials/roihu-data.md index d2e42177d3..8a16adba9b 100644 --- a/docs/support/tutorials/roihu-data.md +++ b/docs/support/tutorials/roihu-data.md @@ -89,7 +89,7 @@ * It is **not** recommended to transfer data to Roihu via Allas or your local workstation. Instead, CSC recommends using command-line based tools such as [`rsync`](#2-recommended-data-migration-methods) to **directly transfer data - from Puhti/Mahti to Roihu.** + from Puhti/Mahti/LUMI to Roihu.** !!! warning "Extremely important" @@ -108,7 +108,7 @@ * **`rsync`** is the preferred tool for transferring data from Puhti or Mahti to Roihu. [Read more about `rsync` here](../../data/moving/rsync.md). -* **We will use Puhti as an example**, but the exact same steps apply for Mahti +* **We will use Puhti as an example**, but the exact same steps apply for Mahti and LUMI also. Simply replace all occurrences of `puhti` in host names etc. with `mahti`. * All examples require that you've **forwarded your SSH agent** including your From 2e8b7ef523a02f7b4f40dc64791ca615bfa6938f Mon Sep 17 00:00:00 2001 From: Henry Barton Date: Wed, 6 May 2026 15:09:11 +0300 Subject: [PATCH 122/139] sbf exclusive use clarification (#2991) * --exclusive clarity * typos --------- Co-authored-by: Henry Barton --- docs/computing/roihu-disk.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/computing/roihu-disk.md b/docs/computing/roihu-disk.md index 7bfa40e390..5f56ae7607 100644 --- a/docs/computing/roihu-disk.md +++ b/docs/computing/roihu-disk.md @@ -265,16 +265,17 @@ to get larger capacity fast storage for your jobs. ### Requesting storage from slurm -!!! info "Support in jobs on shared nodes coming Q3 2026" - At the present you can only request this storage for jobs that are making use of full nodes, - i.e. that are submitted with the `--exclusive` flag. Support for shared node jobs is coming - in Q3 2026. +!!! warning "You must request resources in conjunction with `--exclusive`" + At the present you can only request this storage for jobs that are making use of full nodes, + i.e. that are submitted with the `--exclusive` flag. Presently if you do not specify this flag + your job will fail, but will be marked "CANCELLED by 350" and you will lack any stdout or stderr + logs. This should be resolved once support for shared node jobs arrives in Q3 2026. -To request flash storage to be mounted in an sbatch job you must add the following to the resource +To request flash storage to be mounted in an sbatch job you must add the following to the resource request block of your script: ```bash - +#SBATCH --exclusive #BB_LUA SBF storagesize=20GB path=/run/sbb/ ``` From bd0511cf041793680d76d511154eda593da36b56 Mon Sep 17 00:00:00 2001 From: Jaan Tollander de Balsch Date: Thu, 7 May 2026 09:47:25 +0300 Subject: [PATCH 123/139] warning to avoid mixing archictures on roihu --- docs/computing/running/submitting-jobs.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/computing/running/submitting-jobs.md b/docs/computing/running/submitting-jobs.md index c65b3d17f7..19fb77a166 100644 --- a/docs/computing/running/submitting-jobs.md +++ b/docs/computing/running/submitting-jobs.md @@ -67,6 +67,9 @@ parameters that can be used to select which data is displayed. sacct --starttime now-7days > sacct-output.txt ``` +!!! warning "Do not mix CPU and GPU architectures on Roihu" + Although the scheduler allows submitting jobs from CPU login nodes (x86_64) to GPU queue (aarch64) and vice versa, this is **strongly discouraged**: binaries are not portable across architectures and jobs may fail unpredictably. Always submit from a login node matching your target compute nodes. + ## More information - [Creating Roihu batch jobs](creating-job-scripts-roihu.md) From c77a766406c30e5255afe6db0e3c6885865cc1c1 Mon Sep 17 00:00:00 2001 From: Juha Lento Date: Thu, 7 May 2026 14:08:25 +0300 Subject: [PATCH 124/139] Roihu user spack tough case (#2989) * Update roihu-user-spack.md What to do when spack is not using upstream packages that it should --- docs/support/tutorials/roihu-user-spack.md | 142 ++++++++++++++++++--- 1 file changed, 125 insertions(+), 17 deletions(-) diff --git a/docs/support/tutorials/roihu-user-spack.md b/docs/support/tutorials/roihu-user-spack.md index 1cb5175f7f..24c31f83c1 100644 --- a/docs/support/tutorials/roihu-user-spack.md +++ b/docs/support/tutorials/roihu-user-spack.md @@ -5,7 +5,7 @@ manager. This document describes how regular users can use Spack to install additional software, libraries and applications, on top of the already installed software. -Throughout this document, we use the package `eccodes` as an example, +Throughout this document, we use the package `eccodes` as an example if not stated otherwise, and assume that the commands are run in user's custom software install root, usually somewhere under `/projappl//$USER`. @@ -20,6 +20,7 @@ For example, term "environment" in this document specifically refers to incomplete, or not fully tested in the current Roihu environment. Use with caution and report any issues you encounter. + ## When to install with Spack Spack installation is a viable option for "traditional" HPC software, @@ -53,10 +54,10 @@ settings from those scopes leak into our setup. See for details. ```console -source /appl/soft/spack/v2026_03/spack/share/spack/setup-env.sh -source /appl/soft/spack/v2026_03/spack/share/spack/bash/spack-completion.bash export SPACK_USER_CACHE_PATH=$TMPDIR/spack export SPACK_DISABLE_LOCAL_CONFIG=true +source /appl/soft/spack/v2026_03/spack/share/spack/setup-env.sh +source /appl/soft/spack/v2026_03/spack/share/spack/bash/spack-completion.bash ``` @@ -103,42 +104,48 @@ The packages in the upstream environment can be listed, for example, with command ```console -spack -c 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' find +spack -E -c 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' find -l ``` which gives ```output -- linux-rhel9-x86_64 / %c=gcc@15.2.0 --------------------------- -knem@1.1.4 +vshzxn2 knem@1.1.4 -- linux-rhel9-x86_64 / no compilers ---------------------------- -gcc@15.2.0 glibc@2.34 lustre@2.14.0 rdma-core@54.0 slurm@25.05.3 +nsx4vac gcc@15.2.0 45if5qv glibc@2.34 jix3h7v lustre@2.14.0 aeadm2n rdma-core@54.0 fkinzg5 slurm@25.05.3 -- linux-rhel9-zen5 / %c,cxx,fortran=gcc@15.2.0 ----------------- -openblas@0.3.30 openmpi@5.0.10 papi@7.2.0 +2d66cws hdf5@1.14.6 qskucwe openblas@0.3.30 jtu4mle openmpi@5.0.10 ad53otu papi@7.2.0 msn34e2 parallel-netcdf@1.14.1 -- linux-rhel9-zen5 / %c,cxx=gcc@15.2.0 ------------------------- -berkeley-db@18.1.40 c-blosc@1.21.6 eigen@5.0.1 gettext@1.0 krb5@1.22.2 lz4@1.10.0 ncurses@6.6 openssl@3.6.1 ucx@1.20.0 -bison@3.8.2 cmake@3.31.11 expat@2.7.4 hwloc@2.4.1 libaec@1.1.4 m4@1.4.21 nghttp2@1.67.1 python@3.14.3 zlib-ng@2.3.3 -boost@1.88.0 curl@8.18.0 ffmpeg@7.1 icu4c@74.2 libffi@3.5.2 mimalloc@3.2.7 openssh@10.2p1 snappy@1.2.1 zstd@1.5.7 +5jgdak6 berkeley-db@18.1.40 5huzpbm curl@8.18.0 ctjvy35 hwloc@2.4.1 gavswuq lz4@1.10.0 n2yix5u openssh@10.2p1 35almze ucx@1.20.0 +bwi6e4z bison@3.8.2 qjlzhxn eigen@5.0.1 sa7jjuo icu4c@74.2 57nb6no m4@1.4.21 3a2xrgw openssl@3.6.1 siciate zlib-ng@2.3.3 +niyho5a boost@1.88.0 m3d4hr2 expat@2.7.4 ro346yy krb5@1.22.2 b5ymicp mimalloc@3.2.7 be236m7 osu-micro-benchmarks@7.5.2 pvz4ljc zstd@1.5.7 +dhosjnx c-blosc@1.21.6 bkrrw3k ffmpeg@7.1 5ddtnbo libaec@1.1.4 ezw6kfg ncurses@6.6 qvwhx54 python@3.14.3 +dkj6m2a cmake@3.31.11 sgerwwx gettext@1.0 6neer47 libffi@3.5.2 zj62cb6 nghttp2@1.67.1 xjllls2 snappy@1.2.1 -- linux-rhel9-zen5 / %c,fortran=gcc@15.2.0 --------------------- -fftw@3.3.10 netcdf-fortran@4.6.2 netlib-lapack@3.12.1 netlib-scalapack@2.2.2 +gafrwzb fftw@3.3.10 ilpb6cd netcdf-fortran@4.6.2 lagqyzt netlib-lapack@3.12.1 stnq2qp netlib-scalapack@2.2.2 -- linux-rhel9-zen5 / %c=gcc@15.2.0 ----------------------------- -alsa-lib@1.2.15.3 diffutils@3.12 gmake@4.4.1 libbsd@0.12.2 libiconv@1.18 libtool@2.5.4 nasm@2.16.03 perl@5.42.0 readline@8.3 util-linux-uuid@2.41 -automake@1.18.1 findutils@4.10.0 gsl@2.8 libedit@3.1-20240808 libmd@1.1.0 libxcrypt@4.5.2 netcdf-c@4.9.3 pigz@2.8 sqlite@3.51.2 xz@5.8.2 -bzip2@1.0.8 gdbm@1.26 hdf5@1.14.6 libevent@2.1.12 libsigsegv@2.15 libxml2@2.15.1 numactl@2.0.19 pkgconf@2.5.1 tar@1.35 +w5oytoe alsa-lib@1.2.15.3 rd2uuxx gdbm@1.26 kcjhrtc libevent@2.1.12 ddibcjq libxcrypt@4.5.2 ejivk2i perl@5.42.0 rzwegeu tar@1.35 +os67qlb automake@1.18.1 3jmx5cd gmake@4.4.1 anayviw libiconv@1.18 6mr7zcy libxml2@2.15.1 b5e6x6u pigz@2.8 it55vd4 util-linux-uuid@2.41 +an5rzrc bzip2@1.0.8 lsilpvy gsl@2.8 i3l5zsd libmd@1.1.0 jwkph32 nasm@2.16.03 btxv56s pkgconf@2.5.1 vk7ckqw xz@5.8.2 +72jfleb diffutils@3.12 dl3mtdk libbsd@0.12.2 qurnpw5 libsigsegv@2.15 pcod7dd netcdf-c@4.9.3 lf5ljls readline@8.3 +p4i5zpo findutils@4.10.0 vbo25pf libedit@3.1-20240808 dbbnpoy libtool@2.5.4 zvxjta5 numactl@2.0.19 3m5q6wh sqlite@3.51.2 -- linux-rhel9-zen5 / %cxx=gcc@15.2.0 --------------------------- -kokkos@5.0.2 +t4kg5zu kokkos@5.0.2 -- linux-rhel9-zen5 / no compilers ------------------------------ -autoconf@2.72 compiler-wrapper@1.0 gcc-runtime@15.2.0 -==> 73 installed packages +5e4345x autoconf@2.72 iglv3xy compiler-wrapper@1.0 uaq7tuq gcc-runtime@15.2.0 +==> 75 installed packages ``` +The seven character string in front of the package name is a hash that we can use to refer to particular concretized spec or install. + ## How to set up a custom environment for the installs @@ -289,6 +296,107 @@ all looks fine, and we can proceed to installation spack install ``` + +## What to do when concretization is not using an upstream dependency + +Sometimes there are `-` entries in the concretization for the dependency +packages that would be available in the upstream. The concretization of +`eccodes` used packages installed in the upstream, so let's try to install something +slightly more difficult: + +```console +spack add 'vasp+hdf5+openmp' +spack concretize +``` + +```output + - rfhap2d vasp@6.5.1~cuda+fftlib+hdf5~libbeef~libxc+openmp+shmem~wannier90 build_system=makefile platform=linux os=rhel9 target=zen5 %c,cxx,fortran=gcc@15.2.0 +[^] iglv3xy ^compiler-wrapper@1.0 build_system=generic platform=linux os=rhel9 target=zen5 + - bska4tv ^fftw@3.3.10+mpi+openmp~pfft_patches+shared build_system=autotools patches:=872cff9 precision:=double,float platform=linux os=rhel9 target=zen5 %c,fortran=gcc@15.2.0 +[e] jzf6h3h ^gcc@15.2.0+binutils+bootstrap~graphite+libsanitizer~mold~nvptx~piclibs~profiled~strip build_system=autotools build_type=RelWithDebInfo languages:='c,c++,fortran' platform=linux os=rhel9 target=x86_64 + - qgv5zle ^gcc-runtime@15.2.0 build_system=generic platform=linux os=rhel9 target=zen5 +[e] 45if5qv ^glibc@2.34 build_system=autotools platform=linux os=rhel9 target=x86_64 +[^] 3jmx5cd ^gmake@4.4.1~guile build_system=generic platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[e] nsx4vac ^gcc@15.2.0+binutils+bootstrap~graphite+libsanitizer~mold~nvptx~piclibs~profiled~strip build_system=autotools build_type=RelWithDebInfo languages:='c,c++,fortran,jit' platform=linux os=rhel9 target=x86_64 +[^] uaq7tuq ^gcc-runtime@15.2.0 build_system=generic platform=linux os=rhel9 target=zen5 + - qlknyrh ^hdf5@1.14.6~cxx+fortran~hl~ipo~java~map+mpi+shared~subfiling~szip~threadsafe+tools api=default build_system=cmake build_type=Release generator=make platform=linux os=rhel9 target=zen5 %c,fortran=gcc@15.2.0 +[^] dkj6m2a ^cmake@3.31.11~doc+ncurses+ownlibs~qtgui build_system=generic build_type=Release platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] 5huzpbm ^curl@8.18.0~gssapi~ldap~libidn2~librtmp~libssh~libssh2+nghttp2 build_system=autotools libs:=shared,static tls:=openssl platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] zj62cb6 ^nghttp2@1.67.1 build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] ezw6kfg ^ncurses@6.6~symlinks+termlib abi=none build_system=autotools patches:=7a351bc platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] btxv56s ^pkgconf@2.5.1 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] siciate ^zlib-ng@2.3.3+compat+new_strategies+opt+pic+shared build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 + - yswx35z ^netlib-scalapack@2.2.2~ipo~pic+shared build_system=cmake build_type=Release generator=make platform=linux os=rhel9 target=zen5 %c,fortran=gcc@15.2.0 +[^] qskucwe ^openblas@0.3.30~bignuma~consistent_fpcsr+dynamic_dispatch+fortran~ilp64+locking+pic+shared build_system=makefile symbol_suffix=none threads=openmp platform=linux os=rhel9 target=zen5 %c,cxx,fortran=gcc@15.2.0 + - zxmpyf5 ^openmpi@5.0.10+atomics~cuda~debug+fortran~gpfs~internal-hwloc~internal-libevent~internal-pmix~ipv6~java~lustre~memchecker~openshmem~rocm~romio+rsh~static~two_level_namespace+vt+wrapper-rpath build_system=autotools fabrics:=none romio-filesystem:=none schedulers:=none platform=linux os=rhel9 target=zen5 %c,cxx,fortran=gcc@15.2.0 +[^] 5e4345x ^autoconf@2.72 build_system=autotools platform=linux os=rhel9 target=zen5 +[^] 57nb6no ^m4@1.4.21+sigsegv build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] 72jfleb ^diffutils@3.12 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] qurnpw5 ^libsigsegv@2.15 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] os67qlb ^automake@1.18.1 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] ctjvy35 ^hwloc@2.4.1~cairo~cuda~gl~level_zero~libudev+libxml2~netloc~nvml~opencl~pci~rocm build_system=autotools libs:=shared,static platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] 6mr7zcy ^libxml2@2.15.1+pic~python+shared build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] vk7ckqw ^xz@5.8.2~pic build_system=autotools libs:=shared,static platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] kcjhrtc ^libevent@2.1.12+openssl build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] dbbnpoy ^libtool@2.5.4 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] p4i5zpo ^findutils@4.10.0 build_system=autotools patches:=440b954 platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] sgerwwx ^gettext@1.0+bzip2+curses+git~libunistring+libxml2+pic+shared+tar+xz build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] rzwegeu ^tar@1.35 build_system=autotools zip=pigz platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] b5e6x6u ^pigz@2.8 build_system=makefile platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] zvxjta5 ^numactl@2.0.19 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] n2yix5u ^openssh@10.2p1+gssapi build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] ro346yy ^krb5@1.22.2+shared build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] bwi6e4z ^bison@3.8.2~color build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] vbo25pf ^libedit@3.1-20240808 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] ddibcjq ^libxcrypt@4.5.2~obsolete_api build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] ejivk2i ^perl@5.42.0+cpanm+opcode+open+shared+threads build_system=generic platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] 5jgdak6 ^berkeley-db@18.1.40+cxx~docs+stl build_system=autotools patches:=26090f4,b231fcc platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] an5rzrc ^bzip2@1.0.8~debug~pic+shared build_system=generic platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] rd2uuxx ^gdbm@1.26 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] lf5ljls ^readline@8.3 build_system=autotools patches:=21f0a03 platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 + - utr7ymg ^pmix@6.1.0~munge~python build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 + - ccotrup ^prrte@4.1.0 build_system=autotools schedulers:=none platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 + - syzfpop ^flex@2.6.3+lex~nls build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 + - 24qltll ^rsync@3.4.1 build_system=autotools platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] gavswuq ^lz4@1.10.0+pic build_system=makefile libs:=shared,static platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] 3a2xrgw ^openssl@3.6.1~docs+shared build_system=generic certs=system platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 + - m2nrnpe ^popt@1.19 build_system=autotools platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 +[^] anayviw ^libiconv@1.18 build_system=autotools libs:=shared,static platform=linux os=rhel9 target=zen5 %c=gcc@15.2.0 + - fre4ogj ^xxhash@0.8.3 build_system=makefile platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +[^] pvz4ljc ^zstd@1.5.7+programs build_system=makefile compression:=none libs:=shared,static platform=linux os=rhel9 target=zen5 %c,cxx=gcc@15.2.0 +``` + +The above concretization shows that if we would +proceed to `spack install`, spack would rebuild and install `fftw`, `hdf5` and `openmpi` +packages, among some others, that should come from the upstream. Usually it is best to start +fixing from the first such package in the concretization, here `fftw`. + +Let's first verify that the upstream package actually is suitable, by comparing +the spec of the upstream package and the spec in the concretization. A particular upstream +package spec can printed referring it's hash (note the syntax with `/`): + +```console +spack -E -c 'upstreams:gcc152_ec:install_tree:/appl/soft/spack/core/v2026_03/x86_64/gcc152_ec/install_dir' spec /gafrwzb +``` + +In this case the specs for the `fftw` packages in the environment concretization and in the upstream look identical, +so the upstream package should be fine. + +Let's update the vasp spec in the environment by specifying which exact `fftw` package to use as a dependency. +For some reason `spack change` does not work here, but we can update the `spack.yaml` file in the environment +with two commands (or alternatively edit the `spack.yaml` file directly): + +```console +spack remove vasp +spack add 'vasp+hdf5+openmp ^*/gafrwzb' +``` + +Notice the syntax how to specify the dependency using it's hash. + +In this case fixing the first dependency also fixed all the others in the concretized spec, and +we can proceed to `spack install`. + + ## Using the environment The actual software installs are in the directory From 9424b9472d25efc323d0b994d3754417d0ce6f48 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Thu, 7 May 2026 19:46:26 +0300 Subject: [PATCH 125/139] Update tykky.md Add documentation for uv support in Tykky --- docs/computing/containers/tykky.md | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/docs/computing/containers/tykky.md b/docs/computing/containers/tykky.md index fb8f392f92..a857517ffb 100644 --- a/docs/computing/containers/tykky.md +++ b/docs/computing/containers/tykky.md @@ -129,14 +129,6 @@ To use a Tykky installation with [Jupyter](https://jupyter.org/), include correc The best way to use Jupyter in Puhti or Mahti is with [the web interface](../webinterface/index.md). See [Jupyter application page](../webinterface/jupyter.md#tykky-installations) for details how to use your own Tykky installation with Puhti web interface Jupyter. -### Pip with Conda - -To install some additional pip packages, add the `-r ` argument, e.g.: - -```bash -conda-containerize new -r req.txt --prefix env.yml -``` - ### Mamba The tool also supports using [Mamba](https://github.com/mamba-org/mamba) for installing @@ -148,6 +140,17 @@ flag. conda-containerize new --mamba --prefix env.yml ``` +### Pip/uv with Conda + +To install some additional pip packages, add the `-r ` argument, e.g.: + +```bash +conda-containerize new -r req.txt --prefix env.yml +``` +In addition, there exists a '--uv' flag that can be used to install the additional pip +packages with uv instead of pip, often resulting in faster installation. + + ### End-to-end example Create a new Conda-based installation using the previous `env.yml` file. @@ -214,6 +217,10 @@ modifying a Conda installation apply here as well. Note that the Python version used by `pip-containerize` is the first Python executable found in the path, so it's affected by loaded modules. +Alternatively, pip can be replaced with uv by passing a '--uv' flag. Packages listed in +the requirements file will then be installed using 'uv pip install', enabling faster +dependency installation. + **Important:** This Python can not be itself container-based as nesting is not possible! An additional `--slim` flag exists, which will instead use a pre-built minimal Python From 728ce6b14cc0bae1b3e6289c7dbee282c0c1cfd9 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Thu, 7 May 2026 19:48:21 +0300 Subject: [PATCH 126/139] Update tykky.md --- docs/computing/containers/tykky.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/computing/containers/tykky.md b/docs/computing/containers/tykky.md index a857517ffb..9176206cdb 100644 --- a/docs/computing/containers/tykky.md +++ b/docs/computing/containers/tykky.md @@ -147,8 +147,8 @@ To install some additional pip packages, add the `-r ` argument, e.g.: ```bash conda-containerize new -r req.txt --prefix env.yml ``` -In addition, there exists a '--uv' flag that can be used to install the additional pip -packages with uv instead of pip, often resulting in faster installation. +In addition, there exists a '--uv' flag that can be used to install additional pip +packages with uv instead of pip, resulting in faster installation. ### End-to-end example From 0b81897c2e7ddcf35cd2163cdd3ef8189809a547 Mon Sep 17 00:00:00 2001 From: Juha Lento Date: Fri, 8 May 2026 06:40:43 +0300 Subject: [PATCH 127/139] Update roihu-user-spack.md variable was not expanded as I expected, but seems to create the source-cache directory under the environment root - fine --- docs/support/tutorials/roihu-user-spack.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/support/tutorials/roihu-user-spack.md b/docs/support/tutorials/roihu-user-spack.md index 24c31f83c1..c5c2599ebf 100644 --- a/docs/support/tutorials/roihu-user-spack.md +++ b/docs/support/tutorials/roihu-user-spack.md @@ -174,7 +174,7 @@ default settings, so that they do not point to default system locations (which are not writable by users): ```console -spack config add 'config:source_cache:$spack_user_cache/source-cache' +spack config add 'config:source_cache:source-cache' ``` The chosen upstream environment and the location of our custom environment's From dc06edf276e8fe3ade365ae37f12a5bc8f0acd34 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Fri, 8 May 2026 09:55:42 +0300 Subject: [PATCH 128/139] Update tykky.md --- docs/computing/containers/tykky.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/docs/computing/containers/tykky.md b/docs/computing/containers/tykky.md index 9176206cdb..73493d4851 100644 --- a/docs/computing/containers/tykky.md +++ b/docs/computing/containers/tykky.md @@ -140,16 +140,20 @@ flag. conda-containerize new --mamba --prefix env.yml ``` -### Pip/uv with Conda +### Install additional packages with pip or uv To install some additional pip packages, add the `-r ` argument, e.g.: ```bash conda-containerize new -r req.txt --prefix env.yml ``` -In addition, there exists a '--uv' flag that can be used to install additional pip -packages with uv instead of pip, resulting in faster installation. +By default, pip is used to install the dependencies. Additionally, using the '--uv' flag +together with the '--mamba' flag enables faster installation. The corresponding env.yml file must also include uv. + +```bash +conda-containerize new -r req.txt --mamba --uv --prefix env.yml +``` ### End-to-end example From 1bbcb270af4e27cb1336080c1dd0a152d324ce58 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Fri, 8 May 2026 09:58:12 +0300 Subject: [PATCH 129/139] Update tykky.md --- docs/computing/containers/tykky.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/computing/containers/tykky.md b/docs/computing/containers/tykky.md index 73493d4851..120f24c87f 100644 --- a/docs/computing/containers/tykky.md +++ b/docs/computing/containers/tykky.md @@ -140,7 +140,7 @@ flag. conda-containerize new --mamba --prefix env.yml ``` -### Install additional packages with pip or uv +### Add additional packages with pip or uv To install some additional pip packages, add the `-r ` argument, e.g.: @@ -221,12 +221,12 @@ modifying a Conda installation apply here as well. Note that the Python version used by `pip-containerize` is the first Python executable found in the path, so it's affected by loaded modules. -Alternatively, pip can be replaced with uv by passing a '--uv' flag. Packages listed in +**Important:** This Python can not be itself container-based as nesting is not possible! + +Alternatively, uv can be used to manage the Python installations by passing a '--uv' flag. Packages listed in the requirements file will then be installed using 'uv pip install', enabling faster dependency installation. -**Important:** This Python can not be itself container-based as nesting is not possible! - An additional `--slim` flag exists, which will instead use a pre-built minimal Python container with a much newer version of Python as a base. Without the `--slim` flag, the whole host system is available, whereas with the flag the system installations (i.e. From 45ac6ab4c407b5d2f893d572924cc42be91e7c65 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Fri, 8 May 2026 10:07:27 +0300 Subject: [PATCH 130/139] Update tykky.md --- docs/computing/containers/tykky.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/computing/containers/tykky.md b/docs/computing/containers/tykky.md index 120f24c87f..8656ab7788 100644 --- a/docs/computing/containers/tykky.md +++ b/docs/computing/containers/tykky.md @@ -218,15 +218,15 @@ pip-containerize new --prefix req.txt where `req.txt` is a standard pip requirements file. The notes and options for modifying a Conda installation apply here as well. +Alternatively, uv can be used to manage the Python installations by passing a '--uv' flag. Packages listed in +the requirements file will then be installed using 'uv pip install', enabling faster +dependency installation. + Note that the Python version used by `pip-containerize` is the first Python executable found in the path, so it's affected by loaded modules. **Important:** This Python can not be itself container-based as nesting is not possible! -Alternatively, uv can be used to manage the Python installations by passing a '--uv' flag. Packages listed in -the requirements file will then be installed using 'uv pip install', enabling faster -dependency installation. - An additional `--slim` flag exists, which will instead use a pre-built minimal Python container with a much newer version of Python as a base. Without the `--slim` flag, the whole host system is available, whereas with the flag the system installations (i.e. From e6795981ec794fa765f167c0ee5ba73972717445 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Fri, 8 May 2026 14:27:11 +0300 Subject: [PATCH 131/139] Update docs/computing/containers/tykky.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Mats Sjöberg --- docs/computing/containers/tykky.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/computing/containers/tykky.md b/docs/computing/containers/tykky.md index 8656ab7788..8d7fe312e2 100644 --- a/docs/computing/containers/tykky.md +++ b/docs/computing/containers/tykky.md @@ -148,8 +148,8 @@ To install some additional pip packages, add the `-r ` argument, e.g.: conda-containerize new -r req.txt --prefix env.yml ``` -By default, pip is used to install the dependencies. Additionally, using the '--uv' flag -together with the '--mamba' flag enables faster installation. The corresponding env.yml file must also include uv. +By default, pip is used to install the dependencies. Additionally, using the `--uv` flag +together with the `--mamba` flag enables faster installation. The corresponding `env.yml` file must also include the `uv` package manager. ```bash conda-containerize new -r req.txt --mamba --uv --prefix env.yml From f3e2ba9f36053b139d142c1829ef7bfe609cea00 Mon Sep 17 00:00:00 2001 From: Iidahak <75577529+Iidahak@users.noreply.github.com> Date: Fri, 8 May 2026 14:27:22 +0300 Subject: [PATCH 132/139] Update docs/computing/containers/tykky.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Mats Sjöberg --- docs/computing/containers/tykky.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/computing/containers/tykky.md b/docs/computing/containers/tykky.md index 8d7fe312e2..8a6a0dcd23 100644 --- a/docs/computing/containers/tykky.md +++ b/docs/computing/containers/tykky.md @@ -218,8 +218,8 @@ pip-containerize new --prefix req.txt where `req.txt` is a standard pip requirements file. The notes and options for modifying a Conda installation apply here as well. -Alternatively, uv can be used to manage the Python installations by passing a '--uv' flag. Packages listed in -the requirements file will then be installed using 'uv pip install', enabling faster +Alternatively, uv can be used to manage the Python installations by passing a `--uv` flag. Packages listed in +the requirements file will then be installed using `uv pip install`, enabling faster dependency installation. Note that the Python version used by `pip-containerize` is the first Python executable From c9d0477579139f0cbbdd0677d2406c3d21cf16ab Mon Sep 17 00:00:00 2001 From: Kylli Ek Date: Mon, 11 May 2026 16:47:48 +0300 Subject: [PATCH 133/139] Roihu updates to GIS tutorials (#3001) * Roihu updates to GIS tutorials --- docs/support/tutorials/gis/eo_guide.md | 25 +++++++++---------- docs/support/tutorials/gis/gdal.md | 2 +- docs/support/tutorials/gis/virtual-rasters.md | 19 ++++---------- 3 files changed, 18 insertions(+), 28 deletions(-) diff --git a/docs/support/tutorials/gis/eo_guide.md b/docs/support/tutorials/gis/eo_guide.md index 34a12623df..e390f3883d 100644 --- a/docs/support/tutorials/gis/eo_guide.md +++ b/docs/support/tutorials/gis/eo_guide.md @@ -26,13 +26,13 @@ For working with EO data in general, there are three main options: CSC services do not fit well in this categorization, as they provide some features from all of these. **CSC computing services provide a lot of computing power and storage space, and they are free of charge** for Finnish researchers for academic or educational use. -At CSC, EO data can be processed and analyzed using a supercomputer, for example [supercomputer Puhti](../../../computing/systems-puhti.md), or a virtual machine in the [cPouta cloud service](../../../cloud/pouta/index.md). Puhti's computing capacity can hardly be compared to any other EO service, in both available processing power and amount of memory. Both Puhti and cPouta have also GPU resources, which are especially useful for large simulations and deep learning use cases. +At CSC, EO data can be processed and analyzed using a supercomputer, for example [supercomputer Roihu](../../../computing/systems-roihu.md), or a virtual machine in the [cPouta cloud service](../../../cloud/pouta/index.md). Roihu 's computing capacity can hardly be compared to any other EO service, in both available processing power and amount of memory. Both Roihu and cPouta have also GPU resources, which are especially useful for large simulations and deep learning use cases. -Puhti has a lot of [pre-installed applications](#what-applications-are-available-on-puhti), so it is a ready-to-use environment. cPouta virtual machines are similar to commercial cloud services, where all set-up and installations are done by the end-user. In general, both services only support Linux software. +Roihu has a lot of [pre-installed applications](#what-applications-are-available-on-roihu), so it is a ready-to-use environment. cPouta virtual machines are similar to commercial cloud services, where all set-up and installations are done by the end-user. In general, both services only support Linux software. -At CSC, [some Finnish EO datasets](#eo-data-at-csc) are available for direct use. In many cases, however, downloading EO data from other services (see [list of EO data download services](#eo-data-download-services)) is a required step of the process. Puhti and cPouta provide local storage of ~1-20 Tb. For more storage space, [Allas object storage](../../../data/Allas/index.md) can be used. +At CSC, [some Finnish EO datasets](#eo-data-at-csc) are available for direct use. In many cases, however, downloading EO data from other services (see [list of EO data download services](#eo-data-download-services)) is a required step of the process. Roihu and cPouta provide local storage of ~1-20 Tb. For more storage space, [Allas object storage](../../../data/Allas/index.md) can be used. -Using CSC computing services requires basic Linux skills and ability to use some scripting language (for example Python, R, Julia) or command-line tools. In addition, supercomputers and virtual machines require you to understand some specific concepts, so it takes a few hours to get started. The [Puhti web interface](https://www.puhti.csc.fi/) makes the start considerably easier, providing a desktop environment in the web browser, which enables the use of tools with Graphical User Interfaces (GUI) and also tools like R Studio and JupyterLab for an easy start with R, Python and Julia. +Using CSC computing services requires basic Linux skills and ability to use some scripting language (for example Python, R, Julia) or command-line tools. In addition, supercomputers and virtual machines require you to understand some specific concepts, so it takes a few hours to get started. The [Roihu web interface](https://www.Roihu.csc.fi/) makes the start considerably easier, providing a desktop environment in the web browser, which enables the use of tools with Graphical User Interfaces (GUI) and also tools like R Studio and JupyterLab for an easy start with R, Python and Julia. ## What data do I need? @@ -108,7 +108,6 @@ Commercial datasets are usually available from data provider, while open dataset Some Finnish EO datasets are available locally at CSC. [Paituli STAC](https://paituli.csc.fi/stac.html) includes all raster data available at CSC. -* **Landsat mosaics** in Puhti. * **Sentinel-2 L2A data**, selection of cloud-free tiles in Allas. * [More information and list of all spatial datasets in CSC computing environment](../../../data/datasets/spatial-data-in-csc-computing-env.md) @@ -186,15 +185,15 @@ There is no single software perfect for every task and taste. The right software * Proprietary tools need licenses which may be expensive and/or limiting the use of the tool * FOSS (free and open source software) allows the user to inspect the source code and provide high level insights in its functionality -### What applications are available on Puhti? +### What applications are available on Roihu? -[**FORCE**](../../../apps/force.md) - Framework for Operational Radiometric Correction for Environmental monitoring. All-in-one processing engine with CLI for EO image archives. [FORCE example for Puhti](https://github.com/csc-training/geocomputing/tree/master/force) +[**FORCE**](../../../apps/force.md) - Framework for Operational Radiometric Correction for Environmental monitoring. All-in-one processing engine with CLI for EO image archives. [FORCE example for Roihu](https://github.com/csc-training/geocomputing/tree/master/force) -[**GDAL (OGR)**](../../../apps/gdal.md) - Geospatial Data Abstraction Library. Collection of command-line tools for accessing and transforming geospatial data. It is relatively fast and requires little computational resources. GDAL supports reading data directly from the Internet or object storage. GDAL is included in many other tools for data reading and writing. [GDAL example for Puhti](https://github.com/csc-training/geocomputing/tree/master/gdal) +[**GDAL (OGR)**](../../../apps/gdal.md) - Geospatial Data Abstraction Library. Collection of command-line tools for accessing and transforming geospatial data. It is relatively fast and requires little computational resources. GDAL supports reading data directly from the Internet or object storage. GDAL is included in many other tools for data reading and writing. [GDAL example for Roihu](https://github.com/csc-training/geocomputing/tree/master/gdal) -[**Julia**](../../../apps/julia.md) - Puhtis Julia installation does not include any geospatial packages, but they can be installed by the user. [JuliaGeo](https://github.com/JuliaGeo) provides an overview of packages for geospatial data. +[**Julia**](../../../apps/julia.md) - Roihus Julia installation does not include any geospatial packages, but they can be installed by the user. [JuliaGeo](https://github.com/JuliaGeo) provides an overview of packages for geospatial data. -[**Matlab**](../../../apps/matlab.md) - you can run Matlab jobs on Puhti conveniently from your own computers Matlab installation. +[**Matlab**](../../../apps/matlab.md) - you can run Matlab jobs on Roihu conveniently from your own computers Matlab installation. [**Orfeo Toolbox (OTB)**](../../../apps/otb.md) - offers a wide variety of applications from ortho-rectification or pansharpening, all the way to classification, SAR processing, and much more. Orfeo Toolbox is available as CLI, GUI and via Python interface. @@ -205,13 +204,13 @@ There is no single software perfect for every task and taste. The right software [**QGIS**](../../../apps/qgis.md) - open source tool with GUI for working with spatial data including limited multispectral image processing capabilities. GUI with batch processing possibility and Python interface. Used for example for visualization, map algebra and other raster processing. Many plug-ins available, for EO data processing, check out the [QGIS Semi-automatic classification plugin](https://fromgistors.blogspot.com/p/semi-automatic-classification-plugin.html). -[**R**](../../../apps/r-env-for-gis.md) - Puhti R installation includes a lot of geospatial packages, including several useful for EO data processing, such as `terra`, `CAST`, `raster`, `rstac` and `spacetime`. +[**R**](../../../apps/r-env-for-gis.md) - Roihu R installation includes a lot of geospatial packages, including several useful for EO data processing, such as `terra`, `CAST`, `raster`, `rstac` and `spacetime`. [**Sen2Cor**](../../../apps/sen2cor.md) - a command-line tool for Sentinel-2 Level 2A product generation and formatting. [**Sen2mosaic**](../../../apps/sen2cor.md) - a command-line tool to download, preprocess and mosaic Sentinel-2 data. -[**SNAP**](../../../apps/snap.md) - ESA Sentinel Application Platform. Tool for processing of Sentinel data (+ support for other data sources). GUI, CLI (Graph Processing Tool, GPT) and Python interfaces. [SNAP GPT example for Puhti](https://github.com/csc-training/geocomputing/tree/master/snap). +[**SNAP**](../../../apps/snap.md) - ESA Sentinel Application Platform. Tool for processing of Sentinel data (+ support for other data sources). GUI, CLI (Graph Processing Tool, GPT) and Python interfaces. [SNAP GPT example for Roihu](https://github.com/csc-training/geocomputing/tree/master/snap). [**allas'']](../../../apps/allas.md) - tools for working with S3 storage, inc CSC Allas, CDSE S3 etc: `rclone` and `s3cmd`. @@ -246,7 +245,7 @@ If you are interested in using CSC services for your EO research, please make yo * Find information about services and how to use them in [CSC's documentation pages](../../../index.md) * For information on geocomputing in CSC environment, checkout the collection of [CSC's geocomputing learning materials](https://research.csc.fi/gis-learning-materials) and [CSC geocomputing examples on Github](https://github.com/csc-training/geocomputing) -You can find all the ways that you can get help from CSC specialists via [CSC contact page](../../contact.md). We are happy to help with technical problems around our services and are open for suggestions on which software should be installed to Puhti, or what kind of courses should be offered or materials/examples should be prepared. Please let us know, if you would like to add a service to this page or find anything unclear. +You can find all the ways that you can get help from CSC specialists via [CSC contact page](../../contact.md). We are happy to help with technical problems around our services and are open for suggestions on which software should be installed to Roihu, or what kind of courses should be offered or materials/examples should be prepared. Please let us know, if you would like to add a service to this page or find anything unclear. ## Acknowledgement diff --git a/docs/support/tutorials/gis/gdal.md b/docs/support/tutorials/gis/gdal.md index 2a3d782e14..042f467b21 100644 --- a/docs/support/tutorials/gis/gdal.md +++ b/docs/support/tutorials/gis/gdal.md @@ -7,7 +7,7 @@ * [Install GDAL](https://gdal.org/download.html#binaries). If you have installed already QGIS or R/Python GIS packages, then you should have GDAL already. Just find where it is, look for example for OSGeo shell, Anaconda Prompt or gdalinfo file from your disk. * Open terminal, OSGeo shell, Anaconda Prompt or Windows Command Prompt. * (See the basic command-line help at the end of this page.) -* [GDAL is available in Puhti](../../../apps/gdal.md). +* [GDAL is available in CSC supercomputers](../../../apps/gdal.md). ## Main tools diff --git a/docs/support/tutorials/gis/virtual-rasters.md b/docs/support/tutorials/gis/virtual-rasters.md index 3967bbc6a7..6bf1b0fe45 100644 --- a/docs/support/tutorials/gis/virtual-rasters.md +++ b/docs/support/tutorials/gis/virtual-rasters.md @@ -4,7 +4,7 @@ Technically a virtual raster is just a small xml file that tells GDAL where the actual data files are, but from user's point of view virtual rasters can be treated much like any other raster format. Virtual rasters can include raster data in any file format GDAL supports. Virtual rasters are useful because they allow handling of large datasets as if they were a single file eliminating need for locating correct files. -For example the [NLS 2m and 10m DEM are available in Puhti](../../../data/datasets/spatial-data-in-csc-computing-env.md). These datasets are split into a number of tif files (map sheets) and if we wanted for example to calculate zonal statistics for some areas scattered around whole Finland we would have to somehow find out which file covers which area and compute statistics from correct file. Further complications would arise if an area we want to calculate statistics for happens to lie at a border between two or more map sheets. Similar issues with edge effects would arise for example when using focal functions where information from surrounding files is also needed. These issues can be easily avoided by creating a virtual raster for the whole study area and above mentioned problems will be automatically taken care of by GDAL. +For example the [NLS 2m and 10m DEM are available in Roihu](../../../data/datasets/spatial-data-in-csc-computing-env.md). These datasets are split into a number of tif files (map sheets) and if we wanted for example to calculate zonal statistics for some areas scattered around whole Finland we would have to somehow find out which file covers which area and compute statistics from correct file. Further complications would arise if an area we want to calculate statistics for happens to lie at a border between two or more map sheets. Similar issues with edge effects would arise for example when using focal functions where information from surrounding files is also needed. These issues can be easily avoided by creating a virtual raster for the whole study area and above mentioned problems will be automatically taken care of by GDAL. It is possible to use virtual rasters so, that only the small xml-file is stored locally and the big raster files are in Allas, Amazon S3, publicly on server or any other place supported by GDAL virtual drivers. The data is moved to local only for the area and zoom level requested when the virtual raster is opened. The best performing format to save your raster data in remote service is [Cloud optimized GeoTIFF](https://www.cogeo.org/), but other formats are also possible. @@ -21,20 +21,12 @@ With **GDAL** it is easy to crop a small part out of the big virtual raster: **R**: -Terra: ``` library(terra) vrt <- rast("test.vrt") data = crop(vrt , ext(614500, 644500, 6640500, 6668500)) ``` -Raster: -``` -library(raster) -vrt <- raster("test.vrt") -data = crop(vrt , extent(614500, 644500, 6640500, 6668500)) -``` - **Python**: ``` @@ -52,7 +44,7 @@ It's possible to work with very large virtual rasters when the analysis doesn't ### Working with large virtual rasters visually -It is worth noting that while running some analysis on a 2m DEM covering whole Finland is entirely feasible in Puhti with the basic .vrt, viewing the data with for example QGIS is not practical for such a large dataset without further optimization. If you wanted to easily view a big virtual raster, you have to do a few things: +It is worth noting that while running some analysis on a 2m DEM covering whole Finland is entirely feasible in a supercomputer with the basic .vrt, viewing the data with for example QGIS is not practical for such a large dataset without further optimization. If you wanted to easily view a big virtual raster, you have to do a few things: * Create overviews for your virtual raster using gdaladdo command. You should take care to not create overviews that are so large that the overviews become a huge file themselves. * If your virtual raster is really big it makes sense to create a hierarchial structure of virtual rasters where topmost virtual raster points to smaller virtual rasters which point to smaller virtual rasters and so on until you have the last virtual raster pointing to actual files. The reason for using this approach is that if you don't do this also the overviews used get really big. Note that using this kind of hierachial structure may produce some artifacts when running analysis on the data so it should be reserved for viewing purposes. @@ -69,9 +61,8 @@ Following tools support creating virtual rasters: * [Python](https://gdal.org/api/python/osgeo.gdal.html#osgeo.gdal.BuildVRT) and [R](https://rdrr.io/cran/terra/man/vrt.html) have wrappers for GDAL gdalbuildvrt, for [longer example for R see StackOverflow's answer](https://stackoverflow.com/questions/68332846/improving-computational-speed-of-zonal-statistics-on-150gb-of-raster-tiles-in-r). * [QGIS,](https://docs.qgis.org/3.10/en/docs/user_manual/processing_algs/gdal/rastermiscellaneous.html?highlight=virtual#build-virtual-raster) [GrassGIS](https://grass.osgeo.org/grass79/manuals/r.buildvrt.html) and [SagaGIS](http://www.saga-gis.org/saga_tool_doc/7.5.0/io_gdal_12.html) provide graphical interface for gdalbuildvrt * [lidR](https://cran.r-project.org/web/packages/lidR/index.html) supports writing lidar data analysis results directly as virtual raster -* [vrt_creator.py](../../../data/datasets/spatial-data-in-csc-computing-env.md) in Puhti for custom areas with 2m or 10m DEM -In Puhti glalbuildvrt is included in all [modules including GDAL](../../../apps/gdal.md), Python BuildVRT in [geoconda](../../../apps/geoconda.md), QGIS in [QGIS](../../../apps/qgis.md), R terra and lidR in [r-env](../../../apps/r-env.md) module. +On supercomputers, glalbuildvrt is included in all [modules including GDAL](../../../apps/gdal.md), Python BuildVRT in [geoconda](../../../apps/geoconda.md), QGIS in [QGIS](../../../apps/qgis.md), R terra and lidR in [r-env](../../../apps/r-env.md) module. ### Creating virtual raster with GDAL gdalbuildvrt @@ -96,13 +87,13 @@ File list should include preferably full paths, but for local files also relativ **Raster files in Allas / some other S3** -If doing this from Puhti, load allas module. +If doing this from a supercomputer, load allas module. List the file names as they are in the bucket with rclone or some other tool: `rclone lsf --include '*.**tif**' allas: file_list.txt` -Next add to the file list the full paths as they are required by GDAL, using vsicurl, vsis3 or vsiswift drivers. See longer explanations of GDAL drivers and Allas from [Puhti GDAL page](../../../apps/gdal.md). +Next add to the file list the full paths as they are required by GDAL, using vsicurl, vsis3 or vsiswift drivers. See longer explanations of GDAL drivers and Allas from [CSC GDAL page](../../../apps/gdal.md). `sed -i -e 's-^-/**vsicurl**/https://a3s.fi//-' file_list.txt` From 90134315ad6b846cc48f5e93d38d5ae8c207dd9b Mon Sep 17 00:00:00 2001 From: Joona Tolonen Date: Tue, 12 May 2026 09:53:13 +0300 Subject: [PATCH 134/139] Replaced Mahti references to Roihu. --- docs/cloud/dbaas/firewalls.md | 4 ++-- docs/cloud/rahti/access.md | 2 +- docs/cloud/rahti/tutorials/connect-database-hpc.md | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/cloud/dbaas/firewalls.md b/docs/cloud/dbaas/firewalls.md index 9ced96334e..2087a4cd26 100644 --- a/docs/cloud/dbaas/firewalls.md +++ b/docs/cloud/dbaas/firewalls.md @@ -97,9 +97,9 @@ If one would like to have even strictre rules one could limit it only these puhti-nat-[1,2].csc.fi and puhti-login[11-15].csc.fi --> -### Mahti +### Roihu -Accessing your Pukki database from Mahti from both login nodes and compute node you can allow this: +Accessing your Pukki database from Roihu from both login nodes and compute node you can allow this: ``` 86.50.165.192/27 diff --git a/docs/cloud/rahti/access.md b/docs/cloud/rahti/access.md index 2729062059..5f163571ab 100644 --- a/docs/cloud/rahti/access.md +++ b/docs/cloud/rahti/access.md @@ -8,7 +8,7 @@ In order to use the Rahti container cloud with a CSC account, you need: 1. A CSC user account. You can check which is your "CSC username" in [MyCSC profile page](https://my.csc.fi/profile). You can also change the password from there. 2. A computing project with access to Rahti -Access to another CSC computing system such as cPouta, Mahti or Puhti counts as a valid computing project. The same +Access to another CSC computing system such as cPouta or Roihu counts as a valid computing project. The same project can be used in to access Rahti after adding this later to the list of enabled services. ### Applying for access diff --git a/docs/cloud/rahti/tutorials/connect-database-hpc.md b/docs/cloud/rahti/tutorials/connect-database-hpc.md index 7f3d9c7111..58538e9c10 100644 --- a/docs/cloud/rahti/tutorials/connect-database-hpc.md +++ b/docs/cloud/rahti/tutorials/connect-database-hpc.md @@ -13,7 +13,7 @@ Many HPC workflows require a database. Running these on the login node poses several issues and running on Pouta brings administration overhead. Rahti is a good candidate, but one obstacle is that Rahti does not support non-HTTP traffic from external sources. -A workaround for this problem is to establish a TCP tunnel over an HTTP-compatible WebSocket connection. This can be achieved using a command-line client for connecting to and serving WebSockets called [WebSocat](https://github.com/vi/websocat). Here, a WebSocat instance running on Puhti/Mahti translates a database request coming from a workflow to an HTTP-compatible WebSocket protocol. Once the traffic enters Rahti we use another WebSocat instance running inside Rahti to translate back the WebSocket connection to a TCP connection over the original port the database is configured to receive traffic. A drawing of the process is shown below. +A workaround for this problem is to establish a TCP tunnel over an HTTP-compatible WebSocket connection. This can be achieved using a command-line client for connecting to and serving WebSockets called [WebSocat](https://github.com/vi/websocat). Here, a WebSocat instance running on Roihu translates a database request coming from a workflow to an HTTP-compatible WebSocket protocol. Once the traffic enters Rahti we use another WebSocat instance running inside Rahti to translate back the WebSocket connection to a TCP connection over the original port the database is configured to receive traffic. A drawing of the process is shown below. ![Image illustrating a WebSocket connection bridging CSC's HPC environment and a database service on Rahti](../../../img/websocat-diagram-4.drawio.png) @@ -51,7 +51,7 @@ Only WebSocket connections are welcome here ## Step 2: Running WebSocat on CSC supercomputers -MariaDB and WebSocat have now been set up on Rahti and you should have the following details: MariaDB username, password, database name and the WebSocat route hostname. These are needed when connecting to the database. However, first we need to run the `websocat` binary on Puhti/Mahti to open the required TCP tunnel. +MariaDB and WebSocat have now been set up on Rahti and you should have the following details: MariaDB username, password, database name and the WebSocat route hostname. These are needed when connecting to the database. However, first we need to run the `websocat` binary on Roihu to open the required TCP tunnel. - [Download `websocat` from GitHub](https://github.com/vi/websocat/releases) and add it to your `PATH`. For example: From 94e69bdf35c273592096ec9c6ad36a62cb970a98 Mon Sep 17 00:00:00 2001 From: Joona Tolonen Date: Tue, 12 May 2026 12:05:29 +0300 Subject: [PATCH 135/139] Added Roihu related IP-info and removed references to Puhti. --- docs/cloud/dbaas/firewalls.md | 22 ++++++++++++++++++---- docs/cloud/dbaas/mariadb-accessing.md | 13 ++++++++----- docs/cloud/dbaas/postgres-accessing.md | 11 +++++++---- docs/cloud/pouta/vm-flavors-and-billing.md | 4 ++-- 4 files changed, 35 insertions(+), 15 deletions(-) diff --git a/docs/cloud/dbaas/firewalls.md b/docs/cloud/dbaas/firewalls.md index 2087a4cd26..0512800729 100644 --- a/docs/cloud/dbaas/firewalls.md +++ b/docs/cloud/dbaas/firewalls.md @@ -58,7 +58,6 @@ This is done by allowing subnets. You can find the IP from the Pouta web interface under Network -> Routers -> The specific router -> "External Fixed IPs" - ### ePouta It is important to remember that all traffic from ePouta to Pukki will be going over "the internet" @@ -77,15 +76,27 @@ which makes it even more important to use a strong username and password for you More information can be found in [Rahti security guide](../rahti/security-guide.md) - - ### Noppe + If you need to access your Pukki database instance from Noppe then you need to allow this IP `193.167.189.137/32` . Note that all other Notebook users will be able to access your database instances as well so it is important to use strong passwords for your database user. + +### Roihu + +Accessing your Pukki database from login and compute nodes you can allow this: + +``` +86.50.172.0/27 +``` + ### Puhti +!!! warning "Puhti step-wise retirement during spring and summer 2026" + Puhti will be gradually decommissioned during spring and summer 2026 and + replaced by Roihu. + Accessing your Pukki database from login and compute nodes you can allow this: ``` @@ -97,7 +108,10 @@ If one would like to have even strictre rules one could limit it only these puhti-nat-[1,2].csc.fi and puhti-login[11-15].csc.fi --> -### Roihu +### Mahti + +!!! warning "Mahti retirement in August 2026" + Mahti will be decommissioned in August 2026 and replaced by Roihu. Accessing your Pukki database from Roihu from both login nodes and compute node you can allow this: diff --git a/docs/cloud/dbaas/mariadb-accessing.md b/docs/cloud/dbaas/mariadb-accessing.md index b7d4f7f8f3..179a58e8ed 100644 --- a/docs/cloud/dbaas/mariadb-accessing.md +++ b/docs/cloud/dbaas/mariadb-accessing.md @@ -110,17 +110,20 @@ ERROR 1044 (42000): Access denied for user 'username'@'%' to database 'databasen Either the database specified does not exist, or the username specified has no access to it. -### Accessing your Pukki MariaDB database from Puhti +### Accessing your Pukki MariaDB database from Roihu -1. Ensure your database instance allows [network traffic from Puhti.](firewalls.md#puhti) -2. `ssh` onto Puhti and load the `mariadb` module +!!! info Puhti compatible + The following instructions also apply to Puhti, but it will be decommissioned in summer 2026. Please use Roihu instead. + +1. Ensure your database instance allows [network traffic from Roihu.](firewalls.md#roihu) +2. `ssh` onto Roihu and load the `mariadb` module ``` module load mariadb ``` 3. Now you can connect to the database with the mariadb-client -