From 7de5067198aaabb958c4f5d55543816139dfd22d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gon=C3=A7alo?= Date: Thu, 13 Mar 2025 11:30:42 +0100 Subject: [PATCH 1/5] Added files for the workflow service. --- workflow-service/01_activate_env.sh | 3 ++ workflow-service/02_wrap_packages.sh | 3 ++ workflow-service/README.md | 57 ++++++++++++++++++++++++++++ 3 files changed, 63 insertions(+) create mode 100644 workflow-service/01_activate_env.sh create mode 100644 workflow-service/02_wrap_packages.sh create mode 100644 workflow-service/README.md diff --git a/workflow-service/01_activate_env.sh b/workflow-service/01_activate_env.sh new file mode 100644 index 0000000..bf7f382 --- /dev/null +++ b/workflow-service/01_activate_env.sh @@ -0,0 +1,3 @@ +#!/bin/bash +python3 -m venv naavre_env +source naavre_env/bin/activate || exit \ No newline at end of file diff --git a/workflow-service/02_wrap_packages.sh b/workflow-service/02_wrap_packages.sh new file mode 100644 index 0000000..5aac899 --- /dev/null +++ b/workflow-service/02_wrap_packages.sh @@ -0,0 +1,3 @@ +#!/bin/bash +pip install --platform manylinux2014_x86_64 --target=python --implementation cp --python-version 3.11 --only-binary=:all: --upgrade icoscp==0.2.2 icoscp_core==0.3.9 matplotlib python-slugify || exit +zip -r python.zip python || rm -rf python \ No newline at end of file diff --git a/workflow-service/README.md b/workflow-service/README.md new file mode 100644 index 0000000..dac333f --- /dev/null +++ b/workflow-service/README.md @@ -0,0 +1,57 @@ +#### Notes +- Uploaded to S3 because of the limit in Layers, which is roughly ~50MB. +- Runtime 3.11 because of the Numpy dependencies for compilation. This was noted because while compiling there were several incompatibility errors thrown, specifically by `numpy` package. After some trial and error, we figured out that we had to run this on the `3.11` version of Python. + +- ⁠Imported all the cells into one (copy/paste) just for fast validation +- ⁠⁠Solved runtime issues and binary targeting for the libraries—I am assuming that’s what you were having problems with +- ⁠⁠Fixed some syntax errors (inside of the code actually) + - `if isinstance(d, str) and (d == 'no data available'):` was changed to `if isinstance(datasets, str) and (datasets == 'no data available'):` +- ⁠⁠Fixed temporary folder for cache stuff—I’ll explain this later + +#### To-dos +- [ ] Verify that the PDF is really created (at the end of the code). +- [ ] Check if the zipped folder must really be `python` for it to work on AWS Lambda. +- [ ] There was a blog that hinted me for the MacOS binary compilation problem, I am not sure if it is the one below. Confirm this. + + +#### References +- Main reference for the binary targeting issue: https://repost.aws/knowledge-center/lambda-python-package-compatible. This was the page that hinted for the MacOS-specific compilation errors. +- Solution that was not used (using pre-configured layers from AWS that included Numpy and Pandas): https://stackoverflow.com/questions/46185297/using-numpy-in-aws-lambda + +--- +#### Lambda Python Package Builder +This is the guide to create and package Python library/package dependencies to be used inside of AWS Lambda functions for the NaaVRE [examples on Github](https://github.com/QCDIS/tmp-devops-test-workflows). + +> Please, notice that for this particular case, we're installing the libraries as specicified on [NaaVRE Github Example Page](https://github.com/QCDIS/tmp-devops-test-workflows) for the `use-case-icos`. All commands are manually set. In case you have different dependencies and configurations (e.g., Python runtime), please change that in the code. + +> You may remove the first line from each code snippet. This is assuming that you're going to store this in `.sh` files and run them. Note: for that, after you create them, you must run `chmod +x your_shell_command.sh` in order for it to be executable by your system. +1. Installation of Python packages locally using `venv` +```sh +#!/bin/bash +python3 -m venv naavre_env +source naavre_env/bin/activate || exit +``` + +2. Running `pip` install inside of the `venv`. This creates a zipped folder `python` that you should upload to either S3 (and use its link inside of AWS Lambda Layers) or directly into a Layer object. _Please notice that AWS sets a limit for the Layer's package size, so keep that in mind depending on the service that you want to use._ +```sh +#!/bin/bash +pip install --platform manylinux2014_x86_64 --target=python --implementation cp --python-version 3.11 --only-binary=:all: --upgrade icoscp==0.2.2 icoscp_core==0.3.9 matplotlib python-slugify || exit +zip -r python.zip python || rm -rf python +``` + +3. Upload the code into AWS Lambda through the Console or AWS CLI. + +4. Fixing errors inside the code. +> This sort of errors are more open-ended and might question the feasibility of the project. If those patterns are the same for every package, then the coverage is ensured. However, we're not sure about this. If you have any time error, probably is related with the specificity of the library/package you're using. +- Had to create a temporary folder path for some libraries to work +```py +import os +os.environ['HOME'] = '/tmp' +``` +- Defining a `/tmp/data` folder to satisfy a write command inside of the function +```py +# Ensure the /tmp/data directory exists +data_dir = '/tmp/data' +os.makedirs(data_dir, exist_ok=True) +``` +- Create an account to get the token: https://cpauth.icos-cp.eu/home/ \ No newline at end of file From 7aebb30851d108fe91a26d21ee7b7db22b84bdd0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gon=C3=A7alo?= Date: Thu, 13 Mar 2025 11:34:10 +0100 Subject: [PATCH 2/5] Changed md --- workflow-service/README.md | 47 +++++++++++++++++++------------------- 1 file changed, 24 insertions(+), 23 deletions(-) diff --git a/workflow-service/README.md b/workflow-service/README.md index dac333f..92f62d7 100644 --- a/workflow-service/README.md +++ b/workflow-service/README.md @@ -1,25 +1,4 @@ -#### Notes -- Uploaded to S3 because of the limit in Layers, which is roughly ~50MB. -- Runtime 3.11 because of the Numpy dependencies for compilation. This was noted because while compiling there were several incompatibility errors thrown, specifically by `numpy` package. After some trial and error, we figured out that we had to run this on the `3.11` version of Python. - -- ⁠Imported all the cells into one (copy/paste) just for fast validation -- ⁠⁠Solved runtime issues and binary targeting for the libraries—I am assuming that’s what you were having problems with -- ⁠⁠Fixed some syntax errors (inside of the code actually) - - `if isinstance(d, str) and (d == 'no data available'):` was changed to `if isinstance(datasets, str) and (datasets == 'no data available'):` -- ⁠⁠Fixed temporary folder for cache stuff—I’ll explain this later - -#### To-dos -- [ ] Verify that the PDF is really created (at the end of the code). -- [ ] Check if the zipped folder must really be `python` for it to work on AWS Lambda. -- [ ] There was a blog that hinted me for the MacOS binary compilation problem, I am not sure if it is the one below. Confirm this. - - -#### References -- Main reference for the binary targeting issue: https://repost.aws/knowledge-center/lambda-python-package-compatible. This was the page that hinted for the MacOS-specific compilation errors. -- Solution that was not used (using pre-configured layers from AWS that included Numpy and Pandas): https://stackoverflow.com/questions/46185297/using-numpy-in-aws-lambda - ---- -#### Lambda Python Package Builder +#### Lambda Python Package Builder guide This is the guide to create and package Python library/package dependencies to be used inside of AWS Lambda functions for the NaaVRE [examples on Github](https://github.com/QCDIS/tmp-devops-test-workflows). > Please, notice that for this particular case, we're installing the libraries as specicified on [NaaVRE Github Example Page](https://github.com/QCDIS/tmp-devops-test-workflows) for the `use-case-icos`. All commands are manually set. In case you have different dependencies and configurations (e.g., Python runtime), please change that in the code. @@ -54,4 +33,26 @@ os.environ['HOME'] = '/tmp' data_dir = '/tmp/data' os.makedirs(data_dir, exist_ok=True) ``` -- Create an account to get the token: https://cpauth.icos-cp.eu/home/ \ No newline at end of file + +--- + +#### Notes +- Uploaded to S3 because of the limit in Layers, which is roughly ~50MB. +- Runtime 3.11 because of the Numpy dependencies for compilation. This was noted because while compiling there were several incompatibility errors thrown, specifically by `numpy` package. After some trial and error, we figured out that we had to run this on the `3.11` version of Python. + +- ⁠Imported all the cells into one (copy/paste) just for fast validation +- ⁠⁠Solved runtime issues and binary targeting for the libraries—I am assuming that’s what you were having problems with +- ⁠⁠Fixed some syntax errors (inside of the code actually) + - `if isinstance(d, str) and (d == 'no data available'):` was changed to `if isinstance(datasets, str) and (datasets == 'no data available'):` +- ⁠⁠Fixed temporary folder for cache stuff—I’ll explain this later +- Create an account to get the token: https://cpauth.icos-cp.eu/home/ + +#### To-dos +- [ ] Verify that the PDF is really created (at the end of the code). +- [ ] Check if the zipped folder must really be `python` for it to work on AWS Lambda. +- [ ] There was a blog that hinted me for the MacOS binary compilation problem, I am not sure if it is the one below. Confirm this. + + +#### References +- Main reference for the binary targeting issue: https://repost.aws/knowledge-center/lambda-python-package-compatible. This was the page that hinted for the MacOS-specific compilation errors. +- Solution that was not used (using pre-configured layers from AWS that included Numpy and Pandas): https://stackoverflow.com/questions/46185297/using-numpy-in-aws-lambda \ No newline at end of file From 9d9cd03ddb98dfa533f17d410aece21b86aeefbe Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gon=C3=A7alo?= Date: Thu, 13 Mar 2025 11:59:24 +0100 Subject: [PATCH 3/5] Added timeout to the md file. --- workflow-service/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/workflow-service/README.md b/workflow-service/README.md index 92f62d7..a70477b 100644 --- a/workflow-service/README.md +++ b/workflow-service/README.md @@ -28,6 +28,7 @@ import os os.environ['HOME'] = '/tmp' ``` - Defining a `/tmp/data` folder to satisfy a write command inside of the function +> Ref (second comment ): https://stackoverflow.com/questions/31938828/aws-lambda-downloading-a-file-and-using-it-in-the-same-function-nodejs ```py # Ensure the /tmp/data directory exists data_dir = '/tmp/data' @@ -46,9 +47,10 @@ os.makedirs(data_dir, exist_ok=True) - `if isinstance(d, str) and (d == 'no data available'):` was changed to `if isinstance(datasets, str) and (datasets == 'no data available'):` - ⁠⁠Fixed temporary folder for cache stuff—I’ll explain this later - Create an account to get the token: https://cpauth.icos-cp.eu/home/ +- Changed the timeout from Lambda from the default `3s` to `>3m`. In this case, timeout is of no use. #### To-dos -- [ ] Verify that the PDF is really created (at the end of the code). +- [ ] Verify that the PDF is really creat ed (at the end of the code). - [ ] Check if the zipped folder must really be `python` for it to work on AWS Lambda. - [ ] There was a blog that hinted me for the MacOS binary compilation problem, I am not sure if it is the one below. Confirm this. From e4034f9c46af46f4e836331ccea02560d696f4bf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gon=C3=A7alo?= Date: Thu, 13 Mar 2025 13:56:40 +0100 Subject: [PATCH 4/5] Comment added. --- workflow-service/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/workflow-service/README.md b/workflow-service/README.md index a70477b..88b5a32 100644 --- a/workflow-service/README.md +++ b/workflow-service/README.md @@ -48,6 +48,7 @@ os.makedirs(data_dir, exist_ok=True) - ⁠⁠Fixed temporary folder for cache stuff—I’ll explain this later - Create an account to get the token: https://cpauth.icos-cp.eu/home/ - Changed the timeout from Lambda from the default `3s` to `>3m`. In this case, timeout is of no use. +- Reference for #### To-dos - [ ] Verify that the PDF is really creat ed (at the end of the code). From 3b0feaf506311d004d88beb21a3d0fb19407afd7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gon=C3=A7alo?= Date: Thu, 13 Mar 2025 13:56:43 +0100 Subject: [PATCH 5/5] Comment added. --- workflow-service/datetime_test.py | 9 +++++++++ 1 file changed, 9 insertions(+) create mode 100644 workflow-service/datetime_test.py diff --git a/workflow-service/datetime_test.py b/workflow-service/datetime_test.py new file mode 100644 index 0000000..7ae0083 --- /dev/null +++ b/workflow-service/datetime_test.py @@ -0,0 +1,9 @@ +# Test for folder creation in lambda and S3 +from datetime import datetime +import math + +def test(): + date_time = datetime.now().timestamp() + print(math.floor(date_time)) + +test() \ No newline at end of file