Skip to content

start-stop-mwaa-environment - mwaa_import_data.py - variable.csv - fails for field larger than field limit  #74

@mvitale-kensu

Description

@mvitale-kensu

Hello guys,

In our case the resume step fails because the mwaa_import_data dag fails while importing variable.csv
This is the error:

[2024-05-16, 08:01:16 UTC] {{taskinstance.py:1937}} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/operators/python.py", line 192, in execute
    return_value = self.execute_callable()
                   ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/operators/python.py", line 209, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/dags/mwaa_import_data.py", line 146, in importVariable
    for row in reader:
_csv.Error: field larger than field limit (131072)

Just FYI I've fixed it by adding these few lines of code to mwaa_import_data.py:

import sys
import csv
maxInt = sys.maxsize

while True:
    # decrease the maxInt value by factor 10 
    # as long as the OverflowError occurs.

    try:
        csv.field_size_limit(maxInt)
        break
    except OverflowError:
        maxInt = int(maxInt/10)

Coming from this: https://stackoverflow.com/questions/15063936/csv-error-field-larger-than-field-limit-131072

I am not sure if this is the correct way to manage this, but for what I've seen it seems to be working fine for us.

We are on airflow 2.7.2 and I am using the latest code of this project available in main.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions