Skip to content

Accommodating more channels in sagemaker PIPE mode #135

@Naveen03

Description

@Naveen03

I am training a classification model using AWS Sagemaker -TensorFlow. My training dataset is huge and distributed in 4 folders in the same AWS s3 bucket.

I defined input channels like this
inputs = {
'train1' : folder1,
'train2' : folder2,
'train3': folder3,
'train4':folder4,

'valid':folder
}

I am passing these channels 'ids' into my main train code and then reading the data using PIPE mode like this,
all_data = []
if mode = train:
for id in ids:
data = PipeModeDataset(channel=id, format = 'TFRecord')
data = parsing data here
all_data.append(data)

Now I am using all_data as my whole data and doing augmentation in it and then passing it to the training script.
I got an error while doing this, (error related to data). sometimes training hangs.

What I want to know is the correct way of using multiple channels for single training using PIPE mode

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions