Accommodating more channels in sagemaker PIPE mode

I am training a classification model using AWS Sagemaker -TensorFlow. My training dataset is huge and distributed in 4 folders in the same AWS s3 bucket.

I defined input channels like this
inputs = {
'train1' : folder1,
'train2' : folder2,
'train3': folder3,
'train4':folder4,

'valid':folder
}

I am passing these channels 'ids' into my main train code and then reading the data using PIPE mode like this,
all_data = []
if mode = train:
     for id in ids:
         data = PipeModeDataset(channel=id, format = 'TFRecord')
         data = parsing data here
         all_data.append(data)

Now I am using _all_data_ as my whole data and doing augmentation in it and then passing it to the training script.
 I got an error while doing this, (error related to data). sometimes training hangs.

What I want to know is the correct way of using multiple channels for single training using PIPE mode

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accommodating more channels in sagemaker PIPE mode #135

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Accommodating more channels in sagemaker PIPE mode #135

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions