Hi,
This is my first time working with Sagemaker. I successfully trained a model, however, I'm having difficulty getting it to output evaluation metrics to the log files.
Here is a snippet of my model:
def metric_fn(label_ids, predicted_labels):
accuracy = tf.compat.v1.metrics.accuracy(label_ids, predicted_labels)
recall = tf.compat.v1.metrics.recall(label_ids,predicted_labels)
precision = tf.compat.v1.metrics.precision(label_ids,predicted_labels)
return {"eval_accuracy": accuracy,
"precision": precision,
"recall": recall}
if mode== tf.estimator.ModeKeys.EVAL:
eval_metrics = metric_fn(label_ids, predicted_labels)
return tf.estimator.EstimatorSpec(mode=mode,loss=loss,eval_metric_ops=eval_metrics)
And this is how the model is fit:
estimator = TensorFlow(
entry_point='script.py',
source_dir = [#Source_dir],
train_instance_type='ml.m5.2xlarge',
train_instance_count=4,
output_path=s3_output_location,
hyperparameters=hyperparameters,
role=role,
py_version='py3',
framework_version='1.15.2',
sagemaker_session=sess,
metric_definitions=[{'Name': 'eval-accuracy', 'Regex': 'eval-accuracy=(\d\.\d+)'},
{'Name': 'precision', 'Regex': 'precision=(\d\.\d+)'},
{'Name': 'recall', 'Regex': 'recall=(\d\.\d+)'}],
enable_sagemaker_metrics=True,
distributions= {'parameter_server': {'enabled': True}})
When the training finishes, I don't see any of these metrics in the logs, nor in the 'training jobs' section. This is how the Metrics section looks:
Metrics
Name Regex
eval-accuracy eval-accuracy=(\d.\d+)
precision precision=(\d.\d+)
recall recall=(\d.\d+)
I don't know why it should be so obscure. I've run the script multiple times with sagemaker, and no luck so far! I'd appreciate any help!
Hi,
This is my first time working with Sagemaker. I successfully trained a model, however, I'm having difficulty getting it to output evaluation metrics to the log files.
Here is a snippet of my model:
And this is how the model is fit:
When the training finishes, I don't see any of these metrics in the logs, nor in the 'training jobs' section. This is how the Metrics section looks:
Metrics
Name Regex
eval-accuracy eval-accuracy=(\d.\d+)
precision precision=(\d.\d+)
recall recall=(\d.\d+)
I don't know why it should be so obscure. I've run the script multiple times with sagemaker, and no luck so far! I'd appreciate any help!