Skip to content

Avoid intermediate memcpy in _take_serialized_message#877

Open
facontidavide wants to merge 1 commit intoros2:rollingfrom
facontidavide:direct-write-take-serialized
Open

Avoid intermediate memcpy in _take_serialized_message#877
facontidavide wants to merge 1 commit intoros2:rollingfrom
facontidavide:direct-write-take-serialized

Conversation

@facontidavide
Copy link
Copy Markdown

The previous implementation of _take_serialized_message filled a local eprosima::fastcdr::FastBuffer during the DataReader::take() call, then memcpy'd the payload from that FastBuffer into the user's rmw_serialized_message_t. For large serialized messages (e.g. uncompressed image frames) the second memcpy is the dominant cost of the take path.

Add a new FASTDDS_SERIALIZED_DATA_TYPE_RMW_SERIALIZED_MESSAGE enum variant to SerializedDataType, and teach TypeSupport::deserialize to write the CDR payload straight into the user's rmw_serialized_message_t, resizing it via rmw_serialized_message_resize when the capacity is insufficient.

Measured on ROS 2 Humble (the equivalent bool-flag backport), FastDDS SHM transport, 1920x1080 BGR8 images (6.22 MB / frame), 80 measured frames, perf stat on the subscriber process, mean of 3 runs:

 before (stock)      169 ms
 after  (this patch) 138 ms   (-18%)

Wire format and external behaviour are unchanged. The existing FASTDDS_SERIALIZED_DATA_TYPE_CDR_BUFFER path is preserved for other callers (publisher serialize, services).

The previous implementation of _take_serialized_message filled a local
eprosima::fastcdr::FastBuffer during the DataReader::take() call, then
memcpy'd the payload from that FastBuffer into the user's
rmw_serialized_message_t. For large serialized messages (e.g.
uncompressed image frames) the second memcpy is the dominant cost of
the take path.

Add a new FASTDDS_SERIALIZED_DATA_TYPE_RMW_SERIALIZED_MESSAGE enum
variant to SerializedDataType, and teach TypeSupport::deserialize to
write the CDR payload straight into the user's rmw_serialized_message_t,
resizing it via rmw_serialized_message_resize when the capacity is
insufficient.

Measured on ROS 2 Humble (the equivalent bool-flag backport), FastDDS
SHM transport, 1920x1080 BGR8 images (6.22 MB / frame), 80 measured
frames, perf stat on the subscriber process, mean of 3 runs:

                      sub task-clock
  before (stock)      169 ms
  after  (this patch) 138 ms   (-18%)

Wire format and external behaviour are unchanged. The existing
FASTDDS_SERIALIZED_DATA_TYPE_CDR_BUFFER path is preserved for other
callers (publisher serialize, services); only _take_serialized_message
opts in to the new path.

Signed-off-by: Davide Faconti <davide.faconti@gmail.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
if (serialized_message->buffer_capacity < buffer_size) {
auto ret = rmw_serialized_message_resize(serialized_message, buffer_size);
if (ret != RMW_RET_OK) {
return ret; // Error message already set
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before, it returns corresponding error code that is propagated for the failure here? but the new code conceals the reason because it just returns false without any information with it. i think this is minor but we would want to consider to add RMW_SET_ERROR_MSG("failed to resize serialized message") in FASTDDS_SERIALIZED_DATA_TYPE_RMW_SERIALIZED_MESSAGE case statement?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either that, or pass in data.data a pointer to a custom structure that has both the serialized_message and a rmw_ret_t field that is filled in the new case of the type support implementation

@fujitatomoya
Copy link
Copy Markdown
Collaborator

@MiguelCompany can you or someone else take a look at this?

Copy link
Copy Markdown
Collaborator

@MiguelCompany MiguelCompany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @fujitatomoya that the return code from rmw_serialized_message_resize shall be preserved.

Also please change the PR description to follow the standard PR template

@facontidavide
Copy link
Copy Markdown
Author

will do

Comment on lines +45 to +48
// `data` points to a `rmw_serialized_message_t` (aka
// `rcutils_uint8_array_t`). The CDR payload is written directly into that
// buffer on deserialize, avoiding an intermediate FastBuffer and the
// follow-up memcpy in `_take_serialized_message`.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment too verbose or useful to keep?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put similar comments in the new structure I suggested in #877 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants