Skip to content

Data Migration Tool inserts massive duplicate gallery rows in catalog_product_entity_media_gallery_value causing major slowdown #934

@ioweb-gr

Description

@ioweb-gr

Bug Description

After migrating from Magento 1 to Magento 2 using the Data Migration Tool, we identified a severe performance issue in the Magento\Catalog\Model\Product\Gallery\ReadHandler::sortMediaEntriesByPosition method. Product image galleries contain massive duplicate entries in the catalog_product_entity_media_gallery_value table.

Detailed Findings

  • Example: Product SKU BC-1-GOLD (entity_id 3477) only has 3 valid gallery value_ids, but each value_id appears 198 times with the same (value_id, store_id=0, entity_id) tuple in catalog_product_entity_media_gallery_value.
  • As a result, gallery queries return hundreds of thousands of rows per product (e.g., 198×3 value_ids = 594, joined twice yields 117,612 rows for one product), causing severe slowdowns in catalog operations.
  • The Magento 1 database did not contain these duplicates — the issue only arose post-migration.

Root Cause Analysis

  • The source repository code at src/Migration/Handler/Gallery/InsertValueToEntity.php is responsible for inserting gallery value-to-entity records. This code does not appear to check for pre-existing (value_id, entity_id) records before insertion, potentially allowing for many duplicates during migration.
  • Migration config XML only defines document keys, but does not enforce idempotency or uniqueness at the DB/module level for this case.
  • There is no unique constraint enforced for (value_id, store_id, entity_id) in the target table, increasing the risk.
  • No raw direct SQL for inserts is found in this repo, but any repeated/looped execution of the handler or scripts against the same data would amplify duplication.

Magento 2 runs this query

SELECT `main`.`value_id`,
       `main`.`value`                                                           AS `file`,
       `main`.`media_type`,
       `entity`.`entity_id`,
       IFNULL(`value`.`label`, `default_value`.`label`)                         AS `label`,
       IFNULL(`value`.`position`, `default_value`.`position`)                   AS `position`,
       IFNULL(`value`.`disabled`, `default_value`.`disabled`)                   AS `disabled`,
       `default_value`.`label`                                                  AS `label_default`,
       `default_value`.`position`                                               AS `position_default`,
       `default_value`.`disabled`                                               AS `disabled_default`,
       IFNULL(`value_video`.`provider`, `default_value_video`.`provider`)       AS `video_provider`,
       IFNULL(`value_video`.`url`, `default_value_video`.`url`)                 AS `video_url`,
       IFNULL(`value_video`.`title`, `default_value_video`.`title`)             AS `video_title`,
       IFNULL(`value_video`.`description`, `default_value_video`.`description`) AS `video_description`,
       IFNULL(`value_video`.`metadata`, `default_value_video`.`metadata`)       AS `video_metadata`,
       `default_value_video`.`provider`                                         AS `video_provider_default`,
       `default_value_video`.`url`                                              AS `video_url_default`,
       `default_value_video`.`title`                                            AS `video_title_default`,
       `default_value_video`.`description`                                      AS `video_description_default`,
       `default_value_video`.`metadata`                                         AS `video_metadata_default`
FROM `catalog_product_entity_media_gallery` AS `main`
         INNER JOIN `catalog_product_entity_media_gallery_value_to_entity` AS `entity`
                    ON main.value_id = entity.value_id
         LEFT JOIN `catalog_product_entity_media_gallery_value` AS `value`
                   ON main.value_id = value.value_id AND value.store_id = 0 AND value.entity_id = entity.entity_id
         LEFT JOIN `catalog_product_entity_media_gallery_value` AS `default_value`
                   ON main.value_id = default_value.value_id AND default_value.store_id = 0 AND
                      default_value.entity_id = entity.entity_id
         LEFT JOIN `catalog_product_entity_media_gallery_value_video` AS `value_video`
                   ON value.value_id = value_video.value_id AND value.store_id = value_video.store_id
         LEFT JOIN `catalog_product_entity_media_gallery_value_video` AS `default_value_video`
                   ON default_value.value_id = default_value_video.value_id AND
                      default_value.store_id = default_value_video.store_id
WHERE (main.attribute_id = '88')
  AND (main.disabled = 0)
  AND (entity.entity_id = '3477')
ORDER BY IF(value.position IS NULL, default_value.position, value.position) ASC

Which in our case yielded 117k results to sort by usort

17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    
17272 /s/x/sxara-autokinitou-trion-theseon-bc-1-gold_1.jpg image 3477 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0 Σχάρα αυτοκινήτου για 3 ποδήλατα | BC-1 GOLD | example.com 1 0                    

Proposed Solution

  • Add logic in the migration handler and/or resource model to check for existing records before inserting into catalog_product_entity_media_gallery_value.

Additional Information

  • No duplicates existed in source (Magento 1) DB before migration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions