-
Notifications
You must be signed in to change notification settings - Fork 12
Coding rules #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coding rules #249
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| # Coding Rules for migrate-confluence-rules | ||
|
|
||
| ## 1. Analyzer Processor Pattern | ||
|
|
||
| Each XML entity processed by the Analyzer requires: | ||
|
|
||
| ### File Naming | ||
| - Location: `src/Analyzer/Processor/` | ||
| - Pattern: `{EntityName}.php` (singular, PascalCase) | ||
| - Examples: `Page.php`, `BlogPost.php`, `Users.php`, `Comments.php` | ||
|
|
||
| ### Class Convention | ||
| - Implements: `IAnalyzerProcessor` | ||
| - Extends: `ProcessorBase` | ||
| - Name: `{EntityName}` class in namespace `HalloWelt\MigrateConfluence\Analyzer\Processor` | ||
|
|
||
| ### Database Table Requirement | ||
| Each processor must have corresponding table(s) in WorkspaceDB: | ||
| - Primary table: `snake_case` plural form (e.g., `pages`, `blog_posts`) | ||
| - Meta/auxiliary tables: `{primary}_meta`, `{primary}_additional`, etc. | ||
| - Registration: Must be added to `WorkspaceDB::createTables()` and `$allowedTables` whitelist | ||
|
|
||
| ## 2. WorkspaceDB Table Registration | ||
|
|
||
| For any new processor, follow this checklist: | ||
|
|
||
| 1. Define table schema in `WorkspaceDB::createTableXxx()` method | ||
| 2. Add table name to `$allowedTables` array in `getAllData()` | ||
| 3. Register creation call in `createTables()` method | ||
| 4. Add indexes in `createIndexes()` if performance-critical | ||
| 5. Add export method in JSON export chain | ||
| 6. Create add method: `add{EntityName}()` (e.g., `addPage()`, `addBlogPost()`, `addAttachment()`) | ||
| - Method signature: `public function add{EntityName}( ... ): void` | ||
| - Inserts a single object record into the corresponding table | ||
| - Example: `WorkspaceDB::addPage(...)` inserts into `pages` table | ||
|
|
||
| ## 3. Filename Conventions | ||
|
|
||
| | Component | Location | Pattern | Example | | ||
| |-----------|----------|---------|---------| | ||
| | Processor | `src/Analyzer/Processor/` | `{Entity}.php` | `Page.php` | | ||
| | Composer Processor | `src/Composer/Processor/` | `{Entity}.php` | `Pages.php` | | ||
| | Converter | `src/Converter/Processor/` | `{Operation}Macro.php` | `CodeMacro.php` | | ||
| | Postprocessor | `src/Converter/Postprocessor/` | `{Fix/Operation}.php` | `FixLineBreaks.php` | | ||
| | Preprocessor | `src/Converter/Preprocessor/` | Domain-specific | `HtmlPreprocessor.php` | | ||
|
|
||
| ## Wiki Title Conventions | ||
|
|
||
| - Wiki titles have to be created using `HalloWelt\MigrateConfluence\Utility\TitleBuilder` or `HalloWelt\MediaWiki\Lib\Migration\TitleBuilder` | ||
|
|
||
| ## 4. Database Relationships | ||
|
|
||
| Current entities and their tables: | ||
| - **Spaces** → `spaces`, `spaces_descriptions` | ||
| - **Pages** → `pages`, `pages_meta` | ||
| - **Blog Posts** → `blog_posts`, `blog_posts_meta` | ||
| - **Body Contents** → `body_contents`, `body_contents_bodies` | ||
| - **Attachments** → `attachments`, `attachments_meta`, `page_attachments`, `additional_attachments` | ||
| - **Users** → `users` | ||
| - **Comments** → `comments` | ||
| - **Labels** → `labels`, `labellings` | ||
| - **Content Properties** → `content_properties` | ||
| - **Gliffy** → `gliffy` | ||
| - **PageTemplates** → `page_templates`, `page_template_contents` | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe mention using (Generic)TitleBuilder to create safe and sanitzied page titles
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
| ## 5. Adding a New Processor | ||
|
|
||
| Steps: | ||
| 1. Create `src/Analyzer/Processor/{Entity}.php` extending `ProcessorBase` | ||
| 2. Add table creation to `WorkspaceDB` | ||
| 3. Register in `ConfluenceAnalyzer::processXML()` | ||
| 4. Create corresponding Composer processor if needed | ||
| 5. Create Converter processor if transformation required | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| # Coding Rules for Composer Component | ||
|
|
||
| The Composer assembles converted WikiText content and resources into a MediaWiki importable XML format. | ||
|
|
||
| ## 1. Processor Pattern | ||
|
|
||
| Composer processors handle building specific parts of the final MediaWiki XML. | ||
|
|
||
| ### File Naming & Location | ||
| - Location: `src/Composer/Processor/{Entity}.php` | ||
| - Pattern: Plural entity names (Pages, Files, Comments) | ||
| - Examples: `Pages.php`, `Comments.php`, `Files.php` | ||
|
|
||
| ### Class Convention | ||
| - Implements: `IConfluenceComposerProcessor` | ||
| - Extends: `ProcessorBase` | ||
| - Namespace: `HalloWelt\MigrateConfluence\Composer\Processor` | ||
| - Method to implement: `process( Builder $builder, ... ): void` | ||
|
|
||
| ### Processor Responsibilities | ||
| - Read converted data from workspace files | ||
| - Read metadata from `WorkspaceDB` | ||
| - Build XML elements using `Builder` class | ||
| - Add pages, files, or metadata to the MediaWiki XML output | ||
|
|
||
| ## 2. Processor Methods | ||
|
|
||
| ### Standard Methods in ProcessorBase | ||
| - `__construct()`: Accept `Builder`, `DBComposerDataLookup`, `Workspace`, `Output`, etc. | ||
| - `process()`: Main entry point for building XML elements | ||
| - `getName()`: Return processor identifier string | ||
|
|
||
| ### File Naming & Location | ||
| - Location: `src/Composer/Processor/{Name}ContentPostProcessor.php` | ||
| - Example: `TemplateContentPostProcessor.php` | ||
|
|
||
| ### Class Convention | ||
| - Implements: `IPageContentPostProcessor` | ||
| - Namespace: `HalloWelt\MigrateConfluence\Composer\Processor` | ||
| - Method to implement: `process( string $pageId, string $pageTitle, string $content ): string` | ||
|
|
||
| ### Responsibilities | ||
| - Accept page content as WikiText string | ||
|
|
||
| ## 3. Processor Registration | ||
|
|
||
| All processors must be registered in `ConfluenceComposer::buildXML()`: | ||
|
|
||
| 1. **Create processor instance** with required dependencies: | ||
| - `Builder` instance | ||
| - `DBComposerDataLookup` for data access | ||
| - `Workspace` for file access | ||
| - `Output` for progress reporting | ||
| - `MigrationConfig` for settings | ||
|
|
||
| 2. **Call processor** in appropriate order: | ||
| - Files: typically first (attachments, images) | ||
| - Pages: main content | ||
| - Comments: page comments | ||
| - Post-processors: applied per-page during processing | ||
|
|
||
| ### Example Registration Pattern | ||
| ```php | ||
| $processors = [ | ||
| new Files( | ||
| $builder, $composerDataLookup, $this->workspace, | ||
| $this->output, $this->dest, $this->migrationConfig, | ||
| $deploymentInfo | ||
| ), | ||
| new Pages( | ||
| $builder, $composerDataLookup, $this->workspace, | ||
| $this->output, $this->dest, $this->migrationConfig, | ||
| $deploymentInfo | ||
| ), | ||
| ]; | ||
| ``` | ||
|
|
||
| ## 4. Data Lookup Pattern | ||
|
|
||
| ### DBComposerDataLookup | ||
| - Provides convenient access to composed data from database | ||
| - Methods like `getPageData()`, `getAttachmentData()`, etc. | ||
| - Filters and caches results for performance | ||
|
|
||
| ## 6. Builder Integration | ||
|
|
||
| ### Required Data for Builder | ||
| - **Pages**: title, content, timestamp, author, page_id | ||
| - **Files**: filename, content (binary), description, upload_date | ||
|
|
||
| ## 7. Progress Reporting | ||
|
|
||
| ### Output Integration | ||
| - Use `$this->output->writeln()` for progress messages | ||
| - Report processing status per entity type | ||
| - Indicate progress: "Processing 250/1000 pages..." | ||
|
|
||
| ### Logging | ||
| - Use `DBLog` for errors or warnings | ||
| - Log skipped items and reasons | ||
| - Log final statistics | ||
|
|
||
| ## 8. Configuration & Deployment Info | ||
|
|
||
| ### MigrationConfig Usage | ||
| - Access namespaces configuration | ||
| - Access file extension whitelist | ||
| - Access custom replacements or mappings | ||
| - Passed to constructor, stored as instance variable | ||
|
|
||
| ### ComposerDeploymentInfo | ||
| - Stores deployment-specific information | ||
| - Passed to all processors for consistency | ||
| - Used for namespace and prefix mapping | ||
|
|
||
| ## 9. Adding a New Processor | ||
|
|
||
| Steps to add a new Composer Processor: | ||
|
|
||
| 1. Create `src/Composer/Processor/{Entity}Processor.php` | ||
| 2. Implement `IConfluenceComposerProcessor` or extend `ProcessorBase` | ||
| 3. Implement `process()` method: | ||
| - Accept `Builder` and required data sources | ||
| - Read from workspace/database as needed | ||
| - Call appropriate `Builder` methods | ||
| 4. Register in `ConfluenceComposer::buildXML()` constructor | ||
| 5. Add appropriate data lookup methods to `DBComposerDataLookup` if needed | ||
| 6. Test end-to-end XML output |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,108 @@ | ||
| # Coding Rules for Converter Component | ||
|
|
||
| The Converter transforms Confluence Storage XML content into MediaWiki WikiText format. It processes DOM documents through processors, preprocessors, and postprocessors. | ||
|
|
||
| ## 1. Processor Pattern | ||
|
|
||
| Converter processors handle transformation of specific Confluence elements or macros. | ||
|
|
||
| ### File Naming & Location | ||
| - **Macro Processors**: `src/Converter/Processor/{MacroName}Macro.php` | ||
| - Examples: `CodeMacro.php`, `TocMacro.php`, `PanelMacro.php` | ||
| - **Content Processors**: `src/Converter/Processor/{ElementType}.php` | ||
| - Examples: `Image.php`, `PageLink.php`, `UserLink.php`, `Emoticon.php` | ||
| - **Base Classes**: `src/Converter/Processor/{BaseType}Base.php` | ||
| - Examples: `MacroProcessorBase.php`, `StructuredMacroProcessorBase.php`, `LinkProcessorBase.php` | ||
|
|
||
| ### Class Convention | ||
| - Implements: `IProcessor` | ||
| - Extends: One of the base classes (`MacroProcessorBase`, `StructuredMacroProcessorBase`, `LinkProcessorBase`) | ||
| - Namespace: `HalloWelt\MigrateConfluence\Converter\Processor` | ||
| - Method to implement: `process( DOMDocument $dom ): void` | ||
| - Searches for target elements/macros in the DOM | ||
| - Transforms them using DOM manipulation | ||
|
|
||
| ### Pattern Specifics | ||
| - For macro processors: implement `getMacroName(): string` to specify target macro name | ||
| - Use DOM manipulation to locate elements via `getElementsByTagName()`, `getElementsByClassName()`, etc. | ||
| - Replace or modify DOM nodes in place | ||
| - Handle parameters from `ac:parameter` attributes (Confluence format) | ||
|
|
||
| ## 2. Preprocessor Pattern | ||
|
|
||
| Preprocessors prepare the HTML/DOM **before** macro conversion to fix structural issues. | ||
|
|
||
| ### File Naming & Location | ||
| - HTML Preprocessors: `src/Converter/Preprocessor/html/{Name}.php` | ||
| - Example: `CDATAClosingFixer.php` | ||
| - DOM Preprocessors: `src/Converter/Preprocessor/dom/{Name}.php` | ||
| - Examples: `HoistMacroFromHeading.php`, `SanitizeLinkContent.php`, `Table.php` | ||
|
|
||
| ### Class Convention | ||
| - Implements: `IHtmlPreprocessor` or `IDomPreprocessor` | ||
| - Namespace: `HalloWelt\MigrateConfluence\Converter\Preprocessor\{html|dom}` | ||
| - Method to implement: | ||
| - `IHtmlPreprocessor`: `process( string $html ): string` | ||
| - `IDomPreprocessor`: `process( DOMDocument $dom ): void` | ||
|
|
||
| ## 3. Postprocessor Pattern | ||
|
|
||
| Postprocessors fix content **after** macro conversion and PANDOC HTML-to-WikiText transformation. | ||
|
|
||
| ### File Naming & Location | ||
| - Location: `src/Converter/Postprocessor/{Fix|Operation}.php` | ||
| - Examples: `FixLineBreakInHeadings.php`, `FixMultilineTable.php`, `NestedHeadings.php` | ||
| - Use `Fix` prefix for bug fixes, descriptive name for enhancements | ||
|
|
||
| ### Class Convention | ||
| - Implements: `IPostprocessor` | ||
| - Namespace: `HalloWelt\MigrateConfluence\Converter\Postprocessor` | ||
| - Method to implement: `process( string $output ): string` | ||
| - Takes WikiText string as input | ||
| - Returns modified WikiText string | ||
| - Use regex or string manipulation for text-level changes | ||
|
|
||
| ### Usage Pattern | ||
| - Applied in sequence after HTML-to-WikiText conversion | ||
| - Each postprocessor should handle one specific concern | ||
| - Can be disabled/reordered via configuration | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. postprocessing of specific titles can be skipped via injection of the title in the corresponding PostProcessor constructor
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That shoud be clear when checking other processors as example. |
||
|
|
||
| ## 4. Processor Registration | ||
|
|
||
| All processors must be registered in `ConfluenceConverter::__construct()`: | ||
|
|
||
| 1. **Processors**: Add to processor instantiation list | ||
| - Order matters (executed in registration order) | ||
| 2. **Preprocessors**: Add to appropriate preprocessor chain | ||
| - HTML preprocessors before DOM preprocessing | ||
| - DOM preprocessors before macro conversion | ||
| 3. **Postprocessors**: Add to postprocessor chain | ||
| - Order: Fix issues bottom-up (earlier fixes enable later ones) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are also DOM and WikiText postprocessing
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, we post process DOM and WikiText but only for WikiText we have dedicated Classes. The other stuff has to be reworked in future and then we can change it here |
||
|
|
||
| ## 5. DOM Processing Best Practices | ||
|
|
||
| - Use `DOMXPath` for complex queries instead of `getElementsByTagName()` | ||
| - Always iterate over a copy of the NodeList before modifying: | ||
| ```php | ||
| $nodes = []; | ||
| foreach ($dom->getElementsByTagName('macro') as $node) { | ||
| $nodes[] = $node; | ||
| } | ||
| foreach ($nodes as $node) { | ||
| // Safe to modify DOM here | ||
| } | ||
| ``` | ||
| - Replace nodes using `appendChild()` and `removeChild()` | ||
| - Set attributes with `setAttribute()`, get with `getAttribute()` | ||
| - Create new elements with `createElement()` | ||
|
|
||
| ## 6. Naming Conventions Summary | ||
|
|
||
| | Type | Location | Pattern | Example | | ||
| |------|----------|---------|---------| | ||
| | Macro Processor | `Processor/` | `{MacroName}Macro.php` | `CodeMacro.php` | | ||
| | Content Processor | `Processor/` | `{ElementType}.php` | `Image.php` | | ||
| | Processor Base | `Processor/` | `{Type}ProcessorBase.php` | `MacroProcessorBase.php` | | ||
| | HTML Preprocessor | `Preprocessor/html/` | `{Name}.php` | `CDATAClosingFixer.php` | | ||
| | DOM Preprocessor | `Preprocessor/dom/` | `{Name}.php` | `Table.php` | | ||
| | Postprocessor | `Postprocessor/` | `{Fix\|Operation}.php` | `FixLineBreakInHeadings.php` | | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding also page_templates Table already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done