Skip to content

Data Model Refactor + Better Parsing#277

Open
ekeilty17 wants to merge 2 commits into
masterfrom
data-model-refactor
Open

Data Model Refactor + Better Parsing#277
ekeilty17 wants to merge 2 commits into
masterfrom
data-model-refactor

Conversation

@ekeilty17
Copy link
Copy Markdown
Collaborator

Reasons for Refactor

1. Fixing Critical Bugs

The previous iteration of the codebase contained a few bugs which were not easy to fix. Namely relating to how watsonx Assistant manages step variable IDs. When actions are duplicated, so are their variables. In many situations the same variable is reused. The previous code was conflating these duplicate variables across actions and thus conflating their metadata, leading to inaccurate reports.

2. More Complete Parsing

The previous iteration of the codebase made certain simplifications in its parsing of the wxa json. As a result, there are certain features that would be difficult to implement in the previous codebase. In particular, toggles related to step settings such as Display options and Repeat response after the validation message. These toggles are not simple booleans, and must be computed based on other fields.

3. Better Modularization and Documentation

Good code should be self-documenting. The previous iteration of the codebase was close, but not quite there. This is a must in the era of AI coding tools.

4. CLI Deficiencies

The previous iterating of the codebase had only moderate CLI documentation, and did not allow for packagage.

New Features

1. Strong Typing

All relevant objects in the assistant JSON are now parsed into a dataclass. This makes the code easier to understand, easier to maintain, and ensures future-proofing.

The other benefit of using dataclasses is LLMs are highly trained on them. So AI coding tools can easily navigate to the appropriate dataclass to get exactly the context they need. In the future, this will better enable a skill.md file to be written, so the AI coding tool can utilize the SDK features for custom scripts.

2. Complete Modelling of the Assistant JSON

All important fields in the assistant JSON have been parsed in the refactor. This includes several new fields not present in the previous code base, such as

  • handlers
  • question
  • response_type (all of them)
  • system_settings

3. Variables

Variable handling has been completely overhauled to account for duplications. Now, a uid field is present which can differentiate between duplicated variables which share an id.

Additionally, there is better differentiation between all variable types (skill, step, result, and system). The user can now filter on these variable types.

4. Entities

Entities are now being extracted from conditions/context. Additionally, we are parsing in the question field for entities and surfacing that to the user

5. Action Conditions

Action conditions are being parsed and surfaced to the user.

6. Response Types

In the previous iteration, only text responses and option responses were being parsed. Now all response types have been added. Additionally, more metadata about the response types are being parsed.

7. Inferred Settings Toggles

Toggles in step settings such as Display options and Repeat response after the validation message must be inferred based on fields in TextResponse and the Question objects. This was difficult to do in the previous version of the code. Now these toggle values are being computed.

8. Improved CLI

  • As a result of better parsing, more fine-grained CLI functions can be provided to the user. See the cli/ folder
  • Better documentation of the CLI has been written. The user can type --help on any instruction and it will explain how to use the CLI and gives examples.
  • The CLI can be packaged

DCO 1.1 Signed-off-by: Eric Keilty eric.keilty@ibm.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant