Forum: SDL Trados support
Topic: QA Checker 3.0 & tags
Poster: asr2
Post title: Regexes in QA
This is an extract from a book I am preparing, unfortunately formatting and figures are lost
20.4 Use of regexes in QA
Although Trados Studio provides powerful integrated QA processing, there are many situations where regexes can be used to define user-specific QA rules. This section provides several examples of user-written QA regexes.
“Project Settings” → “Verification” → “QA Checker 3.0” → “Regular Expressions” lists the regexes performed for the QA (Quality Assurance).
Activate the “Search regular expressions” checkbox to allow edit capability (otherwise the input fields are blocked).
Condition
The “Condition” specifies which regex verification should be performed.
Condition Meaning
Report if both target and source RegEx patterns match The patterns found in the source and the target match, i.e. no transformation performed
Report if target matches but not the source The pattern found in the target does not have the specified matching pattern in the source
Report if source matches but not the target The pattern found in the source does not have the specified matching pattern in the target
Report if source matches (source check only) The specified source pattern is found
this can be used pattern searching in the source
Report if target matches (target check only) The specified target pattern is found
this can be used pattern searching in the source
Report if both source and target match but with different count The number of patterns found in the source and target differ
Grouped search expression - report if source matches but not target The capturing group specified in the source is not found in the target
Grouped search expression - report if source and target matches The capturing group specified in the source matches that in the source, i.e. no translation performed
Table 20-4: Regex verification conditions
“Grouped” means capturing groups are involved. In this case, the source has the regex format for matching and the target has the regex format for substitutions. The capturing groups (either numbered or named) from the source can be referenced in the target, see Example 2. The sequence of the capturing groups can be changed, for example, to check that an European format date (dd.mm.yy) has been converted correctly to an American format date (yy-mm-dd) and the individual fields match.
Example 1
Check for format mismatch.
Figure 20-4.3: QA regex entry example 1
Enter a “Description” (mandatory).
Enter the appropriate regex(es) in the “RexEx source” and/or “RexEx target” fields as appropriate.
Select an appropriate “Condition”.
This regex example checks that a source date (German, in the form dd.mm.yy) has the correct target date format (English, dd-mm-yy).
Enter (\d+)\.(\d+)\.(\d+) and (\d+)-(\d+)-(\d+) in the “RexEx source” and “RexEx target” fields, respectively.
Select “Report if source matches but not the target” as “Condition”, in this case.
Note: The converse condition (“Report if target matches but not the source”) checks that for the specified target format, the associated source format matches. In the above case, a target format date dd-mm-yy must have a corresponding source format date dd.mm.yy.
If required, activate the “Ignore case” checkbox if case-insensitive matching is to be performed.
Click “Add item” as “Action” to enter the regex rule.
Figure 20-4.4: Erroneous regex
If the regex is erroneous, its font colour changes to red (see above figure). Clicking the (red exclamation mark) icon displays the associated error message.
Click “OK” to save all the regex rules with the project.
Topic: QA Checker 3.0 & tags
Poster: asr2
Post title: Regexes in QA
This is an extract from a book I am preparing, unfortunately formatting and figures are lost
20.4 Use of regexes in QA
Although Trados Studio provides powerful integrated QA processing, there are many situations where regexes can be used to define user-specific QA rules. This section provides several examples of user-written QA regexes.
“Project Settings” → “Verification” → “QA Checker 3.0” → “Regular Expressions” lists the regexes performed for the QA (Quality Assurance).
Activate the “Search regular expressions” checkbox to allow edit capability (otherwise the input fields are blocked).
Condition
The “Condition” specifies which regex verification should be performed.
Condition Meaning
Report if both target and source RegEx patterns match The patterns found in the source and the target match, i.e. no transformation performed
Report if target matches but not the source The pattern found in the target does not have the specified matching pattern in the source
Report if source matches but not the target The pattern found in the source does not have the specified matching pattern in the target
Report if source matches (source check only) The specified source pattern is found
this can be used pattern searching in the source
Report if target matches (target check only) The specified target pattern is found
this can be used pattern searching in the source
Report if both source and target match but with different count The number of patterns found in the source and target differ
Grouped search expression - report if source matches but not target The capturing group specified in the source is not found in the target
Grouped search expression - report if source and target matches The capturing group specified in the source matches that in the source, i.e. no translation performed
Table 20-4: Regex verification conditions
“Grouped” means capturing groups are involved. In this case, the source has the regex format for matching and the target has the regex format for substitutions. The capturing groups (either numbered or named) from the source can be referenced in the target, see Example 2. The sequence of the capturing groups can be changed, for example, to check that an European format date (dd.mm.yy) has been converted correctly to an American format date (yy-mm-dd) and the individual fields match.
Example 1
Check for format mismatch.
Figure 20-4.3: QA regex entry example 1
Enter a “Description” (mandatory).
Enter the appropriate regex(es) in the “RexEx source” and/or “RexEx target” fields as appropriate.
Select an appropriate “Condition”.
This regex example checks that a source date (German, in the form dd.mm.yy) has the correct target date format (English, dd-mm-yy).
Enter (\d+)\.(\d+)\.(\d+) and (\d+)-(\d+)-(\d+) in the “RexEx source” and “RexEx target” fields, respectively.
Select “Report if source matches but not the target” as “Condition”, in this case.
Note: The converse condition (“Report if target matches but not the source”) checks that for the specified target format, the associated source format matches. In the above case, a target format date dd-mm-yy must have a corresponding source format date dd.mm.yy.
If required, activate the “Ignore case” checkbox if case-insensitive matching is to be performed.
Click “Add item” as “Action” to enter the regex rule.
Figure 20-4.4: Erroneous regex
If the regex is erroneous, its font colour changes to red (see above figure). Clicking the (red exclamation mark) icon displays the associated error message.
Click “OK” to save all the regex rules with the project.