When locating a defined term in paragraph text, quote style can differ between the document text and the search phrase. A matching step that compares quote variants as the same character prevents punctuation style from hiding a real match.
findUniqueSubstringMatch compares a haystack (the paragraph text being searched) with a needle (the substring, i.e., the smaller phrase being located). The function checks matching stages in order, and the quote-normalized stage maps curly and straight quote characters to the same form before searching while still returning a mode that records how the substring was found.[1]
Below is a test scenario of findUniqueSubstringMatch: quote-normalized matching treats curly quotes in the haystack as equivalent to straight quotes in the needle.
The scenario
When findUniqueSubstringMatch is called with a haystack containing curly quotes and a needle containing straight quotes,
Then
- the result status is
unique; - the result mode is
quote_normalized.
The test fixture
The fixture supplies one paragraph-like haystack and one needle, then checks the returned matching status and mode. The haystack and needle differ only in quote style, so the scenario isolates the quote-normalized matching stage.[2]
Below is the test fixture code.
test.openspec('quote_normalized matches curly quotes against straight quotes')('Scenario: quote_normalized matches curly quotes against straight quotes', async ({ when, then, attachPrettyJson }: AllureBddContext) => {
const haystack = '\u201CCompany\u201D means ABC Corp.';
const needle = '"Company" means ABC Corp.';
let result!: ReturnType<typeof findUniqueSubstringMatch>;
await when('findUniqueSubstringMatch is called', async () => {
result = findUniqueSubstringMatch(haystack, needle);
await attachPrettyJson('Result', result);
});
await then('the result SHALL have status unique and mode quote_normalized', () => {
expect(result.status).toBe('unique');
if (result.status !== 'unique') return;
expect(result.mode).toBe('quote_normalized');
});
});
The expected result shape
The scenario asserts two fields on the returned value, so the expected shape shows those fields rather than the unasserted offsets or matched substring.
Below is the result that findUniqueSubstringMatch is expected to return for this scenario.
{
status: 'unique',
mode: 'quote_normalized',
}
Below is a description of the expected fields:
statusis expected to beunique, because the transformed needle appears once in the transformed haystack.modeis expected to bequote_normalized, because the match is found after quote characters are normalized.
A non-obvious detail
The quote-normalized stage follows the exact and clean stages, so the returned mode records the first stage that finds one match. In this scenario, the quote style difference prevents an exact-stage match, and quote normalization produces the single match checked by the assertions.