When locating a defined phrase in document text, spacing variance can separate the phrase as stored from the phrase supplied by a caller. A substring (i.e., the contiguous range of source characters being matched) still needs to point back to the original source range so that downstream operations do not target the wrong text.
findUniqueSubstringMatch handles that case by applying ordered matching modes to a haystack (the source text being searched) and a needle (the phrase being searched for). Because the flexible_whitespace mode collapses whitespace while keeping source spans mapped to the original haystack, the returned match can report the mode that found the unique substring and still preserve original offsets.[1]
Below is a test scenario of the baseline successful case of findUniqueSubstringMatch: flexible whitespace matches across spacing variance.
The scenario
Given a haystack with extra spacing and a needle with single spaces,
When findUniqueSubstringMatch is called,
Then
- the result status is
unique. - the result mode is
flexible_whitespace.
The test fixture
The fixture sets up the source phrase and search phrase, calls the matching primitive, and asserts the reported status and mode.[2]
Below is the test fixture code.
test.openspec('flexible_whitespace matches across spacing variance')('Scenario: flexible_whitespace matches across spacing variance', async ({ when, then, attachPrettyJson }: AllureBddContext) => {
const haystack = 'The Purchase Price';
const needle = 'The Purchase Price';
let result!: ReturnType<typeof findUniqueSubstringMatch>;
await when('findUniqueSubstringMatch is called', async () => {
result = findUniqueSubstringMatch(haystack, needle);
await attachPrettyJson('Result', result);
});
await then('the result SHALL have status unique and mode flexible_whitespace', () => {
expect(result.status).toBe('unique');
if (result.status !== 'unique') return;
expect(result.mode).toBe('flexible_whitespace');
});
});
The expected result shape
The scenario asserts predicates over the returned match result rather than equality over the full object. The expected result is therefore shown as the literal assertions the scenario performs.
Below is the result that findUniqueSubstringMatch is expected to return for this scenario.
expect(result.status).toBe('unique');
if (result.status !== 'unique') return;
expect(result.mode).toBe('flexible_whitespace');
Below is a description of the expected fields:
statusis expected to beunique, because the collapsed-spacing comparison finds one matching substring.modeis expected to beflexible_whitespace, because the exact spacing differs and the successful stage is the whitespace-collapsing stage.
A non-obvious detail
The flexible_whitespace mode does not return normalized text as the matched substring. The implementation collapses whitespace for comparison, but it maps each normalized character back to source spans so the unique result can still identify the original haystack range.