When parsing agreement paragraphs that use typed list labels, the leading marker must be separated from the paragraph content before classification can proceed. The label (i.e., the visible numbering or lettering marker at the start of the paragraph) can determine how later style inference treats the paragraph.
extractListLabel returns the extracted label and its label type for text-formatted list labels. Because parenthesized letters can overlap with other parenthesized patterns, the parser classifies the label after checking the higher-priority label forms handled by the same primitive.[1]
Scenario
Below is a test scenario of the baseline successful case of extractListLabel: a parenthesized letter label is classified as a letter label.
The scenario
Given the input text is (a) First item of the agreement,
When extractListLabel is called with that text,
Then
result.label_typeisLabelType.LETTER.result.labelis(a).
Test fixture
The fixture calls extractListLabel with a paragraph-like string and checks the returned label classification.
Below is the test fixture code.
test.openspec('extract parenthesized letter labels')('Scenario: extract parenthesized letter labels', async ({ when, then, attachPrettyJson }: AllureBddContext) => {
const text = '(a) First item of the agreement';
let result!: ReturnType<typeof extractListLabel>;
await when('extractListLabel is called', async () => {
result = extractListLabel(text);
await attachPrettyJson('Result', result);
});
await then('the result SHALL have label_type LETTER', () => {
expect(result.label_type).toBe(LabelType.LETTER);
expect(result.label).toBe('(a)');
});
});
Expected result shape
The scenario asserts the label type and the extracted label returned by extractListLabel.[2]
Below is the result that extractListLabel is expected to return for this scenario.
expect(result.label_type).toBe(LabelType.LETTER);
expect(result.label).toBe('(a)');
Below is a description of the expected fields:
label_typeis expected to beLabelType.LETTER, because the leading marker is a parenthesized alphabetic label and does not match the higher-priority section, article, numbered-heading, number, or multi-character roman forms.labelis expected to be(a), because the matched parenthesized letter is returned as the visible label without the following paragraph content.