Extracting Text from an Analysis Result
Overview
There are use cases when the embedding application wants to obtain content from the generated result. This following discusses a technique for that to occur using a series of text extraction commands.
General Approach
The helper’s getTextContent(options) call can be used to extract tab delimited text from the generated result. Its options contain a reference and an extract command list.
The asynchronous reply is sent to a function which is set in the embedOptions as
handleGetTextContentReply: aFunction(status)
The general structure of the callback is shown below.
IFrameEmbeddedDualExample.prototype.handleGetTextContentReply = function (status)
{
// The status may be {error} or {results[], reference}
…
}
The status reference is set by the original helper request. Use it to disambiguate the reply from potential others requested by your application. There is a result for each text extraction. You can use the extracted text items to perform another application activity or even format your own Generative AI completion request.
Extract Commands
The various extract commands are listed below.
Commad | Description |
---|---|
extractToEmptyLine | Gather lines up to a line that is empty. Add the text as a result entry. |
extractToLineWith | text: stringadvance: booleanincludeLastLine: booleanGather up lines until the string is found on a line. If advance is false then remain on the line. If includeLastLine is false then exclude the matching line. Add the text as a result entry. |
skipLines | count: howManySkip howMany lines. |
skipToLineWith | text: stringadvance: booleanSkip lines until the string is found.If advance is false then remain on that line. |
skipToEndOfPage | Skip over all the lines until a form feed is reached. Advance pasted the form feed line. * |
extractToEndOfPage | Extract lines until reaching a form feed line. Advance pasted the form feed line. * |
Example Extraction
Sample Result
A sample template result is shown below.
The extraction process iterates through the tab delimited text based on a series of commands. Below is an example list of commands.
var extractCommands = [ ];
var cmd;
cmd = {command: 'skipToLineWith', text: 'Stocks', advance: false};
extractCommands.push(cmd);
cmd = {command: 'extractToLineWith', text: 'Bonds', advance: false, includeLastLine: false};
extractCommands.push(cmd);
cmd = {command: 'extractToLineWith', text: 'Total portfolio', advance: true, includeLastLine: false};
extractCommands.push(cmd);
cmd = {command: 'skipToLineWith', text: 'Summary of Managed Assets'};
extractCommands.push(cmd);
cmd = {command: 'skipLines', count: 2};
extractCommands.push(cmd);
cmd = {command: 'skipEmptyLines'};
extractCommands.push(cmd);
cmd = {command: 'extractToEmptyLine'};
extractCommands.push(cmd);
var options = { extractCommands };
options.reference = 'sample1';
options.exportOptions = {includeFormFeedPageBreak: false};
this.helper.getTextContent(options);
Extracted Text
The result of the extract commands on the sample template result is a list of strings:
StocksAlphabet 200 $905.00 $1,236.00 UT $247,200.00 0.47 46.5%Amazon 100 $1,020.00 $1,864.25 UT $186,425.00 0.35 35.1%Apple 400 $175.00 $203.75 UT $81,500.00 0.15 15.3%DuPont 300 $48.25 $39.50 DT $11,850.00 0.02 2.2%PG&E 200 $45.00 $21.50 DT $4,300.00 0.01 0.8%___________________________________________________ |
---|
Bonds2030 New York State Thruway Authority 0.045 $40,000.00 0.622025 California Municipal 0.075 $25,000.00 0.38___________________________________________________$65,000.00 |
Cash $20,000.00 0.03Stocks $531,275.00 0.86Bonds $65,000.00 0.11Total $616,275.00 |
Note that text with form feed separators are exported only when the exportOptions field includeFormFeedPageBreak is set to true.