storageRobot

DocuWare storageRobot is a specialized tool for automatic document import in complex scenarios, such as integration with external ERP and inventory management systems.

More information about the use of DocuWare storageRobot

Additionally, storageRobot can use the AI-powered service DocuWare Intelligent Document Processing (IDP) with the “document separation” or “splitting” method. Batch scans are automatically split into separate documents, even without separator sheets or barcode labels. The individual PDFs can then be further processed in company-wide workflows.

More information about the setup of DocuWare IDP with storageRobot

storageRobot generally performs document import automatically. Import configurations are either scheduled or activate automatically when documents or metadata are newly saved or changed in the file system on a server.

The import configurations can be set up in the storageRobot Administration Tool, which can be downloaded and locally installed together with other program components.

Start download of storageRobot Administration Tool

Setting up Import Configurations

The storageRobot Administration Tool offers a central main window as well as a wizard.

In the main window, you create or edit new, automatic import configurations. You can also manually execute a configuration here, for example, to test the settings of an initial setup. In addition, you enter the license key in the main window, retrieve updates, or access log files.
The wizard guides you step by step through the creation of a new configuration. Depending on the type of configuration, different functions are available. For example, when storing documents in the file system, different settings are required than when storing in a file cabinet.
Note: The settings are only saved in the last step of the wizard with the Finish button.

Key Functions in the storageRobot Wizard at a Glance

These functions are available in every wizard of storageRobot:

Storage location
Sample document
IDP
Replace values of placeholders
Placeholder check

Source Document Storage Location

Specify the source for the documents to be imported. The following locations are available: file system, DocuWare document tray, FTP, and meta files.
In addition to the source for documents to be imported, you can specify a folder into which a document is copied if an error occurs during processing.

File System

With this option, files from a local folder or a UNC path are processed.

Include/Exclude Files: If you only want to process certain file types or subfolders from the source folder, you can individually include or exclude file types and subfolders under the following points.
Including or excluding subfolders only works if the AllDirectories search option is selected.
- Include file pattern: Here you can individually include file types or subfolders in the processing. If this field is empty, storageRobot will process all documents by default.
- Include only defined file types: For example, if only PDF documents are to be archived, you must enter *.pdf in this field. storageRobot will then process only everything (*) with the file type ".pdf".
- This function also works with multiple file types. The individual filters must be separated by a comma. Example: *.pdf,*.jpg
- Include all file types in defined subfolders: If the source folder contains subfolders such as "Incoming invoices," "Outgoing invoices," "Orders," and "Delivery notes" and only the "Incoming invoices" folder is to be processed, you must enter *\Incoming invoices\*.* in this field. storageRobot will then only process the documents in the "Incoming invoices" subfolder.
- Multiple subfolders can also be included, separated by commas. Example: *\Incoming invoices\*.*, *\Outgoing invoices\*.*
- Include only defined file types in defined subfolders: The above described functions can also be combined, so that, for example, only all PDF files from the subfolders "Incoming invoices" and "Outgoing invoices" are processed. The correct syntax would then be *\Incoming invoices\*.pdf, *\Outgoing invoices\*.pdf.
- Exclude file pattern: Here you can individually exclude file types or subfolders from processing. If this field is empty, storageRobot will process all documents by default. It works identically to "Include file pattern".
Minimum age: It can be useful to check the age of a file, for example, if you want a file to be processed only after it has been fully placed in the folder.
For this purpose, enter the minimum age in seconds. storageRobot checks during each run whether the file meets this rule. If so, the file will be processed; if not, it will remain in the folder.
Zip files: Read ZIP files will instruct storageRobot to search only for ZIP files in the document location. These files will be extracted and treated as a folder. Include/exclude patterns and other options will be applied to the content of the ZIP files.

DocuWare Document Tray

Files from a DocuWare document tray are processed.

FTP

Files from an FTP or SFTP server are processed.

Meta files

Meta files is a processing mode designed to update index data of already stored documents.

Sample Document

The file selected and set up here will be used repeatedly in the subsequent configuration steps to make settings.

IDP

Detailed information about configuring IDP in storageRobot can be found below in the chapter DocuWare IDP with storageRobot

Replace values of placeholders

Here you can specify, for example, that the value “RE” of a barcode is entered as “Invoice” in an Index Field (optional).

Placeholder Check

A classic example of a placeholder check is using a value from a barcode for a database query. The document should only be stored once the query has provided the required information. If the SQL placeholder is not filled, the document can be moved to the error folder.

Date Format

If the documents to be imported contain a date that does not conform to the standard DD.MM.YYYY, storageRobot can convert this value into a format valid for DocuWare.

Add via the plus symbol the field that should be read from the import document and assigned to the date-time field type.
All reading areas available in the storageRobot configuration can be used, including FileProperty, OCR, Barcode, and SQL.

Format: Specify the current format for the value so that storageRobot can recognize and convert it during import.

Common format strings are listed in the following table. More information and a complete overview of the formats at Microsoft.

Format	Description
yyyy	Year (4 digits, e.g. 2024)
yy	Year (2 digits, e.g. 24)
MM	Month (with leading zero, e.g. 01 - 12)
M	Month (without leading zero, e.g. 1 - 12)
dd	Day (with leading zero, e.g. 01 - 31)
d	Day (without leading zero, e.g. 1 - 31)
HH	Hour (with leading zero, e.g. 00-23)
mm	Minute (with leading zero, e.g. 00-59)
ss	Seconds (with leading zero, e.g. 00-59)

Note
Upper and lower case changes the meaning of the format strings.

DocuWare IDP with storageRobot

The AI behind DocuWare Intelligent Document Processing (IDP) is trained to classify various document types such as invoices and delivery notes and extract relevant data from them. These classifications and extraction models can be used in storageRobot for automatic indexing during import. storageRobot can also split scan batches with DocuWare IDP and automatically index them even without barcodes.

More information about DocuWare IDP

General information on DocuWare IDP on the DocuWare website
Configuration DocuWare IDP with Connect to Mail
Connect DocuWare Workflow Manager with DocuWare IDP

Set up DocuWare IDP for storageRobot

To set up DocuWare IDP for use in storageRobot, use the DocuWare IDP wizard in the Administration Tool. You can also use the other functions of storageRobot. Only the storageRobot’s own document separation functions are not available if DocuWare IDP with splitting is used.

*In the main window of the DocuWare IDP wizard you connect storageRobot with DocuWare IDP*

Make the following settings:

API URL: Enter the URL for DocuWare IDP.
If you are in the EU, just click EU Standard.
If you are in the USA, click US Standard.
API Key: The key for connecting via API can be found in your account at natif.ai. If you have questions, contact your DocuWare contact.
Refresh Workflows: After connecting storageRobot with DocuWare IDP via the API, use the Refresh Workflows button to load the configurations available in DocuWare IDP, such as classifications and extraction models. These are then available further down in this dialog under Select Workflows.
A “workflow” here refers to a configuration in DocuWare IDP based on function types such as classification, extraction, or splitting. A DocuWare workflow in the usual sense is not meant here.
Select Workflow: Select a configuration or “workflow” from DocuWare IDP that you want to use for import with storageRobot. storageRobot supports the following configurations from DocuWare IDP:
1. Splitting: DocuWare IDP automatically splits document stacks into single documents.
2. Extraction: Data is automatically extracted from almost all document types. This includes invoices, contracts, HR documents, and emails. For invoices, line items can also be recognized and extracted.
3. Classification: Classify documents by type, for example.
Additional features such as cropping and OCR are also supported by storageRobot.
Again, note: “Workflow” here refers to a configuration in DocuWare IDP and not a DocuWare workflow.
Maximum Wait Time: Maximum Wait Time specifies how long the system should wait for a response from DocuWare IDP when processing a document. This time is specified in seconds.
1. What does this mean for you? If you set a time, the system will wait as long as you have specified for a response. If document processing is completed within this time, no further action is required.
2. What happens in case of a delay? If processing takes longer than the set Maximum Wait Time, the system pauses and renames the original document (extension: .dwidp.original).
  Additionally, a new file (extension: .dwidp) is created. The system continues to process the documents at a later time without you having to intervene manually.
3. Important with 0 seconds: If you set 0 seconds as Maximum Wait Time, delayed processing is always triggered. The system immediately marks the documents for later processing.

Additional Notes on Setting up the Import Configuration for DocuWare IDP

Depending on the configuration type you selected in DocuWare IDP, the displayed elements in the IDP wizard of storageRobot will change:

Splitting: If you use a configuration of the type Splitting, you can enter a confidence value.
- Minimum confidence is a value between 1 and 100. This indicates how certain DocuWare IDP is that a recognized separation point is correct. The higher you set this value, the less likely it is that a document will be separated incorrectly, but the rejection rate may be noticeably higher.
- If the confidence of a separation point is below the set value, the entire document is considered an error and moved to the low confidence path. If you do not enter anything here, storageRobot will use the previously configured error directory.
Other DocuWare IDP techniques: If you use the Analyze Sample Document option, the previously selected sample document will be sent to the API and analyzed. The results are then available for later use in the wizard.
The IDP processing of the sample document may incur costs. For more information, refer to the DocuWare price list or contact your DocuWare representative if in doubt.
In some cases, the IDP creates a new PDF file from your document. These PDFs can be a cropped version of the original file or contain additional text if OCR was performed. Enable "Use IDP-generated PDF file" to use this PDF instead of the original document.

Assign Extracted Information to DocuWare Index Fields

An IDP import configuration reads different information from imported documents depending on the type and automatically assigns this extracted information to certain placeholders of storageRobot.

In storageRobot, these placeholders are then linked to the DocuWare index fields, so that finally the extracted metadata is stored as index data for the imported document in the DocuWare file cabinet.

Some placeholders of storageRobot are the same for all configuration types of IDP, for example

{IDP_WorkflowId} Workflow ID
{IDP_ProcessingId} Processing number
{IDP_ClassifiedAs} Classification result

The placeholders that were created based on an extraction are generated dynamically depending on the selected IDP workflow and the fields and annotations created there, for example:

{IDP_DocumentType} Document type
{IDP_SenderCity} Sender's city
The following screenshot shows an example for the storageRobot placeholders of an extraction configuration:
storageRobot placeholders for an IDP extraction
For comparison: These placeholders are based on the following metadata in DocuWare IDP:
{IDP_SenderCity} = Sender > City
{IDP_ReceiverCity} = Receiver > City