DocuWare stores all documents in file cabinets where they can be saved long-term. Each file cabinet is assigned to a DocuWare organization. Users access documents in the DocuWare interface using a search query in the respective file cabinet or in multiple file cabinets.
Every organization has at least one file cabinet for storing documents. Under file cabinet settings, you can determine:
General file cabinet characteristics, e.g. name, etc.
The database to be used for the documents' index information and any additional database-related settings
The storage location to be used for the documents and (if applicable) their subdivision into logical disks with associated capacity limits
File cabinet fields / index fields
Access rights and file cabinet profiles for the archive or for individual fields
The user dialogs for file storage, searches, results list, and folder structures
Additional functionalities, e.g. availability of a fulltext index, type, and extent of the stamps that are available for document processing
The disk concept
The documents of a file cabinet are stored on "DocuWare disks." DocuWare disks are generally directories in the file cabinet identified by a name that DocuWare has assigned them. The subdivision of the file cabinet into logical disks is a means of organizing the storage media.
You can transfer these logical disks to another medium at any time you choose, for example when they reach a certain size. Document management with DocuWare has the advantage that documents can be swapped out either by pre-defined rules or automatically. DocuWare offers features for conveniently automating the corresponding steps.
The concept of logical disks and the open file structure gives the administrator a high degree of transparency and flexibility when managing the DocuWare system.
Document structure
A document in DocuWare can consist of one or more files. In addition, a document can be composed of a combination of various file formats, e.g. PDF/A, PDF, MS Excel, for instance if DocuWare accepts an email with several attachments as an associated document.
Again, each file comprises one or more pages:
The structure of a document that contains two files: one with three and one with two pages
Example 1 (see graphic above):
A 3-page paper document that was scanned into DocuWare consists of a 3-page PDF/A file.
Example 2 (see graphic above):
For a document, a 3-page PDF/A file generated by DocuWare and a 2-page Word file are clipped together in the document tray.
Example 3:
For one document, a PDF/A file generated by DocuWare, a 3-page Word file and a 2-page PDF file are clipped together in the tray. The document then consists of three files:
1. File of the document: PDF/A file with page 1
2. File of the document: Word file with pages 1, 2, and 3
3. File of the document: PDF file with pages 1 and 2
Annotations can be made on every page of a file within a document, on multiple annotation levels if required. Annotations are stored with their characteristics and additional attributes and reproduced for the duration by the DocuWare Viewer.
Each document in DocuWare can have a maximum of 999 document files.
Documents scanned and printed with DocuWare applications are stored in the DocuWare file cabinets as PDF/A files. All other documents that are read into DocuWare, such as PDF and MS Office files, are stored in their original formats.
Metadata of the documents
The metadata contains information about the document, such as stamps, index data, and annotations. The metadata,is automatically stored in the file cabinet database. Copies of these data can optionally be saved in a ZIP-based file format (extension.DWX) in the file cabinet location.
To do this, the option Index data backup in the storage location for the file cabinet must be enabled in the DocuWare configuration under File Cabinets.
They are updated asynchronously, not as part of the document change. After upgrading to DocuWare Version 7, this redundant storage option for on-premises is set to ON. This also applies to new file cabinets.
This means that the documents with their index data are still completely available even if the database fails completely without a backup. However, the restoration can be very time-consuming and is therefore no substitute for a conventional database backup.
A command line tool is available for restoring database entries from the file cabinet location in a DocuWare on-premises system.
When the user deletes a document from the file cabinet, it is moved to the recycle bin and all metadata from the database is saved in a TBDWX file. This file is removed when the document is restored or deleted from the trash bin.
Each document file has a unique name (GUID). When a document file is updated, the file in the memory is not overwritten. Instead, a new unique name (GUID) is generated. After the new file is created, the old one is deleted.
File structure change in version 6.12 and version 7
As with all contents of this white paper, this chapter refers exclusively to DocuWare With DocuWare version 7, the file structure of the documents has changed. If you were already using DocuWare version 6.12 or earlier and are now working with version 7 or higher, your documents are stored in different structures.
There is no other version between 6.12 and 7.
DocuWare version 7 and later: The metadata is automatically stored in the file cabinet database. Copies of this data can optionally be saved in a ZIP-based file format (extension .DWX) in the file cabinet storage location. (However, if the option is not used, this may result in a performance gain). The DWX files are updated asynchronously and not as part of the document change.
Up to DocuWare version 6.12: The metadata was always automatically stored in a header file per document in the file cabinet storage location.
Documents stored with a DocuWare version 6.x or earlier are only transferred to the new storage structure if they are edited or their index entries are changed. The storage structure of documents stored with a DocuWare version 6.x or earlier is therefore not changed and their header files continue to be used.
If such a document is changed in DocuWare version 7 or higher (e.g. if an annotation is added in the viewer), the metadata of the document is copied to the database and the header file is deleted. If the option to save DWX files is activated, a DWX file is saved. The new file naming convention (GUID) is used.
DocuWare Version 7 and later
Up to DocuWare Version 6.12
Document file name
GUID
Continuous number (DocID)
Index data and annotations
DWX file (optional)
Header-file (XML)
Fulltext functionality
DocuWare provides its own fulltext functionality, which allows you to run an effective search in the fulltext of documents and their index entries. The application is optional. The fulltext functionality operates as follows.
The Background Process Service extracts text shots from the document and stores these in the data store. The search terms of a document page and their position is marked in the text shots. This allows the results to be marked in the document.
At the same time, the Background Process Service transfers the text shots to the Fulltext Server. This stores the text shots again in catalog files (index files) and uses them for the search requests. The catalog files are created per DocuWare file cabinet. They are stored on the computer where the Fulltext Server is installed by default.
If an error occurs during indexing for the fulltext search, for example if a server is not accessible, the indexing of these documents is automatically repeated at a later time.
Document tray in a file cabinet
The essential application scenario for document trays is the viewing of new documents and their processing before archiving. For this reason, new documents are often first imported into a document tray. This is also where the documents are evaluated using Intelligent Indexing. In addition, a document tray can be used for copies of documents that have already been archived.
Document trays are technically structured like archives and the data is stored in a simplified format. Unlike archives, however, document trays do not have a structured search or a full-text search, nor do they have a rights concept. They are usually configured so that only one user has access to them.
In principle, it is also possible to give multiple users access to a document tray. However, it is important to note that no logging takes place in the document tray and no more precise assignment of rights is possible. Anyone who has access to a document tray may perform any action there. An individual action cannot be assigned to a specific user in retrospect.
File cabinet synchronisation
Two file cabinets can be synchronized with each other (documents and database). This synchronization is managed in DocuWare Configuration.
Documents are matched using globally unique GUIDs. One of the file cabinets to be synchronized must be located in the local system, the other one can be located in the same or in another DocuWare system.
The comparison between the file cabinets simply takes place in a text field column and so it only takes a short amount of time. The synchronization process is executed through Cloud-compatible HTTPS by the Background Process Service.