White Paper on-premises: databases, storage locations and fulltext index

Prev Next

DocuWare requires several databases and at least one file storage (file cabinet). Installing the fulltext functionality is optional.

Databases

For its operation, DocuWare requires several relational databases. These databases are used for storing the structured index data of the documents, for searching them and for the full-text index. In addition, DocuWare stores all essential system information (such as Authentication Server data) in a database or saves workflow information there.

Note: The database user that DocuWare uses to access its databases must have owner-level permissions for all databases. Permissions such as reader, writer or administrator are not sufficient. .

Supported database systems

MS SQL Server and MySQL Server can be coupled with a DocuWare system. The administrator has the option of specifying a particular database to be used for each file cabinet. In addition, a cluster system can be connected. Databases may reside on autonomous servers outside the DocuWare server area. DocuWare can work with several database connections to different servers and different databases simultaneously. Several simultaneous connections can be established to one database.

To ensure optimum performance and maintainability, DocuWare recommends using the Microsoft SQL Server database system for archives with more than 1 million documents (without full-text functionality) or more than 200,000 document pages (with full-text functionality). Contact DocuWare Professional Services for support in the migration of databases.

Internal database server

In the event that no external database server is provided or can be set up, DocuWare also offers an integrated database server as part of the standard feature set (Internal Database). This MySQL server can be optionally installed with the Server Setup.

If an MSSQL database is used, the archive name can be up to 128 characters long, and with MySQL up to 64.

Structure of the databases

A DocuWare system contains the following databases:

  • System database (DWSYSTEM)

    All data on rights, licenses, and settings are stored in this database. Auditing data at system and organization level can also be found here.

  • Database for document data (DWDATA)
    This database contains all internal system information for searching and finding documents. You can create several such databases.

  • Notification database (DWNOTIFICATION)

    This database contains all the events that the Background Process Service needs to run workflows and email notifications.

  • Workflow Engine database (DWWORKFLOWENGINE)

    This database contains all information required by the Background Process Service for creating, editing, and executing workflow configurations.

Storage locations

DocuWare supports a broad spectrum of storage media for storing documents. This includes local hard disks, (virtual) network storage media, and external storage systems. Which media actually come into use depends on the volume of the documents to be stored and requirements concerning access and safeguarding. As long as conventions for Windows file systems are complied with, the technological basis of these systems is irrelevant. You can also use storage procedures such as RAID systems (RAID = Redundant Array of Independent Disks) or NetApp storage solutions, provided that these can be incorporated into the Windows file system as a virtual system drive.

DocuWare also supports special storage systems. DocuWare delivers software that can be used to incorporate storage systems as DocuWare file deposits in the same way as in a file cabinet, as is possible with Windows file deposits. You can set specific options to determine whether files will be written directly to the target medium, which in the case of WORM for example will ensure maximum security, or whether to go via the intermediary of the virtual disk.

Hard disks, RAID

In addition to the ability to use individual hard disks, you have the option of combining several hard disks in a "Disk Array." These arrays are the ideal solution for an archiving system where magnetic storage technology does not present a problem. If a RAID is selected, it increases security against loss of data in the event of hard disk failure thanks to redundancy. This way you can swap a hard disk – depending on the RAID level – during running operation.

Directories and drives can be used as document storage. It is irrelevant whether the directories and drives are simple hard disks, virtual disks, RAID networks (hardware or software RAID, storage spaces) or network drives.

For production systems, it is recommended to store the data on redundant storage systems. The use of simple, non-redundant storage systems is not recommended.

If DocuWare is installed distributed over several servers, network storage should be used and SMBv3 should be used as the protocol. SMBv1 should not be used for security reasons.

For installations with a high volume and many users, the database files should be stored on redundant flash memory. The same applies to the full text index files. The storage locations for the documents can be distributed on classic disks even in large installations.

Platform Service and Background Process Service must have read and write access to all storage locations and databases used by DocuWare:

  • All accesses to the memory take place under the Windows account that was entered in the Server Setup for the service user. In addition, this user must have full access to the memory to support the full functionality of the product.

    The app pools of the Frontend Services (like Platform) access the storage for interactive requests, for example for storing a new document or repeating Intelligent Indexing interactively.

    The Windows service of the Backend Services (like Background Process Service) accesses the storage for queued background tasks, like extracting document text and sending documents to Intelligent Indexing in the normal case.

  • It does not matter which DocuWare user is served by the services. Access is always done in the context of the service user, both in the Frontend (app pool) as well as in Backend Services (Windows service).

NetApp storage

The NetApp storage solutions are based on NetApp's own operating system and can be integrated in various storage area networks similarly to hard disks (NAS, SAN, iSCSI). They are especially intended to manage large volumes of data and for the long-term archiving of WORM documents. NetApp Storage can be used with DocuWare for storing documents. Files in NetApp storages cannot be edited and are assigned the "Read Only" attribute. Even if disks on NetApp storage solutions can be set to different types in the DocuWare Administration, we recommended to select the type "WORM" because it is best suited for the NetApp behavior.

Fulltext index

During a fulltext search, the Fulltext Server lists the occurrences as well as the context strings for the individual search terms in a fulltext index. At the same time, the estimated relevance of a term is evaluated. The result list of a fulltext search is sorted according to this relevance. The optional Fulltext Server is based on the SolR 9 platform. For more information, see the Fulltext Functionality section in the chapter file cabinet structure.