Digital Transformation with Documents
Digital Transformation with Documents

Charles Drayson
Jul 26 · 3 min read

Charles Drayson
Jul 26 · 3 min read
You store them on digital media but, ultimately, they remain analogue materials for human consumption. Advances in machine learning and artificial intelligence are as yet unable to extract full and reliable data from documents. Hell, you can put the same document in front of several humans and not even yield consistent interpretations of the content.
What can we do with those pesky documents in a quest to make things better?
You store them on digital media but, ultimately, they remain analogue materials for human consumption. Advances in machine learning and artificial intelligence are as yet unable to extract full and reliable data from documents. Hell, you can put the same document in front of several humans and not even yield consistent interpretations of the content.
What can we do with those pesky documents in a quest to make things better?
Abandon documents that we don’t need
If a document exists to present digital data in human-readable form, keep the data and create a transient document on each occasion when it’s needed. Bank statements are a good example. Who needs years of bank statements if we can reconstruct a statement (probably with more insightful data) on demand from a digital record of transactions? The veracity of a digital record is easier to verify – documents are prone to fabrication. Document assembly tools are ideal for this.
Store documents with the data that spawned them
If you must keep a document, and if the document was generated automatically from machine-readable data (just about all high volume documents), make sure you store the data (or a link to it) with the document. Systems can then read the data rather than trying to reverse engineer the document. Insurance policies, mortgage documents, employment contracts, purchase orders – all fall into this category.

If a dispute arises, someone can verify the data tallies with the document on an exceptions basis. You also mitigate the risk of a document presenting ambiguous information – you can look back at the source to resolve discrepancies. It’s helpful to have a document management system that allows such files to be collocated.
Store documents with metadata
If you cannot link a document to machine-readable data that contains the same information, tag the document with clues about what it contains. Typical metadata could include the date of the document, the name of the author, the type of document, maybe a tag to describe a related project. This might be sufficient for most reporting purposes, and reduce the task of searching for specific documents when the occasion arises. Created metadata at the same time as the document – it’s harder to tag documents retrospectively (although some AI systems are good at that).
Limit the use of documents as the sole repository of information
This requires some discipline among the teams of prospective authors. They need to understand that using a document as the medium to record the product of their work could be the least efficient way to apply their efforts.
For example, a real estate agent could feasibly create a very pleasing document describing a client’s house, with photographs, plans and descriptions. How useful is that document for analysis by anything systematic (or to share with marketing agents)? Instead, have a taxonomy for describing houses (number of rooms, dimensions, plot size, etc) and for collating plans and photographs. Compile it digitally, and then use document assembly to render descriptive documents (with the added benefit of having documents that are consistent and meet the organisation branding). Lawyers should create more legal documents this way too.
Impose a screening process before assimilating ad hoc documents into a digital system
If you must have incoming documents that don’t fall into the categories above, you do at least need to safeguard your organisation with a few steps to keep a healthy document store. The exact steps will depend on your industry but think about compliance and safeguarding.
Before you accept a document, you might want to remind users about data protection (you might want to know if documents contain personal data), security classification, password protection (if documents have to be opened by someone without a password), or simply to check if this is the correct system to be storing such documents.
Build a process for the human-in-the-loop
If documents are for human consumption, build an automated business process that has a place for the human-in-the-loop. Don’t try to replicate the nuanced, sensitive, intuitive work that only humans do well. Provide a way for humans to participate and leave their mark. Workflow tools integrated with document automation are what you need.
Abandon documents that we don’t need
If a document exists to present digital data in human-readable form, keep the data and create a transient document on each occasion when it’s needed. Bank statements are a good example. Who needs years of bank statements if we can reconstruct a statement (probably with more insightful data) on demand from a digital record of transactions? The veracity of a digital record is easier to verify – documents are prone to fabrication. Document assembly tools are ideal for this.
Store documents with the data that spawned them
If you must keep a document, and if the document was generated automatically from machine-readable data (just about all high volume documents), make sure you store the data (or a link to it) with the document. Systems can then read the data rather than trying to reverse engineer the document. Insurance policies, mortgage documents, employment contracts, purchase orders – all fall into this category. 
 
If a dispute arises, someone can verify the data tallies with the document on an exceptions basis. You also mitigate the risk of a document presenting ambiguous information – you can look back at the source to resolve discrepancies. It’s helpful to have a document management system that allows such files to be collocated.
Store documents with metadata
If you cannot link a document to machine-readable data that contains the same information, tag the document with clues about what it contains. Typical metadata could include the date of the document, the name of the author, the type of document, maybe a tag to describe a related project. This might be sufficient for most reporting purposes, and reduce the task of searching for specific documents when the occasion arises. Created metadata at the same time as the document – it’s harder to tag documents retrospectively (although some AI systems are good at that).
Limit the use of documents as the sole repository of information
This requires some discipline among the teams of prospective authors. They need to understand that using a document as the medium to record the product of their work could be the least efficient way to apply their efforts.
For example, a real estate agent could feasibly create a very pleasing document describing a client’s house, with photographs, plans and descriptions. How useful is that document for analysis by anything systematic (or to share with marketing agents)? Instead, have a taxonomy for describing houses (number of rooms, dimensions, plot size, etc) and for collating plans and photographs. Compile it digitally, and then use document assembly to render descriptive documents (with the added benefit of having documents that are consistent and meet the organisation branding). Lawyers should create more legal documents this way too.
Impose a screening process before assimilating ad hoc documents into a digital system
If you must have incoming documents that don’t fall into the categories above, you do at least need to safeguard your organisation with a few steps to keep a healthy document store. The exact steps will depend on your industry but think about compliance and safeguarding.
Before you accept a document, you might want to remind users about data protection (you might want to know if documents contain personal data), security classification, password protection (if documents have to be opened by someone without a password), or simply to check if this is the correct system to be storing such documents.
Build a process for the human-in-the-loop
If documents are for human consumption, build an automated business process that has a place for the human-in-the-loop. Don’t try to replicate the nuanced, sensitive, intuitive work that only humans do well. Provide a way for humans to participate and leave their mark. Workflow tools integrated with document automation are what you need.
More Industry Insights



















