Banking M&A – Document Discovery & Allocation Approach

A major financial services
intuition divesting different business lines needed to identify, tag and separate a large volume of documents.
Using Rapid OCR and indexing features Frisk created an index of data which, in conjunction with reference list, identified
and classified all available
documents. Once classified documents were appropriately tagged and separated.
  • Significant time savings over manual review processes.
  • Increased confidence in outcome compared to a
    sampling approach.
  • Where necessary, outputs could be quickly validated using the Frisk Flexible UI and reporting capabilities.


A large bank was divesting one of their subsidiary businesses. The sale resulted in two separate buyers for different parts of the business and another segment of the business being retained by the bank. A major challenge involved over 100 million documents which needed to be classified so they could be provided to the purchaser of each relevant part of the business. A fixed transaction period made it imperative that the process happened both quickly and accurately.


Frisk developed a process that would enable documents to be classified to facilitate extraction and transfer to the purchaser of each business line:

  1. Configure and Index: Documents that are not text searchable are put through the Frisk OCR process. Given the large volume, the Frisk multi-threading configuration enables the optimal use of available computer capacity to reduce the OCR processing time required. Simultaneously the indexing of every word in every document commences.
  2. Content Analysis: A set of documents with known heritage are identified. These are then analysed using the Term Frequency Report to look for words that may be unique to each segment which would then be used as a reference point for identification. Review of the documents also identifies unique phrases and context reference points to enable a search driven classification.
  3. Cross Reference Data Sources: A variety of data sources would be identified as additional points of reference. These could be any unique element that would link documents back to the target groups such as product names, staff lists, business unit names etc. All these reference points are used to run bulk searches across the documents to identify documents referencing the relevant criteria.
  4. Document Tagging:  As documents are identified as belonging to a particular segment this would then be tagged in the index enabling it to be a searchable field.  For example, a document would be flagged as being Insurance related and ‘Insurance’ could then become a filter in the search User Interface (UI).
  5. Refinement Analysis: Using the configurable UI, searches can then be run and the results used to further refine the search criteria. This iterative process enables a very efficient allocation of documents to their correct classification.
  6. Reporting:  Once the documents have been tagged and allocated to each segment, a report would be run which lists each document, its classification and its location on the network. This can inform a document extraction process which would physically move each document to its required location.
StageProcessFrisk Capability
Configure & Index
  1. Configure appliances to maximise streaming options
  2. OCR and Index to enable search across document content and metadata
  • OCR and index on the fly
  • Streaming – take advantage of available computer resources
Content Analysis
  1. Technical consulting
  2. Query sample set of documents with known “heritage” to identify unique content to classify
  • Term Frequency Report
  • Smart Search
  • Export to Report
Cross Reference Data Sources
  1. Technical consulting
  2. Build bulk queries from reference sources i.e
    • Staff names
    • Product names
    • Business units
    • Unique language
  • Bulk Query tool
  • Report outcomes for further analysis
Document Tagging
  1. Technical consulting
  2. As documents are classified, either:
    • Augment the document metadata held in the index
    • Or write to a reference list
  • Index new classifications and make searchable
Refinement Analysis
  1. Technical consulting
  2. Enable filtering to include/exclude documents from query results
  3. Augment the document metadata held in the index
  4. Iteratively define ambiguous documents
  • Configurable UI to streamline analysis process
  1. Technical consulting
  2. Produce a report that lists all query results (documents) in each relevant classification or grouping
  • Export to Report

Contact Us


180 Flinders St ADELAIDE SA 5000
PO Box 879, UNLEY BC SA 5061

Phone: 1300 43 33 11

180 Flinders St ADELAIDE SA 5000
PO Box 879, UNLEY BC SA 5061

p: 1300 43 33 11


  • This field is for validation purposes and should be left unchanged.