|A major financial services|
intuition divesting different business lines needed to identify, tag and separate a large volume of documents.
|Using Rapid OCR and indexing features Frisk created an index of data which, in conjunction with reference list, identified|
and classified all available
documents. Once classified documents were appropriately tagged and separated.
A large bank was divesting one of their subsidiary businesses. The sale resulted in two separate buyers for different parts of the business and another segment of the business being retained by the bank. A major challenge involved over 100 million documents which needed to be classified so they could be provided to the purchaser of each relevant part of the business. A fixed transaction period made it imperative that the process happened both quickly and accurately.
Frisk developed a process that would enable documents to be classified to facilitate extraction and transfer to the purchaser of each business line:
- Configure and Index: Documents that are not text searchable are put through the Frisk OCR process. Given the large volume, the Frisk multi-threading configuration enables the optimal use of available computer capacity to reduce the OCR processing time required. Simultaneously the indexing of every word in every document commences.
- Content Analysis: A set of documents with known heritage are identified. These are then analysed using the Term Frequency Report to look for words that may be unique to each segment which would then be used as a reference point for identification. Review of the documents also identifies unique phrases and context reference points to enable a search driven classification.
- Cross Reference Data Sources: A variety of data sources would be identified as additional points of reference. These could be any unique element that would link documents back to the target groups such as product names, staff lists, business unit names etc. All these reference points are used to run bulk searches across the documents to identify documents referencing the relevant criteria.
- Document Tagging: As documents are identified as belonging to a particular segment this would then be tagged in the index enabling it to be a searchable field. For example, a document would be flagged as being Insurance related and ‘Insurance’ could then become a filter in the search User Interface (UI).
- Refinement Analysis: Using the configurable UI, searches can then be run and the results used to further refine the search criteria. This iterative process enables a very efficient allocation of documents to their correct classification.
- Reporting: Once the documents have been tagged and allocated to each segment, a report would be run which lists each document, its classification and its location on the network. This can inform a document extraction process which would physically move each document to its required location.
|Configure & Index|
|Cross Reference Data Sources|