SignUp
SignIn
SignIn
SignIn

Welcome to Banking Quest

Business Continuity Management

March 20, 2024, 8:33 a.m.

Mr. Anand Shrimali, Ex Deputy General Manager (IT) Bank of India

What is Disaster

  • ‘Disaster' is defined as a crisis situation causing wide spread damage.

 Why to Worry

  • Threat to Life
  • Threat to Business Continuity

 What to Do

  • Plan for Emergency Response
  • Plan for Business Continuity

 What may go wrong ?

Types of Failures & Security Breaches 

  • System Failure – HW / SW / Network
  • Operations failure
  • Data Damage
  • Data Loss
  • Data Leakages
  • Data Theft - Insider / Internal Threats
  • Data Theft – Breaches / External Threats

 Murphy's law: "Anything that can go wrong will go wrong."

 Threats to Business Continuity

These can happen due to 

  • Hardware / Software failures
  • Power failures
  • Network Link failures (LAN / WAN)
  • Building fires
  • Electrical Short-circuits
  • Floods
  • Earthquakes
  • Epidemic, Pandemic disease, Outbreak
  • Transport shut-downs, disruptions
  • Terrorist attacks, Bomb Blasts
  • Civil Unrest, War

 

Business Continuity Planning & Process ...

  • Lack of Business Continuity may result in:
  1. Loss of Reputation/Image/Public confidence;
  2. Loss of Customers to Competitors;
  3. Loss of Revenue;
  4. Loss of Opportunity;
  5. Loss of Productivity;
  6. Other Material Consequences

 (Loss Event Types - BDSF-EDPM & DPA)

(Business Disruption due to System Failure, Execution Delivery & Process Management, Damage to Physical Assets)

 Major Types of Disruptions & Mitigation

A disruption can be caused by a multitude of threats, for example:

 Natural Hazards –

✓Geophysical - Earthquakes, Volcano, Tsunami

✓Hydrological – Flood and landslides

✓Meteorological (Short lived) Cyclone, Storm Surge, Lighting, Heavy Rains, Dust Storm, Snow, Winter Storm, Extreme Temperature, Fog, Heat-wave, etc.

 Environmental Occurrences e.g. fire, explosion, nuclear disaster, pollution, radiation, chemical, biological incidents

Utility failures including electrical power, water supply, air conditioning

IT- Hardware and software failures including loss of historical/current data

People including human error, disclosure, accidental damage, staff shortage, loss of key by staff, labour unrest, mass absence due to a pandemic, bio-chemical disaster

Criminal actions including fraud, sabotage, hacking, theft, terrorist actions, hostage taking

Civil / Political / Legal actions including acts of war, riots, demonstrations, labour strikes, legal or regulatory action

Third party failures such as disruption of delivery from suppliers/service providers/ outsourced services

 Business Continuity Planning & Process

Objectives

  • A planning process to ensure that an organization can continue to provide an acceptable level of service throughout an event that causes disruption.
  • Preparedness For managing disruptions to ensure Continuity, Resumption and Recovery
  • of critical business processes to an agreed level and limit the impact on people, processes, infrastructure and systems.
  • Entrust management responsibility for the implementation, monitoring and review of Business Continuity Planning documentation; and
  • To draw roles and responsibilities of staff,
  • Minimize the impact due to business discontinuity 

 RBI Guidelines on Business Continuity Planning

  • RBI Guidance Note on Management of Operational Risk-2005
  • Controls/Mitigation of Operational Risk – General
  • G.Gopalakrishna Working Group on IS, Electronic Banking, Technology Risk Management & Cyber Frauds – 2011
  • Chapter 7 – Business Continuity Planning:
  • Roles, Responsibilities and Organizational Structure

– Board of Directors & Senior Management

– BCP Head or Business Continuity Coordinator

– BCP Committee or Crisis Management Team

– BCP Teams

 

Key Factors to be considered for BCP Design

  • Testing, its techniques – Simulation; Component Testing, Technical Recovery Testing, Testing recovery at an alternate site, Testing of suppliers facilities and services, Complete Rehearsals
  • Maintenance and Re-assessment of Plans (Review)
  • Procedural aspects of BCP – Extract of BCP to be made available on  Public Domain
  • Infrastructure aspects of BCP
  • Human aspects of BCP
  • Technology aspects of BCP – Data Recovery Strategies, RPO, RTO, MTPD, Near Site, DR
  • Issues/Challenges in DC/DR implementation by the Banks

 What is BCP?...

  • Preparedness of an Organization
    • Policies
    • Standards & Procedures to ensure Continuity
    • Resumption and recovery of critical business processes

 Why BCP?

➢ Uninterrupted customer service

➢ Better “Risk Management”

➢ Compliance of “Regulatory requirements”

➢ Identification of weaknesses and implement a disaster prevention program

➢ Minimizing the duration of a serious disruption to business operations

➢ Reduction in the complexity of the recovery effort

➢ Absence of BCP can result in Operational & Financial Losses, Reputation & Legal Risks, other consequences

 

What if we Don’t Plan?

➢ Operational Loss

➢ Financial Loss

➢ Reputation Risk

➢ Legal Risk

➢ Loss of Stakeholders’ Confidence

➢ Other material consequences arising due to disaster e.g. Loss of Life

 

Business Impact Analysis (BIA)

BIA Covers the following activities: 

Operational functions- Customer service

Support functions- IT, Human resources, Administration,

external support services

Strategic activities- Management, Projects and planning

 

Purpose:

  • To document the impacts over time may resulting loss or disruption.
  • Identify the Max. Tolerable Period of Disruption (MTPD)
  • Identify the dependencies
  • Determine the impact of
  • Interruption in advance because of business changes
  • Significant change in business operations
  • Change in organizational structure or staffing levels
  • A significant new Project
  • Change in outsourcing contract

 Risk Assessment

Classification of Business functions 

Critical: These functions can not be performed unless they are replaced by identical capabilities. Eg. Hardware crash in server

Vital: These functions can be performed manually or otherwise, but only for a brief period of time.

Sensitive: These functions can be performed manually or otherwise at a tolerable cost and for an extended period of time. Eg. One hard-disk (out of two) got crashed of server

Non-sensitive: These functions may be interrupted for an extended period of time at little or no extra cost. Eg. One printer (out of two) is not functioning in branch

 

Recovery Strategy

Recovery Point Objective (RPO)-The acceptable latency of data that will be recovered –

the LOSS of data from a given moment / state, Organisation can afford to loose

Recovery Time Objective (RTO)-The acceptable amount of time to restore the function –

the amount of time, within which Organisation should resume business operations / services;

➢ Organisation can afford to loose (sustain) before Recovery /Restoration of services.

 

Risk Analysis

➢ Identification of “Resources” and “Processes”

➢ Identification of Vulnerabilities and Threats:

Vulnerability: Chances of failure of resource due to its inherit property. Eg. Non updation of Antivirus.

Threat: Event of failure of resource

➢ Identification of possibility of occurrence of threat

➢ Identification of “Controls” to manage a threat

➢ Cost-Benefit analysis

➢ Residual Risk: Acceptable risk

 

Development & Implementation of BCP

➢ Action plans, i.e.: defined response actions specific to the bank’s processes

➢ Establishing management succession and emergency powers

➢ Compatibility and co-ordination of contingency plans at both the bank and its service providers

➢ Having specific contingency plans for each outsourcing arrangement or service agency

➢ Periodic updating to absorb changes in the institution or its service providers

 

Key Factors for BCP

  • Probability of unplanned events
  • Security threats
  • Increasing infra and application interdependencies
  • Regulatory and compliance requirements
  • Failure of key third party arrangements
  • Globalisation and the challenges of operating in multiple countries

BCP Framework

➢ Conditions for activating plans

(Declaration of Disaster)

➢ Emergency procedures

(Liaison with police, fire service, health-care services and local government)

➢ Identification of the processing resources and locations, available to replace those supporting critical activities

➢ Identification of information to be backed up

➢ Resumption procedures

➢ Awareness and education activities

➢ Roles and Responsibilities

 

Testing Techniques

  • Table-top testing for scenarios (discussing business recovery arrangements using example interruptions)
  • Simulations (particularly for training people in their post-incident or crisis management roles)
  • Technical recovery testing (ensuring information systems can be restored effectively)
  • Testing recovery at an alternate site (running business processes in parallel with recovery operations away from the main site)
  • Tests of supplier facilities and services (ensuring externally provided services and products will meet the contracted commitment)
  • Complete rehearsals (testing that the organisation, personnel, equipment, facilities and processes can cope with interruptions)

 Various Teams & Responsibilities ... (1)

➢ Incident Response Team

➢ Emergency Action Team

➢ Damage Assessment Team

➢ Emergency Management Team

➢ Offsite Storage Team

➢ Software / Application Team

➢ Security Team

➢ Emergency Operations Team

 

The Teams & Responsibilities .... (2)

➢ Network Recovery Team

➢ Communications Team

➢ Transportation Team

➢ User Hardware Team

➢ Data Preparation and Records Team

➢ Supplies Team

➢ Relocation Team

➢ Legal Affairs Team

➢ Recovery Test Team

➢ Training Team

 

Testing BCP

➢ Regularly test BCP to ensure that they are up to date and effective

➢ Audit the effectiveness of BCP

➢ BCP drill with the critical third parties

➢ Periodically move the operations to DR Centre including People, Processes and Resources (IT & Non IT).

➢ Testing the readiness of alternative staff at the DR site

➢ Consider having unplanned BCP drill

 

Maintenance/Reassessment of BCP

➢Strategy may not be adequate as organizational needs change

➢New resources/applications may be developed

➢Changes in HW/SW environment

➢Changes in Business Strategy

 

Disaster Recovery Site

➢ HA - High availability systems which keep both the data and system replicated off-site, enabling continuous access to systems and data

➢ Replication of data to an off-site location, which overcomes the need to restore the data

 

Backup sites

Hot Sites- duplicate of the original site

Warm Sites- backups are available, but they may not be complete

Cold Sites- does not include backed up copies of data and information from the original location of the organization, nor does it include hardware already set up

Near Sites- duplicate of the original site, but only with data backup + some other minimum backups

 

Overview of Data Centre

  • Structure & Design of Data Centre
  • Important Functions
  • Various Teams
  • BCP & DR Operations
  • NOC
  • SOC
  • Change Management
  • Backup Management
  • HR Management

 Data Centre

Data Centre Infrastructure, HW etc. 

  • Software / Applications and other areas
  • Health of systems, patching, upgrade
  • Application Version – Patch Updates, UAT, making Systems Live
  • Application ownerships, procedures for access
  • System performance / efficiencies, provisioning for capacity requirements
  • Access Controls over CBS, Databases, Tools, Applications
  • Software parameter changes
  • Access Controls to sensitive directories / folders
  • Maintenance of sensitive passwords
  • Anti Virus Servers, Update Procedures
  • E-mail Systems
  • Card Operations – NFS and other Card Networks Interfaces
  • ATM Switch (Servers & Security Systems, HSM)
  • Remittances – RTGS, NEFT, SWIFT, UPI Systems & Interfaces
  • Cheque Truncation Interfacing Systems
  • Backup Procedures, Offsite transfers, transports, tests & restore Procedures

 Data Centre Tier Level Requirements

 Uptime guarantee, Performance, Investment, Return on investment (ROI)

Tier 1

➢ Single non-redundant distribution path serving the IT equipment

➢ Non-redundant capacity components

➢ Basic site infrastructure with expected availability of 99.671%

Tier 2

➢ Meets or exceeds all Tier 1 requirements

➢ Redundant site infrastructure capacity components with expected availability of 99.741%

Tier 3

➢ Meets or exceeds all Tier 1 and Tier 2 requirements

➢ Multiple independent distribution paths serving the IT equipment

➢ All IT equipment must be dual-powered and fully compatible with the topology of a site's architecture

➢ Concurrently maintainable site infrastructure with expected availability of 99.982%

Tier 4

➢ Meets or exceeds all Tier 1, Tier 2 and Tier 3 requirements

➢ All cooling equipment is independently dual-powered, including chillers and heating, ventilating and air-conditioning (HVAC) systems

➢ Fault-tolerant site infrastructure with electrical power storage and distribution facilities with expected availability of 99.995%

 

Enablers for Business Continuity

  • People
  • Premises
  • Technology Infrastructure
  • Information / Data
  • Support Services

 EMERGENCY RESPONSE

 NON-LIFE THREATENING EMERGENCY

BCM Team shall be in action to respond to the situation and co-ordinate with concerned staff /

Vendors to ensure that the negative impact is reduced. 

➢ BCM Team should take note of the incident and validate whether it needs attention for

business continuity

➢ Communicate incident details to the BCM Leader & Zonal EMT

➢ Involve with the concerned staff / Vendors / departments for resolution of the incident and

note what actions are being taken towards resolution of the incident

➢ Assist the concerned staff members based on past learning of similar incidents

 

LIFE THREATENING EMERGENCY

BCM Team with help from Security staff, Floor wardens & First aid volunteers shall do – 

➢ Take appropriate action based on the referred guidelines from Emergency Response &

Evacuation Guidelines

➢ In case evacuation is required, floor wardens to ensure that all staff are evacuated

➢ While evacuating staff, floor wardens should make note of and observe preliminary impact

of the incident on the building / floors

➢ Floor wardens should convey the preliminary impact observations to BCM Coordinator /

BCM Head.

➢ Fire fighting efforts should be initiated by trained staff / Security staff

➢ First Aid volunteers should help others in need of first aid

 

Business Continuity Management

YEARLY PLAN REVIEW 

➢ During annual review process BCM Team should internally discuss the effectiveness of this plan (Table top testing) and make necessary changes in the plan to suite branch requirements.

➢ One Copy of updated plan should be kept in branch record & one copy should be shared with Zonal BCM Committee for record.

➢ The Updated plan shall be distributed to all BCM Team members. The records of the same should be maintained in Plan distribution list (annexure – V).

➢ Before delivery of new plan, old plan copies should get collected from all members and marked expired or old.

 

EMERGENCY MANAGEMENT

PRE-EMERGENCY PREPARATION 

  • Emergency Response & Evacuation Guidelines

➢ suggested response to emergencies such as

➢ Fire, Earthquake, Flood, Tsunami, Civil disorders, Terrorism, Pandemic etc.

  • Facility, Employee and Asset inventory information –

Ensure the following documents are available with the BCM Team:

➢ Branch / Center Employee contact details

➢ Vendor contact details for Emergency Services & Critical Supply

➢ Emergency Response and Evacuation Guidelines

➢ Copies of building / floor plans, Electrical & LAN Diagrams

➢ Asset Inventories (IT & Non-IT)

➢ Insurance information

  • Staff Awareness & Training for

1. Emergency response & evacuation guidelines

2. Emergency Evacuation procedures

3. Use of Fire fighting equipment

4. First aid trainings / Availability of First Aid Box

 

EMERGENCY RESPONSE

POST EMERGENCY ANALYSIS

 Non Life Threatening Emergency

  • root cause analysis
  • report to the Branch / Center BCM Head and Zonal EMT.

Life Threatening Emergency

  • root cause analysis
  • incident report to the Branch / Center BCM Head and Zonal EMT.
  • Assess the damages assets & human life and document the same
  • Conduct a thorough inventory check of equipments
  • Report estimated losses sustained
  • Highlight critical facilities requiring priority repairs if damaged
  • Request manpower aid from Zonal Office if required
  • Compile damage assessment reports and submit to Branch BCM Head and Zonal EMT

Business Continuity Management

RECOVERY STRATEGY - PRE-DISRUPTION PREPARATION (ENABLER WISE) 

(I) People –

➢ For each process, identify backup for key persons and on a periodic basis (or whenever key person is not available) assign work of respective seat to identified backup person.

➢ Provide multi-skill training to staff members so that maximum processes can be continued with minimum staff strength

➢ Encourage planned leaves to avoid surprise staff shortages

(II) Premises –

➢ Alternative sites should be identified. The alternative site administration should be well informed and taken in to confidence about such plan.

➢ Special care for the processes which are specific to this branch and not available in nearby branches. Separate detailed plans may be developed for such unique processes where absence of all five enablers should be separately taken care.

(III) Technology –

➢ BCM Team members should have fair understanding of the Branch / Center technical setup.

This will help in proper diagnosis of technology related disruptions.

➢ The Contact list of critical vendors like ATM vendor, UPS vendor, Head office helpdesk numbers etc. should be readily available in branch.

➢ The Branch / Unit should ensure that Antivirus & Operating system patches are regularly

updated in branch. It will significantly reduce computer problems in branch.

(IV) Data / Information -

➢ Due to CBS, most of branch data is in Data Centre, but still many processes have associated manual activities and require multiple physical documents. Such documents need to be identified and their availability at alternate site should be ensured.

➢ The Digitisation of critical documents especially credit / legal / customer documents should

be done on priority.

➢ The work files used by branch / office staff for reporting / credit appraisal & other miscellaneous activities should be backed up periodically and stored at onsite as well as off-

site location.

(V) Support Services & stock -

➢ The Branch / Office should have suitable arrangements for power supply failures situations like UPS power for computers and Inverter or generator for lighting, fan etc.

➢ For Telephone line failure, branch / office should have some wireless based telephone sets so that communication can be ensured in case of line failure. The Branch leased line should have functional ISDN or VSAT as fall back option for network connectivity

➢ The BCM Team should ensure that Fire fighting equipment in branch / office get refilled before expiry.

➢ The Suitable stock of computer/other stationary items / canteen food articles need to be ensured in branch / office

 

RECOVERY STRATEGY - DURING-DISRUPTION RESPONSE 

  • Processes common with Alternate Branch

➢ With CBS system in place most of the services customers can avail from any other branch of the Bank.

➢ But for customer convenience a communication put one board state alternative branch name & address (refer Alternate Site details) outside the branch.

➢ In all communications to the customers (phone and email) staff should inform them for alternate branch location.

➢ Alternate Branch’s Manager should be consulted for staff support to handle increased load.

➢ The Branch BCM Team should involve with the concerned staff / Vendors / departments for resolution of the incident and initiate recovery efforts towards restoration of operations at branch within agreed RTO timelines. 

  • Processes unique to the Branch –

Alternative Arrangements and Requirements:

➢ Separate detailed plans may be developed for such unique processes where absence of all five enablers should be separately taken care off.

 

RECOVERY STRATEGY - POST -DISRUPTION ANALYSIS

Once continuity of processes is restored with the coordinated efforts of all, the following task should be carried out

➢ Root cause analysis of the incident should be done by Branch BCM Team, report submitted to the Branch BCM Head and Zonal EMT .

➢ This Report should be sent by the branch in maximum 3 days after the incident.

➢ The Branch BCM Head should ensure that preventive and corrective actions are initiated as per the root cause analysis report of the incident.

 

QUARTERLY PREPAREDNESS CHECK

➢ The Branch BCM Team should quarterly assess branch preparedness for business continuity.

➢ The Contact details and other routine changes required in annexures due to movement of

staff etc. should be updated during quarterly review with each BCM Team members updating his plan copy.

➢ During this exercise gap analysis shall be carried out with respect to pre-emergency preparatory guidelines

➢ The Findings along with list of actionable and action owners shall get reported to Zonal BCM

committee. The Physical & Environmental Control checklist (annexure – IV) should be part of this report.

 

YEARLY PLAN REVIEW

➢ The Branch BCP Plan should be maintained by annual reviews and updates to ensure its continued effectiveness. Updated copy should reach controlling office in time.

➢ The Maintenance & review exercise shall ensure that any changes (internal or external) which impact the branch are reviewed in relation to BCM.

➢ It shall also identify any new products and services and their dependent activities which need to be included in the Business Continuity Plan.

 

Comments (0)

Please login to post a comment