Business Continuity Management
March 20, 2024, 8:33 a.m.What is Disaster
- ‘Disaster' is defined as a crisis situation causing wide spread damage.
Why to Worry
- Threat to Life
- Threat to Business Continuity
What to Do
- Plan for Emergency Response
- Plan for Business Continuity
What may go wrong ?
Types of Failures & Security Breaches
- System Failure – HW / SW / Network
- Operations failure
- Data Damage
- Data Loss
- Data Leakages
- Data Theft - Insider / Internal Threats
- Data Theft – Breaches / External Threats
Murphy's law: "Anything that can go wrong will go wrong."
Threats to Business Continuity
These can happen due to
- Hardware / Software failures
- Power failures
- Network Link failures (LAN / WAN)
- Building fires
- Electrical Short-circuits
- Floods
- Earthquakes
- Epidemic, Pandemic disease, Outbreak
- Transport shut-downs, disruptions
- Terrorist attacks, Bomb Blasts
- Civil Unrest, War
Business Continuity Planning & Process ...
- Lack of Business Continuity may result in:
- Loss of Reputation/Image/Public confidence;
- Loss of Customers to Competitors;
- Loss of Revenue;
- Loss of Opportunity;
- Loss of Productivity;
- Other Material Consequences
(Loss Event Types - BDSF-EDPM & DPA)
(Business Disruption due to System Failure, Execution Delivery & Process Management, Damage to Physical Assets)
Major Types of Disruptions & Mitigation
A disruption can be caused by a multitude of threats, for example:
Natural Hazards –
✓Geophysical - Earthquakes, Volcano, Tsunami
✓Hydrological – Flood and landslides
✓Meteorological (Short lived) Cyclone, Storm Surge, Lighting, Heavy Rains, Dust Storm, Snow, Winter Storm, Extreme Temperature, Fog, Heat-wave, etc.
Environmental Occurrences e.g. fire, explosion, nuclear disaster, pollution, radiation, chemical, biological incidents
Utility failures including electrical power, water supply, air conditioning
✓IT- Hardware and software failures including loss of historical/current data
✓People including human error, disclosure, accidental damage, staff shortage, loss of key by staff, labour unrest, mass absence due to a pandemic, bio-chemical disaster
✓Criminal actions including fraud, sabotage, hacking, theft, terrorist actions, hostage taking
✓Civil / Political / Legal actions including acts of war, riots, demonstrations, labour strikes, legal or regulatory action
✓Third party failures such as disruption of delivery from suppliers/service providers/ outsourced services
Business Continuity Planning & Process
Objectives
- A planning process to ensure that an organization can continue to provide an acceptable level of service throughout an event that causes disruption.
- Preparedness For managing disruptions to ensure Continuity, Resumption and Recovery
- of critical business processes to an agreed level and limit the impact on people, processes, infrastructure and systems.
- Entrust management responsibility for the implementation, monitoring and review of Business Continuity Planning documentation; and
- To draw roles and responsibilities of staff,
- Minimize the impact due to business discontinuity
RBI Guidelines on Business Continuity Planning
- RBI Guidance Note on Management of Operational Risk-2005
- Controls/Mitigation of Operational Risk – General
- G.Gopalakrishna Working Group on IS, Electronic Banking, Technology Risk Management & Cyber Frauds – 2011
- Chapter 7 – Business Continuity Planning:
- Roles, Responsibilities and Organizational Structure
– Board of Directors & Senior Management
– BCP Head or Business Continuity Coordinator
– BCP Committee or Crisis Management Team
– BCP Teams
Key Factors to be considered for BCP Design
- Testing, its techniques – Simulation; Component Testing, Technical Recovery Testing, Testing recovery at an alternate site, Testing of suppliers facilities and services, Complete Rehearsals
- Maintenance and Re-assessment of Plans (Review)
- Procedural aspects of BCP – Extract of BCP to be made available on Public Domain
- Infrastructure aspects of BCP
- Human aspects of BCP
- Technology aspects of BCP – Data Recovery Strategies, RPO, RTO, MTPD, Near Site, DR
- Issues/Challenges in DC/DR implementation by the Banks
What is BCP?...
- Preparedness of an Organization
- Policies
- Standards & Procedures to ensure Continuity
- Resumption and recovery of critical business processes
Why BCP?
➢ Uninterrupted customer service
➢ Better “Risk Management”
➢ Compliance of “Regulatory requirements”
➢ Identification of weaknesses and implement a disaster prevention program
➢ Minimizing the duration of a serious disruption to business operations
➢ Reduction in the complexity of the recovery effort
➢ Absence of BCP can result in Operational & Financial Losses, Reputation & Legal Risks, other consequences
What if we Don’t Plan?
➢ Operational Loss
➢ Financial Loss
➢ Reputation Risk
➢ Legal Risk
➢ Loss of Stakeholders’ Confidence
➢ Other material consequences arising due to disaster e.g. Loss of Life
Business Impact Analysis (BIA)
BIA Covers the following activities:
➢Operational functions- Customer service
➢Support functions- IT, Human resources, Administration,
external support services
➢Strategic activities- Management, Projects and planning
Purpose:
- To document the impacts over time may resulting loss or disruption.
- Identify the Max. Tolerable Period of Disruption (MTPD)
- Identify the dependencies
- Determine the impact of
- Interruption in advance because of business changes
- Significant change in business operations
- Change in organizational structure or staffing levels
- A significant new Project
- Change in outsourcing contract
Risk Assessment
Classification of Business functions
• Critical: These functions can not be performed unless they are replaced by identical capabilities. Eg. Hardware crash in server
• Vital: These functions can be performed manually or otherwise, but only for a brief period of time.
• Sensitive: These functions can be performed manually or otherwise at a tolerable cost and for an extended period of time. Eg. One hard-disk (out of two) got crashed of server
• Non-sensitive: These functions may be interrupted for an extended period of time at little or no extra cost. Eg. One printer (out of two) is not functioning in branch
Recovery Strategy
➢ Recovery Point Objective (RPO)-The acceptable latency of data that will be recovered –
the LOSS of data from a given moment / state, Organisation can afford to loose
➢ Recovery Time Objective (RTO)-The acceptable amount of time to restore the function –
the amount of time, within which Organisation should resume business operations / services;
➢ Organisation can afford to loose (sustain) before Recovery /Restoration of services.
Risk Analysis
➢ Identification of “Resources” and “Processes”
➢ Identification of Vulnerabilities and Threats:
• Vulnerability: Chances of failure of resource due to its inherit property. Eg. Non updation of Antivirus.
• Threat: Event of failure of resource
➢ Identification of possibility of occurrence of threat
➢ Identification of “Controls” to manage a threat
➢ Cost-Benefit analysis
➢ Residual Risk: Acceptable risk
Development & Implementation of BCP
➢ Action plans, i.e.: defined response actions specific to the bank’s processes
➢ Establishing management succession and emergency powers
➢ Compatibility and co-ordination of contingency plans at both the bank and its service providers
➢ Having specific contingency plans for each outsourcing arrangement or service agency
➢ Periodic updating to absorb changes in the institution or its service providers
Key Factors for BCP
- Probability of unplanned events
- Security threats
- Increasing infra and application interdependencies
- Regulatory and compliance requirements
- Failure of key third party arrangements
- Globalisation and the challenges of operating in multiple countries
BCP Framework
➢ Conditions for activating plans
(Declaration of Disaster)
➢ Emergency procedures
(Liaison with police, fire service, health-care services and local government)
➢ Identification of the processing resources and locations, available to replace those supporting critical activities
➢ Identification of information to be backed up
➢ Resumption procedures
➢ Awareness and education activities
➢ Roles and Responsibilities
Testing Techniques
- Table-top testing for scenarios (discussing business recovery arrangements using example interruptions)
- Simulations (particularly for training people in their post-incident or crisis management roles)
- Technical recovery testing (ensuring information systems can be restored effectively)
- Testing recovery at an alternate site (running business processes in parallel with recovery operations away from the main site)
- Tests of supplier facilities and services (ensuring externally provided services and products will meet the contracted commitment)
- Complete rehearsals (testing that the organisation, personnel, equipment, facilities and processes can cope with interruptions)
Various Teams & Responsibilities ... (1)
➢ Incident Response Team
➢ Emergency Action Team
➢ Damage Assessment Team
➢ Emergency Management Team
➢ Offsite Storage Team
➢ Software / Application Team
➢ Security Team
➢ Emergency Operations Team
The Teams & Responsibilities .... (2)
➢ Network Recovery Team
➢ Communications Team
➢ Transportation Team
➢ User Hardware Team
➢ Data Preparation and Records Team
➢ Supplies Team
➢ Relocation Team
➢ Legal Affairs Team
➢ Recovery Test Team
➢ Training Team
Testing BCP
➢ Regularly test BCP to ensure that they are up to date and effective
➢ Audit the effectiveness of BCP
➢ BCP drill with the critical third parties
➢ Periodically move the operations to DR Centre including People, Processes and Resources (IT & Non IT).
➢ Testing the readiness of alternative staff at the DR site
➢ Consider having unplanned BCP drill
Maintenance/Reassessment of BCP
➢Strategy may not be adequate as organizational needs change
➢New resources/applications may be developed
➢Changes in HW/SW environment
➢Changes in Business Strategy
Disaster Recovery Site
➢ HA - High availability systems which keep both the data and system replicated off-site, enabling continuous access to systems and data
➢ Replication of data to an off-site location, which overcomes the need to restore the data
Backup sites
➢ Hot Sites- duplicate of the original site
➢ Warm Sites- backups are available, but they may not be complete
➢ Cold Sites- does not include backed up copies of data and information from the original location of the organization, nor does it include hardware already set up
➢ Near Sites- duplicate of the original site, but only with data backup + some other minimum backups
Overview of Data Centre
- Structure & Design of Data Centre
- Important Functions
- Various Teams
- BCP & DR Operations
- NOC
- SOC
- Change Management
- Backup Management
- HR Management
Data Centre
Data Centre Infrastructure, HW etc.
- Software / Applications and other areas
- Health of systems, patching, upgrade
- Application Version – Patch Updates, UAT, making Systems Live
- Application ownerships, procedures for access
- System performance / efficiencies, provisioning for capacity requirements
- Access Controls over CBS, Databases, Tools, Applications
- Software parameter changes
- Access Controls to sensitive directories / folders
- Maintenance of sensitive passwords
- Anti Virus Servers, Update Procedures
- E-mail Systems
- Card Operations – NFS and other Card Networks Interfaces
- ATM Switch (Servers & Security Systems, HSM)
- Remittances – RTGS, NEFT, SWIFT, UPI Systems & Interfaces
- Cheque Truncation Interfacing Systems
- Backup Procedures, Offsite transfers, transports, tests & restore Procedures
Data Centre Tier Level Requirements
Uptime guarantee, Performance, Investment, Return on investment (ROI)
Tier 1
➢ Single non-redundant distribution path serving the IT equipment
➢ Non-redundant capacity components
➢ Basic site infrastructure with expected availability of 99.671%
Tier 2
➢ Meets or exceeds all Tier 1 requirements
➢ Redundant site infrastructure capacity components with expected availability of 99.741%
Tier 3
➢ Meets or exceeds all Tier 1 and Tier 2 requirements
➢ Multiple independent distribution paths serving the IT equipment
➢ All IT equipment must be dual-powered and fully compatible with the topology of a site's architecture
➢ Concurrently maintainable site infrastructure with expected availability of 99.982%
Tier 4
➢ Meets or exceeds all Tier 1, Tier 2 and Tier 3 requirements
➢ All cooling equipment is independently dual-powered, including chillers and heating, ventilating and air-conditioning (HVAC) systems
➢ Fault-tolerant site infrastructure with electrical power storage and distribution facilities with expected availability of 99.995%
Enablers for Business Continuity
- People
- Premises
- Technology Infrastructure
- Information / Data
- Support Services
EMERGENCY RESPONSE
NON-LIFE THREATENING EMERGENCY
BCM Team shall be in action to respond to the situation and co-ordinate with concerned staff /
Vendors to ensure that the negative impact is reduced.
➢ BCM Team should take note of the incident and validate whether it needs attention for
business continuity
➢ Communicate incident details to the BCM Leader & Zonal EMT
➢ Involve with the concerned staff / Vendors / departments for resolution of the incident and
note what actions are being taken towards resolution of the incident
➢ Assist the concerned staff members based on past learning of similar incidents
LIFE THREATENING EMERGENCY
BCM Team with help from Security staff, Floor wardens & First aid volunteers shall do –
➢ Take appropriate action based on the referred guidelines from Emergency Response &
Evacuation Guidelines
➢ In case evacuation is required, floor wardens to ensure that all staff are evacuated
➢ While evacuating staff, floor wardens should make note of and observe preliminary impact
of the incident on the building / floors
➢ Floor wardens should convey the preliminary impact observations to BCM Coordinator /
BCM Head.
➢ Fire fighting efforts should be initiated by trained staff / Security staff
➢ First Aid volunteers should help others in need of first aid
Business Continuity Management
YEARLY PLAN REVIEW
➢ During annual review process BCM Team should internally discuss the effectiveness of this plan (Table top testing) and make necessary changes in the plan to suite branch requirements.
➢ One Copy of updated plan should be kept in branch record & one copy should be shared with Zonal BCM Committee for record.
➢ The Updated plan shall be distributed to all BCM Team members. The records of the same should be maintained in Plan distribution list (annexure – V).
➢ Before delivery of new plan, old plan copies should get collected from all members and marked expired or old.
EMERGENCY MANAGEMENT
PRE-EMERGENCY PREPARATION
- Emergency Response & Evacuation Guidelines
➢ suggested response to emergencies such as
➢ Fire, Earthquake, Flood, Tsunami, Civil disorders, Terrorism, Pandemic etc.
- Facility, Employee and Asset inventory information –
Ensure the following documents are available with the BCM Team:
➢ Branch / Center Employee contact details
➢ Vendor contact details for Emergency Services & Critical Supply
➢ Emergency Response and Evacuation Guidelines
➢ Copies of building / floor plans, Electrical & LAN Diagrams
➢ Asset Inventories (IT & Non-IT)
➢ Insurance information
- Staff Awareness & Training for
1. Emergency response & evacuation guidelines
2. Emergency Evacuation procedures
3. Use of Fire fighting equipment
4. First aid trainings / Availability of First Aid Box
EMERGENCY RESPONSE
POST EMERGENCY ANALYSIS
Non Life Threatening Emergency
- root cause analysis
- report to the Branch / Center BCM Head and Zonal EMT.
Life Threatening Emergency
- root cause analysis
- incident report to the Branch / Center BCM Head and Zonal EMT.
- Assess the damages assets & human life and document the same
- Conduct a thorough inventory check of equipments
- Report estimated losses sustained
- Highlight critical facilities requiring priority repairs if damaged
- Request manpower aid from Zonal Office if required
- Compile damage assessment reports and submit to Branch BCM Head and Zonal EMT
Business Continuity Management
RECOVERY STRATEGY - PRE-DISRUPTION PREPARATION (ENABLER WISE)
(I) People –
➢ For each process, identify backup for key persons and on a periodic basis (or whenever key person is not available) assign work of respective seat to identified backup person.
➢ Provide multi-skill training to staff members so that maximum processes can be continued with minimum staff strength
➢ Encourage planned leaves to avoid surprise staff shortages
(II) Premises –
➢ Alternative sites should be identified. The alternative site administration should be well informed and taken in to confidence about such plan.
➢ Special care for the processes which are specific to this branch and not available in nearby branches. Separate detailed plans may be developed for such unique processes where absence of all five enablers should be separately taken care.
(III) Technology –
➢ BCM Team members should have fair understanding of the Branch / Center technical setup.
This will help in proper diagnosis of technology related disruptions.
➢ The Contact list of critical vendors like ATM vendor, UPS vendor, Head office helpdesk numbers etc. should be readily available in branch.
➢ The Branch / Unit should ensure that Antivirus & Operating system patches are regularly
updated in branch. It will significantly reduce computer problems in branch.
(IV) Data / Information -
➢ Due to CBS, most of branch data is in Data Centre, but still many processes have associated manual activities and require multiple physical documents. Such documents need to be identified and their availability at alternate site should be ensured.
➢ The Digitisation of critical documents especially credit / legal / customer documents should
be done on priority.
➢ The work files used by branch / office staff for reporting / credit appraisal & other miscellaneous activities should be backed up periodically and stored at onsite as well as off-
site location.
(V) Support Services & stock -
➢ The Branch / Office should have suitable arrangements for power supply failures situations like UPS power for computers and Inverter or generator for lighting, fan etc.
➢ For Telephone line failure, branch / office should have some wireless based telephone sets so that communication can be ensured in case of line failure. The Branch leased line should have functional ISDN or VSAT as fall back option for network connectivity
➢ The BCM Team should ensure that Fire fighting equipment in branch / office get refilled before expiry.
➢ The Suitable stock of computer/other stationary items / canteen food articles need to be ensured in branch / office
RECOVERY STRATEGY - DURING-DISRUPTION RESPONSE
- Processes common with Alternate Branch
➢ With CBS system in place most of the services customers can avail from any other branch of the Bank.
➢ But for customer convenience a communication put one board state alternative branch name & address (refer Alternate Site details) outside the branch.
➢ In all communications to the customers (phone and email) staff should inform them for alternate branch location.
➢ Alternate Branch’s Manager should be consulted for staff support to handle increased load.
➢ The Branch BCM Team should involve with the concerned staff / Vendors / departments for resolution of the incident and initiate recovery efforts towards restoration of operations at branch within agreed RTO timelines.
- Processes unique to the Branch –
Alternative Arrangements and Requirements:
➢ Separate detailed plans may be developed for such unique processes where absence of all five enablers should be separately taken care off.
RECOVERY STRATEGY - POST -DISRUPTION ANALYSIS
Once continuity of processes is restored with the coordinated efforts of all, the following task should be carried out
➢ Root cause analysis of the incident should be done by Branch BCM Team, report submitted to the Branch BCM Head and Zonal EMT .
➢ This Report should be sent by the branch in maximum 3 days after the incident.
➢ The Branch BCM Head should ensure that preventive and corrective actions are initiated as per the root cause analysis report of the incident.
QUARTERLY PREPAREDNESS CHECK
➢ The Branch BCM Team should quarterly assess branch preparedness for business continuity.
➢ The Contact details and other routine changes required in annexures due to movement of
staff etc. should be updated during quarterly review with each BCM Team members updating his plan copy.
➢ During this exercise gap analysis shall be carried out with respect to pre-emergency preparatory guidelines
➢ The Findings along with list of actionable and action owners shall get reported to Zonal BCM
committee. The Physical & Environmental Control checklist (annexure – IV) should be part of this report.
YEARLY PLAN REVIEW
➢ The Branch BCP Plan should be maintained by annual reviews and updates to ensure its continued effectiveness. Updated copy should reach controlling office in time.
➢ The Maintenance & review exercise shall ensure that any changes (internal or external) which impact the branch are reviewed in relation to BCM.
➢ It shall also identify any new products and services and their dependent activities which need to be included in the Business Continuity Plan.
Comments (0)