QRadar Technical Blog: HA and DR
People often question whether to use High Availability (HA) or Disaster Recovery (DR). In our view this is not really a valid question. The two techniques address different issues and probably the choice of “HA or DR” should be re-defined as “HA and DR”.
The limitations imposed by the configuration of a HA pair restricts its use to physical appliances. A virtual environment cannot host a HA pair as high latency between the pair makes block replication unreliable. Further, the recommendation is for the two appliances to be installed within the same rack, not an option for virtual installs. The objective of HA is to provide a hot-swap should the primary device fail. The failover will take place automatically once the failure has been established, this ensures continuity. From a purely practical point of view, the use of HA pairing in a distributed QRadar environment only provides real benefit when applied to the data collection portions of the deployment. Therefore creating an HA pair with the console will not enhance security if the event and flow processors are single devices.
The concept of a DR installation is to provide continuity when an entire data centre becomes unusable. As many companies have multiple data centres providing a measure of disaster recovery, the option of being able to spin-up consoles and processors in the event of a disaster is merely an adjunct to any plans to provide backup for the company’s core business processes. In this case, virtualising the entire QRadar DR deployment makes perfect sense.
In a small deployment of a console and two processors, the optimum configuration would be to install the console in a virtual environment, install both processors with HA secondarys and setup a DR centre based around a virtual console and two virtual processors. This provides for security in event capture and processing in a day-to-day context and the capability to raise a replica deployment in the event of an emergency. The network integrity could be further enhanced by maintaining the DR deployment in a state of readiness with all event/flow data being forwarded by the primary system to the secondary (DR) network. Then with a small amount of manual intervention the secondary system can take over completely.