talegence Infrastructure Security and Reliability
Providing a secure, reliable environment for talegence customers is a critical component of our service. We have developed multiple layers of protection and integrated these into our processes to insure that every user experiences the highest level of security and reliability without intruding into the user’s interaction with the system. Thus the system is extremely easy to use, but completely protected and robust.
The first layers of security involve the servers. All servers are located in state of the art data centers that provide exceptional protection and reliability. We use two data centers for all servers to provide fail over capability in case of catastrophic failure of the primary center.
Access to the data centers is protected 24 hours a day. In order to gain entry all employees and guests must pass through two-factor authentication barriers. A proximity security badge is required for entry/exit on all data center doors. Guards patrol the entire facility and motion sensitive cameras track all activity throughout the facility and on the data center floors.
Each server is behind hardware and software firewalls. The servers are locked down to allow traffic only on those ports necessary to operate talegence. File access software tracks all changes in all files. The reports from these files are examined by two people each morning to verify that only those files that are expected to have changed have been altered. Files in sensitive directories are monitored on an hourly basis. Intrusion detection software is used to detect port scanning and auto-block potential hackers in real time.
Multiple security notification services are monitored to insure our underlying infrastructure software is secure. Applying security patches receives the highest priority and is usually achieved within twenty four hours of patches being made available. If no patch is available, certain web site functions may be disabled to secure the servers until a patch is provided.
Hardware and software mechanisms continuously monitor all traffic and guard against denial of service attacks (including distributed). If a DDOS attack is detected it is verified by one of two network operations centers. The affected traffic is routed through a DDoS mitigation service that filters out the attack traffic and allows the legitimate traffic to continue to its original destination. When the attack has subsided, the original direct path between the destination and the rest of the Internet is restored.
All server access is recorded in internal log files that are parsed and reported each day. These reports include information on httpd access, named access, all batch jobs, known site probes, IP addresses automatically locked out for suspicious behavior, secured and unsecured logins (successful and unsuccessful) and disk utilization. Two employees review the log files of each server daily.
Server Power Management
Power is supplied to the data center via redundant direct connection from the local power company and fed into transfer switches. Master distribution boards funnel the power into the UPS systems which provide power to the servers. All components from the incoming power to the servers have redundant paths that switch automatically in the event of failure.
If the power transfer switches sense a decline in power from the local utility, they automatically signal diesel-powered generators to start up and switch power flow to the generators. In the few seconds between the decline in power from the utility and switch over to the backup generators, UPS units are supplied with power through a battery system that ensures the power supply to servers is uninterrupted.
Should any generator fail, additional backup generators kick in automatically to keep power supplies flowing. And if by any chance there is an extended power outage, on-site diesel generators can run indefinitely. Servers are fed by multiple uninterruptible power supply (UPS) units with redundant and instant failover coverage should any single UPS unit fail.
All heating, ventilation, air conditioning (HVAC) systems are rated N+1 redundant. To be at a level of N+1, the overall system integrity cannot be impacted by the failure of any one component and will continue to function at acceptable performance levels after the loss of any component.
All HVAC power is supplied through the data center power grid guaranteeing electricity to the units even in the midst of a complete power failure. The diesel
generator backups are scaled to power all components in the data centers and the HVAC indefinitely.
Keeping the servers up and secure is only part of the process. In order to insure that talegence customers have access to the system, we need network connectivity that will not fail. Four different backbone providers are used to connect the core routers to the Internet to prevent physical failures or corporate bankruptcies from interfering with the operation of talegence.
The Internet connections feed through two levels of routers and switches to reach the servers. All routers and switches are in groups that contain redundant components. If a component fails, traffic is automatically routed to the other devices in the group and both network operations centers are notified.
Server Backups and Failover
All servers in the primary data center are duplicated in a second data center that is owned by a different provider and located in a different geographic location. The primary and secondary servers are kept in real time synchronization to provide a component for component (1+1) redundancy in the event of a catastrophic failure of the primary data center. This capability is tested on a quarterly basis.
All files on the primary servers are automatically backed up every day, including supporting programming and databases. The backups are stored locally on the hosting server (to facilitate quick restores) and copied to remote servers to protect against catastrophic hardware failures. Backups rotate automatically every week giving access to the last seven days. Unlike the failover servers, this process provides snapshots of the system to give us restore points.
All servers also watch each other and automatically notify our technical staff if one server detects a problem with another server (such as running low on disk space or excessive CPU utilization).
talegence was written from the ground up with security in mind. The system was developed completely in house so our staff knows all of the code and each of the functions. Its design incorporates numerous security provisions and we are always looking for new ways to increase the protection. We have layered the security to provide a solid, reliable system that does not interfere with the user’s ability to navigate through the system easily and naturally.
Some of the security measures in place include:
The best security in the world must be regularly checked in order to remain effective. We have procedures in place to insure that our security measures are fully in force. Although much of security and reliability can be automated, there are still some tasks that are best handled by humans.