Software, Security & Disaster Management Policy
Software Release Management
Software Release Procedure
- Upon completion of development, a feature shall be committed in source control to a separate branch and deployed to the staging server for QA.
- A technical lead shall review the feature’s source code for correctness, and for compliance with the security policy outlined below. Any non-compliant feature may not proceed to QA until the security issues are resolved.
- QA personnel shall review the source code for any affected functionality. They shall then thoroughly test all aspects of the feature, all affected functionality and perform light testing of other major features that should not be affected.
- Upon successful QA, the feature branch shall be merged into the production branch and tagged with an incremental version number.
- The feature shall be deployed to production no later than mid-week, and only using the Ansible deployment system.
- If it becomes necessary to roll back the feature due to unexpected issues, said rollback shall be approved by a technical lead and performed only using the Ansible deployment system targeted at the previous version.
- If it is necessary to deploy an emergency patch, the regular process steps 3 to 5 may be omitted after approval from a technical lead. A technical lead shall be present for the deployment and subsequent testing of the patch.
Backup & Disaster Recovery
Automated Backups
- An automated backup of the database to an offsite backup server shall be performed. Backup frequency is daily. Backup retention is 1 week of daily backups and 2 months of monthly backups.
- An automated backup of all source code, including full history, to an offsite backup server shall be performed. Backup frequency is daily.
- Once per quarter a technical lead shall check both database and source code backup systems for correct operation and consistency.
Disaster Recovery - Servers & Code
Upon a total failure of one or more servers, the following procedure shall be observed:
- Latest source code shall be reloaded from backup if necessary.
- New server(s) to be instanced at the appropriate cloud server providers as needed.
- In case of “central” servers, those servers shall be added to the firewall & security policy for central servers to prevent unauthorized access.
- Ansible server inventories to be updated to cover the new server IPs.
- A complete deployment to those servers shall be run using the Ansible deployment system.
- Elastic IP, DNS and/or database entry changes shall be applied to point the system to the new servers as necessary.
- A technical lead shall verify the system is back to normal operation.
- The failed servers shall be taken permanently offline.
Disaster Recovery - Database
- A new RDS instance shall be instanced via AWS if necessary.
- The latest backup copy of the database shall be loaded to the new RDS instance from the backup server.
- Redundancy shall be enabled on the new RDS instance.
- Software settings for all central servers shall be updated to point to the new RDS instance.
- A technical lead shall verify the database is correctly loaded & all required settings are in place.
- Updated software settings shall be pushed to source code control via the emergency procedure outlined in “Software Release Management”.
- A technical lead shall deploy the new settings to all central servers using the Ansible deployment system.
Personal Information
Personal Information Storage
- Any personal data collected shall be stored on the database server only, located as part of the central server set on AWS in a USA region.
- Only the following personal data may be collected:
- Names, email addresses and passwords of users entitled to use the system.
- Names, email addresses and phone numbers of people to be contacted in case of check failures.
- The following additional data may also be collected:
- Integration API keys for any 3rd party integrations supported by Uptime.com.
- Domain names, URLs, basic auth logins and other impersonal data required to perform monitoring checks as part of the Uptime.com service.
- Credit card details shall NOT be stored by the system in any way, and must be transmitted to and handled directly by a PCI compliant payment provider per their recommended usage guidelines.
Personal Information Security
- Any transmission of personal information shall be encrypted using industry-standard SSL communication with valid certificates.
- Passwords for user logins shall be encrypted in the database using industry-standard one-way salted hashes.
- All servers that store or process personal information (database servers, web servers and background processing servers) shall be behind an IP-restricted firewall accessible only to other such servers and authorized developers/system administrators of Uptime.com. The only exceptions are ports 80 and 443 being open on web servers.
Security Management
Security Policy
- All communication between users and the Uptime.com system, and between the system to 3rd party providers, shall be encrypted using industry-standard SSL communication with valid certificates.
- Access to core system servers and components shall be done using industry-standard SSH encrypted communication, and shall be protected by an IP-restricted firewall open to authorized developers/system administrators of Uptime.com only.
- All internal and external passwords shall be random strings of at least 16 characters including a mix of lowercase letters, uppercase letters and numerals.
- Any employee leaving Uptime.com shall immediately have their public keys revoked from all firewalls and servers in an automated fashion using the Ansible deployment system. Any passwords that were accessible to them shall be changed.
- Technical leads shall stay informed and abreast of the latest developments and industry standards of web security. Technical leads shall advise the directors if and when external resources (such as courses, expert consultants etc.) are required to maintain in-house knowledge and system security at an acceptable level.
- Twice annually the technical leads shall conduct a comprehensive vulnerability assessment of the system using the procedure outlined below. Any security issues that arise from this assessment shall be added to the near-term development roadmap and resolved within 3 months, or ASAP for critical issues.
Vulnerability Assessment Procedure
- Review & update high level infrastructure documentation.
- Identify & document all points of communication between internal servers, and with external services.
- For each of the above, check communication protocol and method for compliance with industry-standard security procedures.
- For each of the above, verify server allows access to the necessary clients only.
- For each of the above, consider potential exploits based on recent developments in web security.
- Review firewall settings for IP-restrictions to authorized Uptime.com developers/system administrators. Verify all “central” servers are fully hidden behind the firewall.
- Run port scanning to ensure only HTTP/HTTPS is visible to the outside.
- Verify that all OS security updates have been applied to all servers.
- Verify that all security updates have been applied to security-related applications and libraries on all servers (SSL, SSH, Nginx, gunicorn).
- Document any potential security issues for development & resolution in subsequent development cycles.
Security Incident Response Procedure
- Technical and business leads shall be notified as soon as any security breach has been detected.
- The affected areas of the system shall be determined and either blocked or shut down to mitigate the incident as soon as possible.
- All logs from the affected servers shall be saved offsite, and the cause of the security breach shall be determined.
- A short term fix shall be approved by the technical leads and put into place to prevent a short-term recurrence of the breach.
- Affected systems shall be restarted.
- Detailed diagnosis of the breach and development of a permanent solution shall be added to the near-term development pipeline at high priority.