7 critical security requirements for modern IoT troubleshooting systems

Download now

Introduction

Embedded and IoT devices are getting increasingly complex and connected. These devices occasionally fail due to hardware or software issues; when that happens, remotely diagnosing and fixing issues is critical to avoid the exploding costs of support.

When considering how to remotely access and troubleshoot devices in the field, the security aspects of remote, over-the-network, access to the device and its data must also be addressed. This white paper explores the requirements and industry best practices for secure remote troubleshooting.

 

Group 8819 4 (2)

01 Authentication and attack vectors

Correctly verifying the identity of a user is important to prevent an attacker from impersonating someone else with access to the system. Connected servers and devices frequently receive malicious login attempts from attackers. These brute-force attacks often target Secure Shell (SSH), Virtual Network Computing (VNC), teletype network (telnet), and remote desktop1 by connecting to the server listening on the appropriate port, and testing weak or known credentials such as admin:admin, user:password, and pi:raspberry, among others. 

SSH, VNC, telnet, and remote desktop are all server software, meaning these listen to a network interface and port and accept incoming connections. To prevent brute-force attacks, default credentials must be avoided, in addition to using short, weak, and easily guessable passwords. Telnet is widely considered insecure (lacking encryption and basic security features) and should also not be used. As a standard practice, any remote access software not in use should be disabled or uninstalled to prevent accidental use and abuse.

In general, key-based or token-based authentication is more secure than usernames and passwords. Both of these approaches use secrets that are long and practically impossible to guess. Key-based authentication (using public key cryptography) has the added benefit that an exposed public key is not an issue. Only the private key needs to be kept private, and it can typically be kept in just one place; it is the public key that needs to be distributed in multiple places. Randomly generated tokens or passwords (or their salted hashes) are shared secrets that must be kept confidential on both sides (the user or server and remote device). If using passwords for authentication, requiring a second factor, such as a time-based one-time password (TOTP) app or security key, is recommended.

Centralized authentication – such as single sign-on (SSO) or Security Assertion Markup Language (SAML) provides the ability to effectively control and revoke access to systems. Centralized authentication can also help in consistently enforcing strong authentication requirements, such as password length, two-factor authentication (2FA), and others, across different services and systems. This approach also reduces the risk of users reusing the same passwords in multiple places, which exposes devices or systems to credential stuffing attacks.

Another approach is to have the software agent running on the edge device not accept incoming connections but instead reach out and initiate a connection. The benefit of this approach is that it limits the attack surface, making it significantly more difficult for an adversary. In this setup, an attacker can no longer just connect and try passwords; they instead need to compromise the network to perform a man-in-the-middle attack or coerce the device to connect and communicate to their malicious server instead of the authentic one.

For both of these approaches, it is also a good idea to limit the network access of the device – both for incoming and outgoing network traffic. Limiting network access is commonly achieved through some combination of a virtual private network (VPN) and a firewall. A zero-trust, deny-by-default approach is generally recommended, meaning only allowing network traffic that is known to be necessary and blocking everything else.

In all cases, authentication and cryptography should be done using well-studied and widely tested algorithms and libraries. Implementing cryptographic methods that are actually secure is difficult and mistakes can be easily made. To ensure the security of IoT products, it is best to rely on software written by trusted subject matter experts with years of experience and dedication to making authentication as secure as possible.

02 Encryption

In order to ensure the confidentiality of intellectual property, secrets, user data, and day-to-day operations, all information should be stored and transmitted encrypted. Confidentiality can be achieved by ensuring all network communication happens over Transport Layer Security (TLS) and that all devices use disk encryption, including edge devices, cloud servers, and user laptops. For encryption to work effectively, it is crucial to safeguard the encryption keys. Proper key management routines2, limitations on how and where keys are used, using secret management solutions, and security hardware3 can help with this.

With hardware security, cryptographic keys are stored to make them easy to use for signing and encryption but almost impossible to copy or extract. As a result, it is very difficult for an attacker to steal the keys and impersonate the device. There is a high level of trust in the signatures and encryption performed by these keys.

As with authentication, it is strongly recommended that widely available, well-tested, and trusted cryptographic algorithms and libraries are leveraged instead of relying on a bespoke or alternative solution.

03 Access control

Once strong authentication, network protections, and encryption are established, the weakest link becomes human users. In a system with many users, it is reasonable to assume one of them will eventually be compromised. Social engineering, phishing, or exposed credentials commonly target the human user as the weakness and system entry point.

The principle of least privilege4 is used to limit access within software, systems, and services. At its core, it means that access should be limited as much as possible – only the people who need access should have it, and they should only have the level of access they need. The principle of least privilege provides two main benefits: limiting the access and capabilities of an attacker using a compromised account and preventing legitimate users from accidentally making damaging changes.

Access should be separated into at least three levels:

  • Level 1 – Read (Default): View basic information about devices and groups of devices.
  • Level 2 – Write: Level of access needed to make changes. For remote troubleshooting, users could log into devices and make changes. However, access is still restricted, and users cannot make any changes; users are only granted access to make changes that are typically needed, for example, by engineers or support staff.
  • Level 3 – Admin: The highest level of access should only be given to a limited number of trusted users. This includes access to significant and potentially disruptive changes to the system and functionality, which is not needed by engineers or support staff on a day-to-day basis, such as:
    1. Managing other users and roles
    2. Managing integrations
    3. Global settings and defaults
    4. Access to view or manage secrets

Splitting access according to this three-way separation, at a minimum, is a standard practice. However, it is easy to see the benefits of splitting access up further:

  • Dividing read access: Read access could be split into basic read access and access to some more valuable information, such as security or compliance related logs.
  • Dividing write access: Similarly, write access for running terminal commands could distinguish between a shell user with limited access versus a root or sudo user with access to everything on the device.
  • Partial admin access: Some admin actions could be split into separate roles, especially if they are needed more frequently or executed by people who don't need access to the other admin-level commands, such as adding users.

Group 8820 (1)

Beyond this, limiting which devices a user or role can access is also beneficial for larger organizations with many devices across different teams and departments. Users can be grouped based on geographical location (such as facility, state, country, or continent), specific hardware, product line, responsible team, or different classes of devices (such as production, test, and development prototypes).

Additionally, it is important to have routines and systems for updating and revoking access, for example, when an employee leaves the company. If the employee had access to secrets (such as passwords, tokens, and keys), these should also be revoked or rotated. If all communication with devices, or at least the authentication step, goes through a centralized system, it makes it very easy to lock out users, for example, during a security incident or after the employee leaves the company.

The method of setting up access, roles, and users requires some planning and forethought. It considers what makes sense for the specific environment while also ensuring the final state is efficient and easy to maintain. The goal should be to minimize the risk of human errors in access management and operational overhead.

04 Auditability and monitoring

When users log into devices, their commands should be logged. Similarly, changes to access, users, and groups, among other things, should also be logged. Whenever a user changes something, at a minimum, log when the change happened, who performed it, and what they changed or which command they ran.

In the case of a security incident or audit, the audit log allows you to review precisely what actions happened leading up to the event. These logs should not be stored on the device, and they should be hard or close to impossible for an attacker to delete.

Remote access systems are a high-value target for an adversary; a compromised system could give them remote access to run commands on the entire device fleet. As such, establishing a method to monitor for drastic changes in usage patterns and suspicious activity or utilizing other intrusion detection methods to be alerted if the system is being used maliciously is also critical for device security.

05 Configuration and maintenance complexity

A system implementing all of these features – such as centralized authentication, encryption, logging, and monitoring – will rely on a combination of different libraries, components, or products.

Security should consider the maintenance complexity, especially related to custom code or configuration files that require writing, editing, or maintenance, as well as separate systems or user interfaces (UIs) that require manual work.

Beyond the initial investment, system usage – requirements to add users, user training and adoption, machine installation or configuration requirements, etc. – should also be considered. If any parts of using the system are unintuitive or require clarification, these things should be documented for users to access easily.

Managing the complexity of ongoing maintenance and configuration requirements is central to a strong security posture. Even if a system is initially set up securely, lapses in its maintenance or configuration errors create vulnerabilities an adversary could exploit.

Group 8821 2

 

06 Routines, assessments, and penetration tests

Dedicated resources are required to verify and improve the security of your product, the remote access solution, or networked software in general. There are many options to assess security, using both internal and external resources.

It can be beneficial to work with security researchers to perform realistic penetration tests where they attempt to break into the system or device as an attacker. Internal engineers can also test the software, use scanning tools, or analyze the source code to find vulnerabilities. Bug bounty programs exist to allow individuals online to search for weaknesses to exploit, and companies only pay for confirmed vulnerabilities. A security-focused organization will want to rely on a combination of these and other approaches to ensure its products and systems are not easily breached.

07 Security updates

Any software of significant size has bugs and vulnerabilities; software authors or maintainers release new versions to fix these issues. Ensuring software is on the latest version is a central component of security. If a device is stuck on an older version, it will be exposed to bugs (which have already been fixed in newer versions) and vulnerabilities (which have been publicly announced). Older or outdated versions will, in some cases, be easy to exploit.

The remote access software, its dependencies, the operating system, and other software on the device must be kept up to date with the latest security patches. A secure and reliable update mechanism enables frequent software updates while minimizing the risk of disruption or downtime due to an update error or unintended consequences. Most of the same IoT troubleshooting security requirements listed are also applicable to software updates.

Conclusion

It is essential to secure your IoT devices – ensuring they remain available to users who should have access and are inaccessible to attackers while preserving the integrity and confidentiality of data and communication. Developing and maintaining the solution needed to achieve your security and compliance requirements needs resources, even if using ‘free’ software tools and components. But in today’s digitally-dependent and connected world, security is paramount. And the risks of the alternative or potential consequences of lacking security – a successful cyberattack, data breach, device disruption, or downtime – far outweigh investing in security upfront.

To start to bolster your security, leverage a best-in-class, security-oriented, remote troubleshooting solution that encompasses and facilitates the top 7 critical security requirements. Updates, automation, remote access, configuration, monitoring, reporting, and more are best practices across security disciplines and should be embedded within individual solutions and the overall architecture supporting your products. By implementing these features and functions in a secure way, organizations can rest assured their infrastructure and devices are secure and focus resources on developing the differentiating aspects of their product.

Notes
[1] These botnets and brute force attacks have been widely documented by security news outlets and journalists over the past 8 years, including: KrebsOnSecurity, Darktrace, and BleepingComputer. At FOSDEM 2018, an engineer from Northern.tech held the talk "loT botnet wars" to show how different loT botnets worked and targeted these vulnerable devices with default credentials.

[2] The National Institute of Standards and Technology (NIST) have wel respected and widely used guidelines for cryptographic key management which are freely available online - https://csrc.nist.gov/Projects/Key-Management/Key-Management-Guidelines

[3] By security hardware, we mean some way of securely creating and storing cryptographic keys and performing cryptographic operations without accessing the private keys directly or otherwise exposing them. Different implementations and names exist for this concept, such as Trusted Platform Modules, Hardware Security Modules, Secure Enclave, Secure Element, etc. Which technology is available to you and appropriate depends on the application and type of device in question.

[4] The principle of least privilege: https://www.cloudflare.com/en-gb/learning/access-management/principle-of-least-privilege/

Tags:

Download the PDF

Ready to get started?

Try Alvaldi by signing up 10 devices for free for the first 6 months.