Blog posts by Mike Small
On December 11th I attended an analyst webcast from Cisco entitled “The Future of the Internet”. At this Cisco unveiled its plans for its next generation of networking products. While this was interesting, it did not meet my expectations for a deeper vision of the future of the internet.
The timing is interesting because 50 years ago in 1969 there were several events that were seminal to the internet. Many people will remember Apollo 11 and the moon landing – while this was an enormous achievement in its own right – it was the space race that led to the miniaturization and commercialization of the silicon-based technology upon which the internet is based.
The birth of the Internet
1969 also marked the birth of Unix. Two Bell Labs computer scientists Ken Thompson and Dennis Ritchie had been working on an experimental time-sharing operating system called Multics as part of the joint research group with General Electric and MIT. They decided to take the best ideas from Multics and implement them on a smaller scale – on a PDP-7 minicomputer at Bell Labs. That marked the birth of Unix, the ancestor of Linux, the most widely deployed computing platform in the cloud, as well MacOS, IoS and Android.
October 29th, 1969 was another important date: it marked the first computer to computer communication using a packet-switched network ARPANET. In 1958, US President Dwight D. Eisenhower formed the Advanced Research Projects Agency (ARPA), bringing together some of the best scientific minds in the US with the aim to help American military technology stay ahead of its enemies. Among ARPA’s projects was a remit to test the feasibility of a large-scale computer network.
In July 1961 Leonard Kleinrock at MIT published the first paper on packet switching theory and the first book on the subject in 1964. Lawrence Roberts, the first person to connect two computers, was responsible for developing computer networks at ARPA, working with scientist Leonard Kleinrock. When the first packet-switching network was developed in 1969, Kleinrock successfully used it to send messages to another site, and the ARPA Network, or ARPANET, was born—the forerunner of the internet.
These were the foundational ideas and technologies that led to the largest man-made artifact – the internet.
Cisco announces new products based on single multi-purpose ASIC
The internet today depends upon the technology provided by a range of vendors including Cisco. During the webcast, Chuck Robbins, CEO of Cisco, made the comment that he believed that 90% of the traffic on the internet flows through silicon created by Eyal Dagan, SVP silicon engineering at Cisco. Obviously, this technology is an important part of the internet infrastructure. So, what did Cisco announce that is new?
The first and most significant announcement was that Cisco has created the first single multi-purpose ASIC (computer chip) that can handle both routing and switching efficiently, with high performance and with lower power consumption. According to Cisco, this is a first, and in the past, it was generally thought that the requirements of routing and switching were so different that it was not possible for a single chip to meet all the requirements for both these tasks. Why does this matter?
The current generation of network appliances is built with different chips for different jobs and so this means that multiple software stacks are needed to manage the different hardware combinations. A single chip means a single software stack which is smaller, more efficient and easier to maintain. Their new network operating system IOS XR7 implements this and is claimed to be simpler, supports modern APIs and is more trustworthy. Trustworthiness is ensured through a single hardware root of trust.
One of the problems that network operators have when deploying new hardware is testing to ensure that the service is maintained after the change. Cisco also announced a new cloud service that helps them with this problem. The end user uploads their current configuration to the service which generates tests for the new platform and greatly speeds up testing as well as providing more assurance that the service can be cut over without problems.
Cisco delivers infrastructure but is that enough?
It is easy to forget, when you click on your phone, just how many technical problems had to be overcome to provide the seamless connectivity that we all now take for granted. The vision, creativity, and investment by Cisco over the five years that it took for this development is to be applauded. It is excellent news to hear that the next generation of infrastructure components that underly the internet will provide more bandwidth, better connectivity and use less energy. However, this infrastructure does not define the future of the internet.
The internet has created enormous opportunities but has also brought significant challenges. It has provided opportunities for new kinds of business but also enabled the scourge of cybercrime by providing an unpoliced hiding place for cybercriminals. It has opened widows across the world to allow people from all cultures to connect with each other but has also provided a platform for cyberbullying, fake news and political interference. It is enabling new technologies such as artificial intelligence which have the potential to do great good but also raise many ethical concerns.
Mankind has created the internet, with the help of technology from companies like Cisco. The internet is available to mankind, but can mankind master its creation?
Artificial Intelligence is a hot topic and many organizations are now starting to exploit these technologies, at the same time there are many concerns around the impact this will have on society. Governance sets the framework within which organizations conduct their business in a way that manages risk and compliance as well as to ensure an ethical approach. AI has the potential to improve governance and reduce costs, but it also creates challenges that need to be governed.
The concept of AI is not new, but cloud computing has provided the access to data and the computing power needed to turn it into a practical reality. However, while there are some legitimate concerns, the current state of AI is still a long way from the science fiction portrayal of a threat to humanity. Machine Learning technologies provide significantly improved capabilities to analyze large amounts of data in a wide range of forms. While this poses a threat of “Big Other” it also makes them especially suitable for spotting patterns and anomalies and hence potentially useful for detecting fraudulent activity, security breaches and non-compliance.
AI covers a range of capabilities, including ML (machine learning), RPA (Robotic Process Automation), NLP (Natural Language Processing) amongst others. But AI tools are simply mathematical processes which come in a wide variety of forms and have relative strengths and weaknesses.
ML (Machine Learning) is based on artificial neural networks, inspired by the way in which animal brains work. These use networks of machine learning algorithms which can be trained to perform tasks using data as examples and without needing any preprogrammed rules.
For ML training to be effective it needs large amounts of data and acquiring this data can be problematic. The data may need to be obtained from third parties and this can raise issues around privacy and transparency. The data may contain unexpected biases and, in any case, needs to be tagged or classified with the expected results which can take significant effort.
One major vendor successfully applied this to detect and protect against identity led attacks. This was not a trivial project and took 12 people over 4 years to complete However, the results were worth the cost since this is now much more effective than the hand-crafted rules that were previously used. It is also capable of automatically adapting to new threats as they emerge.
So how can this technology be applied to governance? Organizations are faced with a tidal wave of regulation and need to cope with the vast amount of data that is now regularly collected for compliance. The current state of AI technologies makes them very suitable to meet these challenges. ML can be used to identify abnormal patterns in event data and detect cyber threats while in progress. The same approach can help to analyze the large volumes of data collected to determine the effectiveness of compliance controls. Its ability to process textual data makes it practical to process regulatory texts to extract the obligations and compare these with the current controls. It can also process textbooks, manuals, social media and threat sharing sources to relate event data to threats.
However, the system needs to be trained by regulatory professionals to recognize the obligations in regulatory texts and to extract these into a common form that can be compared with the existing obligations documented in internal systems to identify where there is a match. It also needs training to discover existing internal controls that may be relevant or, where there are no controls, to advise on what is needed.
Lined with a conventional GRC system this can augment the existing capabilities and help to consolidate new and existing regulatory requirements into a central repository used to classify complex regulations and help stakeholders across the organization to process large volumes of regulatory data. It can help to map regulatory requirements to internal taxonomies and business structures and basic GRC data. Thus connecting regulatory data to key risks, controls and policies, and linking that data to an overall business strategy.
Governance also needs to address the ethical challenges that come with the use of AI technologies. These include unintentional bias, the need for explanation, avoiding misuse of personal data and protecting personal privacy as well as vulnerabilities that could be exploited to attack the system.
Bias is a very current issue with bias related to gender and race as top concerns. Training depends upon the data used and many datasets contain an inherent if unintentional bias. For example see the 2018 paper Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. There are also subtle differences between human cultures, and it is very difficult for humans to develop AI systems to be culturally neutral. Great care is needed to with this area.
Explanation – In many applications, it may be very important to provide an explanation for conclusions reached and actions taken. Rule-based systems can provide this to some extent but ML systems, in general, are poor at this. Where explanation is important some form of human oversight is needed.
One of the driving factors in the development of ML is the vast amount of data that is now available, and organizations would like to get maximum value from this. Conventional analysis techniques are very labor-intensive, and ML provides a potential solution to get more from the data with less effort. However, organizations need to beware of breaching public trust by using personal data, that may have been legitimately collected, in ways for which they have not obtained informed consent. Indeed, this is part of the wider issue of surveillance capitalism - Big Other: Surveillance Capitalism and the Prospects of an Information Civilization.
ML systems, unlike the human, do not use understanding they simply match patterns – this makes them open to attacks using inputs that are invisible to humans. Recent examples of vulnerabilities reported include one where the autopilot of a Tesla car was tricked into changing lanes into oncoming traffic by stickers placed on the road. A wider review of this challenge is reported in A survey of practical adversarial example attacks.
In conclusion, AI technologies and ML in particular, provide the potential to assist governance by reducing the costs associated with onboarding new regulations, managing controls and processing compliance data. The exploitation of AI within organizations needs to be well governed to ensure that it is applied ethically and to avoid unintentional bias and misuse of personal data. The ideal areas for the application of ML are those where with a limited scope and where explanation is not important.
For more information attend KuppingerCole’s AImpact Summit 2019.
If you liked this text, feel free to browse our Focus Area: AI for the Future of Your Business for more related content.
Getting competitive advantage from data is not a new idea however, the volume of data now available and the way in which it is being collected and analysed has led to increasing concerns. As a result, there are a growing number of regulations over its collection, processing and use. Organization need to take care to ensure compliance with these regulations as well as to secure the data they use. This is increasing the costs and organizations need a more sustainable approach.
In the past organizations could be confident that their data was held internally and was under their own control. Today this is no longer true, as well as internally generated data being held in cloud services, the Internet of Things and Social Media provide rich external sources of valuable data. These innovations have increased the opportunities for organizations to use this data to create new products, get closer to their customers and to improve efficiency. However, this data needs to be controlled to ensure security and to comply with regulatory obligations.
The recent EU GDPR (General Data Protection Regulation) provides a good example of the challenges organizations face. Under GDPR the definition of data processing is very broad, it covers any operation that is performed on personal data or on sets of personal data. It includes everything from the initial collection of personal data through to its final deletion. Processing covers every operation on personal data including storage, alteration, retrieval, use, transmission, dissemination or otherwise making available. The sanctions for non-compliance are very severe and critically, the burden of proof to demonstrate compliance with these principles lies with the Data Controller.
There are several important challenges that need to be addressed – these include:
- Governance – externally sources and unstructured data may be poorly governed, with no clear objectives for its use;
- Ownership – external and unstructured data may have no clear owner to classify its sensitivity and control its lifecycle;
- Sharing - Individuals can easily share unstructured data using email, SharePoint and cloud services;
- Development use - developers can use and share regulated data for development and test purposes in ways that may breach the regulatory obligations;
- Data Analytics - the permissions to use externally acquired data may be unclear and data scientists may use legitimately acquired data in ways that may breach obligations.
To meet these challenges in a sustainable way, organizations need to adopt Good Information Stewardship. This is a subject that KuppingerCole have written about in the past. Good Information Stewardship takes an information centric, rather than technology centric, approach to security and Information centric security starts with good data governance.
Figure: Sustainable Data Management
Access controls are fundamental to ensuring that data is only accessed and used in ways that are authorized. However, additional kinds of controls are needed to protect against illegitimate access, cyber adversaries now routinely target access credentials. Further controls are also needed to ensure the privacy of personal data when it is processed, shared or held in cloud services. Encryption, tokenization, anonymization and pseudonymization technologies provide such “data centric” controls.
Encryption protects data against unauthorized access but is only as strong as the control over the encryption keys. Rights Management using public / private key encryption is an important control that allows controlled sharing of unstructured data. DLP (Data Leak Prevention) technology and CASBs (Cloud Access Security Brokers) are also useful to help to control the movement of regulated data outside of the organization.
Pseudonymisation is especially important because it is accepted by GDPR as an approach to data protection by design and default. GDPR accepts that properly pseudonymized personal data is outside the scope of its obligations. Pseudonymization also allows operational data to be processed while it is still protected. This is very useful since pseudonymized personal data can therefore be used for development and test purposes, as well as for data analytics. Indeed, according to the UK Information Commissioner’s Office “To protect privacy it is better to use or disclose anonymized data than personal data”.
In summary, organizations need to take a sustainable approach to managing the potential risks both to the organization and to those outside (for example to the data subjects) from their use of data. Good information stewardship based on a data centric approach, where security controls follow the data, is best. This provides a sustainable approach that is independent of the infrastructure, tools and technologies used to store, analyse and process data.
For more details see: Advisory Note: Big Data Security, Governance, Stewardship - 72565
The dream of being able to create systems that can simulate human thought and behaviour is not new. Now that this dream appears to be coming closer to reality there is both excitement and alarm. Famously, in 2014 Prof. Stephen Hawking told the BBC: "The development of full artificial intelligence could spell the end of the human race”. Should we be alarmed by these developments and what in practice does this mean today?
The origins of today’s AI (Artificial Intelligence) can be traced back to the seminal work on computers by Dr Alan Turing. He proposed an experiment that became known as the “Turing Test”, to define the standard for a machine to be called "intelligent". A computer could only be said to "think" if a human was not able to distinguish it from a human being through a conversation with it.
The theoretical work that underpins today’s AI and ML (Machine Learning) was developed in the 1940s and 1950s. The early computers of that era were slow and could only store limited amounts of data, this restricted what could practically be implemented. This has now changed – the cloud provides the storage for vast amounts of data and the computing power needed for ML.
The theoretical basis for ML stems from work published in 1943 by Warren McCulloch and Walter Pitts on a computational model for neural networks based on mathematics and algorithms called threshold logic. Artificial neural networks provide a framework for machine learning algorithms to learn based on examples without being formally programmed. This learning needs large amounts of data and the significant computing power which the cloud can provide.
Analysing this the vast amount of data now available in the cloud creates its own challenges and ML provides a potential solution to these. Normal statistical approaches may not be capable of spotting patterns that a human would see, and programming individual analyses is laborious and slow. ML provides a way to supercharge human ability to analyse data. However, it changes the development cycle from programming to training based on curated examples overseen by human trainers. Self-learning systems may provide a way around the programming bottleneck. However, the training-based development cycle creates new challenges around testing, auditing and assurance.
ML has also provided a way to enhance algorithmic approaches to understanding visual and auditory data. It has for example enabled facial recognition systems as well as chatbots for voice-based user interactions. However, ML is only as good as the training and is not able to provide explanations for the conclusions that it reaches. This leads to the risk of adversarial attacks – where a third party spots a weakness in the training and exploits this to subvert the system. However, it has been applied very successfully to visual component inspection in manufacturing where it is faster and more accurate than a human.
One significant challenge is how to avoid bias – there are several reported examples of bias in facial recognition systems. Bias can come from several sources. There may be insufficient data to provide a representative sample. The data may have been consciously or unconsciously chosen in a way that introduces bias. This latter is difficult to avoid since every human is part of a culture which is inherently founded on a set of shared beliefs and behaviours which may not be the same as in other cultures.
Another problem is one of explanation – ML systems are not usually capable of providing an explanation for their conclusions. This makes training ML doubly difficult because when the system being trained gets the wrong answer it is hard to figure out why. The trainer needs to know this to correct the error. In use, an explanation may be required to justify a life-changing decision to the person that it affects, to provide the confidence needed to invest in a project based on a projection, or to justify why a decision was taken in a court of law.
A third problem is that ML systems do not have what most people would call “common sense”. This is because currently each is narrowly focussed on one specialized problem. Common sense comes from a much wider understanding of the world and allows the human to recognize and discard what may appear to be a logical conclusion because in the wider context it is clearly stupid. This was apparent when Microsoft released a chatbot that was supposed to train itself did not recognize mischievous behaviour.
Figure: AI, Myths, Reality and Challenges
In conclusion, AI systems are evolving but they have not yet reached the state portrayed in popular science fiction. ML is ready for practical application and major vendors offer tools to support this. The problems where AI is ready can be applied today can be described in two dimensions – the scope of knowledge required and the need for explanation. Note that the need for explanation is related to the need for legal justification or where potential consequences of mistakes are high.
Organizations are recommended to look for applications that fit the green area in the diagram and to use caution when considering those that would lie in the amber areas. The red area is still experimental and should only be considered for research.
For more information on this subject attend the AI track at EIC in Munich in May 2019.
According to the Ponemon Institute - cyber incidents that take over 30 days to contain cost $1m more than those contained within 30 days. However, less than 25% of organizations surveyed globally say that their organization has a coordinated incident response plan in place. In the UK, only 13% of businesses have an incident management process in place according to a government report. This appears to show a shocking lack of preparedness since it is when not if your organization will be the target of a cyber-attack.
Last week on January 24th I attended a demonstration of IBM’s new C-TOC (Cyber Tactical Operations Centre) in London. The C-TOC is an incident response centre housed in an 18-wheel truck. It can be deployed in a wide range of environments, with self-sustaining power, an on-board data centre and cellular communications to provide a sterile environment for cyber incident response. It is designed to provide companies with immersion training in high pressure cyber-attack simulations to help them to prepare for and to improve their response to these kinds of incidents.
The key to managing incidents is preparation. There are 3 phases to a cyber incident, these are the events that led up to the incident, the incident itself and what happens after the incident. Prior to the incident the victim may have missed opportunities to prevent it. When the incident occurs, the victim needs to detect what is happening, to manage and contain its effects. After the incident the victim needs to respond in a way that not only manages the cyber related aspects but also deals with potential customer issues as well as reputational damage.
Prevention is always better than cure, so it is important to continuously improve you organization’s security posture, but you still need to be prepared to deal with an incident when it occurs.
The so-called Y2K (Millenium) bug is an example of an incident that was so well managed some people believe it was a myth. In fact, I like many other IT professionals, spent the turn of the century in a bunker ready to help any organization experiencing this problem. However, I am glad to say that the biggest problem that I met was when I returned to my hotel the next morning, I had to climb six flights of stairs because the lifts had been disabled as a precaution. There were many pieces of software that contained the error and it was only through the recognition of the problem, rigorous preparation to remove the bug as well as planning to deal with it where it arose that major problems were averted.
In the IBM C-TOC I participated in cyber response challenge involving a fictitious international financial services organization called “Bane and Ox”. This organization has a cyber security team and so called “Fusion Centre” to manage cyber security incident response. This exercise started with an HR Onboarding briefing welcoming me into the team.
We then were then taken through an unfolding cyber incident and asked to respond to the events as they occurred with phone calls from the press, attempts to steal money via emails exploiting the situation, a ransom demand, physical danger to employees, customers claiming that their money is being stolen, a data leak and an attack on the bank’s ATMs. I then underwent a TV interview about the bank’s response to the event with hostile questioning by the news reporter, not a pleasant experience!
According to IBM, organizations need a clear statement of the “Commander’s Intent”. This is needed to ensure that everyone works together towards a common goal that everyone can understand when under pressure and making difficult decisions. IBM gave the example that the D Day Commander’s Intent statement was “Take the beach”.
The next priority is to collect information. “The first call is the most important”. Whether it is from the press, a customer or an employee. You need to get the details, check the details and determine the credibility of the source.
You then need to implement a process to resolve where the problems lie and to take corrective action as well as to inform regulators and other people as necessary. This is not easy unless you have planned and prepared in advance. Everyone needs to know what they must do, and management cover is essential to ensure that resources and budget are available as needed. It may also be necessary to enable deviation from normal business processes.
Given the previously mentioned statistics on organizational preparedness for cyber incidents, many organizations need to take urgent action. The preparation needed involves the many parts of the organization not just IT, it must be supported at the board level and involve senior management. Sometimes the response will require faster decision making with the ability to bypass normal processes - only senior management can ensure that this is possible. An effective response need planning, preparation and above all practice.
- Obtain board level sponsorship for your incident response approach;
- Identify the team of people / roles that must be involved in responding to an incident;
- Ensure that it is clear what constitutes an incident and who can invoke the response plan;
- Make sure that you can contact the people involved when you need to;
- You will need external help – set up the agreement for this before you need it;
- Planning, preparation and practice can avoid pain and prosecution;
- Practice, practice and practice again.
KuppingerCole Advisory Note: GRC Reference Architecture – 72582 provides some advice on this area.
Last week I attended the Oracle Open World Europe 2019 in London. At this event Andrew Sutherland VP of technology told us that security was one of the main reasons why customers were choosing the Oracle autonomous database. This is interesting for two reasons firstly it shows that security is now top of mind amongst the buyers of IT systems and secondly that buyers have more faith in technology than their own efforts.
The first of these reasons is not surprising. The number of large data breaches disclosed by organizations continues to grow and enterprise databases contain the most valuable data. The emerging laws mandating data breach disclosure, such as GDPR, have made it more difficult for organizations to withhold information when they are breached. Being in a position of responsibility when your organization suffers a data breach is not a career enhancing event.
The second reason takes me back to the 1980s when I was working in the Knowledge Engineering Group in ICL. This group was working as part of the UK Government sponsored Alvey Programme which was created in response to the Japanese report on 5th generation computing. This programme had a focus on Intelligent Knowledge Based Systems (IKBS) and Artificial Intelligence (AI). One of the most successful products from this group was VCMS a performance advisor for the ICL mainframe. This was commercially very successful with a large uptake in the VME customer base. However, one interesting observation was that it boosted customers to upgrade their mainframes. It became apparent that the buyers of these systems were more ready to accept the advice from VCMS than from the customer service representatives.
From an AI point of view managing computer systems is relatively easy. This is because computer systems are deterministic, and their behaviour can be described using rules. These rules may be complex and there may be a wide range of circumstances that need to be considered. However, given the volume of metered data available and the rules an AI system can usually make a good job of optimizing performance. Computer systems manufacturers also have the knowledge needed.
That is not to diminish the remarkable achievements by Oracle to make their database systems autonomous. Mark Hurd, CEO of Oracle (who had to present via a satellite link because he had been unable to renew his passport because of the US government shutdown) described how the Net Suite team had spent 15 years creating 9,000 specialized table indexes for performance. The autonomous database created 6,000 indexes in a short time, and which improved performance by 7%.
Oracle’s strategy for AI extends beyond the autonomous database. Melissa Boxer, VP Fusion Adaptive Intelligence, described how Oracle Adaptive Intelligent Apps (Oracle AI Apps) provide a suite of prebuilt AI and data-driven across Customer Experience (CX), Human Capital Management (HCM), Enterprise Resource Planning (ERP), and Manufacturing. These use a shared data model to provide Adaptive Intelligence and Machine Learning across all the different pillars. They include a personalized user experience OAUX and intelligent chat bots.
So, to return to the original question – can Autonomous Systems (as defined by Oracle) improve security posture? In my opinion the answer is both yes and no. The benefits of autonomous systems are that they can automatically incorporate patches and implement best practice configuration. Since many data breaches stem from these areas this is a good thing. They can also automatically tune themselves to optimize performance for an individual customer’s workload and, since they discover what is normal, they can potentially identify abnormal activity.
The challenge is that AI technology is not yet able to defend against adversarial activity. Machine Learning is only as good as its training and, at the current state of the art, it is does not provide good explanations for the conclusions reached. This means that sometimes, the learnt behavior contains flaws that can be exploited by a malicious agent to trick the system into the wrong conclusions. Given that cyber crime is an arms race and we must assume that the cyber adversaries are working with the same technology we must assume that this will be deployed against us as well as for us.
For security – AI can help to avoid the simple mistakes – to defend against a concerted attack still needs a human in the loop.
On October 28th IBM announced its intention to acquire Red Hat. At $34 Billion, this is the largest software acquisition ever. So why would IBM pay such a large amount of money for an Open Source software company? I believe that this acquisition needs to be seen beyond looking just at DevOps and Hybrid Cloud, rather in the context of IBM’s view of the future where the business value from IT services will come from in future. This acquisition provides near-term tactical benefits from Red Hat’s OpenShift Platform and its participation in the Kubeflow project. It strengthens IBM’s capabilities to deliver the foundation for digital business transformation. However, digital business is increasingly based on AI delivered through the cloud. IBM recently announced a $240M investment in a 10-year research collaboration on AI with MIT and this represents the strategy. This adds to the already significant investments that IBM has already made in Watson, setting up a division in 2016, as well as in cloud services.
Red Hat was founded in 1993 and in 1994 released the Red Hat version of Linux. This evolved into a complete development stack (JBoss) and recently released Red Hat OpenShift - a container- and microservices-based (Kubernetes) DevOps platform. Red Hat operates on a business model based on open-source software development within a community, professional quality assurance, and subscription-based customer support.
The synergy between IBM and Red Hat is clear. IBM has worked with Red Hat on Linux for many years and both have a commitment to Open Source software development. Both companies have a business model in which services are the key element. Although these are two fairly different types of services – Red Hat’s being service fees for software, IBM’s being all types of services including consultancy, development they both fit well into IBM’s overall business.
One critical factor is the need for tools to accelerate the development lifecycle for ML projects. For ML projects this can be much less predictable than for software projects. In the non-ML DevOps world microservices and containers and the key technologies that have helped here. How can these technologies help with ML projects?
There are several differences between developing ML and coding applications. Specifically, ML uses training rather than coding and, in principle, this in itself should accelerate the development of much more sophisticated ways to use data. The ML Development lifecycle can be summarized as:
- Obtain, prepare and label the data
- Train the model
- Test and refine the model
- Deploy the model
While the processes involved in ML development are different to conventional DevOps, a microservices-based approach is potentially very helpful. ML Training involves multiple parties working together and microservices provide a way to orchestrate various types of functions, so that data scientists, experts and business users can just use the capabilities without caring about coding etc. A common platform based on microservices could also provide automated tracing of the data used and training results to improve traceability and auditing. It is here that there is a great potential for IBM/Red Hat to deliver better solutions.
Red Hat OpenShift provides a DevOps environment to orchestrate the development to deployment workflow for Kubernetes based software. OpenShift is, therefore, a potential solution to some of the complexities of ML development. Red Hat OpenShift with Kubernetes has the potential to enable a data scientist to train and query models as well as to deploy a containerized ML stack on-premises or in the cloud.
In addition, Red Hat is a participant in the Kubeflow project. This is an Open Source project dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Their goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures.
In conclusion, the acquisition has strengthened IBM’s capabilities to deliver ML applications in the near term. These capabilities complement and extend IBM’s Watson and improve and accelerate their ability and the ability of their joint customers to create, test and deploy ML-based applications. They should be seen as part of a strategy towards a future where more and more value is delivered through AI-based solutions.
Read as well: IBM & Red Hat – and now?
Famously, in 2014 Prof. Stephen Hawking told the BBC: "The development of full artificial intelligence could spell the end of the human race." The ethical questions around Artificial Intelligence were discussed at a meeting led by the BCS President Chris Rees in London on October 2nd. This is also an area covered by KuppingerCole under the heading of Cognitive Technologies and this blog provides a summary of some of the issues that need to be considered.
Firstly, AI is a generic term and it is important to understand precisely what this means. Currently the state of the art can be described as Narrow AI. This is where techniques such as ML (machine learning) combined with massive amounts of data are providing useful results in narrow fields. For example, the diagnosis of certain diseases and predictive marketing. There are now many tools available to help organizations exploit and industrialise Narrow AI.
At the other extreme is what is called General AI where the systems are autonomous and can decide for themselves what actions to take. This is exemplified by the fictional Skynet that features in the Terminator games and movies. In these stories this system has spread to millions of computers and seeks to exterminate humanity in order to fulfil the mandates of its original coding. In reality, the widespread availability of General AI is still many years away.
In the short term, Narrow AI can be expected to evolve into Broad AI where a system will be able to support or perform multiple tasks using what is learnt in one domain applied to another. Broad AI will evolve to use multiple approaches to solve problems. For example, by linking neural networks with other forms of reasoning. It will be able work with limited amounts of data, or at least data which is not well tagged or curated. For example, in the cyber-security space to be able to identify a threat pattern that has not been seen before.
What is ethics and why is it relevant to AI? The term is derived from the Greek word “ethos” which can mean custom, habit, character or disposition. Ethics is a set of moral principles that govern behaviour, or the conduct of an activity, ethics is also a branch of philosophy that studies these principles. The reason why ethics is important to AI is because of the potential for these systems to cause harm to individuals as well as society in general. Ethics considerations can help to better identify beneficial applications while avoiding harmful ones. In addition, new technologies are often viewed with suspicion and mistrust. This can unreasonably inhibit the development of technologies that have significant beneficial potential. Ethics provide a framework that can be used to understand and overcome these concerns at an early stage.
Chris Rees identified 5 major ethical issues that need to be addressed in relation to AI:
- Economic Impact;
Bias is a very current issue with bias related to gender and race as top concerns. AI systems are likely to be biased because people are biased, and AI systems amplify human capabilities. Social media provides an example of this kind of amplification where uncontrolled channels provide the means to share beliefs that may be popular but have no foundation in fact - “fake news”. The training of AI systems depends upon the use of data which may include inherent bias even though this may not be intentional. The training process and the trainers may pass on their own unconscious bias to the systems they train. Allowing systems to train themselves can lead to unexpected outcomes since the systems do not have the common sense to recognize mischievous behaviour. There are other reported examples of bias in facial recognition systems.
Explanation - It is very important in many applications that AI systems can explain themselves. Explanation may be required to justify a life changing decision to the person that it effects, to provide the confidence needed to invest in a project based on a projection, or to justify after the event why a decision was taken in a court of law. While rule-based systems can provide a form of explanation based on the logical rules that were fired to arrive at a particular conclusion neural network are much more opaque. This poses not only a problem to explain to the end user why a conclusion was reached but also to the developer or trainer to understand what needs to be changed to correct the behaviour of the system.
Harmlessness – the three laws of robotics that were devised by Isaac Asimov in the 1940’s and subsequently extended to include a zeroth law apply equally to AI systems. However, the use or abuse of the systems could breach these laws and special care is needed to ensure that this does not happen. For example, the hacking of an autonomous car could turn it into a weapon, which emphasizes the need for strong inbuilt security controls. AI systems can be applied to cyber security to accelerate the development of both defence and offence. It could be used by the cyber adversaries as well as the good guys. It is therefore essential that this aspect is considered and that countermeasures are developed to cover the malicious use of this technology.
Economic impact – new technologies have both destructive and constructive impacts. In the short-term the use of AI is likely to lead to the destruction of certain kinds of jobs. However, in the long term it may lead to the creation of new forms of employment as well as unforeseen social benefits. While the short-term losses are concrete the longer-term benefits are harder to see and may take generations to materialize. This makes it essential to create protection for those affected by the expected downsides to improve acceptability and to avoid social unrest.
Responsibility – AI is just an artefact and so if something bad happens who is responsible morally and in law? The AI system itself cannot be prosecuted but the designer, the manufacturer or the user could be. The designer may claim that the system was not manufactured to the design specification. The manufacturer may claim that the system was not used or maintained correctly (for example patches not applied). This is an area where there will need to be debate and this should take place before these systems cause actual harm.
In conclusion, AI systems are evolving but they have not yet reached the state portrayed in popular fiction. However, the ethical aspects of this technology need to be considered and this should be done sooner rather than later. In the same way that privacy by design is an important consideration we should now be working to develop “Ethical by Design”. GDPR allows people to take back control over how their data is collected and used. We need controls over AI before the problems arise.
As organizations go through digital transformation, the cyber challenges they face become more important. Their IT systems and applications become more critical and at the same time more open. The recent data breach suffered by British Airways illustrates the sophistication of the cyber adversaries and the difficulties faced by organization to prevent, detect, and respond to these challenges. One approach that is gaining ground is the application of AI technologies to cyber security and, at an event in London on September 24th, IBM described how IBM Watson is being integrated with other IBM security products to meet these challenges.
The current approaches to cyber defence include multiple layers of protection including firewalls and identity and access management as well as event monitoring (SIEM). While these remain necessary they have not significantly reduced the time to detect breaches. For example, the IBM-sponsored 2018 Cost of a Data Breach Study by Ponemon showed that the mean time for organizations to identify a breach was 197 days. This length of time has hardly improved over many years. The reasons for this long delay are many and include: the complexity of the IT infrastructure, the sophistication of the techniques used by cyber adversaries to hide their activities and the sheer volume data available.
So, what is AI and how can it help to mitigate this problem?
AI is a generic term that covers a range of technologies. In general, the term AI refers to systems that “simulate thought processes to assist in finding solutions to complex problems through augmentation and enhancement of human capabilities”. Kuppingercole has analysed in detail what this really means in practice and this is summarized in the following slide from the EIC 2017 Opening Keynote by Martin Kuppinger.
At the lower layer, improved algorithms enable the transformation of Big Data into “Smart Information”. See KuppingerCole Advisory Note: Big Data Security, Governance, Stewardship - 72565. This is augmented by Machine Learning where human reinforcement is used to tune the algorithms to identify those patterns that are of interest and to ignore those that are not. Cognitive technologies add an important element to this mix through their capability to include speech, vision and unstructured data into the analysis. Today, this represents the state of the art for the practical application of AI to cyber security.
The challenges of AI at the state of the art are threefold:
- The application of common sense – a human applies a very wide context to decision making whereas AI systems tend to be very narrowly focussed and so sometimes reach what the human would consider to be a stupid conclusion.
- Explanation – of how the conclusions were reached by the AI system to demonstrate that they are valid and can be trusted.
- Responsibility –for action based on the conclusions from the system.
Cyber security products collect vast amounts of data – the cyber security analyst is literally drowning in data. The challenge is to find the so called IOCS (Indicators of Compromise), that show the existence of a real threat, amongst this enormous amount of data. The problem is to not just to find what is abnormal, but to filter out the many false positives that obscure the real threats.
There are several vendors that have incorporated Machine Learning (ML) systems into their products to tune the identification of important anomalies. This is useful to reduce false positives, but it is not enough. To be really useful to a security analyst, the abnormal pattern needs to be related to known or emerging threats. While there have been several attempts to standardize the way information on threats is described and shared most of this information is still held in unstructured form in documents, blogs and twitter feeds. It is essential to take account of these.
This is where IBM QRadar Advisor with Watson is different. A Machine Learning system is only as is only as good as its training – training is the key to its effectiveness. IBM say that it has been trained through the ingestion of over 10 billion pieces of structured data and 1.24 million unstructured documents to assist with the investigation of security incidents. This training involved IBM X-Force experts as well as IBM customers. Because of this training, it can now identify patterns that represents potential threats and provide links to the relevant sources that have been used to reach these conclusions. However, while this helps the security analyst to do their job more efficiently and more effectively, it does not yet replace the human.
Organizations now need to assume that cyber adversaries have access to their organizational systems and to constantly monitor for this activity in a way that will enable them to take action before damage is done. AI provides a great potential to help with this challenge and to evolve to help organizations to improve their cyber security posture through intelligent code analysis and configuration scanning as well as activity monitoring. For more information on the future of cyber security attend KuppingerCole’s Cybersecurity Leadership Summit 2018 Europe.
The primary factor that most organizations consider when choosing a cloud service is how well the service meets their functional needs. However, this must be balanced against the non-functional aspects such as compliance, security and manageability. These aspects are increasingly becoming a challenge in the hybrid multi-cloud IT environment found in most organizations. This point was emphasized by Virtustream during their briefing in London on September 6th, 2018.
Virtustream was founded in 2009 with a focus on providing cloud services for mission-critical applications like SAP. In order to achieve this Virtustream developed its xStream cloud management platform to meet the requirements of complex production applications in the private, public and hybrid cloud. This uses patented xStream cloud resource management technology (μVM), to deliver assured SLA levels for business-critical applications and services.Through a series of acquisitions Virtustream is now a Dell Technologies business.
The hybrid multi-cloud IT environment has made the challenges of governance, compliance and security even more complex. There is no single complete solution currently on the market to this problem.
Typically, organizations use multiple cloud services including office productivity tools from one CSP (Cloud Service Provider), a CRM system from another CSP, and a test and development service from yet another one. At the same time, legacy applications and business critical data may be retained on-premises or in managed hosting. This hybrid multi-cloud environment creates significant challenges relating to the governance, management, security and compliance of the whole system.
What is needed is a consistent approach with common processes supported by a single platform that provides all the necessary functions across all the various components involved in delivering all the services.
Most CSPs offer their own proprietary management portal– which may in some cases extend to cover some on premises cases. This makes it important when choosing a cloud service to evaluate how the needs for management, security and compliance will be integrated with the existing processes and components that make up the enterprise IT architecture. The hybrid IT service model requires an overall IT governance approach as described in KuppingerCole Advisory Note: Security Organization Governance and the Cloud - 72564
An added complexity is that the division of responsibility for the different layers of the service depends upon how the service is delivered. There are 5 layers:
- The lowest layer is the physical service infrastructure which includes as the data center, the physical network, the physical servers and the storage devices. In the case of IaaS this is the responsibility of the CSP.
- Above this sits the Operating Systems, basic storage services and the logical network. For IaaS, the management of this layer is the responsibility of the customer.
- The next plane includes the tools and middleware needed to build and deploy business applications. For PaaS (Platform as a Service) these are the responsibility of the CSP.
- Above the middleware are the business applications and for SaaS (Software as a Service) these are the responsibility of the CSP.
- The highest plane is the governance of business data and control of access to the data and applications. This is always the responsibility of the customer.
An ideal solution would be a common management platform that covers all the cloud and on-premises services and components. However, most cloud services only offer a proprietary management portal that covers the management of their service.
So, does Virtustream provide a solution that completely meets these requirements? The answer is: Not yet. However, there are two important points in its favour:
- Firstly, Virtustream have highlighted that the problem exists. Acceptance is the first step on the road to providing a solution.
- Secondly, Virtustream is a part of Dell and Dell also own VMware. VMware provides a solution to this problem but only where VMware is used across different IT service delivery models. VMware is used by Virtustream and is also supported by several other CSPs.
In conclusion, the hybrid multi-cloud environment presents a complex management challenge particularly in the areas of security and compliance. There are five layers each with six dimensions that need to be managed, and the responsibilities are shared between the CSP and the customer. It is vital that organizations consider this when selecting cloud services and that they implement a governance-based approach. Look for the emergence of tools to help with this challenge. There was a workshop this subject at KuppingerCole’s EIC earlier this year.
Get access to the whole body of KC PLUS research including Leadership Compass documents for only €800 a year
Register now for KuppingerCole Select and get your free 30-day access to a great selection of KuppingerCole research materials and to live trainings.
AI for the Future of your Business: Effective, Safe, Secure & Ethical Everything we admire, love, need to survive, and that brings us further in creating a better future with a human face is and will be a result of intelligence. Synthesizing and amplifying our human intelligence have therefore the potential of leading us into a new era of prosperity like we have not seen before, if we succeed keeping AI Safe, Secure and Ethical. Since the very beginning of industrialization, and even before, we have been striving at structuring our work in a way that it becomes accessible for [...]