I’ve looked at a bunch of automation platforms which could meet any of Cyte’s goals – integrating existing platforms would, ideally, accelerate development.
Read on for a summary of these platforms and how they may integrate with Cyte.
Siri is one of the more well known assistants. In recent times, Apple has released an SDK and the Shortcuts app to encourage developers and users to build and share automations.
Siri uses a voice activated intent mechanism for operation, where spoken commands are translated to one of the available intents, which is then used to look up a set of procedures registered with the active intent to execute.
Where Siri falls short of our needs is its limitation to Apple devices, and also the fact that automations must be explicitly requested by the user, forcing the user to modify their existing workflow and break focus from whatever context they were in. Also, primarily relies on speech input, and lacks the ability to explicitly train the core assistant.
During the integration development phase, we will look to send intent signals from Siri to Cyte, using Siri for natural language processing on Apple devices.
Cortana is the assistant software available on most Microsoft systems, and operates with an intent mechanism like Siri.
The drawbacks of Cortana are analogous with Siri, except for Microsoft platforms (although Cortana is available on iOS and via emulation on OSX, the user experience leaves a lot to be desired).
As with Siri, we’ll look to send Cortana intent signals to Cyte on windows, taking advantage of the great natural language parsing.
Alexa is yet another intent driven assistant, developed by Amazon. Quite popular in home settings, Alexa can be developed on any hardware platform, and so it does have an advantage over Siri and Cortana in that it seems to be compatible with a larger range of operating systems.
The other drawbacks mentioned above still apply to Alexa, and we’ll also look to send Alexa intents as signals to Cyte.
Similar to Alexa, except backed by Google. Same drawbacks, and again we will aim to use intents as input signals for Cyte.
An open source voice assistant launched with the backing of a Kickstarter campaign. A great positive is the fact that the core assistant can be modified as it is open source, however all the other drawbacks previously mentioned still apply, and there doesn’t seem to be any added benefit to integrating intent signals with Cyte over the previously mentioned platforms.
The first non-assistant platform. Effectively an intelligent and customizable web crawler, it is the first entry which accounts for visual context, however limited to websites. Sadly it does not handle speech context. While it could be used to automate web workflows, we need something that works across all applications.
Automator is a robotic process automation (RPA) platform native to OSX. There are a number of predefined actions/workflows, the ability to script new workflows using AppleScript, and a large community of shared workflows available to use.
Apart from lacking cross platform support, Automator uses an explicit trigger to activate, meaning it lacks visual and speech context. However there is value in integrating Cyte with Automator, by triggering pre-made workflows connected to specific context triggers in Cyte.
Pretty much like Automator, but for Windows. There are plenty of prebuilt automations, and also some basic visual context capability in the form of optical character recognition (OCR).
A fully fledged, standalone RPA and Intelligent Automation (IA) platform. Like Automator and WinAutomation, but with deeper functionality and a simpler drag-and-drop visual programming interface for creating workflows in a manner akin to flowcharts.
The primary downsides include a lack of cross platform support, lack of visual and speech context, and the inability to learn from experience (requires explicit programming).
Another complete RPA/IA solution like Automation Anywhere – and has the same upsides + downsides.
Another complete RPA/IA solution.
A complete RPA and Smart Process Automation (SPA) platform – meaning it can leverage data and learn from experience. Supports OCR but lacks visual and speech context beyond that.
The first in a selection of cloud based services. IBM provides a range of machine learning services at various levels, and the accuracy and quality is impressive.
There are a couple of major drawbacks, however. The latency incurred sending data to the service and getting a response back makes it only suitable for a specific range of tasks where the frequency of events is sufficiently slow. Additionally, charges are incurred per message, making it difficult to budget and potentially very expensive to run high frequency analysis.
Amazon Web Services
Amazon’s offering of cloud based machine learning APIs – which come with the same limitations as the Watson services discussed above.
Google’s cloud based machine learning APIs. See above for limitations in the context of Cyte.
Microsoft’s cloud based machine learning APIs. See above for limitations in the context of Cyte.
ROS an operating system for real time control systems, interfacing software with servos, actuators and other mechanical parts. Three dimensional operation allows ROS applications to interact with software and more with the same natural inputs as humans. Sophia, the Hansen Robotics humanoid (the first robot citizen of a country), runs on ROS.
While there is a lot of open source code available for ROS (see the OpenCog project), 3D spatial navigation brings massive complexity for relatively little gain in terms of Cyte’s short to mid term goals. Eventually however, we’d like to integrate Cyte’s outputs with a ROS system in a real world environment.
There are a few clear classes of solutions in the list above, distinguished primarily by the organization developing the platform.
First, there are the consumer grade voice assistants like Siri, Cortana, Alexa, Mycroft and Google Assistant. One of Cyte’s primary goals is silent & non-intrusive operation, which these platforms currently lack. I don’t doubt there are efforts underway to implement visual recognition in these platforms, and are likely currently limited by network latency and computation power on the edge. Another differentiating factor for Cyte will be the personalized models – trained for and by the user – targeting a more technically oriented user base interested in customization.
Second, there are the enterprise grade RPA solutions like Automation Anywhere, UIPath, Blue Prism and Work Fusion. These platforms are highly customizable by the user, however are still lacking in AutoML and visual recognition. The primary distinction between Cyte and these tools is accessibility in terms of pricing and complexity.
Thirdly, we have the cloud services offered by IBM, AWS, Azure and Google. These solutions are not fully fledged solutions for automation, instead intended to be used within a larger pipeline. The major disadvantage of these solutions is the need to pay-as-you-go, and the latency incurred moving data to and from the edge.
Finally, there are the specialized solutions which cater to specific tasks, instead of taking a machine learning approach – e.g. Automator, WinAutomation, Screen-Scraper and ROS. Machine learned patterns are an integral part of Cyte’s goals as it greatly reduces the up front time investment required for automation.
In summary, Cyte aims to be a higher performance, more customizable alternative to current voice assistants. It aims to be a simpler, more accessible alternative to current RPA platforms. It needs to be capable of lower latency and operate with a much lower, predictable cost than using cloud solutions. It aims to encourage automation by reducing the effort required to program a workflow by using machine learning. And most importantly, for me, it should allow the user to remain in their workflow and automate without requiring a context switch from the user.