By Jonas Bonér is CEO and co-founder of Lightbend
Gartner predicts that by 2025, 75% of enterprise-generated data will be created and processed at the edge. The cost and latency of sending all of this data back to the cloud for processing is prohibitive, so enterprises will need to keep as much data on the edge as possible while also moving processing to the edge.
The benefits of keeping data at the edge include:
- Real-time processing with lower latency.
- Resilience and availability.
- More resource- and cost-efficient, with less data shipping.
Organizations will realize these benefits via a “Cloud-to-Edge Data Plane” model which unifies cloud and edge into a single network, enabling you to define your data and business logic, package it up as a service, and deploy it without having to worry about where it should run. This frees up the data plane to adaptively move services around, optimizing data access, coordination, and replication, ensuring physical co-location of data, compute, and user for the lowest possible latency with the highest throughput and availability.
From an architectural perspective, the edge is a continuum consisting of hierarchical layers between the cloud and end devices; each layer is further away from the cloud but closer to the user. Many key parameters like latency, throughput, consistency, availability, scalability, compute resources, and others change drastically as you move through this continuum.
Mobile Location Transparent services
A successful Cloud-to-Edge Data Plane model requires the emergence of Mobile Location Transparent services. These are services free to run anywhere and collaborate with other services without restrictions. They allow the underlying platform to optimize overall system behavior by relocating services as needed, based on changes in usage, resources available, communication patterns, failure scenarios, and more.
Not having to make these decisions, or worry about how they will be maintained efficiently, but only focusing on API, data model, and business logic, reduces time-to-market and overall risk.
Vision for Data
1: Data always exists wherever and whenever needed, but only for the time it is needed.
Having the latest and correct data always available to you, right when you need it, wherever you happen to be located — in the central cloud or at the far edge — sounds like a developer’s dream. On the other hand, replicating data to locations where it will never be needed is wasteful, and the same is true for holding on to data longer than needed. This balancing act is extremely hard and should ideally be done by the platform, adaptively at runtime, at a global system level, rather than by the developer.
2: Data is always co-located with processing and end-user, ensuring ultra-low latency, high throughput, and resilience.
If not, as in traditional 3-tier stateless architecture, then data is most often somewhere else than where you need it to be, and therefore has to be fetched before being used, and pushed out to storage after modification. This increases latency, reduces throughput, and puts the service at the mercy of the availability of the storage service.
3: Data and compute move adaptively, together with the end-user.
Many use-cases out at the edge are serving users on the move (e.g. cell phones, cars) – their data and compute should move geographically with them.
4: Data is injected into the services on an as-needed basis — automatically, timely, efficiently, and intelligently.
Data storage management is often very intrusive into application design and business logic: how to map domain objects to DB tables (OR-mapping), transaction management, storage API calls for fetching and updating the state, etc. Having the platform inject the data into the service when new data is available makes the code easier to design, write, understand, and maintain.
Additionally, if the service is doing all data management itself then it’s a black box and the platform has no idea of what access patterns have been used, on what data, when data can be safely cached, replicated, sharded, prefetched, shared, etc. If the platform is doing all the data management then it can look broadly across all services and safely optimize for the overall system holistically.
Today “Cloud Computing” and “Edge Computing” are often typically treated as separate domains, but rapidly evolving demands for intelligent low latency processing of edge-generated data are creating requirements for a new approach combining cloud and edge into a single continuum.
Cloud-native applications, architects, and developers have already embraced the concepts required to build services that run in an essentially infinitely scalable (but logically centralized) cloud infrastructure. Extending this thinking to a fully decentralized infrastructure including the edge is a natural next step in enterprise application architecture. Programming models that enable a data-centric, location transparent approach combined with intelligent runtimes that automatically enable data and processing to be co-located where and when they are needed will allow developers to take full advantage of such a wholly distributed edge to cloud environment.
About the author
Jonas Bonér is CEO and co-founder of Lightbend, and the creator of the Akka event-driven middleware project. Previously he was a core technical contributor at Terracotta, working on core JVM-level clustering technology, and at BEA, as part of the JRockit JVM team. Jonas has also been an active contributor to open source projects, including the AspectWerkz Aspect-Oriented Programming (AOP) framework and the Eclipse AspectJ project. He is an amateur jazz musician, a passionate skier, and holds a Bachelor of Science from Mid Sweden University.
DISCLAIMER: Guest posts are submitted content. The views expressed in this post are that of the author, and don’t necessarily reflect the views of Edge Industry Review (EdgeIR.com).
data management | DevOps | enterprise architecture | Lightbend | mobile edge | programming model