Graph-Based AI Enters the Enterprise Mainstream

Graph AI is becoming fundamental to anti-fraud, sentiment monitoring, market segmentation, and other applications where complex patterns must be rapidly identified.

Artificial intelligence (AI) is one of the most ambitious, amorphous, and comprehensive visions in the history of automated information systems.

Fundamentally, AI’s core approach is to model intelligence — or represent knowledge — so that it can be executed algorithmically in general-purpose or specialized computing architectures. AI developers typically build applications through an iterative process of constructing and testing knowledge-representation models to optimize them for specific outcomes.

Image: DIgilife -

AI’s advances move in broad historical waves of innovation, and we’re on the cusp of yet another. Starting in the late 1950s, the first generation of AI was predominantly anchored in deterministic rules for a limited range of expert systems applications in well-defined solution domains. In the early years of this century, AI’s next generation came to the forefront, grounded in statistical models — especially machine learning (ML) and deep learning (DL) — that infer intelligence from correlations, anomalies, and other patterns in complex data sets.

Graph data is a key pillar of the post-pandemic “new normal”

Building on but not replacing these first two waves, AI’s future focuses on graph modeling. Graphs encode intelligence in the form of models that describe the linked contexts within which intelligent decisions are executed. They can illuminate the shifting relationships among users, nodes, applications, edge devices and other entities.

Graph-shaped data forms the backbone of our “new normal” existence. Graph-shaped business problems encompass any scenario in which one is more concerned with relationships among entities than with the entities in isolation. Graph modeling is best suited to complex relationships that are flattened, federated, and distributed, rather than hierarchically patterned.

Graph AI is becoming fundamental to anti-fraud, influence analysis, sentiment monitoring, market segmentation, engagement optimization, and other applications where complex patterns must be rapidly identified.

We find applications of graph-based AI anywhere there are data sets that are intricately connected and context-sensitive. Common examples include:

  • Mobility data, for which graphs can map the “intelligent edge” of shifting relationships among linked users, devices, apps, and distributed resources;
  • Social network data, for which graphs can illuminate connections among people, groups, and other shared content and resources;
  • Customer transaction data, for which graphs can show interactions between customers and items for the purpose of recommending products of interest, as well as detect shifting influence patterns among families, friends, and other affinity groups;
  • Network and system log data, for which connections between source and destination IP addresses are best visualized and processed as graph structures, making this technology very useful for anti-fraud, intrusion detection, and other cybersecurity applications;
  • Enterprise content management data, for which semantic graphs and associated metadata can capture and manage knowledge among distributed virtual teams;
  • Scientific data, for which graphs can represent the physical laws, molecular structures, biochemical interactions, metallurgic properties, and other patterns to be used in engineering intelligent and adaptive robotics;
  • The Internet of Things (IoT), for which graphs can describe how the “things” themselves — such as sensor-equipped endpoints for consumer, industrial, and other uses — are configured in nonhierarchical grids of incredible complexity.

Graph AI is coming fast to enterprise data analytics

The motivation behind cyberattacks is becoming more varied, with disinformation and disruption joining the regulars: data theft, extortion, and vandalism.

Brought to you by Ivanti

Graphs enable great expressiveness in modeling, but also entail considerable computational complexity and resource consumption. We’re seeing more enterprise data analytics environments that are designed and optimized to support extreme-scale graph analysis.

Graph databases are a key pillar of this new order. They provide APIs, languages, and other tools that facilitate the modeling, querying, and writing of graph-based data relationships. And they have been coming into enterprise cloud architecture over the past two to three years, especially since AWS launched Neptune and Microsoft Azure launched Cosmos DB, respectively, each of which introduced graph-based data analytics to their cloud customer bases.

Riding on the adoption of graph databases, graph neural networks (GNN) are an emerging approach that leverages statistical algorithms to process graph-shaped data sets. Nevertheless, GNNs are not entirely new, from an R&D standpoint. Research in this area has been ongoing since the early ‘90s, focused on fundamental data science applications in natural language processing and other fields with complex, recursive, branching data structures.

GNNs are not to be confused with the computational graphs, sometimes known as “tensors,” of which ML/DL algorithms are composed. In a fascinating trend under which AI is helping to build AI, ML/DL tools such as neural architecture search and reinforcement learning are increasingly being used to optimize computational graphs for deployment on edge devices and other target platforms. Indeed, it’s probably a matter of time before GNNs are themselves used to optimize GNNs’ structures, weights, and hyperparameters in order to drive more accurate, speedy, and efficient inferencing over graph data.

In the new cloud-to-edge world, AI platforms will increasingly be engineered for GNN workloads that are massively parallel, distributed, in-memory, and real-time. Already, GNNs are driving some powerful commercial applications.

For example, Alibaba has deployed GNNs to automate product recommendations and personalized searches in its e-commerce platform. Apple, Amazon, Twitter, and other tech firms apply ML/DL to knowledge graph data for question answering and semantic search. Google’s PageRank models facilitate contextual relevance searches across collections of linked webpages that are modeled as graphs. And Google’s DeepMind unit is using GNNs to enable computer vision applications to predict what will happen over an extended time given a few frames of a video scene, without needing to code the laws of physics.

A key recent milestone in the mainstreaming of GNNs was AWS’ December 2020 release of Neptune ML. This new cloud service automates modeling, training, and deployment of artificial neural networks on graph-shaped data sets. It automatically selects and trains the best ML model for the workload, enabling developers to expedite the generation of ML-based predictions on graph data. Sparing developers from needing to have ML expertise, Neptune ML supports easy development of inferencing models for classifying and predicting nodes and links in graph-shaped data.

Neptune ML is designed to accelerate GNN workloads while achieving high predictive accuracy, even when processing graph data sets incorporating billions of relationships. It uses Deep Graph Library (DGL), an open-source library that AWS launched in December 2019 in conjunction with its SageMaker data-science pipeline cloud platform. First released on Github in December 2018, the DGL is a Python open source library for fast modeling, training, and evaluation of GNNs on graph-shaped datasets.

When using Neptune ML, AWS customers pay only for cloud resources used, such as the Amazon SageMaker data science platform, Amazon Neptune graph database, Amazon CloudWatch application and infrastructure monitoring tool, and Amazon S3 cloud storage service.

Graph AI will demand an increasing share of cloud computing resources

Graph analysis is still outside the core scope of traditional analytic databases and even beyond the ability of many Hadoop and NoSQL databases. Graph databases are a young but potentially huge segment of enterprise big data analytics architectures.

However, that doesn’t mean you have to acquire a new database in order to do graph analysis. You can, to varying degrees, execute graph models on a wide range of existing enterprise databases. That’s an important reason why enterprises can begin to play with GNNs now without having to shift right away to an all-new cloud computing or database architecture. Or they can trial AWS’ Neptune ML and other GNN solutions that we expect other cloud computing powerhouses to roll out this year.

If you’re a developer of traditional ML/DL, GNNs can be an exciting but challenging new approach to work in. Fortunately, ongoing advances in network architectures, parallel computation, and optimization techniques, as evidenced by AWS’ evolution of its Neptune offerings, are bringing GNNs more fully into the enterprise cloud AI mainstream.

Over the coming two to three years, GNNs will become a standard feature of most enterprise AI frameworks and DevOps pipelines. Bear in mind, though, that as graph-based AI is adopted by enterprises everywhere for their most challenging initiatives, it will prove to be a resource hog par excellence.

GNNs already operate at a massive scale. Depending on the amount of data, the complexity of models, and the range of applications, GNNs can easily become huge consumers of processing, storage, I/O bandwidth, and other big-data platform resources. If you’re driving the results of graph processing into real-time applications, such as anti-fraud, you’ll need an end-to-end low-latency graph database.

GNN sizes are sure to grow by leaps and bounds. That’s because enterprise graph AI initiatives will undoubtedly become increasingly complex, the range of graph data sources will continually expand, workloads will jump by orders of magnitude, and low-latency requirements will become more stringent.

If you’re serious about evolving your enterprise AI into the age of graphs, you’re going to need to scale your cloud computing environment on every front. Before long, it will become common for GNNs to execute graphs consisting of trillions of nodes and edges. All-in-memory massively parallel graph-database architectures will be de rigeur for graph AI applications. Cloud database architectures will evolve to enable faster, more efficient discovery, processing, querying, and analysis of an ever-widening range of graph data types and formats.

Questions to Ask About DevOps Strategy On-Prem vs. the Cloud

Not every company can or wants to go cloud native but that does not mean they are completely cut off from the advantages of DevOps.

Organizations with predominantly on-prem IT and data environments might feel left out from conversations about DevOps because such talk leans heavily on the cloud — but are there ways for such companies to benefit from this strategic approach to faster app development? Experts from the DevOps Institute and Perforce Software offer some insight on how DevOps might be approached under such ecosystems, including if they later pursue cloud migration plans.

In many ways, the DevOps approach to app development is linked to digital transformation. “DevOps was really born out of migration to the cloud,” says Jayne Groll, CEO of the DevOps Institute. She says there was an early belief among some that DevOps could not be done on-prem because it drew upon having an elastic infrastructure, cloud native tools, and containers. “DevOps was intended always to be about faster flow of software from developers into production,” Groll says.

Image: WrightStudio -

While being in the cloud facilitates the automation DevOps calls for, she says there are still ways for hybrid cloud environments to support DevOps, including with continuous delivery. Groll says it is also possible to manage DevOps-related apps in on-prem situations.

Even for companies that sought to remain on-prem, she says the pandemic drove many to change their structure and operations. That may have influenced their approach to DevOps as well. “Some organizations that had to pivot very quickly to work remotely still had on-prem datacenters,” Groll says. “Last year pushed us into the future faster than we were prepared for.”

Though newer, cloud native companies might wonder why anyone would pursue DevOps on-prem, it can make sense for some to adopt such a development strategy, says Johan Karlsson, senior consultant with Perforce Software, a developer of app development tools.

Legacy organizations, he says, might still be slow to migrate to the cloud for a variety reasons even if their IT represents a cost center. “For them to move to the cloud it’s a cultural journey,” Karlsson says. A company might need to do a bit of soul searching before embracing DevOps in the cloud if the organization is very used to an in-house strategy. “On-prem is often associated with a slow IT department, filing requests, and getting access to your hardware,” he says. “All of that is a tedious process with a lot of internal approval steps.”

IT bureaucracy aside, there may be performance needs that drive organizations to stick with on-prem for DevOps and other needs, Karlsson says. “Certain computers may want to be close to certain other computers to get things done fast,” he says. “Putting things on a machine somewhere else may not immediately give you the performance you are looking for.”

That may have been the case pre-pandemic, Karlsson says, for offices that were not near a datacenter. Keeping resources on-prem also retains full control over the IT environment, he says. There also may be regulatory pressures such as working with government entities that preclude operating in the cloud, Karlsson says, where data can cross national borders. Data privacy regulations might also restrict what information can be moved to the cloud or be seen on networks. “In medical device development, there’s a reason why they need to be 100% sure where the data is all times and that’s the reason why they stay on-prem,” he says.

Nondomestic organizations may have additional considerations to make, Karlsson says, about how their DevOps strategy might play out in the cloud given that the major cloud providers are American companies with very US and Western European-centric services. Users in Eastern Europe, Australia, Asia, and other locals might feel disconnected, he says. “If you’re developing products that are typically distributed around the world or you’re located outside of Silicon Valley, there may be a strong need to have things closer to you because it’s too far away and doesn’t comply with local regulations,” Karlsson says.

Overcoming Digital Transformation Challenges With The Cloud

Here’s how cloud computing can enable the future of work, accelerate data strategies, integrate AI and cyber strategies, and innovate for social good.

Digital transformation remains a top priority for this year, with remote work, digital transactions, customer interactions, and business collaboration all requiring flexible, personalized solutions. Cloud is the essential orchestrator and backbone behind everything digital.     

As cloud providers see continued growth and cloud migration projects abound, we should stop to ask —why?

Image: krass99 -

Yes, the cloud can scale infrastructure, applications, and data services at speed as demand shifts. We don’t think about cloud powering our digital apps, teleconferencing, collaboration tools, remote work and education, video streaming, telehealth, and more (until there’s an outage) — but it does.

As organizations innovate their business strategies and implement their 2021 technology roadmaps, a cohesive cloud strategy can address four major challenges facing organizations — digital work, data modernization, integrated business solutions (e.g., AI), and social impact.   

1. Cloud and the future of work

During 2020, organizations embraced a work from anywhere (WFA) model that accelerated the use of cloud collaboration tools, video conferencing, cloud applications, and cloud infrastructure to support work. Where in January 2020, only 3% of full-time employees worked remotely, by April 2020 that number increased to 64%, according to SHRM — with 81% of the global labor force impacted by May 2020 according to the International Labor Organization.  In response, one cloud collaboration platform vendor reported it almost quadrupled its active daily users for that period.  As organizations and the workforce embrace the digital workplace, the future of work will be powered by the cloud.

Additionally, organizations have realized on-premise data centers require access to the “workplace,” which will remain a challenge in 2021 that can be solved by cloud data centers. Those with essential workers have embraced innovative cloud native digital solutions to protect workers and advance the HR technology strategy.

2. Cloud data platforms and ecosystems

The cloud can enhance information sharing and collaboration across data platforms and digital ecosystems. Deloitte research shows 84% of physicians expect secure, efficient sharing of patient data integrated into care in the next five to 10 years. Real world evidence will be critically important in enhancing digital healthcare with historical patient data, real-time diagnostics, and personalized care. Organizations can leverage the cloud for greater collaboration, data standardization, and interoperability across their ecosystem. Research shows digital business ecosystems using cloud experience greater customer satisfaction rates, with 96% of organizations surveyed saying their brand is perceived better and saw improved revenue growth — with leaders reporting 6.7% average annual revenue growth (vs. 4.9% reported by others).

As businesses scrambled to adapt to the unprecedented level of disruption due to the COVID-19 pandemic, it became clear that the increased use of cloud resources was the key to maintaining business operations.

Brought to you by InformationWeek

3. Cloud for integrated business applications

New Cloud ML approaches for developers and data scientists have become available. These include Cloud AI platforms where organizations bring their existing AI models into the cloud; Cloud ML services where organizations can tap into pretrained models, frameworks, and general-purpose algorithms; and AutoML services to augment their AI teams. In the retail sector, organizations have embraced cloud ML to create digital businesses and predict shifting customer demands. Financial services organizations have used the cloud to modernize legacy lending applications for small businesses during the crisis. And, in the technology, media, and telecommunications sector, the cloud is powering your favorite video streaming service.

As organizations rely on the cloud, cloud security becomes increasingly important for data integrity and workload and network security. Information leakage, cloud misconfiguration, and supply chain risk are the top concerns for organizations. A federated security model, zero trust approach, and robust cloud security controls can help to remediate these risks, increase business agility, and improve trust.

4. Cloud innovation and social impact

Finally, organizations are innovating new “intelligent edge” computing architectures by combining cloud, edge, AI, AR/VR and digital twin technologies that tap into the potential of the spatial web. The innovation and social impact potential are tremendous. Smart buildings have the potential to better report on energy consumption across a smart grid network with the cloud. Farms can benefit from precision agriculture solutions. The potential to use cloud to innovate for business and social impact is a rapidly maturing opportunity for the social enterprise.

Online data migration made simple with these 3 tools from AWS, Azure, and GCP

As data – and our reliance on it – grows, the way we store it becomes equally vital. Which is why more and more organizations are turning to the cloud.

Moving data to the cloud reduces your infrastructure, maintenance, and operations, and frees up valuable resources by turning capital expenses (capex) into operating expenses (opex).

But like many companies looking to migrate data to the cloud, you may still have lingering questions. Namely, how can you expect to move large quantities of data quickly, efficiently, and with as little disruption as possible?

To this end, the three major cloud service providers (CSPs) – Amazon Web Services (AWS), Azure, and Google Cloud Platform (GCP) – have new tools for online data migration to simplify sending your on-premises data to the cloud.

In this post, I’ll examine how these tools simplify and speed up the data transfer process, as well as take a closer look at each CSP’s respective tool.

Enhancing the data migration process through parallelizing writes

Older online methods of data migration like secure file transfer protocol (SFTP) only use a single thread to transfer data. While this is functional and valid, it doesn’t allow for top-of-the-line throughput, limiting the speed in which data moves to the cloud.

The newer tools, on the other hand, take advantage of parallelizing writes, or multi-thread writes.

Think of this like a highway: If you’re moving data through a single lane, you can only go so fast. By adding additional lanes, or parallel writes, you improve the write performance and, as such, decrease the time it takes to transfer data.

The chart below provides an estimate on speeds and transfer times when working with online data replication.

3 new tools from AWS, Azure, and GCP

The tool you use will depend on which CSP you’ve chosen, and while there are differences, each accelerates data transfer.

  1. AWS Data Sync. This data transfer service simplifies moving data between on-premises and AWS. Key features include:
    • Parallelism and multi-threading, which can result in a data transfer performance increase of 10x
    • An on-premises component that’s simple to deploy and easy to manage
    • Transferred data is encrypted
  1. Azure AzCopy. AzCopy is a command line tool used to copy or sync files to Azure storage. Version 10 is the most recent. Key features include:
    • Optimized to take advantage of multi-threading and parallelism, increasing data throughput when replicating data between on-premises and Azure storage
    • Version 10 is supported on Windows, Linux, and Mac
    • Scripts can be written to execute on schedules, data will be replicated to defined Azure storage targets
  1. GCP Cloud Storage. GCP has provided the gsutil command line utility to replicate or synchronize an on-premises volume to Google Cloud Storage. Key features include:
    • With an existing bucket created, the gsutil utility is downloaded and configured to run once or on schedule via a script
    • Using the rsync flag in gsutil ensures the data replicated to Google Storage matches the source volume exactly

These three online tools are available at no additional charge from each provider. But be sure to check for additional fees for ingress and egress before transferring large data sets in either direction.

The tools are different, but the goal is the same

Older methods of migrating data to the cloud, including copying via the console or using a third-party product, still exist, but with these new tools, CSPs are looking to reduce the operational overhead of migrating data.

In the end, your on-premises data can be sent to the cloud faster, more efficiently, and without impacting the applications or data you’re creating on premises.

How this state health agency regained control of its cloud footprint with AWS Landing Zone

A state health agency, which purchases health care for more than 2 million people, found itself with a visibility problem – it didn’t have any.

The health agency had zero visibility into its Amazon Web Services (AWS) billing or its overall AWS signature, leaving it frustrated with its AWS reseller.

With an annual $500,000 cloud spend and optimization budget, the health agency knew that without properly understanding how its billing correlated with deployment, it would struggle to plan for future engagements within AWS, including a data center migration it had planned.

SHI learned about the health agency’s struggles and reached out to offer our professional services.

The agency switched its managed billing over to SHI and got its management of cloud costs under control. But this was just the beginning of the engagement.

Impressed with SHI’s cloud capabilities, the organization presented SHI with a new challenge: migrate its existing on-premises data center to AWS.

Assessing the situation

The health agency wanted to move its current AWS workload and a separate standalone environment – consisting of its encryption-protected health care applications – to AWS Landing Zone. It wanted to add security monitoring and hardening.

But most importantly, the health agency wanted to reduce the number of hours required every time it wanted to spin up a development environment into the cloud.

SHI performed an AWS migration assessment and discovered the health agency had a bunch of AWS accounts to consolidate, all with their own silos. The environment was not built to AWS’ best practices. The health agency didn’t have control of its accounts and couldn’t pursue DevOps with its current environment.

This was not going to be a simple migration. But that didn’t mean it couldn’t be done.

Devising a plan, migrating to AWS Landing Zone, and incorporating AWS Direct Connect

SHI built out the customer’s new accounts using AWS Well-Architected Review and AWS Landing Zone.

Before migrating to AWS Landing Zone, however, SHI crafted a new organizational structure for the health agency by setting up service control policies (SCPs) with AWS Organizations on each of its AWS accounts. This would give the health agency central governance and management for its multiple accounts and would allow it to expand its AWS footprint.

SHI also created a service catalogue that lets developers request an environment that fits pre-defined parameters. Development request times shrunk from days to minutes. Given the health agency’s small cloud team, this would be invaluable moving forward.

While this wasn’t a simple migration, as all the workloads were encrypted, SHI took snapshots of the workloads, decrypted them, and encrypted them in the new AWS Landing Zone. SHI documented this process, showed the health agency how to do it, and helped it perform migrations of its own. SHI also attended bi-weekly state networking meetings to help the health agency troubleshoot its network architecture.

The final piece of the puzzle was setting up AWS Direct Connect. The health agency had been using a VPN to connect to its environment in AWS. While this was stable, it couldn’t offer guaranteed high performance and bandwidth. AWS Direct Connect could, which is why it was always in the agency’s roadmap.

The health agency didn’t know how to implement AWS Direct Connect in a way that would provide failover from the VPN, so SHI handled that as well. Now the state health agency has two ways to connect to its AWS environment, as well as redundant connections.

Gaining more control over AWS

This state health agency had a laundry list of needs. It wanted to be able to migrate workloads to AWS Landing Zone. But it also wanted to gain legitimate visibility into its AWS billing and footprint.

8 tips to enable a cloud-based remote workforce

Only a third of people in the United States worked remotely prior to March 4, according to a Workhuman survey. A mere three weeks later, the reality is starkly different.

Thousands of employees nationwide are being told by their employers or their government officials to stay and work from home with no clear answers on when the restriction will lift.

With so much uncertainty, one thing is crystal clear: Your organization needs a business continuity plan in place to minimize disruptions and enable employees to do their jobs from home.

Cloud providers like Amazon and Microsoft offer platforms for employees to securely work remotely with cloud virtual desktops and remote app access. And over the last week, we’ve seen a sharp uptick in customer requests to scale out the number of virtual desktops in the cloud, rapidly interconnect public clouds to customers’ on-premises networks, and preconfigure remote access to Microsoft Azure services, like Windows Virtual Desktops (WVD) and Remote Desktop Services (RDS), or Amazon AWS WorkSpaces and AppStream with core business applications.

If you find yourself suddenly playing catch up, now tasked with enabling access to core applications and data in the cloud, consider these tips:

  1. Confirm that your staff can reach your cloud-based hosted desktops, applications, and services directly using HTTPS or SSL without having to go through the company network. Eliminating any single point of failure reduces potential downtime in the most critical of situations.
  2. Implement two-factor authentication using smart cards, security keys, or mobile devices. This will add additional layers of security for individuals truly authorized to access your data.
  3. Ensure you have enough bandwidth coming into your company to handle any increased remote traffic. Then consider doubling it. The last thing you need is a poor – or no – user experience, which eventually equates to productivity loss.
  4. If leveraging Azure, consider implementing technology like Azure File Sync to replicate data on-premises to the cloud to make it accessible, modifiable and maintain replication, and limit reliance on a virtual private network (VPN) connection.
  5. Validate you have backups, redundancy, and scale-up capability designed into your services so your employees can keep working when the extra traffic makes your primary services slow down.
  6. If using user-initiated VPN, ensure all remote employees can access it and that you have enough licenses for everyone working remotely.
  7. Enable the necessary logging and diagnostics tools so you can quickly mitigate and troubleshoot the user experience or connectivity issues.
  8. Clearly document all technology protocols and instructions and include visuals. For many users, this may be their first time working remotely and using extra layers of authentication. Clear documentation will help lower the learning curve and eliminate calls to the Help Desk.

We know how important the right technology is for your organization to continue to operate, serve customers, and support employees. And now, more than ever, we know time is of the essence.

Whether you already had a business continuity place in place, or you’re rushing to stand up a solution now, these tips will help ensure you’re on the right path.

What is data fabric, and why should you care?

Data is growing at an exponential rate. Given the increasing number of handheld and IoT devices, it’s getting hard to ignore. And this is only the beginning.

It’s believed that by 2020, we will have over 50-60 zettabytes (ZB) of data. This doesn’t just include documents, pictures, and videos. We’re talking about the data that companies not only need, but should be using to meet their overall business goals.

Unfortunately, just because all this data exists, that doesn’t mean organizations have the means to collect, organize, and maximize all it has to offer. Considering the time to market with products, services, and opportunities has shrunk to the point where decisions and reactions need to be instantaneous, companies need a way to tap into all that data has to offer.

So what’s the solution to this problem? Data fabric. Allow us to explain.

What is data fabric?

Simply put, a data fabric is an integration platform that connects all data together with data management services, enabling companies to access and utilize data from multiple sources. With data fabric, companies can design, collaborate, transform, and manage data regardless of where it resides or is generated. According to NetApp, by simplifying and integrating data management across on-premises and cloud environments, data fabric allows companies to speed up digital transformation.

Why do we need data fabric?

Three things come to mind: speed, agility, and data unification.

For instance, say you’re in need of a new shirt. The first thing you do is search the web for a shirt. Within seconds, pop-up ads appear on every website you access. This is an example of the data fabric analyzing, organizing, and transferring data to fit the needs of the company.

Data now comes from almost anywhere in the world, and it also comes in many different forms. For example, database data is now located all over the world like electronic libraries used for research and learning.

As a result, leaders in software and hardware technologies are working to figure out how to share data among different platforms without conversion or migration. Whether they’re dealing with on-premises, cloud, or hybrid environments, thanks to data fabric, companies are better equipped to unify, blend, integrate, transfer, and manage the siloed variety of data.

What does the future of data fabric look like?

Technology systems are becoming more complex and businesses are facing greater challenges than ever before. In addition, future technology promises to be even more complex, with IoT, mobility, digital workspace, and virtualization generating a variety of new data.

As a result, developers are being asked to build applications that gather data across diverse silo systems, including on-premises, cloud, multi-cloud, SQL, NoSQL, or HDFS repositories. Often that results in slow, ineffective, and complex solutions. Unfortunately, these solutions clash with the business challenges companies face today, many of which require applications with accelerated timelines.

Data fabric enables companies to face these challenges head on, offering greater connectors for unification, data integration, and analytical insight. We expect the demand and need for data fabric to get stronger, as companies look to stay on top of emerging technologies and services, new trends, and the continued deployment of applications to meet any and all of their business needs.

Automating data governance: 5 immediate benefits of modern data mapping

Data privacy legislation like GDPR and CCPA has caused a certain amount of upheaval among businesses.

When GDPR went into effect, many organizations weren’t in compliance, and some still weren’t sure where to start. With the advent of CCPA in California, and ongoing talks at the FTC about national consumer data privacy rules for the U.S., the issue of data governance will only grow more visible and more critical.

Unfortunately, many businesses still aren’t ready. Many have no documentation on how data moves through their organization, how it’s modified, or where it’s stored. Some attempt documentation using spreadsheets that lack version control and are at the mercy of human error.

But the implications of data governance go well beyond compliance. Here are five reasons why it’s time to professionalize your documentation and map your data using automation.

1. Compliance

Data privacy laws are arriving whether businesses like it or not. GDPR famously caught businesses by surprise, despite a two-year heads up that it would be taking effect. Less than a month before GDPR was set to take effect in May 2018, Garnter predicted that more than half of companies affected by the law would fail to reach compliance by the end of the year. More than 18 months later, at the end of 2019, new data suggested 58% of GDPR-relevant companies still couldn’t address data requests in the designated time frame.

To achieve compliance with GDPR and CCPA, you essentially need to know how data comes into your company, where it goes, and how it’s transformed along the way. Organizations struggle to do so because they haven’t properly mapped data’s path through their environment.

Spreadsheets are no solution. To prove compliance, you need accurate, current, and centralized documentation mapping your data. Automated tools speed the process and deliver foolproof compliance.

2. Saving Time And Resources

Data scientists bring a lot of value to an organization, but given their specialized and in-demand skills, the average base salary for a data scientist ranges from $113,000 to $123,000. More experienced data scientists command even more.

Unfortunately, at many organizations, data scientists spend 30-40% of their time doing data preparation and grooming, figuring out where data elements came from, and other basic tasks that could be automated.

When data scientists spend so much time on basic tasks, the organization isn’t just losing the time and cost it takes to do that work, it’s losing the opportunities that could be uncovered if the data scientists spent more of their time on data modeling, actionable insights, and predictive analysis.

3. Accelerating Modernization Efforts

Many companies are looking to transition their data from in-house data centers to more cost-efficient cloud databases, where they only pay for the compute power they use. It’s an opportunity to realize cost savings and modernize their environment.

But it can be a strenuous process if you don’t know what’s flowing into on-premises hardware, because you won’t be able to point the same pathways at the cloud. Documenting those pathways manually can be precarious and time consuming at best. Automated tools that map the data pathways for you can accelerate the transition.

4. Improving Communication And Collaboration Between Business And IT

Business and IT often live in different worlds. This is one of the reasons why so few companies have successfully mapped their data. The ETL coders live in the world of data, so when asked about mapping, they just say it’s in the code. They’ve been trained to prioritize productivity in their code, not to focus on documenting, and that different mindset can create a disconnect with business users.

Business users can see IT as uncooperative. IT can see business users as demanding. But with the right tools, you can take this area of contention and turn it into business value.

With the right tools, you can generate glossaries relating business terms to technical metadata and find every term related to GDPR, for example, offering business users the insight they need. IT can do what it has been trained to do and business users can access data lineage and even a whole architectural overview. The end result is better communication and collaboration between the two.

5. Making Confident And Precise Decisions

If you haven’t mapped your data, if you don’t know where the information came from, how it was modified, and so on, that creates major problems for any actionable insights you’ve extracted from the data. If you don’t know where the data originated, how can you trust that it’s accurate and a solid basis for decision-making?

Without that certainty, you might have decisions that are directionally correct, but lack precision. With detail, transformation logic, and comprehensive documentation, you can trust the data is high quality and accurate, and any decisions based on that data are more targeted and precise.

Automating Data Mapping And Governance

Especially now that major consumer privacy laws are active and others are taking shape, it’s essential to know the entire path of data through your organization and to map and understand that path.

But manual processes, no processes, and spreadsheets won’t cut it anymore. All are dated strategies from years ago, before the explosion in data collection, before user-friendly automated tools widely existed.

Asking ETL coders to document what they’re coding is often a battle you’ll lose. But ultimately, you don’t need to ask that question. Automated tools can map the path of data, provide full documentation of your data lineage and impact analysis that’s scalable, always up to date, and not vulnerable to human error.

Data governance is more important than ever. It’s time to invest in the tools to prioritize it.

Gain control at the edge: 3 strategies for overcoming edge computing challenges

More than 75% of enterprise data will be created and processed outside the data center or cloud by 2025, according to Gartner. To keep pace with this movement, you’ll need to place compute power closer to where data is generated – whether on premises, in the cloud, or at the edge – and ensure 100% availability.

This requires complete visibility into every aspect of your distributed environment – a requirement complicated by decreased headcounts, lack of dedicated IT spaces, unstaffed sites, and budget restrictions.

But unified management across sites isn’t out of reach. Here are three strategies for overcoming edge computing challenges.

1. Streamlined, Standardized Technologies

Edge computing environments are often found in remote, real-world locations – think manufacturing plants, hospital operating rooms, or school systems – where there is little-to-no IT support and varying levels of power and connectivity.

Standardized edge computing components can help in these areas.

With pre-integrated, pre-validated, and pre-tested converged and hyperconverged infrastructures in these environments, you can deploy devices quickly or easily swap them out on the fly. Furthermore, these components have smaller form factors that allow you to maximize floor space for the core business.

2. Intelligent Monitoring And Management Tools

Having watchful eyes to manage your environment when you’re not there is vital to overcoming edge challenges. This is where intelligent monitoring and management tools come into play.

Today’s intelligent software solutions leverage cloud, IoT, machine learning, and artificial intelligence (AI), enabling you to remotely monitor thousands of connected devices. From there, you can centralize that data, prioritize alerts, and aid in remote troubleshooting.

With these tools keeping an eye over your environment, you can gain remote visibility and proactive control over all edge computing system assets, regardless of vendor.

3. Expert Partner-Led Services

If your organization doesn’t have the expertise to implement or manage the tools for overcoming edge difficulties, you can also look for management services to augment your IT team.

Some of these services may include design, configuration, delivery, installation, remote monitoring and troubleshooting, on-site parts or unit replacement, and monthly reporting and recommendations.

These sophisticated, scalable services and support offerings are often less expensive than in-house models. Plus, they keep you from having to hire expensive, hard-to-find IT talent, so your team can focus on mission-critical initiatives.

The Three-Pronged Approach In Action

Our engagement with the Bainbridge Island School District near Seattle, Washington, is a perfect example of how this three-pronged approach can help you overcome edge IT challenges.

While the district serves just 4,000 students, it has a relatively small IT team and its 11 sites are mostly unstaffed. However, its distributed environment relies heavily on continuous connectivity to power its digital classrooms and devices. Downtime isn’t in the curriculum.

After installing uninterruptible power supplies in all facilities, along with battery back-up, the district installed data center infrastructure management (DCIM) software to help supervise its network infrastructure. Unfortunately, the flood of data caused by the frequent changes in the power utility supply overwhelmed the district’s IT team.

To prioritize time and attention, the Bainbridge Island IT team implemented a cloud-based data-center-management-as-a-Service (DMaaS) solution that pulls all device data, regardless of vendor, into a central repository and makes it available on their smartphones. Alarms are automatically prioritized, allowing the team to quickly zoom in on affected devices and address problems quickly and efficiently.

As a result, not only do students and teachers enjoy an uninterrupted learning experience, the IT team has reduced the time spent on after-hours monitoring and troubleshooting from hours to minutes.

Get Control At The Edge

Standardized technologies, AI-enabled remote monitoring and management, and a proactive partnership helped the Bainbridge Island School District gain control at the edge. And its action plan can help others achieve similar results.

If Gartner’s forecast holds true, and 75% of data is created and processed outside the data center and cloud by 2025, you’ll need to take the necessary steps to adapt. By incorporating an approach that combines streamlined technology, monitoring and management tools, and services delivered via expert partners, you’ll be well on your way.

AI and ML in edge computing: Benefits, applications, and how they’re driving the future of business

The edge is not a new concept, but it’s about to take off in a major way.

According to Gartner, by 2022, over 50% of enterprise-generated data will be created and processed outside the data center and cloud.

To keep up with the proliferation of new devices and applications that require real-time decisions, organizations will need a new strategy. That’s where edge computing comes into play.

Edge computing involves moving processing power closer to the source of the data to reduce network congestion and latency, and extract maximum value from your data. By taking this approach, companies can use this information to further enhance business outcomes.

However, there is a caveat. You must account for data growth. The new applications at the edge are continuously producing massive amounts of data, and often, organizations require real-time responses based on that data.

One way to do so is with artificial intelligence (AI) and machine learning (ML). AI and ML enable companies to parse the data and maximize the value of their assets, while accelerating the push to the edge.

The Role Of AI And ML At The Edge

AI and ML are transforming how we leverage application and instrumentation data. Real-time analytics are now possible.

In two years, we will see a 10% improvement in asset utilization based solely on a 50% increase in new industrial assets having some form of AI deployed on edge devices, per an IDC FutureScape report.

Take wind turbine farms, for example. Wind turbines are typically in remote settings and widely distributed. This makes computing extremely difficult. It’s inefficient to try to stream all the data back to a centralized data center for processing.

Edge computing removes these physical limitations. The data is collected from sensors on the turbines and processed closer to the source at the edge, reducing latency.

The addition of ML and AI at the edge then enables business intelligence and data warehousing. You can spot historical trends, optimize inventory, identify anomalies, and even prevent future issues, resulting in less downtime and higher profitability.

Everyday Use Cases For AI And ML In Edge Computing

The use of AI and ML in edge computing usually falls within two emerging technologies: natural language processing and convolutional neuro networking.

Natural language processing involves parsing human speech and human handwriting. It also incorporates text classification. Some common use cases include:

  • Smart retail: AI analyzes customer service conversations and recognizes historically successful interactions
  • Call centers: AI analyzes calls and creates metadata that offers predictions and suggestions for automated customer responses
  • Smart security: For consumers, smart devices listen for noises that sound like broken glass; for public safety, AI detects gunshots
  • Legal assistants: AI assistants review legal documents and make suggestions for language clarity and strength

Convolutional neuro networks focus on visualization algorithms. These can identify faces, people, street signs, and other forms of visual data. Some common use cases with this technology include:

  • Quality control: Inspect for defects in factories and other facilities
  • Facial recognition: Find people at risk in a crowd; control access to a facility or workplace
  • Smart retail: Look at personal attributes of shoppers to make product suggestions that elevate the customer experience, and can be used to recommend additional items
  • Health care: Assist doctors by analyzing an image to check for things like tumors
  • Industrial: Safety in a factory setting — look for the locations of workers if someone gets injured; identify dangerous machinery and shut it down if there’s a malfunction

There are countless applications for ML and AI in edge computing. Whether you’re analyzing video or audio data, these technologies can enhance safety and security, improve interactions with customers, and achieve more efficient business outcomes. By processing, interpreting, and acting on the data at the edge using AI and ML, the results arrive with the speed required for safety, sales, and manufacturing situations.