Backblaze Inc.

01/09/2025 | Press release | Distributed by Public on 01/09/2025 11:52

AI for Enterprise: Getting Started

AI is here to stay, and the question on everyone's mind is how to implement it successfully. If you're ready to implement AI in your business, consider this article a good jumping off point. I'll talk about different options for integrating it into your operations and how to make it truly custom, based on your own data, and useful for your business.

More from AI 101

Want to read more about AI? We've got you covered in our AI 101 series. And, here's a sampling that might be useful when you're thinking about building AI into your business.

How many companies use AI today?

How many businesses are using AI, you ask? Well, let's ask Google. According to their AI overview (yes, we appreciate the irony), anywhere between 55% and 83% of companies are using or exploring AI in some way.

It's not lost on me that the above results illustrate some of the big limitations of AI-namely that it's only as good as the data it's trained on, it's far from infallible, and it can't replace humans wholesale especially when someone needs to fact check those results. Google's AI overviews have been criticized for providing inaccurate information, hallucinating (with sometimes hilarious results), providing a neat answer to complicated questions, providing information from unreliable sources, potential for bias, and so on. Nevertheless, the feature has had several updates since it was first released (which at least means it's no longer telling us to put glue on pizza).

But, setting all that aside, this is actually a great example to consider before we dig into options for incorporating AI into your business. AI Overviews have improved enough-for example, by adding things like source transparency-that we can easily add enough human oversight to consider the above directionally accurate. The landscape of technology is changing, and, ready or not, businesses are being forced to figure out how AI should fit into their strategies.

What we'll talk about today

Today we'll talk about some foundational topics you need to understand when deciding how to incorporate AI into your business. We'll define the following:

  1. Software as a service (SaaS) AI add-ons
  2. AI as a service (AIaaS)
  3. Foundation models
  4. Retrieval augmented generation (RAG)

Those definitions will lead us quickly to some practical examples that illustrate how businesses are using AI.

Software as a service (SaaS) applications, aka, AI as a feature

You may have noticed that many of the web-based applications you are using are suddenly AI-powered or have AI capabilities. While some of that is marketing hype, this could be a way to get started with AI in your organization-by simply turning on a feature in a SaaS product you're already using. There are lots of ways to do this-Slack, for example, offers AI tools for summarizing and answering questions to help teams work faster.

Example AI use case: AI in customer support

Generative AI capabilities such as chatbots are often added to customer-facing applications like your customer support service. The chatbot is trained using your product support materials or actual questions your staff previously answered.

By providing a cache of human-based questions and answers, the chatbot can be trained to respond in your unique company voice.

Oh hey, there's ours!

Before you activate and use a built-in AI feature of an existing service, you'll want to determine how you can measure any changes in overall productivity and user satisfaction. In the customer service example above, that could be capturing metrics such as a customer satisfaction rating, time to first contact, time-to-resolution, escalation ratio, and so on. Then establish a baseline for the existing system before engaging the AI assistant and set specific points where you will compare that baseline to the AI powered system.

Using an AI powered service has many benefits, but there are a number of considerations to contemplate:

  • You are limited in functionality by what the vendor provides.
  • What is the expertise of the software vendor in developing, training, and implementing an AI model?
  • What happens when the model data changes? For example, you've employed AI to respond to customer queries. What happens when you add a new product to your lineup or a new feature to an existing product? Is the model retrained? What are the costs? Does it still make economic sense given any new cost?
  • During the model creation and operational phases, ancillary files such as checkpoints, prompts, responses, and so on are created. Do you have visibility into these files and what analysis can you perform?
  • Given these ancillary files are derived in part from your original data, can you download these files to your central repository or is the data locked in the vendor's application?

Artificial intelligence as a service (AIaaS)

AIaaS is one of the many areas of AI where definitions and capabilities are a moving target. That said, we'll offer that AIaaS is an outsourced service that a cloud-based company provides to other organizations that gives that organization access to different AI models, algorithms, and other resources directly through the vendor's cloud computing platform via a user interface (UI), API, or SDK connection. The aim is to make a user-friendly interface that simplifies the process of training and deploying AI models accessible to non-AI experts.

AIaaS is worth considering if you're interested in working with artificial intelligence but you don't have the in-house resources or expertise to build and manage your own AI technology. There are a broad range of solutions offered in this space which vary by the services provided, let's categorize the services as follows.

  • Walled gardens:
    • What they offer: In my experience, AIaaS providers in this group usually host most or all of the model training data, checkpoints, inferences, and prompts.
    • Pros and cons: This is the most straight-forward option, but in practice, this method can be cost prohibitive and lacks transparency. There are few if any options to reduce the cost or economically transfer the model, its work products, or its data elsewhere.
    • Who are they: The obvious ones that come to mind for me are companies like AWS, Google, and IBM Watson.
  • Mix-and-match:
    • What they offer: Solutions in this group vary by the services they provide as well as add-on options and support services. They typically provide hosting services which are used to train, deploy, and use the model. They can also provide data analysis and cleansing for the model input, model testing, engineering support, and general support services as you might require.
    • Pros and cons: As with the walled garden approach, once data is ingested or ancillary data is created within the system it may be difficult to access and if available expensive to retrieve. Often, they also represent companies that provide specialized services-for instance, companies that solve a type of problem, like a computer vision specialist vs. a natural language processing model, or, alternatively, a company that focuses on AI in IT operations, call center operations, cybersecurity, etc.
    • Who are they: This group includes companies like Twelve Labs, Proofpoint, or Amplify. Note that there's a bit of a porous line between some of the providers in this category and the following-think of it like a gradient.
  • Open cloud:
    • What they offer: Providers in this group offer a variety of tools and services that, when combined, allow an organization to construct, test, operate, and maintain an AI-based solution.
    • Pros and cons: The open cloud approach allows you to select the best of breed providers for the various stages of your AI project. It also allows you to have control over the model and its byproducts such as checkpoint data, inferences, and prompts key to ensuring the model is performing as expected. In summary, while your level of effort for this approach will be higher, you will have more control over your model and more importantly the data, your data.
    • Who are they: This includes platforms like Hugging Face and vendors like OpenAI of ChatGPT fame. Hugging Face is intentionally open source, whereas OpenAI is under pressure to monetize models-one of the bigger evolving conversations in the AI landscape. Today, anyone can purchase an API access subscription from OpenAI to access the GPT-4 Chat from their application. Such subscriptions offer quick access to organizations that want a mature model but aren't able to or interested in building one themselves.

The AIaaS approach is a good choice for organizations that lack expertise in building and operating AI systems. The approach you take, walled garden, mix-and-match, or open cloud, will affect how much access and flexibility you have with the data used and produced by the system. This may not be of interest today, but as your organization becomes more AI savvy, being able to access and share the data within the system could become important.

Foundation models

The term "foundation model" originated with the Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) which defines it as "any model that is trained on broad data that can be adapted (e.g., fine-tuned) to a wide range of downstream tasks." Most, but not all, foundation models are generative AI in form and perform tasks such as language processing, visual comprehension, code generation, and human-centered engagement.

Although foundation models are pre-trained, they can continue to learn from prompts during inference. An organization can develop tailored outputs using techniques such as prompt engineering, fine-tuning, and pipeline engineering. For example, prompt engineering requires you to enter a series of carefully curated prompts to the model such that over time the model infers more precise answers related to the subject matter of the prompts. This makes the model less generic and more specific to your organization.

When using a foundation model, you will need to capture and store all data used to fine-tune the model, for example the prompts and responses used for the prompt engineering process. This will allow you to analyze how the inference process is shifting over time.

Utilizing a foundation model as a starting point is a good choice, but techniques such as prompt engineering are far from being an exact science. Often such training can exacerbate a subtle bias in the existing model or introduce a new bias. This is especially true if the model is public facing.

Retrieval augmented generation (RAG)

Retrieval augmented generation (RAG) is a relatively new technique that allows AI models to link to external sources. These models are, in most cases, a generative AI model, such as a large language model (LLM). By using RAG techniques, external resources, often rich in technical content, can be leveraged as part of the model during inference to be part of the response to the user. One commonly cited example is having medical journals indexed via this technique so their content is reviewed when the model is generating a response. The same could be done with financial data, legal case law, and so on.

RAG works by adding code to the original generative AI model to continuously review defined external resources and convert them into machine-readable indices (vector databases) so they are available for inference. This means the core generative model does not have to be retrained, instead it can use new or updated sources on the fly. This allows you to use your data to make the model your own and lets you update the data sources to keep the model current.

This technique is extremely powerful, but it does require you to store the original model, the testing or validation data used, the external resources you are using to augment the model, their vector databases, and any prompts and inferred responses. Given the tools and utilities you will use to monitor and analyze how your RAG infused AI model is performing, a central cloud storage repository is a good choice for storing this data.

It's all about the data-Your data

AI, at least in its current form, is not deus ex machina. Yes, ChatGPT and its ilk can create wonderful stories of fact or fiction and amazing, never before seen imagery, but without your data, they are marvelously generic. In other words, you and more precisely your data are the key to the value your organization will achieve in using AI.

As we have seen, there are a multitude of options. On one hand, we can hand off our data to a company, pay them handsomely, and let them build and run our AI models-the walled garden approach. While this is enticing, the reality is that AI is still a moving target with few rules and regulations in place and your visibility to what is happening to your data is limited as is your ability to do something if there is a problem.

At the other end is the open cloud approach. This allows you to choose the best-of-breed cloud based applications and cloud compute services to create and run your model. These applications and services can interact freely with your cloud storage platform to leverage your organization's data while providing you complete visibility and control. Yes, it will require more investment on your part, but given the maturity of AI in general, it makes sense for you to keep a watchful eye on how AI is used in your organization and more importantly how well it is performing.

In short, AI requires your data to be truly useful to your organization. AI in its current form is still a young science, one that requires watching to ensure it does what is expected. That's not paranoia, that's just good business. To do this you will need unfettered affordable access to your data, the AI model, and its work products.