When you're serious about AI, it makes sense to host your own

.
/
Date

Your own dedicated AI Hosting makes everything, from experimenting to budgeting, easier.

.

Once you have a clear idea of where AI is going within your business, it’s time to get serious. Public AI models are useful for experimenting and seeing what’s possible, but once you want a properly integrated AI to access your data and work alongside your team, it’s time for GPU Hosting, which lets you host your own AI.

When you run your own server  - dedicated hardware, including GPUs -  the entire neural network is on your infrastructure. There are no connections to AIs run by others, and no sharing of computing resources. All your data is securely stored, and as you refine your models or feed them sensitive data, no-one else can peek in.

Services vs servers

Whatever AI-powered tools you want to run, like large language models (LLM) for example, you have three broad options:

  • Pay-as-you-go access to a model directly provided by an AI company like OpenAI. This is handy for tinkering, but less useful for anything more intensive.

  • Running your AI model on cloud services provided by an international tech behemoth - for example, setting up a foundation model on AWS and paying as you go. The exact architecture is up to you, and the cost of each component is up to them.

  • Provisioning your own custom-built hardware, including dedicated GPUs, in a data centre and running whatever model you choose - typically something from a library like Huggingface using an interface like Gradio. You can either buy the hardware outright and pay for colocation, or rent and run the hardware for a fixed monthly cost. Either way, this is self-hosting.

You have these same options no matter what sort of AI model you’re running, where you source it from, or what you want to achieve with it.

Finding your way around a new set of AWS or Azure services while you’re also adopting new business technology is a recipe for over-complication

It’s important to note that self-hosting doesn’t mean having a server on premises. It doesn’t mean that you have to take on the responsibility of server management, either.

Self-hosting takes away three types of unpredictability

In terms of cost, the big difference with self-hosting is that you know what your bills will be. Paying as you go is much less predictable - especially if the way that you engage with your AI changes over time. (Even for less compute-heavy projects, AWS cost management can be a very big headache.)

Fine-tuning, for example, is computationally heavier than inference. When more computing means higher bills, you’re open to nasty shocks every time you fine-tune your AI model. The simple maths is that you have to pay more - sometimes much more - just so your model works the way you want it to. Or you try to manage costs by using GPU resources less than you ought to.

In terms of computing power, self-hosting again has the advantage of giving you a known quantity. With hardware assembled to your exact specifications, including powerful GPUs entirely at your disposal, there are none of the unexpected queues or delays that shared AI models can suffer from. Your computing time is much more predictable, and you have control over which jobs your resources are dedicated to.

Photo by ThisIsEngineering on Pexels

A third factor is complexity. If you are already familiar with running your own servers, then you’re probably ready to self-host AI. GPU Hosting puts a few extra zeroes on some system specifications, but there aren’t many new tricks to learn at the infrastructure level. Keep your headspace free for the application layer and the changes AI can bring to your business.

On the other hand, finding your way around a new set of AWS or Azure services while you’re also adopting new business technology is a recipe for over-complication.  You might even need to hire a specialist just to run and manage your service architecture - if you can find (and afford) anyone with the right skills.

If your AI model is self-hosted, your data doesn’t leave your business’s systems. In any other architecture you want to think twice about how sensitive that data is, and how it gets used.

A familiar hosting environment with a fixed cost and a defined set of resources is a much simpler way to go about AI Hosting. It’s also a potential source of competitive advantage, because not everyone knows how many benefits you leave on the table when you rush to AWS or Azure for cloud services. Or how much you stand to lose to AWS’s pricing power, which we wrote about when their IPv4 addresses suddenly got expensive.

As well as being much more predictable and user-friendly, self-hosting brings big business benefits.

Intellectual property, data integrity, and security

The field is still young, but it’s well-established that AI models are most valuable when they are specific to your business. That means giving the model access to your proprietary data, extending its general capabilities into any area you wish.

Don’t hook your best ideas into someone else’s machines

If your AI model is self-hosted, your data doesn’t leave your business’s systems. In any other architecture you want to think twice about how sensitive that data is, and how it gets used. 

AI providers are already facing criticism for being overly ravenous. Stable Diffusion, for example, has been sued by Getty Images for using more than 12 million copyrighted images without permission.

It’s in the interests of AI providers to make it oh-so-easy for you to upload commercially sensitive data to infrastructure that they control. They’re not always transparent around how else that data gets used once it arrives, and whether any lessons you learn might be incorporated into future foundation models (which your competitors might pick up).

Don’t let other companies peek behind your curtains

There are countless ways that AI might be integrated into business systems and processes. Most of them haven’t been tried yet. Some will save paradigm-shifting amounts of time and money.

Photo by Noelle Otto on Pexels

Take process mining as an example. It’s smart to use AI to accelerate or improve the way you operate, which is what process mining is all about. But it requires an AI system with deep access to data that reveals your unique ways of working. This IP is worth protecting. It’s probably not wise to mine processes on infrastructure that you don’t control. The risks include accidental data breaches, and lessons being imported back into AI models that your competitors can access.

As well as potentially leaking valuable data to other companies, there is the risk of breaching privacy. The Privacy Commissioner, Michael Webster, has made it clear that providing the wrong type of information to an AI system can breach the Privacy Act. His advice: “Do not input into a generative AI tool personal or confidential information, unless it has been explicitly confirmed that inputted information is not retained or disclosed by the provider.”

Keeping up with computing’s fastest-moving field

When you’re well-resourced, you can keep evolving

In the AI field there’s always another experiment to perform, another new technique to try, another set of refinements to work through.

If your AI model is self-hosted, your data doesn’t leave your business’s systems. In any other architecture you want to think twice about how sensitive that data is, and how it gets used.

All of this makes it hard to set, and stick to, a pay-as-you-go budget. When your head is properly in the game, busy weeks or months are going to be part of it. If your cloud bills are fluctuating massively or setting off alarms in your CFO’s office, you can find yourself spending more time justifying expenses than doing your actual job. 

Accommodate changes in technology without re-budgeting every time

To take a few examples from recent months, there have been big changes in how fine-tuning ought to be done to an off-the-shelf model, and how much of it is necessary. Vector databases have moved out of academic papers to somewhere much nearer the mainstream. Methods that make embeddings quicker to create and score are evolving fast - the evolution of SentenceTransformers is a good example of that.

Photo by Alina Grubnyak on Unsplash

When you own your AI servers, you can take all of this in your stride. If you are paying as you go, the times when you want to run fastest will drain your bank account.

Experiment as much as you need to

Since AI is a black box, there’s no substitute for seeing how tweaks and changes affect the output. Careful tests are the only way to eliminate issues like hallucinations. The more that you can safely experiment with AI, the better. Your own self-hosted AI instance makes this easy - it’s there, at your fingertips, paid for and ready to go.

On the other hand, cloud services incentivise you to cut corners. If you are paying as you go, then every test and each new development adds new costs.

When you’re not in control, you can lose your footing

We all know that AI research and development is a race between companies like OpenAI, Google, and others. They all want to earn headlines about AI’s next leap forward. For users of their models, that can mean operating on less-than-stable ground. 

If you’re using an AI model that’s controlled by a provider who wants to make some tweaks, they are not going to ask you first. Imagine a mechanic letting themselves into your garage overnight to tinker under your car’s bonnet, or maybe change your tyres to something a little more slick. You’re not going to notice anything until you’re in the drivers’ seat and things don’t feel like they used to.

This can affect integrations with AI models, or the behaviour of the models themselves. But if you are running your own instance of the model on your own hardware, you are immune to the whims of anonymous developers.

Need a quick summary of this for your boss? Copy and paste this:

If you want to run a business with AI built in, then direct access to a GPU beats queuing for a pay-as-you-go service.

If you want to be assured that your AI tools won’t spill any data or leak any IP, then you want complete control of your hardware. Self-hosting lets you flexibly experiment with AI-powered ideas, without blowing budgets or giving anything away.

If you want to stop prying eyes from seeing how AI operates for you, then you need to host it away from the companies that want to learn from you.

Fixed-cost hosting is price-competitive with cloud services. Dedicated servers are yours, and yours only. They are less complex to operate and to budget for. 

In short, if you want to build a lasting advantage, at a predictable cost, and without creating new data security headaches, then self-hosting AI on your own Dedicated Server is a very smart choice.