How to kill any AI project – 8 easy and powerful ideas – Part I

5. January 2024
Mathias Landhäußer

What’s the biggest problem of every AI? When it leaves the lab and encounters our ambiguous, noisy, and ever-evolving reality. There are so many reasons AI projects fail. We collected some of the most common problems that kill AI projects, discuss their roots – and we tell you what to do about them. The problem descriptions might sound oversimplified at first but think about them. Chances are high that exactly these let you stumble last time (or are breathing down your neck constantly whispering in your ear…).

Let’s get the obvious ones out of the room first: No budget, no time, no people, no idea, no business value → no project. Easy as that. But what if you think you have all that…? Let’s take a look at the data an AI could help you with: Only about 20% of the data in businesses are structured and the overwhelming 80% are made for and only accessible by humans. If your task is based on the 80%, it’s invisible (or incomprehensible) for most AIs and you are stuck with manual work and little technical support. But is it?

You need an AI like semantha that works with unstructured data (for example, text documents, videos, slack messages) and semi-structured data (for example, forms, spreadsheets that contain text, requirements documents). So let’s explore what semantha can do:

1. The data dilemma: not enough data – and what I have isn’t labeled yet

Kicking off your AI project doesn’t have to be daunting. But depending on the type of solution you’re looking for, you need training data. If you don’t have the data, you need to get it. The questions are: is the data cleaned, representative, … and enough for the AI method at hand? Often, the data is not available for training: it’s not enough data, or it is not labeled yet. Working with semantha in that case is similar to working with a colleague: You don’t train, you educate using a small and potentially growing number of examples.

Most of the time, these examples are available (onboarding materials, guidelines, checklists, etc.). You can feed them to semantha’s library – the knowledge base – and you’re ready to go. That makes starting with little data easy. But what if you have “much” data but it’s unlabeled? Subject matter experts tend not to like sifting through “old” documents on top of their daily workload. And to be honest, it’s hardly welcoming. Figuratively giving the task to the interns isn’t a good idea either – they’d have to know what to label first and how. Fortunately, AI platforms can help you with that.

semantha’s expert modules, for example, provide you with a structure and a guided approach to prepare the analyses. In our scenario, you’d use a feature called Smart Clustering. You can upload your unlabeled documents and let semantha read all documents, identify overlapping content, (highlighting contradictions) and summarize and propose names for the clusters – just like a subject matter expert would do Note that semantha would not cluster on a document level but slice and dice the input and work with meaningful chunks. These chunks are exactly what you’d need to prepare semantha’s library for analyzing unseen documents.

semantha breaks down the configuration process into manageable and measurable steps, ensuring that you and your team can initiate and manage the setup process effectively.

2. I’d like to have full control and build it in-house

Yes, that’s actually a tough one… Or to be honest, it’s a bunch of different arguments rooted in very different corners. There are many good reasons to have a software product built in-house: From privacy and confidentiality concerns to the promises of a custom-built product – the “pros” are almost limitless. Yet, building a software product poses many challenges too.

If you are concerned only about privacy and confidentiality, you might want to skip to section three below as we’ll focus on the engineering challenges first. 😉 If you like to read more on the question of outsourcing, I recommend reading “When should companies implement AI projects internally and when should they outsource?“

A common goal is to integrate the envisioned AI solution in processes of an existing software landscape. Having full control over the implementation and integration of a new software is marvelous – especially if the IT and innovation departments are already on board. No need for discussions, just straight to the solution, smooth from kick-off to launch. “Special wishes? No problem! Most of AI land is built on open-source components anyways.” Yet, building an AI product is rarely a straight line. And if you go that route, you want to make sure to focus on the AI part of the task. Guarantee that IT experts and subject matter experts work hand-in-hand to understand the problem, avoid the data dilemma, define the desired outcome, and benchmark the resulting models after every update.

But there’s also a software engineering part to it – and you want to make sure that your solution (assuming that you arrive at one) can actually be deployed, run, and maintained in your enterprise IT infrastructure. And this is where quick wins are hard to come by. If you need more than a script or notebook to run your data experiments in (I’d call that a feasibility study), you need enterprise-grade software including encryption, integration into your identity management (for single-sign-on and authorization), data governance, patch management for the numerous dependencies etc. And this is where outside help should be embraced.

semantha complements your in-house capabilities by seamlessly integrating with existing systems. Whether it’s through APIs, plugins, or custom connectors, our platform is designed to collaborate harmoniously with your current (and future) infrastructure. This integration ensures a smooth transition to AI without disrupting established workflows. You can set up semantha and run feasibility studies using the web-based interface. Or you can head directly to the API and run everything from within a script – which could be the first step towards an integration. Our pre-built models and expert modules allow for rapid implementation, reducing the development timeline compared to a solution built entirely in-house. Quicker iterations mean higher efficiency so that you can capitalize on the benefits of AI sooner rather than later.

3. Falling into the privacy/confidentiality abyss

Are you on the brink of entrusting your AI endeavors to a service provider, only to discover they demand vast amounts of your sensitive data? The dilemma is real, especially when your data comprises confidential information, trade secrets, or private customer data forbidden for sharing. Then the AI train is heading for a sudden stop. Or at latest, when you inform your DPO or legal department. (Pro tip: Involve them early in your plans!)

If you find yourself in a slightly better AI position, able to use your data for training but under strict confidentiality mandates, the hope is that your provider of choice can train a private model for you without your data mingling with data coming from other customers (and without retaining a copy of your data for future use). But even if this is a given for them, company policies can forbid that your data leaves your premises. Sometimes, policies allow uploading data to the cloud but only to preferred cloud providers (and usually only within your company’s subscription to maintain full control). This is from the AI provider’s point of view as inconvenient as an on-premises deployment.

If this describes your situation, then you need an AI solution that can be deployed on-premises and you need a service provider that recognizes your confidentiality requirements. Semantha in contrast does not require large training datasets and thus the data can be anonymized easily. Most often, you don’t even need the personal information for the analyses so that there is no personal data in semantha’s library. Moreover, semantha doesn’t store the analyzed data, earning nods of approval from DPOs. If your policies demand on-premises deployment, semantha happily complies, even embracing CPU-only environments for flexibility (believe me, it’ll use a GPU if it gets one). Again, this makes on-premises deployment feasible.

Finally, being an on-prem customer doesn’t mean settling for less: You’ll get the same software stack as our cloud customers, with identical update cycles. We’re flexible too – a feasibility study in the cloud can seamlessly transition into an on-premises deployment or a hybrid model.

Get started with semantha today

We’re positive that semantha can help you with your AI challenges when it comes to unstructured data. And if we avoid the pitfalls described above, we’re sure that we’re going to harvest the fruits of our joint work. We are sure we can – but don’t trust a blog post, request a personal demo!

The Library – as well as all other features – are included in every semantha subscription. Each Co-Worker comes with the full flexibility and you can start using the capabilities right away. Get in touch with your semantha representative to find out how to get started with your next use case. If you don’t have an active subscription, simply book a meeting with our team to learn more about how semantha can help drawing insights from your unstructured data.

Contract/document analysis, review, and comparison

Enterprise search

Requirements and specifications analysis

ESG report data analysis

Compliance management

Risk analysis underwriting

Contract/document review for reinsurance

Mass claims analysis

Correspondence automation

Video analysis

How to kill any AI project – 8 easy and powerful ideas – Part I

1. The data dilemma: not enough data – and what I have isn’t labeled yet

2. I’d like to have full control and build it in-house

3. Falling into the privacy/confidentiality abyss

Get started with semantha today

Customers and partners rely on us

Find out how our customers have increased their efficiency with semantha. View all success stories.