Elliott Hoffman at AI Tool Tracker outlines the various risks that businesses planning to use open-source AI will have to manage
Open-source artificial intelligence (AI) brings with it many benefits for the businesses that use it. The primary ones are cost savings, improved scalability and faster development times.
But alongside these key advantages, there are a host of risks it also presents. These include, but are not limited to, training data and algorithm problems, poor objective setting, and lack of training for people using the AI.
Cyber security
The chief problem, however, is security. Open-source AI is publicly available, meaning almost anyone has access to it, including hackers. This means that it’s hard to know whether the technology is safe to use.
By deploying this technology, organisations are potentially exposing themselves to malevolent actors manipulating AI program outputs, generating security issues or delivering inaccurate information.
Another main stumbling block is the lack of a thorough data method for detecting and reporting flaws in AI models. Traditional techniques, used to scan for security issues in standard software, can’t locate them in an AI program. However, new organisations have stepped up to address these problems, such as cleaning the data used in models to identify any security flaws.
Training data
As far as training data is concerned there are three main risks: biased data and data drift, privacy infringement, and labelling errors. Biased data arises from the exclusion or discrimination against certain demographic groups, caused by a developer’s personal biases or lack of available data for those groups.
Data drift occurs when the distribution of the input data changes over time, making the model trained on the initial data less effective. This can create bias in machine learning when the distribution of that data in the training set isn’t representative of that in the real world.
Privacy infringement stems from training data exposing sensitive personal information which can be used to identify or single out an individual. The use of such data carries significant financial risks, with organisations facing a fine of 4 percent of their total global revenues for a privacy infringement under the General Data Protection Regulation.
Labelling errors are another problem area. Unfortunately, labelling quality is often less than ideal, especially in publicly available data sets: research shows that 3.4% of examples in commonly used data sets are mislabelled.
Biased data and data drift can be addressed by continuously monitoring the model’s performance and regularly retraining it with updated data to ensure that it remains effective and unbiased.
For privacy infringement, firms need to look at whether the data contains sensitive personal identifiable information and whether it has been legally and appropriately collected with the owner’s explicit consent.
Algorithms
Another big problem is algorithms. Algorithm design is vulnerable to risks, such as biased logic, flawed assumptions or judgments, inappropriate modelling techniques, coding errors, and the identification of spurious patterns in the training data.
To tackle the issue, organisations need to retrain the model so that it eliminates bias and other incorrect assumptions. Outputs need to be regularly monitored and reviewed to ensure the problems aren’t being repeated and new problems are being picked up.
Poor objective setting
Often poor objective setting for AI is a company’s downfall. If the human doesn’t know clearly what outcome they want to achieve using the technology, then the end result is likely to be confused too.
AI goals should be appropriate to a company’s technical maturity and chosen to maximise the likelihood of success, prove value and build a foundation from which to create increasingly sophisticated AI tools that achieve higher-level business goals.
Therefore, objectives should be well-formed, meaning they are stakeholder-specific, map actual AI outputs to applications and use cases that achieve business goals, and are appropriately sized.
Lack of training on AI use
Then there is the issue of a lack of training for those people that use AI day-to-day on how to use the technology. That’s because there’s a fundamental shortage of experts in the field who can provide the training.
According to SAS, 63% of businesses lacked staff with the necessary AI skills. A further 61% didn’t have enough staff to deliver the benefits of AI and 53% didn’t know the skills required for AI.
To solve the problem, first companies need to cast their net wider to find specialists who can carry out the training, looking at people from other industries and backgrounds.
Rather than continuing to invest in AI, they should also focus on upskilling employees to provide them with the right level of data literacy and awareness to do the job properly.
As AI evolves, businesses need to keep up with the technology, while ensuring its safe implementation. Open-source AI has merely added another layer of complexity that they have to contend with.
Elliott Hoffman is co-founder of AI Tool Tracker
Main image courtesy of iStockPhoto.com
© 2025, Lyonsdown Limited. Business Reporter® is a registered trademark of Lyonsdown Ltd. VAT registration number: 830519543