Software and Product Development

ChatGPT: potential and risks of the most clicked software of the moment

Technology, as we know, travels very fast and, even if we sometimes do not realise it immediately, it drags us into a world from which there is no turning back. One such technology that is making quite a few headlines recently is ChatGPT, an AI software designed to simulate a conversation and quickly answer questions in writing in a precise manner. It has quickly become a mass phenomenon known worldwide, with implementations ranging from automation to music composition. Among its most interesting features is the ability to write programming code quickly and intuitively.

ChatGPT was launched on 3 November 2022 by the company OpenAI, founded in 2015 by current CEO Sam Altman, Elon Musk (who exited the company in 2019), Peter Thiel (co-founder of PayPal) and Reid Hoffman (co-founder and CEO of LinkedIn until 2007). All are considered to be giants in the IT world.

The ChatGPT software uses Natural Language Processing, a technology that, through machine learning algorithms, is capable not only of storing billions of data, learning from them and the flow of information provided by the users themselves, but is also able to grasp the nuances of human language.

While the ChatGPT has some filters to reject requests that the AI considers to be ‘inappropriate’ and admits its limitations when it cannot answer questions, it is also sometimes giving answers that are completely wrong. This can be explained by the fact that it has no access to the Internet and has been educated only up to 2022, therefore it cannot give answers on current events.

The ChatGPT engine requires huge amounts of data to function and improve, which will be generalised into models (as the human mind does), used to answer users’ questions. It is obvious that the more data a model is trained on, the better the detection of new, more accurate patterns and thus the ability to give comprehensive answers. OpenAI provided the tool with around 300 billion words systematically downloaded from the Internet: books, articles, websites and posts, including personal information obtained probably without consent.

From these brief considerations, it is immediately apparent that the data collection used to train ChatGPT can be problematic for a variety of reasons. Ever since the release of ChatGPT, questions have surfaced about the possible fate of certain professions, in particular the creative categories of writers, journalists and artists.

Here at Trust-IT/COMMpla, we do not believe that ChatGPT or AI more generally will replace people in their jobs, or at least not in the immediate future. Much more likely, it will change the way they work, as the history of the IT world has already taught us. Certainly, a whole series of processes can be automated and made more efficient, but they cannot function without control. For example, even the best translation software can never completely replace a valid interpreter, who is in any case called upon to intervene in steps where a literal translation cannot suffice.

We believe, however, that the point about ChatGPT – but more generally about AI – is not so much whether or not it will replace people in their jobs, whether students will be able to pass exams without studying, or whether everyone will become ‘good’ thanks to AI. The crucial question, in our opinion, is related to two main aspects where we have been working in multiple research and innovation projects funded with the European Commission and these are: privacy and (cyber)security. Firstly the holders of personal data were probably not asked whether ChatGPT could use them in the learning phase. Even when data are publicly available, their use may infringe copyright: think for instance of books or musical compositions. At the moment, it also seems that OpenAI does not offer procedures to check whether the company is storing personal information or to request its deletion. This right should be guaranteed in accordance with the European General Data Protection Regulation (GDPR). The so-called ‘right to be forgotten’ becomes especially important when information is inaccurate or even wrong.

Secondly, we cannot overlook an aspect of IPR (Intellectual Property Rights): are we sure that ChatGPT does not store or re-use for its own learning the information with which requests are made, instead of handling it confidentially? For example, if employees of a company use ChatGPT by sending confidential information about their products, algorithms, patents, contracts, and emails to be reviewed, could such data be used to provide other users with answers to their queries, generating the serious problem of disclosure of confidential information?

The platform’s terms of service are very long. I was surprised by two aspects: the service (in theory) is reserved for those over 18 years old and is provided without any guarantee of the quality of the answers, which can be used at one’s own risk. If in fact, ChatGPT should directly or indirectly create problems for those who have used it, the maximum sum that OpenAI says it is willing to pay is $100.

Lastly the (cyber)security considerations: it has been demonstrated (some examples are available on the internet) how ChatGPT can be used to create advanced malware, capable of evading the common defences of cybersecurity solutions. ChatGPT has a number of filters that should prevent it from being used for ‘malicious’ purposes, but it is quite easy to circumvent them to induce the software to write malicious code, which can easily be used in even complex attacks. Furthermore, it is possible to use ChatGPT to mutate this code, which is then theoretically capable of creating multiple variants of the same code, making the attack even more difficult to foresee and combat.

Clearly, one must have advanced programming skills to use the code produced, but there is no doubt that the tool lowers the barrier of entry for all actors revolving around malware, which can therefore also be ‘written’ by people with little programming skills or very limited technical expertise.

We may be exaggerating a little bit by imagining the most catastrophic implications, but it is a fact that we live in a world built on data and content. What is clear is that the availability of AI chatbots will make it possible to multiply certain kinds of information with minimal effort. Undeniably, ChatGPT is a powerful and versatile language model with the potential to revolutionise the way we learn and interact with machines. It is not the oracle that solves all our problems… it is simply another resource we have at our disposal to improve our work!