We first posted this alert on 3 April 2023 and have updated this to reflect recent developments.
We have talked in previous alerts about the increased conflict between AI applications and data protection. There was another example of this on Friday when the Italian Data Protection Authority (the Garante) ordered ChatGPT to stop processing data on Italian users.
ChatGPT is an AI powered chatbot operated by OpenAI, a US corporation which has received substantial investment from Microsoft. According to OpenAI Microsoft has invested $1 billion in OpenAI as part of a partnership announced in 2019. OpenAI has its HQ in the US but ChatGPT was available to users throughout the EU including in Italy. Following the release of ChatGPT, OpenAI’s valuation was estimated at US$29 billion.
The Garante has previously taken action against other AI-based applications including ReplikaAI (see https://www.corderycompliance.com/italy-dpa-chatbot-0223/) and Clearview AI (see https://www.corderycompliance.com/clearview-ai-italy-gdpr-fine). In addition, the Garante has previously brought proceedings relating to the use of AI in food delivery – https://www.corderycompliance.com/garante-fines-deliveroo/. When we wrote about the ReplikaAI case in February we predicted that ChatGPT might be next and Friday’s announcement confirms that to be the case.
What did the Garante decide?
The Garante decided that it should impose “an immediate temporary limitation” on the processing of Italian users’ data by OpenAI. It has also started an inquiry into OpenAI’s use of data.
The main concern here was in ChatGPT’s pot of training data. Often chatbots use large pots of data to train the bot and to provide data to answer questions. As in the ClearviewAI case the Garante was concerned about how that data was collected, whether data subjects had been told and what the lawful basis of processing was. It said:
“no information is provided to users and data subjects whose data are collected by Open AI; more importantly, there appears to be no legal basis underpinning the massive collection and processing of personal data in order to ‘train’ the algorithms on which the platform relies.”
The Garante also had concerns about the accuracy some of ChatGPT’s answers. GDPR contains 6 principles for data processing in GDPR Art. 5. Principle (d) says that data must be accurate and, where necessary kept up to date.
The Garante also had concerns about the use of ChatGPT by children. Like in the ReplikaAI case the Garante, was concerned that there was no age verification element with the chatbot and that the sign-up process was not sufficient to make sure OpenAI’s wish to exclude children under 13 was being enforced.
What does ChatGPT say about the data it uses?
Chat GPT itself is fairly vague as to the data it uses as the basis for its answers. It says:
“As an AI language model, I use a vast amount of text data from various sources to answer questions. This data includes books, articles, research papers, websites, and other publicly available information on the internet. My training dataset was collected from a diverse range of sources, and it includes texts from many different fields such as science, history, literature, and more. When I receive a question, I analyse the input text to identify keywords and phrases that can help me understand the question’s context and meaning. Based on this analysis, I search through my vast database of text data to find relevant information that can help me generate an answer. It’s important to note that my responses are generated based on statistical patterns and associations in the data I was trained on, and I don’t have personal experience or feelings like a human being.”
ChatGPT does not use live data like a search engine might. It says it only uses data it collected in 2021 although some users have highlighted answers which might suggest more recent data has been used.
What about the extra-territorial aspects?
What will ChatGPT do next?
That seems unclear. We asked ChatGPT the question today and it erroneously seemed to imply the ruling was not binding on it:
“As an AI language model, I do not have the ability to take any action regarding the Italian ruling against ChatGPT, as I am not a legal entity and do not have any physical presence or legal representation in Italy. My purpose is to generate human-like responses to the queries I receive based on the data I have been trained on. I do not have any control over the questions asked of me, nor do I have any knowledge of specific legal proceedings or disputes related to my responses. It is important to note that while I strive to provide accurate and helpful information, the responses I generate are not legal advice, and users should always consult with qualified legal professionals for any legal matters.”
OpenAI’s CEO Sam Altman has said on twitter that he is considering the ruling. On Friday he tweeted:
“We of course defer to the Italian government and have ceased offering ChatGPT in Italy (though we think we are following all privacy laws). Italy is one of my favourite countries and I look forward to visiting again soon!”
Setting aside the fact that the ruling was from the Garante rather than the Italian government it is unclear what changes (if any) OpenAI has made to comply with the order. For example on 4 April 2023 we asked ChatGPT about the marital history of a former Italian Prime Minister. It replied:
“Silvio Berlusconi, the former Prime Minister of Italy, has been married three times. His first marriage was to Carla Dall’Oglio in 1965, which ended in divorce in 1985. He then married actress Veronica Lario in 1990, but they separated in 2009 and their divorce was finalized in 2014. Berlusconi’s third marriage was to Francesca Pascale in 2013, whom he had been dating since 2009.”
According to a public statement from Garante OpenAI approached it on 3 April 2022 and asked to start a dialogue about complying with the order. Talks will be held on 5 April 2022, a move which Garante said it welcomed. The Garante said OpenAI had written:
“to express its immediate willingness to collaborate with the Italian Authority in order to comply with the European privacy discipline and reach a shared solution capable of resolving the critical profiles raised by the Authority regarding the processing of data of Italian citizens.”
Will there be a fine?
Possibly, although no fine has been levied yet. The Garante has said that OpenAI has 20 days to tell it the measures it has taken to comply with the order, otherwise it says a fine of up to €20 million or 4% of the total worldwide annual turnover of OpenAI may be imposed.
How have other DPAs reacted?
The UK Data Protection Authority the ICO published a blog today in which it said it would also be investigating chatbots, although ChatGPT was not specifically mentioned as a bot under investigation. The ICO announcement said:
“As the data protection regulator, we will be asking these questions of organisations that are developing or using generative AI. We will act where organisations are not following the law and considering the impact on individuals.”
Other DPAs have said that they are also monitoring developments including the DPAs in Canada, France, Ireland and South Korea.
What does the ruling mean?
The ruling is a reminder that any business thinking of adopting an AI-based solution must think through the likely consequences. Businesses thinking of adopting or buying in an AI-based tool should:
- Ensure your solution complies with data protection laws, and in time purpose-built AI legislation – existing data protection requirements will need to be complied with, such as the data protection principles of fairness, accuracy and transparency. You’ll also need to make sure that you have a proper legal basis for processing the data. Just because the data ‘is in the public domain’ doesn’t mean its fair game for re-use.
- You’ll need to check your using any data gathered lawfully. This will involve making sure you’re aware of IP rights. Since the Shetland Times case in 1996 the courts have upheld copyright in website content and been prepared to prohibit use elsewhere. You’ll also need to consider the terms of service on any website the data was scraped from. If the data has been scraped in breach of the terms of service that could be a criminal offence too.
- You’ll need to work out where the data is being stored. You might need to address data transfer issues too.
- If you’re inputting your data into the tool do you know what that data will be used for? Will your data be added to the data training pot? If so are you happy with that especially given the questionable ownership of some AI solutions. How can you guarantee the security of the data which is also a GDPR requirement?
- Do proper due diligence on any potential vendor or provider. Don’t be fooled by sales spiel and make sure that any solution you are being offered does what is being promised, is offered by a reputable supplier and addresses any compliance concerns. You’ll need a proper contract too with the right level of compensation if things go wrong.
- Keep on top of the available guidance – there is a high volume of guidance in this area, and expect more to come in future as this area develops further.
- Ensure that adequate safeguards are in place to protect people from biased or discriminatory decisions or outcomes – these should leverage the best aspects of both human and automated intelligence.
- Ensure that ethical considerations are given sufficient weight – just because you have the technical capability do something, does not necessarily mean that you should.
- Make sure you have proper processes in place to recognise and act on data subject requests. These could be requests for the information being processed, requests to correct inaccurate data or requests to delete data. Individuals also have more rights under GDPR Art. 21 and GDPR Art.22 when automated decision making is involved.
- Do a proper DPIA. The ICO’s recent guidance says “You must assess and mitigate any data protection risks via the DPIA process before you start processing personal data. Your DPIA should be kept up to date as the processing and its impacts evolve.”
- Consult with specialist lawyers and build time into your development schedules to assess the risks properly.
For more information
We have talked in more detail about these issues in our short films here https://bit.ly/chatfilm2 and here https://bit.ly/chatgptfilm. Our note from 2021 has some more background on the legal aspects of AI https://www.corderycompliance.com/ai-and-gdpr-teaching-machines-fairness/.
There is more information about this and other data protection topics in Cordery’s GDPR Navigator subscription service. GDPR Navigator includes short films, straightforward guidance, checklists and regular conference calls to help you comply. More details are at www.bit.ly/gdprnav.
You can read the Garante’s decision here https://bit.ly/3GuPA6P.
For more information please contact Jonathan Armstrong or André Bywater who are lawyers with Cordery in London where their focus is on compliance issues.
|Jonathan Armstrong, Cordery, Lexis House, 30 Farringdon Street, London, EC4A 4HH||André Bywater, Cordery, Lexis House, 30 Farringdon Street, London, EC4A 4HH|
|Office: +44 (0)207 075 1784||Office: +44 (0)207 075 1785|