Highlight AI-generated information!

So now the time has come… In a new paper(https://lnkd.in/eADdp5r6), some colleagues from Standord and Rice University prove what has been bothering me for some time:

𝘁đ—ŋ𝗮đ—ļđ—ģđ—ļ𝗲đ—ŋ𝘁 đ—ē𝗮đ—ģ 𝗞𝗜 đ—ēđ—ļ𝘁 𝗞𝗜-𝗴𝗲đ—ģ𝗲đ—ŋđ—ļ𝗲đ—ŋ𝘁𝗲đ—ģ 𝗧đ—ŋ𝗮đ—ļđ—ģđ—ļđ—ģ𝗴𝘀𝗱𝗮𝘁𝗲đ—ģ, 𝘄𝗲đ—ŋ𝗱𝗲đ—ģ 𝗱đ—ļ𝗲 𝗘đ—ŋđ—´đ—˛đ—¯đ—ģđ—ļ𝘀𝘀𝗲 𝘀𝗰đ—ĩ𝗹𝗲𝗰đ—ĩ𝘁𝗲đ—ŋ (i.e. they are becoming more and more similar).

This effect is impressively demonstrated in the paper using the example of image generation, but in general this applies to any type of generative AI! So also, for example, if you train Chat-GPT with data generated by Chat-GPT… AI then cannibalizes itself at some point.

We are currently living in an age in which the ratio of generated to real data is still very favorable (AI is only just beginning). However, this is changing rapidly. And that means that we will soon have nothing left with which we can train the AIs in a meaningful way (almost all available data is already trained in the large language models anyway).

Because even if new information is constantly being produced – if we cannot distinguish between what is human-generated and what is AI-generated, then we will no longer be able to use anything qualified for training, i.e. our AIs will no longer improve at some point.

𝗚𝗲đ—ŋ𝗮𝗱𝗲 đ—ŗđ˜‚Ėˆđ—ŋ 𝗨đ—ģ𝘁𝗲đ—ŋđ—ģ𝗲đ—ĩđ—ē𝗲đ—ģ đ—ļ𝘀𝘁 𝗱𝗮𝘀 𝗱𝗲đ—ŋ đ—¯đ—šđ—Žđ—ģ𝗸𝗲 𝗛đ—ŧđ—ŋđ—ŋđ—ŧđ—ŋ!

As soon as your company’s employees start using generative AI in an uncontrolled/unguided manner, you are digging your own potential data grave, because at some point you will no longer be able to rely on your data. And your data is your capital…

𝗗𝗮đ—ĩ𝗲đ—ŋ đ—ē𝘂𝘀𝘀 𝗮𝗸𝘁𝘂𝗲𝗹𝗹 𝗱đ—ļ𝗲 đ—ŧđ—¯đ—˛đ—ŋ𝘀𝘁𝗲 đ—Ŗđ—ŋđ—ļđ—ŧđ—ŋđ—ļđ˜đ—ŽĖˆđ˜ 𝘀𝗲đ—ļđ—ģ, 𝗞𝗜-𝗗𝗮𝘁𝗲đ—ģ 𝘇𝘂 𝗸𝗲đ—ģđ—ģ𝘇𝗲đ—ļ𝗰đ—ĩđ—ģ𝗲đ—ģ!

Difficult to impossible in public, but fortunately feasible within the company.

If you don’t know how to do this, please contact me!



P.S.: Of course, there are also applications where synthetic data is very helpful. However, these are isolated exceptions and not the general rule.

P.P.S.: here are some of my previous posts on this topic:
https://lnkd.in/eY8rC8C7
https://lnkd.in/e3bcJ_92
https://lnkd.in/efAex_M2
https://lnkd.in/eHZMm6KZ

P.P.P.S.: I generated the cover picture with Midjourney. Because our Generative AI is still working 🙂