The Dark Side of LLMs
Hype
The thing about hype is that often, value comes later, after lots of inflated expectations. During the cycle, a lot of crazy things happen. That's where usually things go wrong. We can pass this cycle of hype as soon as possible to be productive and use AI well.
People believe in different things. Some people think code is wrong, and the best thing is to just buy all the products of the universe, and staying away from code is the best thing a company can do. I believe that code is code, and you want to have code. One bad thing about the AI hype is that people think engineering will be dead and no one will be coding in two years, which is a very wrong idea, and I wonder if it will happen that fast. To some degree, it is almost like the AI hype creates a Fear Understancy Doubt (FUD) that:
- You don't need engineers; everything will be completely automated
- Stop learning engineering. Just focus on AI
- No One will be coding, so don't build anything
The last one, for sure, could be the worst. Because, like I said before, if the culture you have is anti-coding and anti-building, this is for sure gasoline into the fire.
More focus on bad products; Code is not the problem.
Bad products |
AI is not magic and cannot fix all the problems companies have, especially because some issues are unique. Also, think about Who is best at being creative, AI or humans? Who do you want in control: AI or Humans? So it depends. For sure, humans are more innovative than others. AI, for sure, is much better with numbers, and as humans, we can make mistakes, but hey, AI can make mistakes too, actually a lot of them.
Code is not the bottleneck.
Code is not the bottleneck. Typing is easy; people can type fast nowadays. A lot of people can type and generate answers much quicker than chatGPT. The issue is understanding and figuring out what to do. Remember, when you use LLMs, you need to tell the LLM what to do. What if you need to say to the LLM what you want better? Now think about this, we have been doing engineering for more than 50+ years, and we still need help discovering requirements, understanding the users, and figuring out what works and what does not. The limitations are still the same; don't think you will say half of a word, and AI will be able to generate 100% precisely what you think.
Discovering is the bottleneck; discovery is a journey. Discovery is about back and forth, experiments, hard work, tons of interactions and thinking, experiments, mistakes, and bets. It's not an answer that is already done or a problem that is already solved; we just need the LLM to do it. If that were the case, engineering would have died long ago. If you pay attention to what's happening, AI is working more as a search engine and auto-complete than Skynet and the revolution of machines.
The reality is that people, especially management, have always been obsessed with engineering productivity. I understand engineers are expensive; thanks to AI, engineers are not the most costly thing anymore :-). But if you are obsessed with productivity, you probably take the AI hype incorrectly. Again, the question could be whether to do it faster, do it right, or do it in a way that fixes the problems of the users and generates revenue for the company.
Fooled by AI
We also see all sorts of scams, like Rabbit, and people tricking LLMs, like the guy who bought a car for 1 USD. It does not stop there; Amazon is also dropping Go stores because it has an army of people in India watching videos and doing most of the work manually. Where there is hype, there is investment, and where there is money, there are scams. Let's remember Gemini telling a kid that C++ is too dangerous for him. LLMs are misleading customers; the list goes on and on, and we will see plenty more. Let's not forget the FAKE DEMOS like Devin and Sora. Hype is a way to sell and is an effective marketing tool. Remeber reality and hype are different things.
AI Tools Gold Rush
Gold Rush |
AI is like the Internet in its early days. There is a massive gold rush, and companies are trying to get into this wave and surf it. Not all products are good, and not all companies are severe. When the gold rush happens, companies take advantage of that to sell mining tools. In AI, we are seeing a crazy amount of tooling popping up. Some are good and useful, but not all.
Copilots
I actually think copilots are cool and useful. There are lots of copilots out there right now, to quote a few: Github Copilot, Supermaven, AmazonQ, and many others. IMHO, Copilots are here to stay. Github copilot is good, however slow. There are security implications with copilots, but with enough care and due diligence, we can definitely use them safely.
Hallucinations
The Temptation of Saint Anthony, Detail, Mathias Grünewald, 1515 |
One thing LLM do is to hallucinate. They will provide an answer that will look right on the surface but might need to be corrected. I repeatedly asked LLMs to generate Zig code and got Rust or C++ code back. I have seen copilots generating code many times that does not compile, is full of bugs, or needs to be corrected. So these are much better auto-complete tools that we used to have in our editors and IDEs, but like I said, we must do more than just get it right all the time. They are getting it right and improving daily, but they could be better. For instance, AI Legal Research products hallucinate 17-33% of the time.
Insecure code
LLMs also generate code that is not secure and has vulnerabilities. So you can't trust 100% on the code that is being generated. For a senior, it is completely fine because a senior engineer can make sense of things and know what to do or even ignore what's wrong and improve it. However, junior engineers can be tricky because they are beginning their careers and might need to learn the difference between right and wrong. So you need to find a way to give them a copilot and look at the code.
Copy Paste
Copy Pasta |
If you use a copilot to speed things up, that would be great. IF you need help understanding the code being generated, that is a recipe for failure. We need to be knowledge at all times.
Less Refactoring, more anti-patterns, faster
Slow DevEx
LLM Architecture
Let's address big problems now.
Training Cost
Training costs are enormous. It costs millions of dollars and months. Not all companies will be able to run pre-training for LLMs. Because it is a very intensive process, costs a lot of money, and takes time. Big tech companies are doing such training 1- 2 times per year.
Data
This is a big problem. Some data is complex to have in high volumes. Synthetical data generation can help but is limited to what we know; if there is a pattern, we know it can help. Usually, big tech companies use Wikipedia and other big data corpora combined with books and papers, but code is also used to train LLMs. Github has a significant advantage over MS Copilot. However, data is an essential and limiting factor. We need more data.
Data has another problem; a lot of data out there needs to be revised. LLM needs data to be trained. For instance, 51.24% of the samples from the 112,000 C programs contain vulnerabilities. Now, think about this: you will pre-train or fine-tune a model in your company and feed it with your code. If your code is well written, great, but what if your code needs to be better written and full of anti-patterns and technical debt? What do you think will be "teaching" the LLM model? The LLM model will replicate the anti-patterns because LLMs cannot really reason.
The problem with Fine Tunning
So, if pre-training is too expensive and data is limited, how can we overcome this problem. There are a couple of routes, such as RaG or fine-tuning. The problem with fine-tunning is that some papers already claim that fine-tunning makes the model forget original training,, and that makesperformance drop considerably. So, there are limits to fine-tuning.
Transformers' complexity and inefficiency
Gen AI uses a lot of power, costs a lot of money, and takes a long time. Clearly, things are not scalable in the way they are, and there are lots of inefficiencies and problems that need to be overcome. Transformers' architecture is pretty complex and hard to understand.