The Dark Side of LLMs

AI is the most considerable hype right now. It is not as new as people think, starting in the 1950s. Significant advances have happened since 2017 with the transformers architecture, which is the heart of all generative AI. There is an excellent potential for substantial disruption because of AI. We are seeing significant improvements, but we are far from AGI. AI can be practical and honest, add value to the business, and improve our lives. AI is narrow at the moment and has many challenges. One of the industries that has the potential to be highly disrupted by AI is the technology industries and engineers. Large Language Models (LLM) can do amazing things. From generating text, generating creative images and videos(with lots of problems), and even generating code. It's absolutely normal to be concerned, but the more you understand what's actually going on, the less you need to be worried. If you are an expert, you will be fine. We saw a great leap and boost of evolution, but we will still see such growth year after year. That does not mean we can completely ignore AI and pretend it is not happening. However, there is a big difference between learning AI and AI, which will take all our jobs in two years. Before I start talking about the problems and the challenges, let me be clear, I think AI can be good, and we can use it to build better products and tools; however, it is not a panacea, and it's not the solution for all problems.

Hype

Hype Cycle

The thing about hype is that often, value comes later, after lots of inflated expectations. During the cycle, a lot of crazy things happen. That's where usually things go wrong. We can pass this cycle of hype as soon as possible to be productive and use AI well.

People believe in different things. Some people think code is wrong, and the best thing is to just buy all the products of the universe, and staying away from code is the best thing a company can do. I believe that code is code, and you want to have code.  One bad thing about the AI hype is that people think engineering will be dead and no one will be coding in two years, which is a very wrong idea, and I wonder if it will happen that fast. To some degree, it is almost like the AI hype creates a Fear Understancy Doubt (FUD) that:

  • You don't need engineers; everything will be completely automated
  • Stop learning engineering. Just focus on AI
  • No One will be coding, so don't build anything

The last one, for sure, could be the worst. Because, like I said before, if the culture you have is anti-coding and anti-building, this is for sure gasoline into the fire. 

More focus on bad products; Code is not the problem.

Bad products

Now, I would like to say that not all products are good. People romanticize products, thinking they are perfect and can fix all the universe's problems. But the realization is that even when you buy a product, you always need to do some sort of integration; however, because you need the code, that can be pretty hard and lead to poor user experience. We need to avoid the trap that AI will boost products so that we don't need to do anything; we just need to take a vacation and let the robots work for us; we are not quite there. 

AI is not magic and cannot fix all the problems companies have, especially because some issues are unique. Also, think about Who is best at being creative, AI or humans? Who do you want in control: AI or Humans? So it depends. For sure, humans are more innovative than others. AI, for sure, is much better with numbers, and as humans, we can make mistakes, but hey, AI can make mistakes too, actually a lot of them. 

Code is not the bottleneck.

Code is not the bottleneck. Typing is easy; people can type fast nowadays. A lot of people can type and generate answers much quicker than chatGPT. The issue is understanding and figuring out what to do. Remember, when you use LLMs, you need to tell the LLM what to do. What if you need to say to the LLM what you want better? Now think about this, we have been doing engineering for more than 50+ years, and we still need help discovering requirements, understanding the users, and figuring out what works and what does not. The limitations are still the same; don't think you will say half of a word, and AI will be able to generate 100% precisely what you think.

Discovering is the bottleneck; discovery is a journey. Discovery is about back and forth, experiments, hard work, tons of interactions and thinking, experiments, mistakes, and bets. It's not an answer that is already done or a problem that is already solved; we just need the LLM to do it. If that were the case, engineering would have died long ago. If you pay attention to what's happening, AI is working more as a search engine and auto-complete than Skynet and the revolution of machines.

The reality is that people, especially management, have always been obsessed with engineering productivity. I understand engineers are expensive; thanks to AI, engineers are not the most costly thing anymore :-). But if you are obsessed with productivity, you probably take the AI hype incorrectly. Again, the question could be whether to do it faster, do it right, or do it in a way that fixes the problems of the users and generates revenue for the company.

Fooled by AI 

We also see all sorts of scams, like Rabbit, and people tricking LLMs, like the guy who bought a car for 1 USD. It does not stop there; Amazon is also dropping Go stores because it has an army of people in India watching videos and doing most of the work manually. Where there is hype, there is investment, and where there is money, there are scams. Let's remember Gemini telling a kid that C++ is too dangerous for him. LLMs are misleading customers; the list goes on and on, and we will see plenty more. Let's not forget the FAKE DEMOS like Devin and Sora. Hype is a way to sell and is an effective marketing tool. Remeber reality and hype are different things.

AI Tools Gold Rush

Gold Rush

AI is like the Internet in its early days. There is a massive gold rush, and companies are trying to get into this wave and surf it. Not all products are good, and not all companies are severe. When the gold rush happens, companies take advantage of that to sell mining tools. In AI, we are seeing a crazy amount of tooling popping up. Some are good and useful, but not all. 

Copilots

I actually think copilots are cool and useful. There are lots of copilots out there right now, to quote a few: Github Copilot, Supermaven, AmazonQ, and many others. IMHO, Copilots are here to stay. Github copilot is good, however slow. There are security implications with copilots, but with enough care and due diligence, we can definitely use them safely. 

Hallucinations

The Temptation of Saint Anthony, Detail, Mathias Grünewald, 1515

One thing LLM do is to hallucinate. They will provide an answer that will look right on the surface but might need to be corrected. I repeatedly asked LLMs to generate Zig code and got Rust or C++ code back. I have seen copilots generating code many times that does not compile, is full of bugs, or needs to be corrected. So these are much better auto-complete tools that we used to have in our editors and IDEs, but like I said, we must do more than just get it right all the time. They are getting it right and improving daily, but they could be better. For instance, AI Legal Research products hallucinate 17-33% of the time.

Insecure code

LLMs also generate code that is not secure and has vulnerabilities. So you can't trust 100% on the code that is being generated. For a senior, it is completely fine because a senior engineer can make sense of things and know what to do or even ignore what's wrong and improve it. However, junior engineers can be tricky because they are beginning their careers and might need to learn the difference between right and wrong. So you need to find a way to give them a copilot and look at the code. 

Copy Paste

Copy Pasta

One of the worst things is that LLMs must be better integrated into IDEs. And you need to copy and paste most of the code, sure you can use auto-complete. But usually, it is better than the chat. The problem with copying and pasting is that people typically need to think. For decades, I've been fighting the copy-and-paste culture, where engineers must understand what they are doing. They need help copying, pasting, and understanding. This will create a wrong code base full of anti-patterns and technical debt.

If you use a copilot to speed things up, that would be great. IF you need help understanding the code being generated, that is a recipe for failure. We need to be knowledge at all times.

Less Refactoring, more anti-patterns, faster

Here is my biggest fear. IF you are in a culture of delivering, no matter the consequences. AI can again put a lot of gasoline on the fire. Because people need to pay attention to details, we will be putting poison on the system just much faster. We can introduce anti-patterns and technical debts very fast. Don't want to take my word? Check out this research.

Slow DevEx


Here is something for us to think about. We code faster with LLMs and Copilots. But then we go to prod, and if things dont work, we will have more errors (faster). Are we going faster or slower if we have more bugs and need more troubleshooting time? The real problem is how we measure people; again, obsession with productivity is not good. Don't get me wrong, it is always good to be able to deliver more, and there is nothing wrong with that. But if we want to speed up, we must first speed up learning and understanding. Otherwise, we are not just making it all slower. There are two reasons for the one I mentioned; the second is that we are waiting for LLM to answer :-) 

LLM Architecture

Let's address big problems now. 

Training Cost

Training costs are enormous. It costs millions of dollars and months. Not all companies will be able to run pre-training for LLMs. Because it is a very intensive process, costs a lot of money, and takes time. Big tech companies are doing such training 1- 2 times per year.

Data

This is a big problem. Some data is complex to have in high volumes. Synthetical data generation can help but is limited to what we know; if there is a pattern, we know it can help. Usually, big tech companies use Wikipedia and other big data corpora combined with books and papers, but code is also used to train LLMs. Github has a significant advantage over MS Copilot. However, data is an essential and limiting factor. We need more data.

Data has another problem; a lot of data out there needs to be revised. LLM needs data to be trained. For instance, 51.24% of the samples from the 112,000 C programs contain vulnerabilities. Now, think about this: you will pre-train or fine-tune a model in your company and feed it with your code. If your code is well written, great, but what if your code needs to be better written and full of anti-patterns and technical debt? What do you think will be "teaching" the LLM model? The LLM model will replicate the anti-patterns because LLMs cannot really reason.

The problem with Fine Tunning 

So, if pre-training is too expensive and data is limited, how can we overcome this problem. There are a couple of routes, such as RaG or fine-tuning. The problem with fine-tunning is that some papers already claim that fine-tunning makes the model forget original training,, and that makesperformance drop considerably. So, there are limits to fine-tuning.

Transformers' complexity and inefficiency

Gen AI uses a lot of power, costs a lot of money, and takes a long time. Clearly, things are not scalable in the way they are, and there are lots of inefficiencies and problems that need to be overcome. Transformers' architecture is pretty complex and hard to understand. 

Making Sense of AI

IMHO, you need to be careful when putting AI in front of the end user for now. Engineering is a safe bet to use AI because if an engineer can review, it's internal and avoids creating customer problems. AI outside engineering needs to be evaluated with much caution and concern for security, privacy, and expectations. 

Google is adding AI to almost all its products, but I would argue in a pretty controlled way. Having an LLM chatbot in front of the user is where things can go wrong. Sure, there are techniques like proxy, adding guardrails, and even sanitizing user requests, even sending them to another LLM to check or summarize them before coming to the core LLM.  

We can still use AI to drive exciting benefits for users. But keep in mind that gen AI does not apply to all problems. It is suitable as an internal search system for finding information (again, search). 

The Road Ahead

AI will disrupt many industries, including technology, but we don't need to worry about losing our jobs over the years, which will take a long time. Since I started in the technology industry, I have heard people saying that engineering will be done and coding will be done. I remenber +20 years ago a teacher of mine telling me that he was always hearing that coding would end. Perhaps that will never end.

We still have a long road ahead with AI. Things could happen fast or take another 50 years for a significant improvement. Clearly, the current architecture needs to be faster and optimized enough. Everything is so resource-intensive and very, very complex. However, the good news is that we are seeing more APIs. Hugging Face is doing a great job of democratizing AI. LLMs tend to become comonitites and the real thing it would be able data, having the data and knowing how to use it, again the bottleneck is not productivity. 

Att some point, the architecture gets better and more efficient and less resource intensive. Until then, it's good to learn and keep exploring but with both feet on earth, grounded on reality and common sense. 

Cheers,
Diego Pacheco





Popular posts from this blog

C Unit Testing with Check

Having fun with Zig Language

HMAC in Java