The problems generated by vibe coding, which we discussed in previous articles, are serious enough to become the cause of the next major computer catastrophe. It's time to move to AI-assisted programming, which involves a more professional (and less risky) approach. It's ideal when you need programs in situations requiring security and stability.
In this article we will begin by discussing the most common problems with vibe coding and why, despite the noise made by some promoters, it does not constitute the new industrial revolution. In the following sections we will develop the procedure to go from vibe coding to AI-assisted programming in Linux.
The problems of vibe coding
Let's take an example: Two people go to a fancy restaurant. One knows nothing about cooking, the other is an expert chef. The first only knows what he wants to eat and, at most, will judge the quality of what he's served by the price or by what his untrained palate tells him. The second will be able to distinguish whether fresh ingredients were used, whether the seasonings were used in the correct amounts, and whether he's being overcharged.
It's no coincidence that we're using a restaurant as an example. Andrej Karpathy, an expert in Artificial Intelligence, is one of the co-founders of OpenAI and director of Artificial Intelligence at Tesla. Wanting to solve the problem of not knowing what to order, he created an app using vibe coding that displays a photo of the dishes. Here's how he describes his experience:
"Creating MenuGen with vibe coding was fun as a local demo, but quite tough as a real app. Building a modern app is like assembling IKEA furniture of the future: lots of services, APIs, configurations, limits, and prices." LLMs have outdated knowledge, make subtle mistakes, and are hallucinating. And the funny thing is, I hardly spent any time coding, but rather configuring services in the browser. All of that isn't even accessible to an LLM. How are we supposed to automate everything by 2027 like that?
Let's look in more detail at what happened to Andrej, who, by the way, He's not a web developer, which probably led him to make things more complicated than necessary.
“I used Cursor + Claude 3.7, gave it the app description, and it wrote all the frontend components in React very quickly, building a beautiful website with soft multicolored fonts, small CSS animations, responsive design and all that, except for the actual backend functionality.”
Frontend and backend are two sides of the same coin in application design. The frontend is the part that users interact with, while the backend is where information is processed and stored. The frontend operates locally, whereas the backend is typically developed on an external server.
Claude is, at least among vibe coding fans, the Great Model Language of the moment. The reasons for this are more ideological than technical. Anthropic, the company that develops it, clashed with the Pentagon over the unsupervised use of its models in the military.
Cursor is a code editor in which the established Artificial Intelligence model acted as a programming companion. We'll talk later about how to install Cursor on Linux.
React is a library for creating dynamic graphical interfaces. I'm not familiar with the application, so I can't judge whether using React is appropriate, but based on my own experience with vibe coding, models tend to use libraries for things that could be done perfectly well with pure HTML, CSS, and JavaScript.
Where things started to get complicated for Karpathy was in the backend part. Their application begins by taking a photo of the menu and using optical character recognition to then search for details about each dish. Let's see their description of what happened:
“This is where some problems started. I needed to call the OpenAI APIs to perform OCR on the menu items from the image. I had to obtain the API keys. Navigating somewhat convoluted menus about “projects” and detailed permissions.” Claude was amazed by deprecated APIs, model names, and recently changed input/output conventions.This was confusing, but it was resolved after copying and pasting documentation several times. Once the API calls were working, I immediately encountered some pretty strict rate limits, which only allowed me to make a few queries every 10 minutes.
This is a classic example of the tendency of models to use a cannon to kill flies. The menu image recognition could have been done with a local library like Tesseract.js
Tesseract.js is based on Google's tool of the same name and supports over 100 languages. It can be used seamlessly from the browser, and most importantly, it doesn't consume tokens, doesn't require API keys, and has no limitations.
Unfortunately, there's no solution to the hallucinations caused by deprecated APIs and working with outdated documentation. Except for learning to program and not using automated tools.
He had similar problems with the second part of the application: converting the description of each dish into images.
"I registered, obtained a Replicate API key, and encountered similar problems. The queries weren't working because the LLM knowledge was outdated, and this time even the official documentation was a bit outdated due to recent API changes. which no longer returns direct JSON but a streaming object that neither Claude nor I fully understood. Then I ran into usage limits again, which made debugging difficult. Later, I was told that these are common anti-fraud measures, but they also make it harder to start legitimate new accounts. I was told that Replicate is migrating to a prepaid credit model, which might help.”
Andrej could have solved this without resorting to image generators. There are several tools that can search for real images of dishes (avoiding potential AI illusions). Among them are two recipe database APIs: TheMealDB and Spoonacular Food API.
The problems didn't end there for our friend the vibe coder. When uploading the application to Vercel (a platform that allows deploying and hosting applications from a GitHub repository), errors appeared that did not occur locally.It took an hour to realize that the API keys hadn't been uploaded to the server. An experienced programmer would have known and saved themselves the token expense.
The author's idea is to charge for using the application (I wonder who's going to pay for something they can get for free by asking Gemini or Siri). For this, he needs user authentication. At Claude's suggestion, Karpathy turned to another cloud-based platform known as Clerk. Clerk handles everything necessary for registration and access. I have no objection to this decision.
The problem is that Claude wrote the code for a deprecated version of the Clerk API and forgot to tell him that if he wanted to use it in production he needed his own domain and Not the free one that Vercel provides. He had to buy it and configure it.
There were also complications when setting up the payment platform. When he finally managed to send it to production, he discovered that:
“All processing was real-time, with no persistence. If it took too long, it failed. If you refreshed, you lost everything. The right solution would be a database plus a queuing system. But that meant more services (e.g., Supabase, Upstash), more complexity. Too much. I left it for the future.”
Rather, the solution is to learn to program or pay someone who knows what they're doing. In the next article, we'll begin looking at the steps to do the former.


