Why LLMs are not Good for Coding. Challenges of Using LLMs for Coding | by Andrea Valenzuela | Feb, 2024 – Towards Data Science

Self-made image

Over the past year, Large Language Models (LLMs) have demonstrated astonishing capabilities thanks to their natural language understanding. These advanced models have not only redefined the standards in Natural Language Processing but also populated applications and services.

There has been a rapidly growing interest in using LLMs for coding, with some companies striving to turn natural language processing into code understanding and generation. This task has already highlighted several challenges yet to be addressed in using LLMs for coding. Despite these obstacles, this trend has led to the development of AI code generator products.

Have you ever used ChatGPT for coding?

While it can be helpful in some instances, it often struggles to generate efficient and high-quality code. In this article, we will explore three reasons why LLMs are not inherently proficient at coding out of the box: the tokenizer, the complexity of context windows when applied to code and the nature of the training itself .

Identify the key areas that need improvement is crutial to transform LLMs into more effective coding assistants!

The LLM tokenizer is the responsible of converting the user input text, in natural language, to a numerical format that the LLMs can understand.

The tokenizer processes raw text by breaking it down into tokens. Tokens can be whole words, parts of words (subwords), or individual characters, depending on the tokenizers design and the requirements of the task.

Since LLMs operate on numerical data, each token is given an ID which depends on the LLM vocabulary. Then, each ID is further associated with a vector in the LLMs latent high-dimensional space. To do this last mapping, LLMs use learned embeddings, which are fine-tuned during training and capture complex relationships and nuances in the data.

If you are interested in playing around with different LLM tokenizers and see how they

Follow this link:

Why LLMs are not Good for Coding. Challenges of Using LLMs for Coding | by Andrea Valenzuela | Feb, 2024 - Towards Data Science

Related Posts

Comments are closed.