Large Language Models

April 12, 2023

Large Language Models (LLMs) are incredible generative tools. They can have philosophical debates, plan trips, even replace your friends. Let’s look at three different approaches when working with LLMs:

Prompt a pretrained LLM

Commercial pre-trained LLMs minimize upfront training costs, often giving the best-performing experience with minimal technical expertise. Vercel’s AI Playground is a handy tool to play around with various commercial LLMs, though GPT-4 is usually my go-to:

Prompt: Write a conversation between two friends. Their names are Amar and Vijay. They are both immigrants to USA, working on H1B visa. Amar has just been laid off, and is at the risk of being deported. Focus on the characters emotions and suffering due to circumstances. Use the writing style of Fyodor Dostoevsky.

Response: Amar and Vijay sat in a small, dimly lit apartment, in a city far from the warmth of their homeland. A sense of impending doom hung in the air as they exchanged a languid greeting.

Amar, his face pale and gaunt, sank into a worn armchair, the weight of the world seemingly on his shoulders. “I have been laid off from my job” he said, his voice barely audible. Vijay stared at him, his eyes wide with terror, as if Amar’s fate were a cruel specter that could just as easily turn its gaze upon him. “How could this happen? What will you do now?" he asked, his voice shaking.

“I don’t know. I am not given much time. I must find a new job within the next few weeks, or I will be deported,”.

Vijay, who had suffered from sleepless nights for many months, fearing the very same fate, felt the familiar ache of empathy for his friend. “I am so sorry. Is there anything I can do to help you?”

Amar’s gaze fell to the floor, as if searching for some hidden truth among the dust and shadows. “I am not certain. You know as well as I do the cruel nature of our situation. We are trapped by our visas, bound to jobs that hold our lives hostage, and threatened with deportation if we dare to defy the system.” He clenched his hands, his thoughts drifting. “I fear not just for myself but for my family. If I am forced to leave the United States, their lives will be upended,” he said, the pain evident in his voice.

Vijay frowned, sharing in his friend’s torment. “I understand. My family, too, relies on the income I provide. The education of my younger siblings, the care of my parents, all of it would be jeopardized if I were to be sent back. My assets, my savings, everything I have worked for… It could all be lost if I am deported”. He stared at the floor, a sense of powerlessness gnawing at him. The room grew silent as the two friends sat, contemplating the cruel hand that life had dealt them.

Fine-tune a pretrained LLM

Fine-tuning a pretrained LLM can leverage LLM learning, to build on top with additional data. I chose GPT-2 as the base model. Fine-tuning is slower than prompting an LLM, but offers more control on inference. Fine-tuning GPT-2 with 123M parameters, took about 30mins on my M2 Macbook with MPS. Training data, and code is available at: https://github.com/umangsh/dostoevskyGPT.

Prompt: What is the meaning of life?

Response: “What is the meaning of life?”, he asks. “The question we’re asking are not religious questions, but philosophical questions … in the philosophical sense of the word. Does God have a place and purpose in the world? Does he have a purpose and purpose in his life? Can he do this or that? What is the life that he will have on this earth? Is he going to be an excellent human being afterwards?”

He makes a sound statement. If God can do that, why not of a kind with human beings?

“It’s not only that there are some types of human beings, but there are some types of human beings who are living in the world and it’s possible that they would understand that in a very specific way, and they would be able to make the necessary decisions that might be required or require for them. That’s what we call a ’life choice’ in philosophy, which is not as much about how you do something, as it is about how you do something. It’s not about the future. It’s not about what you do, but what you think of yourself. The answer to that question is, ‘I’m not going to go that far into the unknown. Until then, I’m not going to tell you anything that I think could be of benefit to me. I’m going to tell you what I think about myself and why I think it’s wonderful. I’ll share with you some of that advice I gave you, but that’s it. I don’t tell you what you’re going to do.”

Train an LLM from scratch

This option offers full control of training datasets, and inference pipelines. The character based nanoGPT is a 10M parameter model, and requires about 30mins to traing on an M2 Macbook with MPS. Training data, and code is available at: https://github.com/umangsh/dostoevskyGPT.