How To: Fine-tune or Train an AI model for your WordPress ChatBot

It used to be clunky or extremely complicated. But with ChatGPT and the advanced GPT-3 and soon GPT-4 family of models, chatbots have reached a new level of sophistication. However, creating a custom chatbot may not be easy, but it is definitely achievable. I am here to make the process as straightforward as possible for you. Join me on this exciting journey! 🎵

Important update: This article is about fine-tuning models based on GPT 3. However, since August 2023, the default fine-tuning features are all based on GPT 3.5+. It’s much simpler, the Casual Fine-Tuning features as well as the delimiters are not needed anymore. This article will be updated at a later point, and it’s still valid about everything else.

The basics of AI

It all starts with AI models. A model is a type of algorithm that can be simple or complex, and can run on a variety of devices, from a mobile phone to a network of servers. There are many different types of models, but their overall goal is typically the same: to take an input and produce an output. Models are trained using data to generate more accurate outputs. Each time a model is trained, it becomes a new version of the model.

Now, let’s get to the topic on everyone’s mind: ChatGPT, developed by OpenAI. It is both a chat system and a model. The ChatGPT model is part of the GPT-3 family, and it was trained using another model in that family, the davinci-003 model. The good news is that you can use the davinci-003 model directly or train it to create a new model.

In theory, you can create your own ChatGPT model using your own data and preferences. I’ve even had fun experimenting with creating my own version, called MeowGPT! 😋 I’m not sure if I’ll make it publicly available (it’s silly!), but the point is that you can create your own chatbot.

AI hallucinates content. I am not kidding, it’s the term! And this image represents it well! Honestly, what’s going on here?

Do you really need to train a model?

This is an important question to consider. In many cases, you can avoid the tedious task of fine-tuning a model by carefully designing your prompts (known as prompt-engineering). This involves thinking carefully about a comprehensive and well-crafted prompt (which you can find as context in the shortcodes of AI Engine).

For example, I have provided a practical demo. In this scenario, the chatbot’s context includes the content of the current article, and I am asking it to have a conversation about it. As you can see, the chatbot is able to understand and discuss the content, allowing you to specify the desired tone and behavior for the model.

Another issue to keep in mind is that fine-tuned models can become more constrained in their abilities and are limited to the format, tone, and somehow the content of the data they were trained on.

It’s important to keep this in mind. However, if you are willing to invest the time and effort in creating a high-quality dataset, fine-tuned models can be awesome. Let’s explore this further!

Let’s fine-tune a model

To fine-tune a model, you will need the best plugin WordPress for AI which is – of course – AI Engine, an account at OpenAI, and some time to complete the process..

The process is simple and involves the following steps:

  1. Gather all the necessary data
  2. Format the data correctly
  3. Upload the data to OpenAI in the JSONL format
  4. Train an existing OpenAI model with your data
  5. Obtain your own fine-tuned model

The AI Engine plugin makes the process much easier, but it’s important to remember that just because it’s easy, it doesn’t mean it should be rushed.

1. Datasets: All you need is data! ✌️

This step is crucial to the fine-tuning process. In the settings of AI Engine, you can access and manage your datasets:

Of course, you might not have any data at the moment. In this case, you can switch to “Dataset Builder” mode in the AI Engine settings by moving the “Model Finetune” toggle to the “Dataset Builder” position. This is where you will spend time creating your dataset. It will look something like this:

As you can see from the image above, the data is essentially a spreadsheet with two columns: prompt and completion. In the case of a chatbot, we can simplify the concept and think of it as a question and an answer. The type of data, how it’s written, and formatted highly depends on the goal and the type of application. But in our case, it will be focused on a chatbot, and I’ll try to keep it simple and easy to understand.

To gather your data, start by collecting all of your pages, content, and any ideas you have in your mind. Try to create a file, or several files, without any HTML formatting or other unnecessary elements. If you have access to the free version of ChatGPT, use it to generate a large number of questions and answers based on your content. Gather the data in a Google Sheet with the two columns, and make sure to review and perfect it. A dataset should have a minimum of 500 rows to offer useful results, and much more if you want to achieve better results. According to the OpenAI documentation, numbers of 3,000 and 5,000 rows are recommended. But it ultimately depends on what you’re trying to achieve.

Once you have your dataset, you can import it into AI Engine using the “Import File” button. You can export a CSV file from Google Sheets and use it here, but it also supports JSON and JSONL formats if you prefer. Alternatively, you can type the data manually.

Wait, I know that you are a bit lazy (and I am too). So here is a handy tool: the Dataset Generator! Here is how it looks.

The Single Generate button will generate data based on one post. You can choose which one, based on its ID, or will run on the very first one – think of it as a test, or as a one by one action. The Run Bulk Generate will generate data based on all your posts of a certain post type (Posts, Pages, etc). This process takes times, and has a cost (as it is using the OpenAI API), so be careful with it.

You can modify the prompt to enhance the process better based on your case, you can also use the {URL} or {TITLE} placeholders, that way:

Generate 30 questions and answers from this text. Question use a neutral tone. Answers use the same tone as the text. If necessary, the answer can end with "More information at {URL}.".

You now have a lot of data, or so I hope. However, you may see red crosses next to all your content, so let me explain why.

Using raw text is not enough; OpenAI requires you to specify separators/delimiters (check Preparing your dataset on OpenAI) for both the prompt and the completion. This is to ensure that the model understands where the input and output begin and end. To make this process even easier, I have set default separators in the Dataset Builder (although these can be overridden for those who are more experienced). You can try to enter the separators manually, but I recommend using the Format with Defaults button, which will automatically add the separators if they aren’t already present. I refer to this process as Casual Fine-Tuning. Keep this in mind… for later! 👍

You can exit this page and return to it later as it will be saved in your browser’s local storage. Additionally, you can Export as CSV for backup and later import.

Once everything is looking good with green checkmarks, it’s ready to be uploaded to OpenAI.

A filename is automatically generated for you, but you can and should change it to something more meaningful and recognizable. This filename will help you identify this dataset once it’s on the OpenAI servers. Are you ready? Click on Upload to OpenAI. Once the upload is complete, AI Engine will take you to your list of datasets, where you will see your newly uploaded file.

2. Train a model with your dataset

To train a model with your dataset, click on the Train Model button next to your dataset. This will open a modal that represents the final step of the process.

You’ll need to choose a base model and a suffix. The suffix is a simple reminder of what your model is for and it will be included in the middle of the entire model name generated by OpenAI. AI Engine will try to suggest a suffix based on the name of your dataset (preview).

For the base model, I recommend using either curie or davinci. As of April 2023, we can’t use Turbo or GPT-4 models. OpenAI generally recommends curie for most cases as it is suitable for large and good datasets, and it is also more cost-effective than davinci. However, if you prefer davinci, it is a more powerful model but it is also more expensive. Keep in mind that you will need to pay for the fine-tuning and later for the usage of the new model.

Are you ready? Click on Start. AI Engine will take you to the list of your models, and you will see that a new model is being built for you. This process will take some time, for a dataset of 500 rows, it typically takes around one hour, but it can take up to a day, or more if you have a huge amount of rows.

Click on Refresh from time to time. Wait for the status of your model to be marked as SUCCEEDED.

Then, you can train your dataset with another model, or you can download it, re-import it into the Dataset Builder, make changes, and re-train it!

3. Your fine-tuned chatbot!

Using your fine-tuned chatbot is now very easy. In the Settings of your Chatbot, you will need to select your new fine-tuned model. If it doesn’t appear, make sure that it has been successfully trained, click on the Refresh button in the Models list again, refresh the page, just in case. Last, but not least…

You must check the Casually Fine-Tuned checkbox. This will ensure that the separators used during the training of your model are also used in the chatbot and the way completions are generated. This is an important step in order to have the chatbot understand the inputs and outputs. Please note that this checkbox is grayed-out if the selected model is not a fine-tuned model.

You don’t need to provide context as your model already knows everything. It’s ready to use.

In conclusion…

Fine-tuning a model and creating your own chatbot has never been easier thanks to AI Engine and OpenAI. By following the simple steps outlined in this article, you can create your own fine-tuned chatbot with your own unique data and flavor. The process of gathering data, formatting it, and training a model may take some time, but the result is worth it! With AI Engine’s user-friendly interface and OpenAI’s powerful models, the possibilities are endless. Let’s get started on creating your own chatbot and have some fun with it! 🦄