This post (Part I) walks you through a Git repo I put together to allow anyone to fine-tune a model to write emails in their own ‘voice’.
I did this to use a small model that I could run at home to help me draft emails that didn’t suck.
You don’t need a hefty GPU, but you will need a credit card to rent one on vast.ai for the training (note: affiliate link – I will benefit if you sign up and spend). It costs me about $1 per run to create a useful model that I can use on my home (non-GPU) server.
Part II will dive deeper into how the script works and what’s going on.
What It Does
At the very highest level, the script creates a new large language model (LLM), building from a base model, and training on sample emails you give it. Here’s some sample output (truncated to 40 tokens):
INPUT:
Write an email to a friend arranging a drink:
OUTPUT:
Hey, how are you? I was thinking of going out for a drink this evening. Are you free? Let me know what time and where we could meet.
Ian
On 10 Jul
You can see how it’s put my name (Ian) at the bottom and a (spurious) date, because that’s why my typical email looks like.
Compare this to the ‘base’ model this was built from (Qwen/Qwen2.5-3B-Instruct), which gives a far more ‘generic’ and formal response (also truncated to 40 tokens):
INPUT:
Write an email to a friend arranging a drink:
OUTPUT:
The date is 10/24/10 at 6:00pm.
Subject: Exciting Plans for Our Drink Date!
Dear [Friend's Name],
I hope
That, by contrast, does not sound like me at all.
…in a bit more detail
Here’s what the script (run_fine_tune.sh) does:
- Picks the cheapest available GPU from vast.ai that fits the criteria needed, and provisions it
- Copies up a script to run on the machine (
build_and_push_model.sh), the gmail samples you want to train it on (compressed), and a huggingface token (to store the resulting models) - Runs
build_and_push_model.shon the GPU, which:- Installs the needed software (LLaMA-Factory, llama.cpp)
- Logs into huggingface
- Creates the huggingface repos (if needed)
- Trains the new model
- Merges the new model
- Converts the new model to
.ggufformat (allowing it to be run in llama.cpp) - Quantizes the new model (so it can be run more easily on a non-GPU machine)
- Pushes the merged model, and the
.gguf, to huggingface repos
- Destroys the vast.ai instance (asking you first)
How To Run It
The latest instructions will be maintained here:
- Clone the repo, and navigate to root folder
- Set up the virtual environment, install required pips
- Use Google Takeout to download your sent email to a
.mboxfile. You probably only need your ‘Sent’ emails, as they contain your responses. If you don’t use gmail, the script expects a.mboxfile. Then run the python script to extract the emails and put them in the correct format. Then compress the result withxz - Get a huggingface write token
- Set up vast.ai api key
- Run the script
- Here’s an example invocation, with placeholders in caps:
./run_fine_tune.sh --max-steps 1 --hf-repo qwen2.5-3b-YOUR_NAME_sft --hf-user YOUR_HUGGING_FACE_ACCOUNT
If you like this, you might like one of my books:
Learn Bash the Hard Way
Learn Git the Hard Way
Learn Terraform the Hard Way
If you feel this was useful, consider buying me a coffee to say thanks!

