Why does nobody talk about the real cost of fine-tuning a big model?

I just spent two months trying to get a 7-billion parameter model to understand my company's specific legal documents. We threw about $15,000 in cloud credits at it, tweaking it over and over. The results were barely better than what we got from a simple, well-made prompt for GPT-4. All that time and money for a 5% accuracy bump? It felt like we were just feeding a furnace. When does chasing the newest, biggest model stop making sense? Has anyone else hit this wall with custom AI projects?

2 comments

2 Comments

ryan_nguyen3mo ago

Honestly, that just sounds like a really expensive way to learn that bigger isn't always better. Maybe the wall you hit is the one telling you to stop.

elizabeth_martin3mo ago

Used to chase big models too, until a project like yours showed me the hard truth.