I’ve got research background in ML but never actually developed any models as it was all theoretical work. I got lucky during the interview stage for this role as my research impressed them. My project involves fine-tuning a GPT-3 model for a specific task and host the model on a website. Does anyone have any tips on how to go about learning what I need to know to do this? Also what should I consider when curating my custom dataset when fine-tuning the model? I really want this to be a learning experience for me.

  • OrganicCriticism6232@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Maybe practice on smaller open source models to get a feel for it. Then work your way up. Tuning is more of an art than a science

  • glitch83@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Sounds a little like you’re overqualified. Fine tuning is something you could probably pick up from a udacity course if you’d like.

    • Lambda_Lifter@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      OP - “I have no clue wtf I’m doing”

      ML subreddit - “sounds like you’re overqualified”

    • Celsuss@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      How can someone be overqualified when they don’t know how to do their work tasks?

    • killver@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      He obviously is not overqualified, otherwise he would know how to do it. Sure, calling some random code snippets to fine-tune is easy, but doing this properly takes quite some experience.

    • __ingeniare__@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      The first result of the Google query is literally OpenAI explaining how to fine-tune GPT-3, which also requires minimal ML and coding knowledge since they have already done all the heavy lifting. Curating the dataset is the hardest part, but even that is just basic data science.

  • JDubbleu@programming.dev
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    10 months ago

    I don’t have any resources as I’m a SWE, but I do have some advice.

    Ask for help from your mentor/other engineers. Seriously, I’m a software engineer (non-ML, but ML teams operate similar to SWE teams) and we don’t expect interns to know almost anything, and we understand they’re gonna need quite a bit of hand holding. I know I did. It’s okay! That’s how we all learn, and being able to ask for help when you need it is one of the most vital skills to have in software. The absolute worst thing you could do is struggle the whole internship without getting the help you need.

    All you gotta do is say, “Hey, I’m struggling with the fine-tuning of this model for my project. My research and academic experience have all been extremely theoretical, but I never got the chance to do much practical tuning. Do you have some suggestions given where I’m at?”. Obviously provide a lot of extra context for where you’re at/what you’re struggling with, but you get the point. They’re not gonna fire you so don’t worry about that (literally every interns worst fear), and they want you to learn! Asking would reflect well on you too since you’re showing 1) you know your short comings and 2) you are actively working to overcome them. If you can do both of those things you’re already ahead of most people.

    Good luck!

  • Infamous-Bank-7739@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Start with BERT as it’s easy to pickup. Follow some guide. I agree that you are actually overqualified for this, unless you are afraid of coding.

    • space-tardigrade-1@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      fine tuning via the OpenAI API is actually easier because you only need to work on preparing a clean data set and sending it to OpenAI and not worry about any other part of the pipeline.

  • blahblahwhateveryeet@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Yeah basically what I think is probably happened is that these guys are like everybody else you thinks that fine tuning is somehow going to make it possible to open Pandora’s black box on GPT and more than likely the folks that tasked you with this have absolutely no clue about how fine tuning works…

  • ZachVorhies@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Yeah none of this requires an ML degree. You are doing data plumbing which means looking at the api and making sure your data goes from your computer to theirs and then using the stored server state to perform computation.

    I recommend using python and using the openai python bindings to handle the plumbing. I wrote a simple script in about a day. Depending on your level of skill with python it could take a bit longer than that.

  • Competitive_Drive_14@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    You should check out the Low Rank Adaptation (LoRA) repo and try running the example for finetuning gpt2 small. Once you got than running you could use the library for finetuning other larger open source models on cloud (check out SkyPilot for this)

  • VitalYin@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    What if u save the company money on fine-tuning and figure out if prompting will be better for this task

  • Current_Ferret_4981@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    I can only hope this wasn’t a highly competitive internship that more qualified (in the sense of having done similar work) students were passed over for. Majority of ML students regardless of speciality should have applied experience even if it’s on a smaller scale imo.

  • Weird_Ad_3153@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    TowardsDatascience, Kaggle and Stack… are your sites to go. Most of the sites tells you how to integrate and use APIs. And off course, you can just Google it!!