I have a database which is essentially a survey tool where admins will define a survey and distribute this to a number of users who will then provide the answers to the survey questions.

The surveys are all independent of each other and fairly random…there’s no real theme to them outside of the industry the survey tool is used by.

There are a fairly large number of questions/answers in the DB (in the millions). What would be an interesting ML exercise to run on the data for a complete ML novice (but competent coder)?

  • bl4h101bl4h@alien.topOPB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Hi…I wouldn’t call it garbage, but it is only relevant to the context for which it was created.

    And as we are data processors, it needs no cleaning to speak of. The answers provided are appropriate to the needs of the survey creators.