Without a way to estimate "AI power" used for the task I don't see how you can fairly rate home assignments.

Like i said either use pre-baked env or give a candidate an auth token with something like $100-200 quota from the provider your company already uses