FAQ

SIOP 2025 Machine Learning FAQ

Version: Feb 17, 2025

Please email us your questions and we will update the FAQ.
Ivan Hernandez @ ivanhernandez@vt.edu

Table of Contents

The competition will involve multiple challenges that reflect various types of assessments commonly used during selection. You will need to complete job applications for a variety of positions being filled by various organizations. You are provided with a list of organizations and positions they are looking to fill, as well as some additional information about the organization and position.

You will then use any combination of computer-mediated approaches (e.g., Large Language Models, ChatGPT, programming libraries) to try to answer those questions in a way that would optimize the scores that a job candidate for that position would receive. The scoring system is not fully transparent, as it would be in real life, but the overall score for an applicant is calculated as a weighted combination of the applicant’s evaluated Personality, Cognitive Ability, and Skills and Ability, with a penalty for if the organization believes the applicant is “faking” using a combination of signals from the responses.

You can use any model and any computational approach available. As long as you can reproduce it at scale on 200 jobs for the test period, and over a zoom call that lasts one hour, then you can use it. The only approach that is not allowed is having the answers to the questions come from a human.

Use of LLMs: You may use any approach, including LLMs (proprietary or open source), but if you choose to use an LLM for a set of questions, you must not apply any ad-hoc human correction to the response. The approach must be reproducible by anyone trying the method.

No previous experience is needed! This competition is a learning opportunity in a rapidly evolving field. You do not need to have your solution be all "code" as you could copy and paste answers directly from an LLM for some sections and mix and match approaches. The submission dataset is in an Excel (xlsx) format, which should make it approachable to individuals used to working in non-programming environments.

The competition will begin on February 17th, when you will first have access to the development dataset.

The last day for submissions will be March 16th. Final submissions will be scored against the test dataset.

You can sign up any time before the final day of the competition. The sooner you sign up, the more time you’ll have to experiment with solutions.

Based on previous competitions, the top 4 teams will be invited to present their methods at a special session during SIOP.

You can share the competition data with any member of your team. By participating, you agree not to share data outside your team. The data will be public after the competition concludes.

Yes, you can add members by emailing the organizer at ivanhernandez@vt.edu

Yes, change your team name by emailing the organizer at ivanhernandez@vt.edu Clever puns are welcome!

No. The challenge is to reflect the real-world challenges of building effective automated job application tools. These tools will rarely have any feedback on how they scored and only learn whether they received the job or not (or were a finalist). Therefore, to know if job selection systems that do not provide ground-truth data are still vulnerable, we will also not have any ground-truth answer data.

There are 13,000 questions, but roughly 50 jobs, which is about 260 questions per job. Having at least 50 jobs means that competitors can know that their approaches are improving in a more systematic way, and not simply due to the noise. Because many personality inventories rely on 300 questions or more, this amount of questions may realistically reflect the number of items a candidate might complete. Additionally, consider ways to address the scalability as part of the challenge. Having many questions has real-world implications, because if automated tools can handle thousands of items, then it presents a greater threat to selection, than if they can only work on an ad-hoc, individual basis.