[JOB] M2 internship: Generative-AI Assisted Generation of Spack Package Recipes

Duration: 4-6 months
Where: Maison de la Simulation, CEA Saclay, France

Supervisors:

Benoît Martin, MdlS, bmartin (at) cea.fr
Thomas Bouvier, MdlS, thomas.bouvier (at) cea.fr
Thomas Padioleau, MdlS, thomas.padioleau (at) cea.fr

Context

Spack is a package manager for supercomputers. It is commonly used in high-performance computing (HPC) environments of all scales: examples include kinetic and gyrokinetic problems on exascale architectures, AI inference stacks on smaller clusters or even local development environments.

Spack recipes are Python scripts defining how packages should be installed. They define properties and behaviors of the build, such as where to find and how to retrieve the software; its dependencies; and options for building from source. Recipes leverage declarative package directives named specs to express all the intersecting and disjoint compatibilities that one version of a package has with another. This allows for the necessary flexibility in scientific stacks where we may want to keep some dependencies at older versions while upgrading other dependencies to newer versions (for reasons of reproducibility and stability).

However, this flexibility comes at the cost of increased complexity in writing Spack recipes, especially for the numerous, complex dependencies of scientific stacks involving multiple languages and toolchains. Based on recent advances in generative AI, it is now feasible to develop agentic skills: 1) to automate the generation of such Spack recipes, and 2) to assist in the identification and creation of test scenarios to ensure the quality of the produced recipes.

Such work implies developing a methodology to reuse existing Spack utilities, AI agentic frameworks and CI tooling. The effectiveness of the generative-AI assisted approach should be evaluated.

Expected Results and Milestones

End goal: Generate a correct Spack recipe for a complex application or library using a generative/agentic AI tool

Get familiar with
1. Spack packaging best practices
2. Existing generative AI and agentic tools
Setup an LLM/agent on our local cluster
Explore what metric to use to quantify the quality of a generated Spack recipe
Generate a Spack recipe (3 levels of difficulty)
1. for a simple Python library
2. for a library containing compiled code
3. for one library and its full dependencies tree
Package a complex application/library such as Gysela or vLLM

Required skills

Python programming
Generative/agentic AI knowledge and experience
Packaging and build system knowledge
Interest in scientific software stacks

Bonus skills

Hugging Face
- smolagents: https://github.com/huggingface/smolagents
- Skills: https://github.com/huggingface/skills
Spack: https://spack-tutorial.readthedocs.io/en/latest/
PyPI to Spack package.py utility: https://github.com/spack/pypi-to-spack-package
Showboat:https://github.com/simonw/showboat
Beads: https://github.com/steveyegge/beads
Autoresearch: https://github.com/karpathy/autoresearch
Fabric: https://github.com/danielmiessler/fabric
Opencode: https://opencode.ai

AI Policy

AI assistance is allowed for this contribution. The applicant takes full responsibility for all code and results, disclosing AI use for non-routine tasks (algorithm design, architecture, complex problem-solving). Routine tasks (grammar, formatting, style) do not require disclosure.