Duration: 4-6 months
Where: Maison de la Simulation, CEA Saclay, France
Supervisors:
- Benoît Martin, MdlS, bmartin (at) cea.fr
- Thomas Bouvier, MdlS, thomas.bouvier (at) cea.fr
- Thomas Padioleau, MdlS, thomas.padioleau (at) cea.fr
Context
Spack is a package manager for supercomputers. It is commonly used in high-performance computing (HPC) environments of all scales: examples include kinetic and gyrokinetic problems on exascale architectures, AI inference stacks on smaller clusters or even local development environments.
Spack recipes are Python scripts defining how packages should be installed. They define properties and behaviors of the build, such as where to find and how to retrieve the software; its dependencies; and options for building from source. Recipes leverage declarative package directives named specs to express all the intersecting and disjoint compatibilities that one version of a package has with another. This allows for the necessary flexibility in scientific stacks where we may want to keep some dependencies at older versions while upgrading other dependencies to newer versions (for reasons of reproducibility and stability).
However, this flexibility comes at the cost of increased complexity in writing Spack recipes, especially for the numerous, complex dependencies of scientific stacks involving multiple languages and toolchains. Based on recent advances in generative AI, it is now feasible to develop agentic skills: 1) to automate the generation of such Spack recipes, and 2) to assist in the identification and creation of test scenarios to ensure the quality of the produced recipes.
Such work implies developing a methodology to reuse existing Spack utilities, AI agentic frameworks and CI tooling. The effectiveness of the generative-AI assisted approach should be evaluated.
Expected Results and Milestones
End goal: Generate a correct Spack recipe for a complex application or library using a generative/agentic AI tool
- Get familiar with
- Spack packaging best practices
- Existing generative AI and agentic tools
- Setup an LLM/agent on our local cluster
- Explore what metric to use to quantify the quality of a generated Spack recipe
- Generate a Spack recipe (3 levels of difficulty)
- for a simple Python library
- for a library containing compiled code
- for one library and its full dependencies tree
- Package a complex application/library such as Gysela or vLLM
Required skills
- Python programming
- Generative/agentic AI knowledge and experience
- Packaging and build system knowledge
- Interest in scientific software stacks
Bonus skills
- Hugging Face
- smolagents: https://github.com/huggingface/smolagents
- Skills: https://github.com/huggingface/skills
- Spack: https://spack-tutorial.readthedocs.io/en/latest/
- PyPI to Spack package.py utility: https://github.com/spack/pypi-to-spack-package
- Showboat:https://github.com/simonw/showboat
- Beads: https://github.com/steveyegge/beads
- Autoresearch: https://github.com/karpathy/autoresearch
- Fabric: https://github.com/danielmiessler/fabric
- Opencode: https://opencode.ai
AI Policy
AI assistance is allowed for this contribution. The applicant takes full responsibility for all code and results, disclosing AI use for non-routine tasks (algorithm design, architecture, complex problem-solving). Routine tasks (grammar, formatting, style) do not require disclosure.
