📊

Professional Overview

Research Leadership Profile
  • Three decades of leading research and development at CNRS (France), Inria (France), and Argonne National Laboratory (USA).
  • R&D topics: LLMs as Research Assistants, High-Performance Computing, Fault-Tolerance, and Scientific Data Compression.
  • Fifteen years of establishing and leading international collaborations through Joint Laboratories.
  • Key Achievements
    • IEEE Fellow
    • IEEE Charles Babbage Award
    • 5 Achievement/Service/Honor Awards
    • 2 R&D 100 Awards for innovative software
    • 12 Best papers Finalists/Awards
    • 300+ publications with 19,500+ citations
    • 113 invited talks including dozens of keynotes
    • 22 Ph.D. students advised
    Current Focus Areas
    LLMs Evaluation / LLMs for Science Resilient High-Performance Parallel/Distributed Computing Compression of Scientific Data
    🎓

    Education

    Ph.D. + 7 years diploma (HDR)
    University of Paris XI, October 2001
    Jury: Michel Cosnard, Brigitte Plateau, Marc Snir, Ian Foster, Mitsuhisa Sato, Joffroy Beauquier
    Ph.D. in Computer Science
    University of Paris XI, January 1994
    Mention très honorable avec les félicitations du jury
    Jury: Michel Cosnard, Brigitte Plateau, William Jalby, Zvonko Vranesic, Daniel Etiemble
    DEA (Pre-doctoral degree)
    University of Paris XI, July 1989
    Master of Science
    University of Paris VIII, July 1988
    💼

    Professional Positions

    Project Manager and Senior Computer Scientist
    Argonne National Laboratory | April 2013 - Present
    Leading research on resilience and compression in high-performance computing, directing multiple ECP projects, and developing innovative solutions for extreme-scale computing challenges.
    Adjunct Research Professor
    University of Illinois at Urbana-Champaign | April 2013 - Present
    Visiting Research Professor
    University of Illinois at Urbana-Champaign | July 2009 - March 2013
    Senior Researcher
    INRIA | September 2003 - December 2013
    Junior Researcher
    CNRS | February 1994 - August 2003
    🏆

    Awards & Honors

    Scientific & Leadership Recognitions
    • 2024 IEEE Charles Babbage Award
    • 2024 Secretary of DOE Honor's Award
    • 2024 Europar Achievement Award
    • 2022 ACM HPDC Achievement Award
    • 2021 IEEE TC Award for Editorial Service and Excellence
    • 2018 IEEE TCPP Outstanding Service and Contribution Award
    • 2017 IEEE Fellow, Class of 2017
    Technical Software Recognitions
    • 2021 R&D 100 Award - SZ: A Lossy Compression Framework for Scientific Data
    • 2019 R&D 100 Award - Scalable Checkpoint/Restart (SCR) Framework
    Best Papers & Special Recognitions
    Best Paper Awards by Year
    • 2025 IEEE IPDPS - "Enabling Efficient Error-controlled Lossy Compression" Best Paper
    • 2025 ACM HPDC - "IPComp: Interpolation Based Progressive Lossy Compression" Best Paper Candidate
    • 2025 ACM ICS - "Pushing the Limits of GPU Lossy Compression" Best Paper Candidate
    • 2024 ACM HPDC - "DataStates-LLM: Lazy Asynchronous Checkpointing for LLMs" Best Paper
    • 2024 IEEE/ACM SC - "hZCC: Accelerating Collective Communication" Best Paper Candidate
    • 2023 IEEE Transactions on Big Data - Best Paper (123 published papers in 2023)
    • 2023 ACM ICS - "FZ: A flexible auto-tuned modular framework" Best Paper Candidate
    • 2023 IEEE Cluster - Best Student Poster Finalist
    • 2022 HiPC - "Towards Efficient Cache Allocation" Best Paper
    • 2022 DRBSD workshop (IEEE/ACM SC) - "Understanding Effects of Modern Compressors" Best Paper
    • 2022 IEEE/ACM SC - "Mitigating Silent Data Corruptions" Best Paper & Best Student Paper Finalist
    • 2018 IEEE Cluster - Overall Best Paper + 3 area Best Papers
    • 2011 IEEE/ACM SC - FTI Paper Perfect Score Award
    • 2007 Europar - "Characterizing Result Errors" Best Paper
    • 2001 IEEE CCGRID - "OVM: Out-of-Order Execution" Best Paper
    Student Awards:
    • 2023: Graduate student 1st place ACM SRC award
    • 2022: Undergraduate 1st place ACM SRC award
    • 2022: Graduate student 2nd place ACM SRC award
    📋

    Current Responsibilities

    Lead of AuroraGPT Evaluation Group
    Since January 2024
    Responsible for overseeing the design and development of evaluation methods, including benchmarks (MCQs, Open Responses, Chain of thoughts, etc.) and other measurement techniques, to assess the performance of Large Language Models (LLMs) as research assistants.
    Leader of Resilience and Compression Topics at Argonne/MCS
    Since April 2013
    Developing research strategy for resilience and compression within MCS, coordinating with DOE program managers, submitting research proposals, participating in ECP-related projects, developing new research topics, and disseminating research results and software.
    Executive Director for ANL of JLESC
    Initiated in 2014
    INRIA-Illinois-ANL-BSC-JSC-Riken-UTK Joint-Laboratory on Extreme Scale Computing. Responsible for development of JLESC activities including workshops and visits, as well as day-to-day management. Initiated this Joint-Lab in 2014 and directed it until 2022.
    📚

    Publications

    Publication Statistics
    6
    Books, Proceedings
    76
    Peer Reviewed Journal Articles
    237
    Peer reviewed Conference and Workshop Papers
    113
    Invited Keynotes, Plenaries and Talks
    Most Influential Publications
    The International Exascale Software Project Roadmap
    966 citations
    IJHPCA, 2011 - J. Dongarra et al.
    Grid'5000: A Large-scale Platform
    711 citations
    IEEE/ACM GRID 2005 - R. Bolze, F. Cappello et al.
    XtremWeb: Generic Global Computing System
    600 citations
    IEEE/ACM CCGRID 2001 - G. Fedak et al.
    Fast Error-bounded Lossy HPC Data Compression with SZ
    599 citations
    IEEE IPDPS 2016 - S. Di, F. Cappello
    Cost-benefit Analysis of Cloud Computing
    539 citations
    IEEE IPDPS 2009 - D. Kondo et al.
    Toward Exascale Resilience
    493 citations
    IJHPCA, 2009 - F. Cappello et al.
    MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes
    483 citations
    IEEE/ACM SC 2002 - G. Bosilca et al.
    FTI: High performance fault tolerance interface for hybrid systems
    453 citations
    IEEE/ACM SC 2011 - L. Bautista-Gomez et al.
    MPI versus MPI+ OpenMP on the IBM SP for the NAS Benchmarks
    390 citations
    IEEE/ACM SC 2000 - F. Cappello et al.
    All Papers
    All Invited Keynotes/Plenaries and Talks
    • [I113] AuroraGPT: Exploring AI Assistants for Science, ORAP Forum, Invited talk, Nov., 2025
    • [I112] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, Exa-DoST 2025 Annual Meeting, Invited talk, Nov., 2025
    • [I111] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, Clemson's inaugural HPC Day, Invited Keynote, Sept., 2025
    • [I110] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, ECMWF’s 50th anniversary celebrations, Invited Keynote, Sept., 2025
    • [I109] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, Trillion Parameters Consorsium annual meeting (TPC25), Invited plenary, Aug., 2025
    • [I108] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, AI workshop on spectroscopy, JLAB (Jefferson Lab), Invited talk, June, 2025
    • [I107] AuroraGPT and the 1000 Scientists AI JAM. Exascale Round Table Committee, Invited talk, May, 2025
    • [I106] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, HPC&AI Workshop at Stony Brook University, Theoretical Physics Symposium, Invited talk, May, 2025
    • [I105] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, Theoretical Physics Symposium, Perimeter Institute, Toronto, Keynote, April, 2025
    • [I104] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, TPC (Trillion Parameters Consortium), Invited talk, virtual, April, 2025
    • [I103] EAIRA: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, Invited seminar, Virginia University, March 2025
    • [I102] AuroraGPT/Eval: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, ETP4HPC, Keynote, February 2025
    • [I101] AuroraGPT/Eval: Establishing a Methodology to Evaluate LLMs/LRMs as Research Assistants, Riken RCCS Symposium, Invited Plenary, Jan 2025
    • [I100] AuroraGPT/Eval: Establishing a methodology to evaluate LLMs/FMs as Research Assistants, University of Virginia, AI for science workshop, Keynote, October 2024
    • [I99] How much can we reduce scientific data without losing science, Invited Seminar CS department, Northwestern U., Evanston, Jan 2024
    • [I98] AuroraGPT: Exploring AI Assistants for Science, Keynote, IPDPS24, San Fransisco, June 2024
    • [I97] AuroraGPT, Evaluation of AI Assistants for Science: Critical and non-Trivial, Invited Plenary, TPC 2024 Barcelona, August 2024
    • [I96] AuroraGPT: Rationale, Challenges and Development of an AI Research Assistant, Keynote, Europar24, August 2024
    • [I95] Establishing a Methodology to Evaluate AI Models as Research Assistants, Invited Plenary, CCDSC 2024 Lyon, September 2024
    • [I94] Frontier AI for Science Security and Technology: FASST, Invited Plenary, NSF HDR Ecosystem Conference 2024, Champaign, September 2024
    • [I93] AuroraGPT: Rationale, Data Challenges and Development of an AI Research Assistant, NYSDS 2024, Invited Plenary, New York (remote), September 2024
    • [I92] Toward AI-augmented SWARM based resilience for Integrate Research Infrastructures, Keynote, SuperCheck workshop at SC23, November 2023
    • [I91] How much can we really compress scientific data without losing science?, Keynote, LIG (Laboratoire Informatique de Grenoble) Keynote Speeches, 2023
    • [I90] How much can we really compress scientific data, Invited talk, Workshop on Clusters, Clouds, and Data for Scientific Computing, CCDSC, 2022
    • [I89] A Reflection on Methodologies, Algorithms, and Software for HPDC, Keynote, ACM HPDC 2022
    • [I88] Fault-tolerance Resilience at Extreme Scale, Keynote, IEEE DSN, 2022
    • [I87] Compression techniques in the US Exascale Program (ECP), Invited talk, Workshop on, data compression for weather and climate data, 2022
    • [I86] High Ratio, Speed and Accuracy Customizable Scientific Data Compression with SZ, Keynote, The Second International Workshop on Big Data Reduction, IEEE International Conference on Big Data, 2021
    • [I85] Lossy compression for scientific data, APS seminar, Dec. 2021
    • [I84] Scientific data reduction, from renaissance to modern age (CCDSC), Invited talk, Lyon, Sept. 2021. Canceled because of COVID-19
    • [I83] Cooking the perfect reduction or how to shrink science data while keeping its substance, Invited seminar, Clemson University, Computer Science, February 2021
    • [I82] International Forum on Detectors for Photon Science (IFDEPS 2020), Mar. 2020. Canceled because of COVID-19
    • [I81] Fulfilling the promises of Lossy compression for scientific applications, (CCDSC), Invited talk, Lyon, Sept. 2020. Canceled because of COVID-19
    • [I80] Compression for scientific data, invited seminar, Inria Grenoble, IMAG building, Feb. 2020
    • [I79] HPC-BigData Convergence: What to do when scientific data becomes too big?, Keynote, Scheduling workshop, Bordeaux, June 2019
    • [I78] Trends in HPC Resilience From Extreme Homogeneity to Extreme Heterogeneity, HPDC PC meeting workshop, Arlington, March. 2019
    • [I77] The ECP EZ project, ECP Video Interview, Dec. 2018
    • [I76] Keeping-up with the flood of scientific data, Keynote, HiPEAC CSW, Oct. 2018
    • [I75] Three frontiers of lossy compression for scientific data, HPC and Data Science for Scientific Discovery, Invited talk, UCLA, Oct. 2018
    • [I74] Keeping-up with the flood of scientific data, Invited talk, Co-design Workshop, China, Oct. 2018
    • [I73] Three frontiers of lossy compression for scientific data, Workshop on Clusters, Clouds, and Data for Scientific Computing (CCDSC), Invited talk, Lyon, Sept. 2018
    • [I72] Keeping-up with the flood of scientific data, Keynote, IEEE ISPDC 2018, Switzerland, June 2018
    • [I71] Lossy compression of scientific simulation data: from visualization to checkpoint/restart, International workshop on the Convergence of Extreme Scale Computing and Big Data Analysis, collocated with IEEE IPDPS 2018, Invited talk, May 2018
    • [I70] Keeping-up with the Flood of Data in Extreme Scale Simulations, Colloquium of Center for Computational Sciences, University of Tsukuba, May 2018.
    • [I69] Progress toward transparent asynchronous multi-level checkpointing with VeloC, SIAM-PP - Resilience mini-symposium, Invited talk, Tokyo, Japan, Mar. 2018, replaced by Bogdan Nicolae
    • [I68] Addressing Fault Tolerance and Data Compression at Exascale, ECP Podcast, published in Inside HPC, https://insidehpc.com/2018/02/podcast-addressing-fault-tolerance-data-compression-exascale/
    • [I67] Lossy Compression of Scientific Datasets, Keynote, PPAM 2017 conference, Poland, Sept. 2017
    • [I66] Reconfigurable Computing for Beyond Moore Computing, Invited Panelist, Smoky Mountain Conference (SMC), Sept. 2017
    • [I65] Checkpoint/Restart: Why You Should Delegate it to a Specialized Library, Invited talk, SIAM Annual Meeting, Pittsburgh, July 2017
    • [I64] From General Purpose-Exact computing to Tailored-Lossy computing (scientific computing), Invited talk, Greater Chicago Area System Research Workshop, IIT, Chicago, April 2017
    • [I63] FPGA for Scientific Computing and Data Analytics, Invited talk, International, workshop on Co-design, Xian, China, Oct. 2016
    • [I62] The Exascale Computing Project and Argonne software activity, Keynote, CREST workshop, Tokyo, Dec. 2016
    • [I61] Scientific Computing and Data Analytics: How to Deal with the Flood of Data. Distinguished Lecture, Northeaster University, Boston, Boston, Nov. 2016
    • [I60] Lossy Compression of sSientific Data: From Stone Age to Renaissance, Workshop on Clusters, Clouds, and Data for Scientific Computing (CCDSC), Invited talk, Lyon, Oct. 2016
    • [I59] FPGA for Scientific Computing and Data Analytics, Invited talk, International workshop on Co-design, Xian, China, Oct. 2016
    • [I58] Reconfigurable Computing: An Ingredient of Post-Moore Scientific Computing?, Invited dinner talk, Argonne Training Program on Extreme-Scale Computing (ATPESC), St Charles, Aug. 2016
    • [I57] On-Demand Data Analytics and Storage for Extreme-Scale Simulations and Experiments, Invited short talk, BDEC meeting, Frankfurt, June 2016
    • [I56] Trust in Results of Numerical Simulation: the New Challenging Scientific Problem in Reliability, Invited Plenary, Conference on Data Analysis: CODA 2016, Santa Fé, March 2016
    • [I55] Grid'5000 Origin and Some Suggestions for the Next 10 Years, Invited talk, Grid5000 school, Feb. 2016
    • [I54] The Joint Laboratory for Extreme Scale Computing: Investigating the challenges of post petascale scientific computing, Invited talk, 6th AICS International Symposium, Feb. 2016
    • [I53] Taking on Exascale Challenges: Key Lessons and International Collaboration Opportunities Delivered by European Cutting-Edge HPC Initiatives, Invited Panelist, SC15 BOF on European HPC Technology Projects, Nov. 2015
    • [I52] Re-form: Approaching Reconfigurable Computing for HPC and Data Analytics, Invited talk, International Workshop on Co-design, Wuxi, China, Nov. 2015
    • [I51] Let's Forget about "Fault Tolerance" and "Resilience" for HPC ; Trust is the New Challenging Scientific Problem in Reliability, Keynote, FTS workshop as part of IEEE Cluster 2015, Chicago, Sept. 2015
    • [I50] Toward Exascale Resilience, Keynote, HiPEAC, thematic session on “reliability for exascale platforms and its impact on performance, from the point of view of programming models,” Oslo, May 5-7, 2015 (cancelled)
    • [I49] Advances in Climate Simulations at Extreme Scale, Invited Plenary, International workshop on Co-design, Guanzhou, China, 2014 (cancelled)
    • [I48] Toward Approximate Detection of Silent Data Corruptions, CCDSC, Invited Plenary, France, 2014
    • [I47] Resilient Algorithms and Computing Models, SIAM Annual Meeting, Invited talk, USA, 2014
    • [I46] Climate Modeling at Extreme Scale, Invited plenary, International workshop on Co-design, Guilin, China, 2013
    • [I45] High Performance Fault Tolerance / Resilience at Extreme Scale, Keynote, HPCS 2013, Helsinki July, 2013
    • [I44] Advanced Fault Tolerance Techniques for Postpetascale Systems, Invited plenary, AICS Symposium, Kobe, 2013
    • [I43] Fault Tolerance at Exascale: Recent Progresses and Open Questions, Keynote, IEEE Cluster, Beijing, 2012
    • [I42] Fault Tolerance for HPC at Extreme Scale: The Disruptive Way, Keynote, SPAC-PAD, New York, 2012
    • [I41] Toward Exascale Climate Simulation: Exploring Limits of Current Codes, Invited talk at “Weather and Climate Prediction on Next Generation Supercomputers: Numerical and Computational Aspects, Met office, Exeter, UK, 2012
    • [I40] Failure Prediction: Current Situation and Open Questions, CCDSC, Invited plenary, France, 2012
    • [I39] Redesiging Fault Tolerance for High Performance Computing, Distinguished Speaker seminar, I2PC, UIUC, 2012
    • [I38] A Holistic Approach for Exascale (Scalable) Resilience, Keynote talk, IEEE/ACM SC11/ScalA workshop, 2011
    • [I37] Fault Tolerance for High Performance Computing Applications in Hostile Environments: Exascale and Cloud, KEYNOTE talk, IEEE IPDPS/DPDNS11, Anchorage, 2011
    • [I36] Exascale: The Great Disruption, Keynote talk, PDP 2011, Cyprus, 2011
    • [I35] EESI: the European Exascale Software Initiative, KEYNOTE talk, Intel Exascale Leadership Conference, 2011
    • [I34] Toward Exascale Resilience, Invited talk HiPC workshop “Reaching Exascale in This Decade,” 2010
    • [I33] From Grid to Cloud: A View from the Experimental Platform Side, Invited Plenary talk, IEEE Grid 2008
    • [I32] Fault Tolerance & PetaScale Systems: Current Knowledge, Challenges and Opportunities, Keynote talk, Europar, Spain, 2008
    • [I31] Fault Tolerance & PetaScale Systems: Current Knowledge, Challenges and Opportunities», Keynote talk, EuroPVM/MPI, Dublin, 2008
    • [I30] French National Grid Testbed: Grid 5000, Keynote talk, DCABES 2008, Dalian, China
    • [I29] Towards an International Computer Science Grid, Keynote talk, IEEE/ACM CCGRID'2007, Rio, Brazil, 2007
    • [I28] Towards an International Computer Science Grid, Keynote talk, GCP'2007, Paris, 2007
    • [I27] Towards an International Computer Science Grid, Keynote talk, Symposium on Grid, Delft, 2007
    • [I26] Towards an International Computer Science Grid, Keynote talk, IEEE WETICE, Paris, 2007
    • [I25] Fault Tolerance & PetaScale Systems: Current Knowledge, Challenges and Opportunities, Clusters and Computational Grids for Scientific Computing 2008, Highland Lake Inn, Asheville, September 10–13, 2008
    • [I24] Fault Tolerance & PetaScale Systems: Current Knowledge, Challenges and Opportunities, HPC Conference, Cetraro, July 2008
    • [I23] Grid'5000, Motivations, Status and Early Results, Grid@Asia workshop, Seoul, Corea, December 13, 2006
    • [I22] When Scale Reactivates Research in Distributed Computing: Grid'5000, Instant Grid and MPI-V, STIC-Amsud meeting, Santiago, Chile, October 18-20, 2006
    • [I21] Grid'5000, Motivations, Status and Early Results, Workshop of the Grille Academic Tunisienne pour la Recherche Scientifique, Tunis, Tunisia, Oct. 2006
    • [I20] An Update of Grid'5000 and a Focus on a Fault Tolerant MPI Experiment, Clusters and Computational Grids for Scientific Computing 2006, Highland Lake Inn, Asheville, USA, September 10-13, 2006
    • [I19] Grid'5000, Motivations, Status and Early Results, HPC Conference, Cetraro, July 2006
    • [I18] Grid and Utility Computing: Do they really mean Pervasive Services?, ICPS 2006 panel session, June 2006
    • [I17] Grid 5000: The Need for Experimental Platform for Grid Research, ExpGrid Workshop Panel, Paris, June 2006
    • [I16] Grid'5000, Motivations, Status and Early Results, Workshop new trends in HPDC, Amsterdam, March, 2006
    • [I15] Grid Projects in France and Europe, Colloquium on "25 years of collaboration between Instituto de Informatica de l'UFRGS and France,” Porto Alegre, November 2005
    • [I14] Grid'5000, Workshop Grid@large, in conjunction with Europar 2005, Lisboa, August 2005
    • [I13] Dependability in Grids, Workshop of the IFIP WG10.4 ON DEPENDABLE COMPUTING AND FAULT TOLERANCE, Yokoama, July, 2005
    • [I12] Grid Research Tools and Grid'5000, workshop on P2P: concept, outils et applications ; Geneve, May 2005
    • [I11] Dependability in Grids, panel "Dependability Challenges and Education Perspectives", Fifth European Dependable Computing Conference, Budapest, April 2005
    • [I10] Desktop Grid, Global Computing and P2P Distributed Systems, workshop on Advanced Grid Technologies, Systems & Services, Session: Grid Foundations for Business & Industry, IST Call 5, Brussels, February 2005
    • [I9] The MPICH-V Project, ENS/NSF Workshop, Lyon, September 2004
    • [I8] Hybrid Preemptive Scheduling of MPI Applications on the Grids, Scheduling Workshop, Modane, August 2004
    • [I7] P2P Computing: From Expectations to Feedback, Trans-European-Research and Education Networking Association, Zagreb, Croatia, May 2003
    • [I6] Desktop Grids with XtremWeb: Experiences and Feedback, panel “Desktop Grids: 10,000 fold parallelism for the masses” SuperComputing 2002 (SC2002), November 2002, Baltimore
    • [I5] XtremWeb: Toward High Performance Computing on P2P systems, Advanced Research Workshop on High Performance Computing, Cetraro, June 2002
    • [I4] Système de Calcul Global Pair à Pair, Journée de l'ORAP, Saclay, March 2002
    • [I3] OVM: High Performance Computing with RPC Programming Style, Score Users Group Meeting, Oxford, UK September 2000
    • [I2] Understanding Performance of SMP Clusters for the NAS Benchmark, Workshop on Grid and Cluster Computing, Tsukuba, Japan, March 2000
    • [I1] Comparing Performance of MPI and MPI+OpenMP for NAS Benchmark on IBM SP3, IBM Watson ACTC European Workshop, Paris, France, May 2000
    💻

    Software & Methods

    SZ Lossy Compressor
    Started in 2015 | R&D100 Award 2021 | Part of the E4S software stack deployed on Exascale Systems
    Lossy compressor strictly respecting user-set error bounds. Integrated into HDF5, ADIOS, and NetCDF I/O libraries as part of the ECP project. Helps several Exascale Computing Project applications reduce dataset sizes significantly.

    GitHub Repository →
    VeloC Checkpoint-Restart Framework
    Started in 2016 | R&D100 Award 2019 (as part of SCR2) | Part of the E4S software stack deployed on Exascale Systems
    Multilevel checkpoint-restart framework helping Exascale Computing Project applications reduce checkpoint/restart overhead with minimum code modification.

    GitHub Repository →
    Grid'5000
    Started in 2003 | 6000+ users | 2500+ publications, 300+ Ph. d Thesis used Grid5000 for their experiments
    Initiated and directed this experimental platform for parallel and distributed computing from 2003-2008. Transformed clusters distributed across France into a fully reconfigurable experimental platform. Inspired NSF Future Grid and Chameleon projects.

    Grid'5000 Official Website →
    Grid'5000 Official History →
    MPICH-V
    Started in 2002 | served/inspired 10s of papers (see list) totaling >7000+ citations
    Experimental platform for fault tolerant protocols. Origin of a decade of research producing tens of publications in fault tolerance.
    XtremWeb
    Started in 1999
    Experimental platform for Desktop Grid. Adopted as the foundation of the iExec platform.

    iExec Platform →
    💰

    Research Grants

    Grant Portfolio Summary
    60+ Grants as Main PI or Co-PI:
    • 20+ French research projects
    • 6 European projects (2 STREPS, 1 NoE, 1 Infrastructure, 2 support actions)
    • 20+ USA grants (DOE ECP, DOE ASCR, NSF, Sandia, ANL LDRD)
    • 1 International project (G8)
    Recent Grants (2020-2024)
    • [60] 2024 DOE ASCR AI for Science - co-PI
    • [59] 2024 DOE ASCR ZF Reduction project - lead PI
    • [58] 2024 Argonne LDRD, AuroraGPT - group lead
    • [57] 2024 NSF CSSI FZ project supplemental funding - Lead PI
    • [56] 2023-2028 DOE ASCR Xscope - X-ray & Neutron Scientific Center
    • [55] 2023-2028 DOE ASCR Illumine - Intelligent Learning for Light Source
    • [54] 2023-2027 NSF CSSI FZ: Cyberinfrastructure for lossy compression - Lead PI
    • [53] 2023-2028 DOE ASCR Distributed Intelligence for Resilient Workflows - Lead Resilience
    • [52] 2023-2026 DOE FAIR DTIO: Computational Storage - co-PI
    • [51] 2022-2025 DOE ASCR Actionable Intelligent Visual Analytics - co-PI
    • [50] 2022-2025 DOE ASCR support for JLESC - PI
    • [49] 2021-2023 NSF CSSI ROCCI: In Situ Lossy Compression - Co-PI
    • [48] 2020-2022 SPP with oil company - lead PI
    • [46] 2020-2023 DOE ECP VeloC-SZ - lead PI
    DOE ECP and Major US Grants (2014-2020)
    • [47] 2018-2019 DOE support for JLESC - lead PI
    • [45] 2018-2020 NSF Ephemeral Coherence Cohort - co-PI (Marc Snir, PI)
    • [44] 2017-2018 DOE support for JLESC - lead PI
    • [43] 2017-2020 DOE ECP VeloC - lead PI
    • [42] 2017-2020 DOE ECP EZ - lead PI
    • [41] 2017-2023 DOE ECP CODAR - Lead for data reduction (Ian Foster, PI)
    • [40] 2017-2023 DOE ECP Computing the Sky - co-lead Data Analysis (Salman Habib, PI)
    • [39] 2016-2019 NSF ALETHEIA: Automatic detection framework - co-PI
    • [38] 2016-2019 DOE ASCR Catalog - co-PI
    • [37] 2015-2018 EDF Grant to support JLESC - PI
    • [36] 2015-2018 DARPA BRASS - senior personnel
    • [35] 2015-2017 ANL LDRD Re-form: FPGA reconfigurability - co-PI
    • [34] 2014-2017 DOE ASC DECAF: High-Performance Decoupling - co-PI
    • [33] 2014-2017 PUF NextGN: Next Generation Simulation Platforms - PI
    European and International Projects (2006-2016)
    • [32] 2013-2015 Anomaly@Exascale: INRIA International Associate Team - co-PI
    • [31] 2013-2016 ANL LDRD Paris: Data Knowledge-Based Resilience - PI
    • [30] 2013-2016 MontBlanc 2: European FP7 IP - Co-PI (as Inria)
    • [29] 2013-2016 Scorpio: Significance-Based Computing - European FP7 FET
    • [28] 2012 EESI2: European Exascale Software Initiative 2 - Leader Resilience
    • [27] 2012 DOE Fault Tolerance Framework for Cray XE6 - PI
    • [26] 2012 AMFT: Advanced Multilevel Fault Tolerance - STRATOS prototype
    • [25] 2011 G8 ECS: Towards Exascale Climate Simulation - initiator & director
    • [24] 2011 DOE XStack: Event Log Analysis - main PI
    • [23] 2010 ANR-JST FP3C: Framework for Post Petascale - co-PI
    • [22] 2010 EESI: European Exascale Software Initiative - co-initiator
    • [18] 2007 EDGeS: Infrastructure European Project FP7 - co-PI
    • [17] 2006 Grid4All: European Project STREP FP6 - co-PI
    • [16] 2006 QosCos Grid: European Project STREP FP6 - PI for INRIA
    • [11] 2004 CoreGRID: European Network of Excellence - senior personnel
    French National Projects and Early Grants (1999-2010)
    • [21] 2010 RESCUE, ANR White - co-PI
    • [20] MAP-REDUCE, ANR ARPEGE - co-PI
    • [19] 2008 Aladdin: Project after Grid'5000 - Scientific Director
    • [15] 2006 HIPcal: ANR Calcul Intensif et Simulation - co-PI
    • [14] 2005 CARRIOCAS: Competitivity pole System@tic - PI INRIA
    • [13] 2005 Grid eXplorer: Large-Scale Emulation Platform - PI
    • [12] 2004 Large-Scale Evaluation of HP Networks, ACI Grid'5000 - PI
    • [10] 2003 CNRS action: National experimental platform - PI
    • [9] 2003 Data GRID Explorer, Data Mass ACI - PI
    • [8] 2003 CNRS-Urbana collaboration - co-PI with Marc Snir
    • [7] 2002 DataGraal, ACI GRID - co-PI
    • [6] 2002 cASPer: Community based Application Service Provider - co-PI
    • [5] 2002 Augernome XtremWeb, PPF Paris XI University - co-PI
    • [4] 2001 CGP2P: P2P Global Computing, ACI GRID - PI
    • [3] 2001 GRID2, ACI GRID coordination action - senior personnel
    • [2] 2000 XtremWeb Desktop Grid, Ministry of research - co-PI
    • [1] 1999 RNRT ROM Project: Multi-service Optical network - senior personnel
    👥

    Student Advising

    Advising Statistics
    • 22 Ph.D. students advised (most now in research or professor positions)
    • 58 Ph.D. defense juries
    • 8 tenure track examinations (French Habilitation)
    • Multiple postdoctoral researchers supervised
    Alumni Success Stories
    Olivier Richard Ph.D. 1999 - Hybrid Parallel Programming, Cluster scheduler
    Current: Assistant Professor IMAG
    George Bosilca
    Ph.D. 2003 - OVM Project
    Current: Software Architect, NVIDIA
    Gilles Fedak
    Ph.D. 2004 - XtremWeb
    Current: INRIA Researcher & Founder and CEO of iExec
    Leonardo Bautista Gomez
    Ph.D. Co-advisor 2012 - FTI Software
    Current: Founder and team leader of MigaLabs
    Ana Gainaru
    Ph.D. Co-advisor 2015 - Failure Prediction
    Current: Computer Scientist at ORNL
    Dingwen Tao
    Ph.D. Co-advisor - Lossy Compression
    Current: Full Professor · Institute of Computing Technology, Chinese Academy of Sciences
    Geraud Krawezik
    Ph.D. Advisor 2005 - Advanced programming with OpenMP
    Current: Software Engineer, Flatiron Institute, Simons Foundation
    Aurelien Bouteiller
    Ph.D. Co-Advisor 2006 - MPICH-V1 environment and protocols
    Current: Research Assistan Professor, Innovative Computing Laboratory, UTK
    Oleg Lodigensky
    Ph.D. Co-Advisor 2006 - XtremWeb-Auger : HEP desktop Grid
    Current: Expert, CryptoNext Security
    Pierre Lemarinier
    Ph.D. Co-Advisor 2006 - MPICH-V2 environment and protocol
    Current: Product Owner at Atos BDS R&D
    Benjamin Quetier
    Ph.D. Co-Advisor 2008 - very large scale distributed system emulator
    Current: co-founder, CTO Invenis
    👥

    Services

    Journal editorial boards and conference steering committees
    • Editorial Board Elsevier Parallel Computing, 2021
    • Editorial Board IEEE Transaction on Computers, since 2019
    • Editorial Board IEEE Transaction on Parallel and Distributed Computing until 2018
    • Editorial Board International Journal of Grid Computing, Kluwer Academic Publishers since 2003
    • Editorial Board International Journal of Cluster Computing since 2008
    • Steering Committee IEEE and ACM HPDC 2014-2020 and previously 2006-2010
    • Steering Committee IEEE/ACM CCGRID
    Conference, workshop, session (co-)organization
    • Tech paper area co-chair IEEE IPDPS 2025
    • Tutorial co-chair IEEE/ACM SC 2023
    • Award chair IEEE/ACM SC 2022
    • Award deputy chair IEEE/ACM SC 2021
    • Virtual Logistics Liaison - Tech papers IEEE/ACM SC 2021
    • Tech Paper Chair IEEE/ACM SC 2020
    • Program Chair IEEE Cluster 2020
    • Deputy Tech Paper Chair IEEE/ACM SC 2019
    • System software track chair IEEE IPDPS 2018
    • Poster Chair IEEE/ACM SC 2018
    • Program co-chair IEEE CCGrid 2017
    • Emerging Technology chair ACM/IEEE SC 2017
    • Program vice-chair: Security, Privacy, and Reliability track IEEE CCGrid 2016
    • Award chair ACM/IEEE SC 2015
    • Scientific visualization showcase chair IEEE/ACM SC 2014
    • Program co-chair ACM CAC 2014
    • Program co-chair ACM HPDC 2014
    • Test of Time Award chair ACM IEEE SC 2013
    • Birds of a feather: G8 Exascale projects, ACM/IEEM SC13
    • Panel: "Fault Tolerance/resilience at Petascale/Exascale: Is it really critical? Are solutions necessarily disruptive?" IEEE/ACM SC13
    • Session: "system software challenges" as ISC2013
    • Tutorial co-chair ACM IEEE SC 2012
    • Technical Paper co-chair ACM IEEE SC 2011
    • Program chair HIPC 2010
    • Program chair IEEE NCA 2010
    • Technical Paper Area co-chair IEEE/ACM SC 2009
    • Program co-chair IEEE/ACM CCGrid 2009
    • Dagstuhl Seminar on Fault tolerance for HPC, 2009
    • General co-chair Grid and Pervasive Computing
    • General co-chair PCCGrid'2007, First Workshop on Large-Scale and Volatile Desktop Grids (PCCGrid) in conjunction with the IEEE International Parallel & Distributed Processing Symposium, 2007
    • Program co-chair EuroPVM/MPI, Paris, Sept, 2007, http://www.pvmmpi07.org/, 2007
    • Program co-chair HotP2P'2007, Fourth International Workshop on Hot Topics in Peer-to-Peer Systems (Hot-P2P) in conjunction with the IEEE International Parallel & Distributed Processing Symposium, 2007
    • Workshop PC-Grid at IEEE IPDPS 2006
    • General Chair IEEE HPDC 2006
    • Workshop GP2PC'2005, "Global and Peer to Peer Computing", CCGRID'2005, Cardiff, 9 May 2005
    • Workshop GP2PC'2004, "Global and Peer to Peer Computing", CCGRID'2004, Chicago, 19 April 2004
    • Workshop "Global and Peer-to-Peer Computing" (GP2PC) Workshop co-located with IEEE/ACM CCGrid’2003, Tokyo, Japan, 2003
    • Winter school GRID 2002, Aussois, Dec. 2002
    • GRID summer school co-located with RenPar 2002, Hammamet, May 2002
    • Workshop « Global and Peer-to-Peer Computing » (GP2PC) co-located with IEEE/ACM CCGrid’2002, Berlin, Germany, 2002 (www.lri.fr/~fci/GP2PC.htm)
    • Workshop « Global Computing on Personal Devices » (GP2PC) workshop co-located with international conference IEEE/ACM CCGrid’2001, Brisbane, Australia, 2001
    • IEEE HPCA6 , Toulouse, France, 2000
    📧

    Contact Information

    📧
    📱
    Phone:
    +1 217 417 8557
    🌐
    Website:
    ANL Profile
    🔬
    Google Scholar:
    View Publications
    🏢
    JLESC Website:
    Joint Laboratory
    Mathematics and Computer Science Division
    Argonne National Laboratory
    9700 S. Cass Avenue
    Lemont, IL 60439, USA